Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Periodic Batch Auction

The defining architectural choice of Promovolve is that auctions run ahead of time, not per-request. When content is crawled (default schedule: 2am daily via Quartz cron), the system runs a full multi-phase auction and caches results in DData for instant serve-time lookups.

Auction Pipeline

┌─────────────────────────┐
│ Page Classification     │  LLM-based (Gemini/OpenAI/Anthropic)
│                         │  → IAB categories + confidence scores
└────────┬────────────────┘
         ▼
┌─────────────────────────┐
│ Category Ranking        │  TaxonomyRankerEntity per (category, site)
│                         │  → Thompson-sampled weights, 7-day half-life
└────────┬────────────────┘
         ▼
┌─────────────────────────┐
│ Bid Collection          │  CategoryBidderEntity (5 virtual shards)
│                         │  → CampaignDistributor (8 workers)
│                         │  → CampaignEntity bid responses
└────────┬────────────────┘
         ▼
┌─────────────────────────┐
│ Candidate Shortlisting  │  Fair selection: 1 per campaign, fill remainder
│                         │  → Top K per slot (default K=3)
└────────┬────────────────┘
         ▼
┌─────────────────────────┐
│ ServeIndex Caching      │  DData WriteLocal, 120-minute TTL
│                         │  → Replicated to all API nodes via gossip
└─────────────────────────┘

Periodic Re-Auction

Between crawl cycles, the system runs periodic re-auctions every 5 minutes (promovolve.auction.reauction-interval) for recent content within the 48-hour recency window. Additionally, event-driven re-auctions trigger on campaign/advertiser state changes.

Content Recency Window

Only pages classified within the last 48 hours participate in auctions. Every 5 minutes, AuctioneerEntity runs cleanup to remove classifications older than 48 hours.

Key Configuration

ParameterValueEnv Var
Re-auction interval5 minutesREAUCTION_INTERVAL
Content recency48 hours
Crawl cron schedule"0 0 2 * * ?"Per-site config
Crawl max depth2Per-site config
Crawl concurrency5Per-site config
ServeIndex TTL120 minutes
Taxonomy ask timeout800ms