Write Consistency Levels

DData supports different consistency levels. Promovolve uses different levels depending on operation criticality.

Consistency Choices (from ServeIndexDData.scala)

Operation	Consistency	Timeout	Retries	Rationale
Put (full replacement)	`WriteLocal`	—	—	Speed; next auction refreshes
Append (single candidate)	`WriteLocal`	—	—	Speed; dedup prevents issues
CPM update	`WriteLocal`	—	—	Best-effort price refresh
FilterByCreativeIds	`WriteLocal`	—	—	Batch cleanup
Remove (takedown)	`WriteMajority`	800ms	5 (200ms backoff)	Must be durable
RemoveCampaignFromKey	`WriteMajority`	800ms	5	Must be durable
RemoveCreativeFromKey	`WriteMajority`	800ms	5	Must be durable
RemoveBySite	`WriteMajority`	800ms	5	Must be durable

Why WriteLocal for Puts?

Auction results are written frequently and losing one write is not catastrophic:

The next crawl cycle produces fresh results
Gossip replicates to other nodes within seconds (2s gossip interval)
Stale data is caught by the TTL sweep

Why WriteMajority for Removes?

Removes must be durable. If a remove only reaches one node and that node crashes:

The entry reappears on restart from other nodes’ copies
A “zombie” creative that was supposed to be taken down continues serving
This is a compliance/safety concern (paused campaigns, suspended advertisers)

WriteMajority ensures the remove is acknowledged by a majority of nodes before returning.

Retry Strategy

MaxRemoveRetries = 5
InitialRetryBackoff = 200.millis

If WriteMajority times out (800ms), the remove is retried with exponential backoff. After 5 failures, the removal is logged and will be caught by the next TTL sweep.

Eventual Consistency Window

WriteLocal operations have a brief window (typically <2s, matching gossip interval) where different API nodes see different ServeIndex contents. This means:

Two concurrent requests to different nodes might get different creatives
A just-written entry might not be visible everywhere immediately

These are acceptable because:

Thompson Sampling already introduces per-request randomness
The 15-minute RL window averages over many decisions
Budget and pause checks at serve time catch any “shouldn’t serve” cases