Rework call cache eviction to use ctid-based join deletes#6477
Open
Rework call cache eviction to use ctid-based join deletes#6477
Conversation
The ttl_max_contracts parameter limited how many contracts got processed per invocation. With the upcoming switch to a more efficient delete strategy that doesn't iterate by contract, this parameter no longer maps to anything meaningful. Remove it from the trait, CLI, and all implementations.
Replace the two-loop structure (fetch stale contracts, then delete their call_cache rows in tiny batches) with a single adaptive loop that joins call_cache to call_meta directly and deletes by ctid. The old approach did `WHERE contract_address = ANY(N addresses)` on the 6.9B row call_cache table repeatedly, which required expensive index scans. The new approach lets Postgres join the small filtered call_meta result against call_cache using the contract_address index, then deletes by physical row address (ctid), which is the fastest possible delete path. Also removes the GRAPH_STORE_STALE_CALL_CACHE_CONTRACTS_BATCH_SIZE env var since batch size is now fully controlled by AdaptiveBatchSize.
…ache When --max-contracts is passed alongside --ttl-days, compute an effective TTL that limits eviction to at most that many contracts. The effective TTL is determined by looking at the (max_contracts+1)th oldest stale entry in call_meta; if it exists, the cutoff is set to its accessed_at date so that contracts at that date are excluded. When multiple contracts share the boundary date, fewer than max_contracts may be deleted since the cutoff operates at date granularity.
Return a StaleCallCacheResult struct with effective_ttl_days, cache_entries_deleted, and contracts_deleted so the graphman command can report what actually happened to the user.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The old eviction approach ran
WHERE contract_address = ANY(...)against a multi-billion-rowcall_cachetable repeatedly, requiring expensive index scans for each batch of contracts. The new ctid-based join lets Postgres scan the small filteredcall_metaresult, join it againstcall_cacheusing thecontract_addressindex, and delete by physical row address — the fastest possible delete path.The old
--ttl-max-contractsparameter limited how many contracts were fetched and iterated per invocation, which no longer maps to anything meaningful with the join-based approach. The new--max-contractsinstead adjusts the TTL cutoff date so that at most that many contracts fall below it, giving operators a way to cap the scope of a single eviction run.Changes