Summary
The gossip_signatures HashMap in crates/storage/src/store.rs:265 has no size cap. It is only pruned when finalization advances (prune_gossip_signatures at line 607, called from update_checkpoints at line 508). During a finalization stall, every gossip attestation from every validator at every slot is inserted and never removed.
Observed Impact
In the devnet4 test_1 run (2026-04-07/08, ~18.5h), finalization stalled at slot 10733 for 6+ hours. During this period:
- Node 0 (aggregator) accumulated attestations indefinitely, with
attestation_count in produced blocks growing monotonically from ~5 to 797
gossip_signatures entries piled up across all unfinalized slots
- This contributed to the RocksDB file descriptor exhaustion crash on node 0
Root Cause
gossip_signatures relies entirely on finalization-based pruning:
pub fn prune_gossip_signatures(&mut self, finalized_slot: u64) -> usize {
let mut gossip = self.gossip_signatures.lock().unwrap();
gossip.retain(|_, entry| entry.data.slot > finalized_slot);
// ...
}
When finalization stops, nothing is ever evicted. With 4-6 validators producing attestations every 4 seconds, this is ~5,400 entries/hour, each carrying a ValidatorSignature (~3 KB XMSS signature).
By contrast, known_payloads (cap=512) and new_payloads (cap=64) are correctly bounded with FIFO eviction via PayloadBuffer.
Suggested Fix
Add a hard cap to gossip_signatures with FIFO or slot-based eviction, similar to the PayloadBuffer pattern used for aggregated payloads. This ensures bounded memory usage regardless of finalization progress.
Summary
The
gossip_signaturesHashMap incrates/storage/src/store.rs:265has no size cap. It is only pruned when finalization advances (prune_gossip_signaturesat line 607, called fromupdate_checkpointsat line 508). During a finalization stall, every gossip attestation from every validator at every slot is inserted and never removed.Observed Impact
In the devnet4 test_1 run (2026-04-07/08, ~18.5h), finalization stalled at slot 10733 for 6+ hours. During this period:
attestation_countin produced blocks growing monotonically from ~5 to 797gossip_signaturesentries piled up across all unfinalized slotsRoot Cause
gossip_signaturesrelies entirely on finalization-based pruning:When finalization stops, nothing is ever evicted. With 4-6 validators producing attestations every 4 seconds, this is ~5,400 entries/hour, each carrying a
ValidatorSignature(~3 KB XMSS signature).By contrast,
known_payloads(cap=512) andnew_payloads(cap=64) are correctly bounded with FIFO eviction viaPayloadBuffer.Suggested Fix
Add a hard cap to
gossip_signatureswith FIFO or slot-based eviction, similar to thePayloadBufferpattern used for aggregated payloads. This ensures bounded memory usage regardless of finalization progress.