The described behavior is a classic cache stampede (also called the “thundering herd” problem): a large set of hot keys share the same expiration time, so they all expire simultaneously. When that happens, many requests miss the cache at the same moment and fall back to the backend database, creating a sudden spike in database load and causing increased latency and poor user experience.
The most effective fix that preserves freshness is to introduce jitter —a small, randomized variance—into the TTL values so cached objects do not all expire at the same time. With TTL jitter, each item still has a bounded lifetime (so freshness is maintained), but expirations become spread out over time. That smooths cache-miss rates and prevents synchronized regeneration, reducing bursts of database traffic. This is a common best practice in high-load caching systems precisely to avoid mass expiration events.
Option A (increase TTL) might reduce how frequently a stampede occurs, but it does not solve the root cause: keys can still align and expire together, and a longer TTL can also reduce freshness if product data changes. Option B (more replicas, disable cluster mode) changes topology and read scaling, but it does not prevent synchronized expiry or the resulting backend spike; disabling cluster mode could also reduce scale for large keyspaces. Option D (more shards) can improve cache throughput capacity, but it does not address the core issue that backend load spikes occur when items expire simultaneously; lowering replica count could even reduce read availability.
Therefore, C is the correct solution: keep TTL-based freshness, but add a randomized component (for example, TTL = baseTTL + random(0..jitter)) so expirations are distributed and the backend database is protected from sudden bursts.
Submit