In high concurrency systems, Redis has become a standard configuration as a cache middleware, which can effectively reduce database pressure and improve system response speed. However, caching is not omnipotent. In practical applications, we often face a serious problem - cache penetration.
This phenomenon may cause Redis to fail, causing a large number of requests to directly impact the database, causing a sharp decline in system performance or even downtime.
Analysis of cache penetration principle
What is cache penetration
Cache penetration refers to querying a data that does not exist at all. Since the cache misses, the request will penetrate the cache layer and directly access the database. In this case, the database cannot query the corresponding data, so the result cannot be written to the cache, resulting in repeated access to the database every time a similar request is made.
Typical scenarios and hazards
Client ---> Redis (missed) ---> Database (query failed) ---> Not updating cache ---> Loop repetition
The main hazards of cache penetration:
- Database pressure surges: A large number of invalid queries fall directly into the database
- System response slow: Excessive database load leads to overall performance degradation
- Waste of resources: Involuntary queries consume CPU and IO resources
- Security risks: Possible to be used maliciously as a means of denial of service attack
There are usually two situations for cache penetration:
- Normal business inquiry: The query data does not exist
- Malicious attack: Deliberately constructing non-existent keys to make large amounts of requests
The following are six effective prevention strategies.
Strategy 1: empty value cache
principle
Null value cache is the simplest and most direct anti-penetration strategy. When the database cannot query the value corresponding to a key, we still cache the "empty result" (usually represented by a null value or a specific tag) and set a relatively short expiration time. In this way, the next time you request the same non-existent key, you can directly return "empty results" from the cache to avoid querying the database again.
Implementation example
@Service public class UserServiceImpl implements UserService { @Autowired private StringRedisTemplate redisTemplate; @Autowired private UserMapper userMapper; private static final String KEY_PREFIX = "user:"; private static final String EMPTY_VALUE = "{}"; // null value tag private static final long EMPTY_VALUE_EXPIRE_SECONDS = 300; //Null value expiration time private static final long NORMAL_EXPIRE_SECONDS = 3600; // Normal value expiration time @Override public User getUserById(Long userId) { String redisKey = KEY_PREFIX + userId; // 1. Query cache String userJson = ().get(redisKey); // 2. Cache hit if (userJson != null) { // Determine whether it is empty if (EMPTY_VALUE.equals(userJson)) { return null; // Return empty result } // Normal cache, deserialize and return return (userJson, ); } // 3. Cache misses and query the database User user = (userId); // 4. Write to cache if (user != null) { // Data is found in the database and written to the normal cache ().set(redisKey, (user), NORMAL_EXPIRE_SECONDS, ); } else { // No data was found in the database, and the empty value cache is written ().set(redisKey, EMPTY_VALUE, EMPTY_VALUE_EXPIRE_SECONDS, ); } return user; } }
Pros and cons analysis
advantage
- Simple implementation without additional components
- Low invasiveness to the system
- Immediate results
shortcoming
- May take up more cache space
- If there are many empty values, it may lead to a decrease in cache efficiency
- Unable to deal with large-scale malicious attacks
- Data inconsistency may occur in the short term (the cache still returns empty value after adding new data)
Strategy 2: Bloom filter
principle
Bloom Filter is a probabilistic data structure with high space efficiency, used to detect whether an element belongs to a set. Its characteristic is that there is a misjudgment, that is, the non-existent element may be misjudged as false positive, but the existing element will not be misjudgmented as false negative.
The Bloom filter contains a long binary vector and a series of hash functions. When an element is inserted, the hash value of the element is calculated using each hash function and the corresponding position in the binary vector is set to 1. When querying, the hash value is also calculated and the corresponding position in the vector is checked. If any bit is 0, the element must not exist; if all bits are 1, the element may exist.
Implementation example
Use Redis's Bloom filter module (Redis 4.0+ supports module extension, RedisBloom is required):
@Service public class ProductServiceWithBloomFilter implements ProductService { @Autowired private StringRedisTemplate redisTemplate; @Autowired private ProductMapper productMapper; private static final String BLOOM_FILTER_NAME = "product_filter"; private static final String CACHE_KEY_PREFIX = "product:"; private static final long CACHE_EXPIRE_SECONDS = 3600; // Initialize the Bloom filter, which can be executed when the application starts @PostConstruct public void initBloomFilter() { // Determine whether the Bloom filter exists Boolean exists = (Boolean) ((RedisCallback<Boolean>) connection -> (BLOOM_FILTER_NAME.getBytes())); if ((exists)) { // Create a Bloom filter with an estimated element quantity of 1 million and an error rate of 0.01 ((RedisCallback<Object>) connection -> ("", BLOOM_FILTER_NAME.getBytes(), "0.01".getBytes(), "1000000".getBytes())); // Load all product IDs into the Bloom filter List<Long> allProductIds = (); for (Long id : allProductIds) { ((RedisCallback<Boolean>) connection -> ("", BLOOM_FILTER_NAME.getBytes(), ().getBytes()) != 0); } } } @Override public Product getProductById(Long productId) { String cacheKey = CACHE_KEY_PREFIX + productId; // 1. Use the Bloom filter to check whether the ID exists Boolean mayExist = (Boolean) ((RedisCallback<Boolean>) connection -> ("", BLOOM_FILTER_NAME.getBytes(), ().getBytes()) != 0); // If the Bloom filter determines that it does not exist, it will return directly if ((mayExist)) { return null; } // 2. Query cache String productJson = ().get(cacheKey); if (productJson != null) { return (productJson, ); } // 3. Query the database Product product = (productId); // 4. Update cache if (product != null) { ().set(cacheKey, (product), CACHE_EXPIRE_SECONDS, ); } else { // Bloom filter misjudgment, the product does not exist in the database // You can consider recording such misjudgment and optimize the Bloom filter parameters ("Bloom filter false positive for productId: {}", productId); } return product; } // When adding a new product, you need to add the ID to the Bloom filter public void addProductToBloomFilter(Long productId) { ((RedisCallback<Boolean>) connection -> ("", BLOOM_FILTER_NAME.getBytes(), ().getBytes()) != 0); } }
Pros and cons analysis
advantage
- High space efficiency and small memory usage
- Fast query speed, time complexity O(k), k is the number of hash functions
- It can effectively filter most non-existent ID queries
- Can be used in combination with other policies
shortcoming
- There is a possibility of misjudgment (false positive)
- Unable to remove elements from Bloom filter (standard implementation)
- All data IDs need to be loaded in advance, which is not suitable for scenarios with frequent dynamic changes.
- The implementation is relatively complex and requires additional maintenance of the Bloom filter
- Regular reconstruction may be required to adapt to data changes
Strategy 3: Request parameter verification
principle
Requesting parameter verification is a means to prevent cache penetration at the business level. By performing legality verification on request parameters, obviously unreasonable requests are filtered out to prevent these requests from reaching the cache and database tiers. This method is particularly suitable for preventing malicious attacks.
Implementation example
@RestController @RequestMapping("/api/user") public class UserController { @Autowired private UserService userService; @GetMapping("/{userId}") public ResponseEntity<?> getUserById(@PathVariable String userId) { // 1. Basic format verification if (!("\d+")) { return ().body("UserId must be numeric"); } // 2. Basic logic verification long id = (userId); if (id <= 0 || id > 100000000) { // Assume ID range limitation return ().body("UserId out of valid range"); } // 3. Call business services User user = (id); if (user == null) { return ().build(); } return (user); } }
Parameter verification can also be added at the service layer:
@Service public class UserServiceImpl implements UserService { // Whitelist, only these ID prefixes are allowed (for example) private static final Set<String> ID_PREFIXES = ("100", "200", "300"); @Override public User getUserById(Long userId) { // More complex business rules verification String idStr = (); boolean valid = false; for (String prefix : ID_PREFIXES) { if ((prefix)) { valid = true; break; } } if (!valid) { ("Attempt to access invalid user ID pattern: {}", userId); return null; } // Normal business logic... return getUserFromCacheOrDb(userId); } }
Pros and cons analysis
advantage
- Simple implementation without additional components
- Can intercept obviously unreasonable access early in the request
- You can use business rules to perform fine control
- Reduce the overall burden on the system
shortcoming
- Unable to cover all illegal request scenarios
- You need to have a good understanding of the business to design reasonable verification rules
- May introduce complex business logic
- Too strict verification may affect the normal user experience
Strategy 4: Interface current limit and fuse
principle
Current limiting is an effective means to control the access frequency of the system, which can prevent burst flow from causing impact on the system. Fuse breaking means that when the system load is too high, some requests are temporarily rejected to protect the system. The combination of these two mechanisms can effectively prevent the systemic risks brought about by cache penetration.
Implementation example
Use SpringBoot+Resilience4j to achieve current limiting and fuse:
@Configuration public class ResilienceConfig { @Bean public RateLimiterRegistry rateLimiterRegistry() { RateLimiterConfig config = () .limitRefreshPeriod((1)) .limitForPeriod(100) // 100 requests per second .timeoutDuration((25)) .build(); return (config); } @Bean public CircuitBreakerRegistry circuitBreakerRegistry() { CircuitBreakerConfig config = () .failureRateThreshold(50) // 50% failure rate triggers the circuit breaker .slidingWindowSize(100) // Based on the last 100 calls .minimumNumberOfCalls(10) // Fuse will be triggered at least 10 calls .waitDurationInOpenState((10)) // Waiting time after fuse .build(); return (config); } } @Service public class ProductServiceWithResilience { private final ProductMapper productMapper; private final StringRedisTemplate redisTemplate; private final RateLimiter rateLimiter; private final CircuitBreaker circuitBreaker; public ProductServiceWithResilience( ProductMapper productMapper, StringRedisTemplate redisTemplate, RateLimiterRegistry rateLimiterRegistry, CircuitBreakerRegistry circuitBreakerRegistry) { = productMapper; = redisTemplate; = ("productService"); = ("productService"); } public Product getProductById(Long productId) { // 1. Apply the current limiter return (() -> { // 2. Apply fuses return (() -> { return doGetProduct(productId); }); }); } private Product doGetProduct(Long productId) { String cacheKey = "product:" + productId; // Query cache String productJson = ().get(cacheKey); if (productJson != null) { return (productJson, ); } // Query the database Product product = (productId); // Update cache if (product != null) { ().set(cacheKey, (product), 1, ); } else { // NULL value cache, valid for short-term ().set(cacheKey, "", 5, ); } return product; } // Degradation method after fuse private Product fallbackMethod(Long productId, Throwable t) { ("Circuit breaker triggered for productId: {}", productId, t); // Return to the default product or get it from the local cache return new Product(productId, "Temporary Unavailable", 0.0); } }
Pros and cons analysis
advantage
- Provide system-level protection
- Can effectively deal with burst traffic and malicious attacks
- Ensure system stability and availability
- Dynamic adjustments can be made in combination with the monitoring system
shortcoming
- May affect normal user experience
- Configuration tuning is difficult
- Decline strategies that need to be improved
- Cannot completely solve the cache penetration problem, but only mitigate its impact
Strategy 5: Cache preheating
principle
Cache warm-up refers to loading the data that may be queried into the cache in advance at the system startup or at a specific point in time to avoid database access caused by cache miss when user requests. For cache penetration issues, warm-up can fill up the space of valid data in advance, reducing the possibility of directly querying the database.
Implementation example
@Component public class CacheWarmUpTask { @Autowired private ProductMapper productMapper; @Autowired private StringRedisTemplate redisTemplate; @Autowired private RedisBloomFilter bloomFilter; // Perform cache preheating when the system starts @PostConstruct public void warmUpCacheOnStartup() { // Execute warm-up tasks asynchronously to avoid blocking application startup (this::warmUpHotProducts); } // Refresh the cache of popular products every day at 2 a.m. @Scheduled(cron = "0 0 2 * * ?") public void scheduledWarmUp() { warmUpHotProducts(); } private void warmUpHotProducts() { ("Start preheating the product cache..."); long startTime = (); try { // 1. Get a list of popular products (such as the TOP5000 sales) List<Product> hotProducts = (5000); // 2. Update cache and Bloom filters for (Product product : hotProducts) { String cacheKey = "product:" + (); ().set( cacheKey, (product), 6, ); // Update the Bloom filter ("product_filter", ().toString()); } // 3. At the same time, preheat some necessary aggregate information List<Category> categories = (); for (Category category : categories) { String cacheKey = "category:" + (); List<Long> productIds = (()); ().set( cacheKey, (productIds), 12, ); } long duration = () - startTime; ("Cache warm-up is completed,time consuming:{}ms,Preheated product quantity:{}", duration, ()); } catch (Exception e) { ("Cache warm-up failed", e); } } }
Pros and cons analysis
advantage
- Improve access performance after system startup
- Reduce cache cold start issues
- Can be refreshed regularly to keep data fresh
- Avoid users waiting
shortcoming
- Unable to cover all possible data access
- Take up additional system resources
- Invalid for unpopular data
- It is necessary to choose the preheating data range reasonably to avoid wasting resources
Strategy 6: Graded Filtering Strategy
principle
The hierarchical filtering strategy is to combine multiple anti-penetration measures to form a multi-layer protective net. By setting filtering conditions at different levels, the system performance can be guaranteed and cache penetration can be prevented to the maximum extent. A typical hierarchical filtering strategy includes: front-end filtering -> API Gateway Filtering -> Application Layer Filtering -> Cache Layer Filtering -> Database Protection.
Implementation example
Here is a comprehensive example of multi-layer protection:
// 1. Gateway layer filtering (using Spring Cloud Gateway)@Configuration public class GatewayFilterConfig { @Bean public RouteLocator customRouteLocator(RouteLocatorBuilder builder) { return () .route("product_route", r -> ("/api/product/**") // Path format verification .and().predicate(exchange -> { String path = ().getURI().getPath(); // Check the product/{id} path to make sure the id is a number if (("/api/product/\d+")) { String id = (('/') + 1); long productId = (id); return productId > 0 && productId < 10000000; // Reasonable scope inspection } return true; }) // Current limit filtering .filters(f -> () .rateLimiter(, c -> (10).setBurstCapacity(20)) .and() .circuitBreaker(c -> ("productCB").setFallbackUri("forward:/fallback")) ) .uri("lb://product-service") ) .build(); } } // 2. Application layer filtering (Resilience4j + Bloom Filter)@Service public class ProductServiceImpl implements ProductService { private final StringRedisTemplate redisTemplate; private final ProductMapper productMapper; private final BloomFilter<String> localBloomFilter; private final RateLimiter rateLimiter; private final CircuitBreaker circuitBreaker; @Value("${-seconds:3600}") private int cacheExpireSeconds; // Constructor injection... @PostConstruct public void initLocalFilter() { // Create a local Bloom filter as secondary protection localBloomFilter = ( (StandardCharsets.UTF_8), 1000000, // Expected number of elements 0.001 // Misjudgment rate ); // Initialize local Bloom filter data List<String> allProductIds = (); for (String id : allProductIds) { (id); } } @Override public Product getProductById(Long productId) { String productIdStr = (); // 1. Local Bloom filter pre-check if (!(productIdStr)) { ("Product filtered by local bloom filter: {}", productId); return null; } // 2. Redis Bloom filter secondary inspection Boolean mayExist = ( (RedisCallback<Boolean>) connection -> ( "", "product_filter".getBytes(), () ) != 0 ); if ((mayExist)) { ("Product filtered by Redis bloom filter: {}", productId); return null; } // 3. Apply current limiting and fuse protection try { return (() -> (() -> { return getProductFromCacheOrDb(productId); }) ); } catch (RequestNotPermitted e) { ("Request rate limited for product: {}", productId); throw new ServiceException("Service is busy, please try again later"); } catch (CallNotPermittedException e) { ("Circuit breaker open for product queries"); throw new ServiceException("Service is temporarily unavailable"); } } private Product getProductFromCacheOrDb(Long productId) { String cacheKey = "product:" + productId; // 4. Query cache String cachedValue = ().get(cacheKey); if (cachedValue != null) { // Handle empty value cache situation if (()) { return null; } return (cachedValue, ); } // 5. Query the database (add to DB protection) Product product = null; try { product = (productId); } catch (Exception e) { ("Database error when querying product: {}", productId, e); throw new ServiceException("System error, please try again later"); } // 6. Update cache (null values are also cached) if (product != null) { ().set( cacheKey, (product), cacheExpireSeconds, ); // Make sure the Bloom filter contains this ID ( (RedisCallback<Boolean>) connection -> ( "", "product_filter".getBytes(), ().getBytes() ) != 0 ); (()); } else { // Cache null value, expires for a short period of time ().set( cacheKey, "", 60, // Short-term cache of empty values ); } return product; } }
Pros and cons analysis
advantage
- Provide all-round system protection
- Each layer of protection complements each other to form a complete line of defense
- Allows flexibly to configure policies at each level
- Minimize resource waste and performance loss
shortcoming
- High complexity
- The configurations of each layer need to be consistent
- May increase system response time
- Relatively high maintenance costs
Summarize
Preventing cache penetration is not only a technical issue, but also an important link in system design and operation and maintenance.
In practical applications, the appropriate strategy combination should be selected based on the specific business scenario and system scale. Usually, a single strategy is difficult to completely solve the problem, while a combination strategy can provide more comprehensive protection. Regular monitoring and performance evaluation are necessary means to ensure the efficient operation of the cache system.
This is the end of this article about Redis’s 6 strategies to prevent cache penetration. For more related Redis cache penetration content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!