Introduction
In the previous article, we built a reusable caching library that provides:
- L1 Cache using EhCache
- L2 Cache using Redis or GemFire
- Strategy Pattern based provider selection
- Kafka and Spring Event based invalidation
- Auto configuration
- Metrics and monitoring
While this solution works well for most workloads, large-scale production systems face additional challenges:
- Multiple pods rebuilding the same cache simultaneously
- Cache stampede during peak traffic
- Cold cache after deployments
- Expensive database queries triggered repeatedly
- Traffic spikes after cache expiration
To solve these problems, we will enhance our caching library with:
- Distributed Locking
- Cache Warming
- Cache Stampede Protection
- Double Check Locking
- Randomized TTL
- Background Refresh
- Cache Preloading
The goal remains unchanged:
Microservices should only add a dependency and configure properties.
The library handles the complexity.
Current Architecture
Client
|
Controller
|
Service
|
Platform Cache Library
|
-------------------------
| |
L1 Cache L2 Cache
EhCache Redis/GemFire
|
Database
New Enhanced Architecture
Client
|
Controller
|
Service
|
Platform Cache Library
|
----------------------------
| | |
L1 Cache L2 Cache Lock Manager
EhCache Redis/GemFire
|
Distributed Lock
|
Redis/GemFire
Additional Components:
Cache Warming Engine
Stampede Protection
Background Refresh Engine
Distributed Lock Service
Problem 1: Cache Stampede
Assume:
product:1001
Expires at:
10:00:00 AM
At:
10:00:01
5000 requests arrive.
Without protection:
5000 Requests
|
Cache Miss
|
5000 DB Calls
Result:
- Database overload
- Thread pool exhaustion
- Increased latency
Solution: Distributed Locking
Only one pod should rebuild cache.
All other requests should wait.
Lock Provider Interface
Create a new package:
com.company.cache.lock
DistributedLockProvider
public interface DistributedLockProvider {
boolean acquireLock(
String lockKey,
long timeoutSeconds);
void releaseLock(
String lockKey);
}
Redis Lock Implementation
@Component
@ConditionalOnProperty(
name="cache.l2.type",
havingValue="redis")
public class RedisLockProvider
implements DistributedLockProvider {
@Autowired
private RedisTemplate<String,String>
redisTemplate;
@Override
public boolean acquireLock(
String lockKey,
long timeout) {
Boolean success =
redisTemplate
.opsForValue()
.setIfAbsent(
lockKey,
"LOCKED",
Duration.ofSeconds(timeout));
return Boolean.TRUE.equals(success);
}
@Override
public void releaseLock(
String lockKey) {
redisTemplate.delete(lockKey);
}
}
GemFire Lock Implementation
@Component
@ConditionalOnProperty(
name="cache.l2.type",
havingValue="gemfire")
public class GemFireLockProvider
implements DistributedLockProvider {
private final ConcurrentHashMap<
String,
ReentrantLock> locks =
new ConcurrentHashMap<>();
@Override
public boolean acquireLock(
String key,
long timeout) {
return locks
.computeIfAbsent(
key,
k -> new ReentrantLock())
.tryLock();
}
@Override
public void releaseLock(
String key) {
locks.get(key).unlock();
}
}
Enhanced Cache Manager
Current implementation:
L1
↓
L2
↓
DB
Updated flow:
L1
↓
L2
↓
Acquire Distributed Lock
↓
Check Cache Again
↓
Load DB
↓
Update Cache
↓
Release Lock
Double Check Locking
public <T> T get(
String key,
Supplier<T> supplier,
long ttl) {
Object local =
localCache.get(key);
if(local != null) {
return (T)local;
}
Object distributed =
provider.get(key);
if(distributed != null) {
localCache.put(key, distributed);
return (T)distributed;
}
String lockKey =
"lock:" + key;
boolean acquired =
lockProvider.acquireLock(
lockKey,
10);
if(acquired) {
try {
distributed =
provider.get(key);
if(distributed != null) {
return (T)distributed;
}
T value =
supplier.get();
if(value != null) {
provider.put(
key,
value,
ttl);
localCache.put(
key,
value);
}
return value;
} finally {
lockProvider.releaseLock(
lockKey);
}
}
return waitForCache(key);
}
Waiting Requests Strategy
All non-lock owners wait.
private <T> T waitForCache(
String key) {
int retries = 10;
while(retries-- > 0) {
Object value =
provider.get(key);
if(value != null) {
return (T)value;
}
try {
Thread.sleep(100);
} catch(Exception e) {
}
}
return null;
}
This reduces:
5000 DB Calls
↓
1 DB Call
Problem 2: Simultaneous Expiry
Assume:
100000 keys
TTL:
300 Seconds
All expire together.
This creates traffic spikes.
Randomized TTL
Add jitter.
Utility Class
public class TtlUtil {
public static long calculateTtl(
long ttl,
int jitterPercent) {
long jitter =
ThreadLocalRandom
.current()
.nextLong(
ttl *
jitterPercent / 100);
return ttl + jitter;
}
}
Update Cache Write
long finalTtl =
TtlUtil.calculateTtl(
ttl,
20);
provider.put(
key,
value,
finalTtl);
Example:
Base TTL = 300
Actual TTL
312
345
356
298
324
No mass expiration.
Problem 3: Cold Cache After Deployment
Deployment occurs.
All pods restart.
Cache becomes empty.
Every request hits DB.
Cache Warming
Preload cache during startup.
New Package
com.company.cache.warming
CacheWarmer Interface
public interface CacheWarmer {
void warm();
}
Product Cache Warmer Example
@Component
public class ProductCacheWarmer
implements CacheWarmer {
@Autowired
private ProductRepository repository;
@Autowired
private MultiLevelCacheManager cache;
@Override
public void warm() {
repository.findTop100Products()
.forEach(product -> {
cache.put(
"product:" +
product.getId(),
product,
3600);
});
}
}
Startup Listener
@Component
public class CacheWarmupRunner
implements ApplicationRunner {
@Autowired
private List<CacheWarmer>
warmers;
@Override
public void run(
ApplicationArguments args) {
warmers.forEach(
CacheWarmer::warm);
}
}
Microservice Configuration
cache.warming.enabled=true
Selective Warming
Warm only critical data.
Examples:
Products
Reference Data
Country Codes
Configuration
Feature Flags
Avoid warming:
Orders
Invoices
Transactions
Background Refresh
Instead of waiting for expiry.
Refresh proactively.
Refresh Configuration
cache.refresh.enabled=true
cache.refresh.interval=300
Refresh Scheduler
@Scheduled(
fixedDelayString =
"${cache.refresh.interval}")
public void refresh() {
hotKeys.forEach(key -> {
Object value =
refreshProvider.load(key);
cache.put(
key,
value,
300);
});
}
Tracking Hot Keys
Store frequently used keys.
ConcurrentHashMap<
String,
AtomicLong>
hitCounter;
Auto Configuration Updates
@ConfigurationProperties(
prefix="cache")
public class CacheProperties {
private boolean warmingEnabled;
private boolean refreshEnabled;
private int refreshInterval;
private int lockTimeout;
private int ttlJitterPercent;
}
Updated Properties
Redis
cache.enabled=true
cache.l1.enabled=true
cache.l2.type=redis
cache.lock.enabled=true
cache.lock.timeout=10
cache.ttl.jitter=20
cache.warming.enabled=true
cache.refresh.enabled=true
cache.refresh.interval=300
cache.invalidation.type=kafka
GemFire
cache.enabled=true
cache.l2.type=gemfire
cache.lock.enabled=true
cache.lock.timeout=10
cache.warming.enabled=true
cache.refresh.enabled=true
New Metrics
Add:
cache.lock.acquired
cache.lock.failed
cache.warming.count
cache.refresh.count
cache.stampede.prevented
cache.wait.count
Micrometer:
Counter.builder(
"cache.stampede.prevented")
.register(registry)
.increment();
Final Production Architecture
Client
|
Load Balancer
|
-------------------------
| | |
Pod1 Pod2 Pod3
| | |
EhCache EhCache EhCache
\ | /
\ | /
Redis/GemFire
|
Distributed Lock
|
Database
Features Provided by Library:
✓ L1 Cache (EhCache)
✓ L2 Cache (Redis/GemFire)
✓ Distributed Locking
✓ Cache Stampede Protection
✓ Double Check Locking
✓ Randomized TTL
✓ Cache Warming
✓ Background Refresh
✓ Kafka Invalidation
✓ Spring Event Invalidation
✓ Metrics & Monitoring
✓ Auto Configuration
What Changes in Microservices?
Almost nothing.
Add dependency:
<dependency>
<groupId>
com.company.platform
</groupId>
<artifactId>
platform-cache-lib
</artifactId>
</dependency>
Configure:
cache.l2.type=redis
cache.warming.enabled=true
cache.lock.enabled=true
Optional:
@Component
public class ProductWarmer
implements CacheWarmer {
public void warm() {
// preload important data
}
}
The library handles everything else.
Conclusion
With these enhancements, the caching framework evolves from a simple L1/L2 cache abstraction into a full-fledged enterprise caching platform. It now protects downstream systems from cache stampedes, automatically warms critical data after deployments, coordinates cache rebuilds across pods using distributed locks, and proactively refreshes hot data before expiration.
The result is a production-ready caching platform that can be adopted consistently across dozens of Spring Boot microservices with minimal code changes and centralized operational control.