Building Scalable and Consistent Microservices Caching Architectures
Caching is one of the most effective techniques for improving application performance. Modern microservices often implement a multi-level cache architecture:
- L1 Cache: In-memory cache inside the application (EhCache, Caffeine)
- L2 Cache: Distributed cache such as Redis or GemFire
- Database: The system of record
While implementing L1 caching, many teams adopt a seemingly simple approach:
@Scheduled(fixedDelay = 300000)
public void refreshCache() {
cache.clear();
cache.putAll(repository.findAll());
}
At first glance, this appears straightforward. Every few minutes the application reloads the cache from the database. Unfortunately, this approach creates several architectural problems in distributed systems.
Understanding L1 Cache
L1 cache resides inside each application instance.
Pod-1 → Local Cache
Pod-2 → Local Cache
Pod-3 → Local Cache
Because each pod maintains its own cache, keeping them synchronized becomes challenging.
Problem 1: Stale Data
Consider a scheduler running every five minutes.
10:00 AM → Cache refreshed
10:02 AM → Database updated
10:05 AM → Application still serves old data
10:10 AM → Next refresh occurs
For several minutes, users receive outdated information.
This becomes dangerous for:
- Customer profiles
- Order status
- Account balances
- Holiday calendars
- Pricing information
The larger the refresh interval, the longer the stale data window.
Problem 2: Database Traffic Spikes
Assume:
- 20 application pods
- 100,000 cached records
- Refresh interval of 5 minutes
Every pod executes:
SELECT * FROM CUSTOMER_MASTER;
This results in:
20 pods × 100,000 records × every 5 minutes
Instead of reducing database traffic, the cache creates periodic traffic spikes.
Database Load
/\ /\ /\
/ \ / \ / \
——/—-\—–/—-\—–/—-\—-
These spikes can negatively impact business transactions.
Problem 3: Poor Horizontal Scalability
A properly designed distributed system should scale efficiently.
With scheduler-based refresh:
| Number of Pods | Refresh Jobs |
| 2 | 2 |
| 10 | 10 |
| 50 | 50 |
As the system scales, refresh traffic increases linearly.
Refresh Load = Number of Pods × Cache Size
Your infrastructure grows, but so does unnecessary work.
Problem 4: Refreshing Unused Data
Suppose:
- Cache contains 100,000 records.
- Only 5,000 records are frequently accessed.
The scheduler still reloads all 100,000 records.
This means:
- CPU waste
- Memory waste
- Network overhead
- Database overhead
Almost 95% of the refresh work may provide no business value.
Problem 5: Cache Stampede
When all pods refresh simultaneously:
Pod-1 —-\
Pod-2 —–\
Pod-3 ——> Database/Redis
Pod-4 —–/
Pod-5 —-/
Suddenly:
- Database CPU increases.
- Redis experiences heavy load.
- Network traffic spikes.
This phenomenon is known as a cache stampede.
Problem 6: Slow Application Startup
Many applications preload caches during startup.
@PostConstruct
public void loadCache() {
refreshCache();
}
During:
- Deployments
- Auto-scaling events
- Pod restarts
Every new pod loads the entire dataset.
Ten pods starting simultaneously can generate substantial load on the backend systems.
Problem 7: Inconsistent Data Across Pods
Consider three pods:
Pod-A refreshed at 10:00
Pod-B refreshed at 10:02
Pod-C refreshed at 10:04
The load balancer sends requests randomly.
Load Balancer
|
——————–
| | |
Pod-A Pod-B Pod-C
V1 V2 V3
Different users may receive different versions of the same data.
This inconsistency becomes difficult to troubleshoot.
Better Approach 1: Cache Aside Pattern
Request
|
L1 Cache
|
Miss
|
Redis
|
Miss
|
Database
Only requested data enters the cache.
Benefits:
- Smaller cache size
- Better memory usage
- Reduced database load
Better Approach 2: Event-Driven Cache Invalidation
Whenever data changes:
Update Database
|
Publish Event
|
Kafka/SNS/Solace
|
Invalidate L1 Caches
Every application instance removes stale data immediately.
The next request automatically loads fresh information.
Better Approach 3: L1 + L2 Architecture
Application
|
L1 Cache
|
Redis
|
Database
Request flow:
- Check L1.
- On miss, check Redis.
- On Redis miss, query database.
- Store results in both caches.
This architecture provides:
- Millisecond response times.
- Reduced database load.
- High scalability.
Better Approach 4: Time-To-Live (TTL)
Instead of refreshing everything:
@PlatformCacheable(ttl = 300)
Only expired entries are refreshed.
Advantages:
- Smaller refresh workload.
- Reduced backend traffic.
- Better cache efficiency.
When Scheduler Refresh Can Work
Schedulers are acceptable for relatively static reference data.
Examples:
✅ Country codes
✅ Currency list
✅ Product categories
✅ Configuration data
✅ State master data
Schedulers are usually unsuitable for:
❌ Customer profiles
❌ Order status
❌ Account balances
❌ Inventory quantities
❌ User preferences
Recommended Architecture
Database
|
Update Event
|
——————————–
| | |
Pod-1 Pod-2 Pod-3
| | |
L1 Cache L1 Cache L1 Cache
\ | /
Redis
Components:
- L1: Caffeine or EhCache
- L2: Redis
- Event-driven invalidation
- TTL as safety mechanism
Final Thoughts
Using schedulers to refresh L1 caches appears simple but introduces:
- Stale data
- Database spikes
- Cache stampedes
- Scalability problems
- Inconsistent user experiences
Modern microservice architectures favor:
- Cache-aside patterns
- Distributed L2 caches
- Event-driven invalidation
- Time-based expiration
The primary objective of caching is not merely improving speed. It is delivering fast, scalable, and consistent data access while minimizing load on backend systems.
A scheduler-based L1 refresh often achieves the opposite.