What really happens when 10,000 users hit your Spring Boot API simultaneously? Does Spring Boot create 10,000 threads? Does Tomcat queue requests? And can Java Virtual Threads genuinely change backend scalability?
If you’ve worked on Spring Boot systems in production, you’ve likely tuned thread pools, investigated latency spikes, or stared at CPU graphs wondering why throughput collapsed under load.
Understanding concurrency is no longer optional for backend engineers.
With Java 21 introducing Virtual Threads, the concurrency conversation has become even more interesting.
In this article, we’ll walk through:
- Spring Boot request lifecycle
- Tomcat worker threads
- Platform Threads vs Virtual Threads
- Concurrent execution behavior
- Race conditions and synchronization fixes
- Async execution patterns
- Load testing concurrency
- Production best practices
This write-up expands on Spring Boot concurrency concepts including thread handling, worker pools, and virtual threading models.
1. Understanding Spring Boot Request Lifecycle
Before discussing concurrency, we need to understand how a request flows through Spring Boot.
When an HTTP request arrives, Spring Boot does not magically process it by itself.
There is an embedded web server involved.
By default:
- Spring Boot + MVC → Embedded Tomcat
- Spring Boot + Reactive → Netty
For traditional MVC applications, Tomcat owns request execution.
Request Flow
Client Request
│
▼
Tomcat Connector
│
▼
Worker Thread Allocation
│
▼
DispatcherServlet
│
▼
Controller
│
▼
Service Layer
│
▼
Database / External API
│
▼
HTTP Response
Here’s what actually happens:
Step 1 — Client Sends Request
A browser, mobile app, or downstream service calls:
GET /orders/123
Step 2 — Tomcat Accepts Connection
Tomcat listens on port 8080.
Incoming connections are accepted by Tomcat connectors.
Step 3 — Worker Thread Assignment
Tomcat picks a thread from its worker pool.
That thread becomes responsible for:
- request parsing
- controller invocation
- business logic execution
- database access
- response writing
One request typically maps to one worker thread.
Step 4 — Spring MVC Processing
Tomcat invokes:
DispatcherServlet
DispatcherServlet routes request execution.
DispatcherServlet
│
├── Handler Mapping
├── Controller
├── Service
└── Response Serialization
Step 5 — Response Returned
Thread finishes work.
Tomcat returns thread to the pool.
Thread reuse begins.
2. Tomcat Worker Threads Deep Dive
Many developers think:
“Spring Boot handles concurrency.”
Technically true.
But Tomcat’s worker pool drives concurrency in Spring MVC systems.
Default Thread Pool Behavior
Tomcat manages:
- worker threads
- request queue
- active connections
Common properties:
server:
tomcat:
threads:
max: 200
min-spare: 10
accept-count: 100
What Do These Settings Mean?
max Threads
server.tomcat.threads.max=200
Maximum worker threads available.
If 300 requests arrive:
- 200 execute
- remaining requests wait
min-spare Threads
Minimum idle threads maintained.
Helps reduce thread creation overhead.
accept-count
Overflow waiting queue.
Example:
200 threads busy
100 queued
new request arrives
Result:
connection refused
Restaurant Analogy
Think of Tomcat like a restaurant.
Customers = HTTP Requests
Waiters = Worker Threads
Kitchen = Business Logic
Queue = accept-count
Scenario:
5 Waiters Available
20 Customers Arrive
Execution:
5 served immediately
10 waiting
5 rejected
This is essentially how overloaded APIs behave.
3. Concurrent Thread Execution Example (Java 8 Platform Threads)
Let’s see traditional concurrency.
Fixed Thread Pool Example
ExecutorService executor =
Executors.newFixedThreadPool(3);
for(int i=1;i<=10;i++) {
int task=i;
executor.submit(() -> {
System.out.println(
Thread.currentThread().getName()
+" processing "+task);
Thread.sleep(2000);
return null;
});
}
What Happens Internally?
Only 3 platform threads exist.
Execution timeline:
Thread-1 → Task1
Thread-2 → Task2
Thread-3 → Task3
Task4 waits
Task5 waits
Task6 waits
After completion:
Thread-1 reused → Task4
Thread-2 reused → Task5
This is concurrent execution with bounded resources.
Sample Output
pool-1-thread-1 processing 1
pool-1-thread-2 processing 2
pool-1-thread-3 processing 3
pool-1-thread-1 processing 4
pool-1-thread-2 processing 5
Notice:
Threads are reused.
Not recreated.
4. The Cost of Platform Threads
Traditional Java threads are OS-backed threads.
Creating them is not free.
Each thread consumes:
- native memory
- stack allocation
- OS scheduling cost
- context switching overhead
Imagine:
50,000 concurrent requests
Creating 50,000 platform threads is usually a terrible idea.
Why?
Memory explosion.
Scheduler pressure.
CPU degradation.
Blocking I/O Problem
Suppose controller logic calls database.
@GetMapping("/users")
public List<User> users() {
return repository.findAll();
}
During DB call:
Thread = BLOCKED
Worker thread waits doing nothing.
Yet memory is still consumed.
This becomes expensive at scale.
5. Enter Virtual Threads (Java 21)
Java 21 introduced Virtual Threads through Project Loom.
This changes concurrency economics.
Traditional Model
1 Request
│
▼
1 Platform Thread
│
▼
OS Scheduling
Virtual Thread Model
1 Request
│
▼
1 Virtual Thread
│
▼
JVM Scheduler
│
▼
Small Platform Thread Pool
Virtual Threads are:
- lightweight
- JVM managed
- cheap to create
- optimized for blocking workloads
Massive Concurrency Example
try(var executor =
Executors.newVirtualThreadPerTaskExecutor()) {
for(int i=1;i<=100000;i++) {
int task=i;
executor.submit(() -> {
Thread.sleep(1000);
return task;
});
}
}
Traditional platform threads?
Probably impossible or highly unstable.
Virtual threads?
Designed for this model.
6. Platform Threads vs Virtual Threads
| Feature | Platform Threads | Virtual Threads |
|---|---|---|
| Ownership | OS | JVM |
| Creation Cost | High | Low |
| Memory Usage | Higher | Lower |
| Blocking I/O | Expensive | Efficient |
| Large Scale Concurrency | Limited | Excellent |
| Existing Code Compatibility | Yes | Yes |
Important Reality Check
Virtual Threads do not magically solve everything.
They help primarily when workloads are:
✅ I/O bound
✅ Database heavy
✅ API integration heavy
They do not automatically improve:
❌ CPU-bound computation
❌ inefficient SQL
❌ poor application architecture
7. Enabling Virtual Threads in Spring Boot 3.x
Spring Boot makes adoption surprisingly simple.
application.properties
spring.threads.virtual.enabled=true
That’s it.
Spring Boot begins using Virtual Threads where supported.
You can also create explicit executors.
@Bean
Executor executor() {
return Executors
.newVirtualThreadPerTaskExecutor();
}
8. Race Conditions: The Hidden Concurrency Bug
Concurrency is not just about speed.
It’s about correctness.
Consider:
@Service
public class CounterService {
private int counter=0;
public int increment() {
counter++;
return counter;
}
}
Looks harmless.
Under concurrency?
Dangerous.
What Actually Happens
Suppose:
counter=20
Two requests execute simultaneously.
Thread-1
Reads:
20
Thread-2
Reads:
20
Thread-1
Writes:
21
Thread-2
Writes:
21
Expected:
22
Actual:
21
Classic race condition.
9. Fixing Race Conditions
Option 1 — AtomicInteger
Best lightweight fix.
private AtomicInteger counter =
new AtomicInteger();
public int increment() {
return counter.incrementAndGet();
}
Atomic operations guarantee thread safety.
Option 2 — synchronized
public synchronized int increment() {
counter++;
return counter;
}
Simple.
Reliable.
But can reduce throughput.
Option 3 — ReentrantLock
More control.
private Lock lock =
new ReentrantLock();
public int increment() {
lock.lock();
try {
counter++;
return counter;
} finally {
lock.unlock();
}
}
Useful for advanced concurrency control.
10. Async Execution in Spring Boot
Not every task should block request threads.
Examples:
- email sending
- PDF generation
- notifications
- report processing
Java 8 — @Async
Enable async.
@EnableAsync
Create async service.
@Async
public CompletableFuture<String> process() {
Thread.sleep(3000);
return CompletableFuture
.completedFuture("done");
}
Controller returns quickly.
Heavy work continues separately.
Custom Executor
@Bean
Executor taskExecutor() {
ThreadPoolTaskExecutor executor =
new ThreadPoolTaskExecutor();
executor.setCorePoolSize(10);
executor.setMaxPoolSize(50);
executor.initialize();
return executor;
}
Production systems should almost always configure executors explicitly.
Java 21 Virtual Thread Async
@Bean
Executor executor() {
return Executors
.newVirtualThreadPerTaskExecutor();
}
Simpler scalability story.
Less thread pool tuning.
11. Load Testing Spring Boot Concurrency
Theory is useful.
Measurement matters more.
ApacheBench Example
ab -n 10000 -c 200 \
http://localhost:8080/orders
Parameters:
-n = total requests
-c = concurrency level
Example Test
Total Requests: 10,000
Concurrency: 200
Observe:
- response time
- throughput
- active threads
- CPU usage
- GC activity
Common Failure Pattern
Under heavy load:
CPU low
Latency high
Requests waiting
Cause?
Usually blocked worker threads.
Not CPU exhaustion.
Virtual Threads can dramatically improve this scenario when waiting dominates execution time.
12. Production Lessons Learned
After enough production incidents, some patterns become obvious.
Lesson 1 — Bigger Thread Pools Are Not Always Better
Wrong instinct:
Performance issue?
Increase max threads to 5000.
Usually bad idea.
Symptoms:
- memory pressure
- context switching
- degraded throughput
Lesson 2 — Database Pools Become Bottlenecks
Your application may support:
500 concurrent threads
But HikariCP:
maxPoolSize=20
Now 480 threads wait.
Concurrency shifted.
Problem not solved.
Lesson 3 — Blocking External APIs Matter
Third-party APIs frequently dominate latency.
Example:
App logic → 20ms
External API → 2.5 sec
Threads spend most of life waiting.
Ideal use case for Virtual Threads.
Lesson 4 — Measure Before Optimizing
Always collect:
- thread dumps
- heap metrics
- CPU profiling
- latency distribution
- database timings
Performance tuning without metrics becomes guesswork.
13. When Should You Use What?
Use Platform Threads When
You have:
- moderate traffic
- mature Java 8 stack
- CPU-heavy workloads
- predictable thread requirements
Use Virtual Threads When
You have:
- high concurrent traffic
- blocking I/O
- REST integrations
- database waiting
- Java 21 adoption path
Consider Reactive When
You need:
- extreme scalability
- event-driven streaming
- fully non-blocking architecture
But remember:
Reactive introduces complexity.
Virtual Threads preserve familiar programming models.
Final Thoughts
Concurrency in Spring Boot is not just about adding more threads.
It’s about understanding:
- request lifecycle
- Tomcat worker behavior
- blocking vs non-blocking execution
- synchronization correctness
- resource bottlenecks
Platform Threads served Java extremely well for years.
Virtual Threads do not replace engineering discipline.
But they dramatically improve how we think about scalable, blocking applications.
The next time someone says:
“Spring Boot handles concurrency automatically.”
You’ll know what’s actually happening underneath the hood.