Introduction
In the previous article, we learned that a Stream is not a data structure but a processing pipeline that transforms data through a series of operations. We explored concepts such as lazy evaluation, internal iteration, and the difference between intermediate and terminal operations.
Now it’s time to examine the operations that form the heart of every Stream pipeline.
Whether you’re building a REST API, processing millions of records from a database, transforming Kafka events, or generating reports, you’ll spend most of your time using intermediate operations.
These operations allow you to:
- Filter unwanted data.
- Transform one object into another.
- Flatten nested collections.
- Remove duplicates.
- Sort results.
- Limit returned records.
- Skip unwanted records.
Understanding how these operations work individually—and more importantly, how they work together—is essential for writing clean, efficient, and maintainable Java applications.
In this article, we’ll explore each operation in depth, understand how the JVM executes them, and apply them to real-world Spring Boot microservices.
Learning Objectives
By the end of this article, you will be able to:
- Understand every intermediate Stream operation.
- Learn how lazy evaluation affects execution.
- Build efficient Stream pipelines.
- Understand the internal processing order.
- Apply intermediate operations in enterprise applications.
- Avoid common performance pitfalls.
- Refactor legacy Java loops into modern Stream pipelines.
What Are Intermediate Operations?
Intermediate operations transform one Stream into another.
Unlike terminal operations, they do not execute immediately.
Instead, they build a processing pipeline that is executed only when a terminal operation is encountered.
For example:
List<String> employeeNames = employees.stream()
.filter(Employee::isActive)
.map(Employee::getName)
.sorted()
.toList();
Nothing happens until toList() is called.
The JVM first constructs the pipeline and then executes it efficiently in a single traversal.
Understanding Pipeline Execution
Many developers incorrectly assume Streams execute one operation at a time.
They imagine this sequence:
Filter entire collection
↓
Map entire collection
↓
Sort everything
↓
Collect results
That is not how Streams work.
Instead, Streams process one element at a time through the entire pipeline.
Employee 1
↓
filter()
↓
map()
↓
sorted()
↓
Employee 2
↓
filter()
↓
map()
↓
sorted()
This minimizes unnecessary object creation and improves performance.
filter()
Purpose
Filters elements based on a condition.
Method signature
Stream<T> filter(Predicate<? super T> predicate)
Example
Suppose we need only active employees.
Traditional Java
List<Employee> active = new ArrayList<>();
for(Employee employee : employees){
if(employee.isActive()){
active.add(employee);
}
}
Using Streams
List<Employee> active = employees.stream()
.filter(Employee::isActive)
.toList();
The intent is immediately clear.
Enterprise Example
Retrieve all approved loan applications.
List<LoanApplication> approved = loans.stream()
.filter(LoanApplication::isApproved)
.toList();
Best Practices
- Keep predicates simple.
- Avoid modifying objects inside
filter(). - Prefer method references where possible.
map()
Purpose
Transforms one object into another.
Method signature
<R> Stream<R> map(Function<? super T, ? extends R>)
Example
Convert employees into employee names.
List<String> names = employees.stream()
.map(Employee::getName)
.toList();
The output Stream now contains String objects instead of Employee objects.
Enterprise Example
Entity → DTO conversion.
List<EmployeeResponse> response = employees.stream()
.map(EmployeeMapper::toResponse)
.toList();
This is one of the most common Stream operations in Spring Boot REST APIs.
mapToInt(), mapToLong(), mapToDouble()
Primitive Streams eliminate unnecessary boxing.
Instead of:
employees.stream()
.map(Employee::getSalary)
Use:
employees.stream()
.mapToDouble(Employee::getSalary)
.sum();
Benefits:
- Better performance.
- Lower memory usage.
- Fewer temporary objects.
flatMap()
The Problem
Consider a customer with multiple accounts.
Customer
Accounts
Savings
Current
Salary
Suppose we have:
List<Customer>
Each customer contains:
List<Account>
Using map() produces:
Stream<List<Account>>
We actually want:
Stream<Account>
This is exactly what flatMap() does.
Example
List<Account> accounts = customers.stream()
.flatMap(customer -> customer.getAccounts().stream())
.toList();
Nested collections become a single continuous Stream.
Enterprise Example
Flatten all order items across every customer.
orders.stream()
.flatMap(order -> order.getItems().stream())
.toList();
This avoids nested loops and results in highly readable code.
distinct()
Removes duplicate elements.
Example
employees.stream()
.map(Employee::getDepartment)
.distinct()
.toList();
Output
IT
Finance
HR
Operations
Internally, distinct() uses equals() and hashCode().
For custom objects, ensure these methods are implemented correctly.
sorted()
Sorts Stream elements.
Natural ordering
employees.stream()
.map(Employee::getName)
.sorted()
.toList();
Custom sorting
employees.stream()
.sorted(Comparator.comparing(Employee::getSalary))
.toList();
Reverse order
employees.stream()
.sorted(
Comparator.comparing(Employee::getSalary)
.reversed())
.toList();
Multi-Level Sorting
employees.stream()
.sorted(
Comparator.comparing(Employee::getDepartment)
.thenComparing(Employee::getName))
.toList();
This is frequently used in reporting APIs.
limit()
Returns only the first N elements.
employees.stream()
.limit(10)
.toList();
Enterprise Example
Top 10 highest-paid employees.
employees.stream()
.sorted(
Comparator.comparing(Employee::getSalary)
.reversed())
.limit(10)
.toList();
Useful for dashboards and leaderboards.
skip()
Skips the first N elements.
employees.stream()
.skip(20)
.limit(10)
.toList();
Commonly used for pagination.
Pagination Example
List<Employee> page = employees.stream()
.skip(pageNumber * pageSize)
.limit(pageSize)
.toList();
Although useful for in-memory collections, database pagination should generally be handled by the database (for example, using SQL OFFSET/FETCH or Spring Data pagination) rather than loading all rows into memory.
Combining Intermediate Operations
A realistic Stream pipeline might look like this:
List<EmployeeResponse> response = employees.stream()
.filter(Employee::isActive)
.filter(employee -> employee.getSalary() > 50000)
.sorted(Comparator.comparing(Employee::getDepartment))
.map(EmployeeMapper::toResponse)
.limit(100)
.toList();
Each operation has a single responsibility, making the pipeline easy to read and maintain.
Enterprise Case Study
Imagine a banking microservice responsible for returning premium customers.
Business Rules:
- Customer must be active.
- Account balance must exceed ₹1,000,000.
- Sort by customer name.
- Convert to API response.
- Return the first 100 records.
Implementation:
List<CustomerResponse> premiumCustomers = customers.stream()
.filter(Customer::isActive)
.filter(customer -> customer.getBalance() > 1_000_000)
.sorted(Comparator.comparing(Customer::getName))
.map(CustomerMapper::toResponse)
.limit(100)
.toList();
The business rules are expressed directly in code without temporary collections or nested loops.
Performance Tips
- Apply
filter()early to reduce the number of elements flowing through the pipeline. - Use primitive streams (
mapToInt(),mapToLong(),mapToDouble()) for numeric operations. - Avoid expensive operations such as
sorted()unless necessary. - Prefer database filtering and pagination when working with persistent data.
- Keep intermediate operations stateless and free of side effects.
Common Mistakes
Using map() instead of flatMap()
This often results in nested collections (Stream<List<T>>) when a flat sequence (Stream<T>) is required.
Performing Database Calls Inside Stream Operations
Avoid invoking repositories or external services inside map() or filter(). This can lead to N+1 query problems, unpredictable latency, and code that’s difficult to test.
Overly Complex Pipelines
If a pipeline becomes difficult to read, extract portions into well-named helper methods.
Interview Questions
Q. What is the difference between map() and flatMap()?
map() transforms one element into another. flatMap() transforms one element into a Stream and then flattens all resulting Streams into a single Stream.
Q. Why is filter() considered lazy?
Because it does not execute until a terminal operation begins processing the pipeline.
Q. Does distinct() use equals()?
Yes. distinct() relies on equals() and hashCode() to identify duplicate elements.
Q. Should pagination be implemented using skip() and limit() for database queries?
Generally no. For large datasets, pagination should be delegated to the database. skip() and limit() are most appropriate for in-memory collections or already-materialized Streams.
Hands-On Exercise
Build a Spring Boot REST endpoint that:
- Retrieves a list of bank customers.
- Filters active customers.
- Filters customers with balances above ₹500,000.
- Sorts them by city and then by name.
- Converts entities to response DTOs.
- Returns the first 25 results.
Next, refactor the same logic from a traditional Java 7 implementation using loops and temporary collections. Compare readability, maintainability, and the amount of code required.
Summary
Intermediate operations are the building blocks of every Stream pipeline. They allow developers to express complex business rules declaratively while benefiting from lazy evaluation and a fluent programming model.
Choosing the right operation—filter() for selection, map() for transformation, flatMap() for flattening nested structures, distinct() for deduplication, sorted() for ordering, and limit()/skip() for slicing—results in code that is both expressive and maintainable.
Understanding these operations is essential before exploring terminal operations and collectors, where Streams become even more powerful.
Coming Up Next
Part 6 – Stream Terminal Operations: Mastering collect(), reduce(), count(), findFirst(), findAny(), min(), max(), anyMatch(), allMatch(), noneMatch(), and toList()
In the next article, we’ll explore how Stream pipelines produce results, examine reduction operations, compare collectors, and build production-ready aggregation pipelines used in Spring Boot microservices.