Building Your First RAG Application Using Spring AI
“Theory teaches concepts. Projects teach engineering.”
Introduction
In the previous articles we learned:
Part 1
- LLMs
- Tokens
- Transformers
Part 2
- Embeddings
- Vector databases
- Semantic search
Part 3
- RAG architecture
- Retrieval pipelines
- Chunking
- Similarity search
Now it is time to build our first application.
Our goal:
Upload documents and ask questions about them.
This is the foundation of:
- Company assistants
- Architecture copilots
- HR bots
- Support systems
- Documentation assistants
What We Are Building
Enterprise Knowledge Assistant
Users can:
✔ Upload PDF documents.
✔ Generate embeddings.
✔ Store vectors.
✔ Ask questions.
✔ Receive answers from documents.
Architecture
+------------------+
| User |
+---------+--------+
|
v
+------------------+
| Spring Boot API |
+---------+--------+
|
+--------------+-------------+
| |
v v
+---------------+ +----------------+
| Vector Store | | OpenAI / LLM |
| PGVector | | GPT / Claude |
+---------------+ +----------------+
^
|
+------+------+
| Embeddings |
+-------------+
Technology Stack
| Component | Technology |
|---|---|
| Language | Java 21 |
| Framework | Spring Boot 3 |
| AI Framework | Spring AI |
| Vector DB | PGVector |
| LLM | OpenAI |
| Database | PostgreSQL |
| Build | Maven |
Why Spring AI?
Spring AI gives Java developers:
- Chat clients
- Embeddings
- Vector stores
- Prompt templates
- RAG support
- Advisors
It feels very similar to Spring Data.
Project Structure
src
├── controller
├── service
├── config
├── documents
└── resources
Step 1: Add Dependencies
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
</dependency>
Step 2: PostgreSQL with PGVector
Docker:
services:
postgres:
image: pgvector/pgvector:pg16
ports:
- "5432:5432"
environment:
POSTGRES_DB: ragdb
POSTGRES_USER: postgres
POSTGRES_PASSWORD: password
Start:
docker compose up -d
Enable Vector Extension
CREATE EXTENSION vector;
Application Properties
spring.datasource.url=jdbc:postgresql://localhost:5432/ragdb
spring.datasource.username=postgres
spring.datasource.password=password
spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.vectorstore.pgvector.initialize-schema=true
Step 3: Configure Chat Client
@Configuration
public class AiConfig {
@Bean
ChatClient chatClient(ChatClient.Builder builder) {
return builder.build();
}
}
Step 4: Embedding Model
Spring AI automatically creates:
@Autowired
EmbeddingModel embeddingModel;
This converts:
Spring Boot deployment
into vectors.
Step 5: Vector Store
@Autowired
VectorStore vectorStore;
This stores:
- Text
- Embeddings
- Metadata
Understanding the Flow
Document
↓
Chunking
↓
Embedding
↓
PGVector
Creating Documents
Document document = new Document("""
Spring Boot applications can be deployed
using Docker containers on AWS ECS.
""");
Storing Documents
vectorStore.add(List.of(document));
This automatically:
- Generates embeddings.
- Stores vectors.
- Saves content.
Similarity Search
List<Document> documents =
vectorStore.similaritySearch(
"How do I deploy Spring applications?"
);
The system retrieves semantically similar content.
Question Answering Service
@Service
public class ChatService {
private final ChatClient chatClient;
public ChatService(ChatClient chatClient) {
this.chatClient = chatClient;
}
public String ask(String question) {
return chatClient.prompt()
.user(question)
.call()
.content();
}
}
But This Isn’t RAG Yet
This is merely:
Question
↓
LLM
We need:
Question
↓
Retrieve Documents
↓
Context
↓
LLM
QuestionAnswerAdvisor
Spring AI provides:
QuestionAnswerAdvisor
Example:
String answer =
chatClient.prompt()
.user(question)
.advisors(
new QuestionAnswerAdvisor(
vectorStore))
.call()
.content();
This is RAG.
What Happens Internally?
Question
↓
Embedding
↓
Vector Search
↓
Top Documents
↓
Prompt Construction
↓
LLM
↓
Answer
REST Controller
@RestController
@RequestMapping("/chat")
public class ChatController {
@Autowired
private ChatService service;
@GetMapping
public String ask(
@RequestParam String question) {
return service.ask(question);
}
}
Example Request
GET /chat?question=How do we deploy Spring Boot?
Response:
Spring Boot applications can be deployed
using Docker containers on AWS ECS.
Adding Metadata
Map<String,Object> metadata =
Map.of(
"team", "platform",
"year", "2026"
);
Document document =
new Document(content, metadata);
Metadata Filtering
Search:
SearchRequest request =
SearchRequest.builder()
.query(question)
.topK(5)
.build();
Understanding topK
Top 1 = Most relevant
Top 5 = Broader context
Top 20 = Larger context
Typical values:
- 3
- 5
- 10
Prompt Engineering
Default:
Answer the question.
Better:
Answer only from the supplied context.
If information is unavailable,
say "I don't know."
Prompt Template
PromptTemplate template =
new PromptTemplate("""
Answer using only the context.
Context:
{context}
Question:
{question}
""");
Hallucination Reduction
Bad:
Explain deployment.
Good:
Answer only from the retrieved documents.
PDF Loading
Spring AI supports:
- PDFs
- Text files
- Markdown
Example:
Resource resource =
new FileSystemResource("guide.pdf");
Text Splitting
TokenTextSplitter splitter =
new TokenTextSplitter();
Produces:
Chunk 1
Chunk 2
Chunk 3
Full Document Pipeline
PDF
↓
Text Extraction
↓
Chunking
↓
Embeddings
↓
PGVector
Production Architecture
S3
↓
Document Service
↓
Embedding Service
↓
PGVector
↓
Spring AI API
↓
Users
AWS Version
S3
↓
Lambda
↓
Titan Embeddings
↓
OpenSearch
↓
Bedrock
↓
Spring Boot
Real Enterprise Use Cases
Architecture Assistant
Upload:
- HLD
- LLD
Ask:
Explain the request flow.
API Assistant
Upload:
- Swagger
- OpenAPI
Ask:
Which API creates users?
Support Assistant
Upload:
- Incident reports
Ask:
Similar issues.
HR Assistant
Upload:
- Policies
Ask:
How many leave days are allowed?
Common Mistakes
Chunk Size Too Large
Poor retrieval.
Chunk Size Too Small
Missing context.
No Metadata
Poor filtering.
Poor Documents
Garbage in.
Large topK
Too much context.
Interview Questions
What is Spring AI?
A framework for building AI applications in Spring.
What is VectorStore?
Stores embeddings.
What does QuestionAnswerAdvisor do?
Implements RAG.
Why PGVector?
Stores vectors in PostgreSQL.
What is topK?
Number of retrieved documents.
Homework
Build:
Employee Handbook Assistant
Documents:
- Leave policy.
- Travel policy.
- WFH policy.
Endpoints:
POST /documents
GET /chat
Capstone Enhancement
Add:
- PDF upload
- Metadata filtering
- User authentication
- Chat history
- Citations
What We Learned
✔ Spring AI integrates AI into Spring Boot.
✔ PGVector stores embeddings.
✔ QuestionAnswerAdvisor enables RAG.
✔ Similarity search retrieves context.
✔ Prompts reduce hallucinations.
✔ Java developers can build AI systems with minimal new concepts.
Coming Next
Part 5 — Prompt Engineering for Enterprise AI
Topics:
- System prompts
- Role prompting
- Few-shot prompting
- Chain of Thought
- Structured output
- JSON generation
- Guardrails
- AI response control
Because in AI systems:
The prompt is becoming the new API.
“RAG is where AI stops being a chatbot and starts becoming an enterprise application.”