AI for Java Architects – Part 4

Building Your First RAG Application Using Spring AI

“Theory teaches concepts. Projects teach engineering.”


Introduction

In the previous articles we learned:

Part 1

  • LLMs
  • Tokens
  • Transformers

Part 2

  • Embeddings
  • Vector databases
  • Semantic search

Part 3

  • RAG architecture
  • Retrieval pipelines
  • Chunking
  • Similarity search

Now it is time to build our first application.

Our goal:

Upload documents and ask questions about them.

This is the foundation of:

  • Company assistants
  • Architecture copilots
  • HR bots
  • Support systems
  • Documentation assistants

What We Are Building

Enterprise Knowledge Assistant

Users can:

✔ Upload PDF documents.

✔ Generate embeddings.

✔ Store vectors.

✔ Ask questions.

✔ Receive answers from documents.


Architecture

                    +------------------+
                    |     User         |
                    +---------+--------+
                              |
                              v
                    +------------------+
                    | Spring Boot API  |
                    +---------+--------+
                              |
               +--------------+-------------+
               |                            |
               v                            v
       +---------------+          +----------------+
       | Vector Store  |          | OpenAI / LLM   |
       | PGVector      |          | GPT / Claude   |
       +---------------+          +----------------+
               ^
               |
        +------+------+
        | Embeddings  |
        +-------------+

Technology Stack

ComponentTechnology
LanguageJava 21
FrameworkSpring Boot 3
AI FrameworkSpring AI
Vector DBPGVector
LLMOpenAI
DatabasePostgreSQL
BuildMaven

Why Spring AI?

Spring AI gives Java developers:

  • Chat clients
  • Embeddings
  • Vector stores
  • Prompt templates
  • RAG support
  • Advisors

It feels very similar to Spring Data.


Project Structure

src
 ├── controller
 ├── service
 ├── config
 ├── documents
 └── resources

Step 1: Add Dependencies

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>

<dependency>
    <groupId>org.postgresql</groupId>
    <artifactId>postgresql</artifactId>
</dependency>

Step 2: PostgreSQL with PGVector

Docker:

services:
  postgres:
    image: pgvector/pgvector:pg16
    ports:
      - "5432:5432"
    environment:
      POSTGRES_DB: ragdb
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: password

Start:

docker compose up -d

Enable Vector Extension

CREATE EXTENSION vector;

Application Properties

spring.datasource.url=jdbc:postgresql://localhost:5432/ragdb
spring.datasource.username=postgres
spring.datasource.password=password

spring.ai.openai.api-key=${OPENAI_API_KEY}

spring.ai.vectorstore.pgvector.initialize-schema=true

Step 3: Configure Chat Client

@Configuration
public class AiConfig {

    @Bean
    ChatClient chatClient(ChatClient.Builder builder) {
        return builder.build();
    }
}

Step 4: Embedding Model

Spring AI automatically creates:

@Autowired
EmbeddingModel embeddingModel;

This converts:

Spring Boot deployment

into vectors.


Step 5: Vector Store

@Autowired
VectorStore vectorStore;

This stores:

  • Text
  • Embeddings
  • Metadata

Understanding the Flow

Document
     ↓
Chunking
     ↓
Embedding
     ↓
PGVector

Creating Documents

Document document = new Document("""
Spring Boot applications can be deployed
using Docker containers on AWS ECS.
""");

Storing Documents

vectorStore.add(List.of(document));

This automatically:

  • Generates embeddings.
  • Stores vectors.
  • Saves content.

Similarity Search

List<Document> documents =
        vectorStore.similaritySearch(
                "How do I deploy Spring applications?"
        );

The system retrieves semantically similar content.


Question Answering Service

@Service
public class ChatService {

    private final ChatClient chatClient;

    public ChatService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    public String ask(String question) {
        return chatClient.prompt()
                .user(question)
                .call()
                .content();
    }
}

But This Isn’t RAG Yet

This is merely:

Question
    ↓
LLM

We need:

Question
     ↓
Retrieve Documents
     ↓
Context
     ↓
LLM

QuestionAnswerAdvisor

Spring AI provides:

QuestionAnswerAdvisor

Example:

String answer =
        chatClient.prompt()
                .user(question)
                .advisors(
                    new QuestionAnswerAdvisor(
                        vectorStore))
                .call()
                .content();

This is RAG.


What Happens Internally?

Question
      ↓
Embedding
      ↓
Vector Search
      ↓
Top Documents
      ↓
Prompt Construction
      ↓
LLM
      ↓
Answer

REST Controller

@RestController
@RequestMapping("/chat")
public class ChatController {

    @Autowired
    private ChatService service;

    @GetMapping
    public String ask(
            @RequestParam String question) {

        return service.ask(question);
    }
}

Example Request

GET /chat?question=How do we deploy Spring Boot?

Response:

Spring Boot applications can be deployed
using Docker containers on AWS ECS.

Adding Metadata

Map<String,Object> metadata =
        Map.of(
            "team", "platform",
            "year", "2026"
        );

Document document =
        new Document(content, metadata);

Metadata Filtering

Search:

SearchRequest request =
        SearchRequest.builder()
            .query(question)
            .topK(5)
            .build();

Understanding topK

Top 1 = Most relevant

Top 5 = Broader context

Top 20 = Larger context

Typical values:

  • 3
  • 5
  • 10

Prompt Engineering

Default:

Answer the question.

Better:

Answer only from the supplied context.

If information is unavailable,
say "I don't know."

Prompt Template

PromptTemplate template =
        new PromptTemplate("""
Answer using only the context.

Context:
{context}

Question:
{question}
""");

Hallucination Reduction

Bad:

Explain deployment.

Good:

Answer only from the retrieved documents.

PDF Loading

Spring AI supports:

  • PDFs
  • Text files
  • Markdown

Example:

Resource resource =
        new FileSystemResource("guide.pdf");

Text Splitting

TokenTextSplitter splitter =
        new TokenTextSplitter();

Produces:

Chunk 1

Chunk 2

Chunk 3

Full Document Pipeline

PDF
   ↓
Text Extraction
   ↓
Chunking
   ↓
Embeddings
   ↓
PGVector

Production Architecture

S3
 ↓
Document Service
 ↓
Embedding Service
 ↓
PGVector
 ↓
Spring AI API
 ↓
Users

AWS Version

S3
 ↓
Lambda
 ↓
Titan Embeddings
 ↓
OpenSearch
 ↓
Bedrock
 ↓
Spring Boot

Real Enterprise Use Cases

Architecture Assistant

Upload:

  • HLD
  • LLD

Ask:

Explain the request flow.


API Assistant

Upload:

  • Swagger
  • OpenAPI

Ask:

Which API creates users?


Support Assistant

Upload:

  • Incident reports

Ask:

Similar issues.


HR Assistant

Upload:

  • Policies

Ask:

How many leave days are allowed?


Common Mistakes

Chunk Size Too Large

Poor retrieval.


Chunk Size Too Small

Missing context.


No Metadata

Poor filtering.


Poor Documents

Garbage in.


Large topK

Too much context.


Interview Questions

What is Spring AI?

A framework for building AI applications in Spring.


What is VectorStore?

Stores embeddings.


What does QuestionAnswerAdvisor do?

Implements RAG.


Why PGVector?

Stores vectors in PostgreSQL.


What is topK?

Number of retrieved documents.


Homework

Build:

Employee Handbook Assistant

Documents:

  • Leave policy.
  • Travel policy.
  • WFH policy.

Endpoints:

POST /documents

GET /chat

Capstone Enhancement

Add:

  • PDF upload
  • Metadata filtering
  • User authentication
  • Chat history
  • Citations

What We Learned

✔ Spring AI integrates AI into Spring Boot.

✔ PGVector stores embeddings.

✔ QuestionAnswerAdvisor enables RAG.

✔ Similarity search retrieves context.

✔ Prompts reduce hallucinations.

✔ Java developers can build AI systems with minimal new concepts.


Coming Next

Part 5 — Prompt Engineering for Enterprise AI

Topics:

  • System prompts
  • Role prompting
  • Few-shot prompting
  • Chain of Thought
  • Structured output
  • JSON generation
  • Guardrails
  • AI response control

Because in AI systems:

The prompt is becoming the new API.


“RAG is where AI stops being a chatbot and starts becoming an enterprise application.”

Leave a Reply

Your email address will not be published. Required fields are marked *