Tokens, Embeddings and Vector Databases for Java Developers

“Databases store data. Vector databases store meaning.”

Introduction

In the previous article, we learned:

What an LLM is.
How transformers work.
Tokens and context windows.
Hallucinations.
Training and inference.

At this point, many developers ask:

If LLMs are trained on old data, how can they answer questions about my company’s documents?

The answer lies in one of the most important concepts in modern AI:

Embeddings

Embeddings are the foundation of:

RAG
Semantic search
Document retrieval
AI assistants
Recommendation systems
Knowledge bases

If LLMs are the brain, embeddings are the memory system.

Why Traditional Search Fails

Suppose your document contains:

Spring Boot microservice deployment guide.

A user searches:

How do I deploy Java services?

Traditional SQL search:

SELECT *
FROM documents
WHERE content LIKE '%deploy%'

Problems:

“microservice” ≠ “service”
“deployment” ≠ “deploy”
“Java service” ≠ “Spring Boot”

Keyword search depends on exact matches.

Humans search by meaning.

Semantic Search

Humans understand:

Car
Vehicle
Automobile

as related concepts.

Computers traditionally do not.

Embeddings solve this problem.

What is an Embedding?

An embedding converts text into numbers.

Example:

Java Developer

becomes:

[0.234, -0.612, 0.912, 0.341, ...]

The numbers themselves are meaningless to humans.

But mathematically:

Similar concepts become close together.

Example

Consider:

Java Developer
Spring Boot Engineer
Backend Architect

These vectors may be close.

Whereas:

Football Player
Cooking Recipe
Mountain Trekking

may be far away.

Java Analogy

Imagine:

class Employee {
    String skill;
}

Traditional search:

employee.skill.equals("Java")

Embedding search:

similarity(employee.skill, "Spring Boot")

The second approach understands meaning.

The Vector Space

Imagine a graph.

Words become coordinates.

Similar meanings cluster together.

Java
Spring
Hibernate
Microservices

appear close.

While:

Cricket
Pizza
Vacation

appear elsewhere.

This is called:

Vector Space

Dimensions

Embeddings often have:

384 dimensions
768 dimensions
1024 dimensions
1536 dimensions
3072 dimensions

Example:

[0.12, 0.45, -0.89, ...]

A 1536-dimensional vector contains 1536 numbers.

Higher dimensions capture more meaning.

Why Numbers?

Machines perform mathematics.

Suppose:

Java Developer
Spring Engineer

Their vectors:

A = [1, 2, 3]

B = [1.1, 2.1, 3.1]

These are close.

Meanwhile:

Pizza Recipe

C = [10, 20, 30]

is far away.

Similarity Search

The fundamental question:

How close are two vectors?

This leads to:

Cosine Similarity

Range:

1.0     Very Similar

0.0     Unrelated

-1.0    Opposite

Example:

Query	Similarity
Spring Boot	0.95
Java API	0.91
Hibernate	0.88
Cricket	0.10

The AI retrieves the highest scores.

Why This Matters

Suppose the user asks:

How do I scale Java services?

The document contains:

Spring Boot microservices scaling guide.

Keyword search may fail.

Embedding search succeeds.

How Embeddings Are Created

Architecture:

Text
   ↓
Embedding Model
   ↓
Vector

Examples:

Java Developer

becomes:

[0.123, 0.555, -0.873 ...]

Popular Embedding Models

OpenAI embeddings
Amazon Titan embeddings
BGE
E5
Instructor
Sentence Transformers

These models specialize in meaning extraction.

The Document Pipeline

Enterprise AI usually follows:

PDF
Word
Wiki
Confluence
Database
       ↓
Text Extraction
       ↓
Chunking
       ↓
Embedding
       ↓
Vector Database

What is Chunking?

LLMs cannot read 500-page documents.

Therefore:

Documents are split.

Example:

Page 1 → Chunk 1
Page 2 → Chunk 2
Page 3 → Chunk 3

Each chunk gets an embedding.

Example

Document:

Spring Boot Deployment Guide

Chunk 1:

Docker configuration

Chunk 2:

Kubernetes deployment

Chunk 3:

AWS deployment

Each chunk becomes searchable.

Chunking Strategies

Fixed Size

500 words.

Sliding Window

Overlap:

Chunk 1: 1–500

Chunk 2: 450–950

Maintains context.

Semantic Chunking

Split by:

Sections
Headings
Topics

Usually produces better results.

Metadata

Documents also contain metadata.

Example:

{
  "document": "Architecture.pdf",
  "author": "Rahul",
  "team": "Platform",
  "year": 2026
}

This allows filtering.

Example:

Search only finance documents.

What is a Vector Database?

Traditional databases store rows.

Vector databases store:

Vector + Metadata + Document

Example:

ID	Vector	Document
1	[0.22…]	Spring Boot
2	[0.56…]	Kafka
3	[0.91…]	AWS

Why SQL Databases Struggle

SQL excels at:

WHERE department = 'IT'

But struggles with:

Find documents similar to this sentence.

Vector databases are optimized for:

Nearest neighbors
Similarity search
Semantic retrieval

Popular Vector Databases

PGVector

PostgreSQL extension.

Excellent for enterprise applications.

Chroma

Simple and lightweight.

Great for learning.

FAISS

Built by Meta.

Extremely fast.

Pinecone

Managed cloud service.

Weaviate

Enterprise vector platform.

Why PGVector Excites Java Developers

You already know:

PostgreSQL
JDBC
JPA

Example:

SELECT *
FROM documents
ORDER BY embedding <-> query_vector
LIMIT 5;

Semantic search using SQL.

The Retrieval Process

Suppose the user asks:

How do I deploy Spring Boot on AWS?

Steps:

Question
      ↓
Embedding
      ↓
Vector Search
      ↓
Top Documents
      ↓
LLM
      ↓
Answer

This is the foundation of RAG.

Java Example

String query = "Deploy Spring Boot on AWS";

Vector vector = embeddingModel.embed(query);

List<Document> docs =
        vectorStore.similaritySearch(vector);

Simple.

Powerful.

Spring AI Example

List<Document> documents =
        vectorStore.similaritySearch(
                SearchRequest.query(question)
        );

Spring AI abstracts the complexity.

Why Vector Search is Revolutionary

Traditional systems answer:

What exactly matches?

AI systems answer:

What means something similar?

That is a fundamental shift.

Enterprise Applications

Knowledge Assistant

Ask:

What is our leave policy?

Architecture Assistant

Ask:

Explain this design.

Production Support Bot

Ask:

Similar incidents in the past?

API Assistant

Ask:

Which endpoint creates users?

Challenges

Embeddings are not perfect.

Problems:

Poor chunking
Wrong embedding models
Missing metadata
Low-quality documents

Good RAG systems depend heavily on retrieval quality.

Interview Questions

1. What is an embedding?

A numerical representation of text.

2. Why are embeddings needed?

To perform semantic search.

3. What is cosine similarity?

A measure of vector similarity.

4. Why do vector databases exist?

To efficiently search similar vectors.

5. What is chunking?

Splitting documents into smaller sections.

Hands-On Exercise

Take 20 architecture documents.

Try:

Extract text.
Split into chunks.
Generate embeddings.
Store in PGVector.
Search semantically.

This single exercise will teach more than hours of theory.

Key Takeaways

✔ LLMs understand tokens.

✔ Embeddings understand meaning.

✔ Similar concepts produce similar vectors.

✔ Vector databases store semantic information.

✔ Chunking is critical.

✔ Similarity search powers modern AI.

✔ Embeddings are the foundation of RAG.

Coming Next

Part 3 — Retrieval Augmented Generation (RAG)

We will finally build:

Document search systems.
Company knowledge assistants.
PDF chatbots.
Architecture assistants.

Topics:

RAG architecture.
Retrieval pipelines.
Context assembly.
Re-ranking.
Citations.
Hallucination reduction.
Enterprise RAG patterns.

Because once we understand RAG, we stop asking:

What does AI know?

and begin asking:

What knowledge should AI use?

“The database revolution taught us how to store data. The vector revolution teaches us how to store meaning.”

Tokens, Embeddings and Vector Databases for Java Developers

Introduction

Embeddings

Why Traditional Search Fails

Semantic Search

What is an Embedding?

Example

Java Analogy

The Vector Space

Vector Space

Dimensions

Why Numbers?

Similarity Search

Cosine Similarity

Cosine Similarity

Why This Matters

How Embeddings Are Created

Popular Embedding Models

The Document Pipeline

What is Chunking?

Example

Chunking Strategies

Fixed Size

Sliding Window

Semantic Chunking

Metadata

What is a Vector Database?

Why SQL Databases Struggle

Popular Vector Databases

PGVector

Chroma

FAISS

Pinecone

Weaviate

Why PGVector Excites Java Developers

The Retrieval Process

Java Example

Spring AI Example

Why Vector Search is Revolutionary

Enterprise Applications

Knowledge Assistant

Architecture Assistant

Production Support Bot

API Assistant

Challenges

Interview Questions

1. What is an embedding?

2. Why are embeddings needed?

3. What is cosine similarity?

4. Why do vector databases exist?

5. What is chunking?

Hands-On Exercise

Key Takeaways

Coming Next

Part 3 — Retrieval Augmented Generation (RAG)

Leave a Reply Cancel reply

AI for Java Architects – Part 10

The Future of Engineering Leadership

AI for Java Architects – Part 7

From Enterprise Architect to AI Engineer: My Journey into RAG, Generative AI, LangChain and Agentic AI