Tokens, Embeddings and Vector Databases for Java Developers
“Databases store data. Vector databases store meaning.”
Introduction
In the previous article, we learned:
- What an LLM is.
- How transformers work.
- Tokens and context windows.
- Hallucinations.
- Training and inference.
At this point, many developers ask:
If LLMs are trained on old data, how can they answer questions about my company’s documents?
The answer lies in one of the most important concepts in modern AI:
Embeddings
Embeddings are the foundation of:
- RAG
- Semantic search
- Document retrieval
- AI assistants
- Recommendation systems
- Knowledge bases
If LLMs are the brain, embeddings are the memory system.
Why Traditional Search Fails
Suppose your document contains:
Spring Boot microservice deployment guide.
A user searches:
How do I deploy Java services?
Traditional SQL search:
SELECT *
FROM documents
WHERE content LIKE '%deploy%'
Problems:
- “microservice” ≠ “service”
- “deployment” ≠ “deploy”
- “Java service” ≠ “Spring Boot”
Keyword search depends on exact matches.
Humans search by meaning.
Semantic Search
Humans understand:
Car
Vehicle
Automobile
as related concepts.
Computers traditionally do not.
Embeddings solve this problem.
What is an Embedding?
An embedding converts text into numbers.
Example:
Java Developer
becomes:
[0.234, -0.612, 0.912, 0.341, ...]
The numbers themselves are meaningless to humans.
But mathematically:
Similar concepts become close together.
Example
Consider:
Java Developer
Spring Boot Engineer
Backend Architect
These vectors may be close.
Whereas:
Football Player
Cooking Recipe
Mountain Trekking
may be far away.
Java Analogy
Imagine:
class Employee {
String skill;
}
Traditional search:
employee.skill.equals("Java")
Embedding search:
similarity(employee.skill, "Spring Boot")
The second approach understands meaning.
The Vector Space
Imagine a graph.
Words become coordinates.
Similar meanings cluster together.
Java
Spring
Hibernate
Microservices
appear close.
While:
Cricket
Pizza
Vacation
appear elsewhere.
This is called:
Vector Space
Dimensions
Embeddings often have:
- 384 dimensions
- 768 dimensions
- 1024 dimensions
- 1536 dimensions
- 3072 dimensions
Example:
[0.12, 0.45, -0.89, ...]
A 1536-dimensional vector contains 1536 numbers.
Higher dimensions capture more meaning.
Why Numbers?
Machines perform mathematics.
Suppose:
Java Developer
Spring Engineer
Their vectors:
A = [1, 2, 3]
B = [1.1, 2.1, 3.1]
These are close.
Meanwhile:
Pizza Recipe
C = [10, 20, 30]
is far away.
Similarity Search
The fundamental question:
How close are two vectors?
This leads to:
Cosine Similarity
Cosine Similarity
Range:
1.0 Very Similar
0.0 Unrelated
-1.0 Opposite
Example:
| Query | Similarity |
|---|---|
| Spring Boot | 0.95 |
| Java API | 0.91 |
| Hibernate | 0.88 |
| Cricket | 0.10 |
The AI retrieves the highest scores.
Why This Matters
Suppose the user asks:
How do I scale Java services?
The document contains:
Spring Boot microservices scaling guide.
Keyword search may fail.
Embedding search succeeds.
How Embeddings Are Created
Architecture:
Text
↓
Embedding Model
↓
Vector
Examples:
Java Developer
becomes:
[0.123, 0.555, -0.873 ...]
Popular Embedding Models
- OpenAI embeddings
- Amazon Titan embeddings
- BGE
- E5
- Instructor
- Sentence Transformers
These models specialize in meaning extraction.
The Document Pipeline
Enterprise AI usually follows:
PDF
Word
Wiki
Confluence
Database
↓
Text Extraction
↓
Chunking
↓
Embedding
↓
Vector Database
What is Chunking?
LLMs cannot read 500-page documents.
Therefore:
Documents are split.
Example:
Page 1 → Chunk 1
Page 2 → Chunk 2
Page 3 → Chunk 3
Each chunk gets an embedding.
Example
Document:
Spring Boot Deployment Guide
Chunk 1:
Docker configuration
Chunk 2:
Kubernetes deployment
Chunk 3:
AWS deployment
Each chunk becomes searchable.
Chunking Strategies
Fixed Size
500 words.
Sliding Window
Overlap:
Chunk 1: 1–500
Chunk 2: 450–950
Maintains context.
Semantic Chunking
Split by:
- Sections
- Headings
- Topics
Usually produces better results.
Metadata
Documents also contain metadata.
Example:
{
"document": "Architecture.pdf",
"author": "Rahul",
"team": "Platform",
"year": 2026
}
This allows filtering.
Example:
Search only finance documents.
What is a Vector Database?
Traditional databases store rows.
Vector databases store:
Vector + Metadata + Document
Example:
| ID | Vector | Document |
|---|---|---|
| 1 | [0.22…] | Spring Boot |
| 2 | [0.56…] | Kafka |
| 3 | [0.91…] | AWS |
Why SQL Databases Struggle
SQL excels at:
WHERE department = 'IT'
But struggles with:
Find documents similar to this sentence.
Vector databases are optimized for:
- Nearest neighbors
- Similarity search
- Semantic retrieval
Popular Vector Databases
PGVector
PostgreSQL extension.
Excellent for enterprise applications.
Chroma
Simple and lightweight.
Great for learning.
FAISS
Built by Meta.
Extremely fast.
Pinecone
Managed cloud service.
Weaviate
Enterprise vector platform.
Why PGVector Excites Java Developers
You already know:
- PostgreSQL
- JDBC
- JPA
Example:
SELECT *
FROM documents
ORDER BY embedding <-> query_vector
LIMIT 5;
Semantic search using SQL.
The Retrieval Process
Suppose the user asks:
How do I deploy Spring Boot on AWS?
Steps:
Question
↓
Embedding
↓
Vector Search
↓
Top Documents
↓
LLM
↓
Answer
This is the foundation of RAG.
Java Example
String query = "Deploy Spring Boot on AWS";
Vector vector = embeddingModel.embed(query);
List<Document> docs =
vectorStore.similaritySearch(vector);
Simple.
Powerful.
Spring AI Example
List<Document> documents =
vectorStore.similaritySearch(
SearchRequest.query(question)
);
Spring AI abstracts the complexity.
Why Vector Search is Revolutionary
Traditional systems answer:
What exactly matches?
AI systems answer:
What means something similar?
That is a fundamental shift.
Enterprise Applications
Knowledge Assistant
Ask:
What is our leave policy?
Architecture Assistant
Ask:
Explain this design.
Production Support Bot
Ask:
Similar incidents in the past?
API Assistant
Ask:
Which endpoint creates users?
Challenges
Embeddings are not perfect.
Problems:
- Poor chunking
- Wrong embedding models
- Missing metadata
- Low-quality documents
Good RAG systems depend heavily on retrieval quality.
Interview Questions
1. What is an embedding?
A numerical representation of text.
2. Why are embeddings needed?
To perform semantic search.
3. What is cosine similarity?
A measure of vector similarity.
4. Why do vector databases exist?
To efficiently search similar vectors.
5. What is chunking?
Splitting documents into smaller sections.
Hands-On Exercise
Take 20 architecture documents.
Try:
- Extract text.
- Split into chunks.
- Generate embeddings.
- Store in PGVector.
- Search semantically.
This single exercise will teach more than hours of theory.
Key Takeaways
✔ LLMs understand tokens.
✔ Embeddings understand meaning.
✔ Similar concepts produce similar vectors.
✔ Vector databases store semantic information.
✔ Chunking is critical.
✔ Similarity search powers modern AI.
✔ Embeddings are the foundation of RAG.
Coming Next
Part 3 — Retrieval Augmented Generation (RAG)
We will finally build:
- Document search systems.
- Company knowledge assistants.
- PDF chatbots.
- Architecture assistants.
Topics:
- RAG architecture.
- Retrieval pipelines.
- Context assembly.
- Re-ranking.
- Citations.
- Hallucination reduction.
- Enterprise RAG patterns.
Because once we understand RAG, we stop asking:
What does AI know?
and begin asking:
What knowledge should AI use?
“The database revolution taught us how to store data. The vector revolution teaches us how to store meaning.”