NOVA RAG Agent & Docker Model Runner
Yesterday I was telling you about the “NOVA Chat Agent” (Ref: N.O.V.A. Chat Agent & Docker Model Runner).
When working with local SLMs, or even TLMs (Tiny Language Models - I’m claiming authorship of this term), you can find yourself limited by the model’s knowledge (small model, small knowledge). That’s where RAG Agents (Retrieval-Augmented Generation) come into play.
In the NOVA library, I’ve implemented a RAG Agent that can be used to perform similarity searches in a document database, and then you can use those similarities by passing them to the “Chat Agent” to generate a response “augmented” by the retrieved documents.
Currently NOVA offers a RAG Agent with either the ability to work in memory and persist (and therefore use) a vector store in a JSON file, or to use an “external” vector store using a Redis database and its vector capabilities. The first mode (JSON) is handy for testing, or if you have few documents, while the second mode (REDIS) is better suited for use with a large number of documents and the ability to easily add documents.
Let’s see how to use the RAG Agent with the “in memory” (JSON) mode with a code example.
RAG Agent and JSON store
Here’s a code example of using the RAG Agent:
You’ll need the ai/mxbai-embed-large:latest model for this code, which you can run locally with Docker Model Runner:
docker model pull ai/mxbai-embed-large:latest
✋ You can of course use another embeddings model, but you’ll need to test it to verify if it fits your use case, your way of “chunking” your documents, and your similarity threshold. The mxbai-embed-large model is a good starting point for most use cases.
Initialize your Go project and install the library:
go get github.com/snipwise/nova@latest
go mod init rag-demo
touch main.go
Create a main.go file and copy the following code into it:
main.go:
package main
import (
"context"
"fmt"
"strings"
"github.com/snipwise/nova/nova-sdk/agents"
"github.com/snipwise/nova/nova-sdk/agents/rag"
"github.com/snipwise/nova/nova-sdk/models"
)
func main() {
ctx := context.Background()
storePathFile := "./store/animals.json"
// Initial documents to load
txtChunks := []string{
"Squirrels run in the forest",
"Birds fly in the sky",
"Frogs swim in the pond",
"Fishes swim in the sea",
"Lions roar in the savannah",
"Eagles soar above the mountains",
"Dolphins leap out of the ocean",
"Bears fish in the river",
"Tigers prowl in the jungle",
"Whales sing in the ocean",
"Owls hoot at night",
"Monkeys swing in the trees",
"Butterflies flutter in the garden",
"Bees buzz around flowers",
}
agent, err := rag.NewAgent(
ctx,
agents.Config{
EngineURL: "http://localhost:12434/engines/llama.cpp/v1",
},
models.Config{
Name: "ai/mxbai-embed-large:latest",
},
rag.WithJsonStore(storePathFile),
// DocumentLoadModeSkipDuplicates:
// will skip loading documents that are already in the store (based on content hash)
rag.WithDocuments(txtChunks, rag.DocumentLoadModeSkipDuplicates),
)
if err != nil {
panic(err)
}
fmt.Println("✅ RAG Agent created with JSON store and initial documents")
// Note: The store is automatically persisted when using WithJsonStore + WithDocuments
// and new documents are added. With DocumentLoadModeSkip, documents are only added
fmt.Printf("📁 Store file location: %s\n", storePathFile)
fmt.Println(strings.Repeat("=", 60))
// Test similarity search
queries := []string{
"What animals live in water?",
"Which creatures can fly?",
"Animals in the forest",
"What animals are in the jungle?",
"Who sings in the ocean?",
"Which animals are active at night?",
"Which animals are found in the trees?",
"Which animals buzz around flowers?",
"Which animals are big cats?",
}
for _, query := range queries {
fmt.Printf("\n🔍 Query: %s\n", query)
fmt.Println(strings.Repeat("-", 60))
similarities, err := agent.SearchSimilar(query, 0.6)
if err != nil {
fmt.Printf("❌ Error searching: %v\n", err)
continue
}
if len(similarities) == 0 {
fmt.Println("No similar documents found (threshold: 0.6)")
} else {
for i, sim := range similarities {
fmt.Printf("%d. [%.3f] %s\n", i+1, sim.Similarity, sim.Prompt)
}
}
}
}
Now, you can run your program:
go mod tidy
go run main.go
The JSON store will be automatically created in the ./store/animals.json folder and the initial documents will be loaded. Then, you’ll see the similarity search results for each query. You should get output like this:
✅ RAG Agent created with JSON store and initial documents
📁 Store file location: ./store/animals.json
============================================================
🔍 Query: What animals live in water?
------------------------------------------------------------
1. [0.643] Fishes swim in the sea
🔍 Query: Which creatures can fly?
------------------------------------------------------------
1. [0.708] Birds fly in the sky
🔍 Query: Animals in the forest
------------------------------------------------------------
1. [0.754] Squirrels run in the forest
2. [0.689] Tigers prowl in the jungle
3. [0.653] Monkeys swing in the trees
🔍 Query: What animals are in the jungle?
------------------------------------------------------------
1. [0.737] Tigers prowl in the jungle
🔍 Query: Who sings in the ocean?
------------------------------------------------------------
1. [0.639] Whales sing in the ocean
🔍 Query: Which animals are active at night?
------------------------------------------------------------
1. [0.616] Owls hoot at night
🔍 Query: Which animals are found in the trees?
------------------------------------------------------------
1. [0.666] Squirrels run in the forest
2. [0.648] Monkeys swing in the trees
3. [0.619] Tigers prowl in the jungle
🔍 Query: Which animals buzz around flowers?
------------------------------------------------------------
1. [0.809] Bees buzz around flowers
2. [0.649] Butterflies flutter in the garden
🔍 Query: Which animals are big cats?
------------------------------------------------------------
No similar documents found (threshold: 0.6)
============================================================
The program displays similar documents for each query, along with their similarity score. You can adjust the similarity threshold (0.6 in this example) to get more or fewer results depending on your needs.
You can see that the RAG Agent is fairly straightforward to use. In the next section, I’ll show you a version with the “Redis Store”.
You can find the complete code for this RAG Agent in the
samplesfolder of the NOVA library, in the filemain.go. As well as some documentation: rag-agent-guide-en.md - work in progress
RAG Agent and Redis store
Now let’s look at an example with the Redis store. First, you’ll need to have a Redis instance running locally or remotely. You can use Docker Compose for that:
compose.yml:
services:
redis-server:
image: redis:8.2.3-alpine3.22
container_name: nova-redis-vector-store
ports:
- "6379:6379"
volumes:
- ./data:/data
environment:
# Redis persistence: save every 30s if at least 1 key changed
- REDIS_ARGS=--save 30 1
restart: unless-stopped
✋ Note: Redis 8.2.3 includes RediSearch natively for vector similarity search, then no need for redis-stack - standard Redis 8.x has built-in vector support!
Start your Redis instance:
docker compose up -d
Now let’s get to the Go code for using the RAG Agent with Redis as the vector store. Here’s a code example:
main.go:
package main
import (
"context"
"fmt"
"github.com/joho/godotenv"
"github.com/snipwise/nova/nova-sdk/agents"
"github.com/snipwise/nova/nova-sdk/agents/rag"
"github.com/snipwise/nova/nova-sdk/agents/rag/stores"
"github.com/snipwise/nova/nova-sdk/models"
)
func main() {
// This example demonstrates DocumentLoadModeSkipDuplicates
// Run this program multiple times - it will NOT create duplicates!
ctx := context.Background()
// Configuration
engineURL := "http://localhost:12434/engines/llama.cpp/v1"
embeddingModel := "ai/mxbai-embed-large:latest"
// documents
documents := []string{
"Squirrels run in the forest and collect acorns for winter",
"Birds fly in the sky and migrate south during winter",
"Frogs swim in the pond and catch insects with their tongues",
"Bears hibernate in caves during the cold winter months",
"Rabbits hop through meadows and live in underground burrows",
}
// Create RAG agent with Redis and DocumentLoadModeSkipDuplicates
ragAgent, err := rag.NewAgent(
ctx,
agents.Config{
Name: "SkipDuplicatesDemo",
EngineURL: engineURL,
},
models.Config{
Name: embeddingModel,
},
rag.WithRedisStore(stores.RedisConfig{
Address: "localhost:6379",
Password: "",
DB: 0,
IndexName: "skip_duplicates_demo",
}, 1024),
rag.WithDocuments(documents, rag.DocumentLoadModeSkipDuplicates),
)
if err != nil {
fmt.Printf("❌ Failed to create RAG agent: %v\n", err)
return
}
fmt.Println("✅ RAG Agent created with Redis store and initial documents")
fmt.Println()
fmt.Println("Testing Similarity Search...")
// Test similarity search
query := "What do animals do in winter?"
fmt.Printf("🔍 Query: %s\n", query)
fmt.Println()
results, err := ragAgent.SearchTopN(query, 0.3, 3)
if err != nil {
fmt.Printf("❌ Failed to search: %v\n", err)
return
}
fmt.Printf("📊 Top %d results (similarity > 0.3):\n", len(results))
for i, result := range results {
fmt.Printf("%d. [%.3f] %s\n", i+1, result.Similarity, result.Prompt)
}
fmt.Println()
}
Run the program:
go run main.go
You should get output like this:
✅ RAG Agent created with Redis store and initial documents
Testing Similarity Search...
🔍 Query: What do animals do in winter?
📊 Top 3 results (similarity > 0.3):
1. [0.663] Birds fly in the sky and migrate south during winter
2. [0.646] Squirrels run in the forest and collect acorns for winter
3. [0.619] Bears hibernate in caves during the cold winter months
You can find the complete code for this RAG Agent in the
samplesfolder of the NOVA library, in the filemain.go.
A sample project using the Chat Agent and the RAG Agent
I’ve created a small CLI that uses the Chat Agent “augmented” with the RAG Agent to generate content from the terminal based on an XML knowledge base, which you can find on Codeberg: chat-rag-cli. And which I use, for example, like this:
./chat-rag-cli \
prompt "How to define a struct in Golang? Please use your knowledge to answer." \
--instructions context.md \
--output result.md
To summarize the code for using both agents together, here are the main steps:
- Search for similarities with the RAG Agent using the user’s question as the search query
- Build a message list for the Chat Agent, including the similar documents in the instruction prompt
- Call the Chat Agent with the constructed message list
Here’s a code snippet that illustrates the similarity search with the RAG Agent and the prompt construction for the Chat Agent:
userQuestion := "How to define a struct in Golang? Please use your knowledge to answer."
var knowledgeBase string
// Search for similar documents in the RAG Agent
// We use a similarity threshold of 0.5 and want to retrieve the top 3 most similar documents
similarities, err := ragAgent.SearchTopN(userQuestion, 0.5, 3)
if err != nil {
fmt.Printf("❌ Error searching: %v\n", err)
}
if len(similarities) == 0 {
fmt.Println("No similar documents found (threshold: 0.6)")
} else {
fmt.Printf("📚 Found %d similar documents:\n", len(similarities))
for i, sim := range similarities {
fmt.Printf("📗 %d. [%.3f] %s\n", i+1, sim.Similarity, sim.Prompt)
knowledgeBase += sim.Prompt + "\n"
}
knowledgeBase = fmt.Sprintf(
"Here is some knowledge that might be useful for answering the user's question:\n%s",
knowledgeBase,
)
}
messagesList := []messages.Message{}
if knowledgeBase != "" {
messagesList = append(messagesList, messages.Message{
Role: roles.System,
Content: knowledgeBase,
})
}
messagesList = append(messagesList, messages.Message{
Role: roles.User,
Content: userQuestion,
})
That’s all for today. Next time, we’ll see how to use the Crew Agent to orchestrate multiple Chat Agents, coupled with a RAG Agent, while being mindful of the conversational memory context size. Stay tuned!
✋ Note: NOVA provides a few helpers for “chunking” text, markdown and XML documents: nova-sdk/agents/rag/chunks. I plan to add more in the future.