RAGLib Documentation¶

Welcome to RAGLib, a comprehensive library of Retrieval-Augmented Generation (RAG) techniques with a unified RAGTechnique.apply() API for research and production environments.

What is RAGLib? 🤔¶

RAGLib provides a modular, extensible framework for implementing and experimenting with different RAG techniques. Each technique follows a consistent interface, making it easy to:

Compare different RAG approaches on your data
Compose techniques into complex pipelines
Extend the library with your own custom techniques
Scale from prototyping to production

Key Features ✨¶

Unified Interface¶

All techniques implement the same apply() method, ensuring consistency across the library.

Modular Design¶

Mix and match components as needed. Lightweight core with optional heavy dependencies.

Production Ready¶

Built with scalability and performance in mind, suitable for both research and production.

Extensible¶

Easy-to-use plugin system for adding new techniques and adapters.

Benchmarking¶

Built-in tools for comparing techniques and measuring performance.

Quick Start 🚀¶

from raglib.techniques import DenseRetriever
from raglib.adapters import InMemoryVectorStore

# Create and apply a technique
technique = DenseRetriever(
    vectorstore=InMemoryVectorStore()
)

# Apply to your documents
results = technique.apply(
    documents=["Your document content..."],
    query="What is RAG?"
)

Get Started →

Architecture Overview¶

RAGLib is organized into several key components:

Core Components¶

RAGTechnique: Base class for all techniques
TechniqueRegistry: Central registry for technique discovery
TechniqueMeta: Metadata for technique description and categorization

Technique Categories¶

Chunking: Split documents into processable segments
Retrieval: Find relevant information from knowledge bases
Reranking: Improve retrieval quality through reordering
Generation: Produce final answers using retrieved context
Orchestration: Coordinate multiple techniques in complex workflows

Adapters¶

Embedders: Convert text to vector representations
Vector Stores: Store and retrieve embeddings efficiently
LLM Adapters: Interface with different language models

Quick Example¶

from raglib.techniques import DenseRetriever, FixedSizeChunker
from raglib.adapters import InMemoryVectorStore, DummyEmbedder

# Initialize components
chunker = FixedSizeChunker(chunk_size=512)
embedder = DummyEmbedder()
vectorstore = InMemoryVectorStore()
retriever = DenseRetriever(embedder=embedder, vectorstore=vectorstore)

# Process documents
documents = ["Your documents here..."]
chunks = chunker.apply(documents)
retriever.apply(chunks.payload["chunks"], mode="index")

# Query
query = "What is the main topic?"
results = retriever.apply(query, mode="retrieve", top_k=5)

Next Steps¶

Getting Started: Set up RAGLib and run your first example
Techniques: Browse the complete catalog of available techniques
API Reference: Detailed API documentation

Community & Contributing¶

RAGLib is an open-source project welcoming contributions from the community. Whether you're fixing bugs, adding new techniques, or improving documentation, we'd love to have you involved!

📖 Read our Contributing Guide
🐛 Report issues on GitHub
💬 Join discussions in GitHub Discussions