Getting Started¶
This guide will help you get up and running with RAGLib quickly.
Installation¶
Basic Installation¶
Install RAGLib using pip:
This installs the core library with lightweight default adapters.
Optional Dependencies¶
RAGLib supports optional dependencies for different use cases:
# For FAISS-based vector storage
pip install rag-techlib[faiss]
# For LLM integrations (OpenAI, Transformers)
pip install rag-techlib[llm]
# For development and testing
pip install rag-techlib[dev]
# Install everything
pip install rag-techlib[faiss,llm,dev]
Quick Start¶
Let's build a simple RAG pipeline:
1. Basic Example¶
from raglib.techniques import (
FixedSizeChunker,
DenseRetriever,
HyDE
)
from raglib.adapters import (
InMemoryVectorStore,
DummyEmbedder
)
# Initialize components
chunker = FixedSizeChunker(chunk_size=512, overlap=50)
embedder = DummyEmbedder(dim=384) # Fallback embedder
vectorstore = InMemoryVectorStore()
retriever = DenseRetriever(embedder=embedder, vectorstore=vectorstore)
# Note: HyDE requires an LLM adapter for query expansion
# Sample documents
documents = [
"RAGLib is a library for building retrieval-augmented generation systems.",
"It provides a unified interface for different RAG techniques.",
"You can easily compose techniques into complex pipelines.",
]
# Step 1: Chunk documents
chunks_result = chunker.apply(documents)
chunks = chunks_result.payload["chunks"]
print(f"Created {len(chunks)} chunks")
# Step 2: Index chunks
index_result = retriever.apply(chunks, mode="index")
print(f"Indexed {index_result.payload['indexed_count']} chunks")
# Step 3: Retrieve relevant chunks
query = "What is RAGLib used for?"
retrieve_result = retriever.apply(query, mode="retrieve", top_k=3)
relevant_chunks = retrieve_result.payload["chunks"]
print(f"Retrieved {len(relevant_chunks)} relevant chunks")
# Step 4: Generate answer
generate_result = generator.apply(
query=query,
context=relevant_chunks
)
print(f"Generated answer: {generate_result.payload['answer']}")
2. Advanced Chunking Techniques¶
RAGLib includes several sophisticated chunking techniques optimized for different document types:
from raglib.techniques import (
ContentAwareChunker,
RecursiveChunker,
DocumentSpecificChunker,
PropositionalChunker,
ParentDocumentChunker
)
from raglib.schemas import Document
# Academic paper with hierarchical structure
document = Document(
id="research_paper",
text="""
# Abstract
This paper presents a novel approach to information retrieval.
## Introduction
Information retrieval has evolved significantly with the advent of neural networks.
### Background
Previous work has focused on traditional keyword-based approaches.
## Methodology
We propose a hybrid approach combining neural and symbolic methods.
""",
meta={"type": "academic", "domain": "computer_science"}
)
# Content-aware chunking respects document structure
content_chunker = ContentAwareChunker(chunk_size=300, overlap=50)
result = content_chunker.apply(document)
content_chunks = result.payload["chunks"]
print(f"Content-aware chunking: {len(content_chunks)} chunks")
for chunk in content_chunks[:2]:
print(f" - Chunk: {chunk.text[:100]}...")
# Recursive chunking with hierarchical splitting
recursive_chunker = RecursiveChunker(
chunk_size=250,
overlap=30,
separators=["\n\n", "\n", ". ", " "] # Custom separator hierarchy
)
result = recursive_chunker.apply(document)
recursive_chunks = result.payload["chunks"]
print(f"Recursive chunking: {len(recursive_chunks)} chunks")
# Document-specific chunking adapts to document type
doc_chunker = DocumentSpecificChunker(chunk_size=400, overlap=40)
result = doc_chunker.apply(document)
doc_chunks = result.payload["chunks"]
print(f"Document-specific chunking: {len(doc_chunks)} chunks")
# Propositional chunking focuses on semantic units
prop_chunker = PropositionalChunker(chunk_size=200, overlap=20)
result = prop_chunker.apply(document)
prop_chunks = result.payload["chunks"]
print(f"Propositional chunking: {len(prop_chunks)} chunks")
# Parent-document chunking maintains context hierarchy
parent_chunker = ParentDocumentChunker(
child_chunk_size=150,
parent_chunk_size=600,
overlap=25
)
result = parent_chunker.apply(document)
parent_data = result.payload
print(f"Parent-document chunking:")
print(f" - Child chunks: {len(parent_data['child_chunks'])}")
print(f" - Parent chunks: {len(parent_data['parent_chunks'])}")
3. Comparing Chunking Strategies¶
from raglib.registry import TechniqueRegistry
# Get all chunking techniques
chunking_techniques = TechniqueRegistry.list_by_category("chunking")
# Test document
test_doc = Document(
id="test",
text="This is a test document. It has multiple sentences and paragraphs.\n\nThis is the second paragraph with more content for testing purposes.",
meta={"source": "test"}
)
# Compare different chunking approaches
print("Chunking Strategy Comparison:")
print("-" * 40)
for name, technique_class in chunking_techniques.items():
try:
chunker = technique_class(chunk_size=100, overlap=20)
result = chunker.apply(test_doc)
if result.success:
chunks = result.payload["chunks"]
avg_length = sum(len(c.text) for c in chunks) / len(chunks)
print(f"{name}:")
print(f" - Chunks: {len(chunks)}")
print(f" - Avg length: {avg_length:.1f} chars")
print(f" - Coverage: {result.meta.get('coverage_ratio', 'N/A')}")
else:
print(f"{name}: Failed - {result.error}")
except Exception as e:
print(f"{name}: Error - {e}")
print()
4. Sparse Retrieval Techniques¶
RAGLib includes a comprehensive set of sparse retrieval techniques that use lexical matching rather than dense embeddings:
from raglib.registry import TechniqueRegistry
from raglib.schemas import Document
# Create test documents
documents = [
Document(id="ml", text="Machine learning algorithms learn patterns from data automatically."),
Document(id="nlp", text="Natural language processing helps computers understand human language."),
Document(id="ir", text="Information retrieval systems find relevant documents efficiently."),
Document(id="ai", text="Artificial intelligence encompasses machine learning and deep learning.")
]
# Get all sparse retrieval techniques
sparse_techniques = TechniqueRegistry.find_by_category("sparse_retrieval")
query = "machine learning algorithms"
print("Sparse Retrieval Comparison:")
print("-" * 50)
for name, technique_class in sparse_techniques.items():
try:
retriever = technique_class(docs=documents)
result = retriever.apply(query=query, top_k=3)
if result.success:
hits = result.payload["hits"]
top_score = hits[0].score if hits else 0.0
print(f"{name.upper()}:")
print(f" - Results: {len(hits)}")
print(f" - Top score: {top_score:.4f}")
print(f" - Best match: {hits[0].doc_id if hits else 'None'}")
# Show technique-specific info
if name == "lexical_matcher":
print(f" - Mode: {result.meta.get('mode', 'N/A')}")
elif name == "splade":
expanded = result.meta.get('expanded_terms', [])
print(f" - Expanded terms: {len(expanded)}")
elif name == "lexical_transformer":
print(f" - Attention weight: {result.meta.get('attention_weight', 'N/A')}")
else:
print(f"{name}: Failed - {result.error}")
except Exception as e:
print(f"{name}: Error - {e}")
print()
# Example: Using BM25 for quick sparse retrieval
print("Quick BM25 Example:")
BM25 = TechniqueRegistry.get("bm25")
bm25 = BM25(docs=documents)
result = bm25.apply(query="information retrieval", top_k=2)
if result.success:
for i, hit in enumerate(result.payload["hits"], 1):
doc = next(d for d in documents if d.id == hit.doc_id)
print(f"{i}. {hit.doc_id} (score: {hit.score:.3f}): {doc.text}")
5. Advanced Vector Retrieval Techniques¶
RAGLib includes state-of-the-art dense retrieval techniques that use semantic embeddings for finding relevant information:
from raglib.registry import TechniqueRegistry
from raglib.schemas import Document, Chunk
from raglib.adapters import DummyEmbedder
# Sample documents for vector retrieval
documents = [
Document(id="ai", text="Artificial intelligence is transforming how we process information."),
Document(id="ml", text="Machine learning algorithms can learn patterns from data automatically."),
Document(id="dl", text="Deep learning uses neural networks with multiple layers for complex tasks."),
Document(id="nlp", text="Natural language processing enables computers to understand human language."),
Document(id="cv", text="Computer vision allows machines to interpret and analyze visual information."),
]
# Convert to chunks for retrieval
chunks = [
Chunk(id=doc.id, text=doc.text, start_idx=0, end_idx=len(doc.text), doc_id=doc.id)
for doc in documents
]
# Initialize embedder
embedder = DummyEmbedder(dim=384)
print("Advanced Vector Retrieval Techniques Showcase:")
print("=" * 60)
# 1. FAISS-based High-Performance Retrieval
print("\n1. FAISS Retrieval (High-Performance Vector Search)")
FaissRetriever = TechniqueRegistry.get("faiss_retriever")
faiss_retriever = FaissRetriever(embedder=embedder, index_type="flat")
# Index chunks
faiss_retriever.add_chunks(chunks)
result = faiss_retriever.apply(query="understanding human language", top_k=3)
if result.success:
hits = result.payload["hits"]
print(f"Found {len(hits)} relevant chunks using FAISS:")
for i, hit in enumerate(hits, 1):
chunk = next(c for c in chunks if c.id == hit.doc_id)
print(f" {i}. {chunk.text[:60]}... (score: {hit.score:.3f})")
# 2. Dual Encoder for Asymmetric Retrieval
print("\n2. Dual Encoder Retrieval (Asymmetric Query/Document Encoding)")
DualEncoder = TechniqueRegistry.get("dual_encoder")
dual_encoder = DualEncoder(
query_embedder=embedder,
doc_embedder=embedder,
similarity="cosine"
)
dual_encoder.add_chunks(chunks)
result = dual_encoder.apply(query="neural networks for complex problems", top_k=3)
if result.success:
hits = result.payload["hits"]
print(f"Dual encoder results:")
for i, hit in enumerate(hits, 1):
chunk = next(c for c in chunks if c.id == hit.doc_id)
print(f" {i}. {chunk.text[:60]}... (score: {hit.score:.3f})")
# 3. ColBERT for Token-Level Matching
print("\n3. ColBERT Retrieval (Token-Level Late Interaction)")
ColBERT = TechniqueRegistry.get("colbert_retriever")
colbert = ColBERT(embedder=embedder, max_tokens=64)
colbert.add_chunks(chunks)
result = colbert.apply(query="machines analyze visual data", top_k=3)
if result.success:
hits = result.payload["hits"]
print(f"ColBERT token-level matching results:")
for i, hit in enumerate(hits, 1):
chunk = next(c for c in chunks if c.id == hit.doc_id)
print(f" {i}. {chunk.text[:60]}... (score: {hit.score:.3f})")
# 4. Multi-Query Retrieval with Query Expansion
print("\n4. Multi-Query Retrieval (Query Expansion & Fusion)")
MultiQuery = TechniqueRegistry.get("multi_query_retriever")
# Note: In real usage, you'd use an actual LLM for query generation
try:
multi_query = MultiQuery(
base_retriever=faiss_retriever,
num_queries=3,
fusion_method="rrf"
)
result = multi_query.apply(query="automated pattern recognition", top_k=3)
if result.success:
hits = result.payload["hits"]
print(f"Multi-query with RRF fusion results:")
for i, hit in enumerate(hits, 1):
chunk = next(c for c in chunks if c.id == hit.doc_id)
print(f" {i}. {chunk.text[:60]}... (score: {hit.score:.3f})")
except Exception as e:
print(f"Multi-query requires LLM adapter. Demo error: {e}")
# 5. Multi-Vector Retrieval with Document Segmentation
print("\n5. Multi-Vector Retrieval (Document Segmentation)")
MultiVector = TechniqueRegistry.get("multi_vector_retriever")
multi_vector = MultiVector(
embedder=embedder,
segment_size=50,
aggregation_method="max"
)
multi_vector.add_chunks(chunks)
result = multi_vector.apply(query="learning from data", top_k=3)
if result.success:
hits = result.payload["hits"]
print(f"Multi-vector segmentation results:")
for i, hit in enumerate(hits, 1):
chunk = next(c for c in chunks if c.id == hit.doc_id)
print(f" {i}. {chunk.text[:60]}... (score: {hit.score:.3f})")
print("\n" + "=" * 60)
print("All vector retrieval techniques support the same interface!")
print("You can swap between techniques without changing your pipeline code.")
6. Using the CLI¶
RAGLib provides a command-line interface for quick experimentation:
# Run the quick start example
raglib-cli quick-start
# Run a specific example
raglib-cli run-example e2e_toy
# Test all chunking techniques
python examples/chunking_benchmark.py
# Build documentation
raglib-cli docs-build
# List all available techniques
python -c "from raglib.registry import TechniqueRegistry; print('\\n'.join(TechniqueRegistry.list().keys()))"
Core Concepts¶
RAGTechnique Interface¶
All techniques in RAGLib implement the same interface:
from raglib.core import RAGTechnique
class MyTechnique(RAGTechnique):
def apply(self, *args, **kwargs):
# Your implementation here
return TechniqueResult(
success=True,
payload={"result": "your_data"}
)
TechniqueResult¶
Every technique returns a TechniqueResult object:
result = technique.apply(data)
if result.success:
data = result.payload
print(f"Operation succeeded: {data}")
else:
print(f"Operation failed: {result.error}")
Registration System¶
Techniques are automatically discoverable through the registry:
from raglib.registry import TechniqueRegistry
# List all registered techniques
techniques = TechniqueRegistry.list()
print(techniques.keys())
# Get a specific technique
ChunkerClass = TechniqueRegistry.get("fixed_size_chunker")
chunker = ChunkerClass(chunk_size=256)
Working with Adapters¶
Adapters provide interfaces to external services and libraries:
Embedders¶
from raglib.adapters import DummyEmbedder
# Fallback embedder (no external dependencies)
embedder = DummyEmbedder(dimension=384)
# With sentence-transformers (requires llm extras)
# from raglib.adapters import SentenceTransformerEmbedder
# embedder = SentenceTransformerEmbedder("all-MiniLM-L6-v2")
Vector Stores¶
from raglib.adapters import InMemoryVectorStore
# In-memory storage (good for development)
vectorstore = InMemoryVectorStore()
# With FAISS (requires faiss extras)
# from raglib.adapters import FaissVectorStore
# vectorstore = FaissVectorStore(dimension=384)
Configuration and Environment¶
Environment Variables¶
RAGLib respects standard environment variables:
# OpenAI API key (for LLM generators)
export OPENAI_API_KEY="your-api-key"
# Hugging Face token (for some models)
export HF_TOKEN="your-token"
Configuration Files¶
You can use configuration files to manage complex setups:
# raglib_config.yaml
chunking:
technique: "fixed_size_chunker"
chunk_size: 512
overlap: 50
retrieval:
technique: "dense_retriever"
top_k: 5
generation:
technique: "llm_generator"
model: "gpt-3.5-turbo"
Next Steps¶
Now that you have RAGLib running:
- Explore Techniques: Check out the techniques catalog
- Build Pipelines: Learn about composing techniques
- Add Custom Techniques: Extend RAGLib with your own implementations
- Optimize Performance: Learn about production deployment strategies
Getting Help¶
- Documentation: Browse the complete API reference
- Examples: Check out the
examples/directory in the repository - Issues: Report bugs or request features on GitHub
- Discussions: Join community discussions for help and ideas