Skip to content

Techniques API 🛠️

This page provides detailed API documentation for all RAGLib techniques organized by category.

Quick Reference 📋

Technique Category Purpose Key Parameters
FixedSizeChunker Chunking Split into fixed-size chunks chunk_size, overlap
SemanticChunker Chunking Semantically-aware chunking similarity_threshold
SentenceWindowChunker Chunking Sentence-based windowing window_size, step_size
DenseRetriever Retrieval Embedding-based retrieval top_k, similarity_threshold
FAISSRetriever Retrieval High-performance vector search index_type, top_k
DualEncoder Retrieval Asymmetric query/doc encoding similarity, top_k
ColBERTRetriever Retrieval Token-level late interaction max_tokens, top_k
MultiQueryRetriever Retrieval Query expansion with fusion num_queries, fusion_method
MultiVectorRetriever Retrieval Multi-vector document representation segment_size, aggregation_method
BM25 Retrieval Keyword-based retrieval top_k, k1, b
CrossEncoderReRanker Reranking Neural reranking model_name, top_k
MMRReRanker Reranking Diversity-aware reranking diversity_lambda
HyDE Generation Hypothetical document embeddings model_name, temperature

Usage Patterns 🚀

Basic Technique Usage

from raglib.techniques import FixedSizeChunker

# Initialize technique
chunker = FixedSizeChunker(chunk_size=512, overlap=50)

# Apply technique
result = chunker.apply(documents)

# Check result
if result.success:
    chunks = result.payload["chunks"]
    print(f"Created {len(chunks)} chunks")
else:
    print(f"Error: {result.error}")

Chaining Techniques

from raglib.techniques import FixedSizeChunker, DenseRetriever, LLMGenerator

# Setup pipeline components
chunker = FixedSizeChunker(chunk_size=512)
retriever = DenseRetriever(top_k=5)
generator = LLMGenerator(model_name="gpt-3.5-turbo")

# Execute pipeline
chunks = chunker.apply(documents)
if chunks.success:
    retrieved = retriever.apply(query, chunks.payload["chunks"])
    if retrieved.success:
        answer = generator.apply(query, retrieved.payload["documents"])

Error Handling

All techniques return consistent TechniqueResult objects:

result = technique.apply(input_data)

if result.success:
    output = result.payload
    metadata = result.metadata
else:
    print(f"Technique failed: {result.error}")
    error_context = result.metadata

Technique Categories 📁

Chunking Techniques 📄

Break documents into processable segments:

  • FixedSizeChunker: Split text into chunks of fixed character length
  • SemanticChunker: Create chunks based on semantic similarity
  • SentenceWindowChunker: Use sentence boundaries with overlapping windows

Retrieval Techniques 🔍

Find relevant information from document collections:

  • DenseRetriever: Vector-based semantic retrieval using embeddings
  • BM25: Traditional keyword-based sparse retrieval with in-memory indexing

Reranking Techniques 🎯

Improve initial retrieval results:

  • CrossEncoderReRanker: Deep learning-based relevance scoring
  • MMRReRanker: Balance relevance with diversity using Maximal Marginal Relevance

Generation Techniques ✍️

Create responses from retrieved context:

  • LLMGenerator: Generate answers using large language models
  • TemplateResponse: Rule-based template filling
  • HyDE: Hypothetical document embeddings for enhanced retrieval

Utility Techniques 🔧

Helper techniques for testing and development:

  • EchoTechnique: Echo input with optional modifications
  • NullTechnique: No-operation technique for pipeline testing

Advanced Usage 🎯

Custom Technique Development

from raglib.core import RAGTechnique, TechniqueResult
from raglib.schemas import TechniqueMeta

class MyTechnique(RAGTechnique):
    def __init__(self, custom_param: str):
        meta = TechniqueMeta(
            name="my_technique",
            category="custom",
            description="My custom technique"
        )
        super().__init__(meta)
        self.custom_param = custom_param

    def apply(self, input_data) -> TechniqueResult:
        try:
            # Your logic here
            result = self._process(input_data)
            return TechniqueResult(
                success=True,
                payload=result,
                metadata={"processing_info": "success"}
            )
        except Exception as e:
            return TechniqueResult(
                success=False,
                error=str(e)
            )

Registry Integration

from raglib.registry import TechniqueRegistry

# Register your technique
TechniqueRegistry.register("my_technique", MyTechnique)

# Use via registry
technique_class = TechniqueRegistry.get("my_technique")
technique = technique_class(custom_param="value")

💡 For detailed parameter documentation and examples, see the individual technique source files or use the interactive help:

from raglib.techniques import FixedSizeChunker
help(FixedSizeChunker)