Skip to content

v0.1.4

Latest
Compare
Choose a tag to compare
@aravind10x aravind10x released this 31 Dec 18:11
· 3 commits to main since this release

What's Changed

New SDK that allows for module-wise optimization.

Basic Usage:

from ragbuilder import RAGBuilder

# Initialize and optimize
builder = RAGBuilder.from_source_with_defaults(input_source='data.pdf')
results = builder.optimize()

# Run a query through the complete pipeline
response = results.invoke("What is HNSW?")

# View optimization summary
print(results.summary())

Advanced Configuration
For fine-grained control, you can customize every aspect:

from ragbuilder.config import (
    DataIngestOptionsConfig,
    RetrievalOptionsConfig,
    GenerationOptionsConfig
)

# Configure data ingestion
data_ingest_config = DataIngestOptionsConfig(
    input_source="data.pdf",
    document_loaders=[
        {"type": "pymupdf"},
        {"type": "unstructured"}
    ],
    chunking_strategies=[{
        "type": "RecursiveCharacterTextSplitter",
        "chunker_kwargs": {"separators": ["\n\n", "\n", " ", ""]}
    }],
    chunk_size={"min": 500, "max": 2000, "stepsize": 500},
    embedding_models=[{
        "type": "openai",
        "model_kwargs": {"model": "text-embedding-3-large"}
    }]
)

# Configure retrieval
retrieval_config = RetrievalOptionsConfig(
    retrievers=[
        {
            "type": "vector_similarity",
            "retriever_k": [20],
            "weight": 0.5
        },
        {
            "type": "bm25",
            "retriever_k": [20],
            "weight": 0.5
        }
    ],
    rerankers=[{
        "type": "BAAI/bge-reranker-base"
    }],
    top_k=[3, 5]
)

# Initialize with custom configs
builder = RAGBuilder(
    data_ingest_config=data_ingest_config,
    retrieval_config=retrieval_config
)

# Access individual components
vectorstore = results.data_ingest.get_vectorstore()
docs = results.retrieval.invoke("What is RAG?")
answer = results.generation.invoke("What is RAG?")

Full Changelog: 0.0.22...v0.1.4