Skip to content

High quality resources & applications for LLMs, multi-modal models and VectorDBs

License

Notifications You must be signed in to change notification settings

lancedb/vectordb-recipes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

VectorDB-recipes


Dive into building GenAI applications! This repository contains examples, applications, starter code, & tutorials to help you kickstart your GenAI projects.
  • These are built using LanceDB, a free, open-source, serverless vectorDB that requires no setup.
  • It integrates into Python data ecosystem so you can simply start using these in your existing data pipelines in pandas, arrow, pydantic etc.
  • LanceDB has native Typescript SDK using which you can run vector search in serverless functions!

Join our community for support - Discord β€’ Twitter

This repository is divided into 2 sections:

  • Examples - Get right into the code with minimal introduction, aimed at getting you from an idea to PoC within minutes!
  • Applications - Ready to use Python and web apps using applied LLMs, VectorDB and GenAI tools

The following examples are organized into different tables to make similar types of examples easily accessible.

Sections

  • Build from Scratch - Build applications/examples from scratch using LanceDB for efficient vector-based document retrieval.
  • Multimodal - Build a multimodal search application with input text or image as queries.
  • RAG - Build a variety of RAG by loading data from different formats and query with text.
  • Vector Search - Build vector search application using different search algorithms.
  • Chatbot - Build chatbot application where user input queries to retrieve relevant context and generate coherent, context-aware replies.
  • Evalution - Evaluate reference and candidate texts to measure their performance on various metrics.
  • AI Agents - Design an application powered with AI agents to exchange information, coordinate tasks, and achieve shared goals effectively.
  • Recommender Systems - Build Recommendation systems which generate personalized recommendations and enhance user experience.
  • Concepts - Concepts related to LLM applications pipeline to ensures accurate information retrieval.

🌟 New 🌟

  • Advanced RAG: Context Enrichment Window - Open In Colab

Build from Scratch

Build applications/examples using LanceDB for efficient vector-based document retrieval.

Build from Scratch Β  Β  Interactive Notebook & Scripts Β 
Build RAG from Scratch Open In Colab LLM beginner
Local RAG from Scratch with Llama3 Python local LLM beginner
Multi-Head RAG from Scratch Python LLM local LLM beginner

MultiModal

Create a multimodal search application using LanceDB for efficient vector-based retrieval of text and image data. Input text or image queries to find the most relevant documents and images from your corpus.

Multimodal Β  Β  Interactive Notebook & Scripts Β  Blog
Multimodal CLIP: DiffusionDB Open In Colab Python LLM beginner Ghost
Multimodal CLIP: Youtube videos Open In Colab Python LLM beginner Ghost
Cambrian-1: Vision centric exploration of images Kaggle LLM intermediate Ghost
Multimodal Jina CLIP-V2 : Food Search Open In Colab Python beginner

RAG

Develop a Retrieval-Augmented Generation (RAG) application using LanceDB for efficient vector-based information retrieval. Input text queries to retrieve relevant documents and generate comprehensive answers by combining retrieved information.

RAG Β  Β  Interactive Notebook & Scripts Blog
RAG with Contextual Retrieval and Hybrid search Open In Colab LLM intermediate Ghost
RAG with Matryoshka Embeddings and LlamaIndex Open In Colab LLM intermediate
RAG with IBM Watsonx Open In Colab LLM watsonx LLM beginner
Improve RAG with Re-ranking Open In Colab LLM beginner Ghost
Instruct-Multitask Open In Colab Python LLM beginner Ghost
Improve RAG with HyDE Open In Colab LLM intermediate Ghost
Improve RAG with LOTR Open In Colab LLM intermediate Ghost
Advanced RAG: Context Enrichment Window Open In Colab LLM intermediate Ghost
Advanced RAG: Late Chunking Open In Colab LLM intermediate Ghost
Advanced RAG: Parent Document Retriever Open In Colab LLM intermediate Ghost
Corrective RAG with Langgraph Open In Colab LLM intermediate Ghost
Contextual-Compression-with-RAG Open In Colab local LLM intermediate Ghost
Improve RAG with FLARE Open In Colab local LLM LLM advanced Ghost
Agentic RAG Open In Colab LLM advanced
GraphRAG Open In Colab LLM intermediate Ghost
GraphRAG with CSV File Open In Colab LLM intermediate Ghost

Vector Search

Build a vector search application using LanceDB for efficient vector-based document retrieval. Input text queries to find the most relevant documents from your corpus.

Vector Search Β  Β  Interactive Notebook & Scripts Β  Blog
Inbuilt Hybrid Search Open In Colab LLM beginner
Hybrid search BM25 & lancedb Open In Colab LLM beginner Ghost
NER powered Semantic Search Open In Colab local LLM beginner Ghost
Vector Arithmetic with LanceDB Open In Colab LLM beginner Ghost
Summarize and Search Reddit Posts Open In Colab beginner
Imagebind demo app hf spaces intermediate
Search Within Images Open In Colab local LLM intermediate Ghost
Zero Shot Object Detection with CLIP Open In Colab intermediate
Vector Search with TransformersJS JS LLM advanced
Accelerate Vector Search Applications Using OpenVINO Open In Colab local LLM advanced Ghost

Chatbot

Create a chatbot application using LanceDB for efficient vector-based response generation. Input user queries to retrieve relevant context and generate coherent, context-aware replies.

Chatbot Β  Β  Interactive Notebook & Scripts Β  Blog Β 
Databricks DBRX Website Bot Python Databricks LLM beginner
CLI-based SDK Manual Chatbot with Phidata Python local LLM beginner
Youtube transcript search bot Open In Colab Python JS LLM intermediate
Langchain: Code Docs QA bot Open In Colab Python JS LLM intermediate
Chatbot with any website using Crawl4AI Open In Colab Python LLM beginner
Context-Aware Chatbot using Llama 2 & LanceDB Open In Colab local LLM advanced Ghost

Evaluation

Develop an evaluation application. Input reference and candidate texts to measure their performance on various metrics.

Evaluation Β  Β  Interactive Notebook & Scripts Β  Blog
Evaluating Prompts with Prompttools Open In Colab LLM local LLM advanced
Evaluating RAG with RAGAs Open In Colab LLM intermediate

AI Agents

Design an AI agents coordination application with LanceDB for efficient vector-based communication and collaboration. Input queries to enable AI agents to exchange information, coordinate tasks, and achieve shared goals effectively.

AI Agents Β  Β  Interactive Notebook & Scripts Β  Blog
AI email assistant with Composio Open In Colab LLM beginner
Assitant Bot with OpenAI Swarm Open In Colab LLM intermediate
AI Trends Searcher with CrewAI Open In Colab LLM beginner Ghost
SuperAgent Autogen Open In Colab LLM intermediate
AI Agents: Reducing Hallucination Open In Colab Python JS LLM advanced Ghost
Multi Document Agentic RAG Open In Colab LLM advanced Ghost

Recommender Systems

Create a recommender system application with LanceDB for efficient vector-based item recommendation. Input user preferences or item features to generate personalized recommendations and enhance user experience.

Recommender Systems Interactive Notebook & Scripts Β  Blog
Movie Recommender Open In Colab Python beginner
Product Recommender Open In Colab Python intermediate
Arxiv paper recommender Open In Colab Python LLM beginner
Music Recommender Open In Colab intermediate

Concepts

Checkout concepts of LLM applications pipeline to ensures accurate information retrieval.

Concepts Interactive Notebook Blog
A Primer on Text Chunking and its Types Open In Colab beginner Ghost
Langchain LlamaIndex Chunking Open In Colab beginner Ghost
Create structured dataset using Instructor Python beginner
Comparing Cohere Rerankers with LanceDB beginner Ghost
Product Quantization: Compress High Dimensional Vectors intermediate Ghost
LLMs, RAG, & the missing storage layer for AI intermediate Ghost
Fine-Tuning LLM using PEFT & QLoRA Open In Colab local LLM advanced Ghost
Extracting Complex tables-text from PDFs using LlamaParse Open In Colab LLM LlamaCloud beginner

Projects & Applications

These are ready to use applications built using LanceDB serverless vector database. You can explore these open source projects, use parts of them in your projects or build your applications on top of these.

Node applications powered by LanceDB

Project Name Description Screenshot
Writing assistant Writing assistant app using lanchain.js with LanceDB, allows you to get real time relevant suggestions and facts based on you written text to help you with your writing. Writing assistant
Sentance auto complete Sentance auto complete app using lanchain.js with LanceDB, allows you to get real time relevant auto complete suggestions and facts based on you written text to help you with your writing.You can also upload your data source in the form of a pdf file.You can switch between gpt models to get faster results. Sentance auto complete
Project Name Description Screenshot
YOLOExplorer Iterate on your YOLO / CV datasets using SQL, Vector semantic search, and more within seconds YOLOExplorer
Website Chatbot (Deployable Vercel Template) Create a chatbot from the sitemap of any website/docs of your choice. Built using vectorDB serverless native javascript package. Chatbot
Chat with multiple URL/website Conversational AI for Any Website with Mistral,Bge Embedding & LanceDB webui_aa
Talk with Podcast Talk with Youtube Podcast using Ollama and insanely-fast-whisper demo
Hr chatbot Hr chatbot - ask your personal query using zero-shot React agent & tools image
Advanced Chatbot with Parler TTS This Chatbot app uses Lancedb Hybrid search, FTS & reranker method with Parlers TTS library. image
Multi-Modal Search Engine Create a Multi-modal search engine app, to search images using both images or text Search
Multimodal Myntra Fashion Search Engine This app uses OpenAI's CLIP to make a search engine that can understand and deal with both written words and pictures. image
Multilingual-RAG Multilingual RAG with cohere embedding & support 100+ languages image
GTE MLX RAG mlx based RAG model using lancedb api support image
Healthcare Chatbot Healthcare chatbot using domain specific LLM & Embedding model image

🌟 New! 🌟 Applied GenAI and VectorDB course on Udacity Learn about GenAI and vectorDBs using LanceDB in the recently launched Udacity Course

Contributing Examples

If you're working on some cool applications that you'd like to add to this repo, please open a PR!