Skip to content

Commit

Permalink
Merge pull request #334 from Shreyanand/milvus
Browse files Browse the repository at this point in the history
Add Milvus database compatibility with the RAG recipe
  • Loading branch information
rhatdan authored May 16, 2024
2 parents 0df104d + ef4b6f0 commit ae88cd7
Showing 14 changed files with 216 additions and 111 deletions.
12 changes: 6 additions & 6 deletions .github/workflows/rag.yaml
Original file line number Diff line number Diff line change
@@ -5,16 +5,16 @@ on:
branches:
- main
paths:
- ./recipes/common/Makefile.common
- ./recipes/natural_language_processing/rag/**
- .github/workflows/rag.yaml
- 'recipes/common/Makefile.common'
- 'recipes/natural_language_processing/rag/**'
- '.github/workflows/rag.yaml'
push:
branches:
- main
paths:
- ./recipes/common/Makefile.common
- ./recipes/natural_language_processing/rag/**
- .github/workflows/rag.yaml
- 'recipes/common/Makefile.common'
- 'recipes/natural_language_processing/rag/**'
- '.github/workflows/rag.yaml'

workflow_dispatch:

1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -12,3 +12,4 @@ recipes/common/bin/*
*/.venv/
training/cloud/examples
training/instructlab/instructlab
vector_dbs/milvus/volumes/milvus/*
33 changes: 29 additions & 4 deletions recipes/natural_language_processing/rag/README.md
Original file line number Diff line number Diff line change
@@ -4,7 +4,7 @@ This demo provides a simple recipe to help developers start to build out their o

There are a few options today for local Model Serving, but this recipe will use [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) and their OpenAI compatible Model Service. There is a Containerfile provided that can be used to build this Model Service within the repo, [`model_servers/llamacpp_python/base/Containerfile`](/model_servers/llamacpp_python/base/Containerfile).

In order for the LLM to interact with our documents, we need them stored and available in such a manner that we can retrieve a small subset of them that are relevant to our query. To do this we employ a Vector Database alongside an embedding model. The embedding model converts our documents into numerical representations, vectors, such that similarity searches can be easily performed. The Vector Database stores these vectors for us and makes them available to the LLM. In this recipe we will use [chromaDB](https://docs.trychroma.com/) as our Vector Database.
In order for the LLM to interact with our documents, we need them stored and available in such a manner that we can retrieve a small subset of them that are relevant to our query. To do this we employ a Vector Database alongside an embedding model. The embedding model converts our documents into numerical representations, vectors, such that similarity searches can be easily performed. The Vector Database stores these vectors for us and makes them available to the LLM. In this recipe we can use [chromaDB](https://docs.trychroma.com/) or [Milvus](https://milvus.io/) as our Vector Database.

Our AI Application will connect to our Model Service via it's OpenAI compatible API. In this example we rely on [Langchain's](https://python.langchain.com/docs/get_started/introduction) python package to simplify communication with our Model Service and we use [Streamlit](https://streamlit.io/) for our UI layer. Below please see an example of the RAG application.

@@ -78,16 +78,41 @@ snapshot_download(repo_id="BAAI/bge-base-en-v1.5",

### Deploy the Vector Database

To deploy the Vector Database service locally, simply use the existing ChromaDB image.
To deploy the Vector Database service locally, simply use the existing ChromaDB or Milvus image. The Vector Database is ephemeral and will need to be re-populated each time the container restarts. When implementing RAG in production, you will want a long running and backed up Vector Database.


#### ChromaDB
```bash
podman pull chromadb/chroma
```
```bash
podman run --rm -it -p 8000:8000 chroma
```

This Vector Database is ephemeral and will need to be re-populated each time the container restarts. When implementing RAG in production, you will want a long running and backed up Vector Database.
#### Milvus
```bash
podman pull milvusdb/milvus:master-20240426-bed6363f
```
```bash
podman run -it \
--name milvus-standalone \
--security-opt seccomp:unconfined \
-e ETCD_USE_EMBED=true \
-e ETCD_CONFIG_PATH=/milvus/configs/embedEtcd.yaml \
-e COMMON_STORAGETYPE=local \
-v $(pwd)/volumes/milvus:/var/lib/milvus \
-v $(pwd)/embedEtcd.yaml:/milvus/configs/embedEtcd.yaml \
-p 19530:19530 \
-p 9091:9091 \
-p 2379:2379 \
--health-cmd="curl -f http://localhost:9091/healthz" \
--health-interval=30s \
--health-start-period=90s \
--health-timeout=20s \
--health-retries=3 \
milvusdb/milvus:master-20240426-bed6363f \
milvus run standalone 1> /dev/null
```
Note: For running the Milvus instance, make sure you have the `$(pwd)/volumes/milvus` directory and `$(pwd)/embedEtcd.yaml` file as shown in this repository. These are required by the database for its operations.


### Build the Model Service
1 change: 1 addition & 0 deletions recipes/natural_language_processing/rag/app/Containerfile
Original file line number Diff line number Diff line change
@@ -16,6 +16,7 @@ COPY requirements.txt .
RUN pip install --upgrade pip
RUN pip install --no-cache-dir --upgrade -r /rag/requirements.txt
COPY rag_app.py .
COPY manage_vectordb.py .
EXPOSE 8501
ENV HF_HUB_CACHE=/rag/models/
ENTRYPOINT [ "streamlit", "run" ,"rag_app.py" ]
81 changes: 81 additions & 0 deletions recipes/natural_language_processing/rag/app/manage_vectordb.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
from langchain_community.vectorstores import Chroma
from chromadb import HttpClient
from chromadb.config import Settings
import chromadb.utils.embedding_functions as embedding_functions
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain_community.vectorstores import Milvus
from pymilvus import MilvusClient
from pymilvus import connections, utility

class VectorDB:
def __init__(self, vector_vendor, host, port, collection_name, embedding_model):
self.vector_vendor = vector_vendor
self.host = host
self.port = port
self.collection_name = collection_name
self.embedding_model = embedding_model

def connect(self):
# Connection logic
print(f"Connecting to {self.host}:{self.port}...")
if self.vector_vendor == "chromadb":
self.client = HttpClient(host=self.host,
port=self.port,
settings=Settings(allow_reset=True,))
elif self.vector_vendor == "milvus":
self.client = MilvusClient(uri=f"http://{self.host}:{self.port}")
return self.client

def populate_db(self, documents):
# Logic to populate the VectorDB with vectors
e = SentenceTransformerEmbeddings(model_name=self.embedding_model)
print(f"Populating VectorDB with vectors...")
if self.vector_vendor == "chromadb":
embedding_func = embedding_functions.SentenceTransformerEmbeddingFunction(model_name=self.embedding_model)
collection = self.client.get_or_create_collection(self.collection_name,
embedding_function=embedding_func)
if collection.count() < 1:
db = Chroma.from_documents(
documents=documents,
embedding=e,
collection_name=self.collection_name,
client=self.client
)
print("DB populated")
else:
db = Chroma(client=self.client,
collection_name=self.collection_name,
embedding_function=e,
)
print("DB already populated")

elif self.vector_vendor == "milvus":
connections.connect(host=self.host, port=self.port)
if not utility.has_collection(self.collection_name):
print("Populating VectorDB with vectors...")
db = Milvus.from_documents(
documents,
e,
collection_name=self.collection_name,
connection_args={"host": self.host, "port": self.port},
)
print("DB populated")
else:
print("DB already populated")
db = Milvus(
e,
collection_name=self.collection_name,
connection_args={"host": self.host, "port": self.port},
)
return db

def clear_db(self):
print(f"Clearing VectorDB...")
try:
if self.vector_vendor == "chromadb":
self.client.delete_collection(self.collection_name)
elif self.vector_vendor == "milvus":
self.client.drop_collection(self.collection_name)
print("Cleared DB")
except:
print("Couldn't clear the collection possibly because it doesn't exist")
36 changes: 0 additions & 36 deletions recipes/natural_language_processing/rag/app/populate_vectordb.py

This file was deleted.

86 changes: 29 additions & 57 deletions recipes/natural_language_processing/rag/app/rag_app.py
Original file line number Diff line number Diff line change
@@ -1,91 +1,68 @@
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.callbacks import StreamlitCallbackHandler
from langchain_community.vectorstores import Chroma
from langchain_community.document_loaders import TextLoader
from langchain_community.document_loaders import PyPDFLoader
from langchain.schema.document import Document
from chromadb import HttpClient
from chromadb.config import Settings
import chromadb.utils.embedding_functions as embedding_functions
import streamlit as st
from manage_vectordb import VectorDB
import tempfile
import uuid
import streamlit as st
import os

model_service = os.getenv("MODEL_ENDPOINT","http://0.0.0.0:8001")
model_service = f"{model_service}/v1"
chunk_size = os.getenv("CHUNK_SIZE", 150)
embedding_model = os.getenv("EMBEDDING_MODEL","BAAI/bge-base-en-v1.5")
vdb_vendor = os.getenv("VECTORDB_VENDOR", "chromadb")
vdb_host = os.getenv("VECTORDB_HOST", "0.0.0.0")
vdb_port = os.getenv("VECTORDB_PORT", "8000")
vdb_name = os.getenv("VECTORDB_NAME", "test_collection")

vdb = VectorDB(vdb_vendor, vdb_host, vdb_port, vdb_name, embedding_model)
vectorDB_client = vdb.connect()
def split_docs(raw_documents):
text_splitter = CharacterTextSplitter(separator = ".",
chunk_size=int(chunk_size),
chunk_overlap=0)
docs = text_splitter.split_documents(raw_documents)
return docs

vectorDB_client = HttpClient(host=vdb_host,
port=vdb_port,
settings=Settings(allow_reset=True,))

def clear_vdb():
global vectorDB_client
try:
vectorDB_client.delete_collection(vdb_name)
print("Cleared DB")
except:
pass

def read_file(file):
file_type = file.type

if file_type == "application/pdf":
temp = tempfile.NamedTemporaryFile()
with open(temp.name, "wb") as f:
f.write(file.getvalue())
loader = PyPDFLoader(temp.name)
pages = loader.load()
text = "".join([p.page_content for p in pages])

if file_type == "text/plain":
text = file.read().decode()

return text
temp = tempfile.NamedTemporaryFile()
with open(temp.name, "wb") as f:
f.write(file.getvalue())
loader = TextLoader(temp.name)
raw_documents = loader.load()
return raw_documents

st.title("📚 RAG DEMO")
with st.sidebar:
file = st.file_uploader(label="📄 Upload Document",
type=[".txt",".pdf"],
on_change=clear_vdb
)
type=[".txt",".pdf"],
on_change=vdb.clear_db
)

### populate the DB ####
os.environ["TOKENIZERS_PARALLELISM"] = "false"

embedding_func = embedding_functions.SentenceTransformerEmbeddingFunction(model_name=embedding_model)
e = SentenceTransformerEmbeddings(model_name=embedding_model)

collection = vectorDB_client.get_or_create_collection(vdb_name,
embedding_function=embedding_func)
if collection.count() < 1 and file != None:
print("populating db")
if file != None:
text = read_file(file)
raw_documents = [Document(page_content=text,
metadata={"":""})]
text_splitter = CharacterTextSplitter(separator = ".",
chunk_size=int(chunk_size),
chunk_overlap=0)
docs = text_splitter.split_documents(raw_documents)
for doc in docs:
collection.add(
ids=[str(uuid.uuid1())],
metadatas=doc.metadata,
documents=doc.page_content
)
if file == None:
print("Empty VectorDB")
documents = split_docs(text)
db = vdb.populate_db(documents)
retriever = db.as_retriever(threshold=0.75)
else:
print("DB already populated")
retriever = {}
print("Empty VectorDB")


########################

if "messages" not in st.session_state:
@@ -95,11 +72,6 @@ def read_file(file):
for msg in st.session_state.messages:
st.chat_message(msg["role"]).write(msg["content"])

db = Chroma(client=vectorDB_client,
collection_name=vdb_name,
embedding_function=e
)
retriever = db.as_retriever(threshold=0.75)

llm = ChatOpenAI(base_url=model_service,
api_key="EMPTY",
Original file line number Diff line number Diff line change
@@ -4,4 +4,5 @@ chromadb
sentence-transformers
streamlit
pypdf
pymilvus

12 changes: 5 additions & 7 deletions vector_dbs/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
# Directory to store vector_dbs files
This directory has make files and container files for open source vector databases. The built container images are used by recipes like `rag` to provide required database functions.

[Chroma](https://www.trychroma.com/) is the open-source embedding database.
## Chroma
[Chroma](https://www.trychroma.com/) is an AI-native open-source embedding database.
Chroma makes it easy to build LLM apps by making knowledge, facts, and skills
pluggable for LLMs.

chromadb is an the AI-native open-source embedding database.

This container image is used by recipes like `rag` to provide required database
functions.

Use the included Makefile to build the container image.
## Milvus
[Milvus](https://milvus.io/) is an open-source vector database built to power embedding similarity search and AI applications. It is highly scalable and offers many production ready features for search.
2 changes: 1 addition & 1 deletion vector_dbs/Makefile → vector_dbs/chromadb/Makefile
Original file line number Diff line number Diff line change
@@ -3,4 +3,4 @@ APPIMAGE ?= quay.io/ai-lab/${APP}:latest

.PHONY: build
build:
podman build -f chromadb/Containerfile -t ${APPIMAGE} .
podman build -f Containerfile -t ${APPIMAGE} .
2 changes: 2 additions & 0 deletions vector_dbs/milvus/Containerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
FROM docker.io/milvusdb/milvus:master-20240426-bed6363f
ADD embedEtcd.yaml /milvus/configs/embedEtcd.yaml
Loading

0 comments on commit ae88cd7

Please sign in to comment.