Querying vectordb with AgentAI #20

adivik2000 · 2023-09-14T18:42:27Z

This PR has the initial functionality of querying a vectordb(using Chroma Db for now) with agentai.

A query model for Chroma looks like this ->

class Query(BaseModel):
    """Query Model to search the vector database. If query_embeddings is provided, query_texts will be ignored."""

    query_embeddings: Optional[List[Embedding]] = Field(None, description="Embedding for the query to search")
    query_texts: Optional[List[str]] = Field(None, description="Simplified query from the user to search")
    k: int = Field(..., description="The number of results requested")
    include: Include = Field(
        ["documents", "embeddings", "metadatas", "distances"], description="Data to include in results"
    )

An example functionality of how we can do this ->

@tool(registry=db_registry)
def query_vector_db(query: Query):
    """
    Ask the vector database a question
    """
    print(f"Querying vector database: {query}")
    results = client_db.get_docs(query=query)
    return results


question = f"""Search for the content about where food comes from in the vector database.
    Get me three results from the vector database and include the documents and distances."""

conversation = Conversation()
conversation.add_message(
    "user",
    question,
)

chat_response = chat_complete_execute_fn(conversation, tool_registry=db_registry, model="gpt-3.5-turbo")
print(chat_response)

Outputs ->

({'ids': [['90834f80-0432-475e-af9b-9688215db92d', 'a3c0e748-0937-46b1-a167-5aa01a70bbac', '81bed12d-1a84-4b4b-bd09-9fa964240278']], 'distances': [[0.7584866881370544, 1.0528839826583862, 1.372355341911316]], 'metadatas': None, 'embeddings': None, 'documents': [["CHAPTER.... 1 Agricultural Practices", "In order to provide food for a large population- regular production.. patterns can be identified.", "Storage\n1.3 ......Preparation of Soil"]]}, {'query': {'query_texts': ['where does food come from'], 'k': 3, 'include': ['documents', 'distances']}}, <function query_vector_db at 0x168888400>)

The Parsing capability of document is limited to pdfs with Unstructured and Azure Document Intelligence(Form Recognizer) for now. Can expand it as needed.

Some of the code is taken from other PRs that are currently open(which doesn't have to be reviewed in this PR). Do leave a comment after merging the ones earlier than this and I'll resolve the conflicts.

Files to review:

In Docs Folder:

Two Notes books - One with Unstructured and the other with Azure Doc Intelligence

In Agentai folder:

Two Files - parsers.py and vectordb.py

review-notebook-app · 2023-09-14T18:42:31Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

adivik2000 added 4 commits August 24, 2023 14:28

Initial Parser of Document Intelligence/Form Recognizer

eeb50ad

changes to table extraction

939fe5c

pydantic model and related changes

3911c3b

vectordb querying with agentai

ffe89c8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Querying vectordb with AgentAI #20

Querying vectordb with AgentAI #20

adivik2000 commented Sep 14, 2023

review-notebook-app bot commented Sep 14, 2023

Querying vectordb with AgentAI #20

Are you sure you want to change the base?

Querying vectordb with AgentAI #20

Conversation

adivik2000 commented Sep 14, 2023

Files to review:

review-notebook-app bot commented Sep 14, 2023