-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
ajosh0504
committed
Jul 23, 2024
1 parent
1915dc1
commit c4b9b17
Showing
13 changed files
with
119 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
# 📘 What is RAG? | ||
|
||
TO-DO | ||
![](/img/screenshots/10-rag/rag.png) | ||
|
||
RAG, short for Retrieval Augmented Generation, is a technique to enhance the quality of responses generated by a large language model (LLM), by augmenting its pre-trained knowledge with information retrieved from external sources. This results is more accurate responses from the LLM by grounding them in real, contextually relevant data. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,28 @@ | ||
# 📘 RAG use cases | ||
# 📘 When to use RAG? | ||
|
||
TO-DO | ||
RAG is best suited for the following: | ||
* Tasks that require very specific information that you don’t think will be present in the LLMs parametric knowledge i.e. information that is not widely available on the internet | ||
* Tasks that require information from multiple different data sources | ||
* Tasks that involve basic question-answering or summarization on a piece of information | ||
|
||
Do not expect success on complex multi-step tasks involving deductive reasoning or long-term planning. These are more suited for agentic workflows. | ||
|
||
Here are some examples of tasks/questions that **DO NOT** require or cannot be achieved with RAG: | ||
|
||
> Who was the first president of the United States? | ||
The information required to answer this question is very likely present in the parametric knowledge of most LLMs. Hence, this question can be answered using a simple prompt to an LLM. | ||
|
||
> How has the trend in the average daily calorie intake among adults changed over the last decade in the United States, and what impact might this have on obesity rates? Additionally, can you provide a graphical representation of the trend in obesity rates over this period? | ||
This question involves multiple sub-tasks such as data aggregation, visualization, and reasoning. Hence, this is a good use case for an AI agent rather than RAG. | ||
|
||
Here are some use cases for RAG: | ||
|
||
> What is the travel reimbursement policy for meals for my company? | ||
The information required to answer this question is most likely not present in the parametric knowledge of available LLMs. However, this question can easily be answered using RAG on a knowledge base consisting of your company's data. | ||
|
||
> Hi, I'm having trouble installing your software on my Windows 10 computer. It keeps giving me an error message saying 'Installation failed: Error code 1234'. How can I resolve this issue? | ||
Again, this question requires troubleshooting information for a specific software, the documentation for which might not be widely available, but can be solved using RAG. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,17 @@ | ||
# 📘 Components of a RAG system | ||
|
||
TO-DO | ||
RAG systems have two main components: **Retrieval** and **Generation**. | ||
|
||
## Retrieval | ||
|
||
Retrieval mainly involves processing your data and constructing a knowledge base in a way that you are able to efficiently retrieve relevant information from it. It typically involves three main steps: | ||
|
||
* **Chunking**: Break down large pieces of information into smaller segments or chunks. | ||
|
||
* **Embedding**: Convert a piece of information such as text, images, audio, video, etc. into an array of numbers a.k.a. vectors. | ||
|
||
* **Semantic Search**: Retrieve the most relevant documents from the knowledge base based on embedding similarity with the query vector. | ||
|
||
## Generation | ||
|
||
Generation involves crafting a prompt that contains all the instructions and information required by the LLM to generate accurate answers to user queries. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,17 @@ | ||
# 📘 Tools, libraries, and concepts | ||
|
||
TO-DO | ||
## [datasets](https://huggingface.co/docs/datasets/en/index) | ||
|
||
Library used to download a dataset of MongoDB Developer center tutorials from Hugging Face. | ||
|
||
## [RecursiveCharacterTextSplitter](https://python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/split_by_token/) | ||
|
||
A LangChain text splitter that first splits documents by a list of characters and then recursively merges characters into tokens until the specified chunk size is reached. | ||
|
||
## [Sentence Transformers](https://sbert.net/) | ||
|
||
Python library for accessing, using, and training open-source embedding models. | ||
|
||
## [PyMongo](https://pymongo.readthedocs.io/en/stable/) | ||
|
||
Python driver for MongoDB. Used to connect to MongoDB databases, delete and insert documents into a MongoDB collection. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,53 @@ | ||
# 📘 Tools, libraries, and concepts | ||
# 📘 Semantic search in MongoDB | ||
|
||
TO-DO | ||
In MongoDB, you can semantically search through your data using MongoDB Atlas Vector Search. | ||
|
||
To perform vector search on your data in MongoDB, you need to create a vector search index. An example of a vector search index definition looks as follows: | ||
|
||
``` | ||
{ | ||
"fields":[ | ||
{ | ||
"type": "vector", | ||
"path": "embedding", | ||
"numDimensions": 1536, | ||
"similarity": "euclidean | cosine | dotProduct" | ||
}, | ||
{ | ||
"type": "filter", | ||
"path": "symbol" | ||
}, | ||
... | ||
] | ||
} | ||
``` | ||
|
||
In the index definition, you specify the path to the embedding field (`path`), the number of dimensions in the embedding vectors (`numDimensions`), and a similarity metric that specifies how to determine nearest neighbors (`similarity`). You can also index filter fields that allow you to pre-filter on certain metadata to narrow the scope of the vector search. | ||
|
||
Vector search in MongoDB takes the form of an aggregation pipeline stage. It always needs to be the first stage in the pipeline and can be followed by other stages to further process the semantic search results. An example pipeline including the `$vectorSearch` stage is as follows: | ||
|
||
``` | ||
[ | ||
{ | ||
"$vectorSearch": { | ||
"index": "vector_index", | ||
"path": "embedding", | ||
"filter": {"symbol": "ABMD"}, | ||
"queryVector": [0.02421053, -0.022372592,...], | ||
"numCandidates": 150, | ||
"limit": 10 | ||
} | ||
}, | ||
{ | ||
"$project": { | ||
"_id": 0, | ||
"Content": 1, | ||
"score": {"$meta": "vectorSearchScore"} | ||
} | ||
} | ||
] | ||
``` | ||
|
||
In this example, you can see a vector search query with a pre-filter. The `limit` field in the query definition specifies how many documents to return from the vector search. | ||
|
||
The `$project` stage that follows only returns documents with the `Content` field and the similarity score from the vector search. |
File renamed without changes.
This file was deleted.
Oops, something went wrong.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,9 @@ | ||
# 📘 Tools, libraries, and concepts | ||
|
||
TO-DO | ||
Memory is important for the LLM to have multi-turn conversations with the user. | ||
|
||
In this lab, you will persist chat messages in a separate MongoDB collection, indexed by session ID. | ||
|
||
For each new user query, you will fetch previous messages for that session from the collection and pass them to the LLM. | ||
|
||
Then once the LLM has generated a response to the query, you will write the query and the LLM's answer to the collection as two separate entries but having the same session ID. |
Binary file not shown.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.