-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
47 changed files
with
133 additions
and
122 deletions.
There are no files selected for viewing
Binary file modified
BIN
+628 Bytes
(110%)
src/docs/_build/doctrees/auto_examples/Basic-RAG/BasicRAG_ingest.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
# Sphinx build info version 1 | ||
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. | ||
config: 58797109ea50b041e451aad9460566a5 | ||
config: 4e9c7fafa68d58ea0265316a26496cf3 | ||
tags: 645f666f9bcd5a90fca523b33c5a78b7 |
Binary file modified
BIN
+0 Bytes
(100%)
src/docs/_build/html/_downloads/40ffe2716096f331549183db9c0ece72/Retriver-GUI_jupyter.zip
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
src/docs/_build/html/_downloads/7c6daaeaa6e5520da795fa975d498452/Retriver-GUI_python.zip
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file modified
BIN
+315 Bytes
(110%)
src/docs/_build/html/_downloads/d30c8b1c6e4654b2ad3d2a98fac0be74/Basic-RAG_python.zip
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file modified
BIN
+331 Bytes
(100%)
src/docs/_build/html/_downloads/f9939c7be8f2cbb228881fcceb9ea19d/Basic-RAG_jupyter.zip
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,37 +1,37 @@ | ||
LLMs | ||
===== | ||
|
||
GRAG offers two ways to run LLMs locally, | ||
GRAG offers two ways to run LLMs locally: | ||
|
||
1. LlamaCPP | ||
2. HuggingFace | ||
|
||
To run LLMs using HuggingFace | ||
############################# | ||
This is the easiest way to get started but does not offer as much | ||
This is the easiest way to get started, but does not offer as much | ||
flexibility. | ||
If using a config file (*config.ini*), just change the `model_name` to | ||
to the HuggingFace repo id. *Note that if the models are gated, make sure to | ||
provide an auth token* | ||
|
||
To run LLMs using LlamaCPP | ||
############################# | ||
Steps to start with llama.cpp: | ||
LlamaCPP requires models in the form of `.gguf` file. You can either download these model files online, | ||
or | ||
|
||
1. Clone the `llama.cpp <https://github.com/ggerganov/llama.cpp>`_ repository. | ||
``git clone https://github.com/ggerganov/llama.cpp.git`` | ||
2. Change directory to `llama.cpp` using `cd llama.cpp` | ||
3. To inference using GPU, which is necessary for most models. | ||
* Make sure you have CUDA installed (check using ``nvcc --version``) | ||
* Follow steps from the `llama.cpp documentation <https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#cublas>`_. | ||
How to quantize models. | ||
************************ | ||
To quantize the model, run: | ||
``python -m grag.quantize.quantize`` | ||
|
||
*Note: While inferencing if model is not utilizing GPU check the `BLAS=1` in the outputs and* | ||
*if it is not then try reinstalling using*:: | ||
After running the above command, user will be prompted with the following: | ||
|
||
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir | ||
1. **Path** where the user wants to clone the `llama.cpp` repo. You can find the repository, `llama.cpp <https://github.com/ggerganov/llama.cpp>`_. | ||
|
||
*or follow the solution provided by* | ||
`this Stack Overflow post <https://stackoverflow.com/questions/76963311/llama-cpp-python-not-using-nvidia-gpu-cuda>`_ | ||
2. Input the **model path**: | ||
|
||
How to quantize models. | ||
************************ | ||
* If user wants to download a model from `HuggingFace <https://huggingface.co/models>`_, the user should provide the repository path from HuggingFace. | ||
|
||
* If the user has the model downloaded locally, then user will be instructed to copy the model and input the name of the model directory. | ||
|
||
3.Finally, the user will be prompted to enter **quantization** settings (recommended Q5_K_M or Q4_K_M, etc.). For more details, check `llama.cpp/examples/quantize/quantize.cpp <https://github.com/ggerganov/llama.cpp/blob/master/examples/quantize/quantize.cpp#L19>`_. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.