From a9cf2fef327c0ea7512691b99f587c635e875476 Mon Sep 17 00:00:00 2001 From: Arjun Bingly Date: Sat, 27 Apr 2024 11:01:03 -0400 Subject: [PATCH 1/6] Update docs get_started.installations with pip and gpu --- src/docs/get_started.installation.rst | 126 +++++++++++++++++++++++++- 1 file changed, 124 insertions(+), 2 deletions(-) diff --git a/src/docs/get_started.installation.rst b/src/docs/get_started.installation.rst index ab8d682..0a2971b 100644 --- a/src/docs/get_started.installation.rst +++ b/src/docs/get_started.installation.rst @@ -1,10 +1,132 @@ Installation =============== -*Since we are in the development phase, we have not published to pypi yet.* + +To install the package + +* ``pip install grag`` + + +Note that since this package is still under development, to check out the latest features. * ``git clone`` the repository * ``pip install .`` from the repository * *For Developers*: ``pip install -e .`` -Further customization can be made on the config file, `src/config.ini`. +GPU and Hardware acceleration support +----------------------------------------- + +GRAG uses ``llama.cpp`` to inference LLMs locally. It supports a number of hardware acceleration backends to speed up +inference as well as backend specific options. See the +`llama.cpp README `_ for a full list. + +Below are some of the supported backends. + +* Note that the below instructions are tailored for Linux and MACOS users, Windows users should add ``$env:`` before + defining environment variables. + +.. code-block:: console + $env:CMAKE_ARGS = "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" + pip install llama-cpp-python + +#. OpenBLAS (CPU) + +To install with OpenBLAS, set the `LLAMA_BLAS` and `LLAMA_BLAS_VENDOR` environment variables before installing: + +.. code-block:: console + + CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" + pip install llama-cpp-python + + +#. CUDA (Nvidia-GPU) + +To install with CUDA support, set the `LLAMA_CUDA=on` environment variable before installing: + +.. code-block:: console + + CMAKE_ARGS="-DLLAMA_CUDA=on" + pip install grag + + +#. Metal (MacOS) + +To install with Metal (MPS), set the `LLAMA_METAL=on` environment variable before installing: + +.. code-block:: console + + CMAKE_ARGS="-DLLAMA_METAL=on" + pip install grag + + +#. CLBlast (OpenCL) + +To install with CLBlast, set the `LLAMA_CLBLAST=on` environment variable before installing: + +.. code-block:: console + + CMAKE_ARGS="-DLLAMA_CLBLAST=on" + pip install grag + + +#. hipBLAS (AMD ROCm) + +To install with hipBLAS / ROCm support for AMD cards, set the `LLAMA_HIPBLAS=on` environment variable before installing: + +.. code-block:: console + + CMAKE_ARGS="-DLLAMA_HIPBLAS=on" + pip install grag + + +#. Vulkan + +To install with Vulkan support, set the `LLAMA_VULKAN=on` environment variable before installing: + +.. code-block:: console + + CMAKE_ARGS="-DLLAMA_VULKAN=on" + pip install grag + + +#. Kompute + +To install with Kompute support, set the `LLAMA_KOMPUTE=on` environment variable before installing: + +.. code-block:: console + + CMAKE_ARGS="-DLLAMA_KOMPUTE=on" + pip install grag + + +#. SYCL + +To install with SYCL support, set the `LLAMA_SYCL=on` environment variable before installing: + +.. code-block:: console + + CMAKE_ARGS="-DLLAMA_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" + pip install grag + + +For more details and troubleshooting please refer `llama-cpp-python `_ + +Upgrading and Reinstalling +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +In case you want to upgrade to change hardware acceleration support, or did not install with hardware acceleration +support, simply rebuilt ``llama-cpp-python`` using the instructions below. + +To upgrade and rebuild ``llama-cpp-python`` add ``--upgrade --force-reinstall --no-cache-dir`` +flags to the pip install command along with the necessary environment variables listed above +to ensure the package is rebuilt from source. + +Example usage for reinstalling with CUDA support: + +.. code-block:: console + + CMAKE_ARGS="-DLLAMA_CUDA=on" + pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir + + + +`Note that one does not have to reinstall the grag package` From 09bd6c38049a23b8a58d0a9fb327ba66bb2df31e Mon Sep 17 00:00:00 2001 From: Arjun Bingly Date: Sat, 27 Apr 2024 11:52:31 -0400 Subject: [PATCH 2/6] Update docs get_started.installations venv --- src/docs/get_started.installation.rst | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/src/docs/get_started.installation.rst b/src/docs/get_started.installation.rst index 0a2971b..193f7b4 100644 --- a/src/docs/get_started.installation.rst +++ b/src/docs/get_started.installation.rst @@ -1,10 +1,25 @@ Installation =============== -To install the package +Virtual Environment +^^^^^^^^^^^^^^^^^^^^ + +We strongly recommend using a virtual environment for installing the package. + +Follow the instructions below to create a virtual environment and activate it. + +* ``python -m venv .gragvenv`` +* ``source .gragvenv/source/activate`` + +Install from pip +^^^^^^^^^^^^^^^^^^ + +To install the package from pip * ``pip install grag`` +Install from git +^^^^^^^^^^^^^^^^^ Note that since this package is still under development, to check out the latest features. @@ -14,7 +29,7 @@ Note that since this package is still under development, to check out the latest GPU and Hardware acceleration support ------------------------------------------ +-------------------------------------- GRAG uses ``llama.cpp`` to inference LLMs locally. It supports a number of hardware acceleration backends to speed up inference as well as backend specific options. See the @@ -128,5 +143,4 @@ Example usage for reinstalling with CUDA support: pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir - `Note that one does not have to reinstall the grag package` From 10084eef1e88501069bc09b36e5682bebd927a79 Mon Sep 17 00:00:00 2001 From: Arjun Bingly Date: Sun, 28 Apr 2024 12:10:38 -0400 Subject: [PATCH 3/6] Add doc builds to gitignore --- .gitignore | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.gitignore b/.gitignore index 7469dfb..3592035 100644 --- a/.gitignore +++ b/.gitignore @@ -167,3 +167,6 @@ cython_debug/ **/others data/ full_report/Latex_report/Capstone5_report_v2.tex + +# Docs +src/docs/_* From f25e50d2c77b1ee8499eb2d9f106087b89bec0d8 Mon Sep 17 00:00:00 2001 From: Arjun Bingly Date: Sun, 28 Apr 2024 12:26:11 -0400 Subject: [PATCH 4/6] Update docs get_started.installations console codeblock --- src/docs/get_started.installation.rst | 51 ++++++++++++++------------- 1 file changed, 26 insertions(+), 25 deletions(-) diff --git a/src/docs/get_started.installation.rst b/src/docs/get_started.installation.rst index 193f7b4..99722d4 100644 --- a/src/docs/get_started.installation.rst +++ b/src/docs/get_started.installation.rst @@ -29,7 +29,7 @@ Note that since this package is still under development, to check out the latest GPU and Hardware acceleration support --------------------------------------- +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ GRAG uses ``llama.cpp`` to inference LLMs locally. It supports a number of hardware acceleration backends to speed up inference as well as backend specific options. See the @@ -41,84 +41,85 @@ Below are some of the supported backends. defining environment variables. .. code-block:: console + $env:CMAKE_ARGS = "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python -#. OpenBLAS (CPU) +**1. OpenBLAS (CPU)** -To install with OpenBLAS, set the `LLAMA_BLAS` and `LLAMA_BLAS_VENDOR` environment variables before installing: +To install with OpenBLAS, set the ``LLAMA_BLAS`` and ``LLAMA_BLAS_VENDOR`` environment variables before installing: -.. code-block:: console +.. code-block:: bash CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python -#. CUDA (Nvidia-GPU) +**2. CUDA (Nvidia-GPU)** -To install with CUDA support, set the `LLAMA_CUDA=on` environment variable before installing: +To install with CUDA support, set the ``LLAMA_CUDA=on`` environment variable before installing: -.. code-block:: console +.. code-block:: bash CMAKE_ARGS="-DLLAMA_CUDA=on" pip install grag -#. Metal (MacOS) +**3. Metal (MacOS)** -To install with Metal (MPS), set the `LLAMA_METAL=on` environment variable before installing: +To install with Metal (MPS), set the ``LLAMA_METAL=on`` environment variable before installing: -.. code-block:: console +.. code-block:: bash CMAKE_ARGS="-DLLAMA_METAL=on" pip install grag -#. CLBlast (OpenCL) +**4. CLBlast (OpenCL)** -To install with CLBlast, set the `LLAMA_CLBLAST=on` environment variable before installing: +To install with CLBlast, set the ``LLAMA_CLBLAST=on`` environment variable before installing: -.. code-block:: console +.. code-block:: bash CMAKE_ARGS="-DLLAMA_CLBLAST=on" pip install grag -#. hipBLAS (AMD ROCm) +**5. hipBLAS (AMD ROCm)** -To install with hipBLAS / ROCm support for AMD cards, set the `LLAMA_HIPBLAS=on` environment variable before installing: +To install with hipBLAS / ROCm support for AMD cards, set the ``LLAMA_HIPBLAS=on`` environment variable before installing: -.. code-block:: console +.. code-block:: bash CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install grag -#. Vulkan +**6. Vulkan** -To install with Vulkan support, set the `LLAMA_VULKAN=on` environment variable before installing: +To install with Vulkan support, set the ``LLAMA_VULKAN=on`` environment variable before installing: -.. code-block:: console +.. code-block:: bash CMAKE_ARGS="-DLLAMA_VULKAN=on" pip install grag -#. Kompute +**7. Kompute** -To install with Kompute support, set the `LLAMA_KOMPUTE=on` environment variable before installing: +To install with Kompute support, set the ``LLAMA_KOMPUTE=on`` environment variable before installing: -.. code-block:: console +.. code-block:: bash CMAKE_ARGS="-DLLAMA_KOMPUTE=on" pip install grag -#. SYCL +**8. SYCL** -To install with SYCL support, set the `LLAMA_SYCL=on` environment variable before installing: +To install with SYCL support, set the ``LLAMA_SYCL=on`` environment variable before installing: -.. code-block:: console +.. code-block:: bash CMAKE_ARGS="-DLLAMA_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" pip install grag From 40812218bfa3f47d5f6461d6d0adb2f6bcbcac9d Mon Sep 17 00:00:00 2001 From: Sanchit Vijay Date: Mon, 29 Apr 2024 13:51:29 -0400 Subject: [PATCH 5/6] Update get_started.installation.rst --- src/docs/get_started.installation.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/docs/get_started.installation.rst b/src/docs/get_started.installation.rst index 99722d4..176dccd 100644 --- a/src/docs/get_started.installation.rst +++ b/src/docs/get_started.installation.rst @@ -43,7 +43,7 @@ Below are some of the supported backends. .. code-block:: console $env:CMAKE_ARGS = "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" - pip install llama-cpp-python + pip install grag **1. OpenBLAS (CPU)** @@ -52,7 +52,7 @@ To install with OpenBLAS, set the ``LLAMA_BLAS`` and ``LLAMA_BLAS_VENDOR`` envir .. code-block:: bash CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" - pip install llama-cpp-python + pip install grag **2. CUDA (Nvidia-GPU)** From ffda398da5d5da76fe09bd8644e9123fffe41c0e Mon Sep 17 00:00:00 2001 From: Arjun Bingly Date: Tue, 30 Apr 2024 19:13:00 -0400 Subject: [PATCH 6/6] export CMAKE args --- src/docs/get_started.installation.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/docs/get_started.installation.rst b/src/docs/get_started.installation.rst index 176dccd..0a7d4a0 100644 --- a/src/docs/get_started.installation.rst +++ b/src/docs/get_started.installation.rst @@ -51,7 +51,7 @@ To install with OpenBLAS, set the ``LLAMA_BLAS`` and ``LLAMA_BLAS_VENDOR`` envir .. code-block:: bash - CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" + export CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install grag @@ -61,7 +61,7 @@ To install with CUDA support, set the ``LLAMA_CUDA=on`` environment variable bef .. code-block:: bash - CMAKE_ARGS="-DLLAMA_CUDA=on" + export CMAKE_ARGS="-DLLAMA_CUDA=on" pip install grag @@ -71,7 +71,7 @@ To install with Metal (MPS), set the ``LLAMA_METAL=on`` environment variable bef .. code-block:: bash - CMAKE_ARGS="-DLLAMA_METAL=on" + export CMAKE_ARGS="-DLLAMA_METAL=on" pip install grag @@ -81,7 +81,7 @@ To install with CLBlast, set the ``LLAMA_CLBLAST=on`` environment variable befor .. code-block:: bash - CMAKE_ARGS="-DLLAMA_CLBLAST=on" + export CMAKE_ARGS="-DLLAMA_CLBLAST=on" pip install grag @@ -91,7 +91,7 @@ To install with hipBLAS / ROCm support for AMD cards, set the ``LLAMA_HIPBLAS=on .. code-block:: bash - CMAKE_ARGS="-DLLAMA_HIPBLAS=on" + export CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install grag @@ -101,7 +101,7 @@ To install with Vulkan support, set the ``LLAMA_VULKAN=on`` environment variable .. code-block:: bash - CMAKE_ARGS="-DLLAMA_VULKAN=on" + export CMAKE_ARGS="-DLLAMA_VULKAN=on" pip install grag @@ -111,7 +111,7 @@ To install with Kompute support, set the ``LLAMA_KOMPUTE=on`` environment variab .. code-block:: bash - CMAKE_ARGS="-DLLAMA_KOMPUTE=on" + export CMAKE_ARGS="-DLLAMA_KOMPUTE=on" pip install grag @@ -121,7 +121,7 @@ To install with SYCL support, set the ``LLAMA_SYCL=on`` environment variable bef .. code-block:: bash - CMAKE_ARGS="-DLLAMA_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" + export CMAKE_ARGS="-DLLAMA_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" pip install grag