Skip to content

Commit

Permalink
Added initial cleaned code
Browse files Browse the repository at this point in the history
  • Loading branch information
KaranrajM committed Apr 11, 2023
1 parent 14ccff7 commit 3fc65f8
Show file tree
Hide file tree
Showing 16 changed files with 1,595 additions and 12 deletions.
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
.env
venv/
.DS_Store
.idea/
__pycache__/
.pytest_cache
gcp_credentials.json
24 changes: 24 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
FROM continuumio/anaconda3
WORKDIR /root
RUN apt-get update && apt-get install -y curl file
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y
ENV PATH=$PATH:/root/.cargo/bin
ENV GOOGLE_APPLICATION_CREDENTIALS=gcp_credentials.json
RUN apt install build-essential -y
RUN wget --no-check-certificate https://dl.xpdfreader.com/xpdf-tools-linux-4.04.tar.gz && \
tar -xvf xpdf-tools-linux-4.04.tar.gz && cp xpdf-tools-linux-4.04/bin64/pdftotext /usr/local/bin
RUN apt-get install ffmpeg -y
RUN pip3 install requirements-prod.txt
COPY gcp_credentials.json /root/
COPY ./main.py /root/
COPY ./query_with_gptindex.py /root/
COPY ./cloud_storage.py /root/
COPY ./query_with_langchain.py /root/
COPY ./io_processing.py /root/
COPY ./translator.py /root/
COPY ./database_functions.py /root/
COPY ./query_with_tfidf.py /root/
COPY ./Titles.csv /root/
EXPOSE 8000
COPY script.sh /root/
ENTRYPOINT ["bash","script.sh"]
84 changes: 72 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,82 @@
![](https://33333.cdn.cke-cs.com/kSW7V9NHUXugvhoQeFaf/images/97c89d65af49e975df2b478d364bbf193c67647202428ab7.png)
# Jugalbandi API : Factual Question & Answering over arbitrary number of documents

## Jugalbandi API
[Jugalbandi API](https://api.jugalbandi.ai/docs) is a system of APIs that allows users to build Q&A style applications on their private and public datasets. The system creates Open API 3.0 specification endpoints using FastAPI.

Juglabnadi APIs is a system of APIs that allows users to build Q&A style applications on their private and public datasets. The system creates Open API 3.0 specification endpoints using FastAPI.

---
# 🔧 1. Installation

## How to use?
To use the code, you need to follow these steps:

To use Jugalbandi APIs you can follow below steps to get you started:
1. Clone the repository from GitHub:

```bash
git clone [email protected]:OpenNyAI/jugalbandi-api.git
```

1. Visit [https://api.jugalbandi.ai/docs](https://api.jugalbandi.ai/docs)
2. Scroll to `upload-file` endpoint to upload the document
3. Once you have uploaded file(s) you should have received a `uuid` number for that document set. Please keep this number handy as it will be required for you to query the document set.
4. Now that you have the `uuid`  you should scroll up to select the query endpoint you want to use. Currently there are three different implementations we support i.e. `query_using_gptindex`, `query_with_langchain` (recommended), `query_using_voice` (recommended for voice interfaces). While you can use any of the query systems, we are constantly refining our langchain implementation.
5. Pass on the `uuid` number and do the query.
2. The code requires **Python 3.7 or higher** and some additional python packages. To install these packages, run the following command in your terminal:

```bash
pip install requirements-dev.txt
```

3. You will need a GCP account to store the uploaded documents & indices in a bucket and to host a postgres connection to store the api logs.

4. Navigate to the repository directory. Create a file named **gcp_credentials.json** which will contain the service account credentials of your GCP account. The file will roughly have the same format mentioned below.

```bash
{
"type": "service_account",
"project_id": "<your-project-id>",
"private_key_id": "<your-private-key-id>",
"private_key": "<your-private-key>",
"client_email": "<your-client-email>",
"client_id": "<your-client-id>",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "<your-client-cert-url>"
}
```

5. In addition to creating gcp_credentials.json file, create another file **.env** which will hold the development credentials and add the following variables. Update the openai_api_key, path to gcp_credentials.json file, gcp_bucket_name and your db connections appropriately.

```bash
OPENAI_API_KEY=<your_openai_api_key>
GOOGLE_APPLICATION_CREDENTIALS=<path-to-gcp_credentials.json>
BUCKET_NAME=<your_gcp_bucket_name>
DATABASE_NAME=<your_db_name>
DATABASE_USERNAME=<your_db_username>
DATABASE_PASSWORD=<your_db_password>
DATABASE_IP=<your_db_public_ip>
DATABASE_PORT=5432
```

# 🏃🏻 2. Running

Once the above installation steps are completed, run the following command in home directory of the repository in terminal

```bash
uvicorn main:app
```

# 🚀 3. Deployment

This repository comes with a Dockerfile. You can use this dockerfile to deploy your version of this application to Cloud Run.
Make the necessary changes to your dockerfile with respect to your new changes. (Note: The given Dockerfile will deploy the base code without any error, provided you added the required environment variables (mentioned in the .env file) to either the Dockerfile or the cloud run revision)

# 👩‍💻 4. Usage

To directly use the Jugalbandi APIs without cloning the repo, you can follow below steps to get you started:

1. Visit [https://api.jugalbandi.ai/docs](https://api.jugalbandi.ai/docs).
2. Scroll to the `/upload-files` endpoint to upload the documents.
3. Once you have uploaded file(s) you should have received a `uuid number` for that document set. Please keep this number handy as it will be required for you to query the document set.
4. Now that you have the `uuid number` you should scroll up to select the query endpoint you want to use. Currently, there are three different implementations we support i.e. `query-with-gptindex`, `query-with-langchain` (recommended), `query-using-voice` (recommended for voice interfaces). While you can use any of the query systems, we are constantly refining our langchain implementation.
5. Use the `uuid number` and do the query.

## Feature request and contribution

* We are currently in the alpha stage and hence need all the inputs, feedbacks and contributions we can.
* You should visit our project board to see what is it that we are prioritizing.
* Kindly visit our project board to see what is it that we are prioritizing.


Loading

0 comments on commit 3fc65f8

Please sign in to comment.