From 5c75001ef53eea6e4aa0eeeff31fb0f5d6c4b477 Mon Sep 17 00:00:00 2001 From: Matthew Harris Date: Wed, 10 Jul 2024 12:45:45 -0400 Subject: [PATCH] Documentation updates --- .env.example | 51 +++++++++++++++-------------- README.md | 91 ++++++++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 108 insertions(+), 34 deletions(-) diff --git a/.env.example b/.env.example index 0a1117e1..80de2759 100644 --- a/.env.example +++ b/.env.example @@ -27,17 +27,20 @@ RECIPE_DB_CONN_STRING=postgresql://${POSTGRES_RECIPE_USER}:${POSTGRES_RECIPE_PAS #==================================================# # Recipes AI Settings # #==================================================# -# You can leave these as-is for quick start +# These control how recipes are retrieved and generated using LLMs. # +# If you are using Azure OpenAI. Note, in Playground in Azure, you can 'View code' to get these #RECIPES_OPENAI_API_TYPE=azure #RECIPES_OPENAI_API_KEY= -#RECIPES_OPENAI_API_ENDPOINT= +#RECIPES_OPENAI_API_ENDPOINT=.openai.azure.com/> #RECIPES_OPENAI_API_VERSION=2024-02-15-preview -#RECIPES_BASE_URL= -#RECIPES_MODEL=gpt-4-turbo +#RECIPES_MODEL= +# +# Leave these as-is for quick start #RECIPES_OPENAI_TEXT_COMPLETION_DEPLOYMENT_NAME=text-embedding-ada-002 +#RECIPES_BASE_URL=${RECIPES_OPENAI_API_ENDPOINT} -# gpt-4o only available on OpenAI +# OpenAI example RECIPES_OPENAI_API_TYPE=openai RECIPES_OPENAI_API_KEY= RECIPES_MODEL=gpt-4o @@ -61,33 +64,33 @@ IMAGE_HOST=http://localhost:3080/images #==================================================# # API Settings # #==================================================# -# To get this go to https://hapi.humdata.org/docs#/, -# select the the encode_identifier endpoint, click the 'Try it out' button, -# Enter a name and you email and click send. The response will have your token. -# Note also, the URL for the api is set in ./ingestion/ingestion.config +# This token is just your encoded email address. To generate it, see the instructions here: +# https://hdx-hapi.readthedocs.io/en/latest/getting-started/ HAPI_API_TOKEN= #==================================================# # Assistant Settings # #==================================================# -# Needed when updating an assistant, see assistants/openai_assistants. Leave blank to create new +# Parameters for the AI assistant used in the chat interface, to serve recipes and carry out +# on-the-fly-analysis +# +# # If you are using Azure OpenAI. Note, in Playground in Azure, you can 'View code' to get these #ASSISTANTS_API_TYPE=azure -#ASSISTANTS_API_KEY= -#ASSISTANTS_ID= -#ASSISTANTS_BASE_URL= +#ASSISTANTS_API_KEY= +#ASSISTANTS_ID= +#ASSISTANTS_BASE_URL=.openai.azure.com/> #ASSISTANTS_API_VERSION=2024-02-15-preview -#ASSISTANTS_MODEL=gpt4-o -#ASSISTANTS_BOT_NAME="Humanitarian AI Assistant" - +#ASSISTANTS_MODEL= +#ASSISTANTS_BOT_NAME= -# OPENAI -OPENAI_API_KEY= +# If you are using OPen AI directly (ie not Azure) +ASSISTANTS_API_TYPE=openai +OPENAI_API_KEY= ASSISTANTS_API_KEY=${OPENAI_API_KEY} -ASSISTANTS_API_TYPE=openai -ASSISTANTS_ID= +ASSISTANTS_ID= +ASSISTANTS_MODEL= +ASSISTANTS_BOT_NAME= ASSISTANTS_BASE_URL="" -ASSISTANTS_MODEL=gpt-4o -ASSISTANTS_BOT_NAME="Humanitarian AI Assistant" #==================================================# # Deployments Settings # @@ -106,11 +109,11 @@ RECIPE_SERVER_API=http://server:8080/ #==================================================# # Chainlit Settings # #==================================================# -# Used with Literal.ai to get telemetry and voting, can be left blank for quick start. +# Used with Literal.ai to get telemetry and voting, can be left blank if running locally LITERAL_API_KEY= # Run "chainlit create-secret" to get this. -# WARNING!!!! You MUST run this to update the defaults below if deploying online +# WARNING!!!! These are test values, ok for a quick start. Do Not deploy online with these as-is, regenerate them CHAINLIT_AUTH_SECRET="1R_FKRaiv0~5bqoQurBx34ctOD8kM%a=YvIx~fVmYLVd>B5vWa>e9rDX?6%^iCOv" USER_LOGIN=muppet-data-chef USER_PASSWORD=hB%1b36!!8-v diff --git a/README.md b/README.md index b545f372..940f8310 100644 --- a/README.md +++ b/README.md @@ -41,19 +41,90 @@ This repo contains a docker-compose environment that will run the following comp # Quick start -1. Copy `.env.example` to `.env` and set variables according to instructions in the file. Most variables be left as-is, but at a minimum you will need to set variables in these sections (see `.env.example` for instructions on how to set them): - - API Settings - Needed for ingesting data from data sources - - Recipes AI Settings - Set to your LLM deployment accordingly - - Assistant Settings - Set to your LLM deployment accordingly -2. `cd data && python3 download_demo_data.py && cd ..` -3. `docker compose up -d --build` -4. `docker compose exec chat python create_update_assistant.py` -5. Update `.env` file and set ASSISTANTS_ID to the value returned from the previous step -6. `docker compose up -d` -7. Go to [http://localhost:8000/](http://localhost:8000/) +1. Install Docker if you don't have it already, see [here](https://www.docker.com/products/docker-desktop/) +2. Check out the Data Recipes AI GitHub repo +Go to the [repo](https://github.com/datakind/data-recipes-ai) in Github, and click the big green '<> Code' button. This provides a few options, you can download a zip file, or check the code out with git. If you have Git installed, a common method would be ... +`git clone https://github.com/datakind/data-recipes-ai.git` + +3. Populate your `.env` file with important settings to get started + +First, copy `.env.example` in your repo to `.env` in the same location, then adjust the following valriables. + +If using **Azure OpenAI**, you will need to set these in your `.env` ... + +``` +RECIPES_OPENAI_API_TYPE=azure +RECIPES_OPENAI_API_KEY= +RECIPES_OPENAI_API_ENDPOINT=.openai.azure.com/> +RECIPES_OPENAI_API_VERSION= +RECIPES_MODEL= + +ASSISTANTS_API_TYPE=azure +ASSISTANTS_API_KEY= +ASSISTANTS_ID= +ASSISTANTS_BASE_URL=.openai.azure.com/> +ASSISTANTS_API_VERSION=2024-02-15-preview +ASSISTANTS_MODEL= +ASSISTANTS_BOT_NAME= + +``` + +Note: In Azure Playground, you can view code for your assistant which provide most of the variables above + +If using **OpenAI directly***, you will instead need to set these ... + +``` +RECIPES_OPENAI_API_TYPE=openai +RECIPES_OPENAI_API_KEY= +RECIPES_MODEL= +RECIPES_OPENAI_TEXT_COMPLETION_DEPLOYMENT_NAME=text-embedding-ada-002 + +ASSISTANTS_API_TYPE=openai +OPENAI_API_KEY= +ASSISTANTS_API_KEY=${OPENAI_API_KEY} +ASSISTANTS_ID= +ASSISTANTS_MODEL= +ASSISTANTS_BOT_NAME= +``` + +Not needed for quick start, but if you want to run ingestion of data with the new HDX API, then you will need to set ... + +`HAPI_API_TOKEN=` + +4. Download sample Humanitarian Data Exchange (HDX) API data + +For a quick start, we have prepared a sample dataset extracted from the new [HDX API](https://hdx-hapi.readthedocs.io/en/latest/). You can also run the ingestion yourself (see below), but this demo file should get you started quickly. + +From [this Google folder](https://drive.google.com/drive/folders/1E4G9HM-QzxdXVNkgP3fQXsuNcABWzdus?usp=sharing), download the file starting with 'datadb' and save it into the 'data' folder of your repo. + +Note: If you use python, you can also download this file by running this in your checked out repo top directory `pip3 install gdown && cd data && python3 download_demo_data.py && cd ..` + +5. Start your environment + +`docker compose up -d --build` + +6. If you don't have one already, create an AI Assistant on Open AI (or Azure OpenAI) + +Data Recipes AI uses Open AI style assistants, which support running code, and searching user-supplied data. We have provided a script to automatically do everything for you. + +In a terminal, navigate to the repo top folder and run `docker compose exec chat python create_update_assistant.py` + +Make note of the assitant ID, then edit your `.env` file and using it set variable `ASSISTANTS_ID`. + +Note: (i) If you rerun `create_update_assistant.py` once `ASSISTANTS_ID` is set, the script will update the assistant rather than create a new one; (ii) You can also add your own data, pdf, docx, csv, xlsx files for the assistant to use, see section 'Adding your own files for the assistant to analyze' below. + +7. Restart so the assistant ID is set, `docker compose up -d` + +8. Go to [http://localhost:8000/](http://localhost:8000/) and sign-in using the values in your `.env` file for `USER_LOGIN` and `USER_PASSWORD` + +The steps above are mostly one-time. Going forward you only need to stop and start the environment as follows: + +- To stop the environment `docker compose stop` +- To start the environment `docker compose up -d`, then go to [http://localhost:8000/](http://localhost:8000/) +- To start with rebuild `docker compose up -d --build` ## Using Recipes