diff --git a/packages/backend/src/ai.json b/packages/backend/src/ai.json index 7b0dab948..0a0eded9a 100644 --- a/packages/backend/src/ai.json +++ b/packages/backend/src/ai.json @@ -13,7 +13,8 @@ "config": "chatbot/ai-studio.yaml", "readme": "# Locallm\n\nThis repo contains artifacts that can be used to build and run LLM (Large Language Model) services locally on your Mac using podman. These containerized LLM services can be used to help developers quickly prototype new LLM based applications, without the need for relying on any other externally hosted services. Since they are already containerized, it also helps developers move from their prototype to production quicker. \n\n## Current Locallm Services: \n\n* [Chatbot](#chatbot)\n* [Text Summarization](#text-summarization)\n* [Fine-tuning](#fine-tuning)\n\n### Chatbot\n\nA simple chatbot using the gradio UI. Learn how to build and run this model service here: [Chatbot](/chatbot/).\n\n### Text Summarization\n\nAn LLM app that can summarize arbitrarily long text inputs. Learn how to build and run this model service here: [Text Summarization](/summarizer/).\n\n### Fine Tuning \n\nThis application allows a user to select a model and a data set they'd like to fine-tune that model on. Once the application finishes, it outputs a new fine-tuned model for the user to apply to other LLM services. Learn how to build and run this model training job here: [Fine-tuning](/finetune/).\n\n## Architecture\n![](https://raw.githubusercontent.com/MichaelClifford/locallm/main/assets/arch.jpg)\n\nThe diagram above indicates the general architecture for each of the individual model services contained in this repo. The core code available here is the \"LLM Task Service\" and the \"API Server\", bundled together under `model_services`. With an appropriately chosen model downloaded onto your host,`model_services/builds` contains the Containerfiles required to build an ARM or an x86 (with CUDA) image depending on your need. These model services are intended to be light-weight and run with smaller hardware footprints (given the Locallm name), but they can be run on any hardware that supports containers and scaled up if needed.\n\nWe also provide demo \"AI Applications\" under `ai_applications` for each model service to provide an example of how a developers could interact with the model service for their own needs. ", "models": [ - "llama-2-7b-chat.Q5_K_S" + "llama-2-7b-chat.Q5_K_S", + "mistral-7b-instruct-v0.1.Q4_K_M" ] }, { @@ -29,7 +30,8 @@ "config": "summarizer/ai-studio.yaml", "readme": "# Summarizer\n\nThis model service is intended be be used for text summarization tasks. This service can ingest an arbitrarily long text input. If the input length is less than the models maximum context window it will summarize the input directly. If the input is longer than the maximum context window, the input will be divided into appropriately sized chunks. Each chunk will be summarized and a final \"summary of summaries\" will be the services final output. ", "models": [ - "llama-2-7b-chat.Q5_K_S" + "llama-2-7b-chat.Q5_K_S", + "mistral-7b-instruct-v0.1.Q4_K_M" ] } ], @@ -43,6 +45,16 @@ "popularity": 3, "license": "?", "url": "https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf" + }, + { + "id": "mistral-7b-instruct-v0.1.Q4_K_M", + "name": "Mistral-7B-Instruct-v0.1-GGUF", + "description": "The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) generative text model using a variety of publicly available conversation datasets. For full details of this model please read our [release blog post](https://mistral.ai/news/announcing-mistral-7b/)", + "hw": "CPU", + "registry": "Hugging Face", + "popularity": 3, + "license": "Apache-2.0", + "url": "https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf" } ], "categories": [