Skip to content

Commit

Permalink
update mistral model to K_M and image updates to ai-lab
Browse files Browse the repository at this point in the history
Signed-off-by: sallyom <[email protected]>
  • Loading branch information
sallyom committed Mar 29, 2024
1 parent e68db5d commit d58410a
Show file tree
Hide file tree
Showing 15 changed files with 31 additions and 30 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/model_servers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ jobs:

- name: Download model
working-directory: ./model_servers/llamacpp_python/
run: make llama-2-7b-chat.Q5_K_S.gguf
run: make mistral-7b-instruct-v0.1.Q4_K_M.gguf

- name: Set up Python
uses: actions/[email protected]
Expand Down
10 changes: 4 additions & 6 deletions ai-lab-recipes-images.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
## Images (x86_64, aarch64) currently built from GH Actions in this repository

- quay.io/redhat-et/locallm-model-service:latest
- quay.io/ai-lab/llamacpp-python:latest
- quay.io/redhat-et/locallm-text-summarizer:latest
- quay.io/redhat-et/locallm-chatbot:latest
- quay.io/ai-lab/chatbot:latest
- quay.io/redhat-et/locallm-rag:latest
- quay.io/redhat-et/locallm-codegen:latest
- quay.io/redhat-et/locallm-chromadb:latest
Expand All @@ -11,9 +11,7 @@

## Model Images (x86_64, aarch64) currently in `quay.io/redhat-et/locallm-*`

- quay.io/redhat-et/locallm-llama-2-7b:latest
- [model download link](https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf)
- quay.io/redhat-et/locallm-mistral-7b-gguf:latest
- [model download link](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_S.gguf)
- quay.io/ai-lab/mistral-7b-instruct:latest
- [model download link](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf)
- quay.io/redhat-et/locallm-codellama-7b-gguf:latest
- [model download link](https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/resolve/main/codellama-7b-instruct.Q4_K_M.gguf)
5 changes: 4 additions & 1 deletion model_servers/llamacpp_python/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,16 @@ build:
llama-2-7b-chat.Q5_K_S.gguf:
curl -s -S -L -f https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf -z $@ -o $@.tmp && mv -f $@.tmp $@ 2>/dev/null || rm -f $@.tmp $@

mistral-7b-instruct-v0.1.Q4_K_M.gguf:
curl -s -S -L -f https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf -z $@ -o $@.tmp && mv -f $@.tmp $@ 2>/dev/null || rm -f $@.tmp $@

.PHONY: install
install:
pip install -r tests/requirements-test.txt

.PHONY: run
run:
podman run -it -d -p 8001:8001 -v ./models:/locallm/models:ro,Z -e MODEL_PATH=models/llama-2-7b-chat.Q5_K_S.gguf -e HOST=0.0.0.0 -e PORT=8001 --net=host ghcr.io/redhat-et/model_servers
podman run -it -d -p 8001:8001 -v ./models:/locallm/models:ro,Z -e MODEL_PATH=models/mistral-7b-instruct-v0.1.Q4_K_M.gguf -e HOST=0.0.0.0 -e PORT=8001 --net=host ghcr.io/redhat-et/model_servers

.PHONY: test
test:
Expand Down
4 changes: 2 additions & 2 deletions model_servers/llamacpp_python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ At the time of this writing, 2 models are known to work with this service
- **Llama2-7b**
- Download URL: [https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf](https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf)
- **Mistral-7b**
- Download URL: [https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_S.gguf](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_S.gguf)
- Download URL: [https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf)

```bash
cd ../models
Expand All @@ -29,7 +29,7 @@ cd ../
```
or
```bash
make -f Makefile models/llama-2-7b-chat.Q5_K_S.gguf
make -f Makefile models/mistral-7b-instruct-v0.1.Q4_K_M.gguf
```

### Deploy Model Service
Expand Down
2 changes: 1 addition & 1 deletion model_servers/llamacpp_python/tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
)
],
extra_environment_variables={
"MODEL_PATH": "models/llama-2-7b-chat.Q5_K_S.gguf",
"MODEL_PATH": "models/mistral-7b-instruct-v0.1.Q4_K_M.gguf",
"HOST": "0.0.0.0",
"PORT": "8001"
},
Expand Down
2 changes: 1 addition & 1 deletion model_servers/llamacpp_python/tooling_options.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
"This notebook assumes that the playground image is running locally. Once built, you can use the below to start the model service image. \n",
"\n",
"```bash\n",
"podman run -it -p 8000:8000 -v <YOUR-LOCAL-PATH>/locallm/models:/locallm/models:Z -e MODEL_PATH=models/llama-2-7b-chat.Q5_K_S.gguf playground\n",
"podman run -it -p 8000:8000 -v <YOUR-LOCAL-PATH>/locallm/models:/locallm/models:Z -e MODEL_PATH=models/mistral-7b-instruct-v0.1.Q4_K_M.gguf playground\n",
"```"
]
},
Expand Down
4 changes: 2 additions & 2 deletions models/Containerfile
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf
#https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_S.gguf
#https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf
#https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/resolve/main/codellama-7b-instruct.Q4_K_M.gguf
#https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin
# podman build --build-arg MODEL_URL=https://... -t quay.io/yourimage .
FROM registry.access.redhat.com/ubi9/ubi-micro:9.3-13
ARG MODEL_URL
ARG MODEL_URL=https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_M.gguf
WORKDIR /model
ADD $MODEL_URL .
4 changes: 2 additions & 2 deletions recipes/natural_language_processing/chatbot/ai-lab.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ application:
- amd64
ports:
- 8001
image: quay.io/redhat-et/locallm-model-service:latest
image: quay.io/ai-lab/llamacppp-python:latest
- name: streamlit-chat-app
contextdir: .
containerfile: builds/Containerfile
Expand All @@ -24,4 +24,4 @@ application:
- amd64
ports:
- 8501
image: quay.io/redhat-et/locallm-chatbot:latest
image: quay.io/ai-lab/chatbot:latest
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ spec:
initContainers:
- name: model-file
image: quay.io/ai-lab/mistral-7b-instruct:latest
command: ['/usr/bin/install', "/model/mistral-7b-instruct-v0.1.Q4_K_S.gguf", "/shared/"]
command: ['/usr/bin/install', "/model/mistral-7b-instruct-v0.1.Q4_K_M.gguf", "/shared/"]
volumeMounts:
- name: model-file
mountPath: /shared
Expand All @@ -29,7 +29,7 @@ spec:
- name: PORT
value: 8001
- name: MODEL_PATH
value: /model/mistral-7b-instruct-v0.1.Q4_K_S.gguf
value: /model/mistral-7b-instruct-v0.1.Q4_K_M.gguf
image: quay.io/ai-lab/llamacpp-python:latest
name: chatbot-model-service
ports:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ application:
- amd64
ports:
- 8001
image: quay.io/redhat-et/locallm-model-service:latest
image: quay.io/ai-lab/llamacpp-python:latest
- name: codegen-app
contextdir: .
containerfile: builds/Containerfile
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@ WantedBy=codegen.service

[Image]
Image=quay.io/redhat-et/locallm-codellama-7b-gguf:latest
Image=quay.io/redhat-et/locallm-model-service:latest
Image=quay.io/ai-lab/llamacpp-python:latest
Image=quay.io/redhat-et/locallm-codegen:latest
4 changes: 2 additions & 2 deletions recipes/natural_language_processing/rag/ai-lab.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ application:
- amd64
ports:
- 8001
image: quay.io/redhat-et/locallm-model-service:latest
image: quay.io/ai-lab/llamacpp-python:latest
- name: chromadb-server
contextdir: ../../../vector_dbs/chromadb
containerfile: Containerfile
Expand All @@ -34,4 +34,4 @@ application:
- amd64
ports:
- 8501
image: quay.io/redhat-et/locallm-rag:latest
image: quay.io/redhat-et/locallm-rag:latest
4 changes: 2 additions & 2 deletions recipes/natural_language_processing/summarizer/ai-lab.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ application:
- amd64
ports:
- 8001
image: quay.io/redhat-et/locallm-model-service:latest
image: quay.io/ai-lab/llamacpp-python:latest
- name: streamlit-summary-app
contextdir: .
containerfile: builds/Containerfile
Expand All @@ -24,4 +24,4 @@ application:
- amd64
ports:
- 8501
image: quay.io/redhat-et/locallm-text-summarizer:latest
image: quay.io/redhat-et/locallm-text-summarizer:latest
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@
WantedBy=summarizer.service

[Image]
Image=quay.io/redhat-et/locallm-mistral-7b-gguf:latest
Image=quay.io/redhat-et/locallm-model-service:latest
Image=quay.io/ai-lab/mistral-7b-instruct:latest
Image=quay.io/ai-lab/llamacpp-python:latest
Image=quay.io/redhat-et/locallm-text-summarizer:latest
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ metadata:
spec:
initContainers:
- name: model-file
image: quay.io/redhat-et/locallm-mistral-7b-gguf:latest
command: ['/usr/bin/install', "/model/mistral-7b-instruct-v0.1.Q4_K_S.gguf", "/shared/"]
image: quay.io/ai-lab/mistral-7b-instruct:latest
command: ['/usr/bin/install', "/model/mistral-7b-instruct-v0.1.Q4_K_M.gguf", "/shared/"]
volumeMounts:
- name: model-file
mountPath: /shared
Expand All @@ -29,8 +29,8 @@ spec:
- name: PORT
value: 8001
- name: MODEL_PATH
value: /model/mistral-7b-instruct-v0.1.Q4_K_S.gguf
image: quay.io/redhat-et/locallm-model-service:latest
value: /model/mistral-7b-instruct-v0.1.Q4_K_M.gguf
image: quay.io/ai-lab/llamacpp-python:latest
name: summarizer-model-service
ports:
- containerPort: 8001
Expand Down

0 comments on commit d58410a

Please sign in to comment.