Skip to content

Commit

Permalink
add whisper quadlet & update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
sallyom committed Mar 22, 2024
1 parent 03536c2 commit 4619249
Show file tree
Hide file tree
Showing 15 changed files with 259 additions and 81 deletions.
109 changes: 109 additions & 0 deletions audio-to-text/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Audio to Text Application

This sample application is a simple recipe to transcribe an audio file.
This provides a simple recipe to help developers start building out their own custom LLM enabled
audio-to-text applications. It consists of two main components; the Model Service and the AI Application.

There are a few options today for local Model Serving, but this recipe will use [`whisper-cpp`](https://github.com/ggerganov/whisper.cpp.git)
and their OpenAI compatible Model Service. There is a Containerfile provided that can be used to build this Model Service within the repo,
[`model_servers/whisper/Containerfile`](/model_servers/whisper/Containerfile).

Our AI Application will connect to our Model Service via it's OpenAI compatible API.

![](/assets/whisper.png)

# Build the Application

In order to build this application we will need a model, a Model Service and an AI Application.

* [Download a model](#download-a-model)
* [Build the Model Service](#build-the-model-service)
* [Deploy the Model Service](#deploy-the-model-service)
* [Build the AI Application](#build-the-ai-application)
* [Deploy the AI Application](#deploy-the-ai-application)
* [Interact with the AI Application](#interact-with-the-ai-application)
* [Input audio files](#input-audio-files)

### Download a model

If you are just getting started, we recommend using [ggerganov/whisper.cpp](https://huggingface.co/ggerganov/whisper.cpp).
This is a well performant mid-sized model with an apache-2.0 license.
It's simple to download a pre-converted whisper model from [huggingface.co](https://huggingface.co)
here: https://huggingface.co/ggerganov/whisper.cpp. There are a number of options, but we recommend to start with `ggml-small.bin`.

The recommended model can be downloaded using the code snippet below:

```bash
cd models
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin
cd ../
```

_A full list of supported open models is forthcoming._


### Build the Model Service

The Model Service can be built from the root directory with the following code snippet:

```bash
cd model_servers/whispercpp
podman build -t whispercppserver .
```

### Deploy the Model Service

The local Model Service relies on a volume mount to the localhost to access the model files. You can start your local Model Service using the following podman command:
```
podman run --rm -it \
-p 8001:8001 \
-v Local/path/to/locallm/models:/locallm/models \
-e MODEL_PATH=models/<model-filename> \
-e HOST=0.0.0.0 \
-e PORT=8001 \
whispercppserver
```

### Build the AI Application

Now that the Model Service is running we want to build and deploy our AI Application. Use the provided Containerfile to build the AI Application
image from the `audio-to-text/` directory.

```bash
cd audio-to-text
podman build -t audio-to-text . -f builds/Containerfile
```
### Deploy the AI Application

Make sure the Model Service is up and running before starting this container image.
When starting the AI Application container image we need to direct it to the correct `MODEL_SERVICE_ENDPOINT`.
This could be any appropriately hosted Model Service (running locally or in the cloud) using an OpenAI compatible API.
The following podman command can be used to run your AI Application:

```bash
podman run --rm -it -p 8501:8501 -e MODEL_SERVICE_ENDPOINT=http://0.0.0.0:8001/inference audio-to-text
```

### Interact with the AI Application

Once the streamlit application is up and running, you should be able to access it at `http://localhost:8501`.
From here, you can upload audio files from your local machine and translate the audio files as shown below.

Everything should now be up an running with the chat application available at [`http://localhost:8501`](http://localhost:8501).
By using this recipe and getting this starting point established,
users should now have an easier time customizing and building their own LLM enabled chatbot applications.

#### Input audio files

Whisper.cpp requires as an input 16-bit WAV audio files.
To convert your input audio files to 16-bit WAV format you can use `ffmpeg` like this:

```bash
ffmpeg -i <input.mp3> -ar 16000 -ac 1 -c:a pcm_s16le <output.wav>
```

<p align="center">
<img src="../assets/whisper.png" width="70%">
</p>


File renamed without changes.
File renamed without changes.
File renamed without changes.
30 changes: 30 additions & 0 deletions audio-to-text/quadlet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
### Run audio-text locally as a podman pod

There are pre-built images and a pod definition to run this audio-to-text example application.
This sample converts an audio waveform (.wav) file to text.

To run locally,

```bash
podman kube play ./quadlet/audio-to-text.yaml
```
To monitor locally,

```bash
podman pod list
podman ps
podman logs <name of container from the above>
```

The application should be acessible at `http://localhost:8501`. It will take a few minutes for the model to load.

### Run audio-text as a systemd service

```bash
cp audio-text.yaml /etc/containers/systemd/audio-text.yaml
cp audio-text.kube.example /etc/containers/audio-text.kube
cp audio-text.image /etc/containers/audio-text.image
/usr/libexec/podman/quadlet --dryrun (optional)
systemctl daemon-reload
systemctl start audio-text
```
7 changes: 7 additions & 0 deletions audio-to-text/quadlet/audio-text.image
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[Install]
WantedBy=audio-text.service

[Image]
Image=quay.io/redhat-et/locallm-whisper-ggml-small:latest
Image=quay.io/redhat-et/locallm-whisper-service:latest
Image=quay.io/redhat-et/locallm-audio-to-text:latest
16 changes: 16 additions & 0 deletions audio-to-text/quadlet/audio-text.kube.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
[Unit]
Description=Python script to run against downloaded LLM
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
RequiresMountsFor=%t/containers

[Kube]
# Point to the yaml file in the same directory
Yaml=audio-text.yaml

[Service]
Restart=always

[Install]
WantedBy=default.target
45 changes: 45 additions & 0 deletions audio-to-text/quadlet/audio-text.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
apiVersion: v1
kind: Pod
metadata:
labels:
app: audio-to-text
name: audio-to-text
spec:
initContainers:
- name: model-file
image: quay.io/redhat-et/locallm-whisper-ggml-small:latest
command: ['/usr/bin/install', "/model/ggml-small.bin", "/shared/"]
volumeMounts:
- name: model-file
mountPath: /shared
containers:
- env:
- name: MODEL_SERVICE_ENDPOINT
value: http://0.0.0.0:8001/inference
image: quay.io/redhat-et/locallm-audio-to-text:latest
name: audio-to-text
ports:
- containerPort: 8501
hostPort: 8501
securityContext:
runAsNonRoot: true
- env:
- name: HOST
value: 0.0.0.0
- name: PORT
value: 8001
- name: MODEL_PATH
value: /model/ggml-small.bin
image: quay.io/redhat-et/locallm-whisper-service:latest
name: whisper-model-service
ports:
- containerPort: 8001
hostPort: 8001
securityContext:
runAsNonRoot: true
volumeMounts:
- name: model-file
mountPath: /model
volumes:
- name: model-file
emptyDir: {}
File renamed without changes.
46 changes: 46 additions & 0 deletions model_servers/whispercpp/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
## Whisper

Whisper models are useful for converting audio files to text. The sample application [audio-to-text](../audio-to-text/README.md)
describes how to run an inference application. This document describes how to build a service for a Whisper model.

### Build model service

To build a Whisper model service container image from this directory,

```bash
podman build -t whisper:image .
```

### Download Whisper model

You can to download the model from HuggingFace. There are various Whisper models available which vary in size and can be found
[here](https://huggingface.co/ggerganov/whisper.cpp). We will be using the `small` model which is about 466 MB.

- **small**
- Download URL: [https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin](https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin)

```bash
cd ../models
wget --no-config --quiet --show-progress -O ggml-small.bin <Download URL>
cd ../
```

### Deploy Model Service

Deploy the LLM and volume mount the model of choice.
Here, we are mounting the `ggml-small.bin` model as downloaded from above.

```bash
# Note: the :Z may need to be omitted from the model volume mount if not running on Linux

podman run --rm -it \
-p 8001:8001 \
-v /local/path/to/locallm/models/ggml-small.bin:/models/ggml-small.bin:Z,ro \
-e HOST=0.0.0.0 \
-e MODEL_PATH=/models/ggml-small.bin \
-e PORT=8001 \
whisper:image
```

By default, a sample `jfk.wav` file is included in the whisper image. This can be used to test with.
The environment variable `AUDIO_FILE`, can be passed with your own audio file to override the default `/app/jfk.wav` file within the whisper image.
4 changes: 4 additions & 0 deletions model_servers/whispercpp/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#! bin/bash

./server -tr --model ${MODEL_PATH} --host ${HOST:=0.0.0.0} --port ${PORT:=8001}

1 change: 1 addition & 0 deletions models/Containerfile
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_S.gguf
#https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF/resolve/main/mistral-7b-instruct-v0.1.Q4_K_S.gguf
#https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/resolve/main/codellama-7b-instruct.Q4_K_M.gguf
#https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin
# podman build --build-arg MODEL_URL=https://... -t quay.io/yourimage .
FROM registry.access.redhat.com/ubi9/ubi-micro:9.3-13
ARG MODEL_URL
Expand Down
2 changes: 1 addition & 1 deletion playground/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,4 +69,4 @@ podman run --rm -it -d \
-v Local/path/to/locallm/models:/locallm/models:ro,Z \
-e CONFIG_PATH=models/<config-filename> \
playground:image
```
```
77 changes: 0 additions & 77 deletions whisper-playground/README.md

This file was deleted.

3 changes: 0 additions & 3 deletions whisper-playground/run.sh

This file was deleted.

0 comments on commit 4619249

Please sign in to comment.