-
Notifications
You must be signed in to change notification settings - Fork 115
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
15 changed files
with
259 additions
and
81 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
# Audio to Text Application | ||
|
||
This sample application is a simple recipe to transcribe an audio file. | ||
This provides a simple recipe to help developers start building out their own custom LLM enabled | ||
audio-to-text applications. It consists of two main components; the Model Service and the AI Application. | ||
|
||
There are a few options today for local Model Serving, but this recipe will use [`whisper-cpp`](https://github.com/ggerganov/whisper.cpp.git) | ||
and their OpenAI compatible Model Service. There is a Containerfile provided that can be used to build this Model Service within the repo, | ||
[`model_servers/whisper/Containerfile`](/model_servers/whisper/Containerfile). | ||
|
||
Our AI Application will connect to our Model Service via it's OpenAI compatible API. | ||
|
||
![](/assets/whisper.png) | ||
|
||
# Build the Application | ||
|
||
In order to build this application we will need a model, a Model Service and an AI Application. | ||
|
||
* [Download a model](#download-a-model) | ||
* [Build the Model Service](#build-the-model-service) | ||
* [Deploy the Model Service](#deploy-the-model-service) | ||
* [Build the AI Application](#build-the-ai-application) | ||
* [Deploy the AI Application](#deploy-the-ai-application) | ||
* [Interact with the AI Application](#interact-with-the-ai-application) | ||
* [Input audio files](#input-audio-files) | ||
|
||
### Download a model | ||
|
||
If you are just getting started, we recommend using [ggerganov/whisper.cpp](https://huggingface.co/ggerganov/whisper.cpp). | ||
This is a well performant mid-sized model with an apache-2.0 license. | ||
It's simple to download a pre-converted whisper model from [huggingface.co](https://huggingface.co) | ||
here: https://huggingface.co/ggerganov/whisper.cpp. There are a number of options, but we recommend to start with `ggml-small.bin`. | ||
|
||
The recommended model can be downloaded using the code snippet below: | ||
|
||
```bash | ||
cd models | ||
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin | ||
cd ../ | ||
``` | ||
|
||
_A full list of supported open models is forthcoming._ | ||
|
||
|
||
### Build the Model Service | ||
|
||
The Model Service can be built from the root directory with the following code snippet: | ||
|
||
```bash | ||
cd model_servers/whispercpp | ||
podman build -t whispercppserver . | ||
``` | ||
|
||
### Deploy the Model Service | ||
|
||
The local Model Service relies on a volume mount to the localhost to access the model files. You can start your local Model Service using the following podman command: | ||
``` | ||
podman run --rm -it \ | ||
-p 8001:8001 \ | ||
-v Local/path/to/locallm/models:/locallm/models \ | ||
-e MODEL_PATH=models/<model-filename> \ | ||
-e HOST=0.0.0.0 \ | ||
-e PORT=8001 \ | ||
whispercppserver | ||
``` | ||
|
||
### Build the AI Application | ||
|
||
Now that the Model Service is running we want to build and deploy our AI Application. Use the provided Containerfile to build the AI Application | ||
image from the `audio-to-text/` directory. | ||
|
||
```bash | ||
cd audio-to-text | ||
podman build -t audio-to-text . -f builds/Containerfile | ||
``` | ||
### Deploy the AI Application | ||
|
||
Make sure the Model Service is up and running before starting this container image. | ||
When starting the AI Application container image we need to direct it to the correct `MODEL_SERVICE_ENDPOINT`. | ||
This could be any appropriately hosted Model Service (running locally or in the cloud) using an OpenAI compatible API. | ||
The following podman command can be used to run your AI Application: | ||
|
||
```bash | ||
podman run --rm -it -p 8501:8501 -e MODEL_SERVICE_ENDPOINT=http://0.0.0.0:8001/inference audio-to-text | ||
``` | ||
|
||
### Interact with the AI Application | ||
|
||
Once the streamlit application is up and running, you should be able to access it at `http://localhost:8501`. | ||
From here, you can upload audio files from your local machine and translate the audio files as shown below. | ||
|
||
Everything should now be up an running with the chat application available at [`http://localhost:8501`](http://localhost:8501). | ||
By using this recipe and getting this starting point established, | ||
users should now have an easier time customizing and building their own LLM enabled chatbot applications. | ||
|
||
#### Input audio files | ||
|
||
Whisper.cpp requires as an input 16-bit WAV audio files. | ||
To convert your input audio files to 16-bit WAV format you can use `ffmpeg` like this: | ||
|
||
```bash | ||
ffmpeg -i <input.mp3> -ar 16000 -ac 1 -c:a pcm_s16le <output.wav> | ||
``` | ||
|
||
<p align="center"> | ||
<img src="../assets/whisper.png" width="70%"> | ||
</p> | ||
|
||
|
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
### Run audio-text locally as a podman pod | ||
|
||
There are pre-built images and a pod definition to run this audio-to-text example application. | ||
This sample converts an audio waveform (.wav) file to text. | ||
|
||
To run locally, | ||
|
||
```bash | ||
podman kube play ./quadlet/audio-to-text.yaml | ||
``` | ||
To monitor locally, | ||
|
||
```bash | ||
podman pod list | ||
podman ps | ||
podman logs <name of container from the above> | ||
``` | ||
|
||
The application should be acessible at `http://localhost:8501`. It will take a few minutes for the model to load. | ||
|
||
### Run audio-text as a systemd service | ||
|
||
```bash | ||
cp audio-text.yaml /etc/containers/systemd/audio-text.yaml | ||
cp audio-text.kube.example /etc/containers/audio-text.kube | ||
cp audio-text.image /etc/containers/audio-text.image | ||
/usr/libexec/podman/quadlet --dryrun (optional) | ||
systemctl daemon-reload | ||
systemctl start audio-text | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
[Install] | ||
WantedBy=audio-text.service | ||
|
||
[Image] | ||
Image=quay.io/redhat-et/locallm-whisper-ggml-small:latest | ||
Image=quay.io/redhat-et/locallm-whisper-service:latest | ||
Image=quay.io/redhat-et/locallm-audio-to-text:latest |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
[Unit] | ||
Description=Python script to run against downloaded LLM | ||
Documentation=man:podman-generate-systemd(1) | ||
Wants=network-online.target | ||
After=network-online.target | ||
RequiresMountsFor=%t/containers | ||
|
||
[Kube] | ||
# Point to the yaml file in the same directory | ||
Yaml=audio-text.yaml | ||
|
||
[Service] | ||
Restart=always | ||
|
||
[Install] | ||
WantedBy=default.target |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
apiVersion: v1 | ||
kind: Pod | ||
metadata: | ||
labels: | ||
app: audio-to-text | ||
name: audio-to-text | ||
spec: | ||
initContainers: | ||
- name: model-file | ||
image: quay.io/redhat-et/locallm-whisper-ggml-small:latest | ||
command: ['/usr/bin/install', "/model/ggml-small.bin", "/shared/"] | ||
volumeMounts: | ||
- name: model-file | ||
mountPath: /shared | ||
containers: | ||
- env: | ||
- name: MODEL_SERVICE_ENDPOINT | ||
value: http://0.0.0.0:8001/inference | ||
image: quay.io/redhat-et/locallm-audio-to-text:latest | ||
name: audio-to-text | ||
ports: | ||
- containerPort: 8501 | ||
hostPort: 8501 | ||
securityContext: | ||
runAsNonRoot: true | ||
- env: | ||
- name: HOST | ||
value: 0.0.0.0 | ||
- name: PORT | ||
value: 8001 | ||
- name: MODEL_PATH | ||
value: /model/ggml-small.bin | ||
image: quay.io/redhat-et/locallm-whisper-service:latest | ||
name: whisper-model-service | ||
ports: | ||
- containerPort: 8001 | ||
hostPort: 8001 | ||
securityContext: | ||
runAsNonRoot: true | ||
volumeMounts: | ||
- name: model-file | ||
mountPath: /model | ||
volumes: | ||
- name: model-file | ||
emptyDir: {} |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
## Whisper | ||
|
||
Whisper models are useful for converting audio files to text. The sample application [audio-to-text](../audio-to-text/README.md) | ||
describes how to run an inference application. This document describes how to build a service for a Whisper model. | ||
|
||
### Build model service | ||
|
||
To build a Whisper model service container image from this directory, | ||
|
||
```bash | ||
podman build -t whisper:image . | ||
``` | ||
|
||
### Download Whisper model | ||
|
||
You can to download the model from HuggingFace. There are various Whisper models available which vary in size and can be found | ||
[here](https://huggingface.co/ggerganov/whisper.cpp). We will be using the `small` model which is about 466 MB. | ||
|
||
- **small** | ||
- Download URL: [https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin](https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin) | ||
|
||
```bash | ||
cd ../models | ||
wget --no-config --quiet --show-progress -O ggml-small.bin <Download URL> | ||
cd ../ | ||
``` | ||
|
||
### Deploy Model Service | ||
|
||
Deploy the LLM and volume mount the model of choice. | ||
Here, we are mounting the `ggml-small.bin` model as downloaded from above. | ||
|
||
```bash | ||
# Note: the :Z may need to be omitted from the model volume mount if not running on Linux | ||
|
||
podman run --rm -it \ | ||
-p 8001:8001 \ | ||
-v /local/path/to/locallm/models/ggml-small.bin:/models/ggml-small.bin:Z,ro \ | ||
-e HOST=0.0.0.0 \ | ||
-e MODEL_PATH=/models/ggml-small.bin \ | ||
-e PORT=8001 \ | ||
whisper:image | ||
``` | ||
|
||
By default, a sample `jfk.wav` file is included in the whisper image. This can be used to test with. | ||
The environment variable `AUDIO_FILE`, can be passed with your own audio file to override the default `/app/jfk.wav` file within the whisper image. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#! bin/bash | ||
|
||
./server -tr --model ${MODEL_PATH} --host ${HOST:=0.0.0.0} --port ${PORT:=8001} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.