-
Notifications
You must be signed in to change notification settings - Fork 41
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
doc: ai-studio.yaml changes proposal (#253)
* doc: ai-studio.yaml changes proposal Signed-off-by: Jeff MAURY <[email protected]> Co-authored-by: Philippe Martin <[email protected]>
- Loading branch information
Showing
1 changed file
with
100 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
# Motivation | ||
|
||
Today, there is no notion of ordering between the containers. But we know that we have a dependency between | ||
the client application and the container that is running the model. | ||
|
||
The second issue is that there is no concept of starting point for a container so today we rely only on the | ||
container being started by the container engine and we know that this is not adequate for the model service container | ||
|
||
So this is handle by a kind of dirty fix: the containers are all started in parallel but as the client application | ||
will fail because the model service is started (as it take a while), so we are trying to restart the client application | ||
until the model service is properly started. | ||
|
||
The purpose of this change is to propose an update to the ai-studio.yaml so that it is as much generic as it | ||
could be and inspired from the Compose specification. | ||
|
||
## Proposed changes | ||
|
||
Define a condition for the container to be properly started: this would be based on the readinessProbe that can already | ||
be defined in a Kubernetes container. In the first iteration, we would support only the ```exec``` field. If | ||
```readinessProbe``` is defined, then we would check for the healthcheck status field to be ```healthy``` | ||
|
||
So the current chatbot file would be updated from: | ||
|
||
```yaml | ||
application: | ||
type: language | ||
name: chatbot | ||
description: This is a LLM chatbot application that can interact with a llamacpp model-service | ||
containers: | ||
- name: chatbot-inference-app | ||
contextdir: ai_applications | ||
containerfile: builds/Containerfile | ||
- name: chatbot-model-service | ||
contextdir: model_services | ||
containerfile: base/Containerfile | ||
model-service: true | ||
backend: | ||
- llama | ||
arch: | ||
- arm64 | ||
- amd64 | ||
- name: chatbot-model-servicecuda | ||
contextdir: model_services | ||
containerfile: cuda/Containerfile | ||
model-service: true | ||
backend: | ||
- llama | ||
gpu-env: | ||
- cuda | ||
arch: | ||
- amd64 | ||
``` | ||
to | ||
```yaml | ||
application: | ||
type: language | ||
name: chatbot | ||
description: This is a LLM chatbot application that can interact with a llamacpp model-service | ||
containers: | ||
- name: chatbot-inference-app | ||
contextdir: ai_applications | ||
containerfile: builds/Containerfile | ||
readinessProbe: # added | ||
exec: # added | ||
command: # added | ||
- curl -f localhost:8080 || exit 1 # added | ||
- name: chatbot-model-service | ||
contextdir: model_services | ||
containerfile: base/Containerfile | ||
model-service: true | ||
readinessProbe: # added | ||
exec: # added | ||
command: # added | ||
- curl -f localhost:7860 || exit 1 # added | ||
backend: | ||
- llama | ||
arch: | ||
- arm64 | ||
- amd64 | ||
- name: chatbot-model-service | ||
contextdir: model_services | ||
containerfile: cuda/Containerfile | ||
model-service: true | ||
readinessProbe: # added | ||
exec: # added | ||
command: # added | ||
- curl -f localhost:7860 || exit 1 # added | ||
backend: | ||
- llama | ||
gpu-env: | ||
- cuda | ||
arch: | ||
- amd64 | ||
``` | ||
From the Podman Desktop API point of view, this would require extending the | ||
[ContainerCreateOptions](https://podman-desktop.io/api/interfaces/ContainerCreateOptions) structure to support the | ||
HealthCheck option. |