Skip to content

Commit

Permalink
doc: ai-studio.yaml changes proposal (#253)
Browse files Browse the repository at this point in the history
* doc: ai-studio.yaml changes proposal

Signed-off-by: Jeff MAURY <[email protected]>
Co-authored-by: Philippe Martin <[email protected]>
  • Loading branch information
jeffmaury and feloy authored Feb 16, 2024
1 parent fc6b68b commit ab0a1a8
Showing 1 changed file with 100 additions and 0 deletions.
100 changes: 100 additions & 0 deletions docs/proposals/ai-studio.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Motivation

Today, there is no notion of ordering between the containers. But we know that we have a dependency between
the client application and the container that is running the model.

The second issue is that there is no concept of starting point for a container so today we rely only on the
container being started by the container engine and we know that this is not adequate for the model service container

So this is handle by a kind of dirty fix: the containers are all started in parallel but as the client application
will fail because the model service is started (as it take a while), so we are trying to restart the client application
until the model service is properly started.

The purpose of this change is to propose an update to the ai-studio.yaml so that it is as much generic as it
could be and inspired from the Compose specification.

## Proposed changes

Define a condition for the container to be properly started: this would be based on the readinessProbe that can already
be defined in a Kubernetes container. In the first iteration, we would support only the ```exec``` field. If
```readinessProbe``` is defined, then we would check for the healthcheck status field to be ```healthy```

So the current chatbot file would be updated from:

```yaml
application:
type: language
name: chatbot
description: This is a LLM chatbot application that can interact with a llamacpp model-service
containers:
- name: chatbot-inference-app
contextdir: ai_applications
containerfile: builds/Containerfile
- name: chatbot-model-service
contextdir: model_services
containerfile: base/Containerfile
model-service: true
backend:
- llama
arch:
- arm64
- amd64
- name: chatbot-model-servicecuda
contextdir: model_services
containerfile: cuda/Containerfile
model-service: true
backend:
- llama
gpu-env:
- cuda
arch:
- amd64
```
to
```yaml
application:
type: language
name: chatbot
description: This is a LLM chatbot application that can interact with a llamacpp model-service
containers:
- name: chatbot-inference-app
contextdir: ai_applications
containerfile: builds/Containerfile
readinessProbe: # added
exec: # added
command: # added
- curl -f localhost:8080 || exit 1 # added
- name: chatbot-model-service
contextdir: model_services
containerfile: base/Containerfile
model-service: true
readinessProbe: # added
exec: # added
command: # added
- curl -f localhost:7860 || exit 1 # added
backend:
- llama
arch:
- arm64
- amd64
- name: chatbot-model-service
contextdir: model_services
containerfile: cuda/Containerfile
model-service: true
readinessProbe: # added
exec: # added
command: # added
- curl -f localhost:7860 || exit 1 # added
backend:
- llama
gpu-env:
- cuda
arch:
- amd64
```
From the Podman Desktop API point of view, this would require extending the
[ContainerCreateOptions](https://podman-desktop.io/api/interfaces/ContainerCreateOptions) structure to support the
HealthCheck option.

0 comments on commit ab0a1a8

Please sign in to comment.