-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: adding InferenceManager #444
Conversation
Signed-off-by: axel7083 <[email protected]>
Signed-off-by: axel7083 <[email protected]>
Signed-off-by: axel7083 <[email protected]>
Signed-off-by: axel7083 <[email protected]>
Signed-off-by: axel7083 <[email protected]>
265d0e6
to
c758db3
Compare
Signed-off-by: axel7083 <[email protected]>
}, | ||
Labels: { | ||
...config.labels, | ||
[LABEL_INFERENCE_SERVER]: JSON.stringify(config.modelsInfo.map(model => model.id)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The server is started only for the first model, why labels are about all models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now we only support one model. But in the future the Inference server can support several models see containers/ai-lab-recipes#72. So using something like MODEL_ID
does not make sense as we want to be able to link more than one model to a single Inference server;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if you find such a container you will think it supports several models where it has been started with a single one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it only support one, only one model id will be listed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that on line 105 only the first element of the array is considered but not here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you go just a little higher at line 103, you can see why I am only using the first at line 105, because currently we do not support more than one model. I still include all of them, because of guard at line 103, which prevent from having more than one.
In the future, when we will support than one, we will already have the proper logic to handle multiples, as we will simply remove the guard at line 103.
Signed-off-by: axel7083 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fix type-o that breaks pull on vllm
What does this PR do?
Adding a fully autonomous Inference Manager. It replaces most of the logic that was made inside the PlaygroundManager. This is the firsts step for having a
Service
page, specific for the models, and independent of the playground.Requires
listImages()
method with optional options podman-desktop/podman-desktop#6257Screenshot / video of UI
What issues does this PR fix or reference?
Fixes #434
How to test this PR?