feat: adding InferenceManager #444

axel7083 · 2024-03-05T15:48:24Z

What does this PR do?

Adding a fully autonomous Inference Manager. It replaces most of the logic that was made inside the PlaygroundManager. This is the firsts step for having a Service page, specific for the models, and independent of the playground.

Requires

Screenshot / video of UI

What issues does this PR fix or reference?

Fixes #434

How to test this PR?

Unit tests has been provided

Signed-off-by: axel7083 <[email protected]>

packages/backend/src/managers/inference/inferenceManager.ts

packages/backend/src/utils/inferenceUtils.ts

jeffmaury · 2024-03-08T13:41:14Z

packages/backend/src/utils/inferenceUtils.ts

+    },
+    Labels: {
+      ...config.labels,
+      [LABEL_INFERENCE_SERVER]: JSON.stringify(config.modelsInfo.map(model => model.id)),


The server is started only for the first model, why labels are about all models

For now we only support one model. But in the future the Inference server can support several models see containers/ai-lab-recipes#72. So using something like MODEL_ID does not make sense as we want to be able to link more than one model to a single Inference server;

So if you find such a container you will think it supports several models where it has been started with a single one

If it only support one, only one model id will be listed here

My concern is that on line 105 only the first element of the array is considered but not here

https://github.com/projectatomic/ai-studio/blob/fa1fd271259eb9a94c9c357cc2b833afbfa43214/packages/backend/src/utils/inferenceUtils.ts#L101-L103

If you go just a little higher at line 103, you can see why I am only using the first at line 105, because currently we do not support more than one model. I still include all of them, because of guard at line 103, which prevent from having more than one.

In the future, when we will support than one, we will already have the proper logic to handle multiples, as we will simply remove the guard at line 103.

Signed-off-by: axel7083 <[email protected]>

jeffmaury

LGTM

Fix type-o that breaks pull on vllm

This was referenced Mar 6, 2024

feat(catalogManager): splitting logic in smaller utilities #452

Merged

chore(deps): bump podman desktop version #455

Merged

axel7083 added 5 commits March 8, 2024 13:24

feat: introducing inference manager

153ce65

Signed-off-by: axel7083 <[email protected]>

refactor(inference): rename folder

8b93181

Signed-off-by: axel7083 <[email protected]>

fix: typecheck

480d719

Signed-off-by: axel7083 <[email protected]>

fix: format

2f678cc

Signed-off-by: axel7083 <[email protected]>

revert(inference): additional changes

c758db3

Signed-off-by: axel7083 <[email protected]>

axel7083 force-pushed the feature/inference-manager branch from 265d0e6 to c758db3 Compare March 8, 2024 12:31

axel7083 marked this pull request as ready for review March 8, 2024 12:31

axel7083 requested a review from a team as a code owner March 8, 2024 12:31

axel7083 requested review from benoitf, jeffmaury and lstocchi March 8, 2024 12:31

fix: prettier

b2ee2ce

Signed-off-by: axel7083 <[email protected]>

jeffmaury reviewed Mar 8, 2024

View reviewed changes

fix: resolve comments

fa1fd27

Signed-off-by: axel7083 <[email protected]>

axel7083 requested a review from jeffmaury March 8, 2024 15:16

jeffmaury approved these changes Mar 11, 2024

View reviewed changes

axel7083 merged commit b76f906 into containers:main Mar 11, 2024
4 checks passed

mhdawson pushed a commit to mhdawson/podman-desktop-extension-ai-lab that referenced this pull request Nov 22, 2024

Merge pull request containers#444 from n1hility/fix-vllm-pull

208ac10

Fix type-o that breaks pull on vllm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: adding InferenceManager #444

feat: adding InferenceManager #444

axel7083 commented Mar 5, 2024 •

edited

Loading

jeffmaury Mar 8, 2024

axel7083 Mar 8, 2024

jeffmaury Mar 8, 2024

axel7083 Mar 8, 2024

jeffmaury Mar 11, 2024

axel7083 Mar 11, 2024

jeffmaury left a comment

feat: adding InferenceManager #444

feat: adding InferenceManager #444

Conversation

axel7083 commented Mar 5, 2024 • edited Loading

What does this PR do?

Screenshot / video of UI

What issues does this PR fix or reference?

How to test this PR?

jeffmaury Mar 8, 2024

Choose a reason for hiding this comment

axel7083 Mar 8, 2024

Choose a reason for hiding this comment

jeffmaury Mar 8, 2024

Choose a reason for hiding this comment

axel7083 Mar 8, 2024

Choose a reason for hiding this comment

jeffmaury Mar 11, 2024

Choose a reason for hiding this comment

axel7083 Mar 11, 2024

Choose a reason for hiding this comment

jeffmaury left a comment

Choose a reason for hiding this comment

axel7083 commented Mar 5, 2024 •

edited

Loading