Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adding InferenceType enum #1186

Merged

Conversation

axel7083
Copy link
Contributor

@axel7083 axel7083 commented Jun 10, 2024

What does this PR do?

Adding the InferenceType enum, allowing InferenceProvider (see #1161) to precise their type, in this context, it could be llama-cpp, whisper-cpp etc.

Notable change

  • Adding new enum InferenceType
  • Adding optional backend string property to ModelInfo interface
  • Adding optional backend string property to Recipe interface
  • adding getByType method to the InferenceProviderRegistry
  • InferenceServer will check the backend property of models to select the InferenceProvider
  • Improving InferenceProvider constructor

Screenshot / video of UI

N/A no visual change

What issues does this PR fix or reference?

Fixes #1181
Part of #1111

How to test this PR?

  • Unit tests has been provided

Side effect

Now, trying to start an InferenceServer with the WhisperModel will raise an error as we do not have any InferenceProvider for it. Same for facebook/detr-resnet-101 model.

image

ℹ️ This is good, this is what we expect !

@axel7083 axel7083 force-pushed the feature/adding-backend-recipes-models branch from 936dc6b to e389061 Compare June 10, 2024 09:19
@axel7083 axel7083 marked this pull request as ready for review June 10, 2024 09:44
@axel7083 axel7083 requested review from benoitf and a team as code owners June 10, 2024 09:45
@axel7083 axel7083 requested review from jeffmaury, lstocchi and feloy June 10, 2024 09:45
@axel7083 axel7083 requested a review from lstocchi June 10, 2024 13:26
Copy link
Contributor

@lstocchi lstocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

2 things that we should improve (even on separate PRs) are

  1. we should add a label displayign the backend in the models table
  2. i don't like to see the error " no enabled provider could be found.." when clicking on create service using a non-supported model. I would prefer to not seeing the "create new inference server" button at all. Otherwise i may think that i'm doing something wrong and there is some way to make it work

@axel7083
Copy link
Contributor Author

Overall LGTM

2 things that we should improve (even on separate PRs) are

  1. we should add a label displayign the backend in the models table
  2. i don't like to see the error " no enabled provider could be found.." when clicking on create service using a non-supported model. I would prefer to not seeing the "create new inference server" button at all. Otherwise i may think that i'm doing something wrong and there is some way to make it work

Yeah I totally agree, this PR is the first step #1111, we can add more items on their to improve how we deal with this relation

Copy link
Contributor

@jeffmaury jeffmaury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

axel7083 added 5 commits June 11, 2024 10:37
Signed-off-by: axel7083 <[email protected]>
Signed-off-by: axel7083 <[email protected]>
Signed-off-by: axel7083 <[email protected]>
@axel7083 axel7083 force-pushed the feature/adding-backend-recipes-models branch from d9e2fb4 to 3caacd7 Compare June 11, 2024 08:37
@axel7083 axel7083 enabled auto-merge (squash) June 11, 2024 08:46
@axel7083 axel7083 merged commit 5ddb5c5 into containers:main Jun 11, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Model & Recipes should precise which backend they support (E.g. llamacpp, whispercpp)
3 participants