feat(inference): introducing InferenceProviders #1161

axel7083 · 2024-06-05T12:04:48Z

What does this PR do?

Introducing InferenceProvider interface, which is an abstraction class made to create InferenceServers. In todays implementation we do not make distinctions between backends (llamacpp, whispercpp etc.)

This is the first step in abstracting the inference providers, to ease the customization of inference server with new provider in the future (whispercpp, llamacpp-cuda, ollama etc.)

Documentation

What issues does this PR fix or reference?

Fixes #1112

How to test this PR?

unit tests has been provided

Manually (recommended)

Create an inference server
Start chatbot recipe

feloy · 2024-06-05T13:32:30Z

In a near future, we will want to run inference servers in Kubernetes clusters too (OpenShift AI typically). I cannot see anything blocking doing this with this architecture, but just to be sure you have this scenario in mind

axel7083 · 2024-06-05T14:06:51Z

In a near future, we will want to run inference servers in Kubernetes clusters too (OpenShift AI typically). I cannot see anything blocking doing this with this architecture, but just to be sure you have this scenario in mind

Thanks @feloy for this feedback. I was not having this in mind, but I think it would make it easier than our current architecture, as we could create a KubernetesInferenceProvider responsible of creating the pod in the Kubernetes cluster.

lstocchi

Codewise LGTM. Also tested and works fine. Nice job!!

jeffmaury

Dependency on Podman Desktop 1.11 does not seems an absolute requirement and could be relaxed with few changes so I would delay merge as @slemeur approval should be required

packages/backend/src/managers/inference/inferenceManager.spec.ts

axel7083 · 2024-06-06T10:06:44Z

Dependency on Podman Desktop 1.11 does not seems an absolute requirement and could be relaxed with few changes so I would delay merge as @slemeur approval should be required

@jeffmaury I revert to @podman-desktop/api 1.10.3. As mention there is not absolute requirement.

Signed-off-by: axel7083 <[email protected]>

packages/backend/src/workers/provider/InferenceProvider.ts

axel7083 requested review from benoitf and a team as code owners June 5, 2024 12:04

axel7083 requested review from lstocchi, jeffmaury and feloy June 5, 2024 12:04

lstocchi approved these changes Jun 6, 2024

View reviewed changes

jeffmaury reviewed Jun 6, 2024

View reviewed changes

packages/backend/src/managers/inference/inferenceManager.spec.ts Outdated Show resolved Hide resolved

axel7083 requested a review from jeffmaury June 6, 2024 10:06

axel7083 added 7 commits June 6, 2024 15:59

feat(inference): introducing InferenceProviders

77d4df4

Signed-off-by: axel7083 <[email protected]>

feat: improve inference provider integration

0e333e1

Signed-off-by: axel7083 <[email protected]>

fix: compilation

3391db2

Signed-off-by: axel7083 <[email protected]>

fix: labels propagation

709fd00

Signed-off-by: axel7083 <[email protected]>

fix: error message

9df0505

Signed-off-by: axel7083 <[email protected]>

fix: revert to podman desktop api 1.10.3

995578c

Signed-off-by: axel7083 <[email protected]>

fix: typecheck

f9f3c98

Signed-off-by: axel7083 <[email protected]>

axel7083 force-pushed the feature/inference-provider branch from 7086f1c to f9f3c98 Compare June 6, 2024 14:01

jeffmaury approved these changes Jun 7, 2024

View reviewed changes

packages/backend/src/workers/provider/InferenceProvider.ts Show resolved Hide resolved

axel7083 merged commit d2ea36b into containers:main Jun 7, 2024
4 checks passed

axel7083 mentioned this pull request Jun 10, 2024

feat: adding InferenceType enum #1186

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(inference): introducing InferenceProviders #1161

feat(inference): introducing InferenceProviders #1161

axel7083 commented Jun 5, 2024 •

edited

Loading

feloy commented Jun 5, 2024

axel7083 commented Jun 5, 2024

lstocchi left a comment

jeffmaury left a comment

axel7083 commented Jun 6, 2024 •

edited

Loading

feat(inference): introducing InferenceProviders #1161

feat(inference): introducing InferenceProviders #1161

Conversation

axel7083 commented Jun 5, 2024 • edited Loading

What does this PR do?

Documentation

What issues does this PR fix or reference?

How to test this PR?

Manually (recommended)

feloy commented Jun 5, 2024

axel7083 commented Jun 5, 2024

lstocchi left a comment

Choose a reason for hiding this comment

jeffmaury left a comment

Choose a reason for hiding this comment

axel7083 commented Jun 6, 2024 • edited Loading

axel7083 commented Jun 5, 2024 •

edited

Loading

axel7083 commented Jun 6, 2024 •

edited

Loading