Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: auto load model on /chat/completions request #900

Merged
merged 1 commit into from
Jul 22, 2024

Conversation

louis-jan
Copy link
Contributor

@louis-jan louis-jan commented Jul 22, 2024

Describe Your Changes

This PR is to add model loading on the /chat/completions & /embeddings request from the JS layer.

Note: We use cortexso/nomic-embed-text-v1 as the default embedding model, in case the model is not defined. It will automatically pull embedding model if it does not exit.

CLI update should be covered by refactoring to send REST requests.

sequenceDiagram
    participant User
    participant API_Controller
    participant Engine_Server
    participant Model

    User->>API_Controller: Chat request
    API_Controller->>Engine_Server: Check Model Running
    Engine_Server-->>API_Controller: Model status
    alt Model is not running
        API_Controller->>Engine_Server: Start Model command
        Engine_Server->>Model: Start Model
        Model-->>Engine_Server: Model started
        Engine_Server-->>API_Controller: Model ready
    end
    API_Controller->>Model: Send Chat Completions
    Model-->>API_Controller: Return Completions
    API_Controller-->>User: Send Response
Loading

Fixes Issues

  • Closes #

Self Checklist

  • Added relevant comments, esp in complex areas
  • Updated docs (for bug fixes / features)
  • Created issues for follow-up changes or refactoring needed

@louis-jan louis-jan force-pushed the chore/auto-load-model-on-chat-completions branch 4 times, most recently from f77b9c8 to d0ad07b Compare July 22, 2024 10:03
@louis-jan louis-jan force-pushed the chore/auto-load-model-on-chat-completions branch from d0ad07b to c39f4c1 Compare July 22, 2024 10:07
@louis-jan louis-jan merged commit 749cf0c into dev Jul 22, 2024
2 checks passed
@louis-jan louis-jan deleted the chore/auto-load-model-on-chat-completions branch July 22, 2024 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants