Feat/unload model api #97

thunhuanh · 2023-10-31T09:21:24Z

Add API to unload model.
Resolve #86

tikikun

Correct approach , please do some manual test and merge @vuonghoainam tks

hiro-v · 2023-11-02T03:06:23Z

Hi @thunhu99
Thank you for your contribution. It's great.

However, the server is not working as expected as the server stops working after I sent the DELETE request to unload model

Load the model

curl --location 'http://localhost:3928/inferences/llamacpp/loadModel' \
--header 'Content-Type: application/json' \
--data '{
    "llama_model_path": "<model_path>",
    "ctx_len": 2048,
    "ngl": 100,
    "embedding": true
}'

Test the model to make sure it's working correctly:

curl --location 'http://localhost:3928/inferences/llamacpp/chat_completion' \
--header 'Content-Type: application/json' \
--header 'Accept: text/event-stream' \
--header 'Access-Control-Allow-Origin: *' \
--data '{
        "messages": [
            {"content": "[INST] Write code to solve the following coding problem that obeys the constraints and passes the example test cases. Please wrap your code answer using ```:{prompt}[/INST]", "role": "system"},
            {"content": "python code for fibonacci", "role": "user"},
            {"content": "Here is a Python code for Fibonacci sequence:\n```def fib(n):if n <= 1:return else:return fib(n-1) + fib(n-2)```This code takes an integer `n` as input and returns the `n`-th Fibonacci number.", "role": "assistant"},
            {"content": "please continue", "role": "user"}
        ],
        "stream": true,
        "model": "gpt-3.5-turbo",
        "max_tokens": 2048,
        "stop": ["hello"],
        "frequency_penalty": 0,
        "presence_penalty": 0,
        "temperature": 0
     }'

Try to unload the model (as your code change)

curl --location --request DELETE 'http://localhost:3928/inferences/llamacpp/unloadmodel' \
--header 'Content-Type: application/json' \
--data ''

However, after the 3rd step, the server stops working and I cannot send the 1st step again (which is similar to killing process)
What we expect is that after 3rd step, I can loadmodel again.

Could you please check and make some changes, thanks

hiro-v · 2023-11-02T03:07:49Z

controllers/llamaCPP.cc

+{
+  Json::Value jsonResp;
+  if (model_loaded) {
+    llama.unloadModel();


As I tested, the server stops working after this line.
The below lines do not execute to return result

Oh, sorry my mistake. The API endpoint map to the wrong handler:
it should be METHOD_ADD(llamaCPP::unloadModel, "unloadmodel", Delete);
instead of METHOD_ADD(llamaCPP::loadModel, "unloadmodel", Delete);

thunhuanh · 2023-11-02T11:11:00Z

I have fixed the issue, and test it locally, the changes should work now @vuonghoainam @tikikun

…nitro into feat/unload-model-api

tikikun · 2023-11-07T08:23:43Z

Hi @thunhuanh there has been quite intense i need to refactor this PR into another PR a bit to merge, will credit back to this issue.

tikikun · 2023-11-13T02:09:11Z

Hi @thunhuanh I have added your change #122 to this PR with a little bit of change, thank you very much to take the time to contribute to the project

thunhu99 added 2 commits October 31, 2023 16:12

Add api to unload model

a4b7226

set context and model back to nullptr after delete

3d81378

tikikun requested review from hiro-v and tikikun October 31, 2023 09:44

tikikun reviewed Oct 31, 2023

View reviewed changes

hiro-v reviewed Nov 2, 2023

View reviewed changes

hiro-v assigned thunhuanh Nov 2, 2023

hiro-v added P1: important Important feature / fix type: enhancement labels Nov 2, 2023

hiro-v added this to the Nitro v2.0.0 milestone Nov 2, 2023

thunhuanh and others added 3 commits November 2, 2023 18:11

Merge branch 'janhq:main' into feat/unload-model-api

b617433

fix wrong api endpoind - handler mapping

af3bdb9

Merge branch 'feat/unload-model-api' of https://github.com/thunhuanh/…

29411b5

…nitro into feat/unload-model-api

tikikun mentioned this pull request Nov 13, 2023

Unload model stop background #122

Merged

tikikun closed this Nov 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/unload model api #97

Feat/unload model api #97

thunhuanh commented Oct 31, 2023 •

edited

Loading

tikikun left a comment

hiro-v commented Nov 2, 2023

hiro-v Nov 2, 2023

thunhuanh Nov 2, 2023

thunhuanh commented Nov 2, 2023 •

edited

Loading

tikikun commented Nov 7, 2023

tikikun commented Nov 13, 2023

Feat/unload model api #97

Feat/unload model api #97

Conversation

thunhuanh commented Oct 31, 2023 • edited Loading

tikikun left a comment

Choose a reason for hiding this comment

hiro-v commented Nov 2, 2023

hiro-v Nov 2, 2023

Choose a reason for hiding this comment

thunhuanh Nov 2, 2023

Choose a reason for hiding this comment

thunhuanh commented Nov 2, 2023 • edited Loading

tikikun commented Nov 7, 2023

tikikun commented Nov 13, 2023

thunhuanh commented Oct 31, 2023 •

edited

Loading

thunhuanh commented Nov 2, 2023 •

edited

Loading