You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running the command tritonserver --model-repository=path_to_triton_models/gemma2 --model-repository=path_to_triton_models/llama3 --model-namespacing=true. All the models are loaded correctly as confirmed by the logs.
At this point I want to send a query to a model. In a single-model deployment scenario, I would use the following curl command:
curl -X POST \
-s localhost:8000/v2/models/tensorrt_llm_bls/generate \
-d '{
"text_input": "What is machine learning?",
"max_tokens": 512,
}'
However, if I use the same endpoint (localhost:8000/v2/models/tensorrt_llm_bls/generate) in the two-models deployment scenario I get, as expected, the following error:
{"error":"There are 2 identifiers of model 'tensorrt_llm_bls' in global map, model namespace must be provided to resolve ambiguity."}
The problem is that I don't know how should I change the target endpoint with --model-namespacing enabled. I tried many things but none of them worked and it seems there is no documentation about this.
Can you help me out? Thanks in advance. Tagging @rmccorm4 for support.
The text was updated successfully, but these errors were encountered:
Hi,
I am trying to serve two LLMs concurrently with TensorRT-LLM backend. The folder structure of the two Triton Model Repositories is the following:
I am running the command
tritonserver --model-repository=path_to_triton_models/gemma2 --model-repository=path_to_triton_models/llama3 --model-namespacing=true
. All the models are loaded correctly as confirmed by the logs.At this point I want to send a query to a model. In a single-model deployment scenario, I would use the following curl command:
However, if I use the same endpoint (
localhost:8000/v2/models/tensorrt_llm_bls/generate
) in the two-models deployment scenario I get, as expected, the following error:The problem is that I don't know how should I change the target endpoint with
--model-namespacing
enabled. I tried many things but none of them worked and it seems there is no documentation about this.Can you help me out? Thanks in advance. Tagging @rmccorm4 for support.
The text was updated successfully, but these errors were encountered: