You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the issue
我已经将Qwen/Qwen2.5-7B-Instruct下载到本地并放到了gorilla/berkeley-function-call-leaderboard/Qwen/Qwen2.5-7B-Instruct这个目录
执行命令:CUDA_VISIBLE_DEVICES=0,1 bfcl generate --model Qwen/Qwen2.5-7B-Instruct --backend vllm --num-gpus 2 --gpu-memory-utilization 0.9
模型部署成功并能curl到:
INFO 12-17 14:33:26 model_runner.py:1530] Graph capturing finished in 17 secs.
INFO 12-17 14:33:27 api_server.py:232] vLLM to use /tmp/tmp5f9y67qb as PROMETHEUS_MULTIPROC_DIR
WARNING 12-17 14:33:27 serving_embedding.py:199] embedding_mode is False. Embedding API will not work.
INFO 12-17 14:33:27 launcher.py:19] Available routes are:
INFO 12-17 14:33:27 launcher.py:27] Route: /openapi.json, Methods: HEAD, GET
INFO 12-17 14:33:27 launcher.py:27] Route: /docs, Methods: HEAD, GET
INFO 12-17 14:33:27 launcher.py:27] Route: /docs/oauth2-redirect, Methods: HEAD, GET
INFO 12-17 14:33:27 launcher.py:27] Route: /redoc, Methods: HEAD, GET
INFO 12-17 14:33:27 launcher.py:27] Route: /health, Methods: GET
INFO 12-17 14:33:27 launcher.py:27] Route: /tokenize, Methods: POST
INFO 12-17 14:33:27 launcher.py:27] Route: /detokenize, Methods: POST
INFO 12-17 14:33:27 launcher.py:27] Route: /v1/models, Methods: GET
INFO 12-17 14:33:27 launcher.py:27] Route: /version, Methods: GET
INFO 12-17 14:33:27 launcher.py:27] Route: /v1/chat/completions, Methods: POST
INFO 12-17 14:33:27 launcher.py:27] Route: /v1/completions, Methods: POST
INFO 12-17 14:33:27 launcher.py:27] Route: /v1/embeddings, Methods: POST
INFO: Started server process [54494]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on socket ('0.0.0.0', 1053) (Press CTRL+C to quit)
但是推理报错:
Max context length: 32768
❗️❗️ Error occurred during inference for test case exec_parallel_37
Error type: InternalServerError
Error message: Internal Server Error
Traceback:
Traceback (most recent call last):
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/oss_model/base_oss_handler.py", line 239, in _multi_threaded_inference
model_responses, metadata = self.inference_single_turn_prompting(test_case, include_input_log)
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/base_handler.py", line 579, in inference_single_turn_prompting
api_response, query_latency = self._query_prompting(inference_data)
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/oss_model/base_oss_handler.py", line 308, in _query_prompting
api_response = self.client.completions.create(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_utils/_utils.py", line 274, in wrapper
return func(*args, **kwargs)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/resources/completions.py", line 539, in create
return self._post(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1260, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 937, in request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1026, in _request
return self._retry_request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1075, in _retry_request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1026, in _request
return self._retry_request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1075, in _retry_request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1041, in _request
raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Internal Server Error
Traceback (most recent call last):
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions
yield
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_transports/default.py", line 236, in handle_request
resp = self._pool.handle_request(req)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
raise exc from None
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
response = connection.handle_request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 103, in handle_request
return self._connection.handle_request(request)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 136, in handle_request
raise exc
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 106, in handle_request
) = self._receive_response_headers(**kwargs)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 177, in _receive_response_headers
event = self._receive_event(timeout=timeout)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 231, in _receive_event
raise RemoteProtocolError(msg)
httpcore.RemoteProtocolError: Server disconnected without sending a response.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 973, in _request
response = self._client.send(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_client.py", line 926, in send
response = self._send_handling_auth(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_client.py", line 954, in _send_handling_auth
response = self._send_handling_redirects(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_client.py", line 991, in _send_handling_redirects
response = self._send_single_request(request)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_client.py", line 1027, in _send_single_request
response = transport.handle_request(request)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_transports/default.py", line 235, in handle_request
with map_httpcore_exceptions():
File "/root/anaconda/envs/BFCL/lib/python3.10/contextlib.py", line 153, in exit
self.gen.throw(typ, value, traceback)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: Server disconnected without sending a response.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/oss_model/base_oss_handler.py", line 239, in _multi_threaded_inference
model_responses, metadata = self.inference_single_turn_prompting(test_case, include_input_log)
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/base_handler.py", line 579, in inference_single_turn_prompting
api_response, query_latency = self._query_prompting(inference_data)
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/oss_model/base_oss_handler.py", line 308, in _query_prompting
api_response = self.client.completions.create(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_utils/_utils.py", line 274, in wrapper
return func(*args, **kwargs)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/resources/completions.py", line 539, in create
return self._post(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1260, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 937, in request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1026, in _request
return self._retry_request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1075, in _retry_request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1026, in _request
return self._retry_request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1075, in _retry_request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1007, in _request
raise APIConnectionError(request=request) from err
openai.APIConnectionError: Connection error.
The text was updated successfully, but these errors were encountered:
HuanzhiMao
changed the title
[BFCL]
[BFCL] "Internal Server Error" when running Qwen2.5-7B-Instruct using vLLM
Dec 19, 2024
Hey @QingChengLineOne ,
Thanks for the issue.
Could you try directly spinning up the vllm server in your terminal and see if that work? (i.e., in the terminal, vllm serve Qwen/Qwen2.5-7B-Instruct --port 1053 --dtype bfloat16 --tensor-parallel-size 2 --gpu-memory-utilization 0.9)
Describe the issue
我已经将Qwen/Qwen2.5-7B-Instruct下载到本地并放到了gorilla/berkeley-function-call-leaderboard/Qwen/Qwen2.5-7B-Instruct这个目录
执行命令:CUDA_VISIBLE_DEVICES=0,1 bfcl generate --model Qwen/Qwen2.5-7B-Instruct --backend vllm --num-gpus 2 --gpu-memory-utilization 0.9
模型部署成功并能curl到:
INFO 12-17 14:33:26 model_runner.py:1530] Graph capturing finished in 17 secs.
INFO 12-17 14:33:27 api_server.py:232] vLLM to use /tmp/tmp5f9y67qb as PROMETHEUS_MULTIPROC_DIR
WARNING 12-17 14:33:27 serving_embedding.py:199] embedding_mode is False. Embedding API will not work.
INFO 12-17 14:33:27 launcher.py:19] Available routes are:
INFO 12-17 14:33:27 launcher.py:27] Route: /openapi.json, Methods: HEAD, GET
INFO 12-17 14:33:27 launcher.py:27] Route: /docs, Methods: HEAD, GET
INFO 12-17 14:33:27 launcher.py:27] Route: /docs/oauth2-redirect, Methods: HEAD, GET
INFO 12-17 14:33:27 launcher.py:27] Route: /redoc, Methods: HEAD, GET
INFO 12-17 14:33:27 launcher.py:27] Route: /health, Methods: GET
INFO 12-17 14:33:27 launcher.py:27] Route: /tokenize, Methods: POST
INFO 12-17 14:33:27 launcher.py:27] Route: /detokenize, Methods: POST
INFO 12-17 14:33:27 launcher.py:27] Route: /v1/models, Methods: GET
INFO 12-17 14:33:27 launcher.py:27] Route: /version, Methods: GET
INFO 12-17 14:33:27 launcher.py:27] Route: /v1/chat/completions, Methods: POST
INFO 12-17 14:33:27 launcher.py:27] Route: /v1/completions, Methods: POST
INFO 12-17 14:33:27 launcher.py:27] Route: /v1/embeddings, Methods: POST
INFO: Started server process [54494]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on socket ('0.0.0.0', 1053) (Press CTRL+C to quit)
但是推理报错:
Max context length: 32768
❗️❗️ Error occurred during inference for test case exec_parallel_37
Error type: InternalServerError
Error message: Internal Server Error
Traceback:
Traceback (most recent call last):
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/oss_model/base_oss_handler.py", line 239, in _multi_threaded_inference
model_responses, metadata = self.inference_single_turn_prompting(test_case, include_input_log)
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/base_handler.py", line 579, in inference_single_turn_prompting
api_response, query_latency = self._query_prompting(inference_data)
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/oss_model/base_oss_handler.py", line 308, in _query_prompting
api_response = self.client.completions.create(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_utils/_utils.py", line 274, in wrapper
return func(*args, **kwargs)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/resources/completions.py", line 539, in create
return self._post(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1260, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 937, in request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1026, in _request
return self._retry_request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1075, in _retry_request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1026, in _request
return self._retry_request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1075, in _retry_request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1041, in _request
raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Internal Server Error
Traceback (most recent call last):
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions
yield
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_transports/default.py", line 236, in handle_request
resp = self._pool.handle_request(req)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
raise exc from None
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
response = connection.handle_request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 103, in handle_request
return self._connection.handle_request(request)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 136, in handle_request
raise exc
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 106, in handle_request
) = self._receive_response_headers(**kwargs)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 177, in _receive_response_headers
event = self._receive_event(timeout=timeout)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 231, in _receive_event
raise RemoteProtocolError(msg)
httpcore.RemoteProtocolError: Server disconnected without sending a response.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 973, in _request
response = self._client.send(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_client.py", line 926, in send
response = self._send_handling_auth(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_client.py", line 954, in _send_handling_auth
response = self._send_handling_redirects(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_client.py", line 991, in _send_handling_redirects
response = self._send_single_request(request)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_client.py", line 1027, in _send_single_request
response = transport.handle_request(request)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_transports/default.py", line 235, in handle_request
with map_httpcore_exceptions():
File "/root/anaconda/envs/BFCL/lib/python3.10/contextlib.py", line 153, in exit
self.gen.throw(typ, value, traceback)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/httpx/_transports/default.py", line 89, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: Server disconnected without sending a response.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/oss_model/base_oss_handler.py", line 239, in _multi_threaded_inference
model_responses, metadata = self.inference_single_turn_prompting(test_case, include_input_log)
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/base_handler.py", line 579, in inference_single_turn_prompting
api_response, query_latency = self._query_prompting(inference_data)
File "/public/zzy/tool_project/gorilla/berkeley-function-call-leaderboard/bfcl/model_handler/oss_model/base_oss_handler.py", line 308, in _query_prompting
api_response = self.client.completions.create(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_utils/_utils.py", line 274, in wrapper
return func(*args, **kwargs)
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/resources/completions.py", line 539, in create
return self._post(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1260, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 937, in request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1026, in _request
return self._retry_request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1075, in _retry_request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1026, in _request
return self._retry_request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1075, in _retry_request
return self._request(
File "/root/anaconda/envs/BFCL/lib/python3.10/site-packages/openai/_base_client.py", line 1007, in _request
raise APIConnectionError(request=request) from err
openai.APIConnectionError: Connection error.
The text was updated successfully, but these errors were encountered: