Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opendatahub/vllm: include adapter smoke test for main #52662

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ images:
- name: nvcc_threads
value: "2"
context_dir: .
dockerfile_path: Dockerfile
dockerfile_path: Dockerfile.ubi
to: vllm-build-main-cuda
promotion:
to:
Expand Down Expand Up @@ -56,13 +56,15 @@ tests:

# we will need to download test models off HF hub
unset HF_HUB_OFFLINE
# spin up the server and run it in the background, allowing for images download

# spin up the OpenAPI server in the background
python -m vllm.entrypoints.openai.api_server &
server_pid=$!

# wait for the server to be up
sleep 60

# OpenAI API tests
curl -v --no-progress-meter --fail-with-body \
localhost:8000/v1/models | python -m json.tool || \
(kill -9 $server_pid && exit 1)
Expand All @@ -76,7 +78,35 @@ tests:
localhost:8000/v1/completions | python -m json.tool || \
(kill -9 $server_pid && exit 1)

echo "success"
echo "OpenAI API success"
dtrifiro marked this conversation as resolved.
Show resolved Hide resolved
kill -9 $server_pid

# spin up the grpc server in the background
python -m vllm_tgis_adapter &
server_pid=$!

# wait for the server to be up
sleep 60


# get grpcurl
curl --no-progress-meter --location --output grpcurl.tar.gz \
https://github.com/fullstorydev/grpcurl/releases/download/v1.9.1/grpcurl_1.9.1_linux_x86_64.tar.gz
tar -xf grpcurl.tar.gz

# get grpc proto
curl --no-progress-meter --location --remote-name \
https://github.com/opendatahub-io/text-generation-inference/raw/main/proto/generation.proto

# GRPC API test
./grpcurl -v \
-plaintext \
-proto generation.proto \
-d '{ "requests": [{"text": "A red fedora symbolizes "}]}' \
localhost:8033 \
fmaas.GenerationService/Generate || (kill -9 $server_pid && exit 1)

echo "GRPC API success"
kill -9 $server_pid
container:
clone: false
Expand Down