Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

container-runtime: add nvidia-docker #15927

Merged
merged 3 commits into from
Sep 20, 2023
Merged

Conversation

d4l3k
Copy link
Contributor

@d4l3k d4l3k commented Feb 26, 2023

This adds a new container-runtime that sets the correct configuration options for using with https://github.com/NVIDIA/k8s-device-plugin#nvidia-device-plugin-for-kubernetes

This requires a custom Dockerfile with nvidia-container-toolkit and a matching libnvidia-ml.so.1 file. The driver version on the host needs to exactly match the version of nvml in the container.

kicbase Dockerfile

FROM gcr.io/k8s-minikube/kicbase:v0.0.37

RUN curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add -
RUN curl -s -L https://nvidia.github.io/libnvidia-container/$(. /etc/os-release;echo $ID$VERSION_ID)/libnvidia-container.list | tee /etc/apt/sources.list.d/libnvidia-container.list
RUN apt-get update && sudo apt-get install -y nvidia-container-toolkit && rm -rf /var/lib/apt/lists/*

ADD libnvidia-ml.so.1 /usr/lib/libnvidia-ml.so.1

Commands to run:

# Build docker image
cp /usr/lib/libnvidia-ml.so.1 .
docker build -t nvidiakic .

# Example start minikube with nvidia-docker
go run ./cmd/minikube start --container-runtime nvidia-docker --base-image='nvidiakic' --iso-url='https://storage.googleapis.com/minikube/iso/minikube-v1.29.0-amd64.iso,https://github.com/kubernetes/minikube/releases/download/v1.29.0/minikube-v1.29.0-amd64.iso' --driver docker --cpus=max --memory=max --nodes=2

# install the normal k8s-deviced-plugin (not the minikube addon!)
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.13.0/nvidia-device-plugin.yml

Related issue #10229

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Feb 26, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: d4l3k / name: Tristan Rice (2c3b698)
  • ✅ login: spowelljr / name: Steven Powell (262f8ce, 2a1f5b9)

@k8s-ci-robot
Copy link
Contributor

Welcome @d4l3k!

It looks like this is your first PR to kubernetes/minikube 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/minikube has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Feb 26, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @d4l3k. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 26, 2023
@minikube-bot
Copy link
Collaborator

Can one of the admins verify this patch?

@d4l3k
Copy link
Contributor Author

d4l3k commented Feb 26, 2023

I just signed the CLA, hasn't updated yet though

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Feb 26, 2023
@d4l3k
Copy link
Contributor Author

d4l3k commented Feb 26, 2023

@sharifelgamal when you get a chance could you take a look at this PR?

I'm also wondering if you have any suggestions on how to handle the nvidia dependencies. I assume it doesn't make sense to add them to kicbase? Also the libnvidia-ml needs to match the host. We could try and grab it at runtime and overlay it in the container but that's pretty hacky. For my use case, right now building an custom kicbase is an acceptable step so this PR is sufficient

Thanks!

@sazzy4o
Copy link

sazzy4o commented Feb 28, 2023

The nvidia oci hook creates the following mounts (on my ubuntu machine), so I don't think it is crazy to do overlays (seems like that is what nvidia is doing for their integrations):

--mount=type=bind,source=/usr/bin/nvidia-smi,destination=/usr/bin/nvidia-smi,ro=true \
--mount=type=bind,source=/usr/bin/nvidia-debugdump,destination=/usr/bin/nvidia-debugdump,ro=true \
--mount=type=bind,source=/usr/bin/nvidia-persistenced,destination=/usr/bin/nvidia-persistenced,ro=true \
--mount=type=bind,source=/usr/bin/nvidia-cuda-mps-control,destination=/usr/bin/nvidia-cuda-mps-control,ro=true \
--mount=type=bind,source=/usr/bin/nvidia-cuda-mps-server,destination=/usr/bin/nvidia-cuda-mps-server,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-ml.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.525.78.01,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.525.78.01,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libcuda.so,destination=/usr/lib/x86_64-linux-gnu/libcuda.so.525.78.01,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.525.78.01,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.525.78.01,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-ml.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libcuda.so,destination=/usr/lib/x86_64-linux-gnu/libcuda.so.1,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.1,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-ml.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-ml.so,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libcuda.so,destination=/usr/lib/x86_64-linux-gnu/libcuda.so,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so,ro=true \
--mount=type=bind,source=/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so,destination=/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so,ro=true \
--mount=type=bind,source=/run/nvidia-persistenced/socket,destination=/run/nvidia-persistenced/socket \
--mount=type=bind,source=/dev/nvidiactl,destination=/dev/nvidiactl \
--mount=type=bind,source=/dev/nvidia-uvm,destination=/dev/nvidia-uvm \
--mount=type=bind,source=/dev/nvidia-uvm-tools,destination=/dev/nvidia-uvm-tools \
--mount=type=bind,source=/dev/nvidia0,destination=/dev/nvidia0 \
--mount=type=bind,source=/proc/driver/nvidia/gpus/0000:65:00.0,destination=/proc/driver/nvidia/gpus/0000:65:00.0 \
--mount=type=tmpfs,destination=/proc/driver/nvidia \

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 23, 2023
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@anthonyalayo
Copy link

What's left here? Merging this would be great!

@medyagh
Copy link
Member

medyagh commented Sep 6, 2023

@d4l3k sorry for the long delay in PR review, I would llke to know how this PR is different from nivdia addon ? can nvidia addon be enabled with this PR as well ?
https://minikube.sigs.k8s.io/docs/handbook/addons/nvidia/

also I would like you to contribute the kicbase changes as well so we could test it too

@d4l3k
Copy link
Contributor Author

d4l3k commented Sep 6, 2023

Kvm requires the GPU to use pcie pass through to the underlying VM. This PR instead makes the gpu device available with the host gpu driver so it can be shared between the host and minikube workers

@spowelljr
Copy link
Member

Hi @d4l3k, I tried your example and it seems to work. Is there a way I can confirm that the pods have access to the GPUs? I tried using TensorFlow but was getting:

2023-09-11 22:46:18.090397: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:268] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

@d4l3k
Copy link
Contributor Author

d4l3k commented Sep 11, 2023

@spowelljr do you have access to the nvidia-smi command line within the container? That should tell you if you can access the GPUs or not

I was testing this with TorchX https://pytorch.org/torchx/latest/quickstart.html but that requires some familiarity with pytorch to get started

@d4l3k
Copy link
Contributor Author

d4l3k commented Sep 11, 2023

I wonder if there's other better options here as well via some of the other runtimes that might be easier to integrate https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#configuration

@spowelljr
Copy link
Member

I tried getting nvidia-smi in the container but haven't been successful yet.

I see examples like this (https://jacobtomlinson.dev/posts/2022/how-to-check-your-nvidia-driver-and-cuda-version-in-kubernetes/)

But when I try nvidia-smi is not present, maybe they re-pushed the imaged with changes?

I tried installing the nvidia driver in a pod but got Failed to initialize NVML: Unknown Error when running nvidia-smi

I'm new to running GPUs and AI/ML workloads in Kubernetes, so I'm welcome to any tips you may have

@d4l3k
Copy link
Contributor Author

d4l3k commented Sep 12, 2023

@spowelljr have you tried getting nvidia-smi to work under just nvidia-docker as a first step? Once you have a good repro can then try it in minikube as well

@spowelljr
Copy link
Member

@d4l3k I was able to get nvidia-smi to run in a container and have Tensorflow pickup the GPU, but I needed to do a couple more steps.

  1. I had to install NVIDIA Container Toolkit on my host machine
  2. I had to add --gpus=all to the command that starts the minikube container

Then I was able to use your PR to start minikube

@d4l3k
Copy link
Contributor Author

d4l3k commented Sep 15, 2023

Ahh nice! That makes sense, need it installed on the host to mount it and also need to set a Docker flags

@spowelljr
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 19, 2023
@spowelljr spowelljr removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 19, 2023
@minikube-pr-bot

This comment has been minimized.

@spowelljr
Copy link
Member

@d4l3k If the tests look good I'll merge this PR. I'm working on a follow up PR that will make this work via a flag when starting minikube. I've discovered that I was able to get it to work without ADD libnvidia-ml.so.1 /usr/lib/libnvidia-ml.so.1 which could simplify the flow. Could you confirm if you're able to get it to work without it or why you needed it.

@d4l3k
Copy link
Contributor Author

d4l3k commented Sep 19, 2023

@spowelljr thanks for pushing this through! Looks good to me :)

As for the ADD libnvidia-ml.so.1 it's been a while but I believe this was due to the fact that I had a different CUDA/driver version in the container compared to my host machine. I was using Arch Linux which had a newer CUDA and driver version than the one available in the container OS. If you were running the same distribution on host and container it may have avoided that issue?

It may also be nvidia-container-toolkit takes care of it now and it's no longer necessary

@minikube-pr-bot

This comment has been minimized.

@spowelljr
Copy link
Member

/retest-this-please

@minikube-pr-bot
Copy link

kvm2 driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 15927) |
+----------------+----------+---------------------+
| minikube start | 50.9s    | 50.4s               |
| enable ingress | 28.1s    | 28.5s               |
+----------------+----------+---------------------+

Times for minikube (PR 15927) start: 50.2s 50.6s 51.7s 49.5s 49.6s
Times for minikube start: 52.2s 50.9s 51.3s 49.9s 50.2s

Times for minikube ingress: 27.7s 28.1s 27.7s 28.6s 28.2s
Times for minikube (PR 15927) ingress: 27.2s 28.1s 29.1s 28.6s 29.6s

docker driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 15927) |
+----------------+----------+---------------------+
| minikube start | 22.5s    | 23.0s               |
| enable ingress | 21.0s    | 20.8s               |
+----------------+----------+---------------------+

Times for minikube start: 24.4s 23.3s 21.6s 21.1s 21.8s
Times for minikube (PR 15927) start: 23.9s 21.7s 22.0s 21.7s 25.8s

Times for minikube (PR 15927) ingress: 20.8s 20.8s 20.8s 20.8s 20.8s
Times for minikube ingress: 20.9s 20.8s 21.8s 20.8s 20.8s

docker driver with containerd runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 15927) |
+----------------+----------+---------------------+
| minikube start | 22.0s    | 21.5s               |
| enable ingress | 34.1s    | 34.7s               |
+----------------+----------+---------------------+

Times for minikube start: 20.9s 21.1s 20.6s 24.1s 23.2s
Times for minikube (PR 15927) start: 19.6s 20.5s 23.6s 20.4s 23.3s

Times for minikube ingress: 27.4s 49.4s 31.3s 31.3s 31.3s
Times for minikube (PR 15927) ingress: 47.3s 31.3s 32.3s 31.3s 31.3s

@minikube-pr-bot
Copy link

These are the flake rates of all failed tests.

Environment Failed Tests Flake Rate (%)
KVM_Linux_containerd TestErrorSpam/setup (gopogh) 0.00 (chart)
KVM_Linux TestCertOptions (gopogh) 0.56 (chart)
KVM_Linux TestStartStop/group/old-k8s-version/serial/VerifyKubernetesImages (gopogh) 0.56 (chart)
Hyperkit_macOS TestStartStop/group/old-k8s-version/serial/VerifyKubernetesImages (gopogh) 0.58 (chart)
Hyperkit_macOS TestAddons/Setup (gopogh) 2.33 (chart)
Hyperkit_macOS TestJSONOutput/start/parallel/DistinctCurrentSteps (gopogh) 2.33 (chart)
Hyperkit_macOS TestJSONOutput/start/parallel/IncreasingCurrentSteps (gopogh) 2.33 (chart)
Hyperkit_macOS TestMinikubeProfile (gopogh) 16.28 (chart)

To see the flake rates of all tests by environment, click here.

Copy link
Member

@spowelljr spowelljr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

@spowelljr spowelljr merged commit 075f1a1 into kubernetes:master Sep 20, 2023
13 of 15 checks passed
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: d4l3k, spowelljr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants