Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD GPU subset selection does not work #21454

Closed
jsevillaamd opened this issue Jan 31, 2024 · 5 comments
Closed

AMD GPU subset selection does not work #21454

jsevillaamd opened this issue Jan 31, 2024 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@jsevillaamd
Copy link

Issue Description

Podman AMD GPUs subset selection should work by selecting devices in /dev/dri/IDs, but it ways takes every GPU in node.

Steps to reproduce the issue

Steps to reproduce the issue

  1. podman run --rm --device=/dev/kfd --device=/dev/dri/renderD128 docker.io/rocm/dev-ubuntu-22.04:latest rocm-smi

Describe the results you received

Rocm-smi retrieve every GPU in node, instead of the selected subset.

Describe the results you expected

It should just use the GPUs mounted in the devices list as docker does.
I the following image we see output from both podman and docker by mounting one device into the container (/dev/dri/renderD128)
image

podman info output

podman info
host:
  arch: amd64
  buildahVersion: 1.32.0
  cgroupControllers:
  - memory
  - pids
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: conmon_2.0.25+ds1-1.1_amd64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.25, commit: unknown'
  cpuUtilization:
    idlePercent: 99.85
    systemPercent: 0.04
    userPercent: 0.12
  cpus: 128
  databaseBackend: boltdb
  distribution:
    codename: jammy
    distribution: ubuntu
    version: "22.04"
  eventLogger: file
  freeLocks: 2048
  hostname: smc-r08-03
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 10012
      size: 1
    - container_id: 1
      host_id: 2918048
      size: 65536
    - container_id: 65537
      host_id: 3704480
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 10012
      size: 1
    - container_id: 1
      host_id: 2918048
      size: 65536
    - container_id: 65537
      host_id: 3704480
      size: 65536
  kernel: 6.2.0-39-generic
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 531311235072
  memTotal: 540844507136
  networkBackend: cni
  networkBackendInfo:
    backend: cni
    dns: {}
    package: kubernetes-cni_1.2.0-00_amd64
    path: /opt/cni/bin
  ociRuntime:
    name: runc
    package: runc_1.1.7-0ubuntu1~22.04.1_amd64
    path: /usr/sbin/runc
    version: |-
      runc version 1.1.7-0ubuntu1~22.04.1
      spec: 1.0.2-dev
      go: go1.18.1
      libseccomp: 2.5.3
  os: linux
  pasta:
    executable: ""
    package: ""
    version: ""
  remoteSocket:
    exists: false
    path: /run/user/10012/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: ""
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns_1.0.1-2_amd64
    version: |-
      slirp4netns version 1.0.1
      commit: 6a7b16babc95b6a3056b33fb45b74a6f62262dd4
      libslirp: 4.6.1
  swapFree: 0
  swapTotal: 0
  uptime: 82h 60m 31.00s (Approximately 3.42 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries: {}
store:
  configFile: /shared/devtest/home/aacplexusd/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/tmp/aacplexusd/share/containers/storage
  graphRootAllocated: 729371230208
  graphRootUsed: 379214508032
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /run/user/10012/containers
  transientStore: false
  volumePath: /var/tmp/aacplexusd/share/containers/storage/volumes
version:
  APIVersion: 4.7.1
  Built: 1706343696
  BuiltTime: Sat Jan 27 02:21:36 2024
  GitCommit: ef83eeb9c7482826672f3efa12db3d61c88df6c4
  GoVersion: go1.21.0
  Os: linux
  OsArch: linux/amd64
  Version: 4.7.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

No

Additional environment details

No response

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

@jsevillaamd jsevillaamd added the kind/bug Categorizes issue or PR as related to a bug. label Jan 31, 2024
@rhatdan
Copy link
Member

rhatdan commented Jan 31, 2024

@giuseppe PTAL

@giuseppe
Copy link
Member

I wonder if the different behavior is because we are not configuring the devices cgroup, since an unprivileged user cannot use it.

Do you see a different output if you run podman as root user (i.e. sudo podman run --rm --device=/dev/kfd --device=/dev/dri/renderD128 docker.io/rocm/dev-ubuntu-22.04:latest rocm-smi)?

If running Podman as root, still behaves the same, please share the output of the following command, both for Docker and Podman:

$RUNTIME run --rm --device=/dev/kfd --device=/dev/dri/renderD128 --rm fedora find /dev -exec stat -t \{\} \;

@jsevillaamd
Copy link
Author

jsevillaamd commented Jan 31, 2024

Hi @giuseppe , thanks for your fast answer.

You are right, with root it just mounts one GPU. sudo podman run --rm --device=/dev/kfd --device=/dev/dri/renderD128 docker.io/rocm/dev-ubuntu-22.04:latest rocm-smi

Is there the possibility to get that behavior with unprivileged users?

If it is now possible, is there any reason why? I will be happy to help to include it (but I would appreciate some guidance)

many thanks

@giuseppe
Copy link
Member

Hi, no that is not possible because the kernel doesn't allow it. On cgroup v2, the devices cgroup requires eBPF and it is not usable from a user namespace. On cgroup v1, similar problem since delegation is not safe and we do not use cgroups at all with unprivileged users

@giuseppe giuseppe closed this as not planned Won't fix, can't repro, duplicate, stale Feb 1, 2024
@jsevillaamd
Copy link
Author

My tests are on Slurm + Podman rootless + AMD GPU, and as far I understand Podman rootless does not fit Slurm jobs when jobs are not node exclusive (select partial number of GPUs)

@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label May 2, 2024
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators May 2, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

3 participants