AMD GPU subset selection does not work #21454

jsevillaamd · 2024-01-31T10:14:19Z

Issue Description

Podman AMD GPUs subset selection should work by selecting devices in /dev/dri/IDs, but it ways takes every GPU in node.

Steps to reproduce the issue

podman run --rm --device=/dev/kfd --device=/dev/dri/renderD128 docker.io/rocm/dev-ubuntu-22.04:latest rocm-smi

Describe the results you received

Rocm-smi retrieve every GPU in node, instead of the selected subset.

Describe the results you expected

It should just use the GPUs mounted in the devices list as docker does.
I the following image we see output from both podman and docker by mounting one device into the container (/dev/dri/renderD128)

podman info output

podman info
host:
  arch: amd64
  buildahVersion: 1.32.0
  cgroupControllers:
  - memory
  - pids
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: conmon_2.0.25+ds1-1.1_amd64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.25, commit: unknown'
  cpuUtilization:
    idlePercent: 99.85
    systemPercent: 0.04
    userPercent: 0.12
  cpus: 128
  databaseBackend: boltdb
  distribution:
    codename: jammy
    distribution: ubuntu
    version: "22.04"
  eventLogger: file
  freeLocks: 2048
  hostname: smc-r08-03
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 10012
      size: 1
    - container_id: 1
      host_id: 2918048
      size: 65536
    - container_id: 65537
      host_id: 3704480
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 10012
      size: 1
    - container_id: 1
      host_id: 2918048
      size: 65536
    - container_id: 65537
      host_id: 3704480
      size: 65536
  kernel: 6.2.0-39-generic
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 531311235072
  memTotal: 540844507136
  networkBackend: cni
  networkBackendInfo:
    backend: cni
    dns: {}
    package: kubernetes-cni_1.2.0-00_amd64
    path: /opt/cni/bin
  ociRuntime:
    name: runc
    package: runc_1.1.7-0ubuntu1~22.04.1_amd64
    path: /usr/sbin/runc
    version: |-
      runc version 1.1.7-0ubuntu1~22.04.1
      spec: 1.0.2-dev
      go: go1.18.1
      libseccomp: 2.5.3
  os: linux
  pasta:
    executable: ""
    package: ""
    version: ""
  remoteSocket:
    exists: false
    path: /run/user/10012/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: ""
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns_1.0.1-2_amd64
    version: |-
      slirp4netns version 1.0.1
      commit: 6a7b16babc95b6a3056b33fb45b74a6f62262dd4
      libslirp: 4.6.1
  swapFree: 0
  swapTotal: 0
  uptime: 82h 60m 31.00s (Approximately 3.42 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries: {}
store:
  configFile: /shared/devtest/home/aacplexusd/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/tmp/aacplexusd/share/containers/storage
  graphRootAllocated: 729371230208
  graphRootUsed: 379214508032
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /run/user/10012/containers
  transientStore: false
  volumePath: /var/tmp/aacplexusd/share/containers/storage/volumes
version:
  APIVersion: 4.7.1
  Built: 1706343696
  BuiltTime: Sat Jan 27 02:21:36 2024
  GitCommit: ef83eeb9c7482826672f3efa12db3d61c88df6c4
  GoVersion: go1.21.0
  Os: linux
  OsArch: linux/amd64
  Version: 4.7.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

No

Additional environment details

No response

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

rhatdan · 2024-01-31T14:05:32Z

@giuseppe PTAL

giuseppe · 2024-01-31T15:22:34Z

I wonder if the different behavior is because we are not configuring the devices cgroup, since an unprivileged user cannot use it.

Do you see a different output if you run podman as root user (i.e. sudo podman run --rm --device=/dev/kfd --device=/dev/dri/renderD128 docker.io/rocm/dev-ubuntu-22.04:latest rocm-smi)?

If running Podman as root, still behaves the same, please share the output of the following command, both for Docker and Podman:

$RUNTIME run --rm --device=/dev/kfd --device=/dev/dri/renderD128 --rm fedora find /dev -exec stat -t \{\} \;

jsevillaamd · 2024-01-31T16:34:22Z

Hi @giuseppe , thanks for your fast answer.

You are right, with root it just mounts one GPU. sudo podman run --rm --device=/dev/kfd --device=/dev/dri/renderD128 docker.io/rocm/dev-ubuntu-22.04:latest rocm-smi

Is there the possibility to get that behavior with unprivileged users?

If it is now possible, is there any reason why? I will be happy to help to include it (but I would appreciate some guidance)

many thanks

giuseppe · 2024-01-31T22:07:36Z

Hi, no that is not possible because the kernel doesn't allow it. On cgroup v2, the devices cgroup requires eBPF and it is not usable from a user namespace. On cgroup v1, similar problem since delegation is not safe and we do not use cgroups at all with unprivileged users

jsevillaamd · 2024-02-01T16:15:59Z

My tests are on Slurm + Podman rootless + AMD GPU, and as far I understand Podman rootless does not fit Slurm jobs when jobs are not node exclusive (select partial number of GPUs)

jsevillaamd added the kind/bug Categorizes issue or PR as related to a bug. label Jan 31, 2024

giuseppe closed this as not planned Won't fix, can't repro, duplicate, stale Feb 1, 2024

This was referenced Feb 1, 2024

Support partial AMD GPU selection on SLURM #21468

Closed

[Feature]: Rocm GPU subset allocation should work on Podman rootless ROCm/ROCm#2860

Open

stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label May 2, 2024

stale-locking-app bot locked as resolved and limited conversation to collaborators May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMD GPU subset selection does not work #21454

AMD GPU subset selection does not work #21454

jsevillaamd commented Jan 31, 2024

rhatdan commented Jan 31, 2024

giuseppe commented Jan 31, 2024

jsevillaamd commented Jan 31, 2024 •

edited

Loading

giuseppe commented Jan 31, 2024

jsevillaamd commented Feb 1, 2024

AMD GPU subset selection does not work #21454

AMD GPU subset selection does not work #21454

Comments

jsevillaamd commented Jan 31, 2024

Issue Description

Steps to reproduce the issue

Describe the results you received

Describe the results you expected

podman info output

Podman in a container

Privileged Or Rootless

Upstream Latest Release

Additional environment details

Additional information

rhatdan commented Jan 31, 2024

giuseppe commented Jan 31, 2024

jsevillaamd commented Jan 31, 2024 • edited Loading

giuseppe commented Jan 31, 2024

jsevillaamd commented Feb 1, 2024

jsevillaamd commented Jan 31, 2024 •

edited

Loading