Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iommu update for Rocky Linux 9 #21

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Iommu update for Rocky Linux 9 #21

wants to merge 2 commits into from

Conversation

technowhizz
Copy link

@technowhizz technowhizz commented Mar 1, 2024

When enabling iommu on RL9 hypervisors the check when: "'Intel' in ansible_facts.processor.0" kept failing as it returned 0. Not sure how the check was working before but now ansible_facts.processor.1 contains the line that has Intel in it as ansible combines the lists into one list. We now use a seach for Intel in the list to pass the check: when: ansible_facts.processor | select('search', 'Intel') | list | length > 0

Additionally, added support for specifying vfio-pci Id's in the cmdline.

@technowhizz technowhizz requested a review from jovial March 1, 2024 11:23
@technowhizz technowhizz self-assigned this Mar 1, 2024
@technowhizz technowhizz requested a review from a team as a code owner March 1, 2024 11:23
@@ -3,11 +3,11 @@
ansible.builtin.include_role:
name: stackhpc.linux.grubcmdline
vars:
kernel_cmdline: # noqa: var-naming[no-role-prefix]
- intel_iommu=on
kernel_cmdline: "{{ ['intel_iommu=on'] + (['vfio-pci.ids=' + vfio_pci_ids] if vfio_pci_ids is defined else []) }}" # noqa: var-naming[no-role-prefix]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the vfio-pci.ids parameter actually related to the IOMMU? You can use the grubcmdline role directly to add kernel args.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, vfio_pci_ids is not included in the README or defaults, and does not have the role name as a prefix.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VFIO driver is an IOMMU/device agnostic framework for exposing direct device access to userspace, in a secure, IOMMU protected environment. In other words, this allows safe 2, non-privileged, userspace drivers.

https://docs.kernel.org/driver-api/vfio.html

I put it in here based on this. Do you think I shouldn't include it?

I've also mentioned it in the docs now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GPU passthrough there are a few other changes we can make, such as blacklisting the nouveau driver. Should we create a gpu-passthrough role to add this configuration?

@@ -4,6 +4,10 @@

- `iommu_pt`: Default true, Configure passthrough mode, which doesn't require DMA translation.

## Host Vars

- `vfio_pci_ids`: Can optionally be set with the pci id of the device to pass-through.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth using the iommu prefix to stop variable collisions? Also I guess it can go in the "Variable" section of the README with the other variable.

kernel_cmdline_remove: # noqa: var-naming[no-role-prefix]
- ^intel_iommu=
when: "'Intel' in ansible_facts.processor.0"
- ^vfio-pci\.ids=
when: ansible_facts.processor | select('search', 'Intel') | list | length > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe vfio-pci is relevant for non-Intel processors too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I am not sure how the previous when clause even worked. Even on CentOS Stream 8, I see ansible_facts.processor looks like this:

  ansible_facts.processor:
  - '0'
  - GenuineIntel
  - Intel(R) Xeon(R) Gold 6146 CPU @ 3.20GHz
  - '1'
  - GenuineIntel
  - Intel(R) Xeon(R) Gold 6146 CPU @ 3.20GHz

So ansible_facts.processor.0 is always a string set to 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested with different Ansible versions for Yoga and for 2023.1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpu_facts come from /proc/cpuinfo - I've seen it returning 0 in the first line and GenuineIntel in the first line, maybe there was some bug in the past that caused lack of '0' in the first line

@technowhizz technowhizz force-pushed the iommu-rl9 branch 5 times, most recently from 5d6e3d9 to 92edcff Compare March 27, 2024 16:30
@technowhizz technowhizz force-pushed the iommu-rl9 branch 2 times, most recently from 874531e to 38ad1c1 Compare March 27, 2024 17:15
ansible.builtin.shell: |-
#!/bin/bash
set -eux
dracut -v -f /boot/initramfs-$(uname -r).img $(uname -r)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to be careful not to break ubuntu

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes the conditional to search for 'Intel' in the
ansible_facts.processor variable as the first item in the list is not
always consistent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants