Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvidia-xrun not working at all #127

Open
Lacrymology opened this issue Jul 20, 2019 · 4 comments
Open

nvidia-xrun not working at all #127

Lacrymology opened this issue Jul 20, 2019 · 4 comments

Comments

@Lacrymology
Copy link

I've noticed a few things. One of them is that regardless of what I put in /etc/X11/nvidia-xorg* (got this from the documentation at the arch wiki, so this might not be the place to say it), /etc/default/nvidia-xrun holds the variables that the script run.

Anyway, I've been fighting against the nvidia driver for a while, when I was able to finally blacklist it properly, nvidia-xrun is still failing, but at least it's failing gracefully. One point I find is this line in /etc/default/nvidia-xrun:

# Bus ID of the PCI express controller
CONTROLLER_BUS_ID=0000:00:01.0

that bus id doesn't seem to exist in my box, and I'm not sure what to put. Here's the output of lspci:

➜ lspci
00:00.0 Host bridge: Intel Corporation Broadwell-U Host Bridge -OPI (rev 09)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 5500 (rev 09)
00:03.0 Audio device: Intel Corporation Broadwell-U Audio Controller (rev 09)
00:14.0 USB controller: Intel Corporation Wildcat Point-LP USB xHCI Controller (rev 03)
00:16.0 Communication controller: Intel Corporation Wildcat Point-LP MEI Controller #1 (rev 03)
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection (3) I218-LM (rev 03)
00:1b.0 Audio device: Intel Corporation Wildcat Point-LP High Definition Audio Controller (rev 03)
00:1c.0 PCI bridge: Intel Corporation Wildcat Point-LP PCI Express Root Port #6 (rev e3)
00:1c.1 PCI bridge: Intel Corporation Wildcat Point-LP PCI Express Root Port #3 (rev e3)
00:1c.4 PCI bridge: Intel Corporation Wildcat Point-LP PCI Express Root Port #5 (rev e3)
00:1d.0 USB controller: Intel Corporation Wildcat Point-LP USB EHCI Controller (rev 03)
00:1f.0 ISA bridge: Intel Corporation Wildcat Point-LP LPC Controller (rev 03)
00:1f.2 SATA controller: Intel Corporation Wildcat Point-LP SATA Controller [AHCI Mode] (rev 03)
00:1f.3 SMBus: Intel Corporation Wildcat Point-LP SMBus Controller (rev 03)
00:1f.6 Signal processing controller: Intel Corporation Wildcat Point-LP Thermal Management Controller (rev 03)
02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5227 PCI Express Card Reader (rev 01)
03:00.0 Network controller: Intel Corporation Wireless 7265 (rev 59)

The nvidia card is not there because I'm using nvidia-xrun-pm.service, but its address is 04:00.0. I changed that in /etc/default/nvidia-xrun, but the controller bus bit is stll raising an error, it complains that /sys/bus/pci/devices/0000:00:01.0/power/control doesn't exist

Below is dmesg after I try to run it. I notice specially the following lines:

[  251.484143] ACPI Warning: \_SB.PCI0.PEG.VID._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20190509/nsarguments-59)
[  251.791305] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[  251.791306] [drm] No driver support for vblank timestamp query.

here's the full output

=? res=success'
[  236.945689] audit: type=1130 audit(1563638171.923:158): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty3 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  236.984048] audit: type=1131 audit(1563638171.960:159): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  236.986936] audit: type=1130 audit(1563638171.963:160): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  236.989082] audit: type=1131 audit(1563638171.966:161): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  236.991888] audit: type=1130 audit(1563638171.970:162): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  243.092940] audit: type=1006 audit(1563638178.068:163): pid=2239 uid=0 old-auid=4294967295 auid=1000 tty=tty2 old-ses=4294967295 ses=4 res=1
[  249.919817] pci 0000:04:00.0: [10de:1347] type 00 class 0x030200
[  249.919862] pci 0000:04:00.0: reg 0x10: [mem 0xf1000000-0xf1ffffff]
[  249.919886] pci 0000:04:00.0: reg 0x14: [mem 0xc0000000-0xcfffffff 64bit pref]
[  249.919909] pci 0000:04:00.0: reg 0x1c: [mem 0xd0000000-0xd1ffffff 64bit pref]
[  249.919926] pci 0000:04:00.0: reg 0x24: [io  0x3000-0x307f]
[  249.919943] pci 0000:04:00.0: reg 0x30: [mem 0xfff80000-0xffffffff pref]
[  249.920120] pci 0000:04:00.0: 16.000 Gb/s available PCIe bandwidth, limited by 5 GT/s x4 link at 0000:00:1c.4 (capable of 31.504 Gb/s with 8 GT/s x4 link)
[  249.920966] pci 0000:04:00.0: BAR 1: assigned [mem 0xc0000000-0xcfffffff 64bit pref]
[  249.920986] pci 0000:04:00.0: BAR 3: assigned [mem 0xd0000000-0xd1ffffff 64bit pref]
[  249.921002] pci 0000:04:00.0: BAR 0: assigned [mem 0xf1000000-0xf1ffffff]
[  249.921012] pci 0000:04:00.0: BAR 6: no space for [mem size 0x00080000 pref]
[  249.921015] pci 0000:04:00.0: BAR 6: failed to assign [mem size 0x00080000 pref]
[  249.921020] pci 0000:04:00.0: BAR 5: assigned [io  0x3000-0x307f]
[  250.982821] IPMI message handler: version 39.2
[  250.998891] ipmi device interface
[  251.222618] nvidia: loading out-of-tree module taints kernel.
[  251.222627] nvidia: module license 'NVIDIA' taints kernel.
[  251.222628] Disabling lock debugging due to kernel taint
[  251.229309] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[  251.238248] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[  251.339070] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  430.34  Wed Jun 26 12:19:48 CDT 2019
[  251.388277] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 235
[  251.429938] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  430.34  Wed Jun 26 12:15:10 CDT 2019
[  251.455750] [drm] [nvidia-drm] [GPU ID 0x00000400] Loading driver
[  251.484143] ACPI Warning: \_SB.PCI0.PEG.VID._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20190509/nsarguments-59)
[  251.791305] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[  251.791306] [drm] No driver support for vblank timestamp query.
[  251.791309] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:04:00.0 on minor 1
[  255.829134] [drm] [nvidia-drm] [GPU ID 0x00000400] Unloading driver
[  255.857069] nvidia-modeset: Unloading
[  256.063743] nvidia-uvm: Unloaded the UVM driver in 8 mode
[  256.081762] nvidia-nvlink: Unregistered the Nvlink Core, major device number 237
[  292.513621] audit: type=1131 audit(1563638227.486:164): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  292.524410] audit: type=1130 audit(1563638227.496:165): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  292.524420] audit: type=1131 audit(1563638227.496:166): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  292.525516] audit: type=1130 audit(1563638227.499:167): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
@michelesr
Copy link
Contributor

This has been explained different times in response to similar issues reported, so I'm thinking the problem here is lack of documentation (maybe the readme should be more explanatory).

However, you should be able to find the controller bus id in the output of lshw, for example in my case it looks something like this:

     *-pci
          description: Host bridge
          product: 8th Gen Core Processor Host Bridge/DRAM Registers
          vendor: Intel Corporation
          physical id: 100
          bus info: pci@0000:00:00.0
          version: 07
          width: 32 bits
          clock: 33MHz
          configuration: driver=skl_uncore
          resources: irq:0
        *-pci:0
             description: PCI bridge
             product: Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16)
             vendor: Intel Corporation
             physical id: 1
             bus info: pci@0000:00:01.0
             version: 07
             width: 32 bits
             clock: 33MHz
             capabilities: pci pm msi pciexpress normal_decode bus_master cap_list
             configuration: driver=pcieport
             resources: irq:122 ioport:3000(size=4096) memory:ec000000-ed0fffff ioport:c0000000(size=301989888)
           *-display UNCLAIMED
                description: 3D controller
                product: GP107M [GeForce GTX 1050 Ti Mobile]
                vendor: NVIDIA Corporation
                physical id: 0
                bus info: pci@0000:01:00.0
                version: a1
                width: 64 bits
                clock: 33MHz
                capabilities: pm msi pciexpress bus_master cap_list
                configuration: latency=0
                resources: memory:ec000000-ecffffff memory:c0000000-cfffffff memory:d0000000-d1ffffff ioport:3000(size=128) memory:ed000000-ed07ffff

The id you're loolking for is the one of the PCI bridge that hosts the card, in my case 0000:00:01.0. Hope this can help.

@Lacrymology
Copy link
Author

okay, thanks. Mine doesn't say PCIE controller anywhere, but I'll try with this -pci:2 device, looks about right.

Thank you

     *-pci
          description: Host bridge
          product: Broadwell-U Host Bridge -OPI
          vendor: Intel Corporation
          physical id: 100
          bus info: pci@0000:00:00.0
          version: 09
          width: 32 bits
          clock: 33MHz
          configuration: driver=bdw_uncore
          resources: irq:0
        *-display
             description: VGA compatible controller
             product: HD Graphics 5500
             vendor: Intel Corporation
             physical id: 2
             bus info: pci@0000:00:02.0
             version: 09
             width: 64 bits
             clock: 33MHz
             capabilities: vga_controller bus_master cap_list rom
             configuration: driver=i915 latency=0
             resources: irq:59 memory:f0000000-f0ffffff memory:e0000000-efffffff ioport:4000(size=64) memory:c0000-dffff
......
        *-pci:2
             description: PCI bridge
             product: Wildcat Point-LP PCI Express Root Port #5
             vendor: Intel Corporation
             physical id: 1c.4
             bus info: pci@0000:00:1c.4
             version: e3
             width: 32 bits
             clock: 33MHz
             capabilities: pci normal_decode bus_master cap_list
             configuration: driver=pcieport
             resources: irq:44 ioport:3000(size=4096) memory:f1000000-f1ffffff ioport:c0000000(size=301989888)
           *-display UNCLAIMED
                description: 3D controller
                product: GM108M [GeForce 940M]
                vendor: NVIDIA Corporation
                physical id: 0
                bus info: pci@0000:04:00.0
                version: a2
                width: 64 bits
                clock: 33MHz
                capabilities: bus_master cap_list
                configuration: latency=0
                resources: memory:f1000000-f1ffffff memory:c0000000-cfffffff memory:d0000000-d1ffffff ioport:3000(size=128)

@Lacrymology
Copy link
Author

@michelesr that did help, but I'm still not being able to run, with the same dmesg errors:

[  120.810746] pci 0000:04:00.0: [10de:1347] type 00 class 0x030200
[  120.810775] pci 0000:04:00.0: reg 0x10: [mem 0xf1000000-0xf1ffffff]
[  120.810789] pci 0000:04:00.0: reg 0x14: [mem 0xc0000000-0xcfffffff 64bit pref]
[  120.810802] pci 0000:04:00.0: reg 0x1c: [mem 0xd0000000-0xd1ffffff 64bit pref]
[  120.810811] pci 0000:04:00.0: reg 0x24: [io  0x3000-0x307f]
[  120.810821] pci 0000:04:00.0: reg 0x30: [mem 0xfff80000-0xffffffff pref]
[  120.810936] pci 0000:04:00.0: 16.000 Gb/s available PCIe bandwidth, limited by 5 GT/s x4 link at 0000:00:1c.4 (capable of 31.504 Gb/s with 8 GT/s x4 link)
[  120.811376] pci 0000:04:00.0: BAR 1: assigned [mem 0xc0000000-0xcfffffff 64bit pref]
[  120.811387] pci 0000:04:00.0: BAR 3: assigned [mem 0xd0000000-0xd1ffffff 64bit pref]
[  120.811396] pci 0000:04:00.0: BAR 0: assigned [mem 0xf1000000-0xf1ffffff]
[  120.811401] pci 0000:04:00.0: BAR 6: no space for [mem size 0x00080000 pref]
[  120.811403] pci 0000:04:00.0: BAR 6: failed to assign [mem size 0x00080000 pref]
[  120.811405] pci 0000:04:00.0: BAR 5: assigned [io  0x3000-0x307f]
[  121.877981] IPMI message handler: version 39.2
[  121.885331] ipmi device interface
[  122.819521] nvidia: loading out-of-tree module taints kernel.
[  122.819534] nvidia: module license 'NVIDIA' taints kernel.
[  122.819535] Disabling lock debugging due to kernel taint
[  122.826941] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[  122.836162] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[  122.937445] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  430.34  Wed Jun 26 12:19:48 CDT 2019
[  123.081084] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 235
[  123.160158] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  430.34  Wed Jun 26 12:15:10 CDT 2019
[  123.207071] [drm] [nvidia-drm] [GPU ID 0x00000400] Loading driver
[  123.231503] ACPI Warning: \_SB.PCI0.PEG.VID._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20190509/nsarguments-59)
[  123.540530] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[  123.540532] [drm] No driver support for vblank timestamp query.
[  123.540535] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:04:00.0 on minor 1
[  128.383771] [drm] [nvidia-drm] [GPU ID 0x00000400] Unloading driver
[  128.414612] nvidia-modeset: Unloading
[  128.724437] nvidia-uvm: Unloaded the UVM driver in 8 mode
[  128.749476] nvidia-nvlink: Unregistered the Nvlink Core, major device number 237
[  185.676286] audit: type=1131 audit(1563819344.571:175): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  185.700457] audit: type=1130 audit(1563819344.594:176): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  185.700513] audit: type=1131 audit(1563819344.594:177): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  185.702775] audit: type=1130 audit(1563819344.597:178): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=getty@tty2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  189.057391] audit: type=1006 audit(1563819347.951:179): pid=1897 uid=0 old-auid=4294967295 auid=1000 tty=(none) old-ses=4294967295 ses=4 res=1
[  199.645017] audit: type=1131 audit(1563819358.541:180): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=user@979 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  199.652062] audit: type=1131 audit(1563819358.547:181): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=user-runtime-dir@979 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  204.266283] audit: type=1130 audit(1563819363.161:182): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=rtkit-daemon comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[  210.884106] audit: type=1130 audit(1563819369.777:183): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=udisks2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'

@Lacrymology Lacrymology reopened this Jul 22, 2019
@michelesr
Copy link
Contributor

I'm not very expert with nvidia drivers, but TBH I don't see anything alarming in the kernel log (tainting the kernel is normal as nvidia is not a module from the original kernel codebase).

What's the problem that are you getting exactly? can you post the output of nvidia-xrun execution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants