Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KVM self-tests fails on AMD hardware #47

Open
Wenzel opened this issue Feb 18, 2021 · 0 comments
Open

KVM self-tests fails on AMD hardware #47

Wenzel opened this issue Feb 18, 2021 · 0 comments

Comments

@Wenzel
Copy link
Member

Wenzel commented Feb 18, 2021

I tried to setup the vagrant on an AMD processor, and the self-tests fails:

TASK [run kvm self-tests] ******************************************************
fatal: [kvmi]: FAILED! => changed=true 
  cmd:
  - ./tools/testing/selftests/kvm/x86_64/kvmi_test
  delta: '0:00:00.607688'
  end: '2021-02-18 15:32:08.759027'
  msg: non-zero return code
  rc: 254
  start: '2021-02-18 15:32:08.151339'
  stderr: |-
    ==== Test Assertion Failure ====
      x86_64/kvmi_test.c:829: get_ucall(ctx->vm, ctx->vcpu_id, &uc)
      pid=8492 tid=8522 - Success
         1  0x0000000000403337: vcpu_worker at kvmi_test.c:828
         2  0x00007fe482c65fa2: ?? ??:0
         3  0x00007fe482b964ce: ?? ??:0
      No guest request
  stderr_lines:
  - ==== Test Assertion Failure ====
  - '  x86_64/kvmi_test.c:829: get_ucall(ctx->vm, ctx->vcpu_id, &uc)'
  - '  pid=8492 tid=8522 - Success'
  - "     1\t0x0000000000403337: vcpu_worker at kvmi_test.c:828"
  - "     2\t0x00007fe482c65fa2: ?? ??:0"
  - "     3\t0x00007fe482b964ce: ?? ??:0"
  - '  No guest request'
  stdout: |-
    Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
    Guest physical address width detected: 40
    KVMI version: 1
            singlestep: 0
            vmfunc: 0
            eptp: 0
            ve: 0
            spp: 0
    vcpu count: 1
    tsc_speed: 0 HZ
    get_registers rip 0x40bc0f
    cpuid(0, 0) => eax 0x0000000d, ebx 0x68747541, ecx 0x444d4163, edx 0x69746e65
    Hypercall event, rip 0x402e5d
    Breakpoint event, rip 0x402ee0, len 1
    CR4, old 0x220, new 0x40220
    Exception event: vector 6, error_code 0x0, cr2 0x0
  stdout_lines: <omitted>

Any ideas @adlazar ?

Wenzel pushed a commit that referenced this issue Nov 3, 2021
Normally the zero fill would hide the missing initialization, but an
errant set to desc_size in reg_create() causes a crash:

  BUG: unable to handle page fault for address: 0000000800000000
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 5 PID: 890 Comm: ib_write_bw Not tainted 5.15.0-rc4+ #47
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:mlx5_ib_dereg_mr+0x14/0x3b0 [mlx5_ib]
  Code: 48 63 cd 4c 89 f7 48 89 0c 24 e8 37 30 03 e1 48 8b 0c 24 eb a0 90 0f 1f 44 00 00 41 56 41 55 41 54 55 53 48 89 fb 48 83 ec 30 <48> 8b 2f 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 8b 87 c8
  RSP: 0018:ffff88811afa3a60 EFLAGS: 00010286
  RAX: 000000000000001c RBX: 0000000800000000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000800000000
  RBP: 0000000800000000 R08: 0000000000000000 R09: c0000000fffff7ff
  R10: ffff88811afa38f8 R11: ffff88811afa38f0 R12: ffffffffa02c7ac0
  R13: 0000000000000000 R14: ffff88811afa3cd8 R15: ffff88810772fa00
  FS:  00007f47b9080740(0000) GS:ffff88852cd40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000800000000 CR3: 000000010761e003 CR4: 0000000000370ea0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   mlx5_ib_free_odp_mr+0x95/0xc0 [mlx5_ib]
   mlx5_ib_dereg_mr+0x128/0x3b0 [mlx5_ib]
   ib_dereg_mr_user+0x45/0xb0 [ib_core]
   ? xas_load+0x8/0x80
   destroy_hw_idr_uobject+0x1a/0x50 [ib_uverbs]
   uverbs_destroy_uobject+0x2f/0x150 [ib_uverbs]
   uobj_destroy+0x3c/0x70 [ib_uverbs]
   ib_uverbs_cmd_verbs+0x467/0xb00 [ib_uverbs]
   ? uverbs_finalize_object+0x60/0x60 [ib_uverbs]
   ? ttwu_queue_wakelist+0xa9/0xe0
   ? pty_write+0x85/0x90
   ? file_tty_write.isra.33+0x214/0x330
   ? process_echoes+0x60/0x60
   ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs]
   __x64_sys_ioctl+0x10d/0x8e0
   ? vfs_write+0x17f/0x260
   do_syscall_64+0x3c/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Add the missing xarray initialization and remove the desc_size set.

Fixes: a639e66 ("RDMA/mlx5: Zero out ODP related items in the mlx5_ib_mr")
Link: https://lore.kernel.org/r/a4846a11c9de834663e521770da895007f9f0d30.1634642730.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <[email protected]>
Reviewed-by: Maor Gottlieb <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant