Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VMware vSphere/ESXi use #82

Closed
singlecheeze opened this issue Feb 26, 2022 · 57 comments
Closed

VMware vSphere/ESXi use #82

singlecheeze opened this issue Feb 26, 2022 · 57 comments
Assignees

Comments

@singlecheeze
Copy link

It would be wonderful 🎉 to have 512e block size as an option for rapiddisk devices.

They appear to only be 4k:

sudo blockdev --getbsz /dev/rd0
4096

This is based on the documentation found here, and specifically that VMware ESXi requires 512e block size devices

Here is an example config:

dave@ubuntu-base:~$ sudo rapiddisk -n
rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

NVMe Target Exports

        1: NQN: nqn.2021-06.org.rapiddisk:ubuntu-base-rd0       Namespace: 1    Device: /dev/rd0        Enabled: True

Exported NVMe Ports

        1: Port: 1 - 172.16.1.230 (tcp) NQN: nqn.2021-06.org.rapiddisk:ubuntu-base-rd0
dave@ubuntu-base:~$ sudo rapiddisk -l
rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

List of RapidDisk device(s):

 RapidDisk Device 1: rd0        Size (KB): 4194304

List of RapidDisk-Cache mapping(s):

 RapidDisk-Cache Target 1: rc-wa_md127  Cache: rd0  Target: md127 (WRITE AROUND)

And the error I am seeing in vSphere (Most fields do not require an input as they will default to specifc settings, and I've tried to change just about everything and get the same error, "Error during discover controllers"):
image

My hunch is that it is successfully connecting to the NVMe/TCP target (details above), but is seeing the 4k versus 512 block and then throws an error. Though I have been unable to find a log that says specifically what the problem is so far.

@singlecheeze singlecheeze changed the title Need 512e block size for Mware vSphere/ESXi use Need 512e block size for VMware vSphere/ESXi use Feb 26, 2022
@singlecheeze singlecheeze changed the title Need 512e block size for VMware vSphere/ESXi use 512e block size for VMware vSphere/ESXi use Feb 26, 2022
@singlecheeze
Copy link
Author

image

@singlecheeze
Copy link
Author

image

@singlecheeze
Copy link
Author

ESXi supports only local 4Kn SAS and SATA HDDs.

The key word above is "local", additionally: "ESXi detects and registers the [local] 4Kn devices and automatically emulates them as 512e."

@pkoutoupis pkoutoupis self-assigned this Feb 26, 2022
@pkoutoupis
Copy link
Owner

I have a similar ticket but in the opposite direction and for RapidDisk-Cache: #59.

@singlecheeze
Copy link
Author

@pkoutoupis you've done a REALLY fantastic job with this project! The more I learn and read the wiki I see massive potential to check off the big three items for: A "white box" NVMe over RDMA/TCP SAN, with caching, and a REST API .

@singlecheeze
Copy link
Author

I'm curious @pkoutoupis and happy to help in any way I can getting 512e block size working and whatever else is required to allow use with VMware ESXi.

@pkoutoupis
Copy link
Owner

pkoutoupis commented Feb 27, 2022

Hmmm. I am actually kind of surprised that blkdev is returning 4K when only the physical block size is 4K and the logical is 512 bytes:

#if LINUX_VERSION_CODE >= KERNEL_VERSION(5,15,0)
	blk_queue_logical_block_size(disk->queue, BYTES_PER_SECTOR);
	blk_queue_physical_block_size(disk->queue, PAGE_SIZE);
#else
	blk_queue_logical_block_size(rdsk->rdsk_queue, BYTES_PER_SECTOR);
	blk_queue_physical_block_size(rdsk->rdsk_queue, PAGE_SIZE);
#endif

BYTES_PER_SECTOR is 512 bytes. I didn't think I made it 4K and as you see from the code about, the logical is not.

petros@dev-machine:~/rapiddisk/src$ cat /sys/block/rd0/queue/logical_block_size 
512
petros@dev-machine:~/rapiddisk/src$ cat /sys/block/rd0/queue/physical_block_size 
4096
petros@dev-machine:~/rapiddisk/src$ sudo blockdev --getbsz /dev/rd0
4096

Strangely, even when I change the physical block size to match that of the logical...

petros@dev-machine:~/rapiddisk/src$ sudo blockdev --getbsz /dev/rd0
4096
petros@dev-machine:~/rapiddisk/src$ cat /sys/block/rd0/queue/physical_block_size 
512
petros@dev-machine:~/rapiddisk/src$ cat /sys/block/rd0/queue/logical_block_size 
512

Hmmmm. Not sure what is going on here.

Even when I change the minimum I/O size from 4K to 512....

petros@dev-machine:~/rapiddisk/src$ cat /sys/block/rd0/queue/minimum_io_size 
512
petros@dev-machine:~/rapiddisk/src$ sudo blockdev --getbsz /dev/rd0
4096

EDIT: What I would recommend is to set up another Linux node to see if it detects/imports the NQN. Let us make sure that it is being exported correctly in the first place.

@pkoutoupis
Copy link
Owner

What I find interesting is....

$ sudo blockdev --report /dev/rd0
RO    RA   SSZ   BSZ   StartSec            Size   Device
rw   256   512  4096          0       134217728   /dev/rd0

It reports the logical sector size as 512 but block size but again, the "block size" is 4K. This util is supposed to send ioctls to the block device:

	},{
		IOCTL_ENTRY(BLKBSZGET),
		.name = "--getbsz",
		.argtype = ARG_INT,
		.argval = -1,
		.help = N_("get blocksize")
	},{

^ from the blockdev source code. What I find strange is that I default everything to 512 bytes:

        case BLKPBSZGET:
        case BLKBSZGET:
        case BLKSSZGET:
                size = BYTES_PER_SECTOR;
                return copy_to_user((void __user *)arg, &size,
                        sizeof(size)) ? -EFAULT : 0;
        }

AND when I cal blockdev with --getbsz (or anything else), my module does not register the ioctl.

Oh, I just stumbled on this interesting piece of info which can explain this: https://bugzilla.redhat.com/show_bug.cgi?id=1684078

So, I do not think that our problem is 512e related. Again, I would see if we can import it in another Linux node. And I would compare the results from the rapiddisk utility to my article published a few years ago: https://www.linuxjournal.com/content/data-flash-part-iii-nvme-over-fabrics-using-tcp

@pkoutoupis
Copy link
Owner

Yeah, I just hacked up my code to see if the ram drive sees the ioctls and it doesn't. The kernel is returning those values. Strange.

@singlecheeze
Copy link
Author

@pkoutoupis good inclination on testing from outside ESXi:

$ sudo nvme discover -t tcp -a 172.16.1.230 -s 4420 -w 172.16.1.241
Failed to write to /dev/nvme-fabrics: Connection refused

@singlecheeze
Copy link
Author

singlecheeze commented Feb 27, 2022

@pkoutoupis nevermind, I forgot I had rebooted the target host and that wiped the rapiddisk rd0 cache and the mapping to the md array and the nvme over TCP target! BTW how do I persist these past a reboot?

$ sudo nvme discover -t tcp -a 172.16.1.230 -s 4420 -w 172.16.1.241

Discovery Log Number of Records 1, Generation counter 2
=====Discovery Log Entry 0======
trtype:  tcp
adrfam:  ipv4
subtype: nvme subsystem
treq:    not specified, sq flow control disable supported
portid:  1
trsvcid: 4420
subnqn:  nqn.2021-06.org.rapiddisk:ubuntu-base-rd0
traddr:  172.16.1.230
sectype: none
dave@ubuntu-base:~$ sudo nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev  
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     VMware NVME_0000     VMware Virtual NVMe Disk                 1          17.18  GB /  17.18  GB    512   B +  0 B   1.3

@singlecheeze
Copy link
Author

singlecheeze commented Feb 27, 2022

For example, I have to do this on every boot on the target VM:

dave@ubuntu-base:~$ !80 && !81 && !84 && !86
sudo rapiddisk -i ens160 -P 1 -t tcp && sudo rapiddisk -a 4096 && sudo rapiddisk -m rd0 -b /dev/md127 -p wa && sudo rapiddisk -e -b rd0 -P 1
rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

Successfully created port 1 set to tcp for interface ens160 (with IP address 172.16.1.230).
rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

Attached device rd0 of size 4096 Mbytes
rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

Command to map rc-wa_md127 with rd0 and /dev/md127 has been sent.
Verify with "-l"

rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

Block device rd0 has been mapped to all hosts through port 1 as nqn.2021-06.org.rapiddisk:ubuntu-base-rd0
dave@ubuntu-base:~$ sudo rapiddisk -n
rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

NVMe Target Exports

	1: NQN: nqn.2021-06.org.rapiddisk:ubuntu-base-rd0 	Namespace: 1	Device: /dev/rd0 	Enabled: True

Exported NVMe Ports

	1: Port: 1 - 172.16.1.230 (tcp)	NQN: nqn.2021-06.org.rapiddisk:ubuntu-base-rd0

Soooo, Ubuntu 20.04 VM Target and Ubuntu 20.04 VM Initiator seems to work great, but ESXi is still not seeing any target as the initiator... must find log to tell me what's going on...

@singlecheeze
Copy link
Author

singlecheeze commented Feb 27, 2022

@pkoutoupis please see: https://communities.vmware.com/t5/ESXi-Discussions/NVMEof-Datastore-Issues/td-p/2301440

"Issue was solved by Storage vendor. Released a new firmware, downgrade 4k to 512 volume block size supported by VMware."

And I found the log:

2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)World: 12075: VC opID kze7o7uv-125036-auto-2oh9-h5:70059652-e9-1b-4e02 maps to vmkernel opID 65988a76
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMEDEV:1393 Ctlr 275 allocated, maximum queues 16
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:159 Controller 275, connecting using parameters:
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:161 kato:      0
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:162 subtype:   1
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:163 cdc:       0
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:166 target type:    NVMe
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:174 vmkParams.asqsize:   32
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:175 vmkParams.cntlid:    0xffff
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:177 vmkParams.hostid:    615ae605-1131-d5a2-0c71-246e961c89c0
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:178 vmkParams.hostnqn:   nqn.2014-08.localdomain:nvme:r730esxi3
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:179 vmkParams.subnqn:    nqn.2014-08.org.nvmexpress.discovery
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:196 vmkParams.trType:    TCP
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:198 vmkParams.trsvcid:   8009
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:200 vmkParams.traddr:    172.16.1.230
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFDEV:202 vmkParams.tsas.digest:    3
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)nvmetcp:nt_ConnectController:781 vmhba65, controller 275
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)nvmetcp:nt_ConnectCM:4357 [ctlr 275, queue 0]
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMFNET:151 Uplink: vmnic4, portset: vSwitch0.
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)nvmetcp:nt_ConnectCM:4404 [ctlr 275, queue 0] Failed to get source vmknic for NVMe/TCP traffic on vmnic4: Not found
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)nvmetcp:nt_ConnectQueueInt:4078 [ctlr 275, queue 0] failed to connect: Not found
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)nvmetcp:nt_FreeSubmissionResources:5132 [ctlr 275, queue 0]
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)nvmetcp:nt_ConnectController:860 Failed to connect admin queue: Not found
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)WARNING: NVMFDEV:880 Failed to transport connect controller 275, status: Not found
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMEDEV:1565 Ctlr 275 freeing
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)NVMEDEV:9060 Cancel requests of controller 275, 0 left.
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)WARNING: NVMFDEV:1423 Failed to connect to controller, status: Not found
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)WARNING: NVMFEVT:1764 Failed to discover controllers, status: Not found
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)WARNING: NVMFEVT:1447 Discover and connect controller failed: Not found
2022-02-27T19:52:41.338Z cpu46:2100801 opID=65988a76)WARNING: NVMFVSI:1300 Failed to discover controllers: Not found

Not exactly sure what "Failed to get source vmknic for NVMe/TCP traffic on vmnic4: Not found" means... also no idea if those vmkParams.subnqn or like params are right... thoughts?

@pkoutoupis
Copy link
Owner

pkoutoupis commented Feb 27, 2022

@singlecheeze

BTW how do I persist these past a reboot?

Yes, for now, it is a manual process on each reboot. I have not added an autoload feature. In fact, I am currently working on a "RapidDisk OS" that will incorporate that autoload functionality. For now, using the current RapidDisk suite, users are limited to either manually running it or by using an rc.local (or other) script to have the same routines run during boot time.

I did notice something above which may be an error (I could be wrong)...
You are exporting rd0 but after you use rd0 to map as the cache for md127.

sudo rapiddisk -m rd0 -b /dev/md127 -p wa && sudo rapiddisk -e -b rd0 -P 1

You should instead be exporting /dev/mapper/rc-wa_md127 which should be an input of -b rc-wa_md127. That way, the initiator host will see the entire volume and not just the cache portion of the drive. You will likely get a return of 512 bytes with blockdev --getbsz when accessing the entire mapping /dev/mapper/rc-wa_md127.

Remember, RapidDisk volumes can be used as independent RAM drives (i.e. /dev/rd0, /dev/rd1, et al) or as a mapping of two volumes: the ram drive as the cache and the backing store (or original volume, in your case /dev/md127). To access the mapping, you need to do so from a virtual block device such as /dev/mapper/rc-wa_md127, which is why that is the node you will need to export via NVMe.

subnqn: nqn.2021-06.org.rapiddisk:ubuntu-base-rd0

But it is reassuring that your second VM is seeing the NQN.

Something like this will connect the drive (you may need to modify the params) to the local VM and it will then appear in the nvme list output:

$ sudo nvme connect -t tcp -n nqn.2021-06.org.rapiddisk:ubuntu-base-rd0 -a 172.16.1.230 -s 4420

Anyway, I am looking at the VMware log output. I am confused by the subnqn. Why are we seeing nqn.2021-06.org.rapiddisk:ubuntu-base-rd0? But before we continue to this step, let us make sure we export the correct volume (i.e. the cache mapping /dev/mapper/rc-wa_md127).

@singlecheeze
Copy link
Author

I'd love to see an OS like ESOS for NVMe!!!!!!!!!!

@singlecheeze
Copy link
Author

singlecheeze commented Feb 28, 2022

@pkoutoupis More hacking on this tonight:

dave@ubuntu-base:~$ !80
sudo rapiddisk -i ens160 -P 1 -t tcp
[sudo] password for dave: 
rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

Successfully created port 1 set to tcp for interface ens160 (with IP address 172.16.1.230).
dave@ubuntu-base:~$ !81
sudo rapiddisk -a 4096
rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

Attached device rd0 of size 4096 Mbytes
dave@ubuntu-base:~$ !84
sudo rapiddisk -m rd0 -b /dev/md127 -p wa
rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

Command to map rc-wa_md127 with rd0 and /dev/md127 has been sent.
Verify with "-l"

dave@ubuntu-base:~$ sudo rapiddisk -l
rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

List of RapidDisk device(s):

 RapidDisk Device 1: rd0	Size (KB): 4194304

List of RapidDisk-Cache mapping(s):

 RapidDisk-Cache Target 1: rc-wa_md127	Cache: rd0  Target: md127 (WRITE AROUND)

dave@ubuntu-base:~$ sudo rapiddisk -e -b rc-wa_md127 -P 1
rapiddisk 8.1.0
Copyright 2011 - 2022 Petros Koutoupis

Block device rc-wa_md127 has been mapped to all hosts through port 1 as nqn.2021-06.org.rapiddisk:ubuntu-base-rc-wa_md127
dave@ubuntu-base:~$ sudo blockdev --getbsz /dev/mapper/rc-wa_md127
4096

ESXi Logs (Auto Discovery):

2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)World: 12075: VC opID kze7o7uv-125819-auto-2p30-h5:70059861-26-4b-8dc4 maps to vmkernel opID b7c49169
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMEDEV:1393 Ctlr 276 allocated, maximum queues 16
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:159 Controller 276, connecting using parameters:
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:161 kato:      0
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:162 subtype:   1
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:163 cdc:       0
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:166 target type:    NVMe
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:174 vmkParams.asqsize:   32
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:175 vmkParams.cntlid:    0xffff
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:177 vmkParams.hostid:    615ae605-1131-d5a2-0c71-246e961c89c0
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:178 vmkParams.hostnqn:   nqn.2014-08.localdomain:nvme:r730esxi3
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:179 vmkParams.subnqn:    nqn.2014-08.org.nvmexpress.discovery
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:196 vmkParams.trType:    TCP
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:198 vmkParams.trsvcid:   8009
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:200 vmkParams.traddr:    172.16.1.230
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFDEV:202 vmkParams.tsas.digest:    0
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)nvmetcp:nt_ConnectController:781 vmhba65, controller 276
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)nvmetcp:nt_ConnectCM:4357 [ctlr 276, queue 0]
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMFNET:151 Uplink: vmnic4, portset: vSwitch0.
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)nvmetcp:nt_ConnectCM:4404 [ctlr 276, queue 0] Failed to get source vmknic for NVMe/TCP traffic on vmnic4: Not found
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)nvmetcp:nt_ConnectQueueInt:4078 [ctlr 276, queue 0] failed to connect: Not found
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)nvmetcp:nt_FreeSubmissionResources:5132 [ctlr 276, queue 0]
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)nvmetcp:nt_ConnectController:860 Failed to connect admin queue: Not found
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)WARNING: NVMFDEV:880 Failed to transport connect controller 276, status: Not found
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMEDEV:1565 Ctlr 276 freeing
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)NVMEDEV:9060 Cancel requests of controller 276, 0 left.
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)WARNING: NVMFDEV:1423 Failed to connect to controller, status: Not found
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)WARNING: NVMFEVT:1764 Failed to discover controllers, status: Not found
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)WARNING: NVMFEVT:1447 Discover and connect controller failed: Not found
2022-02-28T02:20:55.969Z cpu40:2100804 opID=b7c49169)WARNING: NVMFVSI:1300 Failed to discover controllers: Not found

And Manual input (Don't be fooled by the vmkParams.subnqn: nqn.2021-06.org.rapiddisk:ubuntu-base-rc-wa_md127 I manually input that)

2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)World: 12075: VC opID kze7o7uv-125846-auto-2p3r-h5:70059878-3a-81-8e49 maps to vmkernel opID b959e5e7
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMEDEV:1393 Ctlr 277 allocated, maximum queues 16
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:159 Controller 277, connecting using parameters:
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:161 kato:      30000
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:162 subtype:   2
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:163 cdc:       0
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:166 target type:    NVMe
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:174 vmkParams.asqsize:   32
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:175 vmkParams.cntlid:    0xffff
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:177 vmkParams.hostid:    615ae605-1131-d5a2-0c71-246e961c89c0
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:178 vmkParams.hostnqn:   nqn.2014-08.localdomain:nvme:r730esxi3
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:179 vmkParams.subnqn:    nqn.2021-06.org.rapiddisk:ubuntu-base-rc-wa_md127
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:196 vmkParams.trType:    TCP
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:198 vmkParams.trsvcid:   1
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:200 vmkParams.traddr:    172.16.1.230
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFDEV:202 vmkParams.tsas.digest:    0
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)nvmetcp:nt_ConnectController:781 vmhba65, controller 277
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)nvmetcp:nt_ConnectCM:4357 [ctlr 277, queue 0]
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMFNET:151 Uplink: vmnic4, portset: vSwitch0.
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)nvmetcp:nt_ConnectCM:4404 [ctlr 277, queue 0] Failed to get source vmknic for NVMe/TCP traffic on vmnic4: Not found
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)nvmetcp:nt_ConnectQueueInt:4078 [ctlr 277, queue 0] failed to connect: Not found
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)nvmetcp:nt_FreeSubmissionResources:5132 [ctlr 277, queue 0]
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)nvmetcp:nt_ConnectController:860 Failed to connect admin queue: Not found
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)WARNING: NVMFDEV:880 Failed to transport connect controller 277, status: Not found
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMEDEV:1565 Ctlr 277 freeing
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)NVMEDEV:9060 Cancel requests of controller 277, 0 left.
2022-02-28T02:23:28.866Z cpu14:2100806 opID=b959e5e7)WARNING: NVMFVSI:1498 Failed to connect to controller, status: Not found

I'm puzzled by Failed to get source vmknic for NVMe/TCP traffic on vmnic4: Not found -- I'd really be embarrassed if there is something missing or I forgot to do in ESXi.

What's strange is how instantly it errors out. I don't have really any experience with NVMe/TCP, but I surmise that it would potentially use Keep Alive Timeout (KATO) of NVMFDEV:161 kato: 30000 as the connection timeout too?

@singlecheeze
Copy link
Author

singlecheeze commented Feb 28, 2022

Hmmmm. Not sure what is going on here.

Even when I change the minimum I/O size from 4K to 512....

petros@dev-machine:~/rapiddisk/src$ cat /sys/block/rd0/queue/minimum_io_size 
512
petros@dev-machine:~/rapiddisk/src$ sudo blockdev --getbsz /dev/rd0
4096

See here: I wonder if we are seeing through 512e and seeing the 4Kn... it's hard to tell what is 512 native (512n), 4K emulated to 512e, or 4K native (4Kn)...

https://kb.vmware.com/s/article/2091600#:~:text=in%20that%20case.-,This%20table,-compares%20native%20512

@singlecheeze
Copy link
Author

I worked on this a bit more tonight.

On previous trials, I was presenting virtual NVMe hard disks (in md raid 1) over RapidDisk NVMe/TCP back to the same ESXi that was running the VM I was presenting storage from.

Tonight I thought to try a NON NVMe virtual disk from the VM and use a different different disk controller (LSI Logic Parallel instead of NVMe) thinking that maybe the virtual NVMe controller or virtual drives where influencing the 4096 block size.

I knew this was a stretch though because I still have two virtual drives in an md raid 1 array and that is what I am fronting with RapidDisk.

No difference in ESXi, same discovery error. 😞

image

@singlecheeze
Copy link
Author

I found this in regard to iSCSI with SPDK (Intel backed project for NVMe target I believe):

spdk/spdk#1414

@singlecheeze
Copy link
Author

@pkoutoupis any github for your "RapidDisk OS"? I'd love to contribute anything I can.

There are virtually no projects out there that I've been able to find that combine the caching and NVMe over RDMA/TCP that you've accomplished with RapidDisk!

@pkoutoupis
Copy link
Owner

@singlecheeze I appreciate your excitement. I am not ready to share that project publicly yet. I am hoping to do so within the next 6 months. There is a bit of polishing that needs to happen first. Trust me when I say, you will be one of the first I reach out to when I do. And when that does happen, I hope to have this issue with vSphere sorted out as it needs to be a non-issue when the OS does become more available.

@singlecheeze
Copy link
Author

@pkoutoupis 🎉🎉🎉

@singlecheeze
Copy link
Author

Made some progress here:
image

Seems the issue was the need to manually configure a vmk for "NVMe over TCP"
image

The error in the logs Failed to get source vmknic for NVMe/TCP traffic on vmnic4: Not found could have been a bit more descriptive...

@singlecheeze
Copy link
Author

singlecheeze commented Apr 20, 2022

[root@r730esxi3:~] cat /var/log/vmkernel.log
2022-04-20T01:57:23.731Z cpu18:2097508)HPP: HppCreateDevice:3071: Created logical device 'uuid.664f7b17eb0b4b2d93ede1355f3e81ed'.
2022-04-20T01:57:23.731Z cpu18:2097508)WARNING: HPP: HppClaimPath:3956: Failed to claim path 'vmhba65:C0:T0:L0': Not supported
2022-04-20T01:57:23.731Z cpu18:2097508)HPP: HppUnclaimPath:4002: Unclaiming path vmhba65:C0:T0:L0
2022-04-20T01:57:23.731Z cpu18:2097508)ScsiPath: 8597: Plugin 'HPP' rejected path 'vmhba65:C0:T0:L0'
2022-04-20T01:57:23.731Z cpu18:2097508)ScsiClaimrule: 2039: Plugin HPP specified by claimrule 65534 was not able to claim path vmhba65:C0:T0:L0: Not supported
2022-04-20T01:57:23.731Z cpu18:2097508)WARNING: ScsiPath: 8496: NMP cannot claim a path to NVMeOF device vmhba65:C0:T0:L0
2022-04-20T01:57:23.731Z cpu18:2097508)ScsiClaimrule: 2039: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba65:C0:T0:L0: Not supported
2022-04-20T01:57:23.731Z cpu18:2097508)ScsiClaimrule: 2518: Error claiming path vmhba65:C0:T0:L0. Not supported.

From a lead here: https://communities.vmware.com/t5/ESXi-Discussions/NVMEof-Datastore-Issues/td-p/2301440

@singlecheeze
Copy link
Author

Rapiddisk nqn comes across as:

[root@r730esxi3:~] esxcli nvme controller list
Name                                                                    Controller Number  Adapter  Transport Type  Is Online
----------------------------------------------------------------------  -----------------  -------  --------------  ---------
nqn.2021-06.org.rapiddisk:fedora-rc-wt_md127#vmhba65#172.16.1.223:4420                268  vmhba65  TCP                  true

@singlecheeze
Copy link
Author

[root@r730esxi3:~] esxcli nvme namespace list
Name                                   Controller Number  Namespace ID  Block Size  Capacity in MB
-------------------------------------  -----------------  ------------  ----------  --------------
uuid.664f7b17eb0b4b2d93ede1355f3e81ed                268             1         512           32749

@singlecheeze
Copy link
Author

singlecheeze commented Apr 20, 2022

Rebooted ESXi Host and seems the claim rule still results in a dead path:
image
image

2022-04-20T03:38:03.822Z cpu9:2097508)ScsiClaimrule: 2039: Plugin HPP specified by claimrule 102 was not able to claim path vmhba65:C0:T0:L0: Not supported

We may want to check vmknvme_hostnqn_format too:

[root@r730esxi3:~] esxcli system module parameters list -m vmknvme
Name                                   Type  Value  Description
-------------------------------------  ----  -----  -----------
vmknvme_adapter_num_cmpl_queues        uint         Number of PSA completion queues for NVMe-oF adapter, min: 1, max: 16, default: 4
vmknvme_compl_world_type               uint         completion world type, PSA: 0, VMKNVME: 1
vmknvme_ctlr_recover_initial_attempts  uint         Number of initial controller recover attempts, MIN: 2, MAX: 30
vmknvme_ctlr_recover_method            uint         controller recover method after initial recover attempts, RETRY: 0, DELETE: 1
vmknvme_cw_rate                        uint         Number of completion worlds per IO queue (NVMe/PCIe only). Number is a power of 2. Applies when number of queues less than 4.
vmknvme_enable_noiob                   uint         If enabled, driver will split the commands based on NOIOB.
vmknvme_hostnqn_format                 uint         HostNQN format, UUID: 0, HostName: 1
vmknvme_io_queue_num                   uint         vmknvme IO queue number for NVMe/PCIe adapter: pow of 2 in [1, 16]
vmknvme_io_queue_size                  uint         IO queue size: [8, 1024]
vmknvme_iosplit_workaround             uint         If enabled, qdepth in PSA layer is half size of vmknvme settings.
vmknvme_log_level                      uint         log level: [0, 20]
vmknvme_max_prp_entries_num            uint         User defined maximum number of PRP entries per controller:default value is 0
vmknvme_total_io_queue_size            uint         Aggregated IO queue size of a controller, MIN: 64, MAX: 4096

Ref: https://www.dell.com/support/manuals/en-us/powerstore-9000t/pwrstr-hcg/setting-the-esxi-host-nvme-qualified-name?guid=guid-43b40269-77bd-421d-b7fd-968bd7acb2cc&lang=en-us

Unset it looks to default to hostname:

[root@r730esxi3:~] esxcli nvme info get
   Host NQN: nqn.2014-08.localdomain:nvme:r730esxi3

Versus example in Dell link above after changing parameter to UUID:

  • Connect to the host as root.
  • Run the following esxcli command for the Host NQN to be automatically generated according to spec during the ESXi host reboot:
    esxcli system module parameters set -m vmknvme -p vmknvme_hostnqn_format=0
  • Reboot the host.
  • Run the esxcli nvme info get command to confirm that the Host NQN was generated correctly.
# esxcli nvme info get
Host NQN:
nqn.2014-08.org.nvmexpress:uuid:5ffefdd7-0a51-3700-0b16-001e6792303a

@pkoutoupis
Copy link
Owner

All wonderful investigative work. I am caught up but am unclear on what we are suspecting. Are we doubting the NQN format from the rapiddisk application?

@singlecheeze
Copy link
Author

singlecheeze commented Apr 20, 2022

@pkoutoupis too hard to say at the moment... vmware documentation online is very little so having to try a bunch of stuff. Going to work on a bit more today.

@singlecheeze
Copy link
Author

@singlecheeze
Copy link
Author

2022-04-20T14:26:29.644Z cpu0:2097510)HPP: HppCreateDevice:3071: Created logical device 'uuid.664f7b17eb0b4b2d93ede1355f3e81ed'.
2022-04-20T14:26:29.644Z cpu0:2097510)WARNING: HPP: HppClaimPath:3956: Failed to claim path 'vmhba65:C0:T0:L0': Not supported
2022-04-20T14:26:29.644Z cpu0:2097510)HPP: HppUnclaimPath:4002: Unclaiming path vmhba65:C0:T0:L0
2022-04-20T14:26:29.644Z cpu0:2097510)ScsiPath: 8597: Plugin 'HPP' rejected path 'vmhba65:C0:T0:L0'
2022-04-20T14:26:29.644Z cpu0:2097510)ScsiClaimrule: 2039: Plugin HPP specified by claimrule 65534 was not able to claim path vmhba65:C0:T0:L0: Not supported
2022-04-20T14:26:29.644Z cpu0:2097510)WARNING: ScsiPath: 8496: NMP cannot claim a path to NVMeOF device vmhba65:C0:T0:L0
2022-04-20T14:26:29.644Z cpu0:2097510)ScsiClaimrule: 2039: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba65:C0:T0:L0: Not supported
2022-04-20T14:26:29.644Z cpu0:2097510)ScsiClaimrule: 2518: Error claiming path vmhba65:C0:T0:L0. Not supported.

@singlecheeze
Copy link
Author

dave@fedora ~]$ sudo nvmetcli
/> ls
o- / ......................................................................................................................... [...]
  o- hosts ................................................................................................................... [...]
  o- ports ................................................................................................................... [...]
  | o- 4420 ................................................ [trtype=tcp, traddr=172.16.1.223, trsvcid=4420, inline_data_size=16384]
  |   o- ana_groups .......................................................................................................... [...]
  |   | o- 1 ..................................................................................................... [state=optimized]
  |   o- referrals ........................................................................................................... [...]
  |   o- subsystems .......................................................................................................... [...]
  |     o- nqn.2021-06.org.rapiddisk:fedora-rc-wt_md127 ...................................................................... [...]
  o- subsystems .............................................................................................................. [...]
    o- nqn.2021-06.org.rapiddisk:fedora-rc-wt_md127 ........................ [version=1.3, allow_any=1, serial=9a9a8e03f7740e3e812e]
      o- allowed_hosts ....................................................................................................... [...]
      o- namespaces .......................................................................................................... [...]
        o- 1 ........................... [path=/dev/mapper/rc-wt_md127, uuid=c2320ec9-f552-4a59-b25c-24cfe0dc8c56, grpid=1, enabled]

@pkoutoupis
Copy link
Owner

Let me know if you need a patch for the NQN. Or anything else. Again, I appreciate you putting all this work into this. You have been extremely helpful.

@singlecheeze
Copy link
Author

@pkoutoupis thank you very much. I'm using this issue to document some troubleshooting as there is virtually nothing online as per my google fu... hope you don't mind :)

@singlecheeze
Copy link
Author

singlecheeze commented Apr 20, 2022

Tried this claim rule too, which didn't work:

[root@r730esxi3:~] esxcli storage core claimrule add --rule 102 -t vendor -P HPP -V NVMe -M "rapiddisk*" --config-string "pss=LB-Latency,latency-eval-time=180000"

[root@r730esxi3:~] esxcli storage core claimrule list
Rule Class   Rule  Class    Type       Plugin     Matches                                                                XCOPY Use Array Reported Values  XCOPY Use Multiple Segments  XCOPY Max Transfer Size KiB  Config String
----------  -----  -------  ---------  ---------  ---------------------------------------------------------------------  -------------------------------  ---------------------------  ---------------------------  -------------
MP             50  runtime  transport  NMP        transport=usb                                                                                    false                        false                            0
MP             51  runtime  transport  HPP        transport=sata                                                                                   false                        false                            0  pss=FIXED
MP             52  runtime  transport  NMP        transport=ide                                                                                    false                        false                            0
MP             53  runtime  transport  NMP        transport=block                                                                                  false                        false                            0
MP             54  runtime  transport  NMP        transport=unknown                                                                                false                        false                            0
MP            101  runtime  vendor     MASK_PATH  vendor=DELL model=Universal Xport pci_vendor_id=* pci_sub_vendor_id=*                            false                        false                            0
MP            101  file     vendor     MASK_PATH  vendor=DELL model=Universal Xport pci_vendor_id=* pci_sub_vendor_id=*                            false                        false                            0
MP            102  runtime  vendor     HPP        vendor=NVMe model=rapiddisk* pci_vendor_id=* pci_sub_vendor_id=*                                 false                        false                            0  latency-eval-time=180000,pss=LB-Latency
MP            102  file     vendor     HPP        vendor=NVMe model=rapiddisk* pci_vendor_id=* pci_sub_vendor_id=*                                 false                        false                            0  latency-eval-time=180000,pss=LB-Latency
MP          65531  runtime  transport  HPP        transport=sas                                                                                    false                        false                            0  pss=FIXED
MP          65532  runtime  transport  HPP        transport=parallel                                                                               false                        false                            0  pss=FIXED
MP          65533  runtime  transport  HPP        transport=pcie                                                                                   false                        false                            0  pss=FIXED
MP          65534  runtime  vendor     HPP        nvme_controller_model=* pci_vendor_id=* pci_sub_vendor_id=*                                      false                        false                            0
MP          65535  runtime  vendor     NMP        vendor=* model=* pci_vendor_id=* pci_sub_vendor_id=*                                             false                        false                            0

EUI (extended unique identifier):
A 64‐bit number that uniquely identifies every device in the world. The format consists of 24 bits that are unique to a given company, and 40 bits assigned by the company to each device it builds. (https://documentation.suse.com/sles/15-SP1/single-html/SLES-storage/index.html#sec-iscsi-terminology)

Looks like most storage vendors use SPDK and exposes the EUI which I'm wondering if VMware has to have vs a UUID (I have not seen how to use in-kernel nvme EUI or if it is even available):

[root@angryesx5:~]  esxcli nvme namespace list
Name                                                                              Controller Number  Namespace ID  Block Size  Capacity in MB
--------------------------------------------------------------------------------  -----------------  ------------  ----------  --------------
t10.NVMeDell_Express_Flash_NVMe_SM1715_1.6TB_AIC__S1W6NYAG50009900000001                256             1         512         1526185
eui.d4a79e230000000064b94e1929554c0f                                                            270             1         512          106496
eui.d4a79e230000000064b94e1929554c0f                                                            271             1         512          106496
eui.d4a79e230000000064b94e1929554c0f                                                            272             1         512          106496
eui.d4a79e230000000064b94e1929554c0f                                                            273             1         512          106496

[root@angryesx5:~]  esxcli nvme controller list
Name                                                                                          Controller Number  Adapter  Transport Type  Is Online
--------------------------------------------------------------------------------------------  -----------------  -------  --------------  ---------
nqn.2014-08.org.nvmexpress_144d_Dell_Express_Flash_NVMe_SM1715_1.6TB_AIC__S1W6NYAG500099                256  vmhba1   PCIe                 true
nqn.1988-11.com.dell:powerflex:00:50300f1929554c0f#vmhba65#10.0.1.61:4420                                   270  vmhba65  TCP                  true
nqn.1988-11.com.dell:powerflex:00:50300f1929554c0f#vmhba65#10.0.1.62:4420                                   271  vmhba65  TCP                  true
nqn.1988-11.com.dell:powerflex:00:50300f1929554c0f#vmhba65#10.0.1.63:4420                                   272  vmhba65  TCP                  true
nqn.1988-11.com.dell:powerflex:00:50300f1929554c0f#vmhba65#10.0.1.64:4420                                   273  vmhba65  TCP                  true

[root@angryesx5:~] esxcli storage core device list
eui.d4a79e230000000064b94e1929554c0f
   Display Name: NVMe TCP Disk (eui.d4a79e230000000064b94e1929554c0f)
   Has Settable Display Name: false
   Size: 106496
   Device Type: Direct-Access
   Multipath Plugin: HPP
   Devfs Path: /vmfs/devices/disks/eui.d4a79e230000000064b94e1929554c0f
   Vendor: NVMe
   Model: powerflex
   Revision: 4.0.
   SCSI Level: 7
   Is Pseudo: false
   Status: on
   Is RDM Capable: false
   Is Local: false
   Is Removable: false
   Is SSD: true
   Is VVOL PE: false
   Is Offline: false
   Is Perennially Reserved: false
   Queue Full Sample Size: 0
   Queue Full Threshold: 0
   Thin Provisioning Status: yes
   Attached Filters:
   VAAI Status: supported
   Other UIDs: vml.06d4a79e230000000064b94e1929554c0f
   Is Shared Clusterwide: true
   Is SAS: false
   Is USB: false
   Is Boot Device: false
   Device Max Queue Depth: 2032
   No of outstanding IOs with competing worlds: 32
   Drive Type: unknown
   RAID Level: unknown
   Number of Physical Drives: unknown
   Protection Enabled: false
   PI Activated: false
   PI Type: 0
   PI Protection Mask: NO PROTECTION
   Supported Guard Types: NO GUARD SUPPORT
   DIX Enabled: false
   DIX Guard Type: NO GUARD SUPPORT
   Emulated DIX/DIF Enabled: false

@singlecheeze
Copy link
Author

singlecheeze commented Apr 20, 2022

Oddly with the above presented NVMeoTCP disk visable, no custom claim rules are needed (These are all default claim rules):

MP             50  runtime  transport  NMP        transport=usb                                                                                    false                        false                            0
MP             51  runtime  transport  HPP        transport=sata                                                                                   false                        false                            0  pss=FIXED
MP             52  runtime  transport  NMP        transport=ide                                                                                    false                        false                            0
MP             53  runtime  transport  NMP        transport=block                                                                                  false                        false                            0
MP             54  runtime  transport  NMP        transport=unknown                                                                                false                        false                            0
MP            101  runtime  vendor     MASK_PATH  vendor=DELL model=Universal Xport pci_vendor_id=* pci_sub_vendor_id=*                            false                        false                            0
MP            101  file     vendor     MASK_PATH  vendor=DELL model=Universal Xport pci_vendor_id=* pci_sub_vendor_id=*                            false                        false                            0
MP          65531  runtime  transport  HPP        transport=sas                                                                                    false                        false                            0  pss=FIXED
MP          65532  runtime  transport  HPP        transport=parallel                                                                               false                        false                            0  pss=FIXED
MP          65533  runtime  transport  HPP        transport=pcie                                                                                   false                        false                            0  pss=FIXED
MP          65534  runtime  vendor     HPP        nvme_controller_model=* pci_vendor_id=* pci_sub_vendor_id=*                                      false                        false                            0
MP          65535  runtime  vendor     NMP        vendor=* model=* pci_vendor_id=* pci_sub_vendor_id=*                                             false                        false                            0

A helpful blog post on nvme related claim rules:
https://www.codyhosterman.com/2021/03/nvme-of-multipath-configuration-for-pure-storage-datastores/

@singlecheeze
Copy link
Author

@pkoutoupis if you haven't seen this, please take a look: https://open-cas.github.io/

The team at intel doing Open CAS (OCF) that looks very close to rapiddisk potentially and bundles SPDK

@singlecheeze singlecheeze changed the title 512e block size for VMware vSphere/ESXi use VMware vSphere/ESXi use Apr 20, 2022
@pkoutoupis
Copy link
Owner

Interesting. I had never heard of this project and you are correct, it is oddly similar. I am not sure how I feel about that. Anyway, I should probably add diagrams into my project README. Maybe that will avoid some of the questions I keep getting asked about the project.

@pkoutoupis
Copy link
Owner

#82 (comment)
Where did you get the vendor? -V NVMe -M "Linux*"
I am not finding a vendor anywhere in sysfs or with the nvme cli utility. Even the vendor id (vid) and subsystem vendor id (ssvid) are 0x0. I do not even know what 0x0 is. I cannot find it documented anywhere.

$ sudo nvme id-ctrl /dev/nvme1n1
NVME Identify Controller:
vid       : 0
ssvid     : 0
sn        : cb1601dd7e1a7804    
mn        : Linux                                   
fr        : 5.4.0-10
rab       : 6
ieee      : 000000
cmic      : 0xb
[ ... ]

And what happens if you remove the wildcard from after Linux?

@pkoutoupis
Copy link
Owner

What if you use a wildcard for the vendor and Linux for the model without a wildcard?

@singlecheeze
Copy link
Author

singlecheeze commented Apr 21, 2022

My Google Fu must be on 🔥 this morning (though from 2 years ago): https://lore.kernel.org/all/[email protected]/T/

* Re: NVMET Target with esxi 7
2020-05-05 20:18 [Grant Albitz](https://lore.kernel.org/all/[email protected]/T/#mb1153e86677a0c2b589325e4ee8c4c527cd7ad89) @ 2020-05-07 16:19 Sagi Grimberg
2020-05-07 16:58 ` Grant Albitz
0 siblings, 1 reply; 6+ messages in thread
From: Sagi Grimberg @ 2020-05-07 16:19 UTC (permalink / raw)
To: Grant Albitz, linux-nvme
Hello,
Hey Grant,
I was trying to configure vmware to connect to nvmet based target. I did not have much luck. VMwares official stance is a very small vendor list. I did see the recent additions such as metadata info. I was curious if anyone developing nvmet was willing/able to test nvmet as a datastore target for vmware to possibly add this functionality. At the moment I am not sure if there are a large number of reasons its not working or a very small check.
The linux nvme target does not support fused commands so there is no support for VMware currently.

[root@r730esxi3:~] esxcli storage core claimrule add --rule 102 -t vendor -P HPP -V NVMe -M "rapiddisk*" --config-string "pss=LB-Latency,latency-eval-time=180000"

[root@r730esxi3:~] esxcli storage core claimrule add --rule 103 -t vendor -P HPP -V "*" -M "Linux" --config-string "pss=LB-Latency,latency-eval-time=180000"

[root@r730esxi3:~] esxcli storage core claimrule list
Rule Class   Rule  Class    Type       Plugin     Matches                                                                XCOPY Use Array Reported Values  XCOPY Use Multiple Segments  XCOPY Max Transfer Size KiB  Config String
----------  -----  -------  ---------  ---------  ---------------------------------------------------------------------  -------------------------------  ---------------------------  ---------------------------  -------------
MP             50  runtime  transport  NMP        transport=usb                                                                                    false                        false                            0
MP             51  runtime  transport  HPP        transport=sata                                                                                   false                        false                            0  pss=FIXED
MP             52  runtime  transport  NMP        transport=ide                                                                                    false                        false                            0
MP             53  runtime  transport  NMP        transport=block                                                                                  false                        false                            0
MP             54  runtime  transport  NMP        transport=unknown                                                                                false                        false                            0
MP            101  runtime  vendor     MASK_PATH  vendor=DELL model=Universal Xport pci_vendor_id=* pci_sub_vendor_id=*                            false                        false                            0
MP            101  file     vendor     MASK_PATH  vendor=DELL model=Universal Xport pci_vendor_id=* pci_sub_vendor_id=*                            false                        false                            0
MP            102  file     vendor     HPP        vendor=NVMe model=rapiddisk* pci_vendor_id=* pci_sub_vendor_id=*                                 false                        false                            0  latency-eval-time=180000,pss=LB-Latency
MP            103  file     vendor     HPP        vendor=* model=Linux pci_vendor_id=* pci_sub_vendor_id=*                                         false                        false                            0  latency-eval-time=180000,pss=LB-Latency
MP          65531  runtime  transport  HPP        transport=sas                                                                                    false                        false                            0  pss=FIXED
MP          65532  runtime  transport  HPP        transport=parallel                                                                               false                        false                            0  pss=FIXED
MP          65533  runtime  transport  HPP        transport=pcie                                                                                   false                        false                            0  pss=FIXED
MP          65534  runtime  vendor     HPP        nvme_controller_model=* pci_vendor_id=* pci_sub_vendor_id=*                                      false                        false                            0
MP          65535  runtime  vendor     NMP        vendor=* model=* pci_vendor_id=* pci_sub_vendor_id=*                                             false                        false                            0

[root@r730esxi3:~] esxcli storage core claimrule load

[root@r730esxi3:~] esxcli storage core claimrule list
Rule Class   Rule  Class    Type       Plugin     Matches                                                                XCOPY Use Array Reported Values  XCOPY Use Multiple Segments  XCOPY Max Transfer Size KiB  Config String
----------  -----  -------  ---------  ---------  ---------------------------------------------------------------------  -------------------------------  ---------------------------  ---------------------------  -------------
MP             50  runtime  transport  NMP        transport=usb                                                                                    false                        false                            0
MP             51  runtime  transport  HPP        transport=sata                                                                                   false                        false                            0  pss=FIXED
MP             52  runtime  transport  NMP        transport=ide                                                                                    false                        false                            0
MP             53  runtime  transport  NMP        transport=block                                                                                  false                        false                            0
MP             54  runtime  transport  NMP        transport=unknown                                                                                false                        false                            0
MP            101  runtime  vendor     MASK_PATH  vendor=DELL model=Universal Xport pci_vendor_id=* pci_sub_vendor_id=*                            false                        false                            0
MP            101  file     vendor     MASK_PATH  vendor=DELL model=Universal Xport pci_vendor_id=* pci_sub_vendor_id=*                            false                        false                            0
MP            102  runtime  vendor     HPP        vendor=NVMe model=rapiddisk* pci_vendor_id=* pci_sub_vendor_id=*                                 false                        false                            0  latency-eval-time=180000,pss=LB-Latency
MP            102  file     vendor     HPP        vendor=NVMe model=rapiddisk* pci_vendor_id=* pci_sub_vendor_id=*                                 false                        false                            0  latency-eval-time=180000,pss=LB-Latency
MP            103  runtime  vendor     HPP        vendor=* model=Linux pci_vendor_id=* pci_sub_vendor_id=*                                         false                        false                            0  latency-eval-time=180000,pss=LB-Latency
MP            103  file     vendor     HPP        vendor=* model=Linux pci_vendor_id=* pci_sub_vendor_id=*                                         false                        false                            0  latency-eval-time=180000,pss=LB-Latency
MP          65531  runtime  transport  HPP        transport=sas                                                                                    false                        false                            0  pss=FIXED
MP          65532  runtime  transport  HPP        transport=parallel                                                                               false                        false                            0  pss=FIXED
MP          65533  runtime  transport  HPP        transport=pcie                                                                                   false                        false                            0  pss=FIXED
MP          65534  runtime  vendor     HPP        nvme_controller_model=* pci_vendor_id=* pci_sub_vendor_id=*                                      false                        false                            0
MP          65535  runtime  vendor     NMP        vendor=* model=* pci_vendor_id=* pci_sub_vendor_id=*                                             false                        false                            0

image
image
image

Still same result:

2022-04-21T12:47:26.941Z cpu15:2097510)HPP: HppCreateDevice:3071: Created logical device 'uuid.54fc35af7f7842ce835e46eec97ef288'.
2022-04-21T12:47:26.941Z cpu15:2097510)WARNING: HPP: HppClaimPath:3956: Failed to claim path 'vmhba65:C0:T0:L0': Not supported
2022-04-21T12:47:26.941Z cpu15:2097510)HPP: HppUnclaimPath:4002: Unclaiming path vmhba65:C0:T0:L0
2022-04-21T12:47:26.941Z cpu15:2097510)ScsiPath: 8597: Plugin 'HPP' rejected path 'vmhba65:C0:T0:L0'
2022-04-21T12:47:26.941Z cpu15:2097510)ScsiClaimrule: 2039: Plugin HPP specified by claimrule 103 was not able to claim path vmhba65:C0:T0:L0: Not supported
2022-04-21T12:47:26.941Z cpu15:2097510)HPP: HppCreateDevice:3071: Created logical device 'uuid.54fc35af7f7842ce835e46eec97ef288'.
2022-04-21T12:47:26.941Z cpu15:2097510)WARNING: HPP: HppClaimPath:3956: Failed to claim path 'vmhba65:C0:T0:L0': Not supported
2022-04-21T12:47:26.941Z cpu15:2097510)HPP: HppUnclaimPath:4002: Unclaiming path vmhba65:C0:T0:L0
2022-04-21T12:47:26.941Z cpu15:2097510)ScsiPath: 8597: Plugin 'HPP' rejected path 'vmhba65:C0:T0:L0'
2022-04-21T12:47:26.941Z cpu15:2097510)ScsiClaimrule: 2039: Plugin HPP specified by claimrule 65534 was not able to claim path vmhba65:C0:T0:L0: Not supported
2022-04-21T12:47:26.941Z cpu15:2097510)WARNING: ScsiPath: 8496: NMP cannot claim a path to NVMeOF device vmhba65:C0:T0:L0
2022-04-21T12:47:26.941Z cpu15:2097510)ScsiClaimrule: 2039: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba65:C0:T0:L0: Not supported
2022-04-21T12:47:26.941Z cpu15:2097510)ScsiClaimrule: 2518: Error claiming path vmhba65:C0:T0:L0. Not supported.

@singlecheeze
Copy link
Author

singlecheeze commented Apr 21, 2022

The plot thickens with SPDK (Just using one RAM device/drive, not an MD array, even though this is all for test and is actually a VMDK sitting on an iSCSI data store in a VM that I'm just re-presenting storage back to the ESXi host that VM lives on):

[dave@fedora spdk]$ lsblk
NAME        MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
zram0       251:0    0   8G  0 disk [SWAP]
nvme0n1     259:0    0  16G  0 disk 
├─nvme0n1p1 259:1    0   1G  0 part /boot
└─nvme0n1p2 259:2    0  15G  0 part /home
                                    /
nvme0n2     259:3    0  32G  0 disk 
nvme0n3     259:4    0  32G  0 disk

[dave@fedora spdk]$ sudo ./scripts/rpc.py nvmf_create_transport -t TCP -u 16384 -m 8 -c 8192

[dave@fedora spdk]$ sudo ./scripts/rpc.py bdev_malloc_create -b Malloc1 512 512
Malloc1

[dave@fedora spdk]$ sudo ./scripts/rpc.py nvmf_create_subsystem nqn.2016-06.io.spdk:cnode1 -a -s SPDK00000000000001 -d SPDK_Controller1

[dave@fedora spdk]$ sudo ./scripts/rpc.py nvmf_subsystem_add_ns nqn.2016-06.io.spdk:cnode1 Malloc1

[dave@fedora spdk]$ sudo ./scripts/rpc.py nvmf_subsystem_add_listener nqn.2016-06.io.spdk:cnode1 -t tcp -a 172.16.1.223 -s 4420

Results in:

[dave@fedora spdk]$ sudo build/bin/spdk_tgt &
[2022-04-21 16:12:01.131390] tcp.c: 821:nvmf_tcp_listen: *NOTICE*: *** NVMe/TCP Target Listening on 172.16.1.223 port 4420 ***
[2022-04-21 16:12:14.804696] subsystem.c:1215:spdk_nvmf_subsystem_listener_allowed: *WARNING*: Allowing connection to discovery subsystem on TCP/172.16.1.223/4420, even though this listener was not added to the discovery subsystem.  This behavior is deprecated and will be removed in a future release.

ESXi Sees:

2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)World: 12075: VC opID l1pu47yh-54080-auto-15q9-h5:70021458-4b-f1-6343 maps to vmkernel opID af94e221
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMEDEV:1393 Ctlr 256 allocated, maximum queues 16
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:159 Controller 256, connecting using parameters:
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:161 kato:      0
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:162 subtype:   1
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:163 cdc:       0
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:166 target type:    NVMe
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:174 vmkParams.asqsize:   32
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:175 vmkParams.cntlid:    0xffff
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:177 vmkParams.hostid:    615ae605-1131-d5a2-0c71-246e961c89c0
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:178 vmkParams.hostnqn:   nqn.2014-08.localdomain:nvme:r730esxi3
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:179 vmkParams.subnqn:    nqn.2014-08.org.nvmexpress.discovery
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:196 vmkParams.trType:    TCP
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:198 vmkParams.trsvcid:   4420
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:200 vmkParams.traddr:    172.16.1.223
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFDEV:202 vmkParams.tsas.digest:    0
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)nvmetcp:nt_ConnectController:781 vmhba65, controller 256
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)nvmetcp:nt_ConnectCM:4408 [ctlr 256, queue 0]
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)NVMFNET:151 Uplink: vmnic4, portset: vSwitch0.
2022-04-21T20:12:14.753Z cpu0:2100851 opID=af94e221)nvmetcp:nt_ConnectCM:4460 [ctlr 256, queue 0] Using source vmknic vmk0 for socket binding
2022-04-21T20:12:14.803Z cpu0:2100851 opID=af94e221)nvmetcp:nt_SocketConnect:4315 [ctlr 256, queue 0] Socket connected
2022-04-21T20:12:14.803Z cpu33:2109077)nvmetcp:nt_TxWorldFunc:3844 [ctlr 256, queue 0]
2022-04-21T20:12:14.803Z cpu2:2109078)nvmetcp:nt_RxWorldFunc:2974 [ctlr 256, queue 0]
2022-04-21T20:12:14.804Z cpu0:2100851 opID=af94e221)nvmetcp:nt_SendICReq:3975 [ctlr 256, queue 0]
2022-04-21T20:12:14.804Z cpu33:2109077)nvmetcp:nt_SendICReqPdu:3255 [ctlr 256, queue 0] txPdu 0x431029e04060, pduLength 128
2022-04-21T20:12:14.804Z cpu0:2100851 opID=af94e221)nvmetcp:nt_ConnectQueueInt:4136 [ctlr 256, queue 0] connected, qSize:32, maxCapsuleSize:8256, maxH2CData:131072, hpda:0, cpda:0, hdgst:0, ddgst:0
2022-04-21T20:12:14.804Z cpu0:2100851 opID=af94e221)nvmetcp:nt_ConnectController:873 [ctlr 256] connected successfully
2022-04-21T20:12:14.804Z cpu0:2100851 opID=af94e221)NVMFDEV:2141 controller 256, queue 0
2022-04-21T20:12:14.804Z cpu0:2100851 opID=af94e221)NVMFDEV:2218 Connected to queue 0, successfully
2022-04-21T20:12:14.804Z cpu0:2100851 opID=af94e221)NVMFDEV:1904 Controller 0x431134608bc0, set ctlrID to 1
2022-04-21T20:12:14.804Z cpu0:2100851 opID=af94e221)NVMFDEV:676 Adding new controller to target active list: nqn.2014-08.org.nvmexpress.discovery
2022-04-21T20:12:14.804Z cpu0:2100851 opID=af94e221)NVMEDEV:4703 disabling controller...
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:4712 enabling controller...
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:1218 Ctlr 256, queue 0, update queue size to 32
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:4720 reading version register...
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:4734 get controller identify data...
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:7426 Controller 256, name nqn.2014-08.org.nvmexpress.discovery#vmhba65#172.16.1.223:4420
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:9120 Succeeded to create recovery world for controller 256
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMFDEV:931 Controller 256 (nqn.2014-08.org.nvmexpress.discovery#vmhba65#172.16.1.223:4420) registered successfully.
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMFDEV:1420 Connected to controller successfully
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:8599 Got discovery log page at offset 0, tlen 2048 successfully
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMFDEV:1273 Controller 256 (nqn.2014-08.org.nvmexpress.discovery#vmhba65#172.16.1.223:4420)
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:6697 Deleting controller 256 (nqn.2014-08.org.nvmexpress.discovery#vmhba65#172.16.1.223:4420)
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)WARNING: NVMEDEV:9653 Controller 256 recovery is not active.
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:6366 Controller 256
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:7599 controller 256
2022-04-21T20:12:14.805Z cpu39:2109079)NVMEDEV:9443 recover controller 256 finish.
2022-04-21T20:12:14.805Z cpu39:2109079)NVMEDEV:9448 recovery world of controller 256 exit.
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:9167 controller 256 recovery world destroyed.
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:7874 Reset admin queue (controller 256)
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMEDEV:7770 controller 256, queue 0
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)NVMFDEV:2500 Controller 256
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)nvmetcp:nt_ResetQueue:1237 [ctlr 256] reset queue 0
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)nvmetcp:nt_ResetQueueInt:5057 [ctlr 256, queue 0]
2022-04-21T20:12:14.805Z cpu0:2100851 opID=af94e221)nvmetcp:nt_DisconnectCM:4715 [ctlr 256, queue 0]
2022-04-21T20:12:14.805Z cpu2:2109078)nvmetcp:nt_ReceiveFromSocket:967 [ctlr 256, queue 0] failed to read socket, expected 8, received 0.: Broken pipe
2022-04-21T20:12:14.805Z cpu2:2109078)nvmetcp:nt_ReceivePduHeader:2777 [ctlr 256, queue 0] failed to receive common header: Broken pipe
2022-04-21T20:12:14.806Z cpu2:2109078)nvmetcp:nt_RxWorldFunc:3016 [ctlr 256, queue 0] queue disconnecting, will exit.
2022-04-21T20:12:14.806Z cpu2:2109078)nvmetcp:nt_RxWorldFunc:3076 [ctlr 256, queue 0] exiting RxWorld
2022-04-21T20:12:14.806Z cpu33:2109077)nvmetcp:nt_TxWorldFunc:3855 [ctlr 256, queue 0] tx world wakes up: World is marked for death
2022-04-21T20:12:14.806Z cpu33:2109077)nvmetcp:nt_TxWorldFunc:3859 [ctlr 256, queue 0] tx world dying, will exit.
2022-04-21T20:12:14.806Z cpu33:2109077)nvmetcp:nt_TxWorldFunc:3944 [ctlr 256, queue 0] exiting TxWorld
2022-04-21T20:12:14.806Z cpu0:2100851 opID=af94e221)nvmetcp:nt_CleanupPendingIOs:4615 [ctlr 256, queue 0]
2022-04-21T20:12:14.906Z cpu0:2100851 opID=af94e221)nvmetcp:nt_DisconnectCM:4808 [ctlr 256, queue 0] exiting Success
2022-04-21T20:12:14.906Z cpu0:2100851 opID=af94e221)nvmetcp:nt_ConnectCM:4408 [ctlr 256, queue 0]
2022-04-21T20:12:14.906Z cpu0:2100851 opID=af94e221)NVMFNET:151 Uplink: vmnic4, portset: vSwitch0.
2022-04-21T20:12:14.906Z cpu0:2100851 opID=af94e221)nvmetcp:nt_ConnectCM:4460 [ctlr 256, queue 0] Using source vmknic vmk0 for socket binding
2022-04-21T20:12:14.956Z cpu0:2100851 opID=af94e221)nvmetcp:nt_SocketConnect:4315 [ctlr 256, queue 0] Socket connected
2022-04-21T20:12:14.956Z cpu27:2109080)nvmetcp:nt_TxWorldFunc:3844 [ctlr 256, queue 0]
2022-04-21T20:12:14.956Z cpu9:2109081)nvmetcp:nt_RxWorldFunc:2974 [ctlr 256, queue 0]
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)nvmetcp:nt_SendICReq:3975 [ctlr 256, queue 0]
2022-04-21T20:12:14.957Z cpu27:2109080)nvmetcp:nt_SendICReqPdu:3255 [ctlr 256, queue 0] txPdu 0x431029e04f00, pduLength 128
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)nvmetcp:nt_ResetQueueInt:5079 [ctlr 256, queue 0] reconnected
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)nvmetcp:nt_ResetQueue:1256 [ctlr 256, queue 0] reset succeeded
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)NVMFDEV:2141 controller 256, queue 0
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)WARNING: NVMEIO:3718 Ctlr 256, nvmeCmd 0x45b8c57ffe40 (opc 7f), admin queue not available,  nvmeStatus 805
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)NVMFDEV:2218 Connected to queue 0, successfully
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)NVMFDEV:1904 Controller 0x431134608bc0, set ctlrID to 2
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)NVMEDEV:7848 Controller 256, queue 0 reset complete. Status Success
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)nvmetcp:nt_DisconnectController:918 [ctlr 256]
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)nvmetcp:nt_DisconnectQueueInt:5004 [ctlr 256, queue 0]
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)nvmetcp:nt_DisconnectCM:4715 [ctlr 256, queue 0]
2022-04-21T20:12:14.957Z cpu9:2109081)nvmetcp:nt_ReceiveFromSocket:967 [ctlr 256, queue 0] failed to read socket, expected 8, received 0.: Broken pipe
2022-04-21T20:12:14.957Z cpu9:2109081)nvmetcp:nt_ReceivePduHeader:2777 [ctlr 256, queue 0] failed to receive common header: Broken pipe
2022-04-21T20:12:14.957Z cpu9:2109081)nvmetcp:nt_RxWorldFunc:3016 [ctlr 256, queue 0] queue disconnecting, will exit.
2022-04-21T20:12:14.957Z cpu27:2109080)nvmetcp:nt_TxWorldFunc:3855 [ctlr 256, queue 0] tx world wakes up: World is marked for death
2022-04-21T20:12:14.957Z cpu27:2109080)nvmetcp:nt_TxWorldFunc:3859 [ctlr 256, queue 0] tx world dying, will exit.
2022-04-21T20:12:14.957Z cpu9:2109081)nvmetcp:nt_RxWorldFunc:3076 [ctlr 256, queue 0] exiting RxWorld
2022-04-21T20:12:14.957Z cpu27:2109080)nvmetcp:nt_TxWorldFunc:3944 [ctlr 256, queue 0] exiting TxWorld
2022-04-21T20:12:14.957Z cpu0:2100851 opID=af94e221)nvmetcp:nt_CleanupPendingIOs:4615 [ctlr 256, queue 0]
2022-04-21T20:12:15.057Z cpu0:2100851 opID=af94e221)nvmetcp:nt_DisconnectCM:4808 [ctlr 256, queue 0] exiting Success
2022-04-21T20:12:15.057Z cpu0:2100851 opID=af94e221)nvmetcp:nt_FreeSubmissionResources:5189 [ctlr 256, queue 0]
2022-04-21T20:12:15.057Z cpu0:2100851 opID=af94e221)nvmetcp:nt_DisconnectQueueInt:5028 [ctlr 256, queue 0] disconnected
2022-04-21T20:12:15.057Z cpu0:2100851 opID=af94e221)nvmetcp:nt_DisconnectController:953 controller 256 disconnected
2022-04-21T20:12:15.057Z cpu0:2100851 opID=af94e221)NVMEDEV:1565 Ctlr 256 freeing
2022-04-21T20:12:15.057Z cpu0:2100851 opID=af94e221)NVMEDEV:9060 Cancel requests of controller 256, 0 left.
2022-04-21T20:12:15.057Z cpu0:2100851 opID=af94e221)NVMFVSI:1304 Discovery executed successfully
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)World: 12075: VC opID l1pu47yh-54089-auto-15qi-h5:70021464-76-1c-6348 maps to vmkernel opID 59976c6a
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMEDEV:1393 Ctlr 257 allocated, maximum queues 16
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:159 Controller 257, connecting using parameters:
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:161 kato:      30000
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:162 subtype:   2
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:163 cdc:       0
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:166 target type:    NVMe
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:174 vmkParams.asqsize:   32
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:175 vmkParams.cntlid:    0xffff
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:177 vmkParams.hostid:    615ae605-1131-d5a2-0c71-246e961c89c0
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:178 vmkParams.hostnqn:   nqn.2014-08.localdomain:nvme:r730esxi3
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:179 vmkParams.subnqn:    nqn.2016-06.io.spdk:cnode1
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:196 vmkParams.trType:    TCP
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:198 vmkParams.trsvcid:   4420
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:200 vmkParams.traddr:    172.16.1.223
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFDEV:202 vmkParams.tsas.digest:    0
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectController:781 vmhba65, controller 257
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectCM:4408 [ctlr 257, queue 0]
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)NVMFNET:151 Uplink: vmnic4, portset: vSwitch0.
2022-04-21T20:12:21.733Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectCM:4460 [ctlr 257, queue 0] Using source vmknic vmk0 for socket binding
2022-04-21T20:12:21.783Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SocketConnect:4315 [ctlr 257, queue 0] Socket connected
2022-04-21T20:12:21.783Z cpu32:2109082)nvmetcp:nt_TxWorldFunc:3844 [ctlr 257, queue 0]
2022-04-21T20:12:21.783Z cpu41:2109083)nvmetcp:nt_RxWorldFunc:2974 [ctlr 257, queue 0]
2022-04-21T20:12:21.784Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SendICReq:3975 [ctlr 257, queue 0]
2022-04-21T20:12:21.784Z cpu32:2109082)nvmetcp:nt_SendICReqPdu:3255 [ctlr 257, queue 0] txPdu 0x431029e04060, pduLength 128
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectQueueInt:4136 [ctlr 257, queue 0] connected, qSize:32, maxCapsuleSize:8256, maxH2CData:131072, hpda:0, cpda:0, hdgst:0, ddgst:0
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectController:873 [ctlr 257] connected successfully
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMFDEV:2141 controller 257, queue 0
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMFDEV:2218 Connected to queue 0, successfully
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMFDEV:1904 Controller 0x43113460b340, set ctlrID to 1
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMFDEV:348 Adding new controller nqn.2016-06.io.spdk:cnode1 to active list
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMEDEV:4703 disabling controller...
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMEDEV:4712 enabling controller...
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMEDEV:1218 Ctlr 257, queue 0, update queue size to 32
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMEDEV:4720 reading version register...
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMEDEV:4734 get controller identify data...
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMEDEV:7426 Controller 257, name nqn.2016-06.io.spdk:cnode1#vmhba65#172.16.1.223:4420
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMEDEV:7222 Ctlr(257) nqn.2016-06.io.spdk:cnode1#vmhba65#172.16.1.223:4420 got slot 0
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMEDEV:7485 controller 257
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMEDEV:7548 controller 257, set KATO to 30000 miliseconds.
2022-04-21T20:12:21.785Z cpu2:2100850 opID=59976c6a)NVMEDEV:4814 configuring IO queues...
2022-04-21T20:12:21.786Z cpu2:2100850 opID=59976c6a)NVMEDEV:1918 Ctlr 257, set queue size to 128.
2022-04-21T20:12:21.786Z cpu2:2100850 opID=59976c6a)NVMEDEV:1780 Ctlr 257, tqsz 2048, qsz 128, qn 2, minqn 1, maxqn 16, nc 24, nn 2
2022-04-21T20:12:21.786Z cpu2:2100850 opID=59976c6a)NVMEDEV:2021 Ctlr 257, calling setNumberIOQueues (16 queues desired)
2022-04-21T20:12:21.786Z cpu2:2100850 opID=59976c6a)NVMEDEV:2034 Ctlr 257, 8 queues allocated
2022-04-21T20:12:21.786Z cpu2:2100850 opID=59976c6a)NVMEDEV:2050 Ctlr 257, creating IO queue 1 (qsize 128)
2022-04-21T20:12:21.786Z cpu2:2100850 opID=59976c6a)NVMEDEV:1218 Ctlr 257, queue 1, update queue size to 128
2022-04-21T20:12:21.786Z cpu2:2100850 opID=59976c6a)NVMFDEV:2633 controller 257, queue 1
2022-04-21T20:12:21.786Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1072 [ctlr 257] queue 1, size 128
2022-04-21T20:12:21.786Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectCM:4408 [ctlr 257, queue 1]
2022-04-21T20:12:21.836Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SocketConnect:4315 [ctlr 257, queue 1] Socket connected
2022-04-21T20:12:21.836Z cpu43:2109084)nvmetcp:nt_TxWorldFunc:3844 [ctlr 257, queue 1]
2022-04-21T20:12:21.836Z cpu39:2109085)nvmetcp:nt_RxWorldFunc:2974 [ctlr 257, queue 1]
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SendICReq:3975 [ctlr 257, queue 1]
2022-04-21T20:12:21.837Z cpu43:2109084)nvmetcp:nt_SendICReqPdu:3255 [ctlr 257, queue 1] txPdu 0x431029e07060, pduLength 128
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectQueueInt:4136 [ctlr 257, queue 1] connected, qSize:128, maxCapsuleSize:8256, maxH2CData:131072, hpda:0, cpda:0, hdgst:0, ddgst:0
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1121 [ctlr 257, queue 1] connected successfully
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)NVMFDEV:2141 controller 257, queue 1
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)WARNING: NVMEIO:3581 Ctlr 257, nvmeCmd 0x45b8c57ffac0 (opc 7f), queue 1 (expect 1) not available, nvmeStatus 805
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)NVMFDEV:2218 Connected to queue 1, successfully
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)NVMFDEV:2697 CTLR 257 queue 1 (qsize 128) created successfully.
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)NVMEDEV:2066 Ctlr 257, queue 1 created
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)NVMEDEV:2050 Ctlr 257, creating IO queue 2 (qsize 128)
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)NVMEDEV:1218 Ctlr 257, queue 2, update queue size to 128
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)NVMFDEV:2633 controller 257, queue 2
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1072 [ctlr 257] queue 2, size 128
2022-04-21T20:12:21.837Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectCM:4408 [ctlr 257, queue 2]
2022-04-21T20:12:21.887Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SocketConnect:4315 [ctlr 257, queue 2] Socket connected
2022-04-21T20:12:21.887Z cpu42:2109086)nvmetcp:nt_TxWorldFunc:3844 [ctlr 257, queue 2]
2022-04-21T20:12:21.887Z cpu37:2109087)nvmetcp:nt_RxWorldFunc:2974 [ctlr 257, queue 2]
2022-04-21T20:12:21.888Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SendICReq:3975 [ctlr 257, queue 2]
2022-04-21T20:12:21.888Z cpu42:2109086)nvmetcp:nt_SendICReqPdu:3255 [ctlr 257, queue 2] txPdu 0x431029e12160, pduLength 128
2022-04-21T20:12:21.888Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectQueueInt:4136 [ctlr 257, queue 2] connected, qSize:128, maxCapsuleSize:8256, maxH2CData:131072, hpda:0, cpda:0, hdgst:0, ddgst:0
2022-04-21T20:12:21.888Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1121 [ctlr 257, queue 2] connected successfully
2022-04-21T20:12:21.888Z cpu2:2100850 opID=59976c6a)NVMFDEV:2141 controller 257, queue 2
2022-04-21T20:12:21.888Z cpu2:2100850 opID=59976c6a)WARNING: NVMEIO:3581 Ctlr 257, nvmeCmd 0x45b8c57ffac0 (opc 7f), queue 2 (expect 2) not available, nvmeStatus 805
2022-04-21T20:12:21.888Z cpu2:2100850 opID=59976c6a)NVMFDEV:2218 Connected to queue 2, successfully
2022-04-21T20:12:21.888Z cpu2:2100850 opID=59976c6a)NVMFDEV:2697 CTLR 257 queue 2 (qsize 128) created successfully.
2022-04-21T20:12:21.888Z cpu2:2100850 opID=59976c6a)NVMEDEV:2066 Ctlr 257, queue 2 created
2022-04-21T20:12:21.888Z cpu2:2100850 opID=59976c6a)NVMEDEV:2050 Ctlr 257, creating IO queue 3 (qsize 128)
2022-04-21T20:12:21.888Z cpu2:2100850 opID=59976c6a)NVMEDEV:1218 Ctlr 257, queue 3, update queue size to 128
2022-04-21T20:12:21.889Z cpu2:2100850 opID=59976c6a)NVMFDEV:2633 controller 257, queue 3
2022-04-21T20:12:21.889Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1072 [ctlr 257] queue 3, size 128
2022-04-21T20:12:21.889Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectCM:4408 [ctlr 257, queue 3]
2022-04-21T20:12:21.939Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SocketConnect:4315 [ctlr 257, queue 3] Socket connected
2022-04-21T20:12:21.939Z cpu17:2109088)nvmetcp:nt_TxWorldFunc:3844 [ctlr 257, queue 3]
2022-04-21T20:12:21.939Z cpu21:2109089)nvmetcp:nt_RxWorldFunc:2974 [ctlr 257, queue 3]
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SendICReq:3975 [ctlr 257, queue 3]
2022-04-21T20:12:21.940Z cpu17:2109088)nvmetcp:nt_SendICReqPdu:3255 [ctlr 257, queue 3] txPdu 0x431029e1d260, pduLength 128
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectQueueInt:4136 [ctlr 257, queue 3] connected, qSize:128, maxCapsuleSize:8256, maxH2CData:131072, hpda:0, cpda:0, hdgst:0, ddgst:0
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1121 [ctlr 257, queue 3] connected successfully
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)NVMFDEV:2141 controller 257, queue 3
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)WARNING: NVMEIO:3581 Ctlr 257, nvmeCmd 0x45b8c57ffac0 (opc 7f), queue 3 (expect 3) not available, nvmeStatus 805
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)NVMFDEV:2218 Connected to queue 3, successfully
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)NVMFDEV:2697 CTLR 257 queue 3 (qsize 128) created successfully.
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)NVMEDEV:2066 Ctlr 257, queue 3 created
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)NVMEDEV:2050 Ctlr 257, creating IO queue 4 (qsize 128)
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)NVMEDEV:1218 Ctlr 257, queue 4, update queue size to 128
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)NVMFDEV:2633 controller 257, queue 4
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1072 [ctlr 257] queue 4, size 128
2022-04-21T20:12:21.940Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectCM:4408 [ctlr 257, queue 4]
2022-04-21T20:12:21.990Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SocketConnect:4315 [ctlr 257, queue 4] Socket connected
2022-04-21T20:12:21.990Z cpu44:2109090)nvmetcp:nt_TxWorldFunc:3844 [ctlr 257, queue 4]
2022-04-21T20:12:21.990Z cpu7:2109091)nvmetcp:nt_RxWorldFunc:2974 [ctlr 257, queue 4]
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SendICReq:3975 [ctlr 257, queue 4]
2022-04-21T20:12:21.991Z cpu44:2109090)nvmetcp:nt_SendICReqPdu:3255 [ctlr 257, queue 4] txPdu 0x431029e28360, pduLength 128
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectQueueInt:4136 [ctlr 257, queue 4] connected, qSize:128, maxCapsuleSize:8256, maxH2CData:131072, hpda:0, cpda:0, hdgst:0, ddgst:0
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1121 [ctlr 257, queue 4] connected successfully
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)NVMFDEV:2141 controller 257, queue 4
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)WARNING: NVMEIO:3581 Ctlr 257, nvmeCmd 0x45b8c57ffac0 (opc 7f), queue 4 (expect 4) not available, nvmeStatus 805
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)NVMFDEV:2218 Connected to queue 4, successfully
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)NVMFDEV:2697 CTLR 257 queue 4 (qsize 128) created successfully.
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)NVMEDEV:2066 Ctlr 257, queue 4 created
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)NVMEDEV:2050 Ctlr 257, creating IO queue 5 (qsize 128)
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)NVMEDEV:1218 Ctlr 257, queue 5, update queue size to 128
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)NVMFDEV:2633 controller 257, queue 5
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1072 [ctlr 257] queue 5, size 128
2022-04-21T20:12:21.991Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectCM:4408 [ctlr 257, queue 5]
2022-04-21T20:12:22.041Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SocketConnect:4315 [ctlr 257, queue 5] Socket connected
2022-04-21T20:12:22.041Z cpu6:2109092)nvmetcp:nt_TxWorldFunc:3844 [ctlr 257, queue 5]
2022-04-21T20:12:22.041Z cpu13:2109093)nvmetcp:nt_RxWorldFunc:2974 [ctlr 257, queue 5]
2022-04-21T20:12:22.042Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SendICReq:3975 [ctlr 257, queue 5]
2022-04-21T20:12:22.042Z cpu6:2109092)nvmetcp:nt_SendICReqPdu:3255 [ctlr 257, queue 5] txPdu 0x431029e33460, pduLength 128
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectQueueInt:4136 [ctlr 257, queue 5] connected, qSize:128, maxCapsuleSize:8256, maxH2CData:131072, hpda:0, cpda:0, hdgst:0, ddgst:0
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1121 [ctlr 257, queue 5] connected successfully
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)NVMFDEV:2141 controller 257, queue 5
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)WARNING: NVMEIO:3581 Ctlr 257, nvmeCmd 0x45b8c57ffac0 (opc 7f), queue 5 (expect 5) not available, nvmeStatus 805
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)NVMFDEV:2218 Connected to queue 5, successfully
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)NVMFDEV:2697 CTLR 257 queue 5 (qsize 128) created successfully.
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)NVMEDEV:2066 Ctlr 257, queue 5 created
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)NVMEDEV:2050 Ctlr 257, creating IO queue 6 (qsize 128)
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)NVMEDEV:1218 Ctlr 257, queue 6, update queue size to 128
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)NVMFDEV:2633 controller 257, queue 6
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1072 [ctlr 257] queue 6, size 128
2022-04-21T20:12:22.043Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectCM:4408 [ctlr 257, queue 6]
2022-04-21T20:12:22.093Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SocketConnect:4315 [ctlr 257, queue 6] Socket connected
2022-04-21T20:12:22.093Z cpu39:2109094)nvmetcp:nt_TxWorldFunc:3844 [ctlr 257, queue 6]
2022-04-21T20:12:22.093Z cpu7:2109095)nvmetcp:nt_RxWorldFunc:2974 [ctlr 257, queue 6]
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SendICReq:3975 [ctlr 257, queue 6]
2022-04-21T20:12:22.094Z cpu39:2109094)nvmetcp:nt_SendICReqPdu:3255 [ctlr 257, queue 6] txPdu 0x431029e47010, pduLength 128
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectQueueInt:4136 [ctlr 257, queue 6] connected, qSize:128, maxCapsuleSize:8256, maxH2CData:131072, hpda:0, cpda:0, hdgst:0, ddgst:0
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1121 [ctlr 257, queue 6] connected successfully
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)NVMFDEV:2141 controller 257, queue 6
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)WARNING: NVMEIO:3581 Ctlr 257, nvmeCmd 0x45b8c57ffac0 (opc 7f), queue 6 (expect 6) not available, nvmeStatus 805
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)NVMFDEV:2218 Connected to queue 6, successfully
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)NVMFDEV:2697 CTLR 257 queue 6 (qsize 128) created successfully.
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)NVMEDEV:2066 Ctlr 257, queue 6 created
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)NVMEDEV:2050 Ctlr 257, creating IO queue 7 (qsize 128)
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)NVMEDEV:1218 Ctlr 257, queue 7, update queue size to 128
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)NVMFDEV:2633 controller 257, queue 7
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1072 [ctlr 257] queue 7, size 128
2022-04-21T20:12:22.094Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectCM:4408 [ctlr 257, queue 7]
2022-04-21T20:12:22.144Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SocketConnect:4315 [ctlr 257, queue 7] Socket connected
2022-04-21T20:12:22.144Z cpu9:2109096)nvmetcp:nt_TxWorldFunc:3844 [ctlr 257, queue 7]
2022-04-21T20:12:22.144Z cpu37:2109097)nvmetcp:nt_RxWorldFunc:2974 [ctlr 257, queue 7]
2022-04-21T20:12:22.145Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SendICReq:3975 [ctlr 257, queue 7]
2022-04-21T20:12:22.145Z cpu9:2109096)nvmetcp:nt_SendICReqPdu:3255 [ctlr 257, queue 7] txPdu 0x431029e56010, pduLength 128
2022-04-21T20:12:22.145Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectQueueInt:4136 [ctlr 257, queue 7] connected, qSize:128, maxCapsuleSize:8256, maxH2CData:131072, hpda:0, cpda:0, hdgst:0, ddgst:0
2022-04-21T20:12:22.145Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1121 [ctlr 257, queue 7] connected successfully
2022-04-21T20:12:22.145Z cpu2:2100850 opID=59976c6a)NVMFDEV:2141 controller 257, queue 7
2022-04-21T20:12:22.145Z cpu2:2100850 opID=59976c6a)WARNING: NVMEIO:3581 Ctlr 257, nvmeCmd 0x45b8c57ffac0 (opc 7f), queue 7 (expect 7) not available, nvmeStatus 805
2022-04-21T20:12:22.145Z cpu2:2100850 opID=59976c6a)NVMFDEV:2218 Connected to queue 7, successfully
2022-04-21T20:12:22.145Z cpu2:2100850 opID=59976c6a)NVMFDEV:2697 CTLR 257 queue 7 (qsize 128) created successfully.
2022-04-21T20:12:22.145Z cpu2:2100850 opID=59976c6a)NVMEDEV:2066 Ctlr 257, queue 7 created
2022-04-21T20:12:22.145Z cpu2:2100850 opID=59976c6a)NVMEDEV:2050 Ctlr 257, creating IO queue 8 (qsize 128)
2022-04-21T20:12:22.145Z cpu2:2100850 opID=59976c6a)NVMEDEV:1218 Ctlr 257, queue 8, update queue size to 128
2022-04-21T20:12:22.146Z cpu2:2100850 opID=59976c6a)NVMFDEV:2633 controller 257, queue 8
2022-04-21T20:12:22.146Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1072 [ctlr 257] queue 8, size 128
2022-04-21T20:12:22.146Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectCM:4408 [ctlr 257, queue 8]
2022-04-21T20:12:22.196Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SocketConnect:4315 [ctlr 257, queue 8] Socket connected
2022-04-21T20:12:22.196Z cpu42:2109098)nvmetcp:nt_TxWorldFunc:3844 [ctlr 257, queue 8]
2022-04-21T20:12:22.196Z cpu32:2109099)nvmetcp:nt_RxWorldFunc:2974 [ctlr 257, queue 8]
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_SendICReq:3975 [ctlr 257, queue 8]
2022-04-21T20:12:22.197Z cpu42:2109098)nvmetcp:nt_SendICReqPdu:3255 [ctlr 257, queue 8] txPdu 0x431029e69010, pduLength 128
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectQueueInt:4136 [ctlr 257, queue 8] connected, qSize:128, maxCapsuleSize:8256, maxH2CData:131072, hpda:0, cpda:0, hdgst:0, ddgst:0
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)nvmetcp:nt_ConnectIOQueue:1121 [ctlr 257, queue 8] connected successfully
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMFDEV:2141 controller 257, queue 8
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)WARNING: NVMEIO:3581 Ctlr 257, nvmeCmd 0x45b8c57ffac0 (opc 7f), queue 8 (expect 8) not available, nvmeStatus 805
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMFDEV:2218 Connected to queue 8, successfully
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMFDEV:2697 CTLR 257 queue 8 (qsize 128) created successfully.
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMEDEV:2066 Ctlr 257, queue 8 created
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMEDEV:2088 Ctlr 257, 8 IO queues created.
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMEDEV:1956 Ctlr 257, numaNode 0, baseQueueID 1, queueMask 7
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMEDEV:1956 Ctlr 257, numaNode 1, baseQueueID 1, queueMask 7
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NvmeDiscover: 5259: Scan operation 0 received on adapter vmhba65
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NvmeDiscover: 3757: controller = nqn.2016-06.io.spdk:cnode1#vmhba65#172.16.1.223:4420
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)WARNING: NVMEDEV:1529 Subscription of NS ATTR change event is ignored for non-vVol controller 257
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMEDEV:8467 Configuring aec: 0x100 on controller 257.
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMEDEV:8499 Async event configuration value is 0x100 on controller 257.
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMEDEV:10248 Subscribed events for controller:257
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMEPSA:2307 Event setting for controller:257 status:Success
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NvmeDiscover: 3601: Controller nqn.2016-06.io.spdk:cnode1#vmhba65#172.16.1.223:4420 fuseOp 1 oncs 2c cmic 3 nscnt 20
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NvmeDiscover: 3970: controller Added = nqn.2016-06.io.spdk:cnode1#vmhba65#172.16.1.223:4420
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NvmeDiscover: 5283: Scan operation 0 completed on adapter vmhba65
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMEDEV:4897 discovering namespaces...
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMEDEV:6187 Discover namespace on controller 257
2022-04-21T20:12:22.197Z cpu2:2100850 opID=59976c6a)NVMEDEV:5456 Controller 257, construct namespace 1
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NVMEDEV:5398 Namespace identifier type 0x2, length 0x10.
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NVMEDEV:5398 Namespace identifier type 0x3, length 0x10.
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NVMEDEV:5339 Set name of ns 1 to eui.5372cc0ac8344c669ad91ba7ac111aee
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NvmeDiscover: 5259: Scan operation 3 received on adapter vmhba65
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NvmeDiscover: 4753: controller = nqn.2016-06.io.spdk:cnode1#vmhba65#172.16.1.223:4420
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NvmeDiscover: 670: controller = nqn.2016-06.io.spdk:cnode1#vmhba65#172.16.1.223:4420
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NVMEPSA:1645 adpater: vmhba65, action: 0
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NvmeDiscover: 4538: Created path for NS (under nqn.2016-06.io.spdk:cnode1#vmhba65#172.16.1.223:4420) 0x430660d9de40 = 0x431134608cc0 for ns = 1 for path vmhba65:C0:T0:L0
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NvmeDiscover: 4563: ANA enabled = 0 on ns=1
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)ns = 1
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NvmeDiscover: 5283: Scan operation 3 completed on adapter vmhba65
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NVMEDEV:6327 Discover namespaces on controller 257 is complete
2022-04-21T20:12:22.198Z cpu2:2100850 opID=59976c6a)NVMEDEV:8467 Configuring aec: 0x100 on controller 257.
2022-04-21T20:12:22.198Z cpu4:2097507)HPP: HppCreateDevice:3071: Created logical device 'eui.5372cc0ac8344c669ad91ba7ac111aee'.
2022-04-21T20:12:22.198Z cpu4:2097507)HPP: HppClaimPath:3916: Non-ALUA target (vmhba65:C0:T0:L0)
2022-04-21T20:12:22.198Z cpu4:2097507)HPP: HppClaimPath:3946: Added path 'vmhba65:C0:T0:L0' to logical device 'eui.5372cc0ac8344c669ad91ba7ac111aee'. Total Path Count 1
2022-04-21T20:12:22.198Z cpu41:2097838)HPP: HppRegisterDevice:3360: Registering logical device with uid 'eui.5372cc0ac8344c669ad91ba7ac111aee'.
2022-04-21T20:12:22.198Z cpu41:2097838)StorageApdHandler: 966: APD Handle  Created with lock[StorageApd-0x430660d5de90]
2022-04-21T20:12:22.198Z cpu16:2100850 opID=59976c6a)NVMEDEV:8499 Async event configuration value is 0x100 on controller 257.
2022-04-21T20:12:22.198Z cpu16:2100850 opID=59976c6a)NVMEDEV:9120 Succeeded to create recovery world for controller 257
2022-04-21T20:12:22.198Z cpu16:2100850 opID=59976c6a)NVMFDEV:931 Controller 257 (nqn.2016-06.io.spdk:cnode1#vmhba65#172.16.1.223:4420) registered successfully.
2022-04-21T20:12:22.198Z cpu16:2100850 opID=59976c6a)NVMFVSI:1495 Connected to controller successfully
2022-04-21T20:12:22.198Z cpu16:2100850 opID=59976c6a)NVMFVSI:812 Masking out trType 1
2022-04-21T20:12:22.198Z cpu16:2100850 opID=59976c6a)NVMFVSI:812 Masking out trType 2
2022-04-21T20:12:22.239Z cpu41:2097838)HPP: HppPathGroupMovePath:644: Path "vmhba65:C0:T0:L0" state changed from "dead" to "active"
2022-04-21T20:12:22.239Z cpu41:2097838)PsaStorEvents: 676: Event Subsystem: Device Events, Created!
2022-04-21T20:12:22.239Z cpu41:2097838)PsaStorEvents: 676: Event Subsystem: Device Events - Internal, Created!
2022-04-21T20:12:22.239Z cpu41:2097838)ScsiDevice: 4417: Legacy VML ID : vml.065372cc0ac8344c669ad91ba7ac111aee
2022-04-21T20:12:22.239Z cpu41:2097838)ScsiDeviceIO: 9649: QErr is correctly set to 0x0 for device eui.5372cc0ac8344c669ad91ba7ac111aee.
2022-04-21T20:12:22.239Z cpu41:2097838)WARNING: NvmeScsi: 172: SCSI opcode 0x1a (0x45d8d33b95c8) on path vmhba65:C0:T0:L0 to namespace eui.5372cc0ac8344c669ad91ba7ac111aee failed with NVMe error status: 0x2 translating to SCSI error H:0x0
2022-04-21T20:12:22.239Z cpu41:2097838)WARNING: D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0 
2022-04-21T20:12:22.239Z cpu24:2097259)WARNING: ScsiDeviceIO: 4052: ModeSense 0x1a request failed - blocking page:0x1c subpage:0x2 eui.5372cc0ac8344c669ad91ba7ac111aee
2022-04-21T20:12:22.239Z cpu41:2097838)ScsiDeviceIO: 10156: Could not detect setting of sitpua for device eui.5372cc0ac8344c669ad91ba7ac111aee. Error Not supported.
2022-04-21T20:12:22.240Z cpu41:2097838)PsaStorEvents: 407: EventSubsystem: Device Events, Event Mask: 6, Parameter: 0x430660d6bbc0, Registered!
2022-04-21T20:12:22.240Z cpu41:2097838)PsaStorEvents: 407: EventSubsystem: Device Events, Event Mask: 5, Parameter: 0x430660daa860, Registered!
2022-04-21T20:12:22.240Z cpu41:2097838)PsaStorEvents: 407: EventSubsystem: Device Events, Event Mask: 9, Parameter: 0x430660d752d0, Registered!
2022-04-21T20:12:22.240Z cpu41:2097838)PsaStorEvents: 407: EventSubsystem: Device Events, Event Mask: 8, Parameter: 0x430660d536d0, Registered!
2022-04-21T20:12:22.240Z cpu41:2097838)PsaStorEvents: 407: EventSubsystem: Device Events, Event Mask: 7, Parameter: 0x430660da1b90, Registered!
2022-04-21T20:12:22.240Z cpu41:2097838)PsaStorEvents: 407: EventSubsystem: Device Events, Event Mask: d, Parameter: 0x430660d755c0, Registered!
2022-04-21T20:12:22.240Z cpu41:2097838)PsaStorEvents: 407: EventSubsystem: Device Events, Event Mask: 1, Parameter: 0x430660d78b30, Registered!
2022-04-21T20:12:22.240Z cpu41:2097838)ScsiDevice: 4797: Successfully registered device "eui.5372cc0ac8344c669ad91ba7ac111aee" from plugin "HPP" of type 0

image

RAM Disk for testing:
image

image
image
image

@pkoutoupis
Copy link
Owner

pkoutoupis commented Apr 21, 2022

Hmmmm. Remember your comment from earlier?
#82 (comment)

The linux nvme target does not support fused commands so there is no support for VMware currently.

Notice how the nvmet Linux in-kernel driver which I use does not support FUSE and in your latest update (#82 (comment)) using SPDK, it does.

I wonder if THIS is the reason.

Anyway, there was an attempt to implement this in the kernel but it was met with a lot of resistance: http://lists.infradead.org/pipermail/linux-nvme/2021-January/021783.html

@pkoutoupis
Copy link
Owner

What version of ESXi are you using? According to the Lightbits website, version 7.0U3 has been certified with their Lightbits OS using the same driver stack (they developed the NVMe TCP driver).

https://www.lightbitslabs.com/vmware

@singlecheeze
Copy link
Author

7.0U3d (Latest)

I've talked to them recently and think they are using SPDK

@pkoutoupis
Copy link
Owner

pkoutoupis commented Apr 22, 2022

If that is indeed the case, then it won't be possible to have the hypervisor support the exported volumes. It would only be possible to export them directly to the VMs that support NVMe TCP without fused commands.

We may need to rethink this whole strategy. Maybe for VMware, just limit things to writing a Python wrapper script that lives in the scripts subdirectory and wraps around SPDK, while leaving the current implementation in place.

@pkoutoupis
Copy link
Owner

So, I asked the question and one of the co-founders answered: https://koutoupis.com/2022/04/22/vmware-lightbits-labs-and-nvme-over-tcp/#comment-56

@singlecheeze
Copy link
Author

Well done!!! More adventures in SPDK land @pkoutoupis
https://gist.github.com/singlecheeze/0bbc2c29a5b6670887127b93f7b71e3f

@pkoutoupis
Copy link
Owner

So, how did we want to resolve this ticket? Some thoughts came to my mind:

@singlecheeze
Copy link
Author

Let me experiment with using SPDK with RapidDisk, happy to create a pull request if I can get something pulled together

@pkoutoupis
Copy link
Owner

@singlecheeze Hello. I am going through the open tickets and wanted to follow up on this.

@singlecheeze
Copy link
Author

@pkoutoupis I tried to do some benchmarking with SPDK but it was very unstable: https://gist.github.com/singlecheeze/0bbc2c29a5b6670887127b93f7b71e3f

Additionally with the Broadcom announcement and VMware, I'm moving away from VMware and looking at other hypervisors, namely XCP-ng which is Debian based I believe so maybe we could do an NVMe over TCP plug-in for that.

Any any case I think this one is good to close.

@pkoutoupis
Copy link
Owner

Thank you very much for the update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants