viostor Reset to device, \Device\RaidPort3, was issued. VM is frozen. #756

mherlitzius · 2022-04-05T16:35:04Z

Unfortunately we are experiencing the same issue like described in #623.

Error message "viostor Reset to device, \Device\RaidPort3, was issued." appears in the log and the VM is frozen / unresponsive. Only cold reboot helps. This happened already several times.

Environment:

KVM / libvirt 3.0.0-4+deb9u5
Storage: Ceph Nautilus 14.2.21, SSD pool
Problem occurs on both Intel (Xeon E5-2643) and AMD (EPYC 7F72) compute nodes

Affected VMs:
Windows server version: Windows Server 2016

Virtio driver version: build 215

Windows server version: Windows Server 2016

Virtio driver version: 100.77.104.17100

Windows server version: Windows Server 2012 R2

Virtio driver version: 62.74.104.14100

It seems primarily servers running MS SQL Server are affected.

The error message means that the storage does not respond for 30 seconds. In Ceph we see nothing close to that, latency of the affected volume is good. Disk performance and throughput on the Ceph SSD storage pool is excellent when measured with CrystalDiskMark or the SQL Server-internal benchmark.

It seems the issue may occur at arbitrary times, i.e. it is not related to the overall load on the storage cluster. The VM often runs for a few minutes after the first "viostor Reset to device" entries appear in the log. The first application-level problems can appear even if the entire VM is not yet frozen.

On a volume on which this error occured once, we can reproduce it as follows:

run CrystalDiskMark on that volume (or just copy files from that volume), while the volume is attached to the original server. The error appears in the log and the VM freezes within seconds.

When we can not reproduce this issue:

If the affected volume is attached to another VM, we can not reproduce this problem. Running the same operation (e.g. CrystalDiskMark or copying files) on the same volume when attached to another VM the error does not occur.
Running the same operation on a (newly created) volume on the initial VM does not reproduce the problem.

This means that apparently "only" this combination of volume and VM causes the problem.

We have been able to resolve this issue at least for some time, by copying the data from the affected volume to a new volume (while attached to another VM) and then attach the newly created volume to the initial VM.

It is working for a while before the error reoccurs.

It also seems that after the problem occurred for the very first time on a volume it then reoccurs more often (it is not always possible to immediately replace the affected volume as these are production servers).

Do you have any idea what could be causing this problem and how we can work around it? Thank you very much.

vrozenfe · 2022-04-06T01:41:11Z

@mherlitzius
he problem supposed to be fixed for both vioscsi and viostor with #684 and #683 commits respectively.

That is quite new strange that you can reproduce the problem on WS2016 with build 215 since it was tested and verified by our QE https://bugzilla.redhat.com/show_bug.cgi?id=2013976 and https://bugzilla.redhat.com/show_bug.cgi?id=2028000
Will it be possible to share the system event log file from WS2016 system?

Thanks,
Vadim.

mherlitzius · 2022-04-06T06:18:02Z

@vrozenfe Thank you for your timely reply. I will ask for the event log from the WS2016 machine running build 215.
If this is fixed with build 215, is it enough to update the driver or should the drives or the whole VM be rebuilt?

vrozenfe · 2022-04-06T06:37:48Z

@mherlitzius
Update should be enough. However, you might be asked to reboot the VM after that.

mherlitzius · 2022-04-06T08:55:37Z

@vrozenfe Please see attached the event log. The error just appeared this morning.
srv-t-jdbib1.zip

vrozenfe · 2022-04-06T10:08:09Z

@mherlitzius
Thanks a lot.
By any chance, did you change any of two parameters "TimeoutValue" or "IoTimeoutValue" mentioned in #623 ?
I'm asking because the reset event is coming every 4 minutes instead of default 30 seconds. Another interesting thing is the bunch of 153 events. Can you please export the system event log file in .evtx format instead of .csv? It will make possible to
check and analyse the SRB status returned by Ceph.

BEst,
Vadim.

mherlitzius · 2022-04-06T13:10:50Z

@vrozenfe
There are no entries for Storport Miniport registry keys.
There is an entry for SCSI miniport:

Can I provide you the event log via another channel as it contains customer specific information? Thanks.

Best,
Matthias

thinkingcap · 2023-04-25T00:44:54Z

Did you get any resolution to this? I have the same issue on Server 19/SQL 19 using VirtIO 0.1.229

melfacion · 2023-06-19T12:39:31Z

I am also interested to know if there was any solution, or a planned new release of the virtio drivers.
We have approx 90 windows VMs, and this issue only happens to 3 of them for some reason. (2 of them are windows 2019 server, and the last one is a windows 2022 server)

Let me know if I can provide any logs or configuration that can be useful to narrow down the issue.
The freeze and error happens sporadically. Sometimes several times each week, and sometimes it can go weeks between occurrences. But only on these 3 VMs for some reason.

vrozenfe · 2023-06-20T03:11:00Z

@melfacion @thinkingcap
build 229 is completely up to date.
If you didn't adjust "IoTimoutValue" https://learn.microsoft.com/en-us/windows-hardware/drivers/storage/registry-entries-for-storport-miniport-drivers - please do it, trying to increase it gradually (60/12/180) and checking if it makes any differences. Please note that VM needs to be restarted erery time this parameter has been changed.
That three VMs. Are they running on the same or different hosts? Do they all use local or network storage? It will be nice if you can share Windows event log files with me at vrozenfe_at_redhat_dot_com.

Thank you,
Vadim.

melfacion · 2023-06-20T06:48:25Z

That three VMs. Are they running on the same or different hosts? Do they all use local or network storage? It will be nice if you can share Windows event log files with me at vrozenfe_at_redhat_dot_com.

Thank you, Vadim.

@vrozenfe : different hosts, different datacenters, different storage pools. They all use Ceph for network storage. Hypervisor is proxmox (ver 7.3-6 in production currently and 7.4-3 in test datacenter)
(2 of the VMs are in our production datacenter and are running customer workloads. One is a QLIK Sense application, the other one is a WSUS update server. The third one in a different datacenter is one of our internal VMs used as a jumpserver/Windows terminal services for accessing our customers ) So the WSUS update server and our internal server can easily be restarted for troubleshooting whenever needed. The last one is a customer-facing solution, and should be up and running at all times.

I exported the events from all sources for some time before the last crash and some time after. Will send it to you as requested.

thinkingcap · 2023-06-20T11:21:47Z

When I get the error Reset to device, \Device\RaidPort2, was issued.

I get the following in the host Proxmox syslog
QEMU[71107]: kvm: virtio: zero sized buffers are not allowed

Once I got this too with same impact on guest
QEMU[2198]: kvm: Desc next is 3

Hypervisor is Proxmox

proxmox-ve: 7.4-1 (running kernel: 6.2.11-2-pve)
pve-manager: 7.4-13 (running version: 7.4-13/46c37d9c)
pve-kernel-6.2: 7.4-3
pve-kernel-5.15: 7.4-3
pve-kernel-6.1: 7.3-6
pve-kernel-6.2.11-2-pve: 6.2.11-2
pve-kernel-6.1.15-1-pve: 6.1.15-1
pve-kernel-5.15.107-2-pve: 5.15.107-2
pve-kernel-5.15.74-1-pve: 5.15.74-1
ceph-fuse: 15.2.17-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.2-1
proxmox-backup-file-restore: 2.4.2-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-4
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-4
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve

qm config

agent: 1
bios: ovmf
boot: order=scsi0;net0;ide0
cores: 12
cpu: Skylake-Server
description: #### Windows Server 2016%0A#### SQL Server 2019
efidisk0: SSD-R10:vm-201-disk-0,efitype=4m,format=raw,pre-enrolled-keys=1,size=528K
ide0: none,media=cdrom
machine: pc-q35-7.1
memory: 61440
meta: creation-qemu=7.1.0,ctime=1676181242
name: IRIS
net0: virtio=86:DB:0D:46:C1:58,bridge=vmbr0,firewall=1,tag=25
numa: 1
onboot: 1
ostype: win10
scsi0: SSD-R10:vm-201-disk-1,discard=on,format=raw,iothread=1,size=96G
scsi1: SSD-R10:vm-201-disk-2,discard=on,iothread=1,size=320G
scsi2: SSD-R10:vm-201-disk-3,discard=on,iothread=1,size=160G
scsi3: SSD-R10:vm-201-disk-4,discard=on,iothread=1,size=64G
scsi4: SSD-R10:vm-201-disk-5,discard=on,iothread=1,size=448G
scsihw: virtio-scsi-single
smbios1: uuid=973a474f-45aa-4caa-ab0e-bace1c0aa76e
sockets: 1
tags: ims;windows
vmgenid: 62178394-12fe-4df7-ae25-839471658f30

Workload is SQL Server / Analysis Server 2019 running ETL jobs and processing cubes.
Ill take a look at IoTimoutValue - currently the reset message comes through every 60 seconds once I get the problem.

I actually get this on 2 hosts, this one more than the other.
Both hosts are Dell servers, PERC hardware RAID 10, local storage using Intel DC SATA SSD's.

vrozenfe · 2023-06-21T11:39:40Z

Since it is SQL server it might be useful to try reducing the maximum transfer size by specifying "PhysicalBreaks" key
in the Registry
https://access.redhat.com/solutions/6963779
This key works for virtio-scsi device only.
(if I'm not mistaken WSUS can also use SQL engine)

Vadim.

thinkingcap · 2023-06-21T13:03:52Z

Its protected content unfortunately.

vrozenfe · 2023-06-22T04:16:29Z

@thinkingcap
No problem. There is some open information in the driver code itself
https://github.com/virtio-win/kvm-guest-drivers-windows/blob/master/vioscsi/vioscsi.c#L485
"PhysicalBreaks" Registry Key allows to redefine the maximum transfer size supported by vioscsi mini-port
driver. Without this key the default value is 512, which make the maximum transfer size equal to 2MB
https://github.com/virtio-win/kvm-guest-drivers-windows/blob/master/vioscsi/vioscsi.c#L501
IIRC on SQL server maximum transfer size can go up to 4MB which means that SQL server can generate
1MB/2MB load. 2MB block transfer work really good for direct LUN SSD backends but can be not optimal for some old or
network connected storages. I would suggest setting "PhysicalBreaks" as 0x1f or 0x3f to see if it helps to fix the problem.

thinkingcap · 2023-06-22T07:33:51Z

@vrozenfe I assume it requires a reboot when changing?
Storage is local SATA SSD (Intel/Solidgm S4620) in RAID10 (Dell PERC)

vrozenfe · 2023-06-22T21:51:53Z

@thinkingcap
System disk requires reboot, for data disk disable/enable cycle is enough.

MaxXor · 2023-06-26T11:30:28Z

Hello, I'm also haunted by this problem. Running Proxmox 7.2 in production, one Windows Server 2022 VM is affected with MSSQL 2019 CU21 and virtio drivers 0.1.225. Storage backend for all VMs in the cluster is CEPH 16.2.7 (using 10Gbit SPF+, Jumbo Frames 9000 MTU). Sometimes the VM is running for weeks without issues, sometimes multiple lockups in a single day which require a hard reset.

Same symptoms:

Windows Event Viewer shows multiple: Reset to device \Device\RaidPort2 was issued. each minute
On Proxmox host it shows after a few entries of the previous error: kvm: virtio: zero sized buffers are not allowed
Then the VM is locked up and requires a hard reset

@vrozenfe According to your suggestions I should first try setting these registry values, right? As you can see in the screenshot it happens exactly every 60 secs, which means my default IO timeout is 60secs right? So I would adjust it to 90sec for the vioscsi driver first?

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\vioscsi\Parameters
- IoTimeoutValue = 0x5a (90)
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\vioscsi\Parameters\Device
- PhysicalBreaks = 0x3f (63)

Does 0x3f make sense for CEPH with 9000 MTU?

vrozenfe · 2023-06-27T02:57:22Z

@MaxXor
I would try to reduce the transfer size ("PhysicalBreaks") first to see if it helps to solve the problem.
Btw, there is a different option to adjust the transfer size without touching the Windows Registry.
Upstream and RHEL QEMU have "max_sectors" parameters for that.
"-device virtio-scsi-pci,id=scsi-vioscsi1,max_sectors=63" will have the same effect as "PhysicalBreaks = 0x3f (63)".
Unfortunately, I don't know if it works for Proxmox or not. You should probably clarify it with Proxmox support team.

I don't recall I've ever seen "kvm: virtio: zero sized buffers are not allowed" before. But since you and @thinkingcap both
mentioned this message, by any chance can you guys post the QEMU version and the QEMU command line?

0x3f makes the maximum transfer size equal to 256K. In my understanding 30 sec (default timeout) period should be enough for Jumbo frame network storage to complete such transfer. Even for write operations, I think. Unfortunately I have very limited knowledge in CEPH, even though that we use it as one of the backends for our internal storage testing.
Ideally, I would run some storage performance test inside the VM to compare latency and standard deviation for different transfer sizes (4K/16K/64K/256K).

Best,
Vadim.

MaxXor · 2023-06-27T07:05:57Z

Thanks for your answer, unfortunately both IoTimeoutValue (0x5a) and PhysicalBreaks (0x3f) didn't help yet. I really don't understand it. Large file transfer, e.g. by copying a file, are working fine which should definitely put load on CEPH and cause higher IO latencies. It's copying with 500 MB/s, no errors. But if I keep the MSSQL running, monitoring the disk IO, it just locks up with low load (max. 30 MB/s) again after 1-2 hours.

I'm using the Proxmox QEMU package at version 7.2.0-8, which relates to this git commit: https://github.com/proxmox/pve-qemu/tree/93d558c1eef8f3ec76983cbe6848b0dc606ea5f1 It's the latest available for Proxmox 7.x.

Another thing I noticed, which might be worth mentioning is that only one of the vioscsi disks locks up. During the reset events in windows I can still read/write fine to all other disks, except the one which is locked up. The VM I'm using has 4 vioscsi disks.

I will post the QEMU command line later.

thinkingcap · 2023-06-27T07:15:36Z

Did you reboot after applying those registry changes?

MaxXor · 2023-06-27T07:26:52Z

Yes. It's not a critical machine. I can reboot it at any time if it helps us figure out the root cause. 🙂

thinkingcap · 2023-06-27T07:37:43Z

Whats your VM config look like? Can you post a qm config <VMID>

MaxXor · 2023-06-27T07:49:05Z

agent: 1
balloon: 0
boot: order=scsi0
cores: 6
cpu: host
machine: pc-q35-6.2
memory: 32768
meta: creation-qemu=6.2.0,ctime=1657105335
net0: virtio=76:80:CE:B7:21:69,bridge=vmbr4
numa: 0
onboot: 1
ostype: win11
scsi0: ceph_storage:vm-104-disk-0,iothread=1,discard=on,size=100G
scsi1: ceph_storage:vm-104-disk-1,iothread=1,cache=writeback,discard=on,size=500G
scsi2: ceph_storage:vm-104-disk-2,iothread=1,cache=writeback,discard=on,size=500G
scsi3: ceph_storage:vm-104-disk-3,iothread=1,discard=on,size=500G
scsihw: virtio-scsi-single
smbios1: uuid=5839521f-0c23-431c-9806-c7ee8aab6104
sockets: 1
vmgenid: 1c6d5ca3-862b-49c5-848d-453cbe070164

Only scsi1 and scsi2 are affected (I already tried without cache=writeback, but it didn't help).

thinkingcap · 2023-06-27T07:52:11Z

I would also change cache to Default (no cache)
I'm also not familiar with Ceph so unsure on performance impact.

thinkingcap · 2023-06-27T09:22:58Z

Back to square 1 here, just got kvm: virtio: zero sized buffers are not allowed again

mherlitzius · 2023-06-28T10:58:10Z

To mitigate this problem, we switched to local storage. Unfortunately, we have not found any other solution either. Thanks for your help anyway, Vadim!

MaxXor · 2023-06-30T21:09:45Z

@thinkingcap I just noticed that the automatic trim operations for SSDs were running at the time of the last lockup. I disabled it for now to see if the problem disappears.

thinkingcap · 2023-07-03T03:04:38Z

@mherlitzius I'm already using local SSD backed storage (from the start)
@MaxXor Our issues occur during time of load when running SQL / AS jobs.

Proxmox 8.0.3 ran without issue for 5 days but got the less frequent error today:
QEMU[4927]: kvm: Desc next is 3
Error in VM is same.
So a different error message on the host but result is the same.

benyamin-codez · 2024-08-12T12:23:02Z

I think #735 is worth a revisit...

JonKohler · 2024-08-12T12:37:47Z

Author of #735 / #736 here - thanks for all of the commentary on this thread. All that allows you to do is when you have large IO, it make sure it is properly aligned so if you're max transfer size config is 1MB, you're actually getting 1MB IO's cleanly up and down the stack.

I suspect that what ends up happening here is something else down the stack is binding up, and perhaps just making sure it has nicely aligned IO makes it angrier.

That said, if you've got a very reliable reproducer, you could git bisect this repro, build some demo drivers, and see if you can find a smoking gun for your setup.

JonKohler · 2024-08-12T12:40:27Z

Also, to be clear, not saying that it couldn't be the problem, perhaps it contributes, but this type of lockup has been going on before that commit was even in the repo (see #623). Very well may be in the same class of issues.

I'll defer to the guru @vrozenfe though. Happy to help if this ends up being part of the issue.

As a data point, we've been running that commit across our fleet for a long time and haven't seen a single customer support case with this signature (albeit in a completely different topology RE qemu/backend storage config).

benyamin-codez · 2024-08-12T12:53:57Z

As a data point, we've been running that commit across our fleet for a long time and haven't seen a single customer support case with this signature (albeit in a completely different topology RE qemu/backend storage config).

I think this is part of the problem with performing root cause analysis for this issue. The lack of fault reproduction in Nutanix and RHEL environments points to other root causes. I didn't see this issue myself running virtio-blk dataplane until I went over 1GiBps throughput. If someone is able to run my reproducer above on a VM hosted on RHEL rather than a Debian-based hypervisor, that would go a long way to determining the scope of the problem. I don't know enough about your "data path plumbing" to know if it would be reproducible in your kit.

That said, if you've got a very reliable reproducer, you could git bisect this repro, build some demo drivers, and see if you can find a smoking gun for your setup.

I'll probably wait for Vadim to have a look before I cast my very fresh eyes over it. 8^d

vrozenfe · 2024-08-12T12:54:36Z

Guys you are all right. In my understanding the root of the problem is in the maximum size of IO transfer. Likely to Nutanix and us (RH) we don't see any problem if the IO transfer exceeds the virtual queue size. Others might have
problem with that. I will try to address this issue very soon. Just need to fix some critical stuff related to the virtio-win installer first.

Cheers,
Vadim.

benyamin-codez · 2024-08-12T13:06:22Z

Guys you are all right. In my understanding the root of the problem is in the maximum size of IO transfer. Likely to Nutanix and us (RH) we don't see any problem if the IO transfer exceeds the virtual queue size. Others might have problem with that. I will try to address this issue very soon. Just need to fix some critical stuff related to the virtio-win installer first.

Cheers, Vadim.

Thanks Vadim, that would be great.

If I was to hazard a guess, it looks like the train really kicked off with #502 --> #596 --> 2f1a58a

...but don't let me lead you up the garden path...!

vrozenfe · 2024-08-12T14:19:52Z

@benyamin-codez

Right, I believe that the problem is that we create SG longer than virtual queue size..
Technically we probably need to reduce the size or to split a large DMA transfer to
a set of smaller chunks.

Best,
Vadim.

benyamin-codez · 2024-08-14T13:47:06Z

@vrozenfe

So I've had a bit of a peek around the vioscsi code and thought I'd drop a quick post here as to what I've found in the hope that this is more a hint than a distraction.

The issue starts after 7dc052d, with the rework on SRB. The issue is apparent following fdf56dd.

The nature of the problem, throughput graph observations during fault reproduction, what appear to be resets on the bus and similar behaviour to issue #907 (and others) - when considered together - suggest to me that SRBs are being processed out of order, potentially with stray retries. It seems stray SRBs might being returning quite late and corrupting the queue. Given SRBs and the initial LUN reset logic were included in that commit, perhaps one is adversely affecting the other...?

So focusing on #684, I built each commit and tested for the fault. Then I tore down the code to see what effect the changes had on the fault condition. I started with the max_physical_breaks / NumberOfPhysicalBreaks / MaximumTransferLength artefacts to little (but observable) effect on the fault. I also reintroduced the USE_CPU_TO_VQ_MAP code in 6090638 to little effect, but note I didn't try reintroducing the USE_WORK_ITEM code in 28f16b0. I'm reasonably confident the root cause isn't in any of these elements that I checked.

It seems to me that the fault is coincident to the changes in request completion handling and the removal (or maybe a relocation?) of the spin lock in the SRB implementation. So the main two commits would be f1338bb and fdf56dd, but considered together.

The problem here is the SRB implementation has evolved quite a bit in the three years since then... With thanks - it looks great..!

I do suspect it might be something in the VioScsiVQLock and VioScsiVQUnlock implementation.

However, I also didn't notice any functions reminiscent of HwScsiDmaStarted or HwScsiStartIo, so I wasn't sure if SRB is adequately implemented. Is it possible there's a mash up of PIO and SRB modes...? As there is no guarantee of sync between DMA transfers and the SRB data buffer while the SRB is being processed, the data buffer shouldn't be touched until the DMA transfer has completed and the channel has been flushed. I didn't really see code for these mechanisms but it's become a bit of a version mish-mash for me looking up and down the tree... 8^d

I was sort of curious where the newer SRB scaffold appeared from, and whether there was a missing functional dependency on VirtIO or something else in the larger tree.

Anyway, I hope the above is of some help. I might have another look in a day or so.

Regards,
Ben

benyamin-codez · 2024-08-14T13:53:22Z

Also forgot to mention I checked changes to MaxNumberOfIO, with no effect.

frwbr · 2024-08-14T15:14:36Z

Hi Vadim,

Thanks for looking into this again.

Right, I believe that the problem is that we create SG longer than virtual queue size.. Technically we probably need to reduce the size or to split a large DMA transfer to a set of smaller chunks.

Your remark about the queue size prompted me to look into virtio-scsi multiqueue. My current understanding of virtio is not very deep, but AFAICT current QEMU allocates 1 virtqueue per vCPU for virtio-scsi (plus 2 more virtqueues) [0] [1].

I've tried setting num_queues=1 to force QEMU to only allocate 1+2 virtqueues. On the running VM, I checked the number of virtqueues (num_vqs) via the QEMU monitor with a command like:

info virtio
info virtio-status /machine/peripheral/virtioscsi0/virtio-backend

e.g.:

/machine/peripheral/virtioscsi0/virtio-backend:
[...]
  num_vqs:                 3
  queue_sel:               2
  isr:                     1
  endianness:              little
[...]

With this setting, the reproducer from [2] has not produced a vioscsi Reset to device, \Device\Raidport1, was issued warning so far (see below for QEMU command line).

This could also be a coincidence, but on the other hand -- could there be some unfortunate interaction in presence of virtio-scsi multiqueue that may trigger the issue?

[0] https://github.com/qemu/qemu/blob/a733f37aef3b7d1d33bfe2716af88cdfd67ba64e/hw/virtio/virtio-pci.c#L2526
[1] https://github.com/qemu/qemu/blob/a733f37aef3b7d1d33bfe2716af88cdfd67ba64e/hw/scsi/virtio-scsi.c#L1232
[2] #756 (comment)

Host: Proxmox VE kernel 6.8.12-1-pve based on Ubuntu kernel 6.8
Guest: Windows 10 with virtio-win 0.1.262
Upstream QEMU 9.0.1 command line (the boot disk is the SATA disk, I run the reproducer on the VirtIO SCSI disk) -- mind the num_queues=1:

[...]/qemu-stable-9.0.1/qemu-system-x86_64 \
  -accel kvm \
  -name 'win10-vioscsi-versions,debug-threads=on' \
  -chardev 'socket,id=qmp,path=/var/run/qemu-server/151.qmp,server=on,wait=off' \
  -mon 'chardev=qmp,mode=control' \
  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
  -mon 'chardev=qmp-event,mode=control' \
  -pidfile /var/run/qemu-server/151.pid \
  -smbios 'type=1,uuid=32b5c31e-ec75-4d62-9d0b-a756f876e943' \
  -smp '4,sockets=1,cores=4,maxcpus=4' \
  -nodefaults \
  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
  -vnc 'unix:/var/run/qemu-server/151.vnc,password=on' \
  -cpu 'qemu64,+aes,enforce,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+pni,+popcnt,+sse4.1,+sse4.2,+ssse3' \
  -m 8192 \
  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
  -device 'pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5' \
  -device 'vmgenid,guid=d033521b-0401-46b6-975b-f8b4c6d1a4a6' \
  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
  -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
  -device 'virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1,num_queues=1' \
  -drive 'file=/dev/pve/vm-151-disk-1,if=none,id=drive-scsi0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' \
  -device 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0' \
  -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' \
  -drive 'file=/dev/pve/vm-151-disk-0,if=none,id=drive-sata0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' \
  -device 'ide-hd,bus=ahci0.0,drive=drive-sata0,id=sata0,bootindex=100' \
  -netdev 'type=tap,id=net0,ifname=tap151i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' \
  -device 'e1000,mac=BA:FF:FF:84:10:E7,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=102' \
  -rtc 'driftfix=slew,base=localtime' \
  -machine 'hpet=off,type=pc-i440fx-9.0' \
  -global 'kvm-pit.lost_tick_policy=discard'

benyamin-codez · 2024-08-14T19:14:03Z

I've confirmed @frwbr's results per https://forum.proxmox.com/threads/redhat-virtio-developers-would-like-to-coordinate-with-proxmox-devs-re-vioscsi-reset-to-device-system-unresponsive.139160/post-693772

It does prevent bus resets and both kvm errors, but at a cost to performance.

I'm thinking I might need to revisit my checking of the legacy USE_CPU_TO_VQ_MAP...

benyamin-codez · 2024-08-14T23:00:36Z

@vrozenfe,

I found it in /vioscsi/helper.c

What I have so far. RFC

--- ./a/vioscsi/helper.c        2024-08-06 05:42:24.000000000 +1000
+++ ./b/vioscsi/helper.c        2024-08-15 08:41:32.366850208 +1000
@@ -51,6 +51,7 @@
     PVOID               va = NULL;
     ULONGLONG           pa = 0;
     ULONG               QueueNumber = VIRTIO_SCSI_REQUEST_QUEUE_0;
+    BOOLEAN             notify = FALSE;
     STOR_LOCK_HANDLE    LockHandle = { 0 };
     ULONG               status = STOR_STATUS_SUCCESS;
     UCHAR               ScsiStatus = SCSISTAT_GOOD;
@@ -108,18 +109,14 @@
         srbExt->psgl,
         srbExt->out, srbExt->in,
         &srbExt->cmd, va, pa);
+    element = &adaptExt->processing_srbs[index];
 
     if (res >= 0) {
-        element = &adaptExt->processing_srbs[index];
+        notify  = virtqueue_kick_prepare(adaptExt->vq[QueueNumber]) ? TRUE : notify;
         InsertTailList(&element->srb_list, &srbExt->list_entry);
         element->srb_cnt++;
-    }
-    VioScsiVQUnlock(DeviceExtension, MessageID, &LockHandle, FALSE);
-    if ( res >= 0){
-        if (virtqueue_kick_prepare(adaptExt->vq[QueueNumber])) {
-            virtqueue_notify(adaptExt->vq[QueueNumber]);
-        }
     } else {
+        RemoveTailList(&element->srb_list);
         virtqueue_notify(adaptExt->vq[QueueNumber]);
         ScsiStatus = SCSISTAT_QUEUE_FULL;
         SRB_SET_SRB_STATUS(Srb, SRB_STATUS_BUSY);
@@ -127,6 +124,17 @@
         StorPortBusy(DeviceExtension, 10);
         CompleteRequest(DeviceExtension, Srb);
         RhelDbgPrint(TRACE_LEVEL_FATAL, " Could not put an SRB into a VQ, so complete it with SRB_STATUS_BUSY. QueueNumber = %d, SRB = 0x%p, Lun = %d, TimeOut = %d.\n", QueueNumber, srbExt->Srb, SRB_LUN(Srb), Srb->TimeOutValue);
+        notify = TRUE;
+    }
+    
+    VioScsiVQUnlock(DeviceExtension, MessageID, &LockHandle, FALSE);
+    
+    if (notify){
+        virtqueue_notify(adaptExt->vq[QueueNumber]);
+        /*if (virtqueue_kick_prepare(adaptExt->vq[QueueNumber])) {
+            virtqueue_notify(adaptExt->vq[QueueNumber]);
+        }
+        */
     }
 
 EXIT_FN_SRB();

benyamin-codez · 2024-08-15T00:40:38Z

@vrozenfe, I refactored the above to the root cause.

I should note that performance is odd. I get performance of 3.5GB/s, then 6.4GB/s alternating. Not consistently, but what appears to be according to a timer. Alternating slow and fast periods. This survives reboot. Is that normal?

--- ./a/vioscsi/helper.c        2024-08-06 05:42:24.000000000 +1000
+++ ./b/vioscsi/helper.c        2024-08-15 10:26:16.389005304 +1000
@@ -110,15 +110,14 @@
         &srbExt->cmd, va, pa);
 
     if (res >= 0) {
+        virtqueue_kick_prepare(adaptExt->vq[QueueNumber]);
         element = &adaptExt->processing_srbs[index];
         InsertTailList(&element->srb_list, &srbExt->list_entry);
         element->srb_cnt++;
     }
     VioScsiVQUnlock(DeviceExtension, MessageID, &LockHandle, FALSE);
     if ( res >= 0){
-        if (virtqueue_kick_prepare(adaptExt->vq[QueueNumber])) {
-            virtqueue_notify(adaptExt->vq[QueueNumber]);
-        }
+        virtqueue_notify(adaptExt->vq[QueueNumber]);
     } else {
         virtqueue_notify(adaptExt->vq[QueueNumber]);
         ScsiStatus = SCSISTAT_QUEUE_FULL;

benyamin-codez · 2024-08-15T01:51:55Z

I should note that performance is odd. I get performance of 3.5GB/s, then 6.4GB/s alternating. Not consistently, but what appears to be according to a timer. Alternating slow and fast periods. This survives reboot. Is that normal?

It looks like this is multiplexing of additional adapters. With one adapter, I get 6.4GB/s consistently. Is that by design?

benyamin-codez · 2024-08-15T12:43:59Z

@vrozenfe, I refactored again to eliminate what I think are unnecessary VQ unlocks.

Performance is improved again, with 6.4GB/s on the boot+system disk and 9.2-9.5GB/s on additional disks. The adapter multiplexing appears only to affect disks on additional adapters. I have not tried booting on SCSI ID > 0.

--- ./a/vioscsi/helper.c        2024-08-06 05:42:24.000000000 +1000
+++ ./b/vioscsi/helper.c        2024-08-15 22:59:13.605562341 +1000
@@ -56,6 +56,7 @@
     UCHAR               ScsiStatus = SCSISTAT_GOOD;
     ULONG MessageID;
     int res = 0;
+    BOOLEAN             notify = FALSE;
     PREQUEST_LIST       element;
     ULONG               index;
 ENTER_FN_SRB();
@@ -110,17 +111,11 @@
         &srbExt->cmd, va, pa);
 
     if (res >= 0) {
+        notify = virtqueue_kick_prepare(adaptExt->vq[QueueNumber]);
         element = &adaptExt->processing_srbs[index];
         InsertTailList(&element->srb_list, &srbExt->list_entry);
         element->srb_cnt++;
-    }
-    VioScsiVQUnlock(DeviceExtension, MessageID, &LockHandle, FALSE);
-    if ( res >= 0){
-        if (virtqueue_kick_prepare(adaptExt->vq[QueueNumber])) {
-            virtqueue_notify(adaptExt->vq[QueueNumber]);
-        }
     } else {
-        virtqueue_notify(adaptExt->vq[QueueNumber]);
         ScsiStatus = SCSISTAT_QUEUE_FULL;
         SRB_SET_SRB_STATUS(Srb, SRB_STATUS_BUSY);
         SRB_SET_SCSI_STATUS(Srb, ScsiStatus);
@@ -128,7 +123,10 @@
         CompleteRequest(DeviceExtension, Srb);
         RhelDbgPrint(TRACE_LEVEL_FATAL, " Could not put an SRB into a VQ, so complete it with SRB_STATUS_BUSY. QueueNumber = %d, SRB = 0x%p, Lun = %d, TimeOut = %d.\n", QueueNumber, srbExt->Srb, SRB_LUN(Srb), Srb->TimeOutValue);
     }
-
+    VioScsiVQUnlock(DeviceExtension, MessageID, &LockHandle, FALSE);
+    if (notify){
+        virtqueue_notify(adaptExt->vq[QueueNumber]);
+    }
 EXIT_FN_SRB();
 }

I have backported this into viostor too. I will drop that at issue #907.

EDIT: Removed unnecessary (hopefully) notify in (res < 0)...

vrozenfe · 2024-08-16T04:30:15Z

@frwbr
num_queue = 1 is not the way to fix the problem. We support this case mostly for the testing sand troubleshooting purposes. The entire idea behind the multiqueue is to let the entire storage IO sequence:
UM tread initiated file io request -> vioscsi BuildIO/StartIO -> IRQ -> DPC -> resume the UM thread issued fio request to be executed on the same (v)CPU. In this case we can eliminate any IPIs and data copies which will be required to complete this request and resume the waiting UM thread if the request initiator and DPC will be executed on the different CPUs

In my opinion the simplest solutions to work around the problem at the moment would be:

try to disable indirect mode (something like "-device virtio-scsi-pci,id=scsi-vioscsi1,indirect_desc=off";
or reduce the max_sectors size (-device virtio-scsi-pci,id=scsi-vioscsi1,max_sectors=32(64))
or limit the maximum transfer by defining "PhysicalBreaks" parameter in the Registry,
or you can also try enlarge the virtual queues size by adding queue_size parameter to the qemu command line.

Best,
Vadim.

vrozenfe · 2024-08-16T04:48:29Z

@benyamin-codez

Thanks a lot.
If it improves the performance and works for multi-LUN setup, then
please submit your work as a commit.

Best,
Vadim.

JonKohler · 2024-08-16T12:46:56Z

Agreed @benyamin-codez thanks for diving into the trenches on this one, let's raise it as a pull request when you finish up your batch of testing and we can comment in there. Please tag me and @sb-ntnx in addition to Vadim, I've got a few thoughts but don't want them to get buried in this very long discussion thread 😃

frwbr · 2024-08-16T15:48:37Z

Hi all,

Thanks @benyamin-codez for jumping into the code! I don't know much about SCSI or Windows drivers so can't comment on the changes themselves. Happy to test a PR with the finished changes though!

If I understand the situation correctly, there does seem to be some bug that can be triggered by excessive IO, e.g. by the 8k+16k reads reproducer. While a large portion of the users affected by this issue are Proxmox VE users, it is not limited to Proxmox VE users, e.g. @GreyWolfBW is using Debian+libvirt. Still, there may be a factor that makes the issue more likely to trigger on Proxmox VE than e.g. on RH (if I understand correctly, Nutanix uses vhost-user-scsi, so that's quite a different setup). As I've done my tests with upstream QEMU stable, I'd rule out our downstream QEMU patches as a factor. Our kernels may be a factor -- next week I can see if I manage to reproduce the issue on a mainline kernel. Also, there may be some peculiarity hiding in my QEMU command line (I've posted them e.g. here) -- I don't see anything suspicious there but I'm obviously biased.

@frwbr num_queue = 1 is not the way to fix the problem. We support this case mostly for the testing sand troubleshooting purposes. The entire idea behind the multiqueue is to let the entire storage IO sequence: UM tread initiated file io request -> vioscsi BuildIO/StartIO -> IRQ -> DPC -> resume the UM thread issued fio request to be executed on the same (v)CPU. In this case we can eliminate any IPIs and data copies which will be required to complete this request and resume the waiting UM thread if the request initiator and DPC will be executed on the different CPUs

Yes, I wouldn't consider num_queues=1 a fix either, and not even a good workaround. :) I mentioned it mainly in the hope it helps in finding the root cause of the issue.

In a similar vein, I built kvm-guest-drivers-windows today at c87ea56 and enabled debug prints (with this patch). I ran the reproducer (4 cores and no restriction on num_queues, QEMU command line below) again on upstream QEMU 9.0.1, and read out

the debug logs during boot (via WinDbg): init.txt
the debug logs while running the reproducer (via DebugView): during-reproducer.txt

In during-reproducer.txt, is the reset at 86s:

03811537	86.12216187	 <--> SRB_FUNCTION_RESET_LOGICAL_UNIT Target (0::0::0), SRB 0xFFFFD403766D9300

Interestingly, there are no vioscsi debug outputs at all for the ~65 seconds before that, the last one is:

03608265	22.71421814	HandleResponse <--> Operation SCSIOP_READ (0x28), Target (0::0::0), SRB 0xFFFFD403762285D0
03608266	22.71422005	 --> SendSRB Srb = 0xFFFFD4037693DC80.

In my opinion the simplest solutions to work around the problem at the moment would be:

Thanks for the suggestions!

* try to disable indirect mode (something like "-device virtio-scsi-pci,id=scsi-vioscsi1,indirect_desc=off";

That one I already tried in #756 (comment), it seems to get rid of the Desc next is 3, but the reproducer still produces the vioscsi warnings and lockups.

* or reduce the max_sectors size (-device virtio-scsi-pci,id=scsi-vioscsi1,max_sectors=32(64))

* or limit the maximum transfer by defining "PhysicalBreaks" parameter in the Registry,

I think I've looked into those already a couple of weeks ago and didn't see changed behavior when running the reproducer, but I'll try to revisit them.

* or you can also try enlarge the virtual queues size by adding queue_size parameter to the qemu command line.

~~I'll see if I can try that next week.~~ (EDIT: Didn't have the time to test this, as I was looking into #1135 instead)

Best,

Friedrich

/[...]/qemu-stable-9.0.1/qemu-system-x86_64 \
  -accel kvm \
  -name 'win10-vioscsi-versions,debug-threads=on' \
  -chardev 'socket,id=qmp,path=/var/run/qemu-server/151.qmp,server=on,wait=off' \
  -mon 'chardev=qmp,mode=control' \
  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
  -mon 'chardev=qmp-event,mode=control' \
  -pidfile /var/run/qemu-server/151.pid \
  -smbios 'type=1,uuid=32b5c31e-ec75-4d62-9d0b-a756f876e943' \
  -smp '4,sockets=1,cores=4,maxcpus=4' \
  -nodefaults \
  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
  -vnc 'unix:/var/run/qemu-server/151.vnc,password=on' \
  -cpu 'qemu64,+aes,enforce,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+pni,+popcnt,+sse4.1,+sse4.2,+ssse3' \
  -m 8192 \
  -object 'iothread,id=iothread-virtioscsi0' \
  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
  -device 'pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5' \
  -device 'vmgenid,guid=dd75a80d-45d3-4ec6-9f87-572bacbf1a7a' \
  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
  -chardev 'socket,id=serial0,path=/var/run/qemu-server/151.serial0,server=on,wait=off' \
  -device 'isa-serial,chardev=serial0' \
  -device 'VGA,id=vga,bus=pci.0,addr=0x2,edid=off' \
  -drive 'file=/var/lib/vz/template/iso/virtio-drivers-debug.iso,if=none,id=drive-ide2,media=cdrom,aio=io_uring' \
  -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=101' \
  -device 'virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1,iothread=iothread-virtioscsi0' \
  -drive 'file=/dev/pve/vm-151-disk-1,if=none,id=drive-scsi0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' \
  -device 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0' \
  -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' \
  -drive 'file=/dev/pve/vm-151-disk-0,if=none,id=drive-sata0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' \
  -device 'ide-hd,bus=ahci0.0,drive=drive-sata0,id=sata0,bootindex=100' \
  -netdev 'type=tap,id=net0,ifname=tap151i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' \
  -device 'e1000,mac=BA:FF:FF:84:10:E7,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=102' \
  -rtc 'driftfix=slew,base=localtime' \
  -machine 'hpet=off,type=pc-i440fx-9.0' \
  -global 'kvm-pit.lost_tick_policy=discard'

benyamin-codez · 2024-08-19T04:40:46Z

Agreed @benyamin-codez thanks for diving into the trenches on this one, let's raise it as a pull request when you finish up your batch of testing and we can comment in there. Please tag me and @sb-ntnx in addition to Vadim, I've got a few thoughts but don't want them to get buried in this very long discussion thread 😃

All good.

Mostly done with IO and SCSI failure testing.

Proxmox doesn't seem to work well with HMP based blkdebug options, so it has been a bit tedious.

Just working out some optimisation vs WPP tracing issues. It must have been "fun" when @vrozenfe implemented it.

The PR should be up in the next day.

benyamin-codez · 2024-08-20T23:09:37Z

Also, to be clear, not saying that it couldn't be the problem, perhaps it contributes, but this type of lockup has been going on before that commit was even in the repo (see #623). Very well may be in the same class of issues.

I suspect that what ends up happening here is something else down the stack is binding up, and perhaps just making sure it has nicely aligned IO makes it angrier.

@JonKohler

This is definitely the case. I'm having a look at it now...

I dropped an update here: https://forum.proxmox.com/threads/redhat-virtio-developers-would-like-to-coordinate-with-proxmox-devs-re-vioscsi-reset-to-device-system-unresponsive.139160/post-695547

JonKohler · 2024-08-21T00:21:05Z

The case about what? I flipped a thru the last few posts on the forums, can you elaborate?

benyamin-codez · 2024-08-21T00:29:05Z

The case about what? I flipped a thru the last few posts on the forums, can you elaborate?

There is an underlying problem that existed before the v208 release. I would say well before. As you put it:

...something else down the stack is binding up...
...perhaps it contributes, but this type of lockup has been going on before that commit was even in the repo...

Per my comments in the Proxmox post and in the draft PR. I'm focusing on that issue at the moment.

benyamin-codez · 2024-08-21T00:42:52Z

Also Vadim's comments:

In my understanding the root of the problem is in the maximum size of IO transfer. Likely to Nutanix and us (RH) we don't see any problem if the IO transfer exceeds the virtual queue size. Others might have
problem with that.

and

Right, I believe that the problem is that we create SG longer than virtual queue size..
Technically we probably need to reduce the size or to split a large DMA transfer to
a set of smaller chunks.

...and that may very well be the case, but I'm also reviewing the NOTIFY implementation in [virtio].
I'm making some progress but do have some competing priorities the next few days.

benyamin-codez · 2024-09-19T03:16:53Z

PR #1150 has been raised to resolve this issue in [vioscsi].

benyamin-codez · 2024-10-23T17:00:31Z

PR #1174 has been raised to resolve this issue in [viostor].

mherlitzius added the Bug Bug label Apr 5, 2022

mherlitzius assigned YanVugenfirer Apr 5, 2022

benyamin-codez mentioned this issue Aug 20, 2024

[vioscsi] Fix SendSRB regression and refactor for optimum performance [viostor] Backport SendSRB improvements #1135

Closed

benyamin-codez mentioned this issue Sep 19, 2024

[vioscsi] Fix SendSRB notify regression #1150

Merged

benyamin-codez mentioned this issue Oct 23, 2024

[viostor] Backport vioscsi SendSRB fixes and improvements #1174

Merged

YanVugenfirer closed this as completed Dec 3, 2024

viostor Reset to device, \Device\RaidPort3, was issued. VM is frozen. #756

viostor Reset to device, \Device\RaidPort3, was issued. VM is frozen. #756

Comments

mherlitzius commented Apr 5, 2022 • edited Loading

vrozenfe commented Apr 6, 2022

mherlitzius commented Apr 6, 2022

vrozenfe commented Apr 6, 2022

mherlitzius commented Apr 6, 2022

vrozenfe commented Apr 6, 2022

mherlitzius commented Apr 6, 2022

thinkingcap commented Apr 25, 2023

melfacion commented Jun 19, 2023

vrozenfe commented Jun 20, 2023

melfacion commented Jun 20, 2023

thinkingcap commented Jun 20, 2023 • edited Loading

vrozenfe commented Jun 21, 2023

thinkingcap commented Jun 21, 2023

vrozenfe commented Jun 22, 2023

thinkingcap commented Jun 22, 2023

vrozenfe commented Jun 22, 2023

MaxXor commented Jun 26, 2023 • edited Loading

vrozenfe commented Jun 27, 2023

MaxXor commented Jun 27, 2023

thinkingcap commented Jun 27, 2023

MaxXor commented Jun 27, 2023

thinkingcap commented Jun 27, 2023

MaxXor commented Jun 27, 2023 • edited Loading

thinkingcap commented Jun 27, 2023

thinkingcap commented Jun 27, 2023

mherlitzius commented Jun 28, 2023

MaxXor commented Jun 30, 2023

thinkingcap commented Jul 3, 2023

benyamin-codez commented Aug 12, 2024

JonKohler commented Aug 12, 2024

JonKohler commented Aug 12, 2024

benyamin-codez commented Aug 12, 2024

vrozenfe commented Aug 12, 2024

benyamin-codez commented Aug 12, 2024

vrozenfe commented Aug 12, 2024

benyamin-codez commented Aug 14, 2024

benyamin-codez commented Aug 14, 2024

frwbr commented Aug 14, 2024

benyamin-codez commented Aug 14, 2024

benyamin-codez commented Aug 14, 2024

benyamin-codez commented Aug 15, 2024 • edited Loading

benyamin-codez commented Aug 15, 2024

benyamin-codez commented Aug 15, 2024 • edited Loading

vrozenfe commented Aug 16, 2024

vrozenfe commented Aug 16, 2024

JonKohler commented Aug 16, 2024

frwbr commented Aug 16, 2024 • edited Loading

benyamin-codez commented Aug 19, 2024

benyamin-codez commented Aug 20, 2024

JonKohler commented Aug 21, 2024

benyamin-codez commented Aug 21, 2024

benyamin-codez commented Aug 21, 2024

benyamin-codez commented Sep 19, 2024 • edited Loading

benyamin-codez commented Oct 23, 2024

mherlitzius commented Apr 5, 2022 •

edited

Loading

thinkingcap commented Jun 20, 2023 •

edited

Loading

MaxXor commented Jun 26, 2023 •

edited

Loading

MaxXor commented Jun 27, 2023 •

edited

Loading

benyamin-codez commented Aug 15, 2024 •

edited

Loading

benyamin-codez commented Aug 15, 2024 •

edited

Loading

frwbr commented Aug 16, 2024 •

edited

Loading

benyamin-codez commented Sep 19, 2024 •

edited

Loading