Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

insmod worked but dmesg shows (nvidia-fs:write IO failed :-512) #26

Closed
singhsaluja opened this issue Sep 10, 2023 · 1 comment
Closed

Comments

@singhsaluja
Copy link

After a lot of struggle, I was able to build the (gds-nvidia-fs-2.17.0) on RHEL-9.2 (5.14.0-284.11.1.el9_2.x86_64) with nvidia-driver (525.89.02). The make worked fine and insmod nvidia-fs.ko didn't throw any errors.

[192745.286125] nvidia_fs: Initializing nvfs driver module
[192745.286136] nvidia_fs: registered correctly with major number 510

But when writing a file via gdsio utility to storage (VAST) which has an rpcrdma driver installed, the throughput speed wasn't expected, and dmesg shows

[Sat Sep  9 20:24:26 2023] nvidia-fs:write IO failed :-512
[Sat Sep  9 20:24:26 2023] nvidia-fs:write IO failed :-512
[Sat Sep  9 20:24:26 2023] nvidia-fs:write IO failed :-512
[Sat Sep  9 20:24:26 2023] nvidia-fs:write IO failed :-512
[Sat Sep  9 20:24:58 2023] nvidia-fs:write IO failed :-512

FWIW, the gdscheck.py utility reports NFS is supported

./gdscheck.py -p
 GDS release version: 1.7.2.10
 nvidia_fs version:  2.17 libcufile version: 2.12
 Platform: x86_64

NFS                : Supported

Userspace RDMA     : Unsupported
 --Mellanox PeerDirect : Enabled
 --rdma library        : Not Loaded (libcufile_rdma.so)
 --rdma devices        : Not configured
 --rdma_device_status  : Up: 0 Down: 0

I am unsure how to debug this. Any leads would be really appreciated. Thank you!

@wakaba-best
Copy link

@singhsaluja Would this be helpful to you? There is a difference between Ubuntu and Local NVMe.
#4 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants