Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error inserting cache 1 Error while writing to cache device #1557

Closed
dcyops opened this issue Oct 6, 2024 · 4 comments
Closed

Error inserting cache 1 Error while writing to cache device #1557

dcyops opened this issue Oct 6, 2024 · 4 comments
Labels
question Further information is requested

Comments

@dcyops
Copy link

dcyops commented Oct 6, 2024

Question

Motivation

I am having trouble with configuring open-cas. The error I keep encountering is:

I am referencing http://open-cas.com/getting_started_open_cas_linux.html
and getting stuck on stage 6 which returns the following error:

casadm -S -d /dev/disk/by-id/nvme-Samsung_SSD_970_EVO_1TB_S467NX0M901204R
Error inserting cache 1
Error while writing to cache device

Any help would be greatly appreciated, thank you for providing such a cool project.

Your Environment

  • OpenCAS version (commit hash or tag):
╔═════════════════════════╤═════════════════════╗
║ Name                    │       Version       ║
╠═════════════════════════╪═════════════════════╣
║ CAS Cache Kernel Module │ 24.09.0.0898.master ║
║ CAS CLI Utility         │ 24.09.0.0898.master ║
╚═════════════════════════╧═════════════════════╝

(venv) user@laptop:~/.local/opt/open-cas-linux$ git tag
list
v19.3
v19.6
v19.9
v20.1
v20.12
v20.12.1
v20.12.2
v20.12.3
v20.3
v20.3.1
v20.3.2
v20.3.3
v20.3.4
v21.3
v21.3.1
v21.3.2
v21.3.3
v21.3.4
v21.6
v21.6.1
v21.6.2
v21.6.3
v21.6.4
v21.6.5
v22.3
v22.3.1
v22.3.2
v22.6
v22.6.1
v22.6.2
v22.6.3
(venv) user@laptop:~/.local/opt/open-cas-linux$ git branch
* master
  • Operating System:
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 24.04.1 LTS
Release:	24.04
Codename:	noble
  • Kernel version:
Linux laptop 6.8.0-45-generic #45-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 30 12:02:04 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
  • Cache device type (NAND/Optane/other) & Core device type (HDD/SSD/other):
    NAND (I believe)
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.8.0-45-generic] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       Samsung SSD 970 EVO 1TB
Serial Number:                      S467NX0M901204R
Firmware Version:                   2B2QEXE7
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 1,000,204,886,016 [1.00 TB]
Unallocated NVM Capacity:           0
Controller ID:                      4
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          1,000,204,886,016 [1.00 TB]
Namespace 1 Utilization:            15,800,717,312 [15.8 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            002538 5991b00e46
Local Time is:                      Sun Oct  6 18:07:06 2024 BST
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x03):         S/H_per_NS Cmd_Eff_Lg
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     85 Celsius
Critical Comp. Temp. Threshold:     85 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     6.20W       -        -    0  0  0  0        0       0
 1 +     4.30W       -        -    1  1  1  1        0       0
 2 +     2.10W       -        -    2  2  2  2        0       0
 3 -   0.0400W       -        -    3  3  3  3      210    1200
 4 -   0.0050W       -        -    4  4  4  4     2000    8000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        51 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    1%
Data Units Read:                    85,889,551 [43.9 TB]
Data Units Written:                 53,275,136 [27.2 TB]
Host Read Commands:                 750,330,700
Host Write Commands:                548,918,667
Controller Busy Time:               2,986
Power Cycles:                       1,878
Power On Hours:                     5,299
Unsafe Shutdowns:                   269
Media and Data Integrity Errors:    0
Error Information Log Entries:      4,620
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    18
Temperature Sensor 1:               51 Celsius
Temperature Sensor 2:               36 Celsius

Error Information (NVMe Log 0x01, 16 of 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS  Message
  0       4620     0  0x0008  0x4004      -            0     0     -  Invalid Field in Command

Self-test Log (NVMe Log 0x06)
Self-test status: No self-test in progress
No Self-tests Logged
  • Cache configuration: n/a
    • Cache mode: (default: wt)
    • Cache line size: (default: 4)
    • Promotion policy: (default: always)
    • Cleaning policy: (default: alru)
    • Sequential cutoff policy: (default: full)
  • Other (e.g. lsblk, casadm -P, casadm -L)
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
loop0         7:0    0  38.8M  1 loop /snap/snapd/21759
loop1         7:1    0  74.2M  1 loop /snap/core22/1621
loop2         7:2    0     4K  1 loop /snap/bare/5
loop3         7:3    0  91.7M  1 loop /snap/gtk-common-themes/1535
loop4         7:4    0 271.6M  1 loop /snap/firefox/5014
loop5         7:5    0 505.1M  1 loop /snap/gnome-42-2204/176
nvme0n1     259:0    0   1.9T  0 disk 
├─nvme0n1p1 259:1    0   1.9G  0 part /boot
├─nvme0n1p2 259:2    0     1G  0 part /boot/efi
└─nvme0n1p3 259:3    0   1.9T  0 part /
nvme1n1     259:4    0 931.5G  0 disk 
@dcyops dcyops added the question Further information is requested label Oct 6, 2024
@mmichal10
Copy link
Contributor

mmichal10 commented Oct 7, 2024

Hello @aGVsbG8sIHRoZXJl

could you please share the output of dmesg from starting cache?

Regarding your kernel - CAS is not supported on preemptive kernels so before starting a cache instance please make sure that the preemption is disabled on your machine. Otherwise you'll encounter issues like the one described in #1482
Linux laptop 6.8.0-45-generic #45-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 30 12:02:04 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

@dcyops
Copy link
Author

dcyops commented Oct 7, 2024

Hello @aGVsbG8sIHRoZXJl

could you please share the output of dmesg from starting cache?

Regarding your kernel - CAS is not supported on preemptive kernels so before starting a cache instance please make sure that the preemption is disabled on your machine. Otherwise you'll encounter issues like the one described in #1482 Linux laptop 6.8.0-45-generic #45-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 30 12:02:04 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Hey @mmichal10,

Thanks for quick response. I switched preemption to 'none' as seen below.

~/.local/opt/open-cas-linux $ sudo dmesg | grep Preempt
[    0.042649] Dynamic Preempt: none
[    0.109762] rcu: Preemptible hierarchical RCU implementation.
1 ~/.local/opt/open-cas-linux $ sudo cat /sys/kernel/debug/sched/preempt
(none) voluntary full 

I have attached the dmesg output as requested, after invoking the following sudo casadm -S -d /dev/disk/by-id/nvme-Sams│[ +0.000026] Thread cas_io_1_11 stopped ung_SSD_970_EVO_1TB_S467NX0M901204R

dmesg_output.txt

@mmichal10
Copy link
Contributor

mmichal10 commented Oct 8, 2024

Hi @aGVsbG8sIHRoZXJl

thanks for attaching the dmesg output. Your problem looks like an instance of #1558 which is fixed by #1561. The fix should be merged soon, we're just waiting for regression test results

@dcyops
Copy link
Author

dcyops commented Oct 9, 2024

Yup, that did it. Thanks a ton @mmichal10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants