Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writeback cache high memory usage halting OS #71

Closed
Augusto7743 opened this issue Dec 2, 2021 · 5 comments
Closed

Writeback cache high memory usage halting OS #71

Augusto7743 opened this issue Dec 2, 2021 · 5 comments

Comments

@Augusto7743
Copy link

Augusto7743 commented Dec 2, 2021

I had tested several times in a Virtual Box with Lubuntu 20.04.3 creating 4 writeback caches and in some times the OS had halted at point needing force a VM shutdown.
Thus need test installing in a "real" computer.
Using a 80 GB disk empty was installed Lubuntu 20.04.3 in 4 GB RAM.
Disk partitions
sda1 1 MB unformatted boot grub
sda2 400 ext2 boot
sda3 btrfs root
sda4 btrfs opt
sda5 btrfs home
sda6 btrfs data

3 writeback caches
rd1 sdb3 128
rd2 sdb4 32
rd4 sdb5 256

In tests not was loaded softwares in memory and the OS use less of 1 GB.
I had created 3 writeback caches for 3 partitions and when installing the initramfs scripts the default root wa cache was configured in a partition unformatted to avoid issues when using wb and wa in same root partition (In VM tests root file system damage was "fixed" avoiding wa and wb in same root).

Lubuntu have a widget "system statistics" configured to report OS memory usage.
When the OS is started
OS and applications 10 %
Buffers 0 %
Cache 14 %
Swap %

Testing only installing softwares the OS usage wlll be much high even not loading any software.
OS and applications 14 %
Buffers 0 %
Cache 65 %
Swap usage 0

If dmesetup flush is done few times when restarting the OS not boot and the OS startup process stop before load the GUI user login. Not is possible load the OS.
Thus I want try another command and was used also "dmsetup suspend" to each cache and in few seconds the OS is totally halted only is possible move the mouse pointer. Start menu or load any software not is work. Even pressing computer power button not shutdown OS. Few times also happen the same result above.

Is how if the OS is halted because mapper device is in activity.
Writecache not is moved to swap file thus swap continue being 0 %.

I had done a test when the cache is 65 % loading a software using more than 2 GB memory and the OS not was halted.
Again the same test with OS and another software total usage was 75 % and cache goes from 65 to 15 % not halting the OS.
Closing the application the cache begin to reach high memory usage.

Also was configured vm settings below without solution
vm.dirty_background_bytes=1024
vm.dirty_bytes=1024
vm.dirty_expire_centisecs=100
dirtytime_expire_seconds=1
vm.dirty_writeback_centisecs=100

Now testing in OS being used ext4.

@pkoutoupis
Rapiddisk being a wrapper for dm-writeback use a command.
Please you can share the command used when creating writecache caches ?
Only for information and understand anothers details in memory management.

Only reporting all above for your and others users.
Have a nice day and thanks for creating rapiddisk amazing utility.

@Augusto7743
Copy link
Author

Tested another VM only using ext4 having the same issue.
Good side not being a btrfs issue =)
Unhappily only using dmsetup message /dev/mapper/rrc-wb_sdaX 0 flush break OS startup thus if possible need stop totally the writecache device mapper before OS shutdown restart ?
Searching in internet several times not see any site about dmsetup halting the OS and not seeing in dmsetup docs any secure command to use being that was where I see to try the command dmsetup suspend.
Please if you have any information or command maybe helping in my VM tests and share information to your project and others users.

@Augusto7743
Copy link
Author

To avoid the problems above only avoid run "dmsetup suspend" and create flush scripts to run in shutdown.

@pkoutoupis
Copy link
Owner

Again, thank you for doing this research. It is very good details to know. I invite you to create a branch with a shutdown script but it will need to be intelligent enough to find all writeback cache devices (dm_writecache).

@Augusto7743
Copy link
Author

@pkoutoupis
All right ?

In OS startup is displayed message
" ERROR unsupported sector size 4096 on /dev/dm-1."
OS is installed in BTRFS partition.
BTRFS default block size is 16 KB.
I not have information if the message is to be ignored or wrong actions are being done.

Is good enabling rapiddisk.service because is possible use command line about status.

Another detail is when is being done written using writeback the OS cache is very high, but not exact problem. Is how if in each disk written the OS is caching in RAM or is being done OS writeback cache plus rapiddisk cache.
Thus need figure how configure the OS to flush write to "disk" in less time and if possible disable any OS writeback cache or configure to use less OS memory because if writeback is being done by rapiddisk memory is being wasted.
I have analyzed information in official Linux documentation. Have details about "dirty memory", but not have details if is about data file to be written in disk or is data from page file cache.
Very much sites looking how being "copy paste" information not explaining the settings about "dirty memory" and saying be related with file cache, but if using the configuration listed in site not see any changes in OS memory allocation or even other detail.

I have created a script writeback flush in "shutdown", but had happened problems.
Several times the OS root file system was damaged even doing writeback flush needing a OS reinstall or backup recover.
The correct not is configure the script to run before restart or shutdown because :

OS shutdown or restart run in sequence :

  • Umount all file system pre (any partition mounted in OS startup not being root).
  • Others tasks are done and umount root.
  • Shutdown or restart command is done.

Script configured to run in shutdown or restart the writeback flush command will be done before shutdown.target being all file system are umounted and even thus you see disk led displaying disk write activity.
Randomly will have one time damaging the root file system and if is BTRFS is a problem.
BTRFS is a very good file system if working correctly much better than ntfs and ext4, but when happen a file system damage is a problem.
Have problems not being possible fix even using btrfs check or others commands.
I use computer for years and never had analyzed a file system with problems to fix how is BTRFS.
In BTRFS If the computer power off not using OS shutdown the file system not is damaged , but if partial written was done randomly will damage the file system.

Thus the correct is configure script to run before OS stop local file system pre if user have created writeback to anothers partitions and also for root.
Testing if doing a flush command and after OS shutdown the file system not is damaged.

I need the exact name of systemctl done when stopping local file system pre to add in the script and after I post here all details, but yet trying find information about it.
Several web sites and nothing. Much wrong information spread in internet.

In moment waiting reply in few forums about the correct target name to be used.
The service is


[Unit]
Description=RapidDisk Writeback Cache Flush
DefaultDependencies=no
Before=umount.target

[Service]
Type=oneshot
User=root
Group=root
ExecStart=/usr/local/bin/rapiddisk-flush.sh
TimeoutStartSec=0

[Install]
WantedBy=umount.target


In moment using umount.target.
Umount.target run before shutdown. but is after local file system pre and root was umounted.

Have a nice week.

@pkoutoupis
Copy link
Owner

The 4K error is covered here: #59

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants