Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writeback enabled in OS boot and data not are being written in OS shutdown #69

Closed
Augusto7743 opened this issue Nov 7, 2021 · 10 comments
Labels

Comments

@Augusto7743
Copy link

Editing
rapiddisk_hook
adding in line 31
manual_add_modules dm-writecache

and in file rapiddisk_sub.orig
adding in line 5
modprobe -v dm-writecache

and add in line7
rapiddisk -a 128
rapiddisk -a 256
rapiddisk -m rd0 -b /dev/sda2 -p wb
rapiddisk -m rd0 -b /dev/sda3 -p wb

and after running the script install_initrd.sh rapiddisk cache and dm-writeback are loaded in OS startup.

The problem is data not are being written in disk when the OS shutdown.
I had installed a Lubuntu (Ubuntu) 20.04.3 in VirtualBox for testing thus being
sda1 ext4 boot
sda2 btrfs root
sda3 btrfs home

BTRFS not is configured to compress and disabled internal writeback (not really work).

In OS boot log is displayed

"
ERROR unsupported sector size 4096 on /dev/dm-1.
ERROR unsupported sector size 4096 on /dev/dm-0.
"

I not understand if has any detail related with not written data or if user need only ignore the message.

Running the command is displayed

systemctl status rapiddiskd.service
Loaded: loaded (/etc/systemd/system/rapiddiskd.service; disabled; vendor preset: enabled
Active: inactive (dead)

Strange even done systemctl enable rapiddiskd.service is displayed the status message above.
For the systemctl status display module being active need run systemctl start rapiddiskd.service.
I not understand if rapiddisk service is really running or user need ignore that message.

Running command below display
rapiddisk -l

List of rapiddisk device
rd0 131072
rd1 262144

List of rapiddisk-cache mapping
dm-writecache Target 1: rc-wb_sda2 Cache: rd0 Target sda2 (WRITEBACK)
dm-writecache Target 2: rc-wb_sda3 Cache: rd1 Target sda3 (WRITEBACK)

For test I had installed one software (Stacer).
In Stacer is displayed since OS was started only 4 MB was written on disk.
Doing a OS shutdown or restart the software Stacer not is listed how being installed.
Files not are being written on disk.
Seeing in task manager Htop RAM usage is more high when trying write data on disk thus means rapiddisk is working.

Have any command or configuration for Rapiddisk flush data on disk or need configure or a command in dm-writecache for flush data on disk ?

Thanks for your reply.
Have a nice day.

@Augusto7743
Copy link
Author

another detail.
I only had mapped for root for test.
I not recommend writeback in root device.

@pkoutoupis
Copy link
Owner

That is because there is no procedure to flush the data to the backing store. I automatically do this in the "unmap" routine but that may not be possible in such a setup. You may need to write and configure a shutdown script such as an rc or systemd service file to run a command like this:

dmsetup message /dev/mapper/<device name> 0 flush

@Augusto7743
Copy link
Author

Augusto7743 commented Nov 23, 2021

@pkoutoupis

All right with you ? I wait that yes.
I had figured how load dm-writecache in OS startup, but nothing about flushing.
I have one year using Linux thus not much knowledge.
Very much sites in internet spreading information about dm-writecache, but nothing about where to configure.
dm-writecache being a module in kernel for me look how had to do a command in kernel in OS startup or terminal using dm-setup.

A long reply.
I need report about all below to share information.

VM using 1.28 GB with a 10 GB installed Lubuntu 20.04.3 kernel 5.11.
How is for test the disk had to be partitioned to be possible create 4 writecache buffers.
BTRFS was configured with noautodefrag and nospace_cache.

sda1 8 mb bios-grub
sda2 500 MB ext2 /boot
sda3 6,57 GB btrfs /
sda4 1,46 GB btrfs /opt
sda5 1 GB btrfs /home
sda6 500 MB btrfs for Data Files

Lines added in

rapiddisk_hook

manual_add_modules dm-writecache

rapiddisk_sub.orig

#!/bin/sh

modprobe -v rapiddisk
modprobe -v rapiddisk-cache
modprobe -v dm-writecache

rapiddisk -a RAMDISKSIZE
rapiddisk -a 128
rapiddisk -a 32
rapiddisk -a 256
rapiddisk -a 1
rapiddisk -m rd0 -b BOOTDEVICE -p wa
rapiddisk -m rd1 -b /dev/sda3 -p wb
rapiddisk -m rd2 -b /dev/sda4 -p wb
rapiddisk -m rd3 -b /dev/sda5 -p wb
rapiddisk -m rd4 -b /dev/sda6 -p wb

exit 0


Rapiddisk was correctly installed and configurated in initramfs.
All above works correctly.

I have created the script below to flush data. All commands work in terminal.

Script named K99

--
#!/bin/bash
sudo dmsetup message /dev/mapper/rc-wb_sda3 0 flush
sudo dmsetup suspend /dev/mapper/rc-wb_sda3
sudo dmsetup message /dev/mapper/rc-wb_sda4 0 flush
sudo dmsetup suspend /dev/mapper/rc-wb_sda4
sudo dmsetup message /dev/mapper/rc-wb_sda5 0 flush
sudo dmsetup suspend /dev/mapper/rc-wb_sda5
sudo dmsetup message /dev/mapper/rc-wb_sda6 0 flush
sudo dmsetup suspend /dev/mapper/rc-wb_sda6

exit 0

--

After created a script to flush data doing executable and root owner being copied to /etc/rc0.d/ and /etc/rc6.d/ waiting to flush cache in restart and shutdown.
Next step doing a shutdown simply not is done a shutdown and the OS is halted in GUI screen not being possible select menus.
I had forced a shutdown "power off" the vm.

Thus had recovered a VM image backup.
First step copied the script to /etc/rc0.d/ and /etc/rc6.d/.
After install_initrd.sh.
Done a shutdown without the same issue above.
When starting the VM not is possible OS startup with the message below
That issue happen in some times when was done install_initrd.sh even doing the same act in exact same steps.
Possibly problems in filesystem in root.

VirtualBox_startup

Not copying the flush data script to /etc/rc0.d/ and /etc/rc6.d/ and only done install_initrd.sh.
See the screenshot below showing details about not allocating memory and not being possible create a map.

VirtualBox start

After the OS is started running rapiddisk -l shown all cache was created in OS startup.

1

The message from previous screenshot is to be ignored ?

Running dmsetup ls

VirtualBox_dmsetup ls2

I have done tests installing some softwares and restarting the OS the data not was flushed meaning the writecache is working.
Running systemctl status rapiddiskd.service

VirtualBox_systemctl2

Rapiddisk repository not is the local to create questions about how create script thus I had searched how to create a script to run in restart and shutdown. Very much wrong informations and users posting how "doing it" scripts "works". Users trying to do good actions with wrong information.

Please I need information about questions below to do more tests and thus report any information.
If anyone reply thanks very much.

1 ) In screenshot have the message unsupported sectors

VirtualBox start

In
https://manpages.ubuntu.com/manpages/eoan/man7/lvmcache.7.html

dm-writecache block size
The dm-writecache block size can be 4096 bytes (the default), or 512 bytes. The default
4096 has better performance and should be used except when 512 is necessary for
compatibility. The dm-writecache block size is specified with --writecacheblocksize
4096b|512b when caching is started.

   When  a  file system like xfs already exists on the main LV prior to caching, and the file
   system is using a block size of 512, then the writecache block size should be set to  512.
   (The  file  system  will  likely fail to mount if writecache block size of 4096 is used in
   this case.)

About the "unsupported sectors" is to be ignored ? If not the dm-writecache block size need to be configured when creating the writecache buffer ?

2 ) In screenshot below rapiddiskd.service not is running, but the writecache is working.

VirtualBox_systemctl2

Rapiddisk writecache need rapiddiskd.service running to avoid any problem ?
Enabling the service to run in OS startup create problems ?

3 ) What task does rapiddisk when the writeback cache is full ? The data is flushed ? If yes is totally ? If not the filesystem metadata is being updated even not flushed totally the file in disk ?

That information help to understand because happen issues when restarting the OS not being possible complete the OS startup.

4 ) About the script below have anything wrong ? The script is executable and root owner.

Any simple reply help to search how fix.

File name K99

--

#!/bin/bash
sudo dmsetup message /dev/mapper/rc-wb_sda3 0 flush
sudo dmsetup suspend /dev/mapper/rc-wb_sda3
sudo dmsetup message /dev/mapper/rc-wb_sda4 0 flush
sudo dmsetup suspend /dev/mapper/rc-wb_sda4
sudo dmsetup message /dev/mapper/rc-wb_sda5 0 flush
sudo dmsetup suspend /dev/mapper/rc-wb_sda5
sudo dmsetup message /dev/mapper/rc-wb_sda6 0 flush
sudo dmsetup suspend /dev/mapper/rc-wb_sda6

exit 0

--

5 ) Seeing in
https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/writecache.html
https://www.man7.org/linux/man-pages/man8/dmsetup.8.html
https://manpages.ubuntu.com/manpages/eoan/man7/lvmcache.7.html

have information how configure the writecache, but not have any information in what command line to do any option.
I want create a writecache buffer flushing data in each 6 minutes and another writecache buffer when buffer is full flush a selected percentage to flush.
The configuration above need to be done when creating the cache or also is possible configure with the cache working ?

Thanks very much for your reply.
Have a nice week.

@pkoutoupis
Copy link
Owner

@Augusto7743

About the "unsupported sectors" is to be ignored ? If not the dm-writecache block size need to be configured when creating the writecache buffer ?

To date, rapiddisk-cache does not support 4K block sizes. Dm-writecache may support it. I will check that out in my environment and if it does, I will adjust both the documentation and user space utility to reflect that. I look at logical block size reported by the underlying block subsystem.

Rapiddisk writecache need rapiddiskd.service running to avoid any problem ?
Enabling the service to run in OS startup create problems ?

No, rapiddiskd.service is not required to get the cache running. The daemon or service is only used for network management. It provides a REST API for external management of the rapiddisk suite. Now, I am not sure why it is not running in your environment (unless you never started/enabled it). But again, it is not a requirement.

What task does rapiddisk when the writeback cache is full ? The data is flushed ? If yes is totally ? If not the filesystem metadata is being updated even not flushed totally the file in disk ?

I am not sure when it does its flushing. I did not write dm-writecache. I only wrote the other modules. The module documentation provides a little bit more information: https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/writecache.html

About the script below have anything wrong ? The script is executable and root owner.

I am not sure why you are using suspend. A flush should be the only thing required.

The configuration above need to be done when creating the cache or also is possible configure with the cache working ?

Again, the module documentation provides more details: https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/writecache.html BUT I did not add any tunable functions to my implementation. That enhancement request is already captured in your ticket: #62

@Augusto7743
Copy link
Author

@pkoutoupis

Thanks very much for reply.
I had done test to try report information to the project.

"To date, rapiddisk-cache does not support 4K block sizes. Dm-writecache may support it. I will check that out in my environment and if it does, I will adjust both the documentation and user space utility to reflect that. I look at logical block size reported by the underlying block subsystem."
I see BTRFS default block size is 16 KB.
I see being a error message to be ignored.

"The daemon or service is only used for network management. It provides a REST API for external management of the rapiddisk suite. Now, I am not sure why it is not running in your environment (unless you never started/enabled it)"
Exactly ... I not had started or enabled for tests.

"I am not sure when it does its flushing. I did not write dm-writecache. I only wrote the other modules. The module documentation provides a little bit more information: https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/writecache.html"
Unhappily in link above and in other sites not have any information about what is the action done when the cache is full.
I want understand what action is done because in some test copying several files the device root was damaged not using a flush command.

"I am not sure why you are using suspend. A flush should be the only thing required."
I had used suspend because flushing after new data maybe write in cache. I want avoid problems when shutdown.

"BUT I did not add any tunable functions to my implementation. That enhancement request is already captured in your ticket:"
That configure the dm-writecache is good for the users.

I only want to do some tests using virtual machines to figure how load writeback cache in OS startup and also how flush cache, but had happened issues thus I need to do test to figure how avoid the issues and report here to others users.
Example.
When running the install rapiddisk module script in initramfs need to configure where is the root for enable wa cache plus writecache enabled for root.
In some times when restarting the VM not was possible OS startup with the initramfs screen needing a command because not was possible load the root device. The root file system was damaged even not copying files or done a flush.
Thus I had created a a partition sda2 1 MB not formated and installed the module script pointing the root config to sda2.
sda1 boot-grub
sda2 not formated > for wa
sda3 boot
sda4 root > writeback
Now not is being damaging the root file system.
I not have information to explain it.

I see in #68 being install script updated, but i not understand if creating a wa and wb caches in same root partition can damage the file system.

In moment trying figure because script not is running in shutdown and restart.
In moment I not have done a script to read how created the rapiddisk caches and thus automatically create scripts installing systemd services to flush writecache cache being good to add to the project.

I will to do more test and report any information here.

Thanks very much.
Your software is amazing.
Thanks for all your replies.
Have a nice week.

@Augusto7743
Copy link
Author

Augusto7743 commented Dec 10, 2021

Tested in several VMs images.
VM machine 1,2 GB RAM OS Lubuntu 20.04.3 with 3 caches for partitions root, home and opt.
The initramfs scripts was using the command below to avoid wa and wb in same root.
sudo bash install_initrd.sh --install --root=/dev/sda99 --size=1 --kernel=5.11.0-27-generic

Created a script and systemd to run in OS restart and shutdown.
Several times had happened a strange issue with files not being correctly written in disk.
I had tested copying a folder size less of 40 MB with several files and few folders in /home/user/documents.
OS shutdown was done cache flush. but when the OS was restarted the copied folder had problems.
In folder all sub folders was copied correctly and in folder root all files was created with size 0 KB.
The problem was fixed adding in script "sleep 5s" before flush commands.

The script "flush.sh" saved in /home/user


#!/bin/bash
sleep 5s
dmsetup message /dev/mapper/rc-wb_sda3 0 flush
dmsetup message /dev/mapper/rc-wb_sda4 0 flush
dmsetup message /dev/mapper/rc-wb_sda5 0 flush
sleep 5s

exit 0


The systemd service run in shutdown and restart saved in /etc/systemd/system/flush.service
If you known any good line to add please share the information.


[Unit]
Description=RapidDisk Writeback Cache Flush
DefaultDependencies=no
Before=shutdown.target reboot.target

[Service]
Type=oneshot
User=root
Group=root
ExecStart=/home/user/flush.sh
TimeoutStartSec=0

[Install]
WantedBy=shutdown.target reboot.target


After run the commands
sudo systemctl enable flush
sudo systemctl daemon-reload
sudo bash /home/user/flush.sh

Few tests showing all caches was flushed.
Need more test in real machine.

Have two problems happening :

  • Randomly when OS is loaded and selecting the User image login the screen change to blank and after return to User Login Screen how if not is possible run. Not is possible start totally the OS. The solution is randomly being first "fix" try press computer reset button for 2 times and see if the OS was totally started or the second "fix" pressing the restart button in User Login Screen. Only try second "fix" if the first "fix" not work.
  • OS started when browsing the /opt is listed being empty, but /opt have folders. If user run file manager with sudo and browse /opt will be listed all folders how if /opt was correctly "mounted".

Is how if /opt not is being mounted.
In file manager is listed "dm-2". Clicking in "dm-2" automatically redirect to /opt and now all files listed.

Trying figure a solution to 2 problems above after need create a script systemd that automatically read the rapiddisk_sub.orig inside commands creating a script and systemd service.

@Augusto7743
Copy link
Author

The 2 issues in previous reply not are happening in real machine, but have another problem.
Configured a script to run in shutdown.
After installing some softwares when restarting the OS is halted in boot command line interface before changing the screen to a GUI.
After all tasks are started the OS is halted.
Not file system errors.
Anything was broken in root ... configuration, file or etc.
All is being done for test and need reinstall OS.
Now need to test running the script before few last tasks (unmount.target) are being stopped in shutdown process and also creating a script to remove all caches before OS shutdown.

@pkoutoupis
Copy link
Owner

Thank you for doing this research.

@Augusto7743
Copy link
Author

@pkoutoupis
Thanks for doing the new update release.

I have tested for few day with not issues using flush script when shutdown, but had happened a problem when the machine had entered stand by power mode.
Before stand by mode (sleep target mode) was done a systemctl command to nvidia suspend service.
In Nvidia forums have topics relating issues when entering nvidia suspend service.
Thus randomly when entering stand by the machine not enter in stand by mode and the screen is blank not being possible return to OS and if was done partial written from writeback cache the root file system and other partitions using own caches will be damaged at point in next machine power on the OS not will be started showing not being possible mount root partition.

Need remove nvidia suspend service and add in script to run before stand by mode.
I will add and continue the tests.
When I find the correct name target to run before OS umount all file systems will be posted all information and if possible a script to create scripts and services to OS when writeback cache is being used with your amazing cache software.
Have a nice week.

@matteotenca
Copy link
Collaborator

@Augusto7743 I noticed that rapiddisk writeback mode used in conjunction with LVM seems not to suffer of the missing "last flush" on shutdown - I am not sure though, I just noticed this behaviour on a virtual machine once, but I didn't do any further tests and maybe I am totally off topic...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants