Add a disk space check in pre-upgrade #1279

lucamosca1 · 2024-08-08T10:27:53Z

It's common practice to partition servers to optimize and safeguarding the system. It could happens that, using cloud providers, disks are very small to save costs. In my case root partition is large 10GB on every server. In one of them, upgrade failed since there wasn't enough space to download and install all required packages.
Maybe a pre-upgrade could check root partition to see if there are at least X GB for a standard system and provide a warning (I would avoid block the process, that choice should be up to admin.

pirat89 · 2024-08-13T11:12:56Z

HI @lucamosca1, can you share following data?

rpm -qa | grep leapp
used OS (and version)
used cloud?

Also it would help to have following df -h outputs:

before installing leapp-upgrade at all (so /var/lib/leapp is empty),
after running leapp preupgrade
after running leapp upgrade but before reboot
actual error output you hit

and /var/log/leapp/leapp-upgrade.logs

The checks you speak about are actually implemented for some time. We have tested it on various setups and various amount of space - also playing with edge-cases where the amount of free space is very close to real limits when the upgrade starts to fail - testing of typical systems in clouds too. Note that with the implemented checks, it's hard to do some testing to get to a "real" limits as part of the solution are also some reserves, so the upgrade is usually inhibited before you could get very close to real limits.

We would raise these limits much more to be really safer - that means, by another several GBs on top of what is set for reserves right now. However, we know it would be problematic as number of systems can upgrade and such a raise of space would be understood negatively - as basically majority of cloud systems would be blocked to upgrade when it's not necessary. So we do not have too much space to raise required reserves more significantly. Note that the required free disk space is very dynamic. Since we implemented the current checks, reports about this issue are extremely rare. We are aware that 100% safe solution is basically not possible/feasible. So we implemented these best effort checks.

To be able to improve the solution even more, we would need to have outputs from rpm about calculated required disk space per each partition. As rpm is not providing such an output, we cannot make our estimations more optimised. We have discussed this with RPM developers and they consider such a feature problematic due to RPM design. There are some tricky hacks we could do to obtain such information, but right now it's not considered as a good trade-off as it's trickey (basically we would have to make rpm to think that all partitions are too small for anything and get the information from printed errors). So consider this as something we do not want even inside the code right now at all.

Then there is one another problem, as the disk space we check can be still consumed by various logs, apps, etc. before the reboot is executed, etc. We have set some reserves to cover that, but in some corner cases it can happen this reserve will not be enough again.

I hope I provided enough details to understand the actual problems when dealing with required free disk space, so it's understood that this will never be completely error-proof as there are number of heuristics used (even rpm is using number of heuristics to realize how much space will be needed for the transaction and it's not 100% safe neither). We can only improve heuristics but that's all. If people want to be safe with such operation, having at least 5times more free space than is needed for the installed SW per partition can be considered kind of safe - but not in all cases again.

lucamosca1 · 2024-08-13T12:33:35Z

Ok, observation perfectly understandable. Unluckily I don't have anymore the server involved (it was a CentOS 7 installed on an AWS EC2 instance) so I cannot share additional info.
My was an observation to determine an arbitrary X GB of free disk space on root paritition that raise a warning for the upgrade before proceeding. I didn't took in consideration the total and exact amount of space require by all packages that should be downloaded.

pirat89 · 2024-08-13T13:56:46Z

Thanks for the info. In such a case I think you have used the Elevate project from almalinux, which uses own forked of leapp-repository. I am aware that they used for a longer time an older leapp-repository version which has been missing number of fixes and features (this one included). It seems that they have updated the fork recently to version that should contain the changes - speaking about upstream version 0.19.0 and higher.

The original solution has been very bugy (providing misleading error messages as well) due to hacks we had to do to cover older XFS FSs without ftype (d_type) attributes. The solution had complete redesign done by this 2 PRs:

So I am considering this problem is nowadays resolved.

lucamosca1 added the bug Something isn't working label Aug 8, 2024

pirat89 transferred this issue from oamg/leapp Aug 13, 2024

pirat89 closed this as completed Aug 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a disk space check in pre-upgrade #1279

Add a disk space check in pre-upgrade #1279

lucamosca1 commented Aug 8, 2024

pirat89 commented Aug 13, 2024 •

edited

Loading

lucamosca1 commented Aug 13, 2024

pirat89 commented Aug 13, 2024

Add a disk space check in pre-upgrade #1279

Add a disk space check in pre-upgrade #1279

Comments

lucamosca1 commented Aug 8, 2024

pirat89 commented Aug 13, 2024 • edited Loading

lucamosca1 commented Aug 13, 2024

pirat89 commented Aug 13, 2024

pirat89 commented Aug 13, 2024 •

edited

Loading