-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce experimental live mode [squashfs] #1248
Conversation
repos/system_upgrade/common/actors/livemode/prepareliveimage/files/do-upgrade.sh
Fixed
Show fixed
Hide fixed
repos/system_upgrade/common/actors/livemode/prepareliveimage/files/do-upgrade.sh
Fixed
Show fixed
Hide fixed
repos/system_upgrade/common/actors/livemode/prepareliveimage/files/do-upgrade.sh
Fixed
Show fixed
Hide fixed
repos/system_upgrade/common/actors/livemode/prepareliveimage/files/do-upgrade.sh
Fixed
Show fixed
Hide fixed
repos/system_upgrade/common/actors/livemode/prepareliveimage/files/do-upgrade.sh
Fixed
Show fixed
Hide fixed
repos/system_upgrade/common/actors/livemode/prepareliveimage/files/do-upgrade.sh
Fixed
Show fixed
Hide fixed
repos/system_upgrade/common/actors/livemode/modify_userspace_for_livemode/files/do-upgrade.sh
Fixed
Show fixed
Hide fixed
repos/system_upgrade/common/actors/livemode/modify_userspace_for_livemode/files/do-upgrade.sh
Fixed
Show fixed
Hide fixed
repos/system_upgrade/common/actors/livemode/modify_userspace_for_livemode/files/do-upgrade.sh
Fixed
Show fixed
Hide fixed
repos/system_upgrade/common/actors/livemode/modify_userspace_for_livemode/files/do-upgrade.sh
Fixed
Show fixed
Hide fixed
repos/system_upgrade/common/actors/addupgradebootentry/libraries/addupgradebootentry.py
Outdated
Show resolved
Hide resolved
repos/system_upgrade/common/actors/addupgradebootentry/libraries/addupgradebootentry.py
Outdated
Show resolved
Hide resolved
repos/system_upgrade/common/actors/addupgradebootentry/libraries/addupgradebootentry.py
Outdated
Show resolved
Hide resolved
repos/system_upgrade/common/actors/addupgradebootentry/libraries/addupgradebootentry.py
Show resolved
Hide resolved
repos/system_upgrade/common/actors/addupgradebootentry/libraries/addupgradebootentry.py
Show resolved
Hide resolved
...ade/common/actors/initramfs/upgradeinitramfsgenerator/libraries/upgradeinitramfsgenerator.py
Outdated
Show resolved
Hide resolved
repos/system_upgrade/common/actors/livemode/liveimagegenerator/libraries/liveimagegenerator.py
Show resolved
Hide resolved
...tem_upgrade/common/actors/livemode/livemode_config_scanner/libraries/scan_livemode_config.py
Outdated
Show resolved
Hide resolved
...tem_upgrade/common/actors/livemode/livemode_config_scanner/libraries/scan_livemode_config.py
Outdated
Show resolved
Hide resolved
repos/system_upgrade/common/actors/livemode/livemodereporter/actor.py
Outdated
Show resolved
Hide resolved
I reviewed the whole PR yesterday and it looks great. The approach looks good and I didn't find anything that looks like it will introduce a bug. Everything I noted is a small thing and can be deferred until a later PR. I did find that my spotting of little things that could be changed went down in the second half of the PR, though, so it's possible I've missed things. I've asked @MichalHe to point out the places that I should take a harder look at (because those places will run when we are not in livemode) and I will take a second look at those. |
...ade/common/actors/initramfs/upgradeinitramfsgenerator/libraries/upgradeinitramfsgenerator.py
Outdated
Show resolved
Hide resolved
...m_upgrade/common/actors/livemode/modify_userspace_for_livemode/libraries/prepareliveimage.py
Outdated
Show resolved
Hide resolved
repos/system_upgrade/common/actors/addupgradebootentry/libraries/addupgradebootentry.py
Outdated
Show resolved
Hide resolved
...emode/emit_livemode_userspace_requirements/libraries/emit_livemode_userspace_requirements.py
Outdated
Show resolved
Hide resolved
[ -f "$NEWROOT$LEAPP_FAILED_FLAG_FILE" ] && { | ||
echo >&2 "Found file $NEWROOT$LEAPP_FAILED_FLAG_FILE" | ||
echo >&2 "Error: Leapp previously failed and cannot continue, returning back to emergency shell" | ||
echo >&2 "Please file a support case with $NEWROOT/var/log/leapp/leapp-upgrade.log attached" | ||
echo >&2 "To rerun the upgrade upon exiting the dracut shell remove the $NEWROOT$LEAPP_FAILED_FLAG_FILE file" | ||
exit 1 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the shell script does not reflect some recent upstream changes requested by support. e.g. this should not block the upgrade (even when it will most likely lead to crash later). As this is experimental only, it's not a problem, but we should reflect the changes at least in a follow up PR
...m_upgrade/common/actors/livemode/modify_userspace_for_livemode/libraries/prepareliveimage.py
Outdated
Show resolved
Hide resolved
/packit copr-build |
The experimental "live mode" feature that allows booting into a squashfs image of the target userspace and running leapp via a service builds on several models added by this commit. The most important role has the LiveModeConfig model, storing user-defined configuration of the feature. Jira ref: RHEL-45280
/packit copr-build |
Great job! Once the last failing "unittest" is fixed we are ready to go. |
f6de260
to
aaf7528
Compare
e5e045c
to
86742d1
Compare
Modify core actors to support upgrades with "live mode". Whereas live mode implies a new, separate, code path for generating the live image initramfs, the changes introduced in add_upgrade_boot_entry actor interfere deeply with the old implementation. Kernel cmdline arguments for the created boot entry are now manipulated uniformly, avoiding ad- hoc string formatting. It is also possible to remove kernel cmdline args from the entry. Addition of arguments precedes removal, i.e., if arg=value should be added and also removed, it will be removed. The root cmdline parameter is modified separately, due to a bug in grubby. Jira ref: RHEL-45280
Add actors that scan the new configuration file devel-livemode.ini, informing the rest of the actor collective about the configuration. Based on this configuration, additional packages are requested to be installed into the target userspace. The target userspace is also modified to contain services that execute leapp based on kernel cmdline. For a full list of modifications, see models/livemode.py added in a previous commit. The feature can be enabled by setting LEAPP_UNSUPPORTED=1 together with LEAPP_DEVEL_ENABLE_LIVE_MODE=1. Note, that the squashfs-tools package must be installed (otherwise an error will be raised). The live mode feature is currently tested only for x86_64, and, therefore, attempting to use this feature on a different architecture will be prohibited by the implementation. Jira ref: RHEL-45280
I am trying to get the tests fixed. There are 2 problems
I've already created a MR for the problematic test. let's see if we will be able to proceed in few hours. |
/packit copr-build |
summary = ( | ||
'The Live Upgrade Mode requires at least 2 GB of additional space ' | ||
'in the partition that hosts /var/lib/leapp in order to create ' | ||
'the squashfs image. During the "reboot phase", the image will ' | ||
'need more space into memory, in particular for booting over the ' | ||
'network. The recommended memory for this mode is at least 4 GB.' | ||
) | ||
reporting.create_report([ | ||
reporting.Title('Live Upgrade Mode enabled'), | ||
reporting.Summary(summary), | ||
reporting.Severity(reporting.Severity.HIGH), | ||
reporting.Groups([reporting.Groups.BOOT]), | ||
reporting.RelatedResource('file', '/etc/leapp/files/devel-livemode.ini') | ||
]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in future, we should check the available memory and raise this output just in case we see that there is not enough memory avilable. Note that information about total memory is covered by MemoryInfo
msg, but I do not remember now whether it's physical only or whether it includes swap as well - note that in some cases we should not count with SWAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passed! Reviewed most of the code (I haven't reviewed deeply some experimental actors). It seems the solution works as expected. Additional changes are expected to be delivered in another PRs in future.
Great job guys!!! 🚀
## Packaging - .. names of packages, dependencies, changes in provided capabilities.... ## Upgrade handling ### Fixes - Add missing RHUI GCP config info for RHEL for SAP (oamg#1253) - Fix creation of the post upgrade report about changes in states of systemd services (oamg#1210) - Fix detection of valid sshd config with internal-sftp subsystem in Leapp (oamg#1212) - Fix evaluation of PES data (oamg#1194) - Fix failing "update-ca-trust" command caused by missing util-linux package (oamg#1169) - Fix handling of versions in RHUI configuration for ELS and SAP upgrades (oamg#1240) - Fix the parsing of the lscpu output (oamg#1184, oamg#1208) - Fix the upgrade of systems using RHUI on AWS after changes in RHUI client package (oamg#1178) - Fix upgrade on aarch64 via RHUI on AWS (oamg#1240) - Handle a false positive GPG check error when TargetUserSpaceInfo is missing (oamg#1269) - Target by default always "GA" channel repositories unless a different channel is specified for the leapp execution (oamg#1205) - Update the default kernel cmdline (oamg#1193, oamg#1216) - Update the device driver deprecation data, fixing invalid fields for some AMD CPUs (oamg#1211) - Wait for the storage initialization when /usr is on separate file system - covering SAN (oamg#1218, oamg#1219) - [IPU 7 -> 8] Drop enforced tomcat removal for satellite when upgrading to RHEL 8.10 (oamg#1243) - [IPU 7 -> 8] Fix detection of bootable device on RAID (oamg#1260) - [IPU 8 -> 9] Inhibit the upgrade to RHEL 9.5 on ARM architecture due to incompatibility of the RHEL 8 bootloader and RHEL 9.5 kernel (oamg#1270) ### Enhancements - [IPU 8 -> 9] Introduce upgrade path 8.10 -> 9.5 (oamg#1245, oamg#1246) - Apply solutions for leftover rpms for all major upgrade paths - including experimental actors (oamg#1199) - Do not terminate the upgrade dracut module execution anymore if /sysroot/root/tmp_leapp_py3/.leapp_upgrade_failed exists (oamg#1197) - Improve set_systemd_services_states logging (oamg#1213) - Include leapp command execution and defined leapp envars inside leapp.db - (oamg#1152) - Introduce experimental upgrades in 'live' mode for the testing (oamg#1248) - Load obsoleted GPG keys from gpg-signatures.json file instead of hardcoding them (oamg#1241) - Several minor improvements in messages printed in console output (oamg#1173, oamg#1214, oamg#1274) - Several minor improvements in report and error messages (oamg#1207, oamg#1217, oamg#1234, oamg#1235, oamg#1242) - Sort lists in dnf-plugin-data for easier overview (oamg#1231) - [IPU 7 -> 8] Allow upgrade of content from ELS repositories (oamg#1198) - [IPU 7 -> 8] Inhibit the upgrade when Legacy GRUB is detected (oamg#1206) - [IPU 7 -> 8] Inhibit the upgrade when embedding area is small to prevent failed bootloader update (oamg#1195) - [IPU 8 -> 9] Enable EL 8 > 9 upgrades on Alibaba cloud (oamg#1249) - [IPU 8 -> 9] Enable EL 8 to 9 upgrade of Satellite/Foreman server (oamg#1181) - [IPU 9 -> 10] Introduced number of changes to enable experimental IPU 9 -> 10 (oamg#1169) - [IPU 9 -> 10] Prevent upgrading if NetworkManager is configured with dhcp=dhclient (oamg#1268) - [IPU 9 -> 10] Update URLs in reports to reflect the next planned major upgrade path (oamg#1169, oamg#1273) ## Additional changes interesting for devels - drop unused `packager` field from gpg-signatures.json (oamg#1233) - [IPU 9 -> 10] make system_upgrade/common leapp repo Python 3.12 compatible - [IPU 9 -> 10] introduced system_upgrade/el9toel10 leapp repo
## Packaging - Start building for EL 9 in the upstream repository on COPR (#1169) ## Upgrade handling ### Fixes - Add missing RHUI GCP config info for RHEL for SAP (#1253) - Fix creation of the post upgrade report about changes in states of systemd services (#1210) - Fix detection of valid sshd config with internal-sftp subsystem in Leapp (#1212) - Fix evaluation of PES data (#1194) - Fix failing "update-ca-trust" command caused by missing util-linux package (#1169) - Fix handling of versions in RHUI configuration for ELS and SAP upgrades (#1240) - Fix the parsing of the lscpu output (#1184, #1208) - Fix the upgrade of systems using RHUI on AWS after changes in RHUI client package (#1178) - Fix upgrade on aarch64 via RHUI on AWS (#1240) - Handle a false positive GPG check error when TargetUserSpaceInfo is missing (#1269) - Target by default always "GA" channel repositories unless a different channel is specified for the leapp execution (#1205) - Update the default kernel cmdline (#1193, #1216) - Update the device driver deprecation data, fixing invalid fields for some AMD CPUs (#1211) - Wait for the storage initialization when /usr is on separate file system - covering SAN (#1218, #1219) - [IPU 7 -> 8] Drop enforced tomcat removal for satellite when upgrading to RHEL 8.10 (#1243) - [IPU 7 -> 8] Fix detection of bootable device on RAID (#1260) - [IPU 8 -> 9] Inhibit the upgrade to RHEL 9.5 on ARM architecture due to incompatibility of the RHEL 8 bootloader and RHEL 9.5 kernel (#1270) ### Enhancements - [IPU 8 -> 9] Introduce upgrade path 8.10 -> 9.5 (#1245, #1246) - Update leapp data files (#1280) - Apply solutions for leftover rpms for all major upgrade paths - including experimental actors (#1199) - Do not terminate the upgrade dracut module execution anymore if /sysroot/root/tmp_leapp_py3/.leapp_upgrade_failed exists (#1197) - Improve set_systemd_services_states logging (#1213) - Include leapp command execution and defined leapp envars inside leapp.db - (#1152) - Introduce experimental upgrades in 'live' mode for the testing (#1248) - Load obsoleted GPG keys from gpg-signatures.json file instead of hardcoding them (#1241) - Several minor improvements in messages printed in console output (#1173, #1214, #1274) - Several minor improvements in report and error messages (#1207, #1217, #1234, #1235, #1242) - Sort lists in dnf-plugin-data for easier overview (#1231) - [IPU 7 -> 8] Allow upgrade of content from ELS repositories (#1198) - [IPU 7 -> 8] Inhibit the upgrade when Legacy GRUB is detected (#1206) - [IPU 7 -> 8] Inhibit the upgrade when embedding area is small to prevent failed bootloader update (#1195) - [IPU 8 -> 9] Enable EL 8 > 9 upgrades on Alibaba cloud (#1249) - [IPU 8 -> 9] Enable EL 8 to 9 upgrade of Satellite/Foreman server (#1181) - [IPU 9 -> 10] Introduced number of changes to enable IPU 9 -> 10 for testing (#1169) - [IPU 9 -> 10] Prevent upgrading if NetworkManager is configured with dhcp=dhclient (#1268) - [IPU 9 -> 10] Update URLs in reports to reflect the next planned major upgrade path (#1169, #1273) ## Additional changes interesting for devels - drop unused `packager` field from gpg-signatures.json (#1233) - [IPU 9 -> 10] make system_upgrade/common leapp repo Python 3.12 compatible - [IPU 9 -> 10] introduced system_upgrade/el9toel10 leapp repo
## Packaging - Start building for EL 9 in the upstream repository on COPR (oamg#1169) ## Upgrade handling ### Fixes - Add missing RHUI GCP config info for RHEL for SAP (oamg#1253) - Fix creation of the post upgrade report about changes in states of systemd services (oamg#1210) - Fix detection of valid sshd config with internal-sftp subsystem in Leapp (oamg#1212) - Fix evaluation of PES data (oamg#1194) - Fix failing "update-ca-trust" command caused by missing util-linux package (oamg#1169) - Fix handling of versions in RHUI configuration for ELS and SAP upgrades (oamg#1240) - Fix the parsing of the lscpu output (oamg#1184, oamg#1208) - Fix the upgrade of systems using RHUI on AWS after changes in RHUI client package (oamg#1178) - Fix upgrade on aarch64 via RHUI on AWS (oamg#1240) - Handle a false positive GPG check error when TargetUserSpaceInfo is missing (oamg#1269) - Target by default always "GA" channel repositories unless a different channel is specified for the leapp execution (oamg#1205) - Update the default kernel cmdline (oamg#1193, oamg#1216) - Update the device driver deprecation data, fixing invalid fields for some AMD CPUs (oamg#1211) - Wait for the storage initialization when /usr is on separate file system - covering SAN (oamg#1218, oamg#1219) - [IPU 7 -> 8] Drop enforced tomcat removal for satellite when upgrading to RHEL 8.10 (oamg#1243) - [IPU 7 -> 8] Fix detection of bootable device on RAID (oamg#1260) - [IPU 8 -> 9] Inhibit the upgrade to RHEL 9.5 on ARM architecture due to incompatibility of the RHEL 8 bootloader and RHEL 9.5 kernel (oamg#1270) ### Enhancements - [IPU 8 -> 9] Introduce upgrade path 8.10 -> 9.5 (oamg#1245, oamg#1246) - Update leapp data files (oamg#1280) - Apply solutions for leftover rpms for all major upgrade paths - including experimental actors (oamg#1199) - Do not terminate the upgrade dracut module execution anymore if /sysroot/root/tmp_leapp_py3/.leapp_upgrade_failed exists (oamg#1197) - Improve set_systemd_services_states logging (oamg#1213) - Include leapp command execution and defined leapp envars inside leapp.db - (oamg#1152) - Introduce experimental upgrades in 'live' mode for the testing (oamg#1248) - Load obsoleted GPG keys from gpg-signatures.json file instead of hardcoding them (oamg#1241) - Several minor improvements in messages printed in console output (oamg#1173, oamg#1214, oamg#1274) - Several minor improvements in report and error messages (oamg#1207, oamg#1217, oamg#1234, oamg#1235, oamg#1242) - Sort lists in dnf-plugin-data for easier overview (oamg#1231) - [IPU 7 -> 8] Allow upgrade of content from ELS repositories (oamg#1198) - [IPU 7 -> 8] Inhibit the upgrade when Legacy GRUB is detected (oamg#1206) - [IPU 7 -> 8] Inhibit the upgrade when embedding area is small to prevent failed bootloader update (oamg#1195) - [IPU 8 -> 9] Enable EL 8 > 9 upgrades on Alibaba cloud (oamg#1249) - [IPU 8 -> 9] Enable EL 8 to 9 upgrade of Satellite/Foreman server (oamg#1181) - [IPU 9 -> 10] Introduced number of changes to enable IPU 9 -> 10 for testing (oamg#1169) - [IPU 9 -> 10] Prevent upgrading if NetworkManager is configured with dhcp=dhclient (oamg#1268) - [IPU 9 -> 10] Update URLs in reports to reflect the next planned major upgrade path (oamg#1169, oamg#1273) ## Additional changes interesting for devels - drop unused `packager` field from gpg-signatures.json (oamg#1233) - [IPU 9 -> 10] make system_upgrade/common leapp repo Python 3.12 compatible - [IPU 9 -> 10] introduced system_upgrade/el9toel10 leapp repo (cherry picked from commit 03c257b)
## Packaging - Start building for EL 9 in the upstream repository on COPR (oamg#1169) ## Upgrade handling ### Fixes - Add missing RHUI GCP config info for RHEL for SAP (oamg#1253) - Fix creation of the post upgrade report about changes in states of systemd services (oamg#1210) - Fix detection of valid sshd config with internal-sftp subsystem in Leapp (oamg#1212) - Fix evaluation of PES data (oamg#1194) - Fix failing "update-ca-trust" command caused by missing util-linux package (oamg#1169) - Fix handling of versions in RHUI configuration for ELS and SAP upgrades (oamg#1240) - Fix the parsing of the lscpu output (oamg#1184, oamg#1208) - Fix the upgrade of systems using RHUI on AWS after changes in RHUI client package (oamg#1178) - Fix upgrade on aarch64 via RHUI on AWS (oamg#1240) - Handle a false positive GPG check error when TargetUserSpaceInfo is missing (oamg#1269) - Target by default always "GA" channel repositories unless a different channel is specified for the leapp execution (oamg#1205) - Update the default kernel cmdline (oamg#1193, oamg#1216) - Update the device driver deprecation data, fixing invalid fields for some AMD CPUs (oamg#1211) - Wait for the storage initialization when /usr is on separate file system - covering SAN (oamg#1218, oamg#1219) - [IPU 7 -> 8] Drop enforced tomcat removal for satellite when upgrading to RHEL 8.10 (oamg#1243) - [IPU 7 -> 8] Fix detection of bootable device on RAID (oamg#1260) - [IPU 8 -> 9] Inhibit the upgrade to RHEL 9.5 on ARM architecture due to incompatibility of the RHEL 8 bootloader and RHEL 9.5 kernel (oamg#1270) ### Enhancements - [IPU 8 -> 9] Introduce upgrade path 8.10 -> 9.5 (oamg#1245, oamg#1246) - Update leapp data files (oamg#1280) - Apply solutions for leftover rpms for all major upgrade paths - including experimental actors (oamg#1199) - Do not terminate the upgrade dracut module execution anymore if /sysroot/root/tmp_leapp_py3/.leapp_upgrade_failed exists (oamg#1197) - Improve set_systemd_services_states logging (oamg#1213) - Include leapp command execution and defined leapp envars inside leapp.db - (oamg#1152) - Introduce experimental upgrades in 'live' mode for the testing (oamg#1248) - Load obsoleted GPG keys from gpg-signatures.json file instead of hardcoding them (oamg#1241) - Several minor improvements in messages printed in console output (oamg#1173, oamg#1214, oamg#1274) - Several minor improvements in report and error messages (oamg#1207, oamg#1217, oamg#1234, oamg#1235, oamg#1242) - Sort lists in dnf-plugin-data for easier overview (oamg#1231) - [IPU 7 -> 8] Allow upgrade of content from ELS repositories (oamg#1198) - [IPU 7 -> 8] Inhibit the upgrade when Legacy GRUB is detected (oamg#1206) - [IPU 7 -> 8] Inhibit the upgrade when embedding area is small to prevent failed bootloader update (oamg#1195) - [IPU 8 -> 9] Enable EL 8 > 9 upgrades on Alibaba cloud (oamg#1249) - [IPU 8 -> 9] Enable EL 8 to 9 upgrade of Satellite/Foreman server (oamg#1181) - [IPU 9 -> 10] Introduced number of changes to enable IPU 9 -> 10 for testing (oamg#1169) - [IPU 9 -> 10] Prevent upgrading if NetworkManager is configured with dhcp=dhclient (oamg#1268) - [IPU 9 -> 10] Update URLs in reports to reflect the next planned major upgrade path (oamg#1169, oamg#1273) ## Additional changes interesting for devels - drop unused `packager` field from gpg-signatures.json (oamg#1233) - [IPU 9 -> 10] make system_upgrade/common leapp repo Python 3.12 compatible - [IPU 9 -> 10] introduced system_upgrade/el9toel10 leapp repo (cherry picked from commit 03c257b)
This work is based on @bessonc's (cbesson) awesome livemode actors.
Description
This patch introduces a feature named live mode allowing to boot into a squashfs of the target userspace, reaching the target
multi-user
. Therefore, it is possible to interact with the upgrade environment while the upgrade is running in the background, or, debug crash of the service while having a fully booted system (although the system is relatively minimal). Reachingmulti-user
means that the environment could have network available, and the patch will set up network configurations for target userspace, however, it is a best-effort solution at the moment. The patch somehow simplifies development of such features, as one can boot into the squashfs, log in, check whether everything is as expected and if not the developer can manually play with the system recording his/hers modifications. Later, the developer just needs to reproduce these modifications in the code.If the feature is not used/enabled, then leapp's behavior remains unchanged, although this patch refactors some of leapp's core actors (e.g., add_upgrade_boot_entry).
How to run:
Note that one needs to have
squashfs-tools
installed. Moreover, the feature is currently limited to x86_64 only and attempting to run it on a different architecture will be prevented by leapp.Tests coverage:
add_upgrade_boot_entry
live_image_generator
live_mode_config
live_mode_report
prepare_live_image
emit_livemode_userspace_requirements
Jira ref: RHEL-45280