-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: fail deployment when it's aborted #115
base: main
Are you sure you want to change the base?
fix: fail deployment when it's aborted #115
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment below. Otherwise looks good
I didn't know if we wanted to fail on everything except 204, so I added MENDER_ABORTED.
I think this is a good idea.
20e2f05
to
65f9994
Compare
65f9994
to
9703d03
Compare
9703d03
to
bc3ce0d
Compare
A double free can occur in `mender_flash_abort_deployment`, so call `FREE_AND_NULL` Changelog: None Ticket: None Signed-off-by: Daniel Skinstad Drabitzius <[email protected]>
Changelog: Title Ticket: MEN-7693 Signed-off-by: Daniel Skinstad Drabitzius <[email protected]>
bc3ce0d
to
d37607b
Compare
Merging these commits will result in the following changelog entries: Changelogsmender-mcu (abort-deployment)New changes in mender-mcu since main: Bug Fixes
|
@@ -1069,6 +1081,9 @@ mender_client_update_work_function(void) { | |||
if (!mender_update_module->supports_rollback) { | |||
mender_log_warning("Rollback not supported for artifacts of type '%s'", mender_update_module->artifact_type); | |||
ret = MENDER_FAIL; | |||
} else if (aborted_deployment) { | |||
/* Don't rollback if deployment is aborted */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't you as a user expect a rollback if you abort a deployment that isn't committed yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right - but the code doesn't support an abort after a reboot - there is nothing publishing the status between reboot and commit, and on commit we've said it's too late. My idea is that before we have rebooted, we should not rollback into the other partition, but just go to failure - if not we'll end up in the wrong partition if we e.g. abort the deployment after the install state has published its status and set the pending image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand, sorry. The code is in a place where we are in the ROLLBACK
state, i.e. we are supposed to rollback. And at that place we don't rollback if the deployment was aborted. To me that's wrong. We either should not get into the ROLLBACK
in case there's nothing to rollback or we should do a rollback. So, if I understand correctly what you describe, the proper behavior is to not end up in this state in such a case. IOW, either there's something to rollback and we should do a rollback if we get here and the deployment is aborted or there's nothing to rollback and we should not end up here. Or am I still missing something? If so, please describe the sequence of states that will result in a problematic case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assume we have MENDER_UPDATE_STATE_DOWNLOAD -> MENDER_UPDATE_STATE_INSTALL -> MENDER_UPDATE_STATE_REBOOT
.
On each of these states we publish the status and check if the deployment is aborted. If it's aborted when we enter MENDER_UPDATE_STATE_REBOOT
, then what do we want to do? We haven't rebooted yet as this is caught before the reboot callback is called, so we can either go directly to failure - which deviates from the normal flow, but it does make sense since we haven't actually rebooted, and this is what I originally did, or if we follow the normal state transitions we should go the failure route, which will transition to rollback, but indeed, we have nothing to rollback, which is why I checked for an aborted deployment and set ret = MENDER_FAIL
in MENDER_UPDATE_STATE_ROLLBACK
If we checked if the status is aborted in MENDER_UPDATE_STATE_COMMIT
(which Lluis said was too late, but I don't quite know why it's too late), then it would make sense to do a proper rollback, as we have rebooted into the new partition, and we would then need to perform a rollback
I didn't know if we wanted to fail on everything except
204
, so I addedMENDER_ABORTED
. If we can go to failure on everything except a204
, then we could just check forMENDER_FAIL
I also added a check to reboot just in case you abort right after the check in install