Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many improvements for failed downloads #3635

Merged
merged 2 commits into from
Aug 19, 2022

Conversation

HebaruSan
Copy link
Member

@HebaruSan HebaruSan commented Aug 17, 2022

Background

We have had a lot of recent user frustration with how CKAN reacts when downloads fail. Such errors are hard to reproduce with the full app running because they happen randomly according to users' network conditions. However, I recently hit upon a strategy: edit CKAN/registry.json and change the download properties of the latest versions of some mods so they're invalid (I added /BROKEN_LINK/ at the start of each URL's path). Note that this will only work for mods with non-redistributable licenses (generally restricted), because otherwise the downloader will fall back to archive.org and succeed. The ones in the screenshot below work well for this, if it saves reviewers the trouble of finding them; you can right click them to purge them from the cache if they're already downloaded. Also set license to restricted to suppress the archive.org fallback URLs.

Problems

The changeset is cleared after an unsuccessful install, which is bad because the user probably still wants to install those mods.

With my mildly corrupted registry in place, I was finally able to conduct a proper investigation with failures-on-demand, and I discovered several more issues, some of which may simply be root causes of mysterious unreproducible issues that were reported previously:

  • ckan gui --show-console gets confused and thinks --show-console is an identifier that you want it to highlight, resulting in a superfluous "Not found" message in the log 🤦
  • The async downloader has a potential race condition in its manipulation of queuedDownloads with FirstOrDefault: one thread can change the list while the other is searching it, which throws an exception
  • The async downloader considers failed downloads to still be active, so they block downloads from the same host from starting (see One concurrent download per host for all hosts #3557)
  • If you try to cancel a download, the async downloader halts the currently in progress downloads but then proceeds to start up any remaining queued downloads
  • The async modules downloader can't be re-used to retry a batch of downloads because it already has all the modules in this.modules and therefore thinks they're in progress
  • If you choose the not:cached filter and then download a mod to cache, it doesn't disappear even though it no longer matches the filter
  • If your changeset contains both removals and installs, and one of the installs fails, the removal step is not rolled back. This is bad if you were intending to switch from the removed mods to the installed mods, because now you need to fall back to your old mods and have to go find them and reinstall, if you can even remember what they were.
  • The code related to the Wait tab is messy and hard to change without breaking something else

Changes

Now if any downloads in your changeset fail, this popup appears over the install progress screen:

image

Clicking either checkbox in a row will toggle both of that row's checkboxes; rather than representing this with a single checkbox, I wanted to make it clear that both choices are actions, and you are sorting the mods into one group or the other.

If you click "Abort whole changeset", then any uninstallations are rolled back and you are returned to the main mod list with your changeset intact. If you click "Retry", then any mods with a check mark in the "Skip" column will be removed from your changeset, along with any mods depending on them, and CKAN will try again with the mods with check marks in the Retry column plus the ones that didn't fail.

  • Now --show-console isn't passed to GUI since it's handled before that
  • Now a lock mutex protects the queued downloads collection, so only one thread can change it at a time
  • Now if a download fails, the async downloader allows a queued download from the same host to replace it (since we will be reporting and handling download errors later in the popup)
  • Now cancelling downloads purges the queued downloads
  • Now the async module downloader clears this.modules after a batch finishes, successfully or otherwise, so it can be used to retry failed downloads
  • Now all of GUI's many BackgroundWorkers used for interacting with Wait are replaced by one that lives inside Wait, for better encapsulation. The functions formerly assigned to DoWork and RunWorkerCompleted are now parameters to StartWaiting. In addition, some of Wait's functions are changed from public to private and only manipulated from well defined, well known interfaces. This will make it easier to keep track of where these things are being used and ensure they don't end up in a weird state.
  • Wait.OnCancel is now used by other code to react to cancellation instead of Main.cancelCallback, and the Cancel button is only available if the cancelable parameter to StartWaiting is true (which it is for installation and downloading to cache)
  • The Retry button starts out hidden and is shown with Wait.RetryEnabled = true when the failure handler supports it
  • Now ModuleDownloadErrorsKraken's list of modules and their corresponding download errors is public, so we can use it in the popup
  • InstallMods now catches ModuleDownloadErrorsKraken and passes its module/exception info to the new popup, then either throws a cancellation kraken or removes the skipped mods and their depending mods from toInstall
  • Now we always show the selected mod's info on returning to ManageMods
  • Now if you cancel an install or a download to cache, we return you to ManageMods immediately since you don't benefit from reading text that says you cancelled
  • Now the changeset is not cleared after a failed install, so users will no longer have the experience of selecting dozens of mods and then trying to re-select them after a failure
  • Now we refresh the mod list after downloading to cache
  • Now one big transaction covers the entire remove / install / upgrade flow, so if the install fails, the whole changeset is reverted
  • Registry.FindReverseDependencies now has a satisfiedFilter parameter that accepts a function from RelationshipDescriptor to bool, which we use to avoid removing a depending mod from the changeset if the broken dependency is virtual; this way, the user can Skip a virtual dependency and then make a different choice

This should provide a much more stable and pleasant experience when downloads fail.

Fixes #873.
Fixes #1636.
Fixes #3588.
Fixes #3604.

@HebaruSan HebaruSan added Bug Something is not working as intended Enhancement New features or functionality GUI Issues affecting the interactive GUI Core (ckan.dll) Issues affecting the core part of CKAN Pull request Network Issues affecting internet connections of CKAN labels Aug 17, 2022
@HebaruSan
Copy link
Member Author

Hi @NathanPeake, @epower53, and @Ictiv, there is a test build for these changes under the Artifacts dropdown here:

https://github.com/KSP-CKAN/CKAN/pull/3635/checks

If you have a chance, please try it out and let us know if you still have problems with downloads.

@HebaruSan

This comment was marked as resolved.

Copy link
Member

@techman83 techman83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing stands out from my read over, should make downloads that barf halfway far less annoying 🎉

@HebaruSan

This comment was marked as resolved.

@HebaruSan
Copy link
Member Author

  1. Start installing a mod with a virtual dependency
  2. CKAN prompts you to pick a mod to satisfy the depends
  3. You pick a mod
  4. The dependency fails
  5. You Skip the dependency in the popup
    At this point, the current code would remove the depending mod from the changeset. But it would be nice to have special handling for virtual dependencies so the user could choose a different one instead. This would actually not be that hard; if we simply don't remove the depending mod, the prompt would appear when you Retry. So all we really need is a way to detect virtual dependencies.

Addressed this in latest commit; we now no longer remove mods from the changeset if their broken dependencies are virtual, so the user can go back and make another choice.

If you choose to install a mod with dependencies, and you Skip that mod, do the dependencies vanish from the changeset as well since they're no longer needed? I think I saw that happening but it would be good to be sure.
Investigating the virtual dependency idea, I discovered that failed non-virtual dependencies can cause the popup to appear but aren't listed in it; the changeset only includes the user's selections, not the dependencies that automatically get pulled in. I'll have to add some logic to handle that...

The dependencies aren't in the changeset, but the latest commit retrieves the full changeset where needed for the changeset pruning logic.

Haven't looked at enforcing relationships in the UI. This feels like something we can do without and return to if there's demand.

@HebaruSan HebaruSan merged commit 6bc1dbf into KSP-CKAN:master Aug 19, 2022
@HebaruSan HebaruSan deleted the fix/failed-download-handling branch August 19, 2022 07:10
@HebaruSan

This comment was marked as outdated.

@NathanPeake
Copy link
Contributor

Hi @NathanPeake, @epower53, and @Ictiv, there is a test build for these changes under the Artifacts dropdown here:

https://github.com/KSP-CKAN/CKAN/pull/3635/checks

If you have a chance, please try it out and let us know if you still have problems with downloads.

This is what I have been wanting for a long long time. thanks for adding this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something is not working as intended Core (ckan.dll) Issues affecting the core part of CKAN Enhancement New features or functionality GUI Issues affecting the interactive GUI In progress We're still working on this Network Issues affecting internet connections of CKAN
Projects
None yet
3 participants