Fix race condition for multiple very fast downloads #3907
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
See #3906 for details; to sum up, one of the new
NetAsyncDownloader
tests from #3904 was failing in GitHub workflows but nowhere else.When downloading multiple URLs, the
DownloadTarget.size
values were 0 for some of the files whenNetAsyncDownloader.DownloadAndWait
completed.Cause
This new test found a real, pre-existing problem for us! I'm really glad I didn't give in to the temptation to add
[Category("FlakyNetwork")]
and move on. 🎉There's a race condition with multiple extremely fast downloads (the test is using
file://
URLs to avoid spurious failures caused by server glitches, so the files are already on disk and possibly cached in memory). At the start, the main thread loops over the requested URLs and starts a thread for each one. This sets up thedownloads
list with the currently executing downloads, and thequeuedDownloads
list with the downloads that are waiting their turn. When a thread completes, the sizes of thedownloads
andqueuedDownloads
lists are compared tocompleted_downloads
to determine whether there are any other downloads in progress or waiting.If at any time the active download threads manage to finish their work before all of the downloads are started, then they will be comparing the completed download count to incomplete
downloads
andqueuedDownloads
lists! This results in the completion signal being sent early, and when the main thread returns toDownloadAndWait
, it will skip waiting and signal completion with whatever was finished up to that point.Changes
dlMutex
lock (which the other threads acquire before they check for completion) while it starts all the downloads. This ensures that by the time any downloader thread is able to check for completion, thedownloads
andqueuedDownloads
lists will be fully populated, which will prevent the completion notification from being triggered prematurely.file://
URLs and looking up the failed targets in the original input, which is now anIList
instead of anICollection
because we need to callIndexOf
on it).Fixes #3906.