Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't crash due to closed socket on Windows #772

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

triplef
Copy link
Contributor

@triplef triplef commented Jan 17, 2023

Fixes possible crash on Windows due to WSAEventSelect() failing with WSAENOTSOCK ("Socket operation on nonsocket") when cancelling a dispatch source after or right before closing the socket.

While the documentation for dispatch_source_cancel() indicates that one should wait for the cancellation handler before closing the socket, this doesn’t seem to be a hard requirement, and trying the same on macOS does not cause any errors.

Additionally, graceful handling of this error is required when using libdispatch with some 3rd party libraries like curl, which notify the caller about a socket being closed only right before closing it, and providing no facility to delay closing the socket.

@triplef
Copy link
Contributor Author

triplef commented Jan 17, 2023

@adierking I’d appreciate your review on this, as you implemented the affected code in 82fefaf. 🙏

@MaxDesiatov MaxDesiatov requested a review from compnerd January 17, 2023 12:43
@lxbndr
Copy link

lxbndr commented Jan 18, 2023

We're using similar patch in our product for quite a long time, and I have a few things to add:

  • the change actually helps to reduce crash rate
  • this error is a consequence of improper socket use and there are more possible crash sites in Dispatch, but with less chances

I actually never thought this approach would be accepted, because I've a discussion with @compnerd about error suppression, and he supposed that we should find and fix the reason of the error instead of silencing it.

By 3rd party libraries like curl you probably mean curl in swift-corelibs-foundation. That's correct, the lack of proper Dispatch Source shutdown in URLSession allows curl close socket before Dispatch frees related resources. But silencing the error would not help in all cases. It appears that, surprisingly, the requirement to wait for Dispatch Source shutdown becomes mandatory for Windows. That's because on Windows sockets handles could be reused after close. That is, if you immediately start a new connection, its socket handle could be same as you just have closed. When Dispatch Source is still shutting down with that value, it leads to corruption of internal structures and crash.

Unfortunately, I had no much time to master the fix properly (I mean, to suggest it for community), but we have tested the solution in Foundation for a few months, and I can say it removes network-related crashes almost completely.

@adierking
Copy link
Contributor

@adierking I’d appreciate your review on this, as you implemented the affected code in 82fefaf. pray

Yeah, I think this should be fine.

cc @compnerd, any opinions on this?

@compnerd
Copy link
Member

I'm not so sure that this is safe. The reuse of the socket handle really worries me as @lxbndr stated. This isn't addressing the underlying issue of mishandling of SOCKET, which really is not dispatch's fault.

@adierking
Copy link
Contributor

I'm not so sure that this is safe. The reuse of the socket handle really worries me as @lxbndr stated. This isn't addressing the underlying issue of mishandling of SOCKET, which really is not dispatch's fault.

Ah, yeah, I see what you mean now. This problem doesn't happen on other platforms because closing a file descriptor automatically removes it from epoll/kqueue, but we don't have any similar mechanism on Windows that I'm aware of. So I agree that the proper fix here is probably along the lines of ensuring that clients never close sockets before canceling the source because this is just a quirk of the Windows platform. In fact this patch could make it harder to find those issues because this crash won't happen anymore.

@triplef
Copy link
Contributor Author

triplef commented Jan 20, 2023

Thank you all for your insightful feedback on this. I can see the possible issue with socket handle reuse.

Unfortunately at least in the case of curl I don’t think this can be done properly at the moment. I’m not very familiar with curl, but it seems to me that in some cases one could probably use CURLOPT_CLOSESOCKETFUNCTION to move closing the socket into the client, and then wait for the dispatch cancellation handler to fire before closing the socket. But I found at least one case where the close callback is not being used (unless one builds curl with CURL_DISABLE_SOCKETPAIR).

I guess this would then be task in swift-corelibs-foundation (or GNUstep’s NSURLSession implementation, which is what we are using) to figure out how to fix the URLSession implementation e.g. by first patching curl as needed. For now we will probably just keep this patch locally as an "almost" workaround.

Please feel free to close this, or keep it open as a reminder of this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants