Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error reporting difference between Sync{} and Async{}.wait #299

Open
codekitchen opened this issue Jan 4, 2024 · 6 comments
Open

Error reporting difference between Sync{} and Async{}.wait #299

codekitchen opened this issue Jan 4, 2024 · 6 comments

Comments

@codekitchen
Copy link

Hi, I recently tracked down a tricky silent failure related to switching from Async to Sync at the top level, as the best practices doc suggests. An exception in the top-level task was being silently swallowed and not reported, so it took a long time to track down why the service was sometimes not behaving. Here is a minimal reproduction:

require 'async'
Sync {
  Async { loop { sleep 1 } }
  raise "whoops"
}

This code just hangs indefinitely without logging the unhandled exception. This code is silly of course, but hopefully you can see how this could arise in practice -- in the actual application, the root task reads input and passes a queue to each long-lived async sub-task to communicate back to the root task. If the root task raises a StandardError due to a bug, it's dead, but the reactor is still waiting on its sub-tasks which are running in a loop so nothing is ever logged or otherwise reported.

But what's interesting is that if you change the top-level Sync{} to Async{}.wait, then the exception is logged in the normal Task may have ended with unhandled exception. way, though it still hangs indefinitely afterwards. But that logging would've been key here. It's not clear to me just looking at the code for the Async and Sync methods why this behavior might be different, I'd need to look at the Reactor impl more closely.

In hindsight, I likely should refactor to have a top-level supervisor task which is the parent of the current root task, since that one is doing a lot of actual I/O work, but this still seems worth probably addressing.

Marginally related, I really wish async didn't just log StandardError failures and continue on, I prefer hard failures on all unhandled exceptions. It'd be nice if the async library had a way to bubble up all unhandled exceptions or register a callback for unhandled errors or something. But that seems like a separate discussion that I should open at some point once I've thought it through further.

@codekitchen
Copy link
Author

Looking a bit deeper, this is because Sync when not already in a task sets up a finished Condition and calls reactor.run().wait vs just reactor.run(), which makes total sense. The current error handling behavior logs at the debug level when the task is being waited on, so if I extend my previous example to call Console.logger.debug! then the error does get logged out.

This seems like an unfortunate difference to me between Sync and Async at the top level, but I'm not sure what a reasonable fix would be. Possibly modify the error handling behavior to still log at warning level if the task is the root task? But that feels maybe too special-casey.

@codekitchen
Copy link
Author

I'm familiarizing myself with the implementation of async a bit more, and realizing that my earlier analysis is a little off. In top-level calls to both Async and Sync, the reactor.run call doesn't return to the caller until the reactor shuts down, so this isn't related to the subsequent call to #wait. The difference is caused by Sync passing an explicit finished: ::Async::Condition.new to the top-level task, which causes the internal task code to see somebody as waiting on the task result. I can trigger the same behavior with Async(finished: ::Async::Condition.new) {...}

I don't know the intent here in setting an explicit finished arg but it seems possibly related to letting the internal task code know that since this is a Sync call, conceptually there's somebody already waiting on the result.

Anyway, I'm glad I understand it more fully now. But I'm not sure that helped steer me closer to what should change here -- assuming the maintainers agree something should change. I do think it'd be beneficial to not have this logging gotcha, though.

@jscheid
Copy link

jscheid commented Nov 21, 2024

I'm also encountering this issue but the other way around: top-level Async { ... }.wait is logging any exception noisily, which seems unnecessary because the wait will re-throw it anyway.

I could be missing something but it feels to me that the exception should only be logged when nothing is awaiting on it. Wouldn't the code be able to detect that?

@ioquatix
Copy link
Member

ioquatix commented Nov 22, 2024

Sorry, I never followed up this discussion/issue.

I have experimented, mostly unsuccessfully, to try and improve this code/behaviour.

In the past, I experimented with removing it entirely. It was such a bad developer experience - stuff would fail and you'd have no idea.

When I first created this, I was strongly in favour of logging all failures - i.e. silent failures should never happen.

But you are absolutely right: Async{}.wait will propagate the error, but there is no way to know that wait will be called until AFTER Async{} is done. Therefore, sometimes we log an error even if it's propagated (which can be confusing).

When I thought about introducing Sync, I was considering the above usage in terms of error handling and error logging.

With that in mind:

require 'async'
Sync {
  Async { loop { sleep 1 } }
  raise "whoops"
}

I wonder if we could make this code immediately exit the event loop. In other words, a top level Sync block could exit if it fails, in order to preserve the semantics of Sync.

@codekitchen
Copy link
Author

I wonder if we could make this code immediately exit the event loop. In other words, a top level Sync block could exit if it fails, in order to preserve the semantics of Sync.

That sounds like a great solution. Preferably, the top level Sync would re-raise the exception as it exits?

@jscheid
Copy link

jscheid commented Nov 22, 2024

I also like that idea.

What do you think about top-level Async (not wrapped in Sync) to re-raise as well, even if not being awaited on?
I feel that doing that, instead of logging and otherwise ignoring the error, would both be safer and less surprising behavior, and it would solve the noise issue.

In other words, never log and always re-throw at the (blocking) top level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants