-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure VM is running in signal handler #200
Conversation
8ebeff1
to
b9e762e
Compare
Since Is there any other public API that I can use instead to ensure the VM is running? Otherwise, I’ll plunge forward with detecting the various names of this symbol across Ruby versions. |
Perhaps adding an |
On Ruby 2.4 and before, it should be called |
That may help, and is probably a good change in its own right. I would still like to add this guard though since it improves the async signal safety of the handler so it's tricky to reason about. |
I think stackprof should probably do this by default. Either that or its callees should check for existence of the VM. We could check for the VM in stackprof, but that seems kind of fraught as the global is private which means it can be renamed at any time and the compiler is free to do whatever it wants with the symbol name. |
Yeah I have similar feelings, and very open to other ways of solving this. Maybe a null check in |
No I don't think that will solve the underlying problem since the code in stackprof that is crashing will then instead crash a few lines down in places like stackprof/ext/stackprof/stackprof.c Lines 745 to 764 in 52d1df6
|
I think you are right. A good stop gap may be to register a new |
b9e762e
to
455d76a
Compare
ruby_current_vm_ptr
is non-null in signal handler
@tenderlove @peterzhu2118 Thoughts on my updates? |
@ianks lgtm! Thanks for the patch! |
FWIW, I think the failing test may be flaky? I noticed it failing on |
Ya, it's flaky. I'll merge this and ship it. I think we're trying to measure some GC stuff in the tests and it's not predictable. |
As of a few weeks ago, there's been an increase of sigabrt in prod for us. After hunting the issue down a bit with @peterzhu2118, we honed in on a case where stackprof is signaled and
ruby_current_vm_ptr
is null. This seems to be happening after SIGQUIT'ing a unicorn worker duringrb_during_gc
.Since signal safety is hard to reason about, let's add a sanity check and not assume the Ruby VM is active just because the handler is called.