Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in fallback path for catching logging errors #3012

Closed
danielwe opened this issue Aug 25, 2024 · 6 comments · Fixed by #3013
Closed

Error in fallback path for catching logging errors #3012

danielwe opened this issue Aug 25, 2024 · 6 comments · Fixed by #3013

Comments

@danielwe
Copy link
Contributor

GPUCompiler.jl implements slightly tweaked loggers for safe use in generated functions:

https://github.com/JuliaGPU/GPUCompiler.jl/blob/27d4b190243f9a4440e8c0bfda876af89dd1140e/src/utils.jl#L55-L73

Pluto.jl fails to display these logs, instead erroring with ArgumentError: Base.TTY(RawFD(4294967295) invalid status, 0 bytes waiting) is not initialized.

MWE notebook (this does not require a GPU, GPUCompiler is only used via Enzyme): https://gist.github.com/danielwe/c30299b9c11bb0cc5ce182913b640da3

In the REPL this emits a warning:

julia> include("pluto_gpucompiler_error.jl");
┌ Warning: Using fallback BLAS replacements for (["drot_64_"]), performance may be degraded
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
dx = [-0.8660254037844386]

In Pluto it errors:

Screencast.from.08-25-2024.11.33.19.AM.webm
@wsmoses
Copy link

wsmoses commented Aug 25, 2024

cc @vchuravy didn't michel or someone else hit this?

@danielwe
Copy link
Contributor Author

The displayed error comes from printing to stderr in the catch block of handle_message, here:

println(original_stderr, "Failed to relay log from PlutoRunner")

So I guess there's actually two issues here, one from failing to display the log, sending us to the catch block, and one from failing to print to stderr in the catch block

@danielwe
Copy link
Contributor Author

danielwe commented Aug 26, 2024

As an experiment, I inserted error() inside the try block in handle_message, such that every logging attempt ends up in the catch block. The result is that any attempt at logging inside a notebook, like @info "hello", produces the same error as above: ArgumentError: Base.TTY(RawFD(4294967295) invalid status, 0 bytes waiting) is not initialized.

In other words, the catch block is broken independently of GPUCompiler. The const original_stderr = stderr apparently isn't a valid, writable IO when handle_message is called.

Then there's the more fundamental issue of why GPUCompiler logs end up in the catch block in the first place. But fixing the catch block would make this a lot easier to debug.

@danielwe
Copy link
Contributor Author

The incompatibility with GPUCompiler logs is due to the yield() calls around the logger redirection, introduced in #2026. Removing these fixes the problem. The point of GPUCompiler's special logging macros is to avoid task switches, so explicitly yielding in handle_message defeats their purpose.

Not sure how to reconcile the need to avoid task switching here with the reason for introducing #2026 in the first place.

The issue with the catch block seems to go away if original_stderr is replaced with stderr. The error is then printed in the REPL where Pluto was started, which I assume was the intention. (This is if you land in the catch block from an ordinary log. If you get there from a GPUCompiler log you still get an error, because println contains yield points... but fixing this corner case seems like a low priority, this should occur rarely if ever once the main issue is fixed.)

@danielwe
Copy link
Contributor Author

Actually, that last parenthetical issue can be fixed by using Core.stderr instead of stderr. Prints to the former are apparently yield-free. To implement this with color if available, leave the catch block as-is and change the definitions of original_std*** to the following:

const original_stdout = IOContext(Core.stdout, :color => get(stdout, :color, false))
const original_stderr = IOContext(Core.stderr, :color => get(stderr, :color, false))

@danielwe danielwe changed the title Error when displaying log messages from GPUCompiler Error in fallback path for catching logging errors Sep 4, 2024
@danielwe
Copy link
Contributor Author

danielwe commented Sep 4, 2024

Since JuliaGPU/GPUCompiler.jl#622 was merged and released, GPUCompiler logs don't crash anymore, so I renamed this issue to refer specifically to the error in the logging fallback path which is fixed by #3013

@fonsp fonsp linked a pull request Sep 17, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants