Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Reentrant concurrent snoopi_deep profiles. #309

Closed
wants to merge 8 commits into from

Conversation

NHDaly
Copy link
Collaborator

@NHDaly NHDaly commented Oct 20, 2022

NOTE: It turns out this PR isn't mergeable as-is, since it depends on assumptions in Julia that aren't valid.
Namely, we need JuliaLang/julia#47258 and JuliaLang/julia#47260 to avoid data races, but it turns out those don't make sense against julia master/1.9, because of recent changes to the inference lock.

At RAI, we have backported those two PRs against 1.8.2, and we're going to keep hacking on this branch to pull it into our internal build. We will follow-up with actual changes to julia 1.9 and SnoopCompile to try to make this work for real.


Following option 3 from this doc:
https://docs.google.com/document/d/1JO43w3V1Cwtb2CxWf4sYfAB68xbkTrCMr973iaJOFwY/edit#

This is a WIP PR to support invoking @snoopi_deep concurrently and in parallel from multiple Tasks.
(It currently requires JuliaLang/julia#47258 to avoid data races, and JuliaLang/julia#47260 to allow reentrant inference timing.)

The idea is basically to share the global julia type inference profile buffer, rather than clear it at the start of every profile, keeping track of the start and stop index for each invocation of snoopi_deep. Then we also clear the beginning of the profile buffer whenever the oldest invocation ends, to ensure we aren't leaking memory.

@NHDaly
Copy link
Collaborator Author

NHDaly commented Oct 20, 2022

Note that of course one must be careful when interpreting the results of concurrently running profiles, since inference caused by invocation B will also show up in the profile for invocation A. But this is true of all of julia's profiling and timing techniques right now, so it's just something the developer must be aware of. (Compare it with @time from multiple tasks reporting allocations caused by other tasks, or reporting compilation time from other tasks.)

@codecov
Copy link

codecov bot commented Oct 20, 2022

Codecov Report

Base: 85.47% // Head: 17.39% // Decreases project coverage by -68.08% ⚠️

Coverage data is based on head (f86ce5b) compared to base (ad792af).
Patch coverage: 0.00% of modified lines in pull request are covered.

Additional details and impacted files
@@             Coverage Diff             @@
##           master     #309       +/-   ##
===========================================
- Coverage   85.47%   17.39%   -68.09%     
===========================================
  Files          17       16        -1     
  Lines        2162     1903      -259     
===========================================
- Hits         1848      331     -1517     
- Misses        314     1572     +1258     
Impacted Files Coverage Δ
SnoopCompileCore/src/snoopi_deep.jl 0.00% <0.00%> (-92.00%) ⬇️
SnoopCompileCore/src/snoopl.jl 0.00% <0.00%> (-100.00%) ⬇️
src/parcel_snoopl.jl 0.00% <0.00%> (-90.00%) ⬇️
src/parcel_snoopi_deep.jl 0.00% <0.00%> (-89.91%) ⬇️
src/invalidation_and_inference.jl 0.00% <0.00%> (-89.66%) ⬇️
src/invalidations.jl 0.00% <0.00%> (-83.92%) ⬇️
src/deep_demos.jl 0.00% <0.00%> (-58.98%) ⬇️
src/SnoopCompile.jl 66.66% <0.00%> (-33.34%) ⬇️
SnoopPrecompile/src/SnoopPrecompile.jl 56.25% <0.00%> (-31.25%) ⬇️
... and 7 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@NHDaly
Copy link
Collaborator Author

NHDaly commented Oct 21, 2022

Update: it turns out that JuliaLang/julia#47258 and JuliaLang/julia#47260 are no longer valid, now that we're shrinking the inference lock (which is 🎉 in general though of course!).

So this PR will have to wait, because we'll probably need to rethink the inference profiler entirely, given that it currently assumes single-threaded, locked access to the inference profile buffer. :/

@NHDaly
Copy link
Collaborator Author

NHDaly commented Oct 25, 2022

CC: @Sacha0 for review plz

@@ -863,6 +863,7 @@ end
# pgdsgui(axs[2], rit; bystr="Inclusive", consts=true, interactive=false)
end


Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snuck in? :)

Suggested change

_name(frame::SnoopCompileCore.InferenceTiming) = frame.mi_info.mi.def.name

@testset "reentrant concurrent profiles 1 - overlap" begin
# Warmup
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps something like the following? :)

Suggested change
# Warmup
# Warmup to prevent, e.g., `+` from appearing in the profiles; only `ROOT` and `foo*` should appear.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Likewise on the warmup comments below if you agree.)

@eval foo1(x) = x+2
@eval foo1(2)

# Test:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That this works is thoroughly pleasing.

Comment on lines +1067 to +1070
sleep((i-1) / 10) # (Divide by 10 so the test isn't too slow)
snoop_time = @timed SnoopCompile.@snoopi_deep @eval begin
$(Symbol("foo$i"))(x) = x + 1
sleep(1.5 / 10)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope that these delays are large enough --- and think they should be by orders of magnitude --- to prevent nondeterministic failures due to jitter. But I've found the machines are just lying in wait, for the moment I think something like that. Then 🦈 🩸.

# Test that the inclusive time (the total time reported by snoopi_deep) matches
# the actual time to do the snoopi_deep, as measured by `@time`.
# These should both be approximately ~0.15 seconds.
@info prof.mi_timing.inclusive_time
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debugging straggler? :)

@testset for i in eachindex(profs)
prof = profs[i]
# Test that the time for the inference is accounted for
@test 0.15 < prof.mi_timing.exclusive_time
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear the definitions of these times has already become hazy to me 😅. This one? 🤔How do know that the exclusive rather than inclusive time should be over 0.15 seconds?

Comment on lines +127 to +128
accum_root_time = time_ns() - inference_root_timing.cur_start_time
current_root_time = inference_root_timing.time + accum_root_time
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit hazy on these details by now but trust that it is correct 👍

@NHDaly
Copy link
Collaborator Author

NHDaly commented Dec 9, 2022

Thanks again for the review, Sacha. I'm closing this for now, since we ended up going a different approach, via JuliaLang/julia#47615 and #313.

@NHDaly NHDaly closed this Dec 9, 2022
@NHDaly NHDaly deleted the nhd-snoopi_deep-reentrant branch December 9, 2022 01:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants