-
Notifications
You must be signed in to change notification settings - Fork 251
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Rewrite msvc backtrace support to be much faster on 64-bit platforms
Currently, capturing the stack backtrace is done on Windows by calling into `dbghelp!StackWalkEx` (or `dbghelp!StackWalk64` if the version of `dbghelp` we loaded is too old to contain that function). This is very convenient since `StackWalkEx` handles everything for us but there are two issues with doing so: 1. `dbghelp` is not safe to use from multiple threads at the same time so all calls into it must be serialized. 2. `StackWalkEx` returns inlined frames as if they were regular stack frames which requires loading debug info just to walk the stack. As a result, simply capturing a backtrace without resolving it is much more expensive on Windows than *nix. This change rewrites our Windows support to call `RtlVirtualUnwind` instead on platforms which support this API (`x86_64` and `aarch64`). This API walks the actual (ie, not inlined) stack frames so it does not require loading any debug info and is significantly faster. For platforms that do not support `RtlVirtualUnwind` (ie, `i686`), we fall back to the current implementation which calls into `dbghelp`. To recover the inlined frame information when we are asked to resolve symbols, we use `SymAddrIncludeInlineTrace` to load debug info and detect inlined frames and then `SymQueryInlineTrace` to get the appropriate inline context to resolve them. The result is significant performance improvements to backtrace capture and symbolizing on Windows! Before: ``` > cargo +nightly bench Running benches\benchmarks.rs running 6 tests test new ... bench: 658,652 ns/iter (+/- 30,741) test new_unresolved ... bench: 343,240 ns/iter (+/- 13,108) test new_unresolved_and_resolve_separate ... bench: 648,890 ns/iter (+/- 31,651) test trace ... bench: 304,815 ns/iter (+/- 19,633) test trace_and_resolve_callback ... bench: 463,645 ns/iter (+/- 12,893) test trace_and_resolve_separate ... bench: 474,290 ns/iter (+/- 73,858) test result: ok. 0 passed; 0 failed; 0 ignored; 6 measured; 0 filtered out; finished in 8.26s ``` After: ``` > cargo +nightly bench Running benches\benchmarks.rs running 6 tests test new ... bench: 495,468 ns/iter (+/- 31,215) test new_unresolved ... bench: 1,241 ns/iter (+/- 251) test new_unresolved_and_resolve_separate ... bench: 436,730 ns/iter (+/- 32,482) test trace ... bench: 850 ns/iter (+/- 162) test trace_and_resolve_callback ... bench: 410,790 ns/iter (+/- 19,424) test trace_and_resolve_separate ... bench: 408,090 ns/iter (+/- 29,324) test result: ok. 0 passed; 0 failed; 0 ignored; 6 measured; 0 filtered out; finished in 7.02s ``` The changes to the symbolize step also allow us to report inlined frames when resolving from just an instruction address which was not previously possible.
- Loading branch information
1 parent
99faef8
commit 2647b90
Showing
6 changed files
with
275 additions
and
210 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.