diff --git a/doc/yjit/yjit.md b/doc/yjit/yjit.md index 3451c50220ddeb..e855137aab1163 100644 --- a/doc/yjit/yjit.md +++ b/doc/yjit/yjit.md @@ -4,22 +4,22 @@

- YJIT - Yet Another Ruby JIT =========================== YJIT is a lightweight, minimalistic Ruby JIT built inside CRuby. It lazily compiles code using a Basic Block Versioning (BBV) architecture. -The target use case is that of servers running Ruby on Rails. YJIT is currently supported for macOS, Linux and BSD on x86-64 and arm64/aarch64 CPUs. This project is open source and falls under the same license as CRuby.

If you're using YJIT in production, please share your success stories with us! -

+

If you wish to learn more about the approach taken, here are some conference talks and publications: +- RubyKaigi 2023 keynote: [Optimizing YJIT’s Performance, from Inception to Production](https://www.youtube.com/watch?v=X0JRhh8w_4I) +- RubyKaigi 2023 keynote: [Fitting Rust YJIT into CRuby](https://www.youtube.com/watch?v=GI7vvAgP_Qs) - RubyKaigi 2022 keynote: [Stories from developing YJIT](https://www.youtube.com/watch?v=EMchdR9C8XM) - RubyKaigi 2022 talk: [Building a Lightweight IR and Backend for YJIT](https://www.youtube.com/watch?v=BbLGqTxTRp0) - RubyKaigi 2021 talk: [YJIT: Building a New JIT Compiler Inside CRuby](https://www.youtube.com/watch?v=PBVLf3yfMs8) @@ -55,8 +55,8 @@ series = {MPLR 2023} ## Current Limitations -YJIT may not be suitable for certain applications. It currently only supports macOS and Linux on x86-64 and arm64/aarch64 CPUs. YJIT will use more memory than the Ruby interpreter because the JIT compiler needs to generate machine code in memory and maintain additional state information. -You can change how much executable memory is allocated using [YJIT's command-line options](#command-line-options). There is a slight performance tradeoff because allocating less executable memory could result in the generated machine code being collected more often. +YJIT may not be suitable for certain applications. It currently only supports macOS, Linux and BSD on x86-64 and arm64/aarch64 CPUs. YJIT will use more memory than the Ruby interpreter because the JIT compiler needs to generate machine code in memory and maintain additional state information. +You can change how much executable memory is allocated using [YJIT's command-line options](#command-line-options). ## Installation @@ -167,8 +167,9 @@ YJIT supports all command-line options supported by upstream CRuby, but also add - `--yjit`: enable YJIT (disabled by default) - `--yjit-call-threshold=N`: number of calls after which YJIT begins to compile a function (default 30) - `--yjit-cold-threshold=N`: number of global calls after which an ISEQ is considered cold and not -compiled, lower values mean less code is compiled (default 200000) +compiled, lower values mean less code is compiled (default 200K) - `--yjit-exec-mem-size=N`: size of the executable memory block to allocate, in MiB (default 64 MiB) +- `--yjit-code-gc`: enable code GC (disabled by default as of Ruby 3.3) - `--yjit-stats`: print statistics after the execution of a program (incurs a run-time cost) - `--yjit-stats=quiet`: gather statistics while running a program but don't print them. Stats are accessible through `RubyVM::YJIT.runtime_stats`. (incurs a run-time cost) - `--yjit-trace-exits`: produce a Marshal dump of backtraces from specific exits. Automatically enables `--yjit-stats` @@ -177,29 +178,43 @@ compiled, lower values mean less code is compiled (default 200000) Note that there is also an environment variable `RUBY_YJIT_ENABLE` which can be used to enable YJIT. This can be useful for some deployment scripts where specifying an extra command-line option to Ruby is not practical. -You can verify that YJIT is enabled by checking that `ruby -v --yjit` includes the string `+YJIT`: +You can also enable YJIT at run-time using `RubyVM::YJIT.enable`. This can allow you to enable YJIT after your application is done +booting, which makes it possible to avoid compiling any initialization code. + +You can verify that YJIT is enabled using `RubyVM::YJIT.enabled?` or by checking that `ruby --yjit -v` includes the string `+YJIT`: ```sh -ruby -v --yjit +ruby --yjit -v ruby 3.3.0dev (2023-01-31T15:11:10Z master 2a0bf269c9) +YJIT dev [x86_64-darwin22] + +ruby --yjit -e "p RubyVM::YJIT.enabled?" +true + +ruby -e "RubyVM::YJIT.enable; p RubyVM::YJIT.enabled?" +true ``` ### Benchmarking -We have collected a set of benchmarks and implemented a simple benchmarking harness in the [yjit-bench](https://github.com/Shopify/yjit-bench) repository. This benchmarking harness is designed to disable CPU frequency scaling, set process affinity and disable address space randomization so that the variance between benchmarking runs will be as small as possible. Please kindly note that we are at an early stage in this project. +We have collected a set of benchmarks and implemented a simple benchmarking harness in the [yjit-bench](https://github.com/Shopify/yjit-bench) repository. This benchmarking harness is designed to disable CPU frequency scaling, set process affinity and disable address space randomization so that the variance between benchmarking runs will be as small as possible. ## Performance Tips for Production Deployments While YJIT options default to what we think would work well for most workloads, they might not necessarily be the best configuration for your application. - This section covers tips on improving YJIT performance in case YJIT does not speed up your application in production. ### Increasing --yjit-exec-mem-size When JIT code size (`RubyVM::YJIT.runtime_stats[:code_region_size]`) reaches this value, -YJIT triggers "code GC" that frees all JIT code and starts recompiling everything. +YJIT stops compiling new code. Increasing the executable memory size means more code +can be optimized by YJIT, at the cost of more memory usage. + +Alternatively, you can enable `--yjit-code-gc`, which will cause all machine code to be +discarded when the executable memory size limit is hit, meaning JIT compilation will +then start over. This can allow you to use a lower executable memory size limit, but +may cause a slight drop in performance when the limit is hit. Compiling code takes some time, so scheduling code GC too frequently slows down your application. Increasing `--yjit-exec-mem-size` may speed up your application if `RubyVM::YJIT.runtime_stats[:code_gc_count]` is not 0 or 1. @@ -213,10 +228,9 @@ You should monitor the number of requests each process has served. If you're periodically killing worker processes, e.g. with `unicorn-worker-killer` or `puma_worker_killer`, you may want to reduce the killing frequency or increase the limit. -## Saving YJIT Memory Usage +## Reducing YJIT Memory Usage YJIT allocates memory for JIT code and metadata. Enabling YJIT generally results in more memory usage. - This section goes over tips on minimizing YJIT memory usage in case it uses more than your capacity. ### Increasing --yjit-call-threshold @@ -231,7 +245,7 @@ if each process only handles 1000 requests, `--yjit-call-threshold=1000` might b ### Decreasing --yjit-exec-mem-size -`--yjit-exec-mem-size` specifies the JIT code size, but YJIT also uses memory for its metadata, +The `--yjit-exec-mem-size` option specifies the JIT code size, but YJIT also uses memory for its metadata, which often consumes more memory than JIT code. Generally, YJIT adds memory overhead by roughly 3-4x of `--yjit-exec-mem-size` in production as of Ruby 3.2. You should multiply that by the number of worker processes to estimate the worst case memory overhead. @@ -248,7 +262,7 @@ This section contains tips on writing Ruby code that will run as fast as possibl - Avoid allocating objects in the hot parts of your code - Minimize layers of indirection - Avoid classes that wrap objects if you can - - Avoid methods that just call another method, trivial one liner methods + - Avoid methods that just call another method, trivial one-liner methods - Try to write code so that the same variables always have the same type - Use `while` loops if you can, instead of C methods like `Array#each` - This is not idiomatic Ruby, but could help in hot methods @@ -258,10 +272,10 @@ You can also use the `--yjit-stats` command-line option to see which bytecodes c ### Other Statistics -If you run `ruby` with `--yjit --yjit-stats`, YJIT will track and return performance statistics in `RubyVM::YJIT.runtime_stats`. +If you run `ruby` with `--yjit-stats`, YJIT will track and return performance statistics in `RubyVM::YJIT.runtime_stats`. ```rb -$ RUBYOPT="--yjit --yjit-stats" irb +$ RUBYOPT="--yjit-stats" irb irb(main):001:0> RubyVM::YJIT.runtime_stats => {:inline_code_size=>340745, @@ -288,25 +302,26 @@ Some of the counters include: * :total_exit_count - number of exits, including side exits, taken at runtime * :avg_len_in_yjit - avg. number of instructions in compiled blocks before exiting to interpreter -Counters starting with "exit_" show reasons for YJIT code taking a side exit (return to the interpreter.) See yjit_hacking.md for more details. +Counters starting with "exit_" show reasons for YJIT code taking a side exit (return to the interpreter.) -Performance counter names are not guaranteed to remain the same between Ruby versions. If you're curious what one does, it's usually best to search the source code for it — but it may change in a later Ruby version. +Performance counter names are not guaranteed to remain the same between Ruby versions. If you're curious what each counter means, +it's usually best to search the source code for it — but it may change in a later Ruby version. -The printed text after a --yjit-stats run includes other information that may be named differently than the information in runtime_stats. +The printed text after a `--yjit-stats` run includes other information that may be named differently than the information in `RubyVM::YJIT.runtime_stats`. ## Contributing -We welcome open source contributors. You should feel free to open new issues to report bugs or just to ask questions. +We welcome open source contributions. You should feel free to open new issues to report bugs or just to ask questions. Suggestions on how to make this readme file more helpful for new contributors are most welcome. Bug fixes and bug reports are very valuable to us. If you find a bug in YJIT, it's very possible be that nobody has reported it before, or that we don't have a good reproduction for it, so please open an issue and provide as much information as you can about your configuration and a description of how you encountered the problem. List the commands you used to run YJIT so that we can easily reproduce the issue on our end and investigate it. If you are able to produce a small program reproducing the error to help us track it down, that is very much appreciated as well. -If you would like to contribute a large patch to YJIT, we suggest opening an issue or a discussion on this repository so that +If you would like to contribute a large patch to YJIT, we suggest opening an issue or a discussion on the [Shopify/ruby repository](https://github.com/Shopify/ruby/issues) so that we can have an active discussion. A common problem is that sometimes people submit large pull requests to open source projects without prior communication, and we have to reject them because the work they implemented does not fit within the design of the -project. We want to save you time and frustration, so please reach out and we can have a productive discussion as to how -you can contribute things we will want to merge into YJIT. +project. We want to save you time and frustration, so please reach out so we can have a productive discussion as to how +you can contribute patches we will want to merge into YJIT. ### Source Code Organization @@ -319,8 +334,8 @@ The YJIT source code is divided between: - `yjit/src/core.rb`: basic block versioning logic, core structure of YJIT - `yjit/src/stats.rs`: gathering of run-time statistics - `yjit/src/options.rs`: handling of command-line options -- `yjit/bindgen/src/main.rs`: C bindings exposed to the Rust codebase through bindgen - `yjit/src/cruby.rs`: C bindings manually exposed to the Rust codebase +- `yjit/bindgen/src/main.rs`: C bindings exposed to the Rust codebase through bindgen The core of CRuby's interpreter logic is found in: - `insns.def`: defines Ruby's bytecode instructions (gets compiled into `vm.inc`) diff --git a/yjit/src/stats.rs b/yjit/src/stats.rs index 74443edea45f22..769e2d78e2dcf6 100644 --- a/yjit/src/stats.rs +++ b/yjit/src/stats.rs @@ -750,8 +750,6 @@ fn rb_yjit_gen_stats_dict(context: bool) -> VALUE { for (name, idx) in cfunc_name_to_idx { let count = call_counts[*idx]; - println!("{}: {}", name, count); - let key = rust_str_to_sym(name); let value = VALUE::fixnum_from_usize(count as usize); rb_hash_aset(calls_hash, key, value);