Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A new, statically typed, JIT IR. #943

Merged
merged 1 commit into from
Jan 30, 2024
Merged

Conversation

ptersilie
Copy link
Contributor

This reworks the JIT IR, incorporating the wisdom we've absorbed through studying PyPy and LuaJIT's JIT IRs.

The JIT IR is now a vector of word-sized instructions that can be matched and destructured using Rust's type system. This differs slightly from PyPy and LuaJIT, which use bytestreams, but we want static type-safety.

Any information that can't fit inline in the instruction word (e.g. operands and their types) are referenced using indices into an auxiliary vector containing the overflow information. This idea was borrowed from LuaJIT.

Indices are all u16, and thus can only represent a limited number of elements. This is fine, as we don't expect traces to be so large that the indices grow beyond what a u16 can represent. If they do we can just abort the trace and remain sound.

A note on mutability: you may mutate the instruction stream so long as you don't cause any index skew. It's OK to replace/mutate an instruction, but you must NOT remove instructions.

The IR is not finished and has a lot of dead code, hence we mark the whole module #![allow(dead_code)]. This will eventually be deleted.

Probable follow up PRs:

  • propagate errors when indices overflow (and test).
  • implement missing operand and instruction types (get enough in to get a simple test loop to translate to JIT IR).
  • generate code :)

@ltratt
Copy link
Contributor

ltratt commented Jan 25, 2024

A note on mutability: you may mutate the instruction stream so long as you don't cause any index skew. It's OK to replace/mutate an instruction, but you must NOT remove instructions.

A thought. Maybe (but not for this PR at least!) we should wrap the Vec of instructions with a new type that doesn't expose instruction deletion? Obviously we'd still let trusted code get access to the underlying vec, but that would make it harder for our future selves to be careless and use remove or friends without thinking a bit about it.

@ptersilie
Copy link
Contributor Author

ptersilie commented Jan 25, 2024

A note on mutability: you may mutate the instruction stream so long as you don't cause any index skew. It's OK to replace/mutate an instruction, but you must NOT remove instructions.

A thought. Maybe (but not for this PR at least!) we should wrap the Vec of instructions with a new type that doesn't expose instruction deletion? Obviously we'd still let trusted code get access to the underlying vec, but that would make it harder for our future selves to be careless and use remove or friends without thinking a bit about it.

Yes, we've thought about that too. I believe we have this already, since the JIT Module fields are private, so the only access to the vec is via accessor functions. Obviously, the jit_ir.rs file can still access the underlying vec directly. So if you want to avoid that too, then your solution would fix that.

ykrt/src/compile/jitc_yk/aot_ir.rs Outdated Show resolved Hide resolved
/// For a call instruction, return the callee.
///
/// Panics if the instruction isn't a call.
pub(crate) fn get_callee(&self, m: &'a Module) -> &'a Function {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_ is not very Rust-like and we should avoid it when possible: callee would be fine here I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 4b6a16b.

Other "gets" fixed in:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the gets but I'm not a fan of Deref (#943 (comment)) which isn't used for these sorts of purposes - we should have a to_u16 (or similar) method IMHO to be Rusty. https://doc.rust-lang.org/std/ops/trait.Deref.html's documentation is worth a read:

Warning: Deref coercion is a powerful language feature which has far-reaching implications for every type that implements Deref. The compiler will silently insert calls to Deref::deref. For this reason, one should be careful about implementing Deref and only do so when deref coercion is desirable. See below for advice on when this is typically desirable or undesirable.

Types that implement Deref or DerefMut are often called “smart pointers” and the mechanism of deref coercion has been specifically designed to facilitate the pointer-like behaviour that name suggests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right.

FWIW, I did test the coercion cases I cared about and they didn't seem to work (in a bad way).

Nonetheless, I'm OK with to_u16() and/or to_usize() depending on use case, so let's do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to get rid of all the Derefs and replace them with explicit methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've killed the Derefs that were added in this PR in d1ddec8.

More pre-existing ones to kill in a future PR (I now have a list of follow-up PRs in my notebook).

ykrt/src/compile/jitc_yk/aot_ir.rs Outdated Show resolved Hide resolved
ykrt/src/compile/jitc_yk/aot_ir.rs Outdated Show resolved Hide resolved
ykrt/src/compile/jitc_yk/jit_ir.rs Outdated Show resolved Hide resolved
ykrt/src/compile/jitc_yk/jit_ir.rs Outdated Show resolved Hide resolved
ykrt/src/compile/jitc_yk/jit_ir.rs Show resolved Hide resolved
ykrt/src/compile/jitc_yk/jit_ir.rs Show resolved Hide resolved
ykrt/src/compile/jitc_yk/jit_ir.rs Outdated Show resolved Hide resolved
ykrt/src/compile/jitc_yk/trace_builder.rs Outdated Show resolved Hide resolved
@vext01
Copy link
Contributor

vext01 commented Jan 25, 2024

Obviously, the jit_ir.rs file can still access the underlying vec directly. So if you want to avoid that too, then your solution would fix that.

The module can always see in innards of things in the same module, so I think what we have is the best you will get.

@ltratt
Copy link
Contributor

ltratt commented Jan 25, 2024

The module can always see in innards of things in the same module, so I think what we have is the best you will get.

This isn't about enforcing visibility, per se, but making it clearer to our future selves that we really really don't want to use remove and friends (even though they can bypass our wrapper if they want). Anyway, not a big deal, and not a worry for this PR, but maybe something to think about in the future.

@ltratt
Copy link
Contributor

ltratt commented Jan 25, 2024

Overall this looks like a real improvement over #930 -- excellent! My detailed comments have, I think, covered pretty much all the high level points I might pull out:

  • Using ty and (I think) tyidx consistently would help readability a lot.
  • I'm not a big fan of "lookup and insert if not found" functions, because (as we've seen) we end up not really knowing what to do in the "not found" case: clone or ... ? Something like HashMaps Entry API might provide inspiration, but even that might not be ideal.
  • get_ function names are generally not Rust-y (with some limited exceptions).

@vext01
Copy link
Contributor

vext01 commented Jan 29, 2024

Most of the commits are referenced in replies to your comments. From memory the other stuff you may wish to additionally review (that wasn't obsoleted by a later change):

Sorry this is a bit hard to review. It's been one of those where you make a small change and then other things that really "should be fixed now (tm)" come up.

Stuff for future PRs that I held back on for fear of overloading the reviewer further:

  • Add remaining index types to AOT IR where there are currently type-unsafe usizes.
  • Audit AOT IR for ty and ty_idx consistency.
  • error propagation.
  • Encapsulate instruction vector in a new type to communicate that you can't remove from it.
  • clippy

@ltratt
Copy link
Contributor

ltratt commented Jan 29, 2024

Broadly speaking, this is definitely going in the right direction! I have left some comments (including in commits), but mostly nothing major (the Deref comment aside).

@vext01
Copy link
Contributor

vext01 commented Jan 30, 2024

I think I've addressed everything now.

There will be follow up PRs.

@ltratt
Copy link
Contributor

ltratt commented Jan 30, 2024

Please squash.

This reworks the JIT IR, incorporating the wisdom we've absorbed through
studying PyPy and LuaJIT's JIT IRs.

The JIT IR is now a vector of word-sized instructions that can be
matched and destructured using Rust's type system. This differs slightly
from PyPy and LuaJIT, which use bytestreams, but we want static
type-safety.

Any information that can't fit inline in the instruction word (e.g.
operands and their types) are referenced using indices into an auxiliary
vector containing the overflow information. This idea was borrowed from
LuaJIT.

Some of the auxiliary vectors are stored in the AOT module, thus sharing
constructs with the AOT IR, for example, functions and types. Indices
for these vectors are 24-bit. Indices for the other auxiliary vectors
are 16-bit and for these there is (currently) no sharing with the AOT
module.

Obviously the reduced-size indices mean that we can only represent a
limited number of vector elements. This is fine, as we don't expect
traces to be so large that the indices overflow and if they do we can
just abort the trace and remain sound.

A note on mutability: you may mutate the instruction stream so long as
you don't cause any index skew. It's OK to replace/mutate an
instruction, but you must NOT remove instructions.

The AOT IR remains immutable.

The IR is not finished and has a lot of dead code, hence we mark the
whole module `#![allow(dead_code)]`. This will eventually be deleted.

Co-authored-by: Edd Barrett <[email protected]>
@vext01
Copy link
Contributor

vext01 commented Jan 30, 2024

squashed and updated the commit message to reflect the sharing changes.

@ltratt ltratt added this pull request to the merge queue Jan 30, 2024
Merged via the queue into ykjit:master with commit cdc8d24 Jan 30, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants