-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ssa): Generic ValueId to show resolved status #6487
base: master
Are you sure you want to change the base?
Conversation
compiler/noirc_evaluator/src/brillig/brillig_gen/brillig_block_variables.rs
Outdated
Show resolved
Hide resolved
compiler/noirc_evaluator/src/brillig/brillig_gen/constant_allocation.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! 👏
I left one question about unresolved values, and some suggestions to use ResolvedValueId
a bit more but that could be left as a follow-up PR. It's great that now we can see where we are comparing raw values so we can later more easily spot bugs (and prevent bugs) or tackle them before-hand before finding those bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we handle the case where we resolve a value id to get a ResolvedValueId
, with id 42
lets say, then we call dfg.set_value_from_id
to map value 42
to value 100
? I think we'd still have a value id that says it is resolved but actually needs to be resolved again.
My main worry is that this change makes ValueIds more cumbersome to use in the future while also not entirely preventing the issue - although I think it'd at least help prevent against forgetting to call resolve.
I suppose we could remove |
I don't think that solves the issue, e.g. if the ValueId in question is stored in a map to be used later |
You are right, that is a fundamental conundrum of storing value IDs in data structures. We had this discussion with @asterite as well: it seems like in It's not a silver bullet, but as you said it is a step in the right direction of preventing certain bugs, and as I tried to say, it at least gives us an option for tightening the screws. I have spent almost 2 hours resolving the recent merge conflicts though; perhaps we can improve the solution in a later PR? Or do you think it's more cumbersome than worth? I'll try to see what removing the |
Trying to remove Also checked that I wonder if there was a way to limit via lifetimes how long a |
Perhaps we can try something like this:
|
Later added them back in favour of the lifetimes. I figured there is no difference in using them as keys vs values in maps; the lifetime on the other hand makes you work hard to store it in any data structure not tied to the same reference. |
Added a lifetime to |
a203cf1
to
efc9a61
Compare
I am thinking that, unfortunately. It is more cumbersome to have to shuffle around the types like this and I don't think the lifetimes solve the root of the issue either. Ultimately it's still up to the dev to ensure they're handling them correctly but this is made a bit more difficult I think by most passes still claiming to store RawValueIds when they should be conceptually resolved (e.g. the normalize_value_ids pass), so I'm not sure the extra types help readability or understandability of the passes. It feels bad to reject such large work but I do think it was a valuable exploration into this issue. |
Agreed, we cannot stop a determined developer to store the IDs. I merely wanted to stop them to forget to do the right thing by accident.
I used Oh well, at least I can stop keeping it in sync 😌 |
Description
Problem*
Followup for #6448
Currently the
ValueId
is an alias forId<Value>
. Throughout the codebase we either calllet id = dfg.resolve(id);
before comparing them, or using them to look up values in dictionaries, or we don't. It's hard to tell when it's okay not to call it, and it's easy to forget to do so.Summary*
ValueId
wrapper ofId
instead of an alias, and adds a generic phantom type to it which is eitherResolved
orUnresolved
, so that we can only directly compareValueId<Resolved>
instances but notValueId<Unresolved>
ones.unresolved_eq
method is added toValueId<Unresolved>
to make it easy to find in the code where we are currently comparing raw values.DataFlowGraph
can at the moment be indexed byValueId<Unresolved>
because that's how the code did it, without resolution. Not sure if this is correct, but removing this indexer would bring to light all the cases for investigation, or we can add auto-resolve in the indexer.ValueId
is used in the data structures such asInstruction
andExpression
, with the defaultUnresolved
value, so that the IDs always have to be resolved before equality checks.Hash
andEq
are derived forValueId
if the generic parameter implements them.Instruction
andExpression
were also made generic, and with theirmap_values
methods they can map over theValueId
s in them to become e.g.Instruction<Resolved>
, which makes them implementHash
andEq
, so they can be directly compared to each other.Resolution
trait implemented forResolved
andUnresolved
so we can treat them generically and decide at runtime whether we need to resolve the ID or not. This is used when the same method is called both withValueId
andResolvedValueId
; this way we can save an extra resolution (although it would be a short one).In this PR I tried to preserve the current behaviour of the program, assuming that what it does is correct, but I hope that with the new ability to differentiate between these states we can spot more easily when we're doing something wrong, and we can gradually move towards more safety.
Additional Context
There are a lot of little changes to go from
ValueId
toRawValueId
(which is justId<Value>
, used in maps), which is annoying, but I hope there is some tangible benefits to this anyway. It's difficult to reason about what should be turned intoResolvedValueId
(sometimes resolutions are defensively done by the caller and the callee as well), so I left most APIs as-is.To help with the review, the meaningful changes are in value and dfg, with the rest just correcting compilation errors. The changes in instruction are mostly just moving methods around in a new generic impl to make them available for both statuses, not just the default one.
One pervasive pattern is accessing
dfg[*id]
with an unresolved ID and using the returnedValue
. Ostensibly this value could have already been replaced by a successor. It's hard to tell whether this is intentional or not, but we now have the option to weed out these and replace them either with a call todfg.resolve(id)
or forcing a promotion withid.resolved()
if we're sure it's the right thing.Alternatives
Removing
set_value_from_id
Another option would be to remove the ability to just replace a value with a successor and leaving a forward reference to the new value, which will later have to be followed recursively every time we encounter a potentially stale ID, and instead to visit every occurrence of the old ID and replace it with the new one.
Without generics
Earlier I tried a different approach where
ValueId
was a non-generic wrapper aroundId<Value>
without implementingHash
, and theResolvedValueId
was the alias for what it is today. Crucially the data structures carryValueId
, because they can be stale. I worked for hours to go through the compilation errors, then hit a spot where eitherInstruction
orExpression
is compared withassert_eq!
or used in maps as keys, which didn't compile because I had to remove theHash
,Eq
,PartialEq
as they containValueId
. Implementing them just for these types manually is possible, but cumbersome.Then I discovered the various
map_values
methods and thought it would make sense to add the generic parameter to everything that can carry aValueId
, and then they can be compared after resolving all IDs first, because Rust can still deriveHash
andEq
as long as the marker implements it, which is what happens in this PR.Documentation*
Check one:
PR Checklist*
cargo fmt
on default settings.