Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frontend Holes #107

Open
wants to merge 34 commits into
base: main
Choose a base branch
from
Open

Frontend Holes #107

wants to merge 34 commits into from

Conversation

stephenverderame
Copy link
Contributor

This is quite a large PR, and I've been working sporadically on this for quite some time, so I don't think I even know everything that changed off the top of my head right now. I do want to write a blog post or two going through some things in more detail.

Generally, I assumed that the explicator only explicates within a funclet, so the frontend still errors if it can't determine types for anything that might cross a funclet boundary. The previous type deduction algorithm is still in use, just with some new features.

The frontend also does a little bit of (polynomial time) synthesis itself now to determine when to make a mutable variable usable if the user never explicitly writes to it. Unless the program is ill-formed, the solution will always be valid, but not necessarily globally optimal, However, the solution has a property that can be thought of as local optimality.

Otherwise, the frontend type deduction should only make unambiguous decisions that can't be wrong, requiring type annotations if something can't be deduced. In all cases, the user should be able to specify what they want with type annotations.

  • The ? Hole. It's not clear to me whether this is something we'd want in the long run or just want to fold it into the ???. Basically, this corresponds to the assembly ? and roughly the semantics of this can be thought of as ? is an expression that can take the value of any variable that reaches it. Where "reaches" here is defined to consider that references can't cross funclet boundaries and how variables are consumed. So in Sketch, ? is basically {| v_1 | v_2 | ... | v_n |} where v_1, ..., v_n are variables that reach ?. Note ? cannot dereference references. As a consequence of how renaming is used for shadowing variables, a hole can use a shadowed variable right now.
  • The ??? hole. This can be used as an expression or statement and allows for arbitrary codegen by the explicator. Like ?, it effectively uses every variable that reaches it.
  • Undefined variables are given definitions at the hole that dominates all uses with the shortest path to the CFG end block. It is assumed that variables a hole defines are also initialized (made usable) at that point. ??? as an expression will generate assembly corresponding to an allocation of a temporary, a big hole, then a use of the temporary. I'm not sure if this is better or worse for the explicator than just a big hole and a small hole.
  • A ReachingDefs pass that calculates reaching defs and ensures that all definitions reach their uses.
  • Unitiaized mutables that need to be initialized (made usable) are initialized at the ??? holes which correspond to the "optimized latest possible point". That is we start initializing mutables at the latest possible point and then hoist the initializations as high as possible such that the new initialization set is strictly better than the previous one.
  • An UninitCheck pass that is similar to ReachingDefs but ensures that variables are initialized at every use they are needed to be usable.
  • A fix to type unification that allows multiple variables to be stored to in branches of if. Previously, it would break if anything other than the spec's result of the if was stored to in a branch of the if. In other words, it was assumed that frontend PHI nodes always correspond to an if in the spec, which isn't always true. For example, we could write the same value to a variable in both branches of an if.
  • Slightly better error messages and a demangling of variable names when displaying them in errors. Something to maybe do in the future is provide better displaying of errors that occur on variables that the frontend generated. Also, it could be desired to support displaying more than one error at a time.
  • Special _ variable name. This is a regular variable, except that the name _ can only be used in a variable definition. Due to renaming, the assembly will see this as _%n for some integer n.
  • Some notes on frontend name mangling (in case you need to debug something). In the following n represents some integer.%n is a suffix for AST-level SSA renaming to support things like scoping and shadowing. .n is a suffix for IR level SSA renaming, which considers writing a new "definition". _v_ref is for the underlying reference of the mutable variable named v (as mutable variables in the frontend have value semantics). ! is used to prepend meta-variable names in type unification to gen a new intermediate variable. $ is used to prepend all spec variable names (class names) in type unification. _hn is for the implicit definition(s) of a ??? hole when it's used as an expression. _fn is for AST-level temporaries created by flattening. _tn is for assembly-level temporaries created by lowering. _t_v1_v2_..._vn is a tuple of variables v1 to vn. So for example _c%0_ref.0 is the original definition of the backing reference of some user mutable variable c.
  • A simple FileCheck like script for testing test output
  • All GPU variables now always have flags storage, map_read, and copy_dst. (The type name suffix ::gds where g stands for GPU, s is for source (copy_dst), and d is for destination (map_read). This isn't ideal, but deduction of these might not be possible in cases with holes where it becomes unknown all the ways a variable is used.
  • Various changes and improvements internal to the frontend

This refactors the code to prepare for the frontend's
initialization algorithm that determines where to make
uninitialized mutables usable.
The initialization algorithm has 2 parts. The first part
places initializations at the latest possible points. This
change begins work on this algorithm by introducing the
init_synth module which contains functions to generate
initializer sets of each uninitialized mutable variable.

An initializer set is a set of hole locations that collectively
dominate the uses of the mutable they initialize.
Finished phase 1 of the initialization algorithm that places
initializations of uninitialized variables as late as possible.
To prepare for the second phase of the frontend initialization
algorithm, which hoists the initialization points to better locations,
if any exist, we first minimize the initialization set by removing
hole locations that are collectively strict dominated by a subset
of the initialization set. That is to say, the mutable must already
be initialized at the program point, and so it can be removed without
affecting correctness.
This change enables writes to variables that don't have
select quotients to occur withen the branches of an if.
This is done by introducing a more flexible SpliceTerm
type expression that can either match like a regular
term or replace itself with a child, provided all its
children can unify with eachother.

Further, we allow phi nodes to unify with constraints
that aren't selects by making the phi node constraints
to be a SpliceTerm.

The coding pattern this change enables is a bit strange
to write manually, but can be easily generated by the
frontend by putting holes in both branches of an if.

Finally, this change also introduces some utilities needed
for phase 2 of the initialization algorithm.
This implements a mostly working hoist algorithm to improve upon
the placement of initializations of variables. At a high-level,
we hoist an initialization point to a set of collectively
dominating holes if we must initialize less dependencies at those
holes. To take into account sharing of dependencies between other
variables, a given variable is only considered to initialize `1/n`
of a node where `n` is the number of other initializations at the
hole that also depend on the same node.
This allows variable initialization to consider sharing
initializations with static definitions of a hole.
Adds tests and fixes bugs they caught. Also catches undefined
node errors in specs.
This change introduces VarName and ClassName newtypes to better distinguish
between the different unification type names such as variable names, class
names without the class identifier, and class names with the class identifier.
Many places in the code still use Strings to represent variables, however,
every class name should now be represented with the ClassName newtype.

This change also fixes semantic versioning checking.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant