Skip to content

Commit

Permalink
Refactoring
Browse files Browse the repository at this point in the history
  • Loading branch information
imbrem committed Oct 14, 2024
1 parent cdcdcf2 commit 88f3485
Showing 1 changed file with 86 additions and 95 deletions.
181 changes: 86 additions & 95 deletions paper.tex
Original file line number Diff line number Diff line change
Expand Up @@ -269,36 +269,32 @@ \section{Introduction}

\section{Static Single Assignment Form}

In this section, we will describe SSA form and the isomorphism between the standard $\phi$-node
based presentation and the more functional basic-blocks with arguments format. We will then discuss
standard dominance-based scoping, and how this can be recast as lexical scoping to make it more
amenable to standard type-theoretic treatment. We will then generalize this format to allow
branching to arbitrary code, rather than only labels, to obtain \textit{A-normal form}, or ANF
\cite{flanagan-93-anf}, analogously to~\citet{chakravarty-functional-ssa-2003}; a straightforward
argument shows that this adds no expressive power. Finally, to allow for substitution, we will
further generalize our syntax to allow for arbitrary expression nesting, as well as let-expressions,
to obtain \textit{type-theoretic SSA}, or \isotopessa{}, which will be the subject of the rest of
this paper.
In this section, we describe SSA form and the isomorphism between the standard $\phi$-node-based
presentation and the more functional \emph{basic blocks with arguments} format. We then discuss
standard dominance-based scoping and how it can be recast as lexical scoping to make it more
amenable to standard type-theoretic treatment. We further generalize this format to allow branching
to arbitrary code rather than only labels, obtaining \emph{A-normal form}
(ANF)~\cite{flanagan-93-anf}, analogously to the transformation described by
\citet{chakravarty-functional-ssa-2003}. A straightforward argument shows that this adds no
expressive power. Finally, to allow for substitution, we relax our syntax to permit arbitrary
expression nesting and \ms{let}-expressions, resulting in \emph{type-theoretic SSA}, or
\isotopessa{}, which will be the focus of the rest of this paper.

As a running example, consider the simple imperative program to compute $10!$ given in
Figure~\ref{fig:fact-program}.

Operating directly on an imperative language can be challenging, since having a uniform
representation of code friendly to mechanized optimization and analysis is often in tension with
features designed to improve readability and programmer productivity, such as syntax sugar.


\TODO{3-address code was one of the first IRs; cite Frances Allen? Perhaps also cite
\cite{allen-70-cfa} for dominance lore?}

We might therefore normalize our code into \textit{3-address code}, as in
Figure~\ref{fig:fact-3addr}, by doing the following:
features designed to improve readability and programmer productivity, such as syntax sugar. Early
work on compiler intermediate representations, notably by Frances Allen~\cite{allen-70-cfa},
introduced \emph{three-address code}, also known as \emph{register transfer language (RTL)}, to
normalize programs into a form more suitable for analysis and optimization. We can normalize our
code into 3-address code, as in Figure~\ref{fig:fact-3addr}, by:
\begin{itemize}
\item Converting structured control flow in terms of (e.g.) \ms{while} to unstructured jumps
between the basic blocks \ms{start} (the entry block), \ms{loop}, and \ms{body}, which now have
explicit labels.
\item Replacing subexpressions like $i + 1$ in $a * (i + 1)$ with let-bindings, so that every
expression in our program is atomic.
\item Converting structured control flow (e.g., \ms{while}) into unstructured jumps between basic
blocks labeled \ms{start}, \ms{loop}, and \ms{body}.
\item Replacing subexpressions like $i + 1$ in $a * (i + 1)$ with \ms{let}-bindings so that every
expression in our program is atomic.
\end{itemize}

\begin{figure}
Expand Down Expand Up @@ -341,37 +337,28 @@ \section{Static Single Assignment Form}
\label{fig:fact-program}
\end{figure}

\TODO{better segue; see below}

Unfortunately, many optimizations are still quite difficult to express in this format, since a
variable's value may have been set by many different definitions throughout the execution of the
program. To improve our ability to reason about programs, we introduce the \textit{static-single
assignment} restriction, which says that every variable must be defined at exactly one point in the
program. We can intuitively represent this as every variable being given by an immutable
Despite this normalization, many optimizations remain difficult to express in this format because a
variable's value may be set by multiple definitions throughout the program's execution. To improve
our ability to reason about programs, we introduce the \emph{static single assignment} restriction,
originally proposed by \citet{alpern-ssa-original-88} and popularized by
\citet{cytron-ssa-intro-91}, which states that every variable must be defined at exactly one point
in the program. We can intuitively represent this as every variable being given by an immutable
\ms{let}-binding.

\TODO{a bit more SSA history; \citet{cytron-ssa-intro-91} again; also look at
\citet{alpern-ssa-original-88}}

\todo{Explain why SSA is useful -- i.e., say substitution, even though it is not totally true\ldots}

It is difficult to express programs with loops in this format, since the value of a
variable may change on each iteration of the loop. The classical solution to this issue is to
introduce \textit{$\phi$-nodes}, which evaluate to a different value depending on which block we
\textit{previously} came from. For example, in basic block \ms{loop} in
Figure~\ref{fig:fact-ssa}, $i_0$ evaluates to 1 if we came from \ms{start}, and to $i_1$ if we
came from \ms{body}. Similarly, $a_0$ evaluates to 1 if we came from \ms{start}, and to $a_1$ if we
came from \ms{body}. This allows us to express the fact that the values of $i_0, a_0$ are
control-flow dependent while still maintaining the single-definition principle (as otherwise, we
would need a new definition for $i$ overriding the old one).

Note, however that $i_1, a_0$ are defined \textit{later} in the program than the
$\phi$-nodes $i_0, a_0$, which would normally be seen as using an undefined
value. We hence need to use the rather confusing scoping rule that the variables
in a branch of a $\phi$-node must be defined \textit{at the end of the source
basic block for that branch}, even if they are undefined when coming from other blocks. This in
general makes giving SSA an operational semantics quite confusing, and with much
time spent in compilers courses trying to build up the requisite intuition.
However, expressing programs with loops in this format is challenging because the value of a
variable may change with each loop iteration. The classical solution is to introduce
\emph{$\phi$-nodes}, which select a value based on the predecessor block from which control arrived.
For example, in basic block \ms{loop} in Figure~\ref{fig:fact-ssa}, $i_0$ evaluates to 1 if we came
from \ms{start} and to $i_1$ if we came from \ms{body}. Similarly, $a_0$ evaluates to either 1 or
$a_1$ based on the predecessor block. This mechanism allows us to maintain the single-assignment
property while capturing control-flow-dependent variable updates.

Notably, $i_1$ and $a_1$ are defined \emph{later} in the program than the $\phi$-nodes that
reference them, which seems counterintuitive since variables are typically used after they are
defined. To reconcile this, we need to use a somewhat unintuitive scoping rule for expressions
appearing in $\phi$-node branches; namely, they are allowed to use any variable defined at the
\emph{end} of the source basic block for that branch.

\begin{figure}
\begin{subfigure}[t]{.5\textwidth}
Expand Down Expand Up @@ -416,16 +403,19 @@ \section{Static Single Assignment Form}
}
\Description{}
\end{figure}

One solution to this issue is to transition to an isomorphic syntax called basic blocks with
arguments (BBA), as illustrated in Figure \ref{fig:fact-bba}. In this approach, each $\phi$-node,
which lacks side effects and whose scope depends solely on the originating basic blocks rather than
its position within its own block, can be moved to the top of the block. This reorganization allows
us to treat each $\phi$-node as equivalent to an argument for the basic block, with the
corresponding values passed at the jump site. Conversely, converting a program from BBA format back
to standard SSA form with $\phi$-nodes is straightforward: introduce a $\phi$-node for each argument
of a basic block, and then, for each branch corresponding to the $\phi$-node, add an argument to the
jump instruction from the appropriate source block.

While this rule can be quite confusing, and in particular difficult to assign an operational
semantics to, the fact that the scoping for $\phi$-node branches is based on the source block,
rather than the block in which the $\phi$-node itself appears, hints at a possible solution. By
\emph{moving} the expression in each branch to the \emph{call-site}, we can transition to an
isomorphic syntax, called basic blocks with arguments (BBA), as illustrated in Figure
\ref{fig:fact-bba}. In this approach, each $\phi$-node, which lacks side effects and whose scope
depends solely on the originating basic blocks rather than its position within its own block, can be
moved to the top of the block. This reorganization allows us to treat each $\phi$-node as equivalent
to an argument for the basic block, with the corresponding values passed at the jump site.
Converting a program from BBA format back to standard SSA form with $\phi$-nodes is straightforward:
introduce a $\phi$-node for each argument of a basic block, and for each branch corresponding to the
$\phi$-node, add an argument to the jump instruction from the appropriate source block.

\begin{figure}
\begin{subfigure}[t]{.5\textwidth}
Expand Down Expand Up @@ -478,44 +468,41 @@ \section{Static Single Assignment Form}
\Description{}
\end{figure}

An important insight provided by the BBA format, as discussed in \citet{appel-ssa}, is that a
program in SSA form may be interpreted as a collection of tail-recursive functions, where each
basic-block and branch correspond to a function and tail call respectively. This interpretation
offers a natural framework for defining the semantics of SSA and reasoning about optimizations.
However, there is a subtle difference between the scoping rules in this format and the actual
scoping used in traditional SSA, which requires careful consideration.

\TODO{Cite \citet{kelsey-95-cps} in paragraph above?}
We can now use standard dominance-based scoping without any special cases for $\phi$-nodes. An
important insight provided by the BBA format, as discussed by \citet{appel-ssa} and
\citet{kelsey-95-cps}, is that a program in SSA form can in this way be interpreted as a collection
of tail-recursive functions, where each basic block and branch correspond to a function and tail
call, respectively. This interpretation offers a natural framework for defining the semantics of SSA
and reasoning about optimizations. However, there is a subtle difference between the scoping rules
in this format and the actual scoping used in traditional SSA, which requires careful consideration.

In particular, while functional languages typically rely on \textit{lexical scoping}, where the
scope of a variable is determined by its position within the code's nested structure, SSA form
introduces a different scoping mechanism based on dominance. In SSA, a variable is considered to be
in scope at a specific point $P$ if and only if all execution paths from the program's entry point
to that point pass through the variable's unique definition $D$. In this case, we say that $P$ is
\textit{strictly dominated} by $D$.
While functional languages typically rely on \emph{lexical scoping}, where the scope of a variable
is determined by its position within the code's nested structure, SSA introduces a different scoping
mechanism based on \emph{dominance}. In SSA, a variable is considered to be in scope at a specific
point $P$ if and only if all execution paths from the program's entry point to $P$ pass through the
variable's unique definition $D$. In this case, we say that $D$ \emph{strictly dominates} $P$.

When considering basic blocks, this translates to a variable being visible within the block $D$
where it is defined, starting from the point of its definition, and continuing to be visible in all
subsequent blocks $P$ strictly dominated by $D$ in the control-flow graph. For example, in
Figure~\ref{fig:fact-bba},
where it is defined, starting from the point of its definition and continuing to be visible in all
subsequent blocks $P$ strictly dominated by $D$ in the control-flow graph (CFG). For example, in
Figure~\ref{fig:fact-bba}:
\begin{itemize}
\item \ms{start} strictly dominates \ms{loop} and \ms{body}; so, for example, the variable $n$
defined in \ms{start} is visible in \ms{loop}
\item \ms{loop} strictly dominates \ms{body}, so the parameterts $i_0, a_0$ to \ms{loop} are
visible, rather than needing to be passed in as parameters themselves as well
\item \ms{body} does \textit{not} strictly dominate \ms{loop}, since there is trivially a path
from \ms{start} to \ms{loop} which does not go through \ms{body}
\item \ms{start} strictly dominates \ms{loop} and \ms{body}; thus, the variable $n$ defined in
\ms{start} is visible in \ms{loop} and \ms{body}.
\item \ms{loop} strictly dominates \ms{body}; therefore, the parameters $i_0$, $a_0$ to \ms{loop}
are visible in \ms{body}, without needing to be passed as parameters again.
\item \ms{body} does \emph{not} strictly dominate \ms{loop}, since there is a path from \ms{start}
to \ms{loop} that does not pass through \ms{body}.
\end{itemize}

In general, the relation ``$A$ strictly dominates $B$'' intersected with the relation ``$A$ jumps
directly to $B$'' (i.e. ``$A$ is a \textit{direct predecessor} of $B$'') forms a tree, called the
\textit{dominance tree} of the control-flow graph. This tree can be computed in nearly linear time
in the size of the CFG \cite{cytron-ssa-intro-91}. If we topologically sort the basic blocks in a CFG by the
corresponding partial order on blocks, we can insert brackets according to the dominance tree such
that a variable is in lexical scope if and only if it is in scope under dominance-based scoping, as
shown in Figure~\ref{fig:dominance-to-lexical}. This is a simple transformation, and it is easy to
see that it forms an isomorphism, as standard SSA can be recovered by simply removing the inserted
``\ms{where}-blocks."
The relation "$A$ strictly dominates $B$" intersected with "$A$ is a \emph{direct predecessor} of
$B$" forms a tree called the \emph{dominance tree} of the CFG. This tree can be computed in nearly
linear time~\cite{cytron-ssa-intro-91}. By topologically sorting the basic blocks in the CFG
according to this partial order and inserting brackets based on the dominance tree, we can convert
dominance-based scoping to lexical scoping. In this arrangement, a variable is in lexical scope if
and only if it is in scope under dominance-based scoping, as shown in
Figure~\ref{fig:dominance-to-lexical}. This transformation is straightforward, and standard SSA can
be recovered by removing the inserted \ms{where}-blocks.

\begin{figure}
\centering
Expand Down Expand Up @@ -4392,7 +4379,7 @@ \subsection{SSA, FP and IRs}
\item Has operational semantics, from which equational theory is derived; refinement is via
observational equivalence
\item Different to Moggi's equational metalanguage \citet{moggi-91-monad} as we have values and
computations but no computations of computations, i.e. $\ms{T}(\ms{T}\;A)$, much like a Kleisli
computations but no computations of computations, i.e. $\ms{T}(\ms{T}(A))$, much like a Kleisli
category vs. a Freyd category
\item Uses effect-refined types, where nontermination is also treated as an effect; unlike us,
who do not track the effects of regions, only terms. Some equivalences are effect dependent,
Expand Down Expand Up @@ -4576,7 +4563,11 @@ \subsection{Compositional (Relaxed) Concurrency}
to have a compositional semantics for relaxed concurrency, not only to be able to reason about the
correctness of programs running under modern architectures, but also to study how their relaxed
interacts with the aggressive optimizations performed by modern compilers, and therefore to be able
to determine the soundness of the latter.
to determine the soundness of the latter. We believe that our SSA semantics offers a useful
benchmark for weak memory semantics: if the weak memory semantics can be organized into a
distributive Elgot category, then we know that it is compatible with the standard control-flow
optimizations for SSA. It turns out that many of these models are close to supporting SSA, but often
fall short in surprising ways.

\paragraph{Pomsets.}

Expand Down

0 comments on commit 88f3485

Please sign in to comment.