Refactoring

isotope-project · Oct 14, 2024 · 88f3485 · 88f3485
1 parent cdcdcf2
commit 88f3485
Showing 1 changed file with 86 additions and 95 deletions.
diff --git a/paper.tex b/paper.tex
@@ -269,36 +269,32 @@ \section{Introduction}
 
 \section{Static Single Assignment Form}
 
-In this section, we will describe SSA form and the isomorphism between the standard $\phi$-node
-based presentation and the more functional basic-blocks with arguments format. We will then discuss
-standard dominance-based scoping, and how this can be recast as lexical scoping to make it more
-amenable to standard type-theoretic treatment. We will then generalize this format to allow
-branching to arbitrary code, rather than only labels, to obtain \textit{A-normal form}, or ANF
-\cite{flanagan-93-anf}, analogously to~\citet{chakravarty-functional-ssa-2003}; a straightforward
-argument shows that this adds no expressive power. Finally, to allow for substitution, we will
-further generalize our syntax to allow for arbitrary expression nesting, as well as let-expressions,
-to obtain \textit{type-theoretic SSA}, or \isotopessa{}, which will be the subject of the rest of
-this paper.
+In this section, we describe SSA form and the isomorphism between the standard $\phi$-node-based
+presentation and the more functional \emph{basic blocks with arguments} format. We then discuss
+standard dominance-based scoping and how it can be recast as lexical scoping to make it more
+amenable to standard type-theoretic treatment. We further generalize this format to allow branching
+to arbitrary code rather than only labels, obtaining \emph{A-normal form}
+(ANF)~\cite{flanagan-93-anf}, analogously to the transformation described by
+\citet{chakravarty-functional-ssa-2003}. A straightforward argument shows that this adds no
+expressive power. Finally, to allow for substitution, we relax our syntax to permit arbitrary
+expression nesting and \ms{let}-expressions, resulting in \emph{type-theoretic SSA}, or
+\isotopessa{}, which will be the focus of the rest of this paper.
 
 As a running example, consider the simple imperative program to compute $10!$ given in
 Figure~\ref{fig:fact-program}. 
 
 Operating directly on an imperative language can be challenging, since having a uniform
 representation of code friendly to mechanized optimization and analysis is often in tension with
-features designed to improve readability and programmer productivity, such as syntax sugar. 
-
-
-\TODO{3-address code was one of the first IRs; cite Frances Allen? Perhaps also cite
-\cite{allen-70-cfa} for dominance lore?}
-
-We might therefore normalize our code into \textit{3-address code}, as in
-Figure~\ref{fig:fact-3addr}, by doing the following:
+features designed to improve readability and programmer productivity, such as syntax sugar. Early
+work on compiler intermediate representations, notably by Frances Allen~\cite{allen-70-cfa},
+introduced \emph{three-address code}, also known as \emph{register transfer language (RTL)},  to
+normalize programs into a form more suitable for analysis and optimization. We can normalize our
+code into 3-address code, as in Figure~\ref{fig:fact-3addr}, by:
 \begin{itemize}
-  \item Converting structured control flow in terms of (e.g.) \ms{while} to unstructured jumps
-  between the basic blocks \ms{start} (the entry block), \ms{loop}, and \ms{body}, which now have
-  explicit labels. 
-  \item Replacing subexpressions like $i + 1$ in $a * (i + 1)$ with let-bindings, so that every
-  expression in our program is atomic. 
+  \item Converting structured control flow (e.g., \ms{while}) into unstructured jumps between basic
+  blocks labeled \ms{start}, \ms{loop}, and \ms{body}.
+  \item Replacing subexpressions like $i + 1$ in $a * (i + 1)$ with \ms{let}-bindings so that every
+  expression in our program is atomic.
 \end{itemize}
 
 \begin{figure}
@@ -341,37 +337,28 @@ \section{Static Single Assignment Form}
   \label{fig:fact-program}
 \end{figure}
 
-\TODO{better segue; see below}
 
-Unfortunately, many optimizations are still quite difficult to express in this format, since a
-variable's value may have been set by many different definitions throughout the execution of the
-program. To improve our ability to reason about programs, we introduce the \textit{static-single
-assignment} restriction, which says that every variable must be defined at exactly one point in the
-program. We can intuitively represent this as every variable being given by an immutable
+Despite this normalization, many optimizations remain difficult to express in this format because a
+variable's value may be set by multiple definitions throughout the program's execution. To improve
+our ability to reason about programs, we introduce the \emph{static single assignment} restriction,
+originally proposed by \citet{alpern-ssa-original-88} and popularized by
+\citet{cytron-ssa-intro-91}, which states that every variable must be defined at exactly one point
+in the program. We can intuitively represent this as every variable being given by an immutable
 \ms{let}-binding.
 
-\TODO{a bit more SSA history; \citet{cytron-ssa-intro-91} again; also look at
-\citet{alpern-ssa-original-88}}
-
-\todo{Explain why SSA is useful -- i.e., say substitution, even though it is not totally true\ldots}
-
-It is difficult to express programs with loops in this format, since the value of a
-variable may change on each iteration of the loop. The classical solution to this issue is to
-introduce \textit{$\phi$-nodes}, which evaluate to a different value depending on which block we
-\textit{previously} came from. For example, in basic block \ms{loop} in
-Figure~\ref{fig:fact-ssa}, $i_0$ evaluates to 1 if we came from \ms{start}, and to $i_1$ if we
-came from \ms{body}. Similarly, $a_0$ evaluates to 1 if we came from \ms{start}, and to $a_1$ if we
-came from \ms{body}. This allows us to express the fact that the values of $i_0, a_0$ are
-control-flow dependent while still maintaining the single-definition principle (as otherwise, we
-would need a new definition for $i$ overriding the old one).
-
-Note, however that $i_1, a_0$ are defined \textit{later} in the program than the
-$\phi$-nodes $i_0, a_0$, which would normally be seen as using an undefined
-value. We hence need to use the rather confusing scoping rule that the variables
-in a branch of a $\phi$-node must be defined \textit{at the end of the source
-basic block for that branch}, even if they are undefined when coming from other blocks. This in
-general makes giving SSA an operational semantics quite confusing, and with much
-time spent in compilers courses trying to build up the requisite intuition.
+However, expressing programs with loops in this format is challenging because the value of a
+variable may change with each loop iteration. The classical solution is to introduce
+\emph{$\phi$-nodes}, which select a value based on the predecessor block from which control arrived.
+For example, in basic block \ms{loop} in Figure~\ref{fig:fact-ssa}, $i_0$ evaluates to 1 if we came
+from \ms{start} and to $i_1$ if we came from \ms{body}. Similarly, $a_0$ evaluates to either 1 or
+$a_1$ based on the predecessor block. This mechanism allows us to maintain the single-assignment
+property while capturing control-flow-dependent variable updates.
+
+Notably, $i_1$ and $a_1$ are defined \emph{later} in the program than the $\phi$-nodes that
+reference them, which seems counterintuitive since variables are typically used after they are
+defined. To reconcile this, we need to use a somewhat unintuitive scoping rule for expressions
+appearing in $\phi$-node branches; namely, they are allowed to use any variable defined at the
+\emph{end} of the source basic block for that branch.
 
 \begin{figure}
   \begin{subfigure}[t]{.5\textwidth}
@@ -416,16 +403,19 @@ \section{Static Single Assignment Form}
   }
   \Description{}
 \end{figure}
-
-One solution to this issue is to transition to an isomorphic syntax called basic blocks with
-arguments (BBA), as illustrated in Figure \ref{fig:fact-bba}. In this approach, each $\phi$-node,
-which lacks side effects and whose scope depends solely on the originating basic blocks rather than
-its position within its own block, can be moved to the top of the block. This reorganization allows
-us to treat each $\phi$-node as equivalent to an argument for the basic block, with the
-corresponding values passed at the jump site. Conversely, converting a program from BBA format back
-to standard SSA form with $\phi$-nodes is straightforward: introduce a $\phi$-node for each argument
-of a basic block, and then, for each branch corresponding to the $\phi$-node, add an argument to the
-jump instruction from the appropriate source block. 
+
+While this rule can be quite confusing, and in particular difficult to assign an operational
+semantics to, the fact that the scoping for $\phi$-node branches is based on the source block,
+rather than the block in which the $\phi$-node itself appears, hints at a possible solution. By
+\emph{moving} the expression in each branch to the \emph{call-site}, we can transition to an
+isomorphic syntax, called basic blocks with arguments (BBA), as illustrated in Figure
+\ref{fig:fact-bba}. In this approach, each $\phi$-node, which lacks side effects and whose scope
+depends solely on the originating basic blocks rather than its position within its own block, can be
+moved to the top of the block. This reorganization allows us to treat each $\phi$-node as equivalent
+to an argument for the basic block, with the corresponding values passed at the jump site.
+Converting a program from BBA format back to standard SSA form with $\phi$-nodes is straightforward:
+introduce a $\phi$-node for each argument of a basic block, and for each branch corresponding to the
+$\phi$-node, add an argument to the jump instruction from the appropriate source block.
 
 \begin{figure}
   \begin{subfigure}[t]{.5\textwidth}
@@ -478,44 +468,41 @@ \section{Static Single Assignment Form}
   \Description{}
 \end{figure}
 
-An important insight provided by the BBA format, as discussed in \citet{appel-ssa}, is that a
-program in SSA form may be interpreted as a collection of tail-recursive functions, where each
-basic-block and branch correspond to a function and tail call respectively. This interpretation
-offers a natural framework for defining the semantics of SSA and reasoning about optimizations.
-However, there is a subtle difference between the scoping rules in this format and the actual
-scoping used in traditional SSA, which requires careful consideration.
-
-\TODO{Cite \citet{kelsey-95-cps} in paragraph above?}
+We can now use standard dominance-based scoping without any special cases for $\phi$-nodes. An
+important insight provided by the BBA format, as discussed by \citet{appel-ssa} and
+\citet{kelsey-95-cps}, is that a program in SSA form can in this way be interpreted as a collection
+of tail-recursive functions, where each basic block and branch correspond to a function and tail
+call, respectively. This interpretation offers a natural framework for defining the semantics of SSA
+and reasoning about optimizations. However, there is a subtle difference between the scoping rules
+in this format and the actual scoping used in traditional SSA, which requires careful consideration.
 
-In particular, while functional languages typically rely on \textit{lexical scoping}, where the
-scope of a variable is determined by its position within the code's nested structure, SSA form
-introduces a different scoping mechanism based on dominance. In SSA, a variable is considered to be
-in scope at a specific point $P$ if and only if all execution paths from the program's entry point
-to that point pass through the variable's unique definition $D$. In this case, we say that $P$ is
-\textit{strictly dominated} by $D$.
+While functional languages typically rely on \emph{lexical scoping}, where the scope of a variable
+is determined by its position within the code's nested structure, SSA introduces a different scoping
+mechanism based on \emph{dominance}. In SSA, a variable is considered to be in scope at a specific
+point $P$ if and only if all execution paths from the program's entry point to $P$ pass through the
+variable's unique definition $D$. In this case, we say that $D$ \emph{strictly dominates} $P$.
 
 When considering basic blocks, this translates to a variable being visible within the block $D$
-where it is defined, starting from the point of its definition, and continuing to be visible in all
-subsequent blocks $P$ strictly dominated by $D$ in the control-flow graph. For example, in
-Figure~\ref{fig:fact-bba},
+where it is defined, starting from the point of its definition and continuing to be visible in all
+subsequent blocks $P$ strictly dominated by $D$ in the control-flow graph (CFG). For example, in
+Figure~\ref{fig:fact-bba}:
 \begin{itemize}
-  \item \ms{start} strictly dominates \ms{loop} and \ms{body}; so, for example, the variable $n$
-  defined in \ms{start} is visible in \ms{loop}
-  \item \ms{loop} strictly dominates \ms{body}, so the parameterts $i_0, a_0$ to \ms{loop} are
-  visible, rather than needing to be passed in as parameters themselves as well
-  \item \ms{body} does \textit{not} strictly dominate \ms{loop}, since there is trivially a path
-  from \ms{start} to \ms{loop} which does not go through \ms{body}
+  \item \ms{start} strictly dominates \ms{loop} and \ms{body}; thus, the variable $n$ defined in
+  \ms{start} is visible in \ms{loop} and \ms{body}.
+  \item \ms{loop} strictly dominates \ms{body}; therefore, the parameters $i_0$, $a_0$ to \ms{loop}
+  are visible in \ms{body}, without needing to be passed as parameters again.
+  \item \ms{body} does \emph{not} strictly dominate \ms{loop}, since there is a path from \ms{start}
+  to \ms{loop} that does not pass through \ms{body}.
 \end{itemize}
 
-In general, the relation ``$A$ strictly dominates $B$'' intersected with the relation ``$A$ jumps
-directly to $B$'' (i.e. ``$A$ is a \textit{direct predecessor} of $B$'') forms a tree, called the
-\textit{dominance tree} of the control-flow graph. This tree can be computed in nearly linear time
-in the size of the CFG \cite{cytron-ssa-intro-91}. If we topologically sort the basic blocks in a CFG by the
-corresponding partial order on blocks, we can insert brackets according to the dominance tree such
-that a variable is in lexical scope if and only if it is in scope under dominance-based scoping, as
-shown in Figure~\ref{fig:dominance-to-lexical}. This is a simple transformation, and it is easy to
-see that it forms an isomorphism, as standard SSA can be recovered by simply removing the inserted
-``\ms{where}-blocks."
+The relation "$A$ strictly dominates $B$" intersected with "$A$ is a \emph{direct predecessor} of
+$B$" forms a tree called the \emph{dominance tree} of the CFG. This tree can be computed in nearly
+linear time~\cite{cytron-ssa-intro-91}. By topologically sorting the basic blocks in the CFG
+according to this partial order and inserting brackets based on the dominance tree, we can convert
+dominance-based scoping to lexical scoping. In this arrangement, a variable is in lexical scope if
+and only if it is in scope under dominance-based scoping, as shown in
+Figure~\ref{fig:dominance-to-lexical}. This transformation is straightforward, and standard SSA can
+be recovered by removing the inserted \ms{where}-blocks.
 
 \begin{figure}
   \centering
@@ -4392,7 +4379,7 @@ \subsection{SSA, FP and IRs}
     \item Has operational semantics, from which equational theory is derived; refinement is via
     observational equivalence
     \item Different to Moggi's equational metalanguage \citet{moggi-91-monad} as we have values and
-    computations but no computations of computations, i.e. $\ms{T}(\ms{T}\;A)$, much like a Kleisli
+    computations but no computations of computations, i.e. $\ms{T}(\ms{T}(A))$, much like a Kleisli
     category vs. a Freyd category
     \item Uses effect-refined types, where nontermination is also treated as an effect; unlike us,
     who do not track the effects of regions, only terms. Some equivalences are effect dependent,
@@ -4576,7 +4563,11 @@ \subsection{Compositional (Relaxed) Concurrency}
 to have a compositional semantics for relaxed concurrency, not only to be able to reason about the
 correctness of programs running under modern architectures, but also to study how their relaxed
 interacts with the aggressive optimizations performed by modern compilers, and therefore to be able
-to determine the soundness of the latter.
+to determine the soundness of the latter. We believe that our SSA semantics offers a useful
+benchmark for weak memory semantics: if the weak memory semantics can be organized into a
+distributive Elgot category, then we know that it is compatible with the standard control-flow
+optimizations for SSA. It turns out that many of these models are close to supporting SSA, but often
+fall short in surprising ways.
 
 \paragraph{Pomsets.}