Skip to content

Latest commit

 

History

History
53 lines (34 loc) · 11.6 KB

README.md

File metadata and controls

53 lines (34 loc) · 11.6 KB

antireduce-graphs

The antireduce-graphs library specializes the antireduce library's distributional search and program serialization/deserialization capabilities to a domain of graph-constructing programs. It is used by a larger project hosted here. antireduce-graphs consists of four modules:

  • Ext: the top-level interface of the library
  • Eval: allows well-typed values of Antireduce.Program.t to be executed according to semantics given by OCaml functions; offers the evaluate function.
  • Graph: offers the core Graph.t data structure which is constructed by programs from this domain
  • State: offers the monadic type State.t which graph-constructing programs transform, and defines the semantics of the base primitives from which all invented primitives are derived.

Module Reference (Alphabetized)

Ext

The Ext module contains three elements that I believe should be standard in libraries that specialize antireduce's distributional program search capabilities:

  • The official name of the domain (string).
  • Specialized forms of the major functions of Antireduce.Parser: parse_program, parse_program_exn, parse_stitch_invention, parse_stitch_invention_exn. Versions of these functions must be written that incorporate knowledge of the base, unchanging primitives of the domain.
  • A specialized form of one of the Antireduce.Exploration.*_explore functions supplied with the arguments necessary to execute programs from this domain.

For an explanation of the parameters that must be provided to the Exploration.*_explore functions, see the documentation of antireduce.

Eval

The Eval module, or an analogous module, is obligated to offer the ability to evaluate a program from the domain. This requires proving to the OCaml compiler that a value of Antireduce.Program.t corresponds to a well-typed composition of OCaml functions. This can be accomplished through the use of Generalized Abstract DataTypes (GADTs). Other implementors are advised to copy the Eval source as closely as they would like. What follows is an explanation of its major elements for those who desire a more thorough understanding.

Reified_type.t is a GADT that represents polymorphic types which can be monomorphized only to a select set of base types relevant to the domain, allow proofs of equality, and can be constructed from values of Type.t. The GADT context defines a type-level list of type variables which are extended and unified as the typechecking function examines each subexpression of a program. The GADT expr describes a program which has not yet been typechecked; however, we require a characterization of all variables as values of Reified_type.t, as this ensures that even if a free variable was present in a value of expr, that the intermediate results of typecheck would still have a type. The GADT texpr serves to native OCaml values with types or type indices which may be used to look up their types in a corresponding context. The goal of the typecheck function is to demonstrate the existence of a value of the GADT exists_texpr whose type parameter is a type-level list, i.e. a type context, compatible with the type-level list constructed by traversing the expr translated from the given value of Antireduce.Program.t while accounting for the constraints imposed by its usages of known base primitives. In doing so, typecheck constructs a value of texpr which is proven to correspond to a well-typed OCaml value when evaluated in an initial (empty) type context. The eval function then produces this OCaml value by translating abstractions into OCaml functions and applying these values/functions in the structure given by the constructed value of texpr. We cannot write a function that directly produces this OCaml value as such a function must too be given a type, but this type cannot be statically inferred by the OCaml compiler as it depends on the specific program under evaluation. Nonetheless, we may exhibit a proof that such a type exists; in the process, we necessarily construct its value. If, in constructing its value, we trigger a side effect that saves the result—a value of State.t—which we intended to produce, this doesn't trouble the compiler. Therefore, the actual evaluate function—the main offering of the Eval module—simply produces a value of unit option, where the option type is used to indicate whether the program typechecked and executed in the allotted time. If the program did not typecheck, the typecheck function would raise an error which is not caught. This would indicate either a problem in the antireduce library, or an issue with the correspondence between the OCaml types of the functions used to implement base primitives and the values of Type.t used to define the value of Dsl.t passed to Antireduce.Exploration.*_explore functions. The latter mismatch usually manifests through typecheck, which raises an error if applications cannot be typed, or if a name of a base primitive lacks a corresponding case. Consequently, it is important that the correspondence between values of Type.t and names of base primitives given by initial_primitives*, all_primitives*, and initial_dsl is equivalent to the relationship between OCaml types, values of Reified_type.t, and operation names as defined by the typecheck function. Luckily, the more finicky part of the latter relationship, between OCaml types and values of Reified_type.t, is validated by the compiler thanks to the use of GADTs. Note that while initial_primitives* recognize only the primitives needed in initial_dsl, and therefore used by Antireduce.Exploration.*_explore to synthesize the unique parts of individual programs, all_primitives* recognizes a small number of additional primitives used in standard ways. In this domain, there are only two such additional primitives: initial, denoting the value of State.t to which all programs are ultimately applied; and save, the operation implementing the side effect that preserves the final output state.

Graph

Graph.t represents the relational structure that is nondestructively modified by programs from this domain. Through the semantics of the operations defined in the State module and their interaction with the semantics of add_node and add_edge, values of Graph.t are limited to being single-component graphs. Each element (node or edge) of a value of Graph.t is associated with an integer ID. This ID reflects the number of other elements that existed in the graph at the time the element was added, and is therefore 0-indexed. In order to support a procedural traversal and expansion API, edges of the graph are also associated with a port number. The port number of an edge ranges from 0 to max_conn, and is used to provide a standard labeling of the nodes in a neighborhood. Though self-edges are not allowed, the same nodes may be connected by multiple edges so long as each edge is placed at a different port. A single edge occupies the same port from the vantage point of both the nodes it connects.

In order to prevent a combinatorial explosion in the number of graph structures considered unique due only to differences in IDs and port numbers, the Graph module offers isomorphism testing via the VF2++ algorithm. To ease the implementation of this algorithm, the Graph module contains a type Graph.Frozen.t which represents nodes and their neighborhoods identically regardless of multiplicity and port number. This representation also eases operations such as accessing the neighborhood of a node, or measuring its degree. The type Graph.Key.t represents a natural primary key for describing a graph: a tuple of its size and a histogram of its degrees. This primary key cannot distinguish every non-isomorphic structure. Rather, given two values of Graph.Frozen.t, the presence of isomorphism can be detected by calling Graph.Morphism.has_iso. Therefore, in order to create both a primary and secondary key of a graph, one may use Graph.Frozen.of_graph to generate the secondary key, followed by Graph.Key.of_frozen to create a primary key. Within Ext, these two steps implement the function provided to Antireduce.Exploration.multikey_explore for the label keys_of_output.

State

The state of a domain program, given by a value of State.t is conceived as a cursor which points to a particular port of a particular node, and has access to a stack of nodes. When its pointed-to port is empty, and so is the same port of the node at the top of its stack, it may connect these two ports with a new edge. Upon adding a new node, the cursor is marooned on the isolated node it has created, and must make such a connection in order to traverse the graph once again. If it fails to do so, the resulting graph will be considered malformed.

A program may increment and decrement the number of the port it is pointing to; if the port it points to is occupied, it also has the option to move its cursor across that edge to a neighboring node. If the pointed-to port is unoccupied and it attempts to move anyway, the location of its cursor will not change. In conjunction with a conditional operation, this can be used to test what connections nodes have and execute contingent behaviors. Lastly, by wrapping a series of operations in a "function call" a program may change the graph but retain its state in all other respect upon their completion.

The above roughly characterizes what is possible using the functions in the submodule State.Operations; all functions not contained in this submodule are simply utilities helpful for writing the functions therein. Since their semantics are quite important, I also include a more exhaustive listing and set of descriptions:

  • identity: the identity function specialized to State.t. Required so that heap search can be complete without the use of lambdas beyond those introduced by invented primitives.
  • next_port: increments the port number of the cursor.
  • prev_port: decrements the port number of the cursor.
  • func: performs an subroutine then restores the previous state of the cursor and stack; has no effect if the graph created by the subroutine has more than one component (it fails to connect a node it has added to at least one neighbor).
  • if_positions_equal: for subroutines f, g, h and k, performs h if the cursor points to the same node after performing f as it does after performing g; otherwise, performs k.
  • move: moves the cursor to the node on the other end of an edge, if an edge exists at the port pointed-to in the current state.
  • add: adds an isolated node to the graph. Fails if the cursor already points to an isolated node or the stack is currently empty.
  • push: adds the currently pointed-to node to the top of the stack.
  • pop: if the cursor does not currently point to an isolated node, or the stack contains nodes other than the isolated node below the top node, removes the node at the top of the stack.
  • connect: provided the currently pointed-to port is not occupied and the stack is not empty, attempts to add an edge between the pointed-to node and the node at the top of the stack at the pointed-to port. Barring unexpected error, this attempt succeeds provided that the pointed-to port is unoccupied for the node at the top of the stack.

The pseudo-operation save, which is not used by Antireduce.Exploration.multikey_explore during enumeration, is also included in the State.Operations submodule, and simply assigns the state it receives to a dedicated reference before returning it.