diff --git a/Makefile b/Makefile index 767a3ec22..38f2562ca 100644 --- a/Makefile +++ b/Makefile @@ -18,6 +18,7 @@ clean: asciidoc: $(MAKE) -C doc/asciidoc + cp doc/asciidoc/manual.html manual.html docker: docker build --tag sail:0.1 . diff --git a/README.md b/README.md index 47f2dc44a..83e902317 100644 --- a/README.md +++ b/README.md @@ -38,7 +38,7 @@ descriptions of the current state. This repository contains the implementation of Sail, together with some Sail specifications and related tools. -* A manual, [manual.pdf](manual.pdf) with source (in [doc/](doc/)) +* A manual, [manual.html](manual.html) with source (in [doc/](doc/)) * The Sail source code (in [src/](src/)) diff --git a/doc/asciidoc/Makefile b/doc/asciidoc/Makefile index b207b47fa..449761a09 100644 --- a/doc/asciidoc/Makefile +++ b/doc/asciidoc/Makefile @@ -1,3 +1,5 @@ +SAIL_LOGO_PATH ?= etc/logo/sail_logo_square.png + SAIL_PLUGIN ?= asciidoctor-sail SAIL_DOCS = sail_doc/riscv_duopod.json @@ -64,7 +66,10 @@ parser.txt: ../../src/lib/parser.mly parser.sed manual.pdf: *.adoc $(SAIL_DOCS) $(INCS) $(TIKZ_FIGURES) parser.txt asciidoctor-pdf manual.adoc -r $(SAIL_PLUGIN) -r asciidoctor-bibtex -manual.html: *.adoc $(SAIL_DOCS) $(INCS) $(TIKZ_FIGURES) parser.txt manual.css sail_logo_square.png +docinfo.html: docinfo.template + sed 's/SAIL_LOGO_PATH/$(shell echo $(SAIL_LOGO_PATH) | sed 's/\//\\\//g')/' $< > $@ + +manual.html: *.adoc $(SAIL_DOCS) $(INCS) $(TIKZ_FIGURES) parser.txt manual.css docinfo.html asciidoctor manual.adoc -r $(SAIL_PLUGIN) -r asciidoctor-bibtex .PHONY: clean diff --git a/doc/asciidoc/docinfo.html b/doc/asciidoc/docinfo.template similarity index 89% rename from doc/asciidoc/docinfo.html rename to doc/asciidoc/docinfo.template index 59b242773..008eb166d 100644 --- a/doc/asciidoc/docinfo.html +++ b/doc/asciidoc/docinfo.template @@ -12,7 +12,7 @@ + + + +
+
+

Introduction

+
+
+

Sail is a language for expressing the instruction-set +architecture (ISA) semantics of processors.

+
+
+

Vendor architecture specification documents typically describe the +sequential behaviour of their ISA with a combination of prose, tables, +and pseudocode for each instruction.

+
+
+

They vary in how precise that pseudocode is: in some it is just +suggestive, while in others it is close to a complete description of +the envelope of architecturally allowed behaviour for sequential code.

+
+
+

For x86[1], the Intel pseudocode is just suggestive, +with embedded prose, while the AMD descriptions[2] +are prose alone. For IBM Power[3], there is +detailed pseudocode which has recently become +machine-processed[4]. For +Arm[5], there is detailed pseudocode, which has +recently become machine-processed[6]. For +MIPS[7, 8] there is also reasonably detailed +pseudocode.

+
+
+

Sail is intended:

+
+
+
    +
  • +

    To support precise definition of real-world ISA semantics;

    +
  • +
  • +

    To be accessible to engineers familiar with existing vendor +pseudocodes, with a similar style to the pseudocodes used by ARM and +IBM Power (modulo minor syntactic differences);

    +
  • +
  • +

    To expose the structure needed to combine the sequential ISA +semantics with the relaxed-memory concurrency models we have +developed;

    +
  • +
  • +

    To provide an expressive type system that can statically check the +bitvector length and indexing computation that arises in these +specifications, to detect errors and to support code generation, +with type inference to minimise the required type annotations;

    +
  • +
  • +

    To support execution, for architecturally complete emulation +automatically based on the definition;

    +
  • +
  • +

    To support automatic generation of theorem-prover definitions, for +mechanised reasoning about ISA specifications; and

    +
  • +
  • +

    To be as minimal as possible given the above, to ease the tasks of +code generation and theorem-prover definition generation.

    +
  • +
+
+
+

A Sail specification will typically define an abstract syntax type/tre +(AST) of machine instructions, a decode function that takes binary +values to AST values, and an execute function that describes how each +of those behaves at runtime, together with whatever auxiliary +functions and types are needed.

+
+
+

Given such a specification, the Sail implementation can typecheck it +and generate:

+
+
+
    +
  • +

    An internal representation of the fully type-annotated +definition (a deep embedding of the definition) in a form that can +be executed by the Sail interpreter. These are both expressed in +Lem[9, 10], a language of type, function, and +relation definitions that can be compiled into OCaml and various +theorem provers. The Sail interpreter can also be used to analyse +instruction definitions (or partially executed instructions) to +determine their potential register and memory footprints.

    +
  • +
  • +

    A shallow embedding of the definition, also in Lem, that can be +executed or converted to theorem-prover code more directly. +Currently this is aimed at Isabelle/HOL or HOL4, though the Sail +dependent types should support generation of idiomatic Coq definitions +(directly rather than via Lem).

    +
  • +
  • +

    A compiled version of the specification +directly into OCaml.

    +
  • +
  • +

    A more efficient executable version of the specification, compiled +into C code.

    +
  • +
+
+
+

Sail has been used to develop models of parts of several architectures:

+
+ +++++ + + + + + + + + + + + + + + + + + + + + + + +

Arm-v8 (ASL)

generated from Arm’s v8.5 public ASL spec

github.com/rems-project/sail-arm/tree/master/arm-v8.5-a

Arm-v9 (ASL)

generated from Arm’s v9.3 public ASL spec

github.com/rems-project/sail-arm/tree/master/arm-v9.3-a

RISC-V

hand-written

github.com/riscv/sail-riscv

CHERI-MIPS

hand-written

github.com/CTSRD-CHERI/sail-cheri-mips

+
+

The Arm-v8 (ASL) model is based on an automatic translation of Arm’s +machine-readable public v8.3 ASL specification [1]. It includes everything in +ARM’s specification.

+
+
+

The MIPS model is hand-written based on the MIPS64 manual version +2.5[7, 8], +but covering only the features in the BERI hardware +reference[11], +which in turn drew on MIPS4000 and MIPS32[12, 13].

+
+
+

The CHERI model is based on that and the CHERI ISA reference manual +version 5[14]. These two are both +principally by Norton-Wright; they cover all basic user +and kernel mode MIPS features sufficient to boot FreeBSD, including a +TLB, exceptions and a basic UART for console interaction. ISA +extensions such as floating point are not covered. The CHERI model +supports either 256-bit capabilities or 128-bit compressed +capabilities.

+
+
+
+
+

A tutorial RISC-V style example

+
+
+

We introduce the basic features of Sail via a small example from our +RISC-V model that includes just two instructions: add immediate and +load double. We start by defining the default order (see Vectors +for details), and including the Sail prelude:

+
+
+
+
default Order dec
+$include <prelude.sail>
+
+
+
+

The Sail prelude is very minimal, and it is expected that Sail +specifications will usually build upon it. Some Sail specifications +are derived from pre-existing pseudocode, which already use specific +idioms — our Sail Arm specification uses ZeroExtend and +SignExtend mirroring the ASL code, whereas our MIPS and RISC-V +specification use EXTZ and EXTS for the same functions. Therefore for +this example we define zero-extension and sign-extension functions as:

+
+
+
+
val EXTZ : forall 'n 'm, 'm >= 'n. (implicit('m), bits('n)) -> bits('m)
+function EXTS(m, v) = sail_sign_extend(v, m)
+
+val EXTS : forall 'n 'm, 'm >= 'n. (implicit('m), bits('n)) -> bits('m)
+function EXTS(m, v) = sail_sign_extend(v, m)
+
+
+
+

We now define an integer type synonym xlen, which for this example +will be equal to 64. Sail supports definitions which are generic over +both regular types, and integers (think const generics in C++, but +more expressive). We also create a type xlenbits for bitvectors of +length xlen.

+
+
+
+
type xlen       : Int = 64
+type xlen_bytes : Int = 8
+type xlenbits         = bits(xlen)
+
+
+
+

For the purpose of this example, we also introduce a type synonym for +bitvectors of length 5, which represent registers.

+
+
+
+
type regbits = bits(5)
+
+
+
+

We now set up some basic architectural state. First creating a +register of type xlenbits for both the program counter PC, and +the next program counter, nextPC. We define the general purpose +registers as a vector of 32 xlenbits bitvectors. The dec +keyword isn’t important in this example, but Sail supports two +different numbering schemes for (bit)vectors inc, for most +significant bit is zero, and dec for least significant bit is +zero. We then define a getter and setter for the registers, which +ensure that the zero register is treated specially (in +RISC-V register 0 is always hardcoded to be 0). Finally we overload +both the read (rX) and write (wX) functions as simply +X. This allows us to write registers as X(r) = value and +read registers as value = X(r). Sail supports flexible ad-hoc +overloading, and has an expressive l-value language in assignments, +with the aim of allowing pseudo-code like definitions.

+
+
+
+

+register PC : xlenbits
+register nextPC : xlenbits
+
+register Xs : vector(32, dec, xlenbits)
+
+val rX : regbits -> xlenbits
+function rX(r) =
+  match r {
+    0b00000 => EXTZ(0x0),
+    _ => Xs[unsigned(r)]
+  }
+
+val wX : (regbits, xlenbits) -> unit
+function wX(r, v) =
+  if r != 0b00000 then {
+     Xs[unsigned(r)] = v;
+  }
+
+
+
+

We also give a function MEMr for reading memory, this function just +points at a builtin we have defined elsewhere. The builtin is very +general (it allows addresses of multiple bitwidths), so we also derive +a simpler read_mem function.

+
+
+
+
val MEMr = monadic { lem: "MEMr", coq: "MEMr", _ : "read_ram" } : forall 'n 'm, 'n >= 0.
+   (int('m), int('n), bits('m), bits('m)) -> bits(8 * 'n)
+
+val read_mem : forall 'n, 'n >= 0. (xlenbits, int('n)) -> bits(8 * 'n)
+function read_mem(addr, width) =
+    MEMr(sizeof(xlen), width, EXTZ(0x0), addr)
+
+
+
+

It’s common when defining architecture specifications to break +instruction semantics down into separate functions that handle +decoding (possibly even in several stages) into custom intermediate +datatypes and executing the decoded instructions. However it’s often +desirable to group the relevant parts of these functions and datatypes +together in one place, as they would usually be found in an +architecture reference manual. To support this Sail supports +scattered definitions. First we give types for the execute and +decode functions, as well as the ast union.

+
+
+
+
enum iop = {RISCV_ADDI, RISCV_SLTI, RISCV_SLTIU, RISCV_XORI, RISCV_ORI, RISCV_ANDI}
+
+scattered union ast
+
+val decode : bits(32) -> option(ast)
+val execute : ast -> unit
+
+
+
+

Now we provide the clauses for the add-immediate ast type, as well +as its execute and decode clauses. We can define the decode function +by directly pattern matching on the bitvector representing the +instruction. Sail supports vector concatenation patterns (@ is the +vector concatenation operator), and uses the types provided +(e.g. bits(12) and regbits) to destructure the vector in the +correct way. We use the EXTS function that sign-extends its +argument.

+
+
+
+
union clause ast = ITYPE : (bits(12), regbits, regbits, iop)
+
+function clause decode imm : bits(12) @ rs1 : regbits @ 0b000 @ rd : regbits @ 0b0010011
+  = Some(ITYPE(imm, rs1, rd, RISCV_ADDI))
+
+function clause execute (ITYPE (imm, rs1, rd, RISCV_ADDI)) = {
+  let rs1_val = X(rs1);
+  let imm_ext : xlenbits = EXTS(imm);
+  let result = rs1_val + imm_ext;
+  X(rd) = result
+}
+
+
+
+

Now we do the same thing for the load-double instruction:

+
+
+
+
union clause ast = LOAD : (bits(12), regbits, regbits)
+
+function clause decode imm : bits(12) @ rs1 : regbits @ 0b011 @ rd : regbits @ 0b0000011
+  = Some(LOAD(imm, rs1, rd))
+
+function clause execute(LOAD(imm, rs1, rd)) = {
+    let addr : xlenbits = X(rs1) + EXTS(imm);
+    let result : xlenbits = read_mem(addr, sizeof(xlen_bytes));
+    X(rd) = result
+}
+
+
+
+

Finally we define the fallthrough case for the decode function. Note +that the clauses in a scattered function will be matched in the order +they appear in the file.

+
+
+
+
function clause decode _ = None()
+
+
+
+
+
+

Usage

+
+
+

In its most basic use-case Sail is a command-line tool, analogous to +a compiler: one gives it a list of input Sail files; it type-checks +them and provides translated output.

+
+
+

To simply typecheck Sail files, one can pass them on the command line +with no other options, for example:

+
+
+
+
sail prelude.sail riscv_types.sail riscv_mem.sail riscv_sys.sail riscv_vmem.sail riscv.sail
+
+
+
+

The sail files passed on the command line are simply treated as if +they are one large file concatenated together, although the parser +will keep track of locations on a per-file basis for +error-reporting. As can be seen, this specification is split into +several logical components. prelude.sail defines the initial +type environment and builtins, riscv_types.sail gives type +definitions used in the rest of the specification, riscv_mem.sail +and riscv_vmem.sail describe the physical and virtual memory +interaction, and then riscv_sys.sail and riscv.sail +implement most of the specification.

+
+
+

One can use also use $include directives in Sail source, for +example:

+
+
+
+
$include <library.sail>
+$include "file.sail"
+
+
+
+

Here, Sail will look for library.sail in the $SAIL_DIR/lib +directory, where $SAIL_DIR is usually the root of the sail +repository (or opam var sail:share when Sail is installed using +opam). It will search for file.sail relative to the location of the +file containing the $include. The space after the $include before +the filename is mandatory. Sail also supports $define, $ifdef, and +$ifndef for basic conditional compilation. These are things that are +understood by Sail itself, not a separate preprocessor, and are +handled after the AST is parsed.

+
+
+

For more complex projects, a module hierarchy can be defined. See the +Modules section for details.

+
+
+

Sail options

+
+

For command line arguments, by default Sail accepts arguments of +the form ‑long_opt, i.e. leading with a single - and words +separated by _. For those who find this departure from standard +convention distasteful, you can define the SAIL_NEW_CLI environment +variable, and all such options will be formatted as +‑‑long‑opt (short opts like ‑o will remain +unchanged). For backwards compatibility reasons, this default is hard +to change.

+
+
+

For a list of all options, one can call Sail as sail -help.

+
+
+
+
+

C compilation

+
+

To compile Sail into C, the -c option is used, like so:

+
+
+
+
sail -c FILES 1> out.c
+
+
+
+

The translated C is by default printed to stdout, but one can also use +the -o option to output to a file, so

+
+
+
+
sail -c FILES -o out
+
+
+
+

will generate a file called out.c. To produce an executable +this needs to be compiled and linked with the C files in the +$SAIL_DIR/lib directory:

+
+
+
+
gcc out.c $SAIL_DIR/lib/*.c -lgmp -lz -I $SAIL_DIR/lib/ -o out
+
+
+
+

The C output requires the GMP library for arbitrary precision +arithmetic, as well as zlib for working with compressed ELF binaries.

+
+
+

There are several Sail options that affect the C output:

+
+
+
    +
  • +

    -O turns on optimisations. The generated C code will be quite slow +unless this flag is set.

    +
  • +
  • +

    -Oconstant_fold apply constant folding optimisations.

    +
  • +
  • +

    -c_include Supply additional header files to be included in the +generated C.

    +
  • +
  • +

    -c_no_main Do not generate a main() function.

    +
  • +
  • +

    -static Mark generated C functions as static where possible. This +is useful for measuring code coverage.

    +
  • +
+
+
+
+
+

SystemVerilog compilation (Experimental)

+
+ + + + + +
+
Caution
+
+This feature is new and experimental, so it is not guaranteed +to provide working SystemVerilog. Furthermore, it is intended for +hardware model checking against a hand-written design. Sail is not a +hardware description language! +
+
+
+

To compile Sail into SystemVerilog, the -sv option is used. The -o +option provides a prefix that is used on the various generated files.

+
+
+

There are several options for the SystemVerilog output:

+
+
+
    +
  • +

    -sv_output_dir Generate all files in the specified directory

    +
  • +
  • +

    -sv_include Add an include directive to the generated SystemVerilog

    +
  • +
  • +

    -sv_verilate Can be used as either -sv_verilate run or +-sv_verilate compile. If used Sail will automatically invoke +verilator on the generated +SystemVerilog

    +
  • +
  • +

    -sv_lines Output SystemVerilog `line directives to aid debugging.

    +
  • +
  • +

    -sv_int_size Set the maximum integer size allowed in the specification.

    +
  • +
  • +

    -sv_bits_size Bound the maximum bitvector width on the generated SystemVerilog.

    +
  • +
  • +

    -sv_specialize The -sv_specialize n option will perform n +rounds of specialisation on the Sail code before generating +SystemVerilog. This will make bitvectors more monomorphic, but at +the cost of code duplication.

    +
  • +
+
+
+

The are various other options that control various minutae about the +generated SystemVerilog, see sail -help for more details.

+
+
+
+
+

Automatic formatting (Experimental)

+
+ + + + + +
+
Caution
+
+This feature is new and experimental, so be sure to inspect +changes to source made by the tool and use at your own risk! +
+
+
+

Sail supports automatic code formatting similar to tools like go fmt +or rustfmt. This is built into Sail itself, and can be used via the +-fmt flag. To format a file my_file.sail, we would use the command:

+
+
+
+
sail -fmt my_file.sail
+
+
+
+

Note that Sail does not attempt to type-check files when formatting +them, so in this case we do not necessarily have to pass the other +files that my_file.sail would otherwise require to +type-check. However, it is perfectly fine to pass multiple files like +so:

+
+
+
+
sail -fmt file1.sail file2.sail file3.sail
+
+
+
+

The one exception is if a file uses a custom infix operator, then the +file that declares that operator must be passed before any file that +uses it. So if my_file.sail uses an operator declared in +operator.sail (otherwise it would not be able to parenthesize infix +expressions correctly), we would be required to do:

+
+
+
+
sail -fmt operator.sail my_file.sail
+
+
+
+

This will format both files. If we want to avoid formatting +operator.sail, we could either use -fmt_skip, like so:

+
+
+
+
sail -fmt_skip operator.sail -fmt operator.sail my_file.sail
+
+
+
+

or the -fmt_only option, like so:

+
+
+
+
sail -fmt_only my_file.sail -fmt operator.sail my_file.sail
+
+
+
+

Both of these options can be passed multiple times if required.

+
+
+

Formatting configuration is done using a JSON configuration file: +as:

+
+
+
+
{
+    "fmt": {
+        "indent": 4,
+        "preserve_structure": false,
+        "line_width": 120,
+        "ribbon_width": 1.0
+    }
+}
+
+
+
+
+

which is passed to sail with the -config flag.

+
+
+

The various keys supported under the "fmt" key are as follows:

+
+
+
    +
  • +

    indent The default indentation level

    +
  • +
  • +

    preserve_structure Preserve the structure of the syntax tree as +much as possible. Note that the use of this operation is not +recommended as it inhibits many formatting options, such as +inserting blocks on if statements and loops.

    +
  • +
  • +

    line_width The desired maximum line-width. Note that this is a +soft limit, and the line-width can go beyond if there are no +possible line break options (e.g. if you have extremely long +identifiers).

    +
  • +
  • +

    ribbon_width A soft limit on what proportion (between 0.0 and 1.0) +of the line should be non-whitespace. A value of 1.0 indicates that the +entirity of line_width can be taken up by non-whitespace.

    +
  • +
+
+
+

If this file is not passed on the command line, Sail will check the +$SAIL_CONFIG environment variable, and if that is unset it will +search for a file named sail_config.json in the current working +directory, then recursively backwards through parent directories.

+
+
+
+
+

Interactive mode

+
+

Sail has a GHCi-style interactive interpreter. This can be used by +starting Sail with sail -i. Sail will still handle any other +command line arguments as per usual. To use Sail files within the +interpreter they must be passed on the command line as if they were +being compiled normally. One can also pass a list of commands to the +interpreter by using the -is flag, as

+
+
+
+
sail -is <file>
+
+
+
+

where <file> contains a list of commands. Once inside the +interactive mode, a list of available commands can be accessed by +typing :commands, while :help <command> can be used to provide +some documentation for each command.

+
+
+
+
+

Other options

+
+

Here we summarize most of the other options available for +Sail. Debugging options (usually for debugging Sail itself) are +indicated by starting with the letter d.

+
+
+
    +
  • +

    -v Print the Sail version.

    +
  • +
  • +

    -help Print a list of options.

    +
  • +
  • +

    -no_warn Turn off warnings.

    +
  • +
  • +

    -enum_casts Allow elements of enumerations to be +automatically cast to numbers.

    +
  • +
  • +

    -memo_z3 Memoize calls to the Z3 solver. This can greatly improve +typechecking times if you are repeatedly typechecking the same +specification while developing it.

    +
  • +
  • +

    -no_lexp_bounds_check Turn off bounds checking in the left hand +side of assignments.

    +
  • +
  • +

    -undefined_gen Generate functions that create undefined values of +user-defined types. Every type T will get a undefined_T function +created for it. This flag is set automatically by some backends that +want to re-write undefined.

    +
  • +
  • +

    -just_check Force Sail to terminate immediately after +typechecking.

    +
  • +
  • +

    -dtc_verbose <verbosity> Make the typechecker print a +trace of typing judgements. If the verbosity level is 1, then this +should only include fairly readable judgements about checking and +inference rules. If verbosity is 2 then it will include a large +amount of debugging information. This option can be useful to +diagnose tricky type-errors, especially if the error message isn’t +very good.

    +
  • +
  • +

    -ddump_tc_ast Write the typechecked AST to stdout after +typechecking

    +
  • +
  • +

    -ddump_rewrite_ast <prefix> Write the AST out after each +re-writing pass. The output from each pass is placed in a file +starting with prefix.

    +
  • +
  • +

    -dmagic_hash Allow the # symbol in identifiers. It’s +currently used as a magic symbol to separate generated identifiers +from those the user can write, so this option allows for the output +of the various other debugging options to be fed back into Sail. The +name comes from the GHC option with the same purpose: +ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/magic_hash.html

    +
  • +
+
+
+
+
+
+

The Sail Language

+
+
+

Functions

+
+

In Sail, we often define functions in two parts. First we can write +the type signature for the function using the val keyword, then +define the body of the function using the function keyword. In this +Subsection, we will write our own version of the replicate_bits +function from the Sail library. This function takes a number n and a +bitvector, and creates a new bitvector containing that bitvector +copied n times.

+
+
+
+
val my_replicate_bits : forall 'n 'm, 'm >= 1 & 'n >= 1.
+    (int('n), bits('m)) -> bits('n * 'm)
+
+
+
+

This signature shows how Sail can track the length of bitvectors and +the value of integer variables in type signatures, using type +variables. Type variables are written with a leading 'tick', so 'n +and 'm are the type variables in the above signature.

+
+
+ + + + + +
+
Note
+
+The leading tick is a convention derived from Standard ML, and +other functional languages derived from Standard ML, such as OCaml. +Readers who are familiar with Rust will also recognise this naming +convention from lifetime variables in Rust types. The val keyword to +declare functions is also taken from OCaml. +
+
+
+

The type bits('m) is a bitvector of length 'm, and int('n) is an +integer with the value 'n. The result of this function will +therefore be a bitvector of length 'n * 'm. We can also add +constraints on these types. Here we require that we are replicating +the input bitvector at least once with the 'n >= 1 constraint, and +that the input bitvector length is at least one with the 'm >= 1 +constraint. Sail will check that all callers of this function are +guaranteed to satisfy these constraints.

+
+
+

Sail will also ensure that the output of our function has precisely +the length bits('n * 'm) for all possible inputs (hence why the +keyword uses the mathematical forall quantifier).

+
+
+

A potential implementation of this function looks like:

+
+
+
+
function my_replicate_bits(n, xs) = {
+    var ys = zeros(n * length(xs));
+    foreach (i from 1 to n) {
+        ys = ys << length(xs);
+        ys = ys | zero_extend(xs, length(ys))
+    };
+    ys
+}
+
+
+
+

Functions may also have implicit parameters, e.g. we can implement a +zero extension function that implicitly picks up its result length +from the calling context as follows:

+
+
+
+
val extz : forall 'n 'm, 'm >= 'n. (implicit('m), bits('n)) -> bits('m)
+
+function extz(m, xs) = zero_extend(xs, m)
+
+
+
+

Implicit parameters are always integers, and they must appear first +before any other parameters in the function type signature. The first +argument can then just be omitted when calling the function, like so:

+
+
+
+
let xs: bits(32) = 0x0000_0000;
+let ys: bits(64) = extz(xs);
+
+
+
+
+

Blocks

+
+

You may have noticed that in the definition of my_replicate_bits +above, there was no return keyword. This is because unlike languages +such as C and C++, and more similar to languages like OCaml and Rust, +everything in Sail is an expression which evaluates to a value. A +block in Sail is simply a sequence of expressions surrounded by curly +braces { and }, and separated by semicolons. The value returned by +a block is the value returned by the last expressions, and likewise +the type of a block is determined by it’s final expressions, so +{ A; B; C }, will evaluate to the value of C +after evaluating A and B in order. The expressions other than the +final expression in the block must have type unit, which is +discussed in the following section. Within blocks we can declare +immutable variables using let, and mutable variables using var, +for example:

+
+
+
+
{
+    let x : int = 3;
+    var y : int = 2;
+    y = y + 1;
+    x + y
+}
+
+
+
+

The above block has type int and evaluates to the value 6.

+
+
+ + + + + +
+
Note
+
+For those familiar with Rust, a trailing semicolon in Sail does +not change the semantics of the block and is purely optional. +
+
+
+
+

The unit type

+
+

The simplest type in Sail is the unit type unit. It is a type with +a single member (). Rather than have functions that takes zero +arguments, we have functions that take a single unit argument. +Similarly, rather than having functions that return no results, a +function with no meaningful return value can return (). The () +notation reflects the fact that the unit type can be thought of as an +empty tuple (see Tuples).

+
+
+

In Sail unit plays a similar role to void in C and C++, except +unlike void it is an ordinary type and can appear anywhere and be used +in generic functions.

+
+
+

The wikipedia page for the unit type, +goes into further details on the difference between unit and void.

+
+
+
+

Numeric types and bits

+
+

Sail has three basic numeric types, int, nat, and range. The +type int which we have already seen above is an arbitrary precision +mathematical integer. Likewise, nat is an arbitrary precision +natural number. The type range('n, 'm) is an inclusive range +between type variables 'n and 'm. For both int and nat we can +specify a type variable that constrains elements of the type to be +equal to the value of that type variable. In other words, the values +of type int('n) are integers equal to 'n. So +5 : int(5) and similarly for any integer constant. These +types can often be used interchangeably provided certain constraints +are satisfied. For example, int('n) is equivalent to +range('n, 'n) and range('n, 'm) can be converted into +int('n) when 'n == 'm.

+
+
+ + + + + +
+
Note
+
+The colon operator : is used for type ascription, so x : Y +is read as x has type Y +
+
+
+

Note that bit isn’t a numeric type (i.e. it’s not +range(0, 1)). This is intentional, as otherwise it would be +possible to write expressions like (1 : bit) + 5 +which would end up being equivalent to +6 : range(5, 6). This kind of implicit casting from +bits to other numeric types would be highly undesirable. The bit +type itself is a two-element type with members bitzero and bitone.

+
+
+

In addition, we can write a numeric type that only contains a fixed +set of integers. The type {32, 64} is a type that can only +contain the values 32 and 64.

+
+
+ + + + + +
+
Note
+
+In older Sail versions the numeric set type would have been +denoted {|32, 64|}. +
+
+
+
+

Bitvector literals

+
+

Bitvector literals in Sail are written as either 0x followed by a +sequence of hexadecimal digits or 0b followed by a sequence of +binary digits, for example 0x12FE or 0b1010100. The length of a +hex literal is always four times the number of digits, and the length +of binary string is always the exact number of digits, so 0x12FE has +length 16, while 0b1010100 has length 7. To ensure bitvector logic +in specifications is precisely specified, we do not allow any kind of +implicit widening or truncation as might occur in C. To change the +length of a bitvector, explicit zero/sign extension and truncation +functions must be used. Underscores can be used in bitvector literals +to separate groups of bits (typically in groups of 16), for example:

+
+
+
+
let large_bitvector : bits(64) = 0xFFFF_0000_1234_0000
+
+
+
+

We can make bitvectors as large as we need:

+
+
+
+
let even_larger_bitvector : bits(192) =
+  0xFFFF_FFFF_FFFF_FFFF_0000_0000_0000_0000_ABCD_ABCD_ABCD_ABCD
+
+
+
+

We can also write bitvectors very verbosely using bitzero and +bitone, like:

+
+
+
+
let v : bits(2) = [bitzero, bitzero]
+
+
+
+

The @ operator is used to concatenate bitvectors, for example:

+
+
+
+
let x = 0xFFFF;
+let y = 0x0000;
+assert(x @ y == 0xFFFF_0000);
+
+
+
+

For historical reasons the bit type is not equal to bits(1), and +while this does simplify naively mapping the bits type into a (very +inefficient!) representation like bit list in Isabelle or OCaml, it +might be something we reconsider in the future.

+
+
+

Sail allows two different types of bitvector orderings---increasing +(inc) and decreasing (dec). These two orderings are shown for the +bitvector 0b10110000 below.

+
+
+
+ordering tikz +
+
+
+

For increasing (bit)vectors, the 0 index is the most significant bit +and the indexing increases towards the least significant bit. Whereas +for decreasing (bit)vectors the least significant bit is 0 indexed, +and the indexing decreases from the most significant to the least +significant bit. For this reason, increasing indexing is sometimes +called `most significant bit is zero' or MSB0, while decreasing +indexing is sometimes called `least significant bit is zero' or +LSB0. While this vector ordering makes most sense for bitvectors (it +is usually called bit-ordering), in Sail it applies to all +vectors. A default ordering can be set using

+
+
+
+
default Order dec
+
+
+
+

and this should usually be done right at the beginning of a +specification. This setting is global, and increasing and decreasing +bitvectors can therefore never be mixed within the same specification!

+
+
+

In practice decreasing order is the almost universal standard and only +POWER uses increasing order. All currently maintained Sail +specifications use decreasing. You may run into issues with increasing +bitvectors as the code to support these is effectively never exercised +as a result.

+
+
+
+

Vectors

+
+

Sail has the built-in type vector, which is a polymorphic type for +fixed-length vectors. For example, we could define a vector v of +three integers as follows:

+
+
+
+
let v : vector(3, int) = [3, 2, 1]
+
+
+
+

The first argument of the vector type is a numeric expression +representing the length of the vector, and the second is the type of +the vector’s elements. As mentioned in the bitvector section, the +ordering of bitvectors and vectors is always the same, so:

+
+
+
+
let a_generic_vector : vector(3, bit) = [bitzero, bitzero, bitone]
+let a_bit_vector : bits(3) = [bitzero, bitzero, bitone] // 0b001
+
+assert(a_generic_vector[0] == bitone)
+assert(a_bitvector_vector[0] == bitone)
+
+
+
+

Note that a generic vector of bits and a bitvector are not the same +type, despite us being able to write them using the same syntax. This +means you cannot write a function that is polymorphic over both +generic vectors and bitvectors). This is because we typically want +these types to have very different representations in our various Sail +backends, and we don’t want to be forced into a compilation strategy +that requires monomorphising such functions.

+
+
+
+

Accessing and updating vectors

+
+

A (bit)vector can be indexed by using the vector index notation. So, +in the following code:

+
+
+
+
let v : vector(4, int) = [1, 2, 3, 4]
+let a = v[0]
+let b = v[3]
+
+
+
+

a will be 4, and b will be 1 (we assume default Order dec +here). By default, Sail will statically check for out of bounds +errors, and will raise a type error if it cannot prove that all such +vector accesses are valid.

+
+
+

A vector v can be sliced using the v[n .. m] notation. The +indexes are always supplied with the index closest to the MSB being +given first, so we would take the bottom 32-bits of a decreasing +bitvector v as v[31 .. 0], and the upper 32-bits of an +increasing bitvector as v[0 .. 31], i.e. the indexing +order for decreasing vectors decreases, and the indexing order for +increasing vectors increases.

+
+
+

A vector v can have an index index using +[v with index = expression]. Similarly, a +sub-range of v can be updated using +[v with n .. m = expression] where the +order of the indexes is the same as described above for increasing and +decreasing vectors.

+
+
+
+

The list type

+
+

In addition to vectors, Sail also has list as a built-in type. For +example:

+
+
+
+
let l : list(int) = [|1, 2, 3|]
+
+
+
+

The cons operator is ::, so we could equally write:

+
+
+
+
let l : list(int) = 1 :: 2 :: 3 :: [||]
+
+
+
+

For those unfamiliar the word, 'cons' is derived from Lisp dialects, +and has become standard functional programming jargon for such an +operator — see en.wikipedia.org/wiki/Cons for more details.

+
+
+ + + + + +
+
Caution
+
+The list type (plus the recursive functions typically used to +manipulate lists) does not work well with certain Sail targets, such +as the SMT and SystemVerilog backends. The vector type is almost +always preferable to the list type. The inclusion of the list type +(where we otherwise forbid recursive types) was perhaps a design +mistake. +
+
+
+
+

Tuples

+
+

Sail has tuple types which represent heterogeneous sequences containing +values of different types. A tuple type (T1, T2, …​) has +values (x1, x2, …​) where x1 : T1, +x2 : T2 and so on. A tuple must have 2 or more elements. +Some examples of tuples would be:

+
+
+
+
let tuple1 : (string, int) = ("Hello, World!", 3)
+
+let tuple2 : (nat, nat, nat) = (1, 2, 3)
+
+
+
+

Note that while the function type (A, B) -> C might +look like a function taking a single tuple argument, it is in fact a +function taking two arguments. If we wanted to write a function taking +a single tuple argument, we would instead write:

+
+
+
+
val single_tuple_argument : ((int, int)) -> unit
+
+function single_tuple_argument(tuple) = {
+  let (x, y) = tuple;
+  print_int("x = ", x);
+  print_int("y = ", y);
+}
+
+
+
+

which would be called as

+
+
+
+
single_tuple_argument((1, 2))
+
+
+
+ + + + + +
+
Note
+
+This is because in Sail the function type is denoted +(A, B, …​) -> C, but we allow the brackets to +be elided when the function has a single non-tuple argument so we can +write A -> B rather than (A) -> B. +
+
+
+
+

Strings

+
+

Sail has a string type, which is primarily used for error reporting +and debugging.

+
+
+ + + + + +
+
Caution
+
+Sail is not a language designed for working with strings, and +the semantics of ISA specifications should not depend on any logic +involving strings. If you find yourself using strings for reasons +other than printing or logging errors in a Sail specification, you +should probably reconsider. +
+
+
+

A Sail string is any sequence of ASCII characters between double +quotes. Backslash is used to introduce escape sequences, and the +following escape sequences are supported:

+
+
+
    +
  • +

    \\ — Backslash

    +
  • +
  • +

    \n — Newline character

    +
  • +
  • +

    \t — Tab character

    +
  • +
  • +

    \b — Backspace character

    +
  • +
  • +

    \r — Carriage return

    +
  • +
  • +

    \' — Single quote (somewhat unnecessary, as single quotes are allowed in Sail strings)

    +
  • +
  • +

    \" — Double quote

    +
  • +
  • +

    \DDD — The character with decimal ASCII code DDD

    +
  • +
  • +

    \xHH — The character with hexadecimal ASCII code HH

    +
  • +
+
+
+

Multi-line strings can be written by escaping the newline character at the end of a line:

+
+
+
+
print_endline("Hello, \
+               World!");
+// Is equivalent to
+print_endline("Hello, World!")
+
+
+
+
+

Pattern matching

+
+

Like most functional languages, Sail supports pattern matching via the +match keyword. For example:

+
+
+
+
let n : int = f();
+match n {
+    1 => print_endline("1"),
+    2 => print_endline("2"),
+    3 => print_endline("3"),
+    _ => print_endline("wildcard"),
+}
+
+
+
+

The match keyword takes an expression and then branches by comparing +the matched value with a pattern. Each case in the match expression +takes the form <pattern> => <expression>, separated by commas (a +trailing comma is allowed). The cases are checked sequentially from +top to bottom, and when the first pattern matches its expression will +be evaluated.

+
+
+

The concrete match statement syntax in Sail is inspired by the syntax +used in Rust — but programmers coming from languages with no pattern +matching features may be unfamiliar with the concept. One can think of +the match statement like a super-powered switch statement in C. At its +most basic level a match statement can function like a switch +statement (except without any fall-through). As in the above example we +can use match to compare an expression against a sequence of values +like so:

+
+
+
+
match expression {
+    1 => print_endline("expression == 1"),
+    2 => {
+        // A match arm can have a single expression or a block
+        // (because blocks in Sail are also expressions!)
+        print_endline("expression == 2")
+        // Note that unlike in C, no 'break' is needed as there is no fallthrough
+    },
+    _ => print_endline("This wildcard pattern acts like default: in a C switch")
+}
+
+
+
+

However the pattern in a match statement can also bind variables. In +the following example we match on a numeric expression x + y, and if +it is equal to 1 we execute the first match arm. However, if that is +not the case the value of x + y is bound to a new immutable variable +z.

+
+
+
+
match x + y {
+    1 => print_endline("x + y == 1"),
+    z => {
+        // here z is a new variable defined as x + y.
+        print_int("x + y = ", z)
+        // z is only defined within the match arm
+    },
+}
+
+
+
+

Finally, we can use patterns to destructure values — breaking them +apart into their constituent parts. For example if we have a pair +expression we can break it apart into the first value in the pair and +the second, which can then be used as individual variables:

+
+
+
+
match pair : (int, int) {
+    (first, second) => {
+        // here we have split a pair into two variables first and second
+        print_int("first = ", first);
+        print_int("second = ", second)
+    }
+}
+
+
+
+

These features can be combined, so if we had a pattern (first, 3) in +the above example, the expression for that pattern would be executed +when the second element of the pair is equal to 3, and the first element +would be bound to the variable first.

+
+
+

Sail will check match statements for exhaustiveness (meaning that the +patterns in the match cover every possible value), and prints a +warning if the overall match statement is not exhaustive. There are +some limitations on the exhaustiveness checker which we will discuss +further below.

+
+
+

Guards

+
+

What if we need to switch based on more complex logic than just the +structure and values of the expressions we are matching on? For this +matches in Sail support guards. A guard allows us to combine the +behavior of a match expression and the boolean logic of an if +expression — and the syntax is reflective of this, as we can use the +if keyword to add extra logic to each match arm:

+
+
+
+
match x + y {
+    z if z > 0  => print_endline("y is greater than 0"),
+    z if z == 0 => print_endline("z is equal to 0"),
+    z           => print_endline("z is less than 0"),
+}
+
+
+
+

You may wonder — why not write z if z < 0 for the final case? Here +we run into one of the limitations of the exhaustiveness checker +mentioned above. Sail can only check the exhaustiveness of unguarded +clauses, meaning that the checker only considers cases without an if +guard. The z pattern by itself is exhaustive, so the checker is +happy, but if we added a if z < 0 guard the checker would complain +that there are no unguarded patterns for it to look at. This may seem +suprising for such a simple case (we can easily see the three guards +would cover all cases!), however each guard clause could contain +arbitrarily complex logic potentially abstracted behind arbitrary +function calls, which the completeness checker cannot reason about.

+
+
+

We now describe all the things that can be matched on in Sail

+
+
+
+

Matching on literals

+
+

First, and as we have already seen, we can match on literal +values. These literal values can be (), bitvectors, the boolean values +true and false, numbers, and strings.

+
+
+
+

Matching on enumerations

+
+

Match can be used to match on possible values of an enum, like so:

+
+
+
+
match x {
+    A => print_endline("A"),
+    B => print_endline("B"),
+    C => print_endline("C")
+}
+
+
+
+

Note that because Sail places no restrictions on the lexical structure +of enumeration elements to differentiate them from ordinary +identifiers, pattern matches on variables and enum elements can be +somewhat ambiguous. Issues with this are usually caught by the +exhaustiveness checker — it can warn you if removing an enum +constructor breaks a pattern match.

+
+
+
+

Matching on tuples

+
+

We use match to destructure tuple types, for example:

+
+
+
+
let x: (int, int) = (2, 3);
+match x {
+    (y, z) => print_endline("y = 2 and z = 3")
+}
+
+
+
+
+

Matching on unions

+
+

Match can also be used to destructure tagged union constructors, for example +using the option type from the Sail library.

+
+
+
+
$include <option.sail>
+
+val g : unit -> option(int)
+
+function match_union() -> unit = {
+    let x = g();
+    match x {
+        Some(n) => print_int("n = ", n),
+        None()  => print_endline("got None()!"),
+    }
+}
+
+
+
+

Note that like how calling a function with a unit argument can be done +as f() rather than f(()), matching on a constructor C with a +unit type can be achieved by using C() rather than C(()).

+
+
+
+

Matching on lists

+
+

Sail allows lists to be destructured using patterns. There are two +types of patterns for lists, cons patterns and list literal +patterns. The cons pattern destructures lists into the first element +(the head) and the rest of the list (the tail).

+
+
+
+
match ys {
+    x :: xs => print_endline("cons pattern"),
+    [||]    => print_endline("empty list")
+}
+
+
+
+

On the other hand, a list pattern matches on the entire list:

+
+
+
+
match ys {
+    [|1, 2, 3|] => print_endline("list pattern"),
+    _           => print_endline("wildcard")
+}
+
+
+
+
+

Matching on structs

+
+

Match can also be used for structures, for example:

+
+
+
+
struct my_struct = {
+    field1 : bits(16),
+    field2 : int,
+}
+
+function match_struct(value : my_struct) -> unit = {
+    match value {
+        // match against literals in the struct fields
+        struct { field1 = 0x0000, field2 = 3 } => (),
+
+        // bind the struct field values to immutable variables
+        struct { field1 = x, field2 = y } => {
+            print_bits("value.field1 = ", x);
+            print_int("value.field2 = ", y)
+        }
+    }
+}
+
+
+
+

We can omit fields from the match by using a wildcard _ in place of +some of the fields:

+
+
+
+
match value {
+    struct { field1 = x, _ } => {
+        print_bits("value.field1 = ", x)
+    }
+}
+
+
+
+

Finally, if we want to create a variable with the same name as a +field, rather than typing struct { field_name = field_name, _ }, we +can shorten this to just struct { field_name, _ }, So the above +example is equivalent to:

+
+
+
+
match value {
+    struct { field1, _ } => {
+        print_bits("value.field1 = ", field1)
+    }
+}
+
+
+
+
+

As patterns

+
+

Like OCaml, Sail also supports naming parts of patterns using the as +keyword. For example, in the above cons pattern example we could bind the +entire matched list to the zs variable:

+
+
+
+
match (1 :: 2 :: ys) : list(int) {
+    x :: xs as zs => print_endline("cons with as pattern"),
+    [||]          => print_endline("empty list"),
+}
+
+
+
+

The as pattern has lower precedence than any other keyword or +operator in a pattern, so in this example the pattern brackets as +(x :: xs) as zs

+
+
+
+

Automatic wildcard insertion

+
+

The various theorem provers that Sail can produce definitions for are +often strict, and enforce that pattern matches are exhaustive. +However, their pattern exhaustiveness checkers do not understand +bitvectors in the same way Sail does. For example, Sail can tell that +the following match is complete:

+
+
+
+
match (x, y) {
+    (0b1, _)   => print_endline("1"),
+    (_, 0b0)   => print_endline("2"),
+    (0b0, 0b1) => print_endline("3"),
+}
+
+
+
+

When translating to a target without bitvector literals, this would be +rewritten to use guards:

+
+
+
+
match (x, y) {
+    (t1, _) if t1 == 0b1 => print_endline("1"),
+    (_, t2) if t2 == 0b0 => print_endline("2"),
+    (t1, t2) if t1 == 0b0 & t2 == 0b1 => print_endline("3"),
+}
+
+
+
+

Most targets that check pattern exhaustiveness share the same limitation +as Sail — they only check match arms without guards, so they +would not see that this match is complete. To avoid this, Sail will +attempt to replace literal patterns with wildcards when possible, so +the above will be rewritten to:

+
+
+
+
match (x, y) {
+    (0b1, _) => print_endline("1"),
+    (_, 0b0) => print_endline("2"),
+    (_, _)   => print_endline("3"),
+}
+
+
+
+

which is equivalent.

+
+
+

One can find situations where such wildcards cannot be inserted. For example:

+
+
+
+
function cannot_wildcard(x: option(bits(1)), y: option(bits(1))) -> unit = {
+    match (x, y) {
+        (Some(0b0), Some(_))   => print_endline("1"),
+        (Some(0b1), Some(0b0)) => print_endline("2"),
+        (Some(0b1), _)         => print_endline("3"),
+        (Some(0b0), None())    => print_endline("4"),
+        (None(), _)            => print_endline("5"),
+    }
+}
+
+
+
+

Here the 0b1 literal in the (Some(0b1), _) case would need to be +replaced for an exhaustiveness checker without bitvector literals to +check the case where x and y are both Some, but this would +change the behavior when x is Some and y is None, hence a +wildcard cannot be inserted.

+
+
+

In this case Sail will print a warning explaining the problem:

+
+
+
+
Warning: Required literal ../examples/cannot_wildcard.sail:17.14-17:
+17 |        (Some(0b1), _)         => print_endline("3"),
+   |              ^-^
+   | 
+Sail cannot simplify the above pattern match:
+This bitvector pattern literal must be kept, as it is required for Sail to show that the surrounding pattern match is complete.
+When translated into prover targets (e.g. Lem, Coq) without native bitvector patterns, they may be unable to verify that the match covers all possible cases.
+
+
+
+

This warning should be heeded, and the match simplified otherwise the +generated theorem prover definitions produced by Sail may be rejected +by the prover.

+
+
+
+
+

Matching in let and function arguments

+
+

The match statement isn’t the only place we can use patterns. We can +also use patterns in function arguments and with let, for example:

+
+
+
+
struct S = {
+    field1 : int,
+    field2 : string,
+}
+
+function example1(s : S) -> unit = {
+    // Destructure 's' using let
+    let struct { field1, _ } = s;
+    print_int("field1 = ", field1)
+}
+
+// Destructure directly in the function argument
+function example2(struct { field1, _ } : S) -> unit = {
+    print_int("field1 = ", field1)
+}
+
+
+
+
+

Type patterns

+
+

In the previous section we saw as patterns, which allowed us bind +additional variables for subpatterns. However, as patterns can also be +used to bind type variables. For example:

+
+
+
+
// Some function that returns either 32 or 64 at runtime
+val get_current_xlen : unit -> {32, 64}
+
+register R : bits(32)
+
+val example : int -> unit
+
+function example() = {
+  let xlen as int('n) = get_current_xlen()
+
+  // Create a bitvector of length xlen
+  let bv : bits('n) = zero_extend(xlen, *R);
+
+  print_bits("bv = ", bv)
+}
+
+
+
+

You can think of the as int('n) as matching on the return type of +the get_current_xlen rather than the value, and binding it’s length +to a new type variable 'n, which we can subsequently use in types +later in our function. Note that even though we only know if xlen will +be 32 or 64 at runtime after the call to get_current_xlen, Sail is +still able to statically check all our bitvector accesses.

+
+
+

If a type only contains a single type variable (as int('n) does), +then we allow omitting the type name and just using a variable as the +type pattern, for example the following would be equivalent to the +first line of example above:

+
+
+
+
let xlen as 'n = get_current_xlen();
+
+
+
+

If we want to give the variable xlen and the type variable 'n the +same name, we could go further and simplify to:

+
+
+
+
function example() = {
+  let 'xlen = get_current_xlen()
+
+  // Create a bitvector of length xlen
+  let bv : bits('xlen) = zero_extend(xlen, *R);
+
+  print_bits("bv = ", bv)
+}
+
+
+
+

Here we can use xlen within the function as both a regular variable +xlen and as a type variable 'xlen.

+
+
+
+

Mutable variables

+
+

Bindings made using let are always immutable, but Sail also allows +mutable variables. Mutable variables are created by using the var +keyword within a block.

+
+
+
+
{
+  var x : int = 3; // Create a new mutable variable x initialised to 3
+  x = 2            // Rebind it to the value 2
+}
+
+
+
+

Like let-bound variables, mutable variables are lexically scoped, so +they only exist within the block that declared them.

+
+
+

Technically, unless the -strict_var option is used Sail also allows +mutable variables to be implicitly declared, like so:

+
+
+
+
{
+  x : int = 3 // Create a new mutable variable x initialised to 3
+  x = 2       // Rebind it to the value 2
+}
+
+
+
+

This functions identically, just without the keyword. We consider this +deprecated and strongly encourage the use of the var keyword going +forwards.

+
+
+

The assignment operator is the equality symbol, as in C and other +programming languages. Sail supports a rich language of l-value +forms, which can appear on the left of an assignment. These will be +described in the next section.

+
+
+

One important thing to note is that Sail always infers the most +specific type it can for variables, and in the presence of integer +types with constraints, these types can be very specific. This is +not a problem for immutable bindings, but can cause issues for mutable +variables when explicit types are omitted. The following will not +typecheck:

+
+
+
+
{
+  var x = 3;
+  x = 2;
+}
+
+
+
+

The reason is that Sail (correctly) infers that x has the type 'the +integer equal to 3', and therefore refuses to allow us to assign 2 +to it (as it well should), because two is not equal to three. To avoid +this we must give an annotation with a less specific type like int.

+
+
+
+

Assignment and l-values

+
+

It is common in ISA specifications to assign to complex l-values, +e.g. to a subvector or named field of a bitvector register, or to +an l-value computed with some auxiliary function, e.g. to select +the appropriate register for the current execution model.

+
+
+

Sail has l-values that allow writing to individual elements of a +(bit)vector:

+
+
+
+
var v : bits(8) = 0xFF;
+v[0] = bitzero;
+assert(v == 0xFE)
+
+
+
+

As well as sub-ranges of a (bit)vector:

+
+
+
+
var v : bits(8) = 0xFF;
+v[3 .. 0] = 0x0; // assume default Order dec
+assert(v == 0xF0)
+
+
+
+

We also have vector concatenation l-values, which work much like +vector concatenation patterns

+
+
+
+
var v1 : bits(4) = 0xF;
+var v2 : bits(4) = 0xF;
+v1 @ v2 = 0xAB;
+assert(v1 == 0xA & v2 == 0xB)
+
+
+
+

For structs we can write to an individual struct field as

+
+
+
+
// Assume S is a struct type with a single bits(8) field (called field)
+var s : S = struct { field = 0xFF };
+s.field = 0x00;
+assert(s.field == 0x00)
+
+
+
+

We can also do multiple assignment using tuples, for example:

+
+
+
+
var (x, y) = (2, 3);
+assert(x == 2 & x == 3)
+// swap values
+(y, x) = (x, y)
+assert(x == 3 & x == 2)
+
+
+
+

Finally, we allow functions to appear in l-values. This is a very +simple way to declare setter functions that look like custom +l-values, for example:

+
+
+
+
memory(addr) = 0x0F
+// is just syntactic sugar for
+memory(addr, 0x0F)
+
+
+
+

This feature is commonly used when setting registers or memory that +have some additional semantics when they are read or written. We +commonly use the ad-hoc overloading feature to declare what appear to +be getter/setter pairs, so for the above example we could implement a +read_memory function and a write_memory function and overload them +both as memory to allow us to write memory using +memory(addr) = data and read memory as +data = memory(addr), for example:

+
+
+
+
val read_memory : bits(64) -> bits(8)
+val write_memory : (bits(64), bits(8)) -> unit
+
+overload memory = {read_memory, write_memory}
+
+
+
+
+

Bitfields

+
+

The following example creates a bitfield type called cr_type and a +register CR of that type. The underlying bitvector type (in this +case bits(8)) must be specified as part of the bitfield declaration.

+
+
+ + + + + +
+
Note
+
+The underlying bitvector type can be specified using a type +synonym, like xlenbits in sail-riscv. +
+
+
+
+

+bitfield cr_type : bits(8) = {
+  CR0 : 7 .. 4,
+  LT  : 7,
+  GT  : 6,
+  CR1 : 3 .. 2,
+  CR3 : 1 .. 0
+}
+
+register CR : cr_type
+
+
+
+
+

If the bitvector is decreasing then indexes for the fields must also +be in decreasing order, and vice-versa for an increasing vector. The +field definitions can be overlapping and do not need to be contiguous.

+
+
+

A bitfield generates a type that is simply a struct wrapper around the +underlying bitvector, with a single field called bits containing +that bitvector. For the above example, this will be:

+
+
+
+
struct cr_type = {
+  bits : bits(8)
+}
+
+
+
+

This representation is guaranteed, so it is expected that Sail code +will use the bits field to access the underlying bits of the +bitfield as needed. The following function shows how the bits +contained in a bitfield can be accessed:

+
+
+
+
function bitfield_access_example() = {
+  // Rather than using numeric indices, the bitfield names are used
+  let cr0_field : bits(4) = CR[CR0];
+
+  // Bitfield accessors always return bitvector results
+  let lt_field : bits(1) = CR[LT]; // Note 'bits(1)' not 'bit'
+
+  // Can access the underlying bitvector using the bits field
+  let some_bits : bits(7) = CR.bits[6 .. 0];
+}
+
+
+
+

Similarly, bitfields can be updated much like regular vectors just +using the field names rather than numeric indices:

+
+
+
+
function bitfield_update_example() = {
+  // We can set fields on the CR register using their field names
+  CR[CR3] = 0b01;
+
+  // If we want to set the underlying bits
+  CR.bits[1 .. 0] = 0b01;
+
+  // We can even use vector update notation!
+  CR = [CR with CR3 = 0b01, LT = 0b1];
+
+  // Because of the representation, we could set the whole thing like:
+  CR = (struct { bits = 0x00 } : cr_type);
+}
+
+
+
+

Older versions of Sail did not guarantee the underlying representation +of the bitfield (because it tried to do clever things to optimise +them). This meant that bitfields had to be accessed using getter and +setter functions, like so:

+
+
+
+
function legacy_bitfields() = {
+  // Assigning to a field of a bitfield register
+  CR->CR3() = 0b01;
+  // '->' notation just means the setter takes the register by reference:
+  (ref CR).CR3() = 0b01;
+
+  // Assigning all the bits (now just 'CR.bits = 0x00')
+  CR->bits() = 0x00;
+
+  // Accessing a field
+  let cr0_field : bits(4) = CR.CR0();
+
+  // Updating a field
+  CR = update_CR3(CR, 0b01); // now '[ CR with CR3 = 0b01 ]'
+
+  // Construct a new CR bitfield
+  CR = Mk_cr_type(0x00);
+}
+
+
+
+

The method like accessor syntax was (overly sweet) syntactic sugar for +getter and setter functions following a specific naming convention +that was generated by the bitfield. These functions are still +generated for backwards compatibility, but we would advise against +using them.

+
+
+ + + + + +
+
Note
+
+Except perhaps for the Mk_cr_type function or equivalent for +other bitfields, which is still quite useful for creating bitfields. +
+
+
+
+

Operators

+
+

Valid operators in Sail are sequences of the following non +alpha-numeric characters: !%&*+-./:<>=@^|#. Additionally, any such +sequence may be suffixed by an underscore followed by any valid +identifier, so <=_u or even <=_unsigned are valid operator names. +Operators may be left, right, or non-associative, and there are 10 +different precedence levels, ranging from 0 to 9, with 9 binding the +tightest. To declare the precedence of an operator, we use a fixity +declaration like:

+
+
+
+
infix 4 <=_u
+
+
+
+

For left or right associative operators, we’d use the keywords +infixl or infixr respectively. An operator can be used anywhere a +normal identifier could be used via the operator keyword. As such, +the <=_u operator can be defined as:

+
+
+
+
val operator <=_u : forall 'n. (bits('n), bits('n)) -> bool
+function operator <=_u(x, y) = unsigned(x) <= unsigned(y)
+
+
+
+

Built-in precedences

+
+

The precedence of several common operators are built into Sail. These +include all the operators that are used in type-level numeric +expressions, as well as several common operations such as equality, +division, and modulus. The precedences for these operators are +summarised in the following table.

+
+ + ++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 1. Default Sail operator precedences
PrecedenceLeft associativeNon-associativeRight associative

9

8

^

7

*, \, %

6

+, -

4

<, ⇐, >, >=, !=, =, ==

3

&

2

|

1

0

+
+
+
+

Ad-hoc overloading

+
+

Sail has a flexible overloading mechanism using the overload keyword. For example:

+
+
+
+
overload foo = {bar, baz}
+
+
+
+

It takes an identifier name, and a list of other identifier names to +overload that name with. When the overloaded name is seen in a Sail +definition, the type-checker will try each of the overloads (that are +in scope) in order from left to right until it finds one that causes +the resulting expression to type-check correctly.

+
+
+

Multiple overload declarations are permitted for the same +identifier, with each overload declaration after the first adding its +list of identifier names to the right of the overload list (so earlier +overload declarations take precedence over later ones). As such, we +could split every identifier from above example into its own +line like so:

+
+
+
+
overload foo = {bar}
+overload foo = {baz}
+
+
+
+

Note that if an overload is defined in module B using identifiers +provided by another module A, then a module C that requires only +B will not see any of the identifiers from A, unless it also +requires A. See the section on modules for details. Note that this +means an overload cannot be used to 're-export' definitions provided by +another module.

+
+
+

As an example for how overloaded functions can be used, consider the +following example, where we define a function print_int and a +function print_string for printing integers and strings +respectively. We overload print as either print_int or +print_string, so we can print either number such as 4, or strings +like "Hello, World!" in the following main function definition.

+
+
+
+
val print_int : int -> unit
+
+val print_string : string -> unit
+
+overload print = {print_int, print_string}
+
+function main() : unit -> unit = {
+  print("Hello, World!");
+  print(4)
+}
+
+
+
+

We can see that the overloading has had the desired effect by dumping +the type-checked AST to stdout using the following command +sail -ddump_tc_ast examples/overload.sail. This will print the +following, which shows how the overloading has been resolved

+
+
+
+
function main () : unit = {
+  print_string("Hello, World!");
+  print_int(4)
+}
+
+
+
+

This option can sometimes be quite useful for observing how +overloading has been resolved. Since the overloadings are done in the +order they are defined, it can be important to ensure that this order +is correct. A common idiom in the standard library is to have versions +of functions that guarantee more constraints about their output be +overloaded with functions that accept more inputs but guarantee less +about their results. For example, we might have two division +functions:

+
+
+
+
val div1 : forall 'm 'n, 'n >= 0 & 'm > 0. (int('n), int('m)) -> {'o, 'o >= 0. int('o)}
+
+val div2 : (int, int) -> option(int)
+
+
+
+

The first guarantees that if the first argument is greater than or +equal to zero, and the second argument is greater than zero, then the +result will be greater than or equal to zero. If we overload these +definitions as

+
+
+
+
overload operator / = {div1, div2}
+
+
+
+

Then the first will be applied when the constraints on its inputs can +be resolved, and therefore the guarantees on its output can be +guaranteed, but the second will be used when this is not the case, and +indeed, we will need to manually check for the division by zero case +due to the option type. Notice that the return type can be different +between different cases in the overload.

+
+
+
+

Mappings

+
+

Mappings are a feature of Sail that allow concise expression of +bidirectional relationships between values that are common in ISA +specifications: for example, bit-representations of an enum type, or +the encoding-decoding of instructions.

+
+
+

They are defined similarly to functions, with a val keyword to +specify the type and a definition, using a bidirectional arrow <-> +rather than a function arrow ->, and a separate mapping +definition.

+
+
+
+
val size_bits : word_width <-> bits(2)
+
+mapping size_bits = {
+  BYTE   <-> 0b00,
+  HALF   <-> 0b01,
+  WORD   <-> 0b10,
+  DOUBLE <-> 0b11
+}
+
+
+
+

As a shorthand, you can also specify a mapping and its type +simultaneously:

+
+
+
+
mapping size_bits2 : word_width <-> bits(2) = {
+  BYTE   <-> 0b00,
+  HALF   <-> 0b01,
+  WORD   <-> 0b10,
+  DOUBLE <-> 0b11
+}
+
+
+
+

Mappings are used simply by calling them as if they were functions: +type inference will determine in which direction the mapping is +applied. (This gives rise to the restriction that the types on either +side of a mapping must be different.)

+
+
+
+
let width : word_width = size_bits(0b00);
+let width : bits(2) = size_bits(BYTE);
+
+
+
+

Sometimes, because the subset of Sail allowed in bidirectional mapping +clauses is quite restrictive, it can be useful to specify the forwards +and backwards part of a mapping separately:

+
+
+
+
mapping size_bits3 : word_width <-> bits(2) = {
+  BYTE <-> 0b00,
+  HALF <-> 0b01,
+  WORD <-> 0b10,
+  forwards  DOUBLE => 0b11, // forwards is left-to-right
+  backwards 0b11 => DOUBLE, // backwards is right-to-left
+}
+
+
+
+
+

Sizeof and constraint

+
+

Sail allows for arbitrary type variables to be included within +expressions. However, we can go slightly further than this, and +include both arbitrary (type-level) numeric expressions in code, as +well as type constraints. For example, if we have a function that +takes two bitvectors as arguments, then there are several ways we +could compute the sum of their lengths.

+
+
+
+
val f : forall 'n 'm. (bits('n), bits('m)) -> unit
+
+function f(xs, ys) = {
+  let len = length(xs) + length(ys);
+  let len = 'n + 'm;
+  let len = sizeof('n + 'm);
+  ()
+}
+
+
+
+

Note that the second line is equivalent to

+
+
+
+
  let len = sizeof('n) + sizeof('n)
+
+
+
+

There is also the constraint keyword, which takes a type-level +constraint and allows it to be used as a boolean expression, so we +could write:

+
+
+
+
function f(xs, ys) = {
+  if constraint('n <= 'm) {
+    // Do something
+  }
+}
+
+
+
+

rather than the equivalent test length(xs) <= length(ys). +This way of writing expressions can be succinct, and can also make it +very explicit what constraints will be generated during flow typing. +However, all the constraint and sizeof definitions must be re-written +to produce executable code, which can result in the generated theorem +prover output diverging (in appearance) somewhat from the source +input. In general, it is probably best to use sizeof and +constraint sparingly on type variables.

+
+
+

One common use for sizeof however, is to lower type-level integers +down to the value level, for example:

+
+
+
+
// xlen is a type of kind 'Int'
+type xlen : Int = 64
+
+val f : int -> unit
+
+function xlen_example() = {
+  let v = sizeof(xlen);
+  f(v)
+}
+
+
+
+
+

Exceptions

+
+

Perhaps surprisingly for a specification language, Sail has exception +support. This is because exceptions as a language feature do sometimes +appear in vendor ISA pseudocode (they are a feature in Arm’s ASL +language), and such code would be very difficult to translate into +Sail if Sail did not itself support exceptions. We already translate +Sail to monadic theorem prover code, so working with a monad that +supports exceptions is fairly natural. In practice Sail language-level +exceptions end up being quite a nice tool for implementing processor +exceptions in ISA specifications.

+
+
+

For exceptions we have two language features: throw statements +and try--catch blocks. The throw keyword takes a value of +type exception as an argument, which can be any user defined type +with that name. There is no built-in exception type, so to use +exceptions one must be set up on a per-project basis. Usually the +exception type will be a union, often a scattered union, which allows +for the exceptions to be declared throughout the specification as they +would be in OCaml, for example:

+
+
+
+
scattered union exception
+
+union clause exception = Epair : (range(0, 255), range(0, 255))
+
+union clause exception = Eunknown : string
+
+function main() : unit -> unit = {
+    try {
+        throw(Eunknown("foo"))
+    } catch {
+        Eunknown(msg) => print_endline(msg),
+        _ => exit()
+    }
+}
+
+union clause exception = Eint : int
+
+end exception
+
+
+
+
+

Scattered definitions

+
+

In a Sail specification, sometimes it is desirable to collect together +the definitions relating to each machine instruction (or group +thereof), e.g.~grouping the clauses of an AST type with the associated +clauses of decode and execute functions, as in +the A tutorial RISC-V style example section. +Sail permits this with syntactic sugar for `scattered' definitions. +Functions, mappings, unions, and enums can be scattered.

+
+
+

One begins a scattered definition by declaring the name and kind +(either function or union) of the scattered definition, e.g.

+
+
+
+
// We can declare scattered types, for either unions or enums
+scattered union U
+
+scattered enum E
+
+// For scattered functions and mappings, we have to provide a type signature with val
+val foo : U -> bits(64)
+
+scattered function foo
+
+val bar : E <-> string
+
+scattered mapping bar
+
+
+
+

This is then followed by clauses for either, which can be freely +interleaved with other definitions. It is common to define both a +scattered type (either union or enum), with a scattered function that +operates on newly defined clauses of that type, as is shown below:

+
+
+
+
enum clause E = E_one
+
+union clause U = U_one : bits(64)
+
+function clause foo(U_one(x)) = x
+
+mapping clause bar = E_one <-> "first member of E"
+
+
+
+

Finally the scattered definitions are ended with the end keyword, like so:

+
+
+
+
end E
+
+end U
+
+end foo
+
+end bar
+
+
+
+

Technically the end keyword is not required, but it is good practice +to include it as it informs Sail that the clauses are now complete, +which forbids new clauses and means subsequent pattern completeness +checks no longer have to require extra wildcards to account for new +clauses being added.

+
+
+

Semantically, scattered definitions for types appear at the start of +their definition (note however, that this does not mean that a module +that requires just the start scattered union definition can access +any constructors of a union defined in modules it does not require). +Scattered definitions for functions and mappings appear at the +end. A scattered function definition can be recursive, but mutually +recursive scattered function definitions should be avoided.

+
+
+
+

Preludes and default environment

+
+

By default Sail has almost no built-in types or functions, except for +the primitive types described in this Chapter. This is because +different vendor-pseudocodes have varying naming conventions and +styles for even the most basic operators, so we aim to provide +flexibility and avoid committing to any particular naming convention or +set of built-ins. However, each Sail backend typically implements +specific external names, so for a PowerPC ISA description one might +have:

+
+
+
+
val EXTZ = "zero_extend" : ...
+
+
+
+

while for ARM, to mimic ASL, one would have

+
+
+
+
val ZeroExtend = "zero_extend" : ...
+
+
+
+

where each backend knows about the "zero_extend" external name, but +the actual Sail functions are named appropriately for each vendor’s +pseudocode. As such each ISA spec written in Sail tends to have its +own prelude.

+
+
+
+
+
+

Modular Sail Specifications

+
+
+

Modules

+
+

Sail provides support for organizing large specifications into +modules. Modules provide an access control mechanism, meaning a Sail +definition in one module cannot access or use definitions provided by +another module unless it explicitly requires the other module.

+
+
+

The module structure of a Sail project/specification is specified in a +separate .sail_project file.

+
+
+

For a simple example, let’s assume we have two Sail files amod.sail and +bmod.sail:

+
+
+
amod.sail
+
+
val alfa : unit -> int
+
+function alfa() = 3
+
+
+
+
bmod.sail
+
+
val bravo : unit -> unit
+
+function bravo() = {
+    let x = alfa();
+    print_int("alfa returned: ", x)
+}
+
+
+
+

We can use the following .sail_project file:

+
+
+
simple_mod.sail_project
+
+
A {
+  files amod.sail
+}
+
+B {
+  requires A
+  files bmod.sail
+}
+
+
+
+

This file defines two modules A and B, with module A containing +the file amod.sail and module B containing the file bmod.sail. +Module B requires module A, so it can use the alfa function +defined in A. What would happen if we removed the requires line? We +would get the following error:

+
+
+
+
Type error:
+../examples/bmod.sail:6.12-16:
+6 |    let x = alfa();
+  |            ^--^
+  | Not in scope
+  | 
+  | Try requiring module A to bring the following into scope for module B:
+  | ../examples/amod.sail:6.4-8:
+  | 6 |val alfa : unit -> int
+  |   |    ^--^ definition here in A
+  | ../examples/simple_mod_err.sail_project:5.0-1:
+  | 5 |B {
+  |   |^ add 'requires A' within B here
+
+
+
+

This error tells us that alfa is not in scope, but Sail knows it +exists as it has already checked module A, so it points us at the +definition and suggests how we could resolve the error by adding the +requires line we just removed.

+
+
+

When using a .sail_project file we do not have to pass all the files +on the command line, so we can invoke Sail simply as

+
+
+
+
sail simple_mod.sail_project
+
+
+
+

and it will know where to find amod.sail and bmod.sail relative to +the location of the project file.

+
+
+

A module can have more than one Sail file. These files are processed +sequentially in the order they are listed. This is exactly like what +happens when we pass multiple Sail files on the command line without a +.sail_project file to define the module structure. A module can +therefore be thought of as a sequence of Sail files that is treated as +a single logical unit. As an example, we could add a third module to +our above file, which is comprised of three Sail files and depends on +A and B.

+
+
+
+
C {
+  requires A, B
+  files
+    foo.sail,
+    bar.sail,
+    baz.sail
+}
+
+
+
+

Note that comments and trailing commas are allowed, and we could optionally delimit +the lists using [ and ], like so:

+
+
+
+
C {
+  // Require both our previous modules
+  requires [A, B]
+  /* Both types of Sail comments are allowed! */
+  files [
+    foo.sail,
+    bar.sail,
+    baz.sail,
+  ]
+}
+
+
+
+

If we wanted to we could define C in a separate file, rather than +adding it to our previous file, and pass multiple project files to +Sail like so:

+
+
+
+
sail simple_mod.sail_project new_file_with_C.sail_project
+
+
+
+

These will be treated together as a single large project file. A use +case might be if you were defining an out-of-tree extension Xfoo for +sail-riscv, you could do something like:

+
+
+
+
sail sail-riscv/riscv.sail_project src/Xfoo.sail_project
+
+
+
+

and the modules you define in Xfoo.sail_project can require modules +from riscv.sail_project, and also vice-versa, although it makes less +sense in this example.

+
+
+
+

Working with Makefiles

+
+

The -list_files option can be used to list all the files within a +project file, which allows them to be used in a Makefile prerequisite. +As an example, to build the module examples in this very manual, we +use the rule below to generate documentation indexes (which our +Asciidoctor plugin consumes) for every .sail file within a project +file.

+
+
+
+
.SECONDEXPANSION:
+
+module_sail_doc/%.json: ../examples/%.sail_project $$(shell sail ../examples/%.sail_project -list_files)
+	mkdir -p module_sail_doc
+	sail -doc $(addprefix -doc_file ,$(shell sail $< -list_files)) -doc_embed plain -doc_bundle $(notdir $@) -o module_sail_doc $<
+
+
+
+
+

Conditional compilation within modules

+
+

We can use variables in our project files to control either file +inclusion within a module or to control whether a module requires +another or not. A variable can even contain a sequence of modules, +that can then be used in a require statement, as shown in the +following example:

+
+
+
+
variable ARCH = A64
+variable ARCH_MODULES = if ARCH == A64 then arch64_only else []
+
+arch_prelude {
+  files
+    prelude.sail,
+    if $ARCH == A32 then arch_xlen32.sail
+    else if $ARCH == A64 then [
+      arch_xlen64.sail,
+      arch_xlen64_helpers.sail
+    ] else error("Invalid value for ARCH")
+}
+
+arch64_only {
+  requires arch_prelude
+  files
+    instructions64.sail
+}
+
+arch_instructions {
+  requires arch_prelude
+  files
+    instructions.sail
+}
+
+arch_end {
+  requires arch_instructions, $ARCH_MODULES
+  files
+    end.sail
+}
+
+
+
+
+

Optional and default modules

+
+

Modules can be marked as either optional or default. Default +modules are those that form the base part of a specification, whereas +optional modules are intended to implement extensions which may or may +not be present. Default modules cannot require optional modules.

+
+
+
+
+
+

The Sail Grammar

+
+
+

This appendix contains a grammar for the Sail language that is +automatically generated from the +Menhir parser definition.

+
+
+ + + + + +
+
Note
+
+This means that the grammar is stated in such a way that the +parser generator can see it is free of LR shift/reduce and +reduce/reduce conflicts, rather than being optimised for human +readability. +
+
+
+

First, we need some lexical conventions:

+
+
+
    +
  • +

    ID is any valid Sail identifier

    +
  • +
  • +

    OPERATOR is any valid Sail operator, as defined in Operators.

    +
  • +
  • +

    TYPE_VARIABLE is a valid Sail identifier with a leading single quote '.

    +
  • +
  • +

    NUMBER is a non-empty sequence of decimal digits [0-9]+.

    +
  • +
  • +

    HEXADECIMAL_LITERAL is 0x[A-Ba-f0-9_]+

    +
  • +
  • +

    BINARY_LITERAL is 0b[0-1_]+

    +
  • +
  • +

    STRING_LITERAL is a Sail string literal.

    +
  • +
+
+
+

$[ATTRIBUTE] and $LINE_DIRECTIVE represent attributes and single +line directives. Examples of line directives would be things like +$include, $ifdef and so on. These start with a leading $, are +followed by the directive name (which follows the same lexical rules +as a Sail identifier), is followed by one or more spaces, then +proceeds to the end of the line, with everything between the +identifier and the line ending being the line directive’s argument +(with leading and trailing whitespace removed). An attribute starts +with $[ and ends with ], and in between consists of an attribute +name, followed by at least one whitespace character, then any +arbitrary sequence of characters that does not contain ].

+
+
+
+
<id> ::= ID
+       | operator OPERATOR
+       | operator -
+       | operator |
+       | operator ^
+       | operator *
+
+<op_no_caret> ::= OPERATOR
+                | -
+                | |
+                | *
+                | in
+
+<op> ::= OPERATOR
+       | -
+       | |
+       | ^
+       | *
+       | in
+
+<exp_op> ::= OPERATOR
+           | -
+           | |
+           | @
+           | ::
+           | ^
+           | *
+
+<pat_op> ::= @
+           | ::
+           | ^
+
+<typ_var> ::= TYPE_VARIABLE
+
+<tyarg> ::= ( <typ_list> )
+
+
+<prefix_typ_op> ::= epsilon
+                  | 2^
+                  | -
+                  | *
+
+<postfix_typ> ::= <atomic_typ>
+
+<typ_no_caret> ::= <prefix_typ_op> <postfix_typ> (<op_no_caret> <prefix_typ_op> <postfix_typ>)*
+
+<typ> ::= <prefix_typ_op> <postfix_typ> (<op> <prefix_typ_op> <postfix_typ>)*
+
+<atomic_typ> ::= <id>
+               | _
+               | <typ_var>
+               | <lit>
+               | dec
+               | inc
+               | <id> <tyarg>
+               | register ( <typ> )
+               | ( <typ> )
+               | ( <typ> , <typ_list> )
+               | { NUMBER (, NUMBER)* }
+               | { <kopt> . <typ> }
+               | { <kopt> , <typ> . <typ> }
+
+<typ_list> ::= <typ> [,]
+             | <typ> , <typ_list>
+
+<kind> ::= Int
+         | Type
+         | Order
+         | Bool
+
+<kopt> ::= ( constant <typ_var> : <kind> )
+         | ( <typ_var> : <kind> )
+         | <typ_var>
+
+<quantifier> ::= <kopt> , <typ>
+             | <kopt>
+
+
+<effect> ::= <id>
+
+<effect_set> ::= { <effect> (, <effect>)* }
+               | pure
+
+<typschm> ::= <typ> -> <typ>
+            | forall <quantifier> . <typ> -> <typ>
+            | <typ> <-> <typ>
+            | forall <quantifier> . <typ> <-> <typ>
+
+
+<pat1> ::= <atomic_pat> (<pat_op> <atomic_pat>)*
+
+<pat> ::= <pat1>
+        | $[ATTRIBUTE] <pat>
+        | <pat1> as <typ>
+
+<pat_list> ::= <pat> [,]
+             | <pat> , <pat_list>
+
+<atomic_pat> ::= _
+               | <lit>
+               | <id>
+               | <typ_var>
+               | <id> ()
+               | <id> [ NUMBER ]
+               | <id> [ NUMBER .. NUMBER ]
+               | <id> ( <pat_list> )
+               | <atomic_pat> : <typ_no_caret>
+               | ( <pat> )
+               | ( <pat> , <pat_list> )
+               | [ <pat_list> ]
+               | [| |]
+               | [| <pat_list> |]
+               | struct { <fpat> (, <fpat>)* }
+
+<fpat> ::= <id> = <pat>
+         | <id>
+         | _
+
+<lit> ::= true
+        | false
+        | ()
+        | NUMBER
+        | undefined
+        | bitzero
+        | bitone
+        | BINARY_LITERAL
+        | HEXADECIMAL_LITERAL
+        | STRING_LITERAL
+
+
+<exp> ::= <exp0>
+        | $[ATTRIBUTE] <exp>
+        | <exp0> = <exp>
+        | let <letbind> in <exp>
+        | var <atomic_exp> = <exp> in <exp>
+        | { <block> }
+        | return <exp>
+        | throw <exp>
+        | if <exp> then <exp> else <exp>
+        | if <exp> then <exp>
+        | match <exp> { <case_list> }
+        | try <exp> catch { <case_list> }
+        | foreach ( <id> ID <atomic_exp> ID <atomic_exp> by <atomic_exp> in <typ> ) <exp>
+        | foreach ( <id> ID <atomic_exp> ID <atomic_exp> by <atomic_exp> ) <exp>
+        | foreach ( <id> ID <atomic_exp> ID <atomic_exp> ) <exp>
+        | repeat [termination_measure { <exp> }] <exp> until <exp>
+        | while [termination_measure { <exp> }] <exp> do <exp>
+
+<prefix_op> ::= epsilon
+              | 2^
+              | -
+              | *
+
+<exp0> ::= <prefix_op> <atomic_exp> (<exp_op> <prefix_op> <atomic_exp>)*
+
+<case> ::= <pat> => <exp>
+         | <pat> if <exp> => <exp>
+
+<case_list> ::= <case>
+              | <case> ,
+              | <case> , <case_list>
+
+<block> ::= <exp> [;]
+          | let <letbind> [;]
+          | let <letbind> ; <block>
+          | var <atomic_exp> = <exp> [;]
+          | var <atomic_exp> = <exp> ; <block>
+          | <exp> ; <block>
+
+<letbind> ::= <pat> = <exp>
+
+<atomic_exp> ::= <atomic_exp> : <atomic_typ>
+               | <lit>
+               | <id> -> <id> ()
+               | <id> -> <id> ( <exp_list> )
+               | <atomic_exp> . <id> ()
+               | <atomic_exp> . <id> ( <exp_list> )
+               | <atomic_exp> . <id>
+               | <id>
+               | <typ_var>
+               | ref <id>
+               | <id> ()
+               | <id> ( <exp_list> )
+               | sizeof ( <typ> )
+               | constraint ( <typ> )
+               | <atomic_exp> [ <exp> ]
+               | <atomic_exp> [ <exp> .. <exp> ]
+               | <atomic_exp> [ <exp> , <exp> ]
+               | struct { <fexp_exp_list> }
+               | { <exp> with <fexp_exp_list> }
+               | [ ]
+               | [ <exp_list> ]
+               | [ <exp> with <vector_update> (, <vector_update>)* ]
+               | [| |]
+               | [| <exp_list> |]
+               | ( <exp> )
+               | ( <exp> , <exp_list> )
+
+<fexp_exp> ::= <atomic_exp> = <exp>
+             | <id>
+
+<fexp_exp_list> ::= <fexp_exp>
+                  | <fexp_exp> ,
+                  | <fexp_exp> , <fexp_exp_list>
+
+<exp_list> ::= <exp> [,]
+             | <exp> , <exp_list>
+
+<vector_update> ::= <atomic_exp> = <exp>
+                  | <atomic_exp> .. <atomic_exp> = <exp>
+                  | <id>
+
+<funcl_annotation> ::= $[ATTRIBUTE]
+
+<funcl_patexp> ::= <pat> = <exp>
+                 | ( <pat> if <exp> ) = <exp>
+
+<funcl_patexp_typ> ::= <pat> = <exp>
+                     | <pat> -> <typ> = <exp>
+                     | forall <quantifier> . <pat> -> <typ> = <exp>
+                     | ( <pat> if <exp> ) = <exp>
+                     | ( <pat> if <exp> ) -> <typ> = <exp>
+                     | forall <quantifier> . ( <pat> if <exp> ) -> <typ> = <exp>
+
+<funcl> ::= <funcl_annotation> <id> <funcl_patexp>
+          | <id> <funcl_patexp>
+
+<funcls> ::= <funcl_annotation> <id> <funcl_patexp_typ>
+           | <id> <funcl_patexp_typ>
+           | <funcl_annotation> <id> <funcl_patexp> and <funcl> (and <funcl>)*
+           | <id> <funcl_patexp> and <funcl> (and <funcl>)*
+
+<funcl_typ> ::= forall <quantifier> . <typ>
+              | <typ>
+
+<paren_index_range> ::= ( <paren_index_range> @ <paren_index_range> (@ <paren_index_range>)* )
+                      | <atomic_index_range>
+
+<atomic_index_range> ::= <typ>
+                       | <typ> .. <typ>
+                       | ( <typ> .. <typ> )
+
+<r_id_def> ::= <id> : <paren_index_range> (@ <paren_index_range>)*
+
+<r_def_body> ::= <r_id_def>
+               | <r_id_def> ,
+               | <r_id_def> , <r_def_body>
+
+<param_kopt> ::= <typ_var> : <kind>
+               | <typ_var>
+
+<typaram> ::= ( <param_kopt> (, <param_kopt>)* ) , <typ>
+            | ( <param_kopt> (, <param_kopt>)* )
+
+<type_def> ::= type <id> <typaram> = <typ>
+             | type <id> = <typ>
+             | type <id> <typaram> -> <kind> = <typ>
+             | type <id> : <kind> = <typ>
+             | struct <id> = { <struct_fields> }
+             | struct <id> <typaram> = { <struct_fields> }
+             | enum <id> = <id> (| <id>)*
+             | enum <id> = { <enum_comma> }
+             | enum <id> with <enum_functions> = { <enum_comma> }
+             | union <id> = { <type_unions> }
+             | union <id> <typaram> = { <type_unions> }
+             | bitfield <id> : <typ> = { <r_def_body> }
+
+<enum_functions> ::= <id> -> <typ> , <enum_functions>
+                   | <id> -> <typ> ,
+                   | <id> -> <typ>
+
+<enum_comma> ::= <id> [,]
+         | <id> => <exp> [,]
+         | <id> , <enum_comma>
+         | <id> => <exp> , <enum_comma>
+
+<struct_field> ::= <id> : <typ>
+
+<struct_fields> ::= <struct_field>
+                  | <struct_field> ,
+                  | <struct_field> , <struct_fields>
+
+<type_union> ::= $[ATTRIBUTE] <type_union>
+               | <id> : <typ>
+               | <id> : { <struct_fields> }
+
+<type_unions> ::= <type_union>
+                | <type_union> ,
+                | <type_union> , <type_unions>
+
+<rec_measure> ::= { <pat> => <exp> }
+
+<fun_def> ::= function [<rec_measure>] <funcls>
+
+<mpat> ::= <atomic_mpat> (<pat_op> <atomic_mpat>)*
+         | <atomic_mpat> as <id>
+
+<atomic_mpat> ::= <lit>
+                | <id>
+                | <id> [ NUMBER ]
+                | <id> [ NUMBER .. NUMBER ]
+                | <id> ()
+                | <id> ( <mpat> (, <mpat>)* )
+                | ( <mpat> )
+                | ( <mpat> , <mpat> (, <mpat>)* )
+                | [ <mpat> (, <mpat>)* ]
+                | [| |]
+                | [| <mpat> (, <mpat>)* |]
+                | <atomic_mpat> : <typ_no_caret>
+                | struct { <fmpat> (, <fmpat>)* }
+
+<fmpat> ::= <id> = <mpat>
+          | <id>
+
+<mpexp> ::= <mpat>
+          | <mpat> if <exp>
+
+<mapcl> ::= $[ATTRIBUTE] <mapcl>
+          | <mpexp> <-> <mpexp>
+          | <mpexp> => <exp>
+          | forwards <mpexp> => <exp>
+          | backwards <mpexp> => <exp>
+
+<mapcl_list> ::= <mapcl> [,]
+               | <mapcl> , <mapcl_list>
+
+<map_def> ::= mapping <id> = { <mapcl_list> }
+            | mapping <id> : <typschm> = { <mapcl_list> }
+
+<let_def> ::= let <letbind>
+
+
+<pure_opt> ::= monadic
+             | pure
+
+<extern_binding> ::= <id> : STRING_LITERAL
+                   | _ : STRING_LITERAL
+
+<externs> ::= epsilon
+            | = STRING_LITERAL
+            | = { <extern_binding> (, <extern_binding>)* }
+            | = <pure_opt> STRING_LITERAL
+            | = <pure_opt> { <extern_binding> (, <extern_binding>)* }
+
+<val_spec_def> ::= val STRING_LITERAL : <typschm>
+                 | val <id> <externs> : <typschm>
+
+<register_def> ::= register <id> : <typ>
+                 | register <id> : <typ> = <exp>
+
+<default_def> ::= default <kind> inc
+                | default <kind> dec
+
+<scattered_def> ::= scattered enum <id>
+                  | scattered union <id> <typaram>
+                  | scattered union <id>
+                  | scattered function <id>
+                  | scattered mapping <id>
+                  | scattered mapping <id> : <funcl_typ>
+                  | enum clause <id> = <id>
+                  | function clause <funcl>
+                  | union clause <id> = <type_union>
+                  | mapping clause <id> = <mapcl>
+                  | end <id>
+
+<loop_measure> ::= until <exp>
+                 | repeat <exp>
+                 | while <exp>
+
+<subst> ::= <typ_var> = <typ>
+          | <id> = <id>
+
+<instantiation_def> ::= instantiation <id>
+                      | instantiation <id> with <subst> (, <subst>)*
+
+<overload_def> ::= overload <id> = { <id> (, <id>)* }
+                 | overload <id> = <id> (| <id>)*
+
+<def_aux> ::= <fun_def>
+            | <map_def>
+            | FIXITY_DEF
+            | <val_spec_def>
+            | <instantiation_def>
+            | <type_def>
+            | <let_def>
+            | <register_def>
+            | <overload_def>
+            | <scattered_def>
+            | <default_def>
+            | $LINE_DIRECTIVE
+            | termination_measure <id> <pat> = <exp>
+            | termination_measure <id> <loop_measure> (, <loop_measure>)*
+
+<def> ::= $[ATTRIBUTE] <def>
+        | <def_aux>
+
+
+
+
+
+
+
+
+
+

References

+
+
+

[1] Intel, “Intel 64 and IA-32 Architectures Software Developer’s Manual.” software.intel.com/en-us/articles/intel-sdm , Dec. 2016.

+
+
+

[2] AMD, “AMD64 Architecture Programmer’s Manual Volume 1: Application Programming.” support.amd.com/TechDocs/24592.pdf , Oct. 2013.

+
+
+

[3] Power ISA Version 3.0B. OpenPOWER Foundation, 2017.

+
+
+

[4] L. Leighton, “The Libre-SOC Project.” EuroPython Society, 2021.

+
+
+

[5] ARM Architecture Reference Manual (ARMv8, for ARMv8-A architecture profile). ARM Ltd., 2015.

+
+
+

[6] A. Reid, “Trustworthy Specifications of ARM v8-A and v8-M System Level Architecture,” 2016.

+
+
+

[7] MIPS Technologies, Inc., “MIPS64 Architecture For Programmers. Volume II: The MIPS64 Instruction Set.” Jul. 2005.

+
+
+

[8] MIPS Technologies, Inc., “MIPS64 Architecture For Programmers. Volume III: The MIPS64 Privileged Resource Architecture.” Jul. 2005.

+
+
+

[9] D. P. Mulligan, S. Owens, K. E. Gray, T. Ridge, and P. Sewell, “Lem: reusable engineering of real-world semantics,” in Proceedings of ICFP 2014: the 19th ACM SIGPLAN International Conference on Functional Programming, 2014, pp. 175–188, doi: 10.1145/2628136.2628143.

+
+
+

[10] “Lem implementation.” bitbucket.org/Peter_Sewell/lem/ , 2017.

+
+
+

[11] R. N. M. Watson et al., “Bluespec Extensible RISC Implementation: BERI Hardware reference,” University of Cambridge, Computer Laboratory, UCAM-CL-TR-868, Apr. 2015. [Online]. Available: www.cl.cam.ac.uk/techreports/UCAM-CL-TR-868.pdf.

+
+
+

[12] J. Heinrich, MIPS R4000 Microprocessor User’s Manual (Second Edition). MIPS Technologies, Inc., 1994.

+
+
+

[13] MIPS Technologies, Inc., “MIPS32 Architecture For Programmers. Volume I: Introduction to the MIPS32 Architecture.” Mar. 2001.

+
+
+

[14] R. N. M. Watson et al., “Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 5),” University of Cambridge, Computer Laboratory, UCAM-CL-TR-891, Jun. 2016. [Online]. Available: www.cl.cam.ac.uk/techreports/UCAM-CL-TR-891.pdf.

+
+
+
+
+
+
+
+1. ARM v8-A Architecture Specification: github.com/meriac/archex +
+
+ + + \ No newline at end of file diff --git a/manual.pdf b/manual.pdf deleted file mode 100644 index f0032e3c1..000000000 Binary files a/manual.pdf and /dev/null differ diff --git a/src/sail_doc_backend/docinfo.ml b/src/sail_doc_backend/docinfo.ml index 2f04b397a..871947a3c 100644 --- a/src/sail_doc_backend/docinfo.ml +++ b/src/sail_doc_backend/docinfo.ml @@ -401,7 +401,9 @@ module Generator (Converter : Markdown.CONVERTER) (Config : CONFIG) = struct let doc_loc l g f x = match Reporting.simp_loc l with - | Some (p1, p2) when p1.pos_fname = p2.pos_fname && Filename.is_relative p1.pos_fname -> doc_lexing_pos p1 p2 + | Some (p1, p2) + when p1.pos_fname = p2.pos_fname && Filename.is_relative p1.pos_fname && Sys.file_exists p1.pos_fname -> + doc_lexing_pos p1 p2 | _ -> Raw (g x |> f |> Pretty_print_sail.to_string |> encode) let get_doc_comment def_annot =