Skip to content

Commit

Permalink
Merge pull request #149 from nikomatsakis/reusable
Browse files Browse the repository at this point in the history
Make formality core reusable
  • Loading branch information
nikomatsakis authored Oct 31, 2023
2 parents c0e0691 + 75fd52f commit 276106b
Show file tree
Hide file tree
Showing 94 changed files with 2,252 additions and 1,468 deletions.
14 changes: 9 additions & 5 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 9 additions & 3 deletions book/src/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
# Summary

- [Intro](./intro.md)
- [Formality](./formality.md)
- [Terms](./terms.md)
- [Inference rules](./inference_rules.md)
- [`formality_core`: the Formality system](./formality_core.md)
- [Defining your lang](./formality_core/lang.md)
- [Defining terms with the `term` macro](./formality_core/terms.md)
- [Implementing `Fold` and `Visit` by hand](./formality_core/impl_fold_visit.md)
- [Implementing `Parse` by hand](./formality_core/impl_parse.md)
- [Variables](./formality_core/variables.md)
- [Collections](./formality_core/collections.md)
- [Judgment functions and inference rules](./formality_core/judgment_fn.md)
- [FAQ and troubleshooting](./formality_core/faq.md)
1 change: 1 addition & 0 deletions book/src/faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# FAQ and troubleshooting
10 changes: 10 additions & 0 deletions book/src/formality_core.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# `formality_core`: the Formality system

`a-mir-formality` is build on the formality core system,
defined by the `formality_core` crate.
Formality core is a mildly opnionated series of macros, derives, and types
that let you write high-level Rust code
in a way that resembles standard type theory notation.
Its primary purpose is to be used by a-mir-formality
but it can be used for other projects.

1 change: 1 addition & 0 deletions book/src/formality_core/collections.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Collections
7 changes: 7 additions & 0 deletions book/src/formality_core/faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# FAQ and troubleshooting

## Why am I getting errors about undefined references to `crate::FormalityLang`?

The various derive macros need to know what language you are working in.
To figure this out, they reference `crate::FormalityLang`, which you must define.
See the [chapter on defining your language](./lang.md) for more details.
22 changes: 22 additions & 0 deletions book/src/formality_core/impl_fold_visit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Implementing `Fold` and `Visit` by hand

The `#[term]` macro auto-generates impls of `Fold`, `Visit`, and `Parse`.
But sometimes you don't want to use the macro.
Sometimes you want to write your own code.
One common reason is for substitution of a variable.
For example, in the Rust model,
the code that replaces type variables with types from a substitution
is defined by a manual impl of `Fold`.
Because `Fold` and `Visit` are trait aliases, you need to implement the underlying
trait (`CoreFold`) by hand.
Here is the custom impl of fold for `Ty` from `formality_types`:

```rust
{{#include ../../../crates/formality-types/src/grammar/ty/term_impls.rs:core_fold_ty}}
```

That same module contains other examples, such as impls of `CoreVisit`.

## Derives

You can also manually derive `Visit` and `Fold` instead of using `#[term]`.
8 changes: 8 additions & 0 deletions book/src/formality_core/impl_parse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Implementing Parse by hand

The generic `#[term]` macro generates a simple parser, but sometimes you want more flexibility.
Here is the code that implements the parsing of Rust types:

```rust
{{#include ../../../crates/formality-types/src/grammar/ty/parse_impls.rs:ty_parse_impl}}
```
185 changes: 185 additions & 0 deletions book/src/formality_core/judgment_fn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
# Judgment functions and inference rules

The next thing is the `judgement_fn!` macro. This lets you write a *judgment* using *inference rules*. A "judgment" just means some kind of predicate that the computer can judge to hold or not hold. Inference rules are those rules you may have seen in papers and things:

```
premise1
premise2
premise3
------------------
conclusion
```

i.e., the conclusion is judged to be true if all the premises are true.

[`prove_wc`]: https://github.com/rust-lang/a-mir-formality/blob/bca36ecd069d6bdff77bffbb628ae3b2ef4f8ef7/crates/formality-prove/src/prove/prove_wc.rs#L21-L125

Judgments in type system papers can look all kinds of ways. For example, a common type system judgment would be the following:

```
Γ ⊢ E : T
```

This can be read as, in the environment Γ, the expression E has type T. You might have rule like these:

```
Γ[X] = ty // lookup variable in the environment
--------------- "variable"
Γ ⊢ X : ty
Γ ⊢ E1 : T // must have the same type
Γ ⊢ E2 : T
--------------- "add"
Γ ⊢ E1 + E2 : T
```

In a-mir-formality, you might write those rules like so:

```rust
judgment_fn! {
pub fn has_type(
env: Env,
expr: Expr,
) => Type {
(
(env.get(&name) => ty)
---------------
(has_type(env, name: Variable) => ty)
)

(
(has_type(env, left) => ty_left)
(has_type(env, right) => ty_right)
(if ty_left == ty_right)
---------------
(has_type(env, Expr::Add(left, right)) => ty_left)
)
}
}
```

Unlike mathematical papers, where judgments can look like whatever you want, judgments in a-mir-formality always have a fixed form that distinguish inputs and outputs:

```
judgment_name(input1, input2, input3) => output
```

In this case, `has_type(env, expr) => ty` is the equivalent of `Γ ⊢ E : T`. Note that, by convention, we tend to use more English style names, so `env` and not `Γ`, and `expr` and not `E`. Of course nothing is stop you from using single letters, it's just a bit harder to read.

When we write the `judgement_fn`, it is going to desugar into an actual Rust function that looks like this:

```rust
pub fn has_type(arg0: impl Upcast<Env>, arg1: impl Upcast<Expr>) -> Set<Type> {
let arg0: Env = arg0.upcast();
let arg1: Expr = arg1.upcast();

...
}
```

Some things to note. First, the function arguments (`arg0`, `arg1`) implicitly accept anything that "upcasts" (infallibly converts) into the desired types. `Upcast` is a trait defined within a-mir-formality and implemented by the `#[term]` macro automatically.

Second, the function always returns a `Set`. This is because there can be more rules, and they may match in any ways. The generated code is going to exhaustively search to find all the ways that the rules could match. At a high-level the code looks like this (at least if we ignore the possibility of cycles; we'll come back to that later):

```rust
pub fn has_type(arg0: impl Upcast<Env>, arg1: impl Upcast<Expr>) -> Set<Type> {
let arg0: Env = arg0.upcast();
let arg1: Expr = arg1.upcast();

let mut results = set![]; // we define this macro

if /* inference rule 1 matches */ {
results.push(/* result from inference rule 1 */);
}

if /* inference rule 2 matches */ {
results.push(/* result from inference rule 1 */);
}

// ...

if /* inference rule N matches */ {
results.push(/* result from inference rule N */);
}

results
}
```

So how do we test if a particular inference rule matches? Let's look more closely at the code we would generate for this inference rule:

```rust
(
(env.get(name) => ty)
---------------
(has_type(env, name: Variable) => ty)
)
```

The first part of the final line, `has_type(env, name: Variable)`, defines patterns that are matched against the arguments. These are matched against the arguments (`arg0`, `arg1`) from the judgment. Patterns can either be trivial bindings (like `env`) or more complex (like `name: Variable` or `Expr::Add(left, right)`). In the latter case, they don't have to match the type of the argument precisely. Instead, we use the `Downcast` trait combined with pattern matching. So this inference rule would compile to something like...

```rust
// Simple variable bindings just clone...
let env = arg0.clone();

// More complex patterns are downcasted and testing...
if let Some(name) = arg1.downcast::<Variable>() {
... // rule successfully matched! See below.
}
```

Once we've matched the arguments, we start trying to execute the inference rule conditions. We have one, `env.get(&name) => ty`. What does that do? A condition written like `$expr => $pat` basically becomes a for loop, so you get...

```rust
let env = arg0.clone();
if let Some(name) = arg1.downcast::<Variable>() {
for ty in env.get(&name) {
... // other conditions, if any
}
}
```

Once we run out of conditions, we can generate the final result, which comes from the `=> $expr` in the conclusion of the rule. In this case, something like this:

```rust
let env = arg0.clone();
if let Some(name) = arg1.downcast::<Variable>() {
for ty in env.get(&name) {
result.push(ty);
}
}
```

Thus each inference rule is converted into a little block of code that may push results onto the final set.

The second inference rule (for addition) looks like...

```rust
// given this...
// (
// (has_type(env, left) => ty_left)
// (has_type(env, right) => ty_right)
// (if ty_left == ty_right)
// ---------------
// (has_type(env, Expr::Add(left, right)) => ty_left)
// )
// we generate this...
let env = arg0.clone();
if let Some(Expr::Add(left, right)) = arg1.downcast() {
for ty_left in has_type(env, left) {
for ty_right in has_type(env, right) {
if ty_left == ty_right {
result.push(ty_left);
}
}
}
}
```

If you want to see a real judgement, take a look at the one for [proving where clauses][`prove_wc`].

[`prove_wc`]: https://github.com/rust-lang/a-mir-formality/blob/bca36ecd069d6bdff77bffbb628ae3b2ef4f8ef7/crates/formality-prove/src/prove/prove_wc.rs#L21-L125

### Handling cycles

Judgment functions must be **inductive**, which means that cycles are considered failures. We have a tabling implementation, which means we detect cycles and try to handle them intelligently. Basically we track a stack and, if a cycle is detected, we return an empty set of results. But we remember that the cycle happened. Then, once we are done, we'll have computed some intermediate set of results `R[0]`, and we execute again. This time, when we get the cycle, we return `R[0]` instead of an empty set. This will compute some new set of results, `R[1]`. So then we try again. We keep doing this until the new set of results `R[i]` is equal to the previous set of results `R[i-1]`. At that point, we have reached a fixed point, so we stop. Of course, it could be that you get an infinitely growing set of results, and execution never terminates. This means your rules are broken. Don't do that.
41 changes: 41 additions & 0 deletions book/src/formality_core/lang.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Defining your language

The very first thing you do to define your own language is to use the `formality_core::declare_language!` macro.
You use it like so:

```rust
{{#include ../../../crates/formality-types/src/lib.rs:declare_rust_language}}
```

The `declare_language` macro will create a module with the name you specify (here, `rust`).
You have to tell it a few things:

* The `NAME` of your language, a string for debugging.
* An enum defining the *kinds* of variables in your language; see the [variables](./variables.md) chapter for more information. For Rust, the kinds are types, lifetimes, and constants.
* An enum defining a *parameter*, which is the terms that can be used to represent the *value* of a variable; see the [variables](./variables.md) chapter for more information.
* Two characters `BINDER_OPEN` and `BINDER_CLOSED` defining the opening and closing characters for binders, e.g., `<` and `>`, to use when parsing.

## Contents of the language module

The language module you create has various items in it:

* A `struct FormalityLang` that defines your language. Some of the contains of `formality_core` (notably the traits that involve bound variables) are

## Specifying the language for a crate

That module will contain a language struct named `FormalityLang`. It
Other parts of the formality system (e.g., autoderives and the like)
will need to know the current language you are defining,
and they expect to find it at `crate::FormalityLang`.
Best practice is to add a `use` at the top of your crate defining your language.
For example, the `formality_types` crate has:

```rust
{{#include ../../../crates/formality-types/src/lib.rs:use_rust_language}}
```

and other crates like `formality_rust` have:

```rust
{{#include ../../../crates/formality-rust/src/lib.rs:use_rust_language}}
```
Loading

0 comments on commit 276106b

Please sign in to comment.