- Abstract
- Problem
- Background
- Proposal
- Details
- Rewrite constraints
- Combining constraints with
&
- Combining constraints with
and
- Combining constraints with
extends
- Combining constraints with
impl as
andis
- Constraint resolution
- Precise rules and termination
- Same type constraints
- Observe declarations
- Implementing an interface
- Ergonomics
- Implementation experience
- Rationale
- Alternatives considered
- Status quo
- Restrict constraints to allow computable type equality
- Find a fully transitive approach to type equality
- Different syntax for rewrite constraint
- Different syntax for same-type constraint
- Required ordering for rewrites
- Multi-constraint
where
clauses - Rewrite constraints in
impl as
constraints
Split the where A == B
constraint in two: where .A = B
produces a new
constraint from an old one, where the value of .A
in the new constraint is
known and eagerly rewritten to B
, and where A == B
, which does not cause A
and B
to be considered as identical by language rules but does permit implicit
(no-op) conversion between them.
This aims to provide an efficiently-computable and human-understandable type equality rule, with type canonicalization and therefore transitive type equality, without sacrificing too much in the way of ergonomics and without sacrificing determinism, while still providing the full power of a general type constraint system in a less ergonomic form.
When an associated type is constrained to be a concrete type, it is desirable for the associated type to behave like that concrete type:
interface Hashable {
fn Hash[me: Self]() -> u128;
}
interface X {
let R:! Hashable;
fn Make() -> R;
}
fn F[T:! X where .R == i32](x: T) -> i32 {
return T.Make() + 1;
}
Here, we want to be able to use the result of F.Make()
in all regards like an
i32
. Assuming an i32.Abs
function exists, and i32
implements Hashable
externally, it is more desirable for F.Make().Abs()
to work than it is for
F.Make().Hash()
to work.
We are currently aiming to address this by saying that when two type expressions are constrained to be equal, a value whose type is either of those type expressions provides the (disjoint) union of their interfaces. This leads to a surprise:
fn G[T:! X where .R == i32](x: T) -> u128 {
let n: i32 = 1;
let m: i64 = 2;
return n.Hash() ^ m.Hash();
}
Here, the call to n.Hash()
is valid but the call to m.Hash()
is invalid,
because the type i32
of n
is constrained to be equal to T.R
which is of
type Hashable
, but the type i64
is not constrained in that way and does not
implement Hashable
internally.
This suggests that there might be two different behaviors that we want when adding a constraint: either we are changing the interface, or not. This is analogous to internal versus external implementation of an interface.
There are a variety of contexts in which types are compared for equality. In some of those contexts, it might be acceptable to perform a search and a computation, and it might be acceptable to have false negatives. For example, when analyzing an argument being passed to a function in a generic, it might be OK to say "we can't prove that the argument has a type that is equal to (or convertible to) the type of the parameter", so long as the developer has some way to demonstrate that the types are the same if that is indeed the case.
However, not all comparisons of generic types have that property. As an extreme and hopefully avoidable example, suppose a C++-like linkage model is used, with name mangling. We may need to determine whether two type expressions are equal at link time, by reducing them both to strings that are guaranteed to be identical if the types are equal and non-identical otherwise. This requires some form of type canonicalization: reducing all equal type expressions to some common representation where we can faithfully perform equality comparisons.
Even if the semantics of Carbon don't require canonicalization, it is still a very useful property for implementation purposes. For example, if type canonicalization is possible to perform efficiently, building a data structure mapping from a type to some value becomes much simpler, and if the representation of a type expression carries its canonical form then type equality comparisons can be done in constant time.
For ergonomic purposes, it is valuable for the notion of type expression equality to be transitive: if type A is known to be the same as type B, and type B is known to be the same as type C, then shouldn't we be able to use an A where a C is required? See #1369 for an explanation of why this is important.
However, if we permit arbitrary constraints and have a transitive equality rule, then type equality is isomorphic to the word problem for finitely-presented monoids, which is undecidable, and certainly not computable, in general.
Therefore we must impose a restriction somewhere: either not all constraints are permissible, or we do not have transitivity, or we impose some other constraint on the language to avoid the hard cases. Some languages merely impose some kind of depth limit on their search for a proof of equality.
Note that if we can perform type canonicalization, then type equality is necessarily transitive: if all equal pairs of types canonicalize to the same representation, then all equal types canonicalize to the same representation transitively.
In order to avoid surprises in the language behavior, we want Carbon to have the property that learning more about the type of a value should not cause the behavior of an expression to change from one valid behavior to another valid behavior.
It's important to distinguish here between learning more about a type, such as
learning that it implements a particular interface – for example, from an added
import – and changing the type, such as by writing a different expression after
a :
or :!
. Changing the type of a value is expected to be able to change the
behavior of an expression using that value.
There has been a lot of work in this space. Some previous relevant proposals:
- #731: generics details 2 introduced associated types, with the intent that a mechanism to constrain their value would be provided in a later proposal.
- #818: generics details 3
introduced
where
constraints for generics, with an open question for whether and how we might restrict constraints in order to give a decidable type equality rule.- #818 represents the status quo of approved proposals. It has both
=
and==
where constraints, which are similar to, but somewhat different from, those in this proposal. =
constraints require a concrete type on their right-hand side. The left-hand side is not restricted. The facet type of the left-hand side type is changed to match the right-hand side.==
constraints do not affect the facet type of either operand.- As an answer to the question of how to produce a decidable type equality
rule, both kinds of constraints are applied automatically, but only at a
depth of one constraint. Neither form is transitive but both can be
transitively extended using
observe
declarations.
- #818 represents the status quo of approved proposals. It has both
- #2070: always
==
not=
inwhere
clauses describes an attempt to unify the syntax and semantics of=
and==
constraints, under which we would always merge the facet types of the operands. That proposal has not yet been accepted, and this proposal intends to supersede it.
Replace the semantics of the existing where A == B
and where A = B
constraints and restrict the syntax of where A = B
as follows:
where .A = T
specifies a rewrite constraint. The left-hand side is required to name an associated constant. If that associated constant.A
is named as a member of a type declared with this constraint, it is rewritten toT
. This means that the interface of the type changes, and the behavior is in all respects as ifT
had been specified directly.where X == Y
specifies a same-type constraint. This is treated much likewhere X is SameAs(Y)
. Here,SameAs
is a hypothetical constraint that is reflexive (T is SameAs(T)
for allT
), commutative (T is SameAs(U)
if and only ifU is SameAs(T)
), and not implementable by the developer. Moreover, ifF
is any pure type function,T is SameAs(U)
impliesF(T) is SameAs(F(U)
).observe
statements can be used to form transitiveSameAs
relations. Same-type constraints are not used automatically by the language rules for any purpose, but a blanketImplicitAs
impl permits conversions between types that are known to be the same:impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U);
A type implements C where .A = T
if and only if it implements
C where .A == T
, assuming both constraints are valid – the inhabitants of
these two facet types are the same, but they are different facet types and
provide different interfaces for corresponding type values.
In a rewrite constraint, the left-hand operand of =
must be a .
followed by
the name of an associated constant. .Self
is not permitted.
interface RewriteSelf {
// ❌ Error: `.Self` is not the name of an associated constant.
let Me:! type where .Self = Self;
}
interface HasAssoc {
let Assoc:! type;
}
interface RewriteSingleLevel {
// ✅ Uses of `A.Assoc` will be rewritten to `i32`.
let A:! HasAssoc where .Assoc = i32;
}
interface RewriteMultiLevel {
// ❌ Error: Only one level of associated constant is permitted.
let B:! RewriteSingleLevel where .A.Assoc = i32;
}
This notation is permitted anywhere a constraint can be written, and results in a new constraint with a different interface: the named member effectively no longer names an associated constant of the constrained type, and is instead treated as a rewrite rule that expands to the right-hand side of the constraint, with any mentioned parameters substituted into that type.
interface Container {
let Element:! type;
let Slice:! Container where .Element = Element;
fn Add[addr me: Self*](x: Element);
}
// `T.Slice.Element` rewritten to `T.Element`
// because type of `T.Slice` says `.Element = Element`.
// `T.Element` rewritten to `i32`
// because type of `T` says `.Element = i32`.
fn Add[T:! Container where .Element = i32](p: T*, y: T.Slice.Element) {
// ✅ Argument `y` has the same type `i32` as parameter `x` of
// `T.(Container.Add)`, which is also rewritten to `i32`.
p->Add(y);
}
Rewrites aren't performed on the left-hand side of such an =
, so
where .A = .B and .A = C
is not rewritten to where .A = .B and .B = C
.
Instead, such a where
clause is invalid when the constraint is
resolved unless each rule for .A
specifies the same
rewrite.
Note that T:! C where .R = i32
can result in a type T.R
whose behavior is
different from the behavior of T.R
given T:! C
. For example, member lookup
into T.R
can find different results and operations can therefore have
different behavior. However, this does not violate coherence because the facet
types C
and C where .R = i32
don't differ by merely having more type
information; rather, they are different facet types that have an isomorphic set
of values, somewhat like i32
and u32
. An =
constraint is not merely
learning a new fact about a type, it is requesting different behavior.
Suppose we have X = C where .R = A
and Y = C where .R = B
. What should
C & X
produce? What should X & Y
produce?
We could perhaps say that X & Y
results in a facet type where the type of R
has the union of the interface of A
and the interface of B
, and that C & X
similarly results in a facet type where the type of R
has the union of the
interface of A
and the interface originally specified by C
. However, this
proposal suggests a simpler rule:
- Combining two rewrite rules with different rewrite targets results in a
facet type where the associated constant is ambiguous. Given
T:! X & Y
, the type expressionT.R
is ambiguous between a rewrite toA
and a rewrite toB
. But givenT:! X & X
,T.R
is unambiguously rewritten toA
. - Combining a constraint with a rewrite rule with a constraint with no rewrite
rule preserves the rewrite rule. For example, supposing that
interface Container
extendsinterface Iterable
, andIterable
has an associated constantElement
, the constraintContainer & (Iterable where .Element = i32)
is the same as the constraint(Container & Iterable) where .Element = i32
which is the same as the constraintContainer where .Element = i32
.
If the rewrite for an associated constant is ambiguous, the facet type is rejected during constraint resolution.
It's possible for one =
constraint in a where
to refer to another. When this
happens, the facet type C where A and B
is interpreted as
(C where A) where B
, so rewrites in A
are applied immediately to names in
B
, but rewrites in B
are not applied to names in A
until the facet type is
resolved:
interface C {
let T:! type;
let U:! type;
let V:! type;
}
class M {
alias Me = Self;
}
// ✅ Same as `C where .T = M and .U = M.Me`, which is
// the same as `C where .T = M and .U = M`.
fn F[A:! C where .T = M and .U = .T.Me]() {}
// ❌ No member `Me` in `A.T:! type`.
fn F[A:! C where .U = .T.Me and .T = M]() {}
Within an interface or named constraint, extends
can be used to extend a
constraint that has rewrites.
interface A {
let T:! type;
let U:! type;
}
interface B {
extends A where .T = .U and .U = i32;
}
var n: i32;
// ✅ Resolved constraint on `T` is
// `B where .(A.T) = i32 and .(A.U) = i32`.
// `T.(A.T)` is rewritten to `i32`.
fn F(T:! B) -> T.(A.T) { return n; }
Within an interface or named constraint, the impl T as C
syntax does not
permit =
constraints to be specified directly. However, such constraints can
be specified indirectly, for example if C
is a named constraint that contains
rewrites. Because these constraints don't change the type of T
, rewrite
constraints in this context will never result in a rewrite being performed, and
instead are equivalent to ==
constraints.
interface A {
let T:! type;
let U:! type;
}
constraint C {
extends A where .T = .U and .U = i32;
}
constraint D {
extends A where .T == .U and .U == i32;
}
interface B {
// OK, equivalent to `impl as D;` or
// `impl as A where .T == .U and .U == i32;`.
impl as C;
}
var n: i32;
// ❌ No implicit conversion from `i32` to `T.(A.T)`.
// Resolved constraint on `T` is
// `B where T.(A.T) == T.(A.U) and T.(A.U) = i32`.
// `T.(A.T)` is single-step equal to `T.(A.U)`, and
// `T.(A.U)` is single-step equal to `i32`, but
// `T.(A.T)` is not single-step equal to `i32`.
fn F(T:! B) -> T.(A.T) { return n; }
A purely syntactic check is used to determine if an =
is specified directly in
an expression:
- An
=
constraint is specified directly in its enclosingwhere
expression. - If an
=
constraint is specified directly in an operand of an&
or(
...)
, then it is specified directly in that enclosing expression.
In an impl as C
or impl T as C
declaration in an interface or named
constraint, C
is not allowed to directly specify any =
constraints:
// Compile-time identity function.
fn Identity[T:! type](x:! T) -> T { return x; }
interface E {
// ❌ Rewrite constraint specified directly.
impl as A where .T = .U and .U = i32;
// ❌ Rewrite constraint specified directly.
impl as type & (A where .T = .U and .U = i32);
// ✅ Not specified directly, but does not result
// in any rewrites being performed.
impl as Identity(A where .T = .U and .U = i32);
}
The same rules apply to is
constraints. Note that .T == U
constraints are
also not allowed in this context, because the reference to .T
is rewritten to
.Self.T
, and .Self
is ambiguous.
// ❌ Rewrite constraint specified directly in `is`.
fn F[T:! A where .U is (A where .T = i32)]();
// ❌ Reference to `.T` in same-type constraint is ambiguous:
// does this mean the outer or inner `.Self.T`?
fn G[T:! A where .U is (A where .T == i32)]();
// ✅ Not specified directly, but does not result
// in any rewrites being performed. Return type
// is not rewritten to `i32`.
fn H[T:! type where .Self is C]() -> T.(A.U);
// ✅ Return type is rewritten to `i32`.
fn I[T:! C]() -> T.(A.U);
When a facet type is used as the declared type of a type T
, the constraints
that were specified within that facet type are resolved to determine the
constraints that apply to T
. This happens:
- When the constraint is used explicitly, when declaring a generic parameter
or associated constant of the form
T:! Constraint
. - When declaring that a type implements a constraint with an
impl
declaration, such asimpl T as Constraint
. Note that this does not includeimpl ... as
constraints appearing ininterface
orconstraint
declarations.
In each case, the following steps are performed to resolve the facet type's
abstract constraints into a set of constraints on T
:
- If multiple rewrites are specified for the same associated constant, they are required to be identical, and duplicates are discarded.
- Rewrites are performed on other rewrites in order to find a fixed point,
where no rewrite applies within any other rewrite. If no fixed point exists,
the generic parameter declaration or
impl
declaration is invalid. - Rewrites are performed throughout the other constraints in the facet type --
that is, in any
==
constraints andis
constraints -- and the type.Self
is replaced byT
throughout the constraint.
// ✅ `.T` in `.U = .T` is rewritten to `i32` when initially
// forming the facet type.
// Nothing to do during constraint resolution.
fn InOrder[A:! C where .T = i32 and .U = .T]() {}
// ✅ Facet type has `.T = .U` before constraint resolution.
// That rewrite is resolved to `.T = i32`.
fn Reordered[A:! C where .T = .U and .U = i32]() {}
// ✅ Facet type has `.U = .T` before constraint resolution.
// That rewrite is resolved to `.U = i32`.
fn ReorderedIndirect[A:! (C where .T = i32) & (C where .U = .T)]() {}
// ❌ Constraint resolution fails because
// no fixed point of rewrites exists.
fn Cycle[A:! C where .T = .U and .U = .T]() {}
To find a fixed point, we can perform rewrites on other rewrites, cycling between them until they don't change or until a rewrite would apply to itself. In the latter case, we have found a cycle and can reject the constraint. Note that it's not sufficient to expand only a single rewrite until we hit this condition:
// ❌ Constraint resolution fails because
// no fixed point of rewrites exists.
// If we only expand the right-hand side of `.T`,
// we find `.U`, then `.U*`, then `.U**`, and so on,
// and never detect a cycle.
// If we alternate between them, we find
// `.T = .U*`, then `.U = .U**`, then `.V = .U***`,
// then `.T = .U**`, then detect that the `.U` rewrite
// would apply to itself.
fn IndirectCycle[A:! C where .T = .U and .U = .V* and .V = .U*]();
After constraint resolution, no references to rewritten associated constants
remain in the constraints on T
.
If a facet type is never used to constrain a type, it is never subject to constraint resolution, and it is possible for a facet type to be formed for which constraint resolution would always fail. For example:
package Broken api;
interface X {
let T:! type;
let U:! type;
}
let Bad:! auto = (X where .T = .U) & (X where .U = .T);
// Bad is not used here.
In such cases, the facet type Broken.Bad
is not usable: any attempt to use
that facet type to constrain a type would perform constraint resolution, which
would always fail because it would discover a cycle between the rewrites for
.T
and for .U
. In order to ensure that such cases are diagnosed, a trial
constraint resolution is performed for all facet types. Note that this trial
resolution can be skipped for facet types that are actually used, which is the
common case.
This section explains the detailed rules used to implement rewrites. There are two properties we aim to satisfy:
- After type-checking, no symbolic references to associated constants that have an associated rewrite rule remain.
- Type-checking always terminates in a reasonable amount of time, ideally linear in the size of the problem.
Rewrites are applied in two different phases of program analysis.
- During qualified name lookup and type checking for qualified member access, if a rewritten member is looked up, the right-hand side's value and type are used for subsequent checks.
- During substitution, if the symbolic name of an associated constant is substituted into, and substitution into the left-hand side results in a value with a rewrite for that constant, that rewrite is applied.
In each case, we always rewrite to a value that satisfies property 1 above, and these two steps are the only places where we might form a symbolic reference to an associated cosntant, so property 1 is recursively satisfied. Moreover, we apply only one rewrite in each of the above cases, satisfying property 2.
Qualified name lookup into either a type parameter or into an expression whose
type is a symbolic type T
-- either a type parameter or an associated type --
considers names from the facet type C
of T
, that is, from T
’s declared
type.
interface C {
let M:! i32;
let U:! C;
}
fn F[T:! C](x: T) {
// Value is C.M in all four of these
let a: i32 = x.M;
let b: i32 = T.M;
let c: i32 = x.U.M;
let d: i32 = T.U.M;
}
When looking up the name N
, if C
is an interface I
and N
is the name of
an associated constant in that interface, the result is a symbolic value
representing "the member N
of I
". If C
is formed by combining interfaces
with &
, all such results are required to find the same associated constant,
otherwise we reject for ambiguity.
If a member of a class or interface is named in a qualified name lookup, the
type of the result is determined by performing a substitution. For an interface,
Self
is substituted for the self type, and any parameters for that class or
interface (and enclosing classes or interfaces, if any) are substituted into the
declared type.
interface SelfIface {
fn Get[me: Self]() -> Self;
}
class UsesSelf(T:! type) {
// Equivalent to `fn Make() -> UsesSelf(T)*;`
fn Make() -> Self*;
impl as SelfIface;
}
// ✅ `T = i32` is substituted into the type of `UsesSelf(T).Make`,
// so the type of `UsesSelf(i32).Make` is `fn () -> UsesSelf(i32)*`.
let x: UsesSelf(i32)* = UsesSelf(i32).Make();
// ✅ `Self = UsesSelf(i32)` is substituted into the type
// of `SelfIface.Get`, so the type of `UsesSelf(i32).(SelfIface.Get)`
// is `fn [me: UsesSelf(i32)]() -> UsesSelf(i32)`.
let y: UsesSelf(i32) = x->Get();
None of this is new in this proposal. This proposal adds the rule:
If a facet type C
into which lookup is performed includes a where
clause
saying .N = U
, and the result of qualified name lookup is the associated
constant N
, that result is replaced by U
, and the type of the result is the
type of U
. No substitution is performed in this step, not even a Self
substitution -- any necessary substitutions were already performed when forming
the facet type C
, and we don’t consider either the declared type or value of
the associated constant at all for this kind of constraint. Going through an
example in detail:
interface A {
let T:! type;
}
interface B {
let U:! type;
// More explicitly, this is of type `A where .(A.T) = Self.(B.U)`
let V:! A where .T = U;
}
// Type of T is B.
fn F[T:! B](x: T) {
// The type of the expression `T` is `B`.
// `T.V` finds `B.V` with type `A where .(A.T) = Self.(B.U)`.
// We substitute `Self` = `T` giving the type of `u` as
// `A where .(A.T) = T.(B.U)`.
let u:! auto = T.V;
// The type of `u` is `A where .(A.T) = T.(B.U)`.
// Lookup for `u.T` resolves it to `u.(A.T)`.
// So the result of the qualified member access is `T.(B.U)`,
// and the type of `v` is the type of `T.(B.U)`, namely `type`.
// No substitution is performed in this step.
let v:! auto = u.T;
}
The more complex case of
fn F2[T:! B where .U = i32](x: T);
is discussed later.
At various points during the type-checking of a Carbon program, we need to
substitute a set of (binding, value) pairs into a symbolic value. We saw an
example above: substituting Self = T
into the type A where .(A.T) = Self.U
to produce the value A where .(A.T) = T.U
. Another important case is the
substitution of inferred parameter values into the type of a function when
type-checking a function call:
fn F[T:! C](x: T) -> T;
fn G(n: i32) -> i32 {
// Deduces T = i32, which is substituted
// into the type `fn (x: T) -> T` to produce
// `fn (x: i32) -> i32`, giving `i32` as the type
// of the call expression.
return F(n);
}
Qualified name lookup is not re-done as a result of type substitution. For a template, we imagine there’s a completely separate step that happens before type substitution, where qualified name lookup is redone based on the actual value of template arguments; this proceeds as described in the previous section. Otherwise, we performed the qualified name lookup when type-checking the generic, and do not do it again:
interface IfaceHasX {
let X:! type;
}
class ClassHasX {
class X {}
}
interface HasAssoc {
let Assoc:! IfaceHasX;
}
// Qualified name lookup finds `T.(HasAssoc.Assoc).(IfaceHasX.X)`.
fn F(T:! HasAssoc) -> T.Assoc.X;
fn G(T:! HasAssoc where .Assoc = ClassHasX) {
// `T.Assoc` rewritten to `ClassHasX` by qualified name lookup.
// Names `ClassHasX.X`.
var a: T.Assoc.X = {};
// Substitution produces `ClassHasX.(IfaceHasX.X)`,
// not `ClassHasX.X`.
var b: auto = F(T);
}
During substitution, we might find a member access that named an opaque symbolic associated constant in the original value can now be resolved to some specific value. It’s important that we perform this resolution:
interface A {
let T:! type;
}
class K { fn Member(); }
fn H[U:! A](x: U) -> U.T;
fn J[V:! A where .T = K](y: V) {
// We need the interface of `H(y)` to include
// `K.Member` in order for this lookup to succeed.
H(y).Member();
}
The values being substituted into the symbolic value are themselves already fully substituted and resolved, and in particular, satisfy property 1 given above.
Substitution proceeds by recursively rebuilding each symbolic value, bottom-up,
replacing each substituted binding with its value. Doing this naively will
propagate values like i32
in the F
/G
case earlier in this section, but
will not propagate rewrite constants like in the H
/J
case. The reason is
that the .T = K
constraint is in the type of the substituted value, rather
than in the substituted value itself: deducing T = i32
and converting i32
to
the type C
of T
preserves the value i32
, but deducing U = V
and
converting V
to the type A
of U
discards the rewrite constraint.
In order to apply rewrites during substitution, we associate a set of rewrites with each value produced by the recursive rebuilding procedure. This is somewhat like having substitution track a refined facet type for the type of each value, but we don’t need -- or want -- for the type to change during this process, only for the rewrites to be properly applied. For a substituted binding, this set of rewrites is the rewrites found on the type of the corresponding value prior to conversion to the type of the binding. When rebuilding a member access expression, if we have a rewrite corresponding to the accessed member, then the resulting value is the target of the rewrite, and its set of rewrites is that found in the type of the target of the rewrite, if any. Because the target of the rewrite is fully resolved already, we can ask for its type without triggering additional work. In other cases, the rewrite set is empty; all necessary rewrites were performed when type-checking the value we're substituting into.
Continuing an example from qualified name lookup:
interface A {
let T:! type;
}
interface B {
let U:! type;
let V:! A where .T = U;
}
// Type of the expression `T` is `B where .(B.U) = i32`
fn F2[T:! B where .U = i32](x: T) {
// The type of the expression `T` is `B where .U = i32`.
// `T.V` is looked up and finds the associated type `(B.V)`.
// The declared type is `A where .(A.T) = Self.U`.
// We substitute `Self = T` with rewrite `.U = i32`.
// The resulting type is `A where .(A.T) = i32`.
// So `u` is `T.V` with type `A where .(A.T) = i32`.
let u:! auto = T.V;
// The type of `u` is `A where .(A.T) = i32`.
// Lookup for `u.T` resolves it to `u.(A.T)`.
// So the result of the qualified member access is `i32`,
// and the type of `v` is the type of `i32`, namely `type`.
// No substitution is performed in this step.
let v:! auto = u.T;
}
interface Container {
let Element:! type;
}
interface SliceableContainer {
extends Container;
let Slice:! Container where .Element = Self.(Container.Element);
}
// ❌ Qualified name lookup rewrites this facet type to
// `SliceableContainer where .(Container.Element) = .Self.(Container.Element)`.
// Constraint resolution rejects this because this rewrite forms a cycle.
fn Bad[T:! SliceableContainer where .Element = .Slice.Element](x: T.Element) {}
interface Helper {
let D:! type;
}
interface Example {
let B:! type;
let C:! Helper where .D = B;
}
// ✅ `where .D = ...` by itself is fine.
// `T.C.D` is rewritten to `T.B`.
fn Allowed(T:! Example, x: T.C.D);
// ❌ But combined with another rewrite, creates an infinite loop.
// `.C.D` is rewritten to `.B`, resulting in `where .B = .B`,
// which causes an error during constraint resolution.
// Using `==` instead of `=` would make this constraint redundant,
// rather than it being an error.
fn Error(T:! Example where .B = .C.D, x: T.C.D);
interface Allowed;
interface AllowedBase {
let A:! Allowed;
}
interface Allowed {
extends AllowedBase where .A = .Self;
}
// ✅ The final type of `x` is `T`. It is computed as follows:
// In `((T.A).A).A`, the inner `T.A` is rewritten to `T`,
// resulting in `((T).A).A`, which is then rewritten to
// `(T).A`, which is then rewritten to `T`.
fn F(T:! Allowed, x: ((T.A).A).A);
interface MoveYsRight;
constraint ForwardDeclaredConstraint(X:! MoveYsRight);
interface MoveYsRight {
let X:! MoveYsRight;
// Means `Y:! MoveYsRight where .X = X.Y`
let Y:! ForwardDeclaredConstraint(X);
}
constraint ForwardDeclaredConstraint(X:! MoveYsRight) {
extends MoveYsRight where .X = X.Y;
}
// ✅ The final type of `x` is `T.X.Y.Y`. It is computed as follows:
// The type of `T` is `MoveYsRight`.
// The type of `T.Y` is determined as follows:
// - Qualified name lookup finds `MoveYsRight.Y`.
// - The declared type is `ForwardDeclaredConstraint(Self.X)`.
// - That is a named constraint, for which we perform substitution.
// Substituting `X = Self.X` gives the type
// `MoveYsRight where .X = Self.X.Y`.
// - Substituting `Self = T` gives the type
// `MoveYsRight where .X = T.X.Y`.
// The type of `T.Y.Y` is determined as follows:
// - Qualified name lookup finds `MoveYsRight.Y`.
// - The declared type is `ForwardDeclaredConstraint(Self.X)`.
// - That is a named constraint, for which we perform substitution.
// Substituting `X = Self.X` gives the type
// `MoveYsRight where .X = Self.X.Y`.
// - Substituting `Self = T.Y` with
// rewrite `.X = T.X.Y` gives the type
// `MoveYsRight where .X = T.Y.X.Y`, but
// `T.Y.X` is replaced by `T.X.Y`, giving
// `MoveYsRight where .X = T.X.Y.Y`.
// The type of `T.Y.Y.X` is determined as follows:
// - Qualified name lookup finds `MoveYsRight.X`.
// - The type of `T.Y.Y` says to rewrite that to `T.X.Y.Y`.
// - The result is `T.X.Y.Y`, of type `MoveYsRight`.
fn F4(T:! MoveYsRight, x: T.Y.Y.X);
Each of the above steps performs at most one rewrite, and doesn't introduce any
new recursive type-checking steps, so should not introduce any new forms of
non-termination. Rewrite constraints thereby give us a deterministic,
terminating type canonicalization mechanism for associated constants: in A.B
,
if the type of A
specifies that .B = C
, then A.B
is replaced by C
.
Equality of types constrained in this way is transitive.
However, some existing forms of non-termination may remain, such as template instantiation triggering another template instantiation. Such cases will need to be detected and handled in some way, such as by a depth limit, but doing so doesn't compromise the soundness of the type system.
A same-type constraint describes that two type expressions are known to evaluate to the same value. Unlike a rewrite constraint, however, the two type expressions are treated as distinct types when type-checking a generic that refers to them.
Same-type constraints are brought into scope, looked up, and resolved exactly as
if there were a SameAs(U:! type)
interface and a T == U
impl corresponded to
T is SameAs(U)
, except that ==
is commutative. As such, it's not possible to
ask for a list of types that are the same as a given type, nor to ask whether
there exists a type that is the same as a given type and has some property. But
it is possible to ask whether two types are (non-transitively) the same.
In order for same-type constraints to be useful, they must allow the two types
to be treated as actually being the same in some context. This can be
accomplished by the use of ==
constraints in an impl
, such as in the
built-in implementation of ImplicitAs
:
final impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U) {
fn Convert[me: Self](other: U) -> U { ... }
}
It superficially seems like it would be convenient if such implementations were
made available implicitly – for example, by writing
impl forall [T:! type] T as ImplicitAs(T)
– but in more complex examples that
turns out to be problematic. Consider:
interface CommonTypeWith(U:! type) {
let Result:! type;
}
final impl forall [T:! type] T as CommonTypeWith(T) where .Result = T {}
fn F[T:! Potato, U:! Hashable where .Self == T](x: T, y: U) -> auto {
// What is T.CommonTypeWith(U).Result? Is it T or U?
return (if cond then x else y).Hash();
}
With this proposal, impl
validation for T as CommonTypeWith(U)
fails: we
cannot pick a common type when given two distinct type expressions, even if we
know they evaluate to the same type, because we would not know which API the
result should have.
It is possible to implement the above impl
of ImplicitAs
directly in Carbon,
without a compiler builtin, by taking advantage of the built-in conversion
between C where .A = X
and C where .A == X
:
interface EqualConverter {
let T:! type;
fn Convert(t: T) -> Self;
}
fn EqualConvert[T:! type](t: T, U:! EqualConverter where .T = T) -> U {
return U.Convert(t);
}
impl forall [U:! type] U as EqualConverter where .T = U {
fn Convert(u: U) -> U { return u; }
}
impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U) {
fn Convert[self: Self]() -> U { return EqualConvert(self, U); }
}
The transition from (T as ImplicitAs(U)).Convert
, where we know that U == T
,
to EqualConverter.Convert
, where we know that .T = U
, allows a same-type
constraint to be used to perform a rewrite.
This implementation is in use in Carbon Explorer.
Same-type constraints are non-transitive, just like other constraints such as
ImplicitAs
. The developer can use an observe
declaration to bring a new
same-type constraint into scope:
observe A == B == C;
notionally does much the same thing as
impl A as SameAs(C) { ... }
where the impl
makes use of A is SameAs(B)
and B is SameAs(C)
.
When implementing an interface, the =
syntax must be used, and each associated
constant without a default must be explicitly assigned-to:
impl IntVec as Container where .Element = i32 { ... }
not
impl IntVec as Container where .Element == i32 { ... }
The latter would be treated as
impl IntVec as Container where IntVec.Element is SameAs(i32) { ... }
... which only checks the value of Element
rather than specifying it.
As in other contexts, Self.Element
is rewritten to i32
within the context of
the impl
.
This proposal can be viewed as dividing type constraints into two categories:
- Ergonomic, but quite restricted, constraints using
=
. - Non-ergonomic, but fully general, constraints using
==
.
In order for this to be an effective and ergonomic strategy, we require both
that the difference between these two kinds of constraints are readily
understood by developers, and that the more-ergonomic =
constraints cover
sufficiently many scenarios that the developer seldom needs to write code to
request a conversion between types that are known to be the same.
Ideally, we hope that =
constraints will cover all real situations, and ==
constraints will never need to be written, outside of the one ImplicitAs
implementation described above. If this turns out to be true in practice, we
should consider removing ==
syntax from the language. However, this will not
remove ==
semantics, both because of the behavior of ImplicitAs
and because
impl T as C where .A = B
constraints in an interface or named constraint give
rise to ==
semantics.
This proposal has been mostly implemented in Carbon Explorer, and appears to behave as expected. However, Explorer does not yet support forward declarations of interfaces and constraints, which means that some of the more interesting cases can't be tested.
- Language tools and ecosystem
- This rule is easy to implement, and even a naive implementation should be efficient.
- Software and language evolution
- The status quo causes coincidentally equal types to have the same interface. Under this proposal, the interface of a type is not affected by coincidental equality, only by intentional assignment to an associated type.
- Code that is easy to read, understand, and write
- Using
=
rather than==
inimpl
s is easier to read and write. - The interface available on a type is more predictable in this proposal than in the status quo ante, making code easier to understand.
- Using
The Problem section describes some concerns with the status quo approach. This proposal aims to address those concerns.
We could somehow restrict which constraints can be specified, in order to ensure that our type equality rule is efficiently computable, perhaps even in a way that permits a deterministic computation of a canonical type. The exact details of how we would do this have not been worked out, but this might lead to a coherent rule that has transitive type equality. However, this would not solve the "equal types with different interfaces" problem.
This proposal makes type equality transitive, but still has non-transitive "same-type" constraints describing types that are known to always eventual be the same after substitution, but which are not treated as the same type value while type-checking a generic. This proposal also imposes a directionality on rewrites, a restriction to only rewrite at a depth of exactly one associated constant, and a restriction that only one value can be specified as equal to a given associated constant. We could seek a type equality rule that would avoid having two different kinds of equality and would avoid some or all of the restrictions placed on rewrite constraints.
One candidate approach is to accept that the full generality of the equality problem is hard, but attempt to solve it anyway. Because of the close correspondence between Turing machines and string rewriting systems, this results in a type system with no upper bound on its runtime. We consider this unacceptable for Carbon, but it is the approach taken by some other modern languages:
- Swift has an undecidable type system due to this, and handles the problem by placing a limit on the complexity of cases they will accept, but that seems unsuited to Carbon's goals, as the rule cannot be readily understood by a developer nor effectively worked around.
- Rust also has an undecidable type system (content warning: contains many instances of an obscene word as part of a programming language name) for the same reason, and similarly has a recursion limit.
See also Typing is Hard, which lists similar information for a variety of other languages. Note that most languages listed have an undecidable type system.
Another candidate approach is to find a way to reduce the inputs to the type-checking step as a set of non-quantified equalities. The resulting system of equalities can then be solved efficiently to determine whether two types are known to be the same. This approach appears sufficient to cope with locally declared constraints, but when constraints appear within interfaces, they can give rise to an infinite family of equalities, and picking any finite subset risks the result being non-transitive in general, if a different subset of equalities might be considered in a different type-checking context.
We could use a different syntax, such as ~>
, for rewrite constraints. This
would have the advantage of not using assignment for an operation that's not
always doing something that developers will think of as assignment. It also
avoids hard-to read code in cases where a binding has an initializer:
// This proposal.
let T:! Add where .Result = i32 = i32;
// Alternative syntax.
let T:! Add where .Result ~> i32 = i32;
However, it seems valuable to use only a single syntax for specifying these
constraints, and equality is the appropriate mental model most of the time.
Moreover, in an impl
declaration, we really are assigning to the associated
constant in the sense of setting a value, rather than merely constraining a
value.
This decision should be revisited if a superior syntax that avoids the above visual ambiguity is suggested.
We considered various options for the spelling of a same-type constraint:
where A == B
where A ~ B
- some other operator symbol
where A is SameType(B)
, or some other constraint name
The advantage of ==
is that it has a lot of desirable implications: it's
familiar, symmetric, and connotes the left and right operand being the same.
However, it also might be taken as suggesting that the ==
overloaded operator
is used to determine equality. A different spelling would also indicate that
this is an unusual constraint, and would be easier to search for. Concerns were
also raised that ==
might suggest transitivity if taken to mean "same value"
rather than something like the Carbon ==
binary operator, and the transitivity
of ==
constraints is not automatic.
The ~
operator is currently not in use as a binary operator. However, that
makes it quite valuable syntactic real estate, and it may see little use in this
context in practice. A longer operator could be used, but any choice other than
==
will present an unfamiliarity barrier, and longer operators will be harder
to remember.
A named constraint is appealing, but this operator does not behave like a named
constraint in that it is automatically symmetrized, not implementable, and
especially in its interactions with observe
declarations. It is unclear how
observe
declarations would be written with a named SameType
constraint:
// Maybe something like this special-case and verbose syntax?
observe A is SameType(B) and B is SameType(C) => A is SameType(C);
The fundamental connection between same-type constraints and observe
suggests
that dedicated syntax is justified.
For now, we use ==
, despite it having a different meaning in the context of a
constraint than in expression contexts. This is analogous to how =
and and
have different meanings in this context from in expressions, so the behavior
divergence is at least consistent. It's also not clear at this stage how much
usage same-type constraints will have, compared to rewrite constraints, so it's
hard to judge the other arguments in favor of and against other operators. It
would be reasonable to revisit this decision when we have more usage
information.
We considered a rule where a rewrite would need to be specified in a constraint before the associated constant is used. For example:
interface C {
let T:! type;
let U:! type;
}
// Could reject, `.U` referenced before rewrite is defined.
fn G[A:! C where .T = .U and .U = i32]() {}
This would remove the need to perform constraint resolution. Instead, we could apply rewrites as we form the constraint, and constraints formed in this way would always satisfy the property that they don't refer to associated constants that have a rewrite.
It may also be less surprising to use this rule. Because Carbon uses top-down processing for type-checking in general, it may be unexpected that a rewrite rule would rewrite an earlier occurrence of its rewrite target, even though this happens in practice at a later point in time, during constraint resolution.
However, such an approach would not give a satisfactory outcome for cases such as:
constraint X {
extends C where .T = .U;
}
constraint Y {
extends C where .U = i32;
}
// Desired behavior is that `.T` and `.U` both rewrite to `i32`.
fn G[A:! X & Y]() -> A.T { return 0; }
Performing earlier-appearing rewrites in later constraints and additionally performing a constraint resolution step to apply rewrites throughout the constraint seems to give better outcomes for the examples we considered.
Instead of treating A where B and C
as (A where B) where C
, we could model
it as (A where B) & (A where C)
.
Specifically, given
interface C {
let T:! type;
let U:! type;
}
this would treat C where .T = A and .U = B
as
(C where .T = A) & (C where .U = B)
, where the type of .Self
would be C
in
both clauses, with both constraints applied "simultaneously" to C
. This would
mean that the order of clauses doesn’t matter.
However, if we do this, then neither right-hand side is rewritten by name
lookup, and thus we would reject cases such as
C where .T = i32 and .U = .T.(Add.Result)
. This case seems reasonable, and we
would prefer to accept it.
Therefore, the rule we select is that, as soon as a rewrite is declared, it is
in scope, and later references to that associated constant refer to the
rewritten value with its rewritten type. We still reject cases like
C where .U = .T.(Add.Result) and .T = i32
, because we do not know that
T is Add
from the declared type of C.T
.
The same rule applies to is
constraints: we accept
C where .T is Add and .U = .T.(Add.Result)
, but reject
C where .U = .T.(Add.Result) and .T is Add
.
A rewrite constraint can appear indirectly in an impl as
constraint in an
interface or named constraint:
interface A {
let T:! type;
}
constraint AInt {
extends A where .T = i32;
}
interface B {
// Or, `impl as A where .T = i32;`
impl as AInt;
}
When this happens, the rewrite constraint does not result in rewrites being performed when the associated constant is mentioned in a member access into that interface or named constraint:
// Return type is not rewritten to `i32`,
// but there is still a same-type constraint.
fn F(T:! B) -> T.(A.T) {
// OK, `i32` implicitly converts to `T.(A.T)`.
return 0 as i32;
}
This may be surprising: B
says that its .(A.T)
has a rewrite to i32
, but
that rewrite is not performed. In contrast, rewrites in an extends
constraint
do become part of the enclosing interface or named constraint.
interface C {
extends AInt;
}
// Return type is rewritten to `i32`.
fn G(T:! C) -> T.T;
Something similar happens with impl T as
constraints. The following two
interfaces A1
and A2
are not equivalent:
interface A1 {
let X:! Add where .Result = i32;
}
constraint AddToi32 {
extends Add where .Result = i32;
}
interface A2 {
let X:! Add;
impl X as AddToi32;
}
In the former case, references to A.X.Result
are rewritten to i32
, and in
the latter case, they are not, because the rewrite is not part of the type of
X
.
The general principle here is that rewrites are part of the declared type of a
name, and can't be changed after the fact by another declaration. For the
impl T as
case, this behavior is also a necessary part of the termination
argument. If rewrites from impl T as
constraints were used, it would be
possible to form a rewrite cycle:
interface C { let X:! type; }
interface A {
let U:! C;
let T:! C;
}
interface B1 {
extends A;
impl T as C where .X = U.X;
}
interface B2 {
extends A;
impl U as C where .X = T.X;
}
// `V.T.X` ~> `V.U.X` ~> `V.T.X` ~> ...
fn F(V:! B1 & B2) -> V.T.X;
We could allow the specific case of impl as
(not impl T as
) to introduce
rewrites. The advantage of this change is that this behavior may be less
surprising in some cases. However, this would mean that an impl as
sometimes
changes the interface of the enclosing constraint, and behaves inconsistently
from an impl T as
constraint, so this proposal does not adopt that behavior.
We could allow impl T as
to introduce rewrites for T
in general, but we
would need to find some way to fix the resulting holes in the termination
argument.
To minimize the scope of confusion, we currently disallow rewrites from
appearing syntactically within an impl as
or impl T as
constraint. We
could allow these constructs. However, a .T = V
constraint would be equivalent
to a .T == V
constraint, and the latter more clearly expresses the resulting
semantics, so disallowing the misleading form seems preferable.