-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Idea: Generic data #245
Comments
I won’t shy away from starting with interface-only features that don’t affect the serialized format. It’s not more wasteful than what we would have now, where people have to write it out (or what happens if you currently export Candid from Motoko that has parametrized type definitions). And decoder simplicity is a useful goal… Not sure how much For polymorphism encoded as For that reason, “Polymorphism with separately serialised types” may actually be easier to implement! But still, since decoding/coercion requires an expected type, we need more detail here. Maybe the detail is “only decoding happens, but no coerction, given that thes values are only used in parametric ways anyways”? In terms of bang-for-the-complexity-buck, the |
idea 4 is what I often want, especially for this project (see
Specifically, I am using a Motoko datatype to encode the fully-parsed AST of the Candid value, from which I can recover the Candid type, when needed. I believe it would be more systematic and more efficient if this kind of thing were supported by Candid directly, somehow. In the meantime, I created a CLI tool in Rust that pipes the AST of the existing candid parser over the wire. The value of this AST approach (or The downside is that I need more tools to use it (The |
I'm a bit surprised by the key-value store interfaces above. The one I primarily want is a generic class that I can instantiate to specific
This would require generic services, but no generic methods per se, though we might want those too in future. Would that actually be simpler to implement? |
I wonder what happens when the canister using this key-value storage class as a client undergoes an upgrade, and its definition of I think that's a common situation for people building apps today, as everyone is starting from a clean page and things they build change quickly. In the meantime, only the last option (option 4) really gives a viable path for people that:
|
It's a bit more than just a semantic restriction. For one, it documents a representation difference for blobs themselves: serialising a blob as type blob has a different representation than serialising it as type Moreover, the knowledge about this type enables serialisers to handle these cases directly, i.e., without the user having to perform the nested serialisations separately. (You could also model this as an implicit conversion between blob and any other type, but that would be a hack that doesn't work in all cases, see first point.)
Good question. On the caller side you'll always have a concrete type instantiation to use for deserialisation (of results). So the deserialiser would just work recursively and thereby naturally check that each dyn's internal type match the expected target type at each occurrence (naturally consider subtyping). I don't think you need any extra checking there. The callee doesn't know the instantiation, and is assumed to behave parametrically. In that case the the serialiser should probably still check that it is a valid Candid blob, though that produces extra cost. I think it's still worth requiring? By the parametricity assumption, I don't think it should have to check anything about the relations of the types inside multiple dyns.
Maybe it's easier, thought that would be surprising. Honestly, I don't have a good intuition for that right now.
Yes, that would be my take exactly.
Agreed, and would potentially be useful independent of more explicit quantification. Btw, I just reused the name
I see what you mean. But this is effectively an alternative representation for the
Hm, not sure I follow. In what way would it show up in Candid definitions?
In an ideal world I totally agree with you, but in practice I assume that many use cases will want heterogeneous storage canisters in which one can stuff what ever. The problem with a homogeneous store on this level is that you'd need to spawn a separate canister for every distinct type. That seems cumbersome and I assume doesn't really scale for most apps (especially when types are compositional and their set thus infinite), so they will instead opt for defining some kind of universal type anyway. But I admit that this might not be the best example to demonstrate function polymorphism. I just picked it because it was simple and common. :) |
Absolutely. Also, what is happening now is not ideal because it consists of sending an inefficient representation over the wire.
What I mean is that Candid and Motoko should both permit sub-values that are self-describing, and permit pattern matching. (That is, I want the abilities that come from having the fully type-annotated AST of the candid value, but with some more efficiency than actually creating this AST and sending it over the wire, as I do now in Candid Spaces.) I've been wanting Candid and Motoko to share a type for this kind of data (call it I do not want, however, for the the service to fix these types at any point in the signature of the methods (unlike ideas 2 and 3, IIUC). By not doing so, the service API remains fixed as the services that use it "evolve" their types in and out of the generic key-value store. As you suggest above, I'd like to use I often imagine a key-value store that also permits some searching for lists of matches. In these cases, I'd want to use
|
We should all probably take another look at this before proceeding.... |
Yeah, that's what I based the name |
I don't think we are discussing the same ideas, though there seems to be a lot of overlap, and thus, a lot of confusion when I try to advocate for my own. Specifically, I am more than satisfied with a solution that reuses the existing (value-level) pattern-matching of Motoko, via Unlike the DEC TR with all of the famous PL people, I am advocating for (first order) candid values as monomorphic syntax trees, not a hybrid dynamic-static type system. In particular, I understand that the paper assumes that To be clear, there is nothing exotic about what I want: Imagine an SML compiler's orientation toward the abstract syntax of SML --- That's precisely the relationship that I want between my programs and the candid ASTs that I want to have language (or library) support for producing and consuming (and possibly, for efficiently representing on the wire). The type for this AST may just as well be an ordinary Motoko type (from an appropriate compiler-provided def, like our other Prims). We do not need to extend the Motoko language to support this, as Candid Spaces is not doing this and is accomplishing what it wants without special language support. What is missing for Candid Spaces and similar projects is a missing feature in the serialization/deserialization layer that connects Candid and Motoko. In particular, there should be a way of saying that, instead of deserializing a candid-typed sub-value into a specific Motoko type, it should go into the AST monotype within Motoko for representing "Candid as syntax data". In my proposal, there is no "type tag" here beyond perhaps, another Motoko value that says what Candid type the Candid AST would synthesize, if we were to walk over it and synthesize a type for it. For this to be possible, the Motoko value must already know the Candid type that it represents (to distinguish number types). But this Motoko value is not a Motoko type; it's yet another Candid (type) AST, represented in Motoko as a value, with ordinary pattern matching ( I considered adding this AST for candid types to Candid Spaces, but it wasn't useful (yet) for anything, so I haven't. I could though, and it wouldn't be a type tag in the sense of that paper that introduces In sum, I think the DEC paper is going off in direction A, and we want direction B, and there are resemblances in what we do along these paths, but they can be very different in terms of complexity for implementors and users. I think I'm advocating for something that is both simple and useful. They are scratching some academic itch that is neither simple, nor (arguably) useful in that it was never used. For instance, I see that this paper tries to go very far with the types that they make dynamic (including unrestricted arrows, quantifiers, etc). I'd even say that supporting higher order types within this feature is not obviously useful, and may just be a large headache for a low value. Most of what we want, 99% of the time, is just algebraic data. Hence, everything that paper says about higher order types could be made irrelevant, along with what it says about |
Candid 0.7.1 does not support decoding a Record into an IDLValue. We were using this in an unsupported way with Candid 0.6.x, but it doesn't work anymore. So, this commit specifies the exact type of the streaming http response token. Unfortunately, this means that we will no longer support arbitrary token types, but this is the best that we can do until Candid supports generic types: dfinity/candid#245
* fix: icx-proxy panic when decoding a response to http_request Candid 0.7.1 does not support decoding a Record into an IDLValue. We were using this in an unsupported way with Candid 0.6.x, but it doesn't work anymore. So, this commit specifies the exact type of the streaming http response token. Unfortunately, this means that we will no longer support arbitrary token types, but this is the best that we can do until Candid supports generic types: dfinity/candid#245
Going back to the top, the issue's original motivation is:
But as reviewers to #292 are saying, this idea has some unresolved issues around the second class nature of argument sequences and response sequences within the Candid type system. That is, at least in some cases, we want But how do we generalize a service's interface if we cannot name or abstract over an entire argument sequence, or response sequence, at the level of Candid types? It seems like we need an additional kind in the Candid type system, for argument sequences, before we can generalize that idea and abstract over it further. Likewise, Motoko needs some way to intro/elim these sequences, since otherwise, there would be no clear way to define the lifecycle of a value sequence of type |
Dragginz team would much appreciate this! https://gist.github.com/borovan/697dc7c58fccc5bc85f15499e732898b |
Generic data
Motivation
Candid cannot currently describe services or functions with a type-generic interface. That is a severe limitation that both we and users keep bumping into. For example, it is not possible to describe the interface of a generic key/value store, where the value could be any form of data (either homogeneously or heterogeneously).
Below I collect some thoughts from previous discussions for possible ways to lift this restriction.
Idea 0: Parameterised type definitions
Summary: Allow type parameters in type definitions.
Examples:
Since types are structural, this essentially is a simple macro mechanism that avoids having to repeat the same type many times with different instantiations. It's a prerequisite to make generic functions practical.
However, this requires additional checks to make sure that type recursion is well-founded and not diverging. If not restricted somehow, that is a significant extra burden on decoders (unless this is a text-only convenience, and types are duplicated in the serialised value, but that would be wasteful).
Idea 1: Dynamic data type
Summary: Extend Candid with a new type
dyn
, which can carry any type of data. It is represented as a nested, self-contained Candid blob, complete with its own type description. (Type-theoretically,dyn = exists X. X
)This probably is the simplest approach. It essentially assigns a special type to blobs that contain nested Candid encodings.
Example:
Pros: very simple
Cons:
fill
all share the same type, but this type is rather large; they cannot share any type defs with the outer blob either)Idea 2: Polymorphism encoded as
dyn
Summary: Add syntactic sugar on top of
dyn
for expressing polymorphic functions with universally quantifed parameters. A value of parameter type is represented likedyn
, but with the assumption that its internal type matches the instantiation of the type parameter.Example:
Parameter types can be referred to by other parameters.
They are not represented in the serialised argument tuple.
Pros: still relatively simple; conveys intention
Cons: maintains potential representation inefficiencies as
dyn
Idea 3: Polymorphism with separately serialised types
Summary: Introduce a type
type
that actually is backed by a serialised type. Values of parameter type are represented as serialised Candid blobs, but without repeating the type information -- that is taken from the separately serialised type.Example (as before):
Parameter types can be referred to by other parameters.
They are represented by a serialised type (without a value) as part of the serialised argument tuple.
Pros: conveys intention; avoids unnecessary type repetition in the wire format
Cons: somewhat more complicated to implement (not simply a nested decoder)
Idea 4: Add existential polymorphism
Both idea 2 and 3 can be combined with 1, making type
dyn
available for non-parametric use cases or to encode existentially quantified use cases.Instead (or additionally), existential types could be introduced more explicitly. Two possibilities are:
(a) Also allow 'type results' right of the array:
get : (key : text) -> (type t) (val : opt t) query
(b) Allow type fields in records:
get : (key : text) -> opt record {type t; val : opt t} query
In the latter case,
dyn
becomes expressible asThoughts?
The text was updated successfully, but these errors were encountered: