Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RFC for Recover blocks with receiver #182

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

jasoncarr0
Copy link

This RFC adds a new syntax to subsume automatic receiver recovery and enable it for more use cases than existing ref methods.

I'm very open to suggestions in syntax, and I could see many variations possible.

@SeanTAllen
Copy link
Member

@jasoncarr0 I don't understand this.

Can you give a couple of concrete usages of where this solves an existing problem.

At the moment, I'm not getting it.

@jasoncarr0
Copy link
Author

jasoncarr0 commented Aug 27, 2020

The example in the beginner help previously would have worked. E.g. the code:

use "collections"

class iso IsoClass
    let isodata : Array[USize] iso = recover iso Array[USize](10) end
    
    new iso create() =>
        for i in Range(0, 10) do
            isodata.push(i)
        end

actor Main
    let ic : IsoClass = IsoClass

    new create(env : Env) =>
        for elem in ic.isodata.values() do
            env.out.print(elem.string())
        end

does not work. With this feature we can do the following:

actor Main
    let ic : IsoClass = IsoClass

    new create(env : Env) =>
        ic.isodata.recover
            for elem in isodata.values() do
                env.out.print(elem.string())
            end
        end

without any special hacks.

Likewise

class Foo
    var x: U64 = 0
actor Main
    new create(env: Env) =>
        let arr = Array[Foo iso]
        arr.push(Foo)
        try
           arr(0)?.recover as elem =>
              this.do_lots_of_work(elem)
           end
        end
     fun tag do_lots_of_work(foo: Foo ref) =>
         foo.x = foo.x + 1

While the first case has workarounds, the latter is not possible at all in pony today (but would be possible if do_lots_of_work were a ref method on Foo).

@jasoncarr0
Copy link
Author

jasoncarr0 commented Aug 27, 2020

Er, point to the last bit. It is possible for arrays in Pony, we can do the same awkward shuffling, but only because Array update exists. Regardless, it is a very heavy burden on performance and readability and does not exist for all cases. In some cases the amount of syntax has to grow for each layer we add.

@jasoncarr0
Copy link
Author

jasoncarr0 commented Aug 27, 2020

@SeanTAllen besides the examples here, I've updated the motivation section to include two simple examples. It's hard to come up with perfect examples as there's often some workaround. It's just that the workarounds are cumbersome, inefficient, and/or anti-modular. Obviously we can come up with impossible examples easily they just might seem contrived.

@SeanTAllen
Copy link
Member

Thanks @jasoncarr0, I'll have a look within the next couple of days.


The body of the recover expression will be type-checked similarly to how recover blocks are checked today, with two exceptions. The block will have
a capability associated with it, and instead of restricting to sendable variable usage, they are restricted to capabilities which are safe-to-write.
In practice, the only special case here is writing `trn` to `trn. The result of the recover block will be adapted in the the viewpoint of the
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just leaving a note to discuss this (trn recovers) properly. I'm becoming less and less sure that this would be the best for Pony.

@rhagenson
Copy link
Member

@jasoncarr0 Is it be fair to call this "viewpoint extraction"?

My reading of this makes me think the purpose of this is to allow users to write code that would otherwise need to be implemented by each and every library developer. To leverage your same isolated data example, the move from non-working ic.isodata.values() to working viewpoint extracted isodata.values() is a transition from viewing the isolated data from outside the IsoClass starting from ic to viewing the isolated data from inside the IsoClass starting from isodate. Is this correct?

If so (I am going to make a cursed comment): I like it...but the syntax is confusing. If this operation is not changing refcaps, but rather changing viewpoint (even if thereby acquiring a new refcap) I would recommend a new viewas block. Overloading recover for both refcap changes AND viewpoint changes is unnecessarily confusing, in my opinion. I also worry about dependence on variable names within the viewpoint extraction -- what happens if isodata is changed to idata? -- and therefore wonder if there is some reasonable limit to how deep this could go?

@jasoncarr0
Copy link
Author

jasoncarr0 commented Aug 28, 2020

@rhagenson that's on the right track, but not quite there. The final result of this expression is adapted, as you've indicated, as in the existing use case of automatic receiver recovery*, but the inside of the block is just like a recover block. Inside the recover block, we may use the capability as though it were fully aliasable, we don't have to maintain uniqueness. And in exchange, we must view the outside world in an isolated fashion. This feature has been available with automatic receiver recovery, but only for authors of the original class, not for users.

In fact, a standard recover block can very nearly be desugared using recovers with receiver / automatic receiver recovery, as it turns out. The standard recover block comes out of the viewpoint adaptation iso^->T (under the updated chart from Steed).

If we have a recover

recover
   ... do standard recover block stuff
   result
end

then we can de-sugar it to:

class iso RecoverRegion[T]
   var result: (T | None) = None
   new iso create() => None
   
let region = RecoverRegion[T]
region.recover
   ... do standard recover block stuff
   region.result = result
end
(consume region).result as iso^->T
// this is why it's "almost" a de-sugaring, we know this won't error because we always assign the result

So perhaps this clarifies why viewpoint adaptation appears in the final result. And to clarify, this viewpoint adaptation also appears indirectly in automatic receiver recovery today*, which is something we refer to as recovery. This, if it was not clear, is simply allowing automatic receiver recovery for "anonymous" blocks

*Technically what we do is check what the final capability is, there's a bit of obfuscation here today, but we allow those types precisely if they're the result of "some" viewpoint adaptation, rather than performing it outright. I accidentally got a bit ahead of myself in explicitly asking for viewpoint adaptation instead of the checks we have today.

@jasoncarr0
Copy link
Author

jasoncarr0 commented Aug 28, 2020

I didn't quite answer your questions in the above.

Is it be fair to call this "viewpoint extraction"?

I would say that is reasonably fair in light of my explanation above. But note that the viewpoint is applied to the result, not the argument.

To leverage your same isolated data example, the move from non-working ic.isodata.values() to working viewpoint extracted isodata.values() is a transition from viewing the isolated data from outside the IsoClass starting from ic to viewing the isolated data from inside the IsoClass starting from isodate. Is this correct?

You are correct in that we are viewing from "inside" the IsoClass. This is not viewpoint adaptation.
In particular: ic.isodata.values() to working viewpoint extracted isodata.values() is also not viewpoint adaptation. Viewpoint adaptation applies the restrictions of the source to the capability of the target. For any capability except ephemeral capabilities and val, it will always make the target weaker.

The capability of the new binding will depend on the capability of the expression. If it is a unique capability, `iso` or `trn`, then the resulting capability
will be the strongest aliasable type: `ref`. If it is any self-aliasing capability `k`, then the resulting capability will be `k`.
Acknowledging that there may be better choices available, at this time `iso^` or `trn^` will take the capability `ref` and act identically to their
non-ephemeral counterparts. Any variables syntactically present in the receiver expression are considered in-use for the duration of the block and cannot be consumed or re-assigned.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that strong enough of a restriction? Is it possible that some unsafe chicanery could be done with destructive assign to fields of an in-use variable, even if the in-use variable is not consumed?

Copy link
Author

@jasoncarr0 jasoncarr0 Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, you're right that this wasn't written nearly as restrictive as it should be. This condition should not deviate from the soundness guaranteed by normal function calls and by recover blocks. In order to not stretch beyond what would be available to a function call, we can require that the capability be iso^, val or tag.

After looking back at ponylang/ponyc#3596 I'm not convinced that the cases we gave were unsound (an iso temporary is compatible with an iso variable in the short-term, we just need to conflate them for consumes), but that this "potentially aliases" notion is a bit more complex than just the variables which originally existed. So unless there's a very clever solution here I think we should stick with the restrictions above of requiring iso^ and not iso expressions, as by its nature iso^ can never conflict with anything.

Copy link
Author

@jasoncarr0 jasoncarr0 Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not be clear what the restriction to iso^ does, it means that we cannot call any mutable methods, and hence cannot do anything else that could invalidate. We still should block consuming/assigning the variables themselves, as we would in the case of a function call (and if it is not the case that we do so, then we can log a soundness bug).

@SeanTAllen
Copy link
Member

I'm not a huge fan of syntax for this, but I do love providing a way for folks to be able to work with iso more easily as per the couple of examples that @jasoncarr0 provided.

@jemc has thought a lot more about recovery than I have so in many ways, I'm probably going to default to his feelings on this.

@SeanTAllen
Copy link
Member

We want to make sure that we discuss this first at the next sync. We made some progress on discussion today but it is a rather large topic and we covered a ton of ground. If you are interested in catching up on this, please see the sync recording for September 1, 2020.

https://sync-recordings.ponylang.io/r/

@jasoncarr0
Copy link
Author

jasoncarr0 commented Sep 4, 2020

To get it written out, and perhaps have the opportunity to read before sync, I'm going to leave a quick comment on the sync topic of referencing objects in different regions for functions. I'm happy to move / reproduce in another conversation.

For its relevance to this RFC, I think these are related, but complementary changes (not competing). What this RFC would be doing is allowing more ways to shift the "baseline" region as is available with automatic receiver recovery. This is nearly orthogonal to the task of referring to objects in two different regions.


As far as referring to objects in two different regions, this can more or less be done in Pony today (but our recover blocks are too strict; this can only be done with iso objects, not in-progress recovers).
We always have a "baseline" region for the current environment that we're working in.
and we have two references a and b which are isolated from each other. Well, that means either:

  • a is isolated from the environment, but b isn't
  • a is not isolated from the environment, but b is
  • both a and b are isolated from the environment

But that last case is the most general and we'd have to account for it, a and b are references which are isolated. Thus in short, they're just regular iso expressions (not iso^). So my short proposal is: we allow functions to take temporary iso arguments which cannot be consumed (cf. Sylvan's noconsume ponylang/ponyc#2043 (comment) and our non-ephemeral match discussion). Recover blocks are fixed to use iso adaptation, once we bring in Steed's viewpoint tables. Thus we can pass refs from two different recover blocks to said functions (as they get adapted to iso when not consuming). There's a point of discussion to allow refs or not and the implications.

That change would actually be benefited by this one as it would let you recover with said arguments without consuming

@jemc
Copy link
Member

jemc commented Sep 8, 2020

allow functions to take temporary iso arguments which cannot be consumed (cf. Sylvan's noconsume ponylang/ponyc#2043 (comment) and our non-ephemeral match discussion)

It seems like in Pony syntax, this would be a parameter binding of type iso! instead of iso. Or in current Mare syntax, iso'aliased instead of iso. Is this the right idea, or do we really need some new concept for the language user to understand like Sylvan's "noconsume" (I hope not)?

@jemc
Copy link
Member

jemc commented Sep 8, 2020

Regarding the example that was given of something that is not possible in Pony today:

class Foo
    var x: U64 = 0

actor Main
    new create(env: Env) =>
        let arr = Array[Foo iso]
        arr.push(Foo)
        try
           arr(0)?.recover as elem =>
              this.do_lots_of_work(elem)
           end
        end

     fun tag do_lots_of_work(foo: Foo ref) =>
         foo.x = foo.x + 1

not possible at all in pony today (but would be possible if do_lots_of_work were a ref method on Foo).

It's worth noting that this use case could also be made possible by improving call auto-recovery in the way specified in this ticket (ponylang/ponyc#2038)

That is, the aim of the above ponyc ticket is to resolve the issue that the receiver argument is treated in a special way by the compiler, allowing it to be auto-recovered, but it should be possible to auto-recover any argument as long as all other arguments are sendable.

So at least in this use case, the new syntax proposed in this RFC is not necessary if we do implement the lower-hanging fruit of improving auto-recovery as described in ponylang/ponyc#2038.

Are there other use cases we can discuss that would not be possible in such a system?

Comment on lines 96 to 98
e.recover as x =>
e
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There have been discussions in the past about syntax in which we have decided to try to avoid "overloading" a keyword or symbol with multiple meanings. In this case I think we should avoid using as for this syntax, since it is entirely different from how we use as for runtime type matching with the right side being a type name.

Not that I'm saying I like everything about Pony syntax as it is today (I don't), but I'm just raising this comment for consistency purposes.

In Pony syntax I think the most applicable symbol/keyword to use here would be | (as used in match blocks).

Suggested change
e.recover as x =>
e
end
e.recover | x =>
x
end

Also I think you meant to use x inside the block instead of e, so I included that change in this suggestion as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied this manually, but the use of e denotes an expression, hence why x is not used. These are grammatical definitions, not examples.

Copy link
Contributor

@malthe malthe Sep 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasoncarr0 that confused me as well; I think e.recover | x => f(x) end (or similar) would be more readable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about e[x]? would that make it clear, or ... x .... Of course at the end of the day this is a pretty meaningless subset of the RFC, and I can go to what makes it clear, but on a personal level those ones feel incorrect.

@jasoncarr0
Copy link
Author

jasoncarr0 commented Sep 23, 2020

In short, we need two mostly orthogonal conditions, one for sendability, and one for the "lifetime extending" behavior. These are implicitly included in the automatic receiver recovery conditions. We will likely want to use the lifetime extension checks for non-ephemeral match

First for the recover:

  • All variables from the outside environment must be used in a sendable way (iso^, iso, val, tag).
    We can do this via viewpoint adaptation (iso->) but this is more useful with Steed's theory.

Then for the lifetime extension, in that we're taking a cap which cannot normally be aliased, and creating a variable,
we want to enforce that we cannot invalidate the source of this reference. One simple condition is:

  • All variables from the outside environment besides consumed isos are read-only, i.e. box-> adapted. This is equivalent to the function automatic recovery when combined with sendability. The reason to special-case consumed iso is because of uniqueness; nothing it can write could alias with our receiver when we check consuming variables in use.

As an additional feature, we could also allow iso expressions that are statically known not to alias, for instance x or x.field, or x.method(y.g), where x and y are iso or trn variables, or the receiver is likewise z.field or z.method(). This is not significantly more complex than the existing checks for variable usage with consume, but will likely take some degree of effort to figure out.


Note that I haven't handled trn here. It could also receive trn^, not just iso^, but it makes the story for recover blocks more complex in the compiler and in docs for little gain in practice.

Also, the fact that the two pieces are iso-> and box-> should clarify why it would be nice to have a capability for iso->box (n.b. this case is not associative unless it's present)

Base automatically changed from master to main February 8, 2021 22:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants