Add async stack support to coroutines #632

ispeters · 2024-08-29T04:30:51Z

This change extends the work in #616 to support async stack frames in
task<> coroutines, including those that invoke at_coroutine_exit().

In task<>, when UNIFEX_NO_ASYNC_STACKS is falsey, the awaiter returned from
task<>'s customization of unifex::await_transform stores an
AsyncStackFrame. The awaiter pushes its frame onto the current async stack in
await_suspend() and pops it again in await_resume(); since
await_resume() is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

when a coroutine completes with a value or an error, it is responsible
for popping its own AsyncStackFrame; but
when a coroutine completes with done, the caller is responsible for
popping the callee's AsyncStackFrame as a part of the caller's
unhandled_done() coroutine.

To support this new requirement of unhandled_done() (that it is
responsible for popping the callee's stack frame), this change
introduces popAsyncStackFrameFromCaller, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A task<> promise has an AsyncStackFrame* that, when it's not
nullptr, points to the AsyncStackFrame in the awaiter waiting for
the task. This pointer exists even when UNIFEX_NO_ASYNC_STACKS is
truthy to help mitigate against ODR violations; linking together two TUs
with UNIFEX_NO_ASYNC_STACKS set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a task<> is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
AsyncStackFrame* to point to the awaiter's frame; when a task<> is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of task<>'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in connect_awaitable(), where an AsyncStackRoot is set up
before starting the connected awaitable.

stop_if_requested used to model both sender and awaitable so that
co_await stop_if_requested(); could take advantage of symmetric
transfer. The stop_if_requested sender now customizes
await_transform to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, unifex::await_transform() now wraps naturally-awaitable
arguments in an awaiter_wrapper that ensures the coroutine_handle<>
passed to the wrapped awaitable is one that establishes an active
AsyncStackRoot before resuming the real waiting coroutine.

This PR continues what was started in #635 and adds more customizations of `get_return_address`, starting with `task<>`.

This change extends the work in #616 to support async stack frames in `task<>` coroutines, including those that invoke `at_coroutine_exit()`. In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from `task<>`'s customization of `unifex::await_transform` stores an `AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in `await_suspend()` and pops it again in `await_resume()`; since `await_resume()` is only invoked for value and error completions, this arrangement leaves it up to the waiting task to pop the awaiter's frame when the awaited task completes with done. This can be expressed as a new rule: - when a coroutine completes with a value or an error, it is responsible for popping its own `AsyncStackFrame`; but - when a coroutine completes with done, the *caller* is responsible for popping the callee's `AsyncStackFrame` as a part of the caller's `unhandled_done()` coroutine. To support this new requirement of `unhandled_done()` (that it is responsible for popping the callee's stack frame), this change introduces `popAsyncStackFrameFromCaller`, which takes the caller's stack frame by reference so that it can assert that, after popping the current async frame (whatever it is), the new top frame is the caller's frame. A `task<>` promise has an `AsyncStackFrame*` that, when it's not `nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is truthy to help mitigate against ODR violations; linking together two TUs with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly supported but, by ensuring this pointer always exists, some ODR problems are avoided. When a `task<>` is awaited from a TU with async stack support enabled, the awaited task's awaiter sets the promise's `AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is awaited from a TU with async stack support disabled, this assignment never happens and the promise's pointer remains null. The above description of `task<>`'s async stack maintenance only covers the recursive case of on coroutine awaiting another. The base case is handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up before starting the connected awaitable. `stop_if_requested` used to model both `sender` and `awaitable` so that `co_await stop_if_requested();` could take advantage of symmetric transfer. The `stop_if_requested` sender now customizes `await_transform` to express its participation in async stack management. This means of expressing async stack awareness is unsatisfying but I don't have any better ideas right now. Lastly, `unifex::await_transform()` now wraps naturally-awaitable arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>` passed to the wrapped awaitable is one that establishes an active `AsyncStackRoot` before resuming the real waiting coroutine.

jesswong · 2024-09-26T00:22:51Z

include/unifex/connect_awaitable.hpp

+      if constexpr (
+          WithAsyncStackSupport && same_as<CPO, tag_t<get_async_stack_frame>>) {
+        return &p.frame_;
+      } else {
+        return std::move(cpo)(std::as_const(p.receiver_));
+      }


this is a weird check. is it possible to separately define a get_async_stack_frame tag_invoke that will be called instead of this generic CPO function?

It is weird, but I ran into trouble (that I don't remember in detail now) expressing the conditions correctly on a separate tag_invoke overload.

jesswong · 2024-09-26T00:25:49Z

include/unifex/connect_awaitable.hpp

@@ -297,6 +365,7 @@ struct _fn {
      (requires detail::_awaitable<Awaitable>)  //
      _sender<remove_cvref_t<Awaitable>>
      operator()(Awaitable&& awaitable) const {
+    // TODO: this is going to generate an unfortunate return address


why's that?

Because it's inside the bowels of Unifex here; I think a better return address might be get_return_address(awaitable). We might have actually changed this in one of the branches that improves return address reporting for task<>.

jesswong

LGTM?

janondrusek · 2024-09-26T02:09:58Z

include/unifex/with_scheduler_affinity.hpp

@@ -144,7 +157,7 @@ struct _fn final {
    if constexpr (
        !same_as<blocking_kind, blocking_t> &&
        (blocking_kind::always_inline == blocking_t{})) {
-      return Awaitable{(Awaitable &&) awaitable};
+      return Awaitable{(Awaitable&&)awaitable};


nit: wanna rid of the C-style cast?

I do want to, but the clock has run out.

janondrusek · 2024-09-26T02:12:13Z

include/unifex/tracing/async_stack-inl.hpp

+  auto root = tryGetCurrentAsyncStackRoot();
+  assert(root != nullptr);


why call it try if it cannot be nullptr?

The current async stack root is allowed to be nullptr in general, but the stack is in a bad state if it's null in this case. The alternative way to read the current stack root is getCurrentAsyncStackRoot(), but that returns a reference; if the stack is corrupt at this point, we can't meaningfully assert that a reference is valid, but we can assert that a nullable pointer isn't null.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 29, 2024

ispeters marked this pull request as draft August 29, 2024 04:33

Base automatically changed from add_return_address_to_as_sender to main August 29, 2024 19:39

ispeters force-pushed the coroutine_async_stack_support branch 3 times, most recently from 48f411c to e8a18b9 Compare September 1, 2024 01:18

ispeters requested review from janondrusek and jesswong September 1, 2024 04:13

ispeters marked this pull request as ready for review September 1, 2024 04:13

ispeters force-pushed the coroutine_async_stack_support branch from e8a18b9 to fe4ab6c Compare September 11, 2024 19:00

ispeters changed the base branch from main to tweak_any_sender_of_opstate September 11, 2024 19:01

ispeters force-pushed the tweak_any_sender_of_opstate branch from b63a902 to 4e4f5e9 Compare September 11, 2024 19:06

Base automatically changed from tweak_any_sender_of_opstate to main September 11, 2024 20:05

ispeters force-pushed the coroutine_async_stack_support branch from fe4ab6c to 45ca45b Compare September 11, 2024 20:06

ispeters added 4 commits September 11, 2024 14:16

Add more get_return_address customizations

9fbb36e

This PR continues what was started in #635 and adds more customizations of `get_return_address`, starting with `task<>`.

Customize get_return_address for _sr_thunk_task

14648df

Customize get_return_address for _wsa_sender_wrapper

8e2fce8

Customize get_return_address for finally

e0a75e3

ispeters force-pushed the coroutine_async_stack_support branch from 45ca45b to bdefd9f Compare September 11, 2024 21:47

ispeters added 3 commits September 12, 2024 11:11

Customize get_return_address for when_all_range

67b7e35

Try improving connect_awaitable's return address

bbca487

ispeters force-pushed the coroutine_async_stack_support branch from 987a18c to bbca487 Compare September 12, 2024 18:11

jesswong reviewed Sep 26, 2024

View reviewed changes

jesswong approved these changes Sep 26, 2024

View reviewed changes

janondrusek approved these changes Sep 26, 2024

View reviewed changes

ispeters merged commit 7af55cd into main Sep 26, 2024
105 checks passed

ispeters deleted the coroutine_async_stack_support branch September 26, 2024 16:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add async stack support to coroutines #632

Add async stack support to coroutines #632

ispeters commented Aug 29, 2024

jesswong Sep 26, 2024 •

edited

Loading

ispeters Sep 26, 2024

jesswong Sep 26, 2024

ispeters Sep 26, 2024

jesswong left a comment

janondrusek Sep 26, 2024

ispeters Sep 26, 2024

janondrusek Sep 26, 2024

ispeters Sep 26, 2024

		auto root = tryGetCurrentAsyncStackRoot();
		assert(root != nullptr);

Add async stack support to coroutines #632

Add async stack support to coroutines #632

Conversation

ispeters commented Aug 29, 2024

jesswong Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

ispeters Sep 26, 2024

Choose a reason for hiding this comment

jesswong Sep 26, 2024

Choose a reason for hiding this comment

ispeters Sep 26, 2024

Choose a reason for hiding this comment

jesswong left a comment

Choose a reason for hiding this comment

janondrusek Sep 26, 2024

Choose a reason for hiding this comment

ispeters Sep 26, 2024

Choose a reason for hiding this comment

janondrusek Sep 26, 2024

Choose a reason for hiding this comment

ispeters Sep 26, 2024

Choose a reason for hiding this comment

jesswong Sep 26, 2024 •

edited

Loading