Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ffigen] Blocking blocks #1796

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

[ffigen] Blocking blocks #1796

wants to merge 13 commits into from

Conversation

liamappelbe
Copy link
Contributor

@liamappelbe liamappelbe commented Dec 10, 2024

Add a new .blocking() constructor to all ObjC blocks that return void. This constructor returns a block that can be invoked from any thread, and blocks the caller until the Dart callback is complete.

Under the hood it has two modes, depending on whether it's invoked from the same thread as the isolate that created the block, or a different thread. If it's on the same thread, it uses a NativeCallable.isolateLocal, and doesn't need anything special to get blocking behavior. If it's on a different thread, it uses a NativeCallable.listener, and uses an ObjC condition variable, NSCondition, to wait for the callback to complete. This difference avoids deadlocks in the on-thread case.

Async callbacks aren't supported in either case. It would be trivial to support them in the off-thread case, but difficult/impossible in the on-thread case. So to avoid semantic differences I'm just not supporting them at all.

There are still edge cases where it's possible to deadlock. If the target isolate has shut down, and the block is invoked from a different thread, then it will send a message to a non-existent isolate and wait forever to be signaled. To mitigate this case we use a timeout when waiting on the condition variable. That timeout can be set by the user.

The different behavior on-thread vs off-thread means we need two Dart-side trampolines, and two blocks passed to the native block wrapper. You can see how it all works by looking at the new ObjC-side trampolines. For example:

_ListenerTrampoline _ObjectiveCBindings_wrapBlockingBlock_1j2nt86(
    _BlockingTrampoline block, _BlockingTrampoline listenerBlock, double timeoutSeconds,
    void* (*newWaiter)(), void (*awaitWaiter)(void*, double))  NS_RETURNS_RETAINED {
  NSThread *targetThread = [NSThread currentThread];
  return ^void(id arg0, id arg1, id arg2) {
    if ([NSThread currentThread] == targetThread) {
      objc_retainBlock(block);
      block(nil, objc_retainBlock(arg0), objc_retain(arg1), objc_retain(arg2));
    } else {
      void* waiter = newWaiter();
      objc_retainBlock(listenerBlock);
      listenerBlock(waiter, objc_retainBlock(arg0), objc_retain(arg1), objc_retain(arg2));
      awaitWaiter(waiter, timeoutSeconds);
    }
  };
}

Part of #1647. Non-void callbacks and blocking protocol methods are left as follow up work.

Details

  • The condition variable is created and managed by 3 new functions in package:objective_c: newWaiter, signalWaiter, and awaitWaiter.
    • Under the hood these manage a DOBJCWaiter, which just wraps an NSCondition and a bool flag. The flag is needed because whenever you wait on a condition variable, there can be spurious wake ups.
    • In the normal flow of the off-thread case, each of these functions should be called once per invocation.
    • newWaiter returns an object with a +2 ref count, and the other two each decrement the ref count by 1.
  • Due to linker restrictions in google3, I can't tell users to use the clang linker flags that would allow the ffigen generated .m file to dynamically link the newWaiter and awaitWaiter functions from package:objective_c. So instead I'm passing them down as function pointers. That's what package:objective_c's new wrapBlockingBlock function is for.

@liamappelbe liamappelbe changed the title WIP: [ffigen] Blocking blocks [ffigen] Blocking blocks Dec 12, 2024
@coveralls
Copy link

coveralls commented Dec 12, 2024

Coverage Status

coverage: 88.389% (+0.4%) from 88.032%
when pulling 4f49843 on blockblock
into 4d81ce6 on main.

@liamappelbe liamappelbe marked this pull request as ready for review December 12, 2024 06:22
@liamappelbe liamappelbe requested review from dcharkes and removed request for HosseinYousefi December 16, 2024 09:20
@brianquinlan
Copy link
Contributor

There is a Duration typedef in IOMacOSTypes.h:

typedef SInt32 Duration;

This gets added to the native bindings as:

typedef Duration = SInt32;

You probably need to add a new header to your generated code like:

import 'dart:core' as core;

and then generate:

core.Duration(seconds: 3)

@liamappelbe
Copy link
Contributor Author

You probably need to add a new header to your generated code like:

import 'dart:core' as core;

Apparently if you do this, you have to prefix all the built in types too: core.bool, core.int...

I'll just add Duration to the keyword list, so that the Apple one gets renamed.

@dcharkes
Copy link
Collaborator

Some high level questions:

Add a new .blocking() constructor to all ObjC blocks that return void. This constructor returns a block that can be invoked from any thread, and blocks the caller until the Dart callback is complete.

Under the hood it has two modes, depending on whether it's invoked from the same thread as the isolate that created the block, or a different thread. If it's on the same thread, it uses a NativeCallable.isolateLocal, and doesn't need anything special to get blocking behavior. If it's on a different thread, it uses a NativeCallable.listener, and uses an ObjC condition variable, NSCondition, to wait for the callback to complete. This difference avoids deadlocks in the on-thread case.

Could we implement this as NativeCallable.blocking in the Dart SDK so it can be reused by interop for other languages?

Async callbacks aren't supported in either case. It would be trivial to support them in the off-thread case, but difficult/impossible in the on-thread case. So to avoid semantic differences I'm just not supporting them at all.

There are still edge cases where it's possible to deadlock. If the target isolate has shut down, and the block is invoked from a different thread, then it will send a message to a non-existent isolate and wait forever to be signaled. To mitigate this case we use a timeout when waiting on the condition variable. That timeout can be set by the user.

Since this depends on NativeCallable.listener, isn't calling that callback after the isolate has been shut down undefined behavior? (Assuming that if the isolate is gone it has the same semantics as if close were called.)

This callback must be closed when it is no longer needed. The Isolate that created the callback will be kept alive until close is called. After NativeCallable.close is called, invoking the nativeFunction from native code will cause undefined behavior.

https://api.dart.dev/dart-ffi/NativeCallable/NativeCallable.listener.html

@liamappelbe
Copy link
Contributor Author

Could we implement this as NativeCallable.blocking in the Dart SDK so it can be reused by interop for other languages?

Yes, but that's a more complicated change to make. It would also mean waiting on a Dart release before I can make the corresponding ffigen changes and unblock cupertino_http. I'm also not aware of any other use cases that need it (other than jnigen, which already has a solution like this one).

Since this depends on NativeCallable.listener, isn't calling that callback after the isolate has been shut down undefined behavior? (Assuming that if the isolate is gone it has the same semantics as if close were called.)

Is your concern that it might do something unexpected, like crash? IIRC, that's technically possible, but in practice I think it pretty much always just does nothing.

Or is your point that we shouldn't worry about deadlocking in this case, since it's UB anyway, and the timeout is unnecessary?

@dcharkes
Copy link
Collaborator

Could we implement this as NativeCallable.blocking in the Dart SDK so it can be reused by interop for other languages?

Yes, but that's a more complicated change to make. It would also mean waiting on a Dart release before I can make the corresponding ffigen changes and unblock cupertino_http. I'm also not aware of any other use cases that need it (other than jnigen, which already has a solution like this one).

We had users upvote it, I'm fine with landing a custom solution here first. However, I'd like to avoid building the custom solution here in a way that would be incompatible with with dart-lang/sdk#54554.

Since this depends on NativeCallable.listener, isn't calling that callback after the isolate has been shut down undefined behavior? (Assuming that if the isolate is gone it has the same semantics as if close were called.)

Or is your point that we shouldn't worry about deadlocking in this case, since it's UB anyway, and the timeout is unnecessary?

Yes, it's UB. So providing an API gives the false security that it's a covered case. Also, I think it would make the API imcompatible with basing it on NativeCallable.blocking later.

Is your concern that it might do something unexpected, like crash? IIRC, that's technically possible, but in practice I think it pretty much always just does nothing.

If we have use cases where users have callbacks outliving isolates and we can guarantee that it's not UB, we should upgrade what we guarantee in the Dart SDK before adding guarantees in FFIgen/JNIgen.

@dcharkes
Copy link
Collaborator

dart-lang/sdk#54554 (comment)

@brianquinlan For cupertino_http, is it important which thread executes the Dart code? In the NativeCallable.blocking design discussions, we discuss two options: (1) the caller thread is blocked and whatever thread is the Dart mutator thread will execute the Dart code and possible callbacks into native code, (2) the caller thread executes the Dart code and possible callbacks into native code, which requires the Dart mutator thread ceeding control.

Option (2) cannot be implemented on top of NativeCallable.listener, so this PR has semantics (1).

@brianquinlan
Copy link
Contributor

dart-lang/sdk#54554 (comment)

@brianquinlan For cupertino_http, is it important which thread executes the Dart code? In the NativeCallable.blocking design discussions, we discuss two options: (1) the caller thread is blocked and whatever thread is the Dart mutator thread will execute the Dart code and possible callbacks into native code, (2) the caller thread executes the Dart code and possible callbacks into native code, which requires the Dart mutator thread ceeding control.

Option (2) cannot be implemented on top of NativeCallable.listener, so this PR has semantics (1).

@dcharkes I didn't understand the framing of your question. There are Apple APIs where it is important that the callback execute the Dart code before exiting. The one API that I'm currently using that his this property is for file downloads, where the API works like:

downloadFile(Uri uri, callback) {
  download the file to `path`
  callback(path)
  unlink(path)
}

So, if the logic for callback isn't completed before the return to the caller, then the file will be deleted and the callback is useless.

@liamappelbe
Copy link
Contributor Author

@brianquinlan I think the distinction Daco is making is between the approach in this PR, where the caller thread is blocked while the Dart code is executed on a VM managed thread, vs one of the other approaches from my old design docs, where the caller thread tries to enter the target isolate and execute the Dart code on the caller thread.

@brianquinlan
Copy link
Contributor

@brianquinlan I think the distinction Daco is making is between the approach in this PR, where the caller thread is blocked while the Dart code is executed on a VM managed thread, vs one of the other approaches from my old design docs, where the caller thread tries to enter the target isolate and execute the Dart code on the caller thread.

What is the semantic different from the user's POV?

@liamappelbe
Copy link
Contributor Author

What is the semantic different from the user's POV?

The only differences I can think of is that the current approach marshals args into a message (whereas the other approach could invoke the Dart function directly), and the fact that any thread-local storage would have different data (only relevant if your Dart code calls back into native code that uses TLS).

@liamappelbe
Copy link
Contributor Author

Yes, it's UB. So providing an API gives the false security that it's a covered case. Also, I think it would make the API imcompatible with basing it on NativeCallable.blocking later.

Ok, I'll remove the timeout.

@dcharkes
Copy link
Collaborator

What is the semantic different from the user's POV?

The only differences I can think of is that the current approach marshals args into a message (whereas the other approach could invoke the Dart function directly), and the fact that any thread-local storage would have different data (only relevant if your Dart code calls back into native code that uses TLS).

Yes, thread local storage. And also if the calling native thread has acquired a lock with reentrancy support, and the Dart code calls back into native code and tries to acquire that lock again.

However, we first need to solve thread pinning (dart-lang/sdk#46943 (comment)) for that to work. So, I don't think we can promise to semantics (2) any time soon.

We can probably switch from semantics (1) to semantics (2) later without breaking people. It would be very weird if people would rely what the Dart mutator thread is and use its TLS and thread-identity for locking. (And it would be broken, because the mutator thread can change.)

@mkustermann
Copy link
Member

The only differences I can think of is that the current approach marshals args into a message (whereas the other approach could invoke the Dart function directly), and the fact that any thread-local storage would have different data (only relevant if your Dart code calls back into native code that uses TLS).

@liamappelbe You can run the dart code on the same thread as the ObjC block was called and therefore make it similar to actual ObjC callbacks, make TLS work ...

Not too familiar with the bindings generator, but it may not be hard to do, you can see the pattern the main thread does e.g. here. You basically

  • give the dart isolate a ping and wait (so it cooperates)
  • the dart isolate will then go to C++, save the current isolate, exists the isolate Dart_ExitIsolate() and gives the blocking thread a ping
  • the blocking thread can then Dart_EnterIsolate() on that thread and execute any code
  • once 3rd party thread is done running dart code, it can do the reverse: Dart_ExitIsolate() and give the blocking main thread a ping
  • the blocking main thread will re-enter the isolate via Dart_EnterIsolate() and continue running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants