initial implementation #1

jlizen · 2024-12-19T05:53:01Z

Sharing initial commit for design feedback. Please pick it to pieces. Glad to split into smaller commits if it'd be helpful. I figured it was short enough that seeing the whole picture might be useful.

Some specific questions:

Crate name

Is compute-heavy-future-executor too long? Most crates are like 6 letters long. I was trying to make it clear what it does, but it's pretty klunky. Open to alternatives

Strategy-setting API

I don't love the current ergonomics of initializing a strategy. We have all these calls like initialize_block_in_place_strategy() which either initialize the oncelock or panic.

I was considering something like a builder or an enum, but it felt ridiculous given that there is only really one step for these and then we stash any output in the once lock. And then they are more verbose to call.

Other ideas welcome.

Custom executor

I wanted to put in an escape hatch to allow alternative async runtimes / customizing existing strategies with extra metrics or whatever / etc. But, the current form isn't particularly pleasant.

Ideally the caller would be able to implement the ComputeHeavyFutureExecutor trait themselves, but there were a few issues:

We can't just accept a generic since we need to store the executor in the once lock, and we can't have generics in the static context
We can't accept Box<dyn ComputeHeavyFutureExecutor> because the trait isn't object safe to its own generic bounds on its execute(fut: F) method

Instead I stored a closure inside of the CustomExecutor struct. That closure takes a future with Any output on the way in, and returns one with Any output. Because our execute call does have concrete typing, we can temporarily type erase the future's output before sending it into the closure, and then downcast it back on the flip side.

I think it's sound (please correct me, of course). Unfortunately we only are validating that the closure we get doesn't mangle the type, on initialization, not at compile time.

In addition to complexity, this also adds quite a few extra allocations due to all the added vtables and such.

I'm open to alternative approaches. Or, if it's better just to drop this functionality from the library entirely, and force people to stay on the rails, I'm open to that too.

Using `get_or_init()` inside `spawn_compute_heavy_future()`

Feedback on some dummy code in rustls/tokio-rustls#94 (comment) was that get_or_init() was overkill top call from inside the spawn handling.

As best as I can tell, using get_or_init() rather than matching against the runtime flavor every time this function is called, is strictly more efficient - since, it anyway starts off with OnceCell::get(), and then only calls the secondary logic if it was uninitialized - ref https://doc.rust-lang.org/beta/src/std/sync/once_lock.rs.html#387

This means that with get_or_init(), in case we need to use defaults, we have the single get branch every call, and the first call, the additional initialization logic.

Meanwhile with just a naked get().unwrap_or_else(), we still have the single get branch every call, and then the additional match branch against runtime flavor.

Probably a minor point either way, just calling out my reasoning so that it can be corrected as needed :)

Use of log crate

It felt strange depending on tracing when tokio itself is optional. I was browsing other libraries and was seeing that many of them just use the simple log crate. Which has interop with tracing anyway.

Is that the right choice? Or are libraries just using log because they were written before tracing was widely used?

Testing

Unit tests + doc tests succeed locally, with and without the tokio cfg flag enabled. I didn't want to overload this PR even further by setting some github actions, but I'll get to it!

src/custom_executor.rs

src/lib.rs

src/secondary_tokio_runtime.rs

src/lib.rs

arielb1 · 2024-12-19T12:41:15Z

Is compute-heavy-future-executor too long? Most crates are like 6 letters long. I was trying to make it clear what it does, but it's pretty klunky. Open to alternatives

I don't think a long name now is a problem.

I don't love the current ergonomics of initializing a strategy. We have all these calls like initialize_block_in_place_strategy() which either initialize the oncelock or panic.

I believe you should return a Result rather than panicking. A framework-like library might want to supply a strategy without panicking.

A builder might be nicer to find references for. For example, one that is used like compute_heavy_future::global_strategy().initialize_block_in_place() form.

I think it's sound (please correct me, of course).

If you don't use unsafe then it has to be sound. It does add a few extra allocations tho [I think 3 - one for the future, one for boxing the output, one for the waiter], but these should not be bad for futures that are actually "compute-heavy".

Meanwhile with just a naked get().unwrap_or_else(), we still have the single get branch every call, and then the additional match branch against runtime flavor.

This should not affect performance significantly. In any interesting case both branches should be perfectly predictable.

src/secondary_tokio_runtime.rs

src/lib.rs

Cargo.toml

src/custom_executor.rs

…input rather than panic, add concurrency control + customizable channel size to secondary runtime

…der method naming

…s of custom executor closure input to include `Send + static`, abstract cancellation logic, change multithreaded tokio default strategy from block_in_place -> spawn_blocking, stash the default strategy in its own oncelock

jlizen · 2024-12-20T20:38:03Z

@arielb1 - all comments should be addressed, thanks again for the close read

The main thing I want to call out is, I ended up adding a secondary OnceLock for the default strategy, that is lazily set and then subsequently ignored if a strategy is initialized afterwards.

I needed this because I started having lifetime issues on the Default impl for ExecutorStrategyImpl once I added in the possible Semaphore usage, even though it is always None when constructed as a default.

I figured it was probably fine since it still avoids the sharp edge of libraries calling spawn_compute_heavy_future() before the caller initializes. Open to alternatives.

Otherwise, I tried to break changes into commits, though some got a bit bigger. Here's the rundown:

Changes since last review

migrate from config flags to features
Add global_strategy_builder() method
Optional concurrency control for all executors
All executors are cancellable
- I considered adding a knob to remove this behavior, but figured it was scope creep and cancellable is the right default behavior. I can open a github issue to see if anybody is interested in that later.
Moved the custom executor over to using a oneshot channel rather than having dyn Any
- I considered keeping some support for Any in case the closure wants to know the future's output type for prioritization or other selective handling. But that's scope creep, I can cut a github issue to see if anybody cares about having support for that via a new executor / new knob.
Removed the default to BlockInPlace strategy for multi-threaded executor, there are too many footguns with it. Anyway the caller can choose it explicitly.
Moved tests to integration tests rather than requiring nextest, it's awkward because it forces a flat directory structure but it's still better than doctests I feel
Uses COMPUTE_HEAVY_FUTURE_EXECUTOR_STRATEGY::get() rather than get_or_init() inside spawn_compute_heavy_future() (though there is now a secondary oncelock for the default strategy as mentioned above)
Various other small cleanup / tweaks

… since it requires polling

src/lib.rs

jlizen · 2024-12-23T17:33:35Z

Additional feedback from ariel - better to keep log messages on setting strategy inside a shared function where the once lock is actually set. Similarly only error there rather than sprinkled throughout constructor.

src/lib.rs

…` call, consolidate tests, rename `run_compute_heavy_future()` to `execute_compute_heavy_future()`

arielb1 · 2024-12-23T22:36:42Z

src/lib.rs

+    }
+}
+
+#[must_use = "doesn't do anything unless used"]#[derive(Default)]


cargo fmt missing

arielb1 · 2024-12-23T22:41:17Z

LGTM but plz rustfmt

jlizen · 2024-12-23T23:53:38Z

Appreciate the review @arielb1

I'll follow up with separate issues on:

avoiding holding permits across i/o sleeps (maybe impl a wrapper future)
support for propagating future type inside the custom executor, for prioritization purposes
support for more variability in handling based on caller-specified 'likiness to block' / 'length of block', etc

rcoh · 2024-12-24T18:57:31Z

My main feedback here is that I'm not sure about the mental model / use cases.

It seems like the strategy to use isn't necessarily application-global (or at least, there isn't a sensible default). Instead, if may make more sense to pick it based on the individual future? The properties of the future (or the number of futures) might dictate that choice.

Or perhaps put another way: I am a customer. How should I pick between these different options? Is there one that is the best?

jlizen · 2024-12-24T19:41:05Z

Thanks for that @rcoh . Well stated.

I do have a few issues open that touch on this:

I think we need some clearer recommendations around defaults. My knee jerk without profiling would be to guess that, probably spawn blocking + concurrency control is the right 'default behavior' to handle assorted calls to this by libraries. And then, if you know you have an issue with frequently blocking futures, the secondary tokio runtime might be better.

I want to validate those assumptions a bit more via profiling and then intend to flesh out the defaults as well as provide better explanations + recommendations in the module docs.

W/r/t to selective handling per future, I agree with that as well. I see two parts of this:

Better 'turnkey' handling for libraries to more granularly define likeliness/length of blocking. I kept initial PR scope narrower to avoid scope creep, but I want to probe this and probably publish with at least two modes of operation (likely 'sometimes/briefly blocking' and 'frequently/extended blocking'. Needs more validation first though.
I do think there is a place for exposing the specific shape of the input future into a custom closure-based executor as well (or perhaps a new turnkey strategy that allows injecting a selector between strategies based on input type, without requiring the full custom executor closure). Need to play around with that more.

rcoh · 2024-12-24T18:44:45Z

tests/block_in_place_strategy.rs

+    #[tokio::test(flavor = "multi_thread", worker_threads = 10)]
+    async fn block_in_place_concurrency() {
+        initialize();
+
+        let start = std::time::Instant::now();
+
+        let mut futures = Vec::new();
+
+        for _ in 0..5 {
+            let future = async move { std::thread::sleep(Duration::from_millis(15)) };
+            // we need to spawn here since otherwise block in place will cancel other futures inside the same task,
+            // ref https://docs.rs/tokio/latest/tokio/task/fn.block_in_place.html
+            let future =
+                tokio::task::spawn(async move { execute_compute_heavy_future(future).await });
+            futures.push(future);
+        }
+
+        join_all(futures).await;
+
+        let elapsed_millis = start.elapsed().as_millis();
+        assert!(elapsed_millis < 50, "futures did not run concurrently");
+
+        assert!(elapsed_millis > 20, "futures exceeded max concurrency");


if there are 10 workers, I'm not sure if this actually tests that execute_compute_heavy_future is working? Seems like the number of futures would need to exceed the number of workers?

yeah, good point, let me update these tests

Specifically, I think we want 4 workers here, so that we can validate both that concurrency limit is firing, but also that the futures are playing nicely with worker threads blocking

rcoh · 2024-12-24T20:02:12Z

src/spawn_blocking.rs

+pub(crate) struct SpawnBlockingExecutor {
+    concurrency_limit: ConcurrencyLimit,
+}


I wonder if this should actually hold onto a handle—you could initialize it from Runtime::current at creation time.

The benefit of that being, it would ensure that the spawn_blocking happens in the originating context instead of calling, in case of multiple runtimes? Or something else?

rcoh · 2024-12-24T20:04:31Z

src/spawn_blocking.rs

+        if let Err(err) = tokio::task::spawn_blocking(move || {
+            tokio::runtime::Handle::current().block_on(wrapped_future)
+        })


does this actually work? I think once a bad future is in the runtime, it's going to potentially tie up a worker

Hmm, I think I found that approach from something alice posted somewhere, let me dig it up.

It's currently test e2e with a thread::sleep so it seems to work as expected?
https://github.com/jlizen/compute-heavy-future-executor/blob/main/tests/spawn_blocking_strategy.rs#L36

I think it was from here, which isn't particularly authoritative
https://stackoverflow.com/questions/76965631/how-do-i-spawn-possibly-blocking-async-tasks-in-tokio

Going to poke around this more closely, thanks for the callout

Ah, alice also discusses it in discord eg https://discord.com/channels/500028886025895936/500336333500448798/1189652644881514586

And here:
https://users.rust-lang.org/t/tokio-from-async-to-sync-and-back-to-async-block-on-vs-spawn-blocking/83438/9

But that doesn't address it being a potentially blocking future

To summarize offline discussion - the behavior is kind of surprising after doing some tracing of polls vs threads. Calling Handle::block_on() from inside spawn_blocking() just spawns a task to a regular worker thread for that runtime. Which makes it pretty useless for our purposes.

Which kind of makes sense, how else would tokio know how to schedule the sleeping futures without a runtime, and it doesn't want to lazily spin up a current thread runtime for every spawn blocking thread. At that point probably you're better off managing your own threadpool full of current thread runtimes or using a delegated secondary multithreaded runtime.

Anyway, PR coming to rip spawn blocking out and default to secondary tokio runtime in case the tokio feature is enabled. I think probably we need to tear our block_in_place as well, for similar reasons.

Then meanwhile we have our custom executor as an escape hatch for the 'pool of current thread runtimes' case, other async runtimes, etc.

I'm also wondering if we should have a (still async) execute_sync() function that accepts a sync input function, for the 'definitely this is blocking and not containing futures' segments of a compute heavy workload.

Since, for those, there is a simpler use case that might just involves spawn_blocking along with concurrency control.

Today, library authors can pick between spawn_blocking() on them (pretty opinionated, also tokio-specific) or just intermingling them with futures (no caller ability to slice them apart from futures that will need to sleep). Whereas ideally they could configure a strategy for handling sync inside of async, which gives the consumer escape hatch to avoid spawn_blocking() (perhaps prefer block_in_place(), perhaps sending to rayon, etc).

jlizen force-pushed the dev branch from 31f7dc5 to 1af1aed Compare December 19, 2024 05:58

arielb1 reviewed Dec 19, 2024

View reviewed changes

src/custom_executor.rs Outdated Show resolved Hide resolved

arielb1 reviewed Dec 19, 2024

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

arielb1 reviewed Dec 19, 2024

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

arielb1 reviewed Dec 19, 2024

View reviewed changes

src/secondary_tokio_runtime.rs Outdated Show resolved Hide resolved

arielb1 reviewed Dec 19, 2024

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

jlizen commented Dec 19, 2024

View reviewed changes

src/secondary_tokio_runtime.rs Outdated Show resolved Hide resolved

jlizen commented Dec 19, 2024

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

jlizen commented Dec 19, 2024

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

jlizen commented Dec 19, 2024

View reviewed changes

src/custom_executor.rs Outdated Show resolved Hide resolved

initial impl

07e017c

jlizen force-pushed the dev branch from bbd4d52 to 07e017c Compare December 19, 2024 20:56

jlizen added 3 commits December 19, 2024 20:57

cut from cfg flags to features

144f2d4

secondary tokio runtime: try_send() to send()

710609f

migrate tests to integ tests to run as separate processes

07da544

jlizen force-pushed the dev branch from ed0bf87 to 07da544 Compare December 19, 2024 21:19

jlizen added 10 commits December 19, 2024 21:20

rustfmt

80992b7

access oncelock with get() rather than get_or_init()

fbe247f

tweak tests

79f99cb

cut custom executor over to channel-based impl, further tweak features

5ca1535

make everything but block_in_place cancellable, add cancellable tests

4a63652

cut over to builder pattern for initialization, return errors on bad …

f2d0631

…input rather than panic, add concurrency control + customizable channel size to secondary runtime

add test assertions against default/loaded strategy types, tweak buil…

d548cd0

…der method naming

tweak tests a bit for resilience

6cca3c3

minor doc changes

526777c

jlizen force-pushed the dev branch from 2acaaea to 526777c Compare December 20, 2024 20:35

newline in Cargo.toml

e50613d

jlizen requested a review from arielb1 December 20, 2024 20:38

jlizen added 2 commits December 20, 2024 21:54

fix test filename

1201819

rename spawn_compute_heavy_future() to run_compute_heavy_future()…

8c1946b

… since it requires polling

arielb1 reviewed Dec 23, 2024

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

arielb1 reviewed Dec 23, 2024

View reviewed changes

src/lib.rs Show resolved Hide resolved

arielb1 reviewed Dec 23, 2024

View reviewed changes

src/lib.rs Show resolved Hide resolved

arielb1 reviewed Dec 23, 2024

View reviewed changes

src/lib.rs Show resolved Hide resolved

jlizen added 3 commits December 23, 2024 22:13

move all initialization errors + logs to the a shared `set_strategy()…

d5de623

…` call, consolidate tests, rename `run_compute_heavy_future()` to `execute_compute_heavy_future()`

add more #[must_use]-es

9aba8f4

use biased polling in select! loop for future cancellation

331687f

arielb1 reviewed Dec 23, 2024

View reviewed changes

rustfmt

be60053

jlizen merged commit 5a1d203 into main Dec 24, 2024

rcoh reviewed Dec 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initial implementation #1

initial implementation #1

jlizen commented Dec 19, 2024 •

edited

Loading

arielb1 commented Dec 19, 2024

jlizen commented Dec 20, 2024

jlizen commented Dec 23, 2024

arielb1 Dec 23, 2024

jlizen Dec 23, 2024

arielb1 commented Dec 23, 2024

jlizen commented Dec 23, 2024

rcoh commented Dec 24, 2024

jlizen commented Dec 24, 2024 •

edited

Loading

rcoh Dec 24, 2024

jlizen Dec 24, 2024

jlizen Dec 24, 2024

rcoh Dec 24, 2024

jlizen Dec 24, 2024

rcoh Dec 24, 2024

jlizen Dec 24, 2024

jlizen Dec 24, 2024 •

edited

Loading

jlizen Dec 24, 2024 •

edited

Loading

jlizen Dec 24, 2024 •

edited

Loading

jlizen Dec 24, 2024 •

edited

Loading

jlizen Dec 24, 2024

initial implementation #1

initial implementation #1

Conversation

jlizen commented Dec 19, 2024 • edited Loading

Crate name

Strategy-setting API

Custom executor

Using get_or_init() inside spawn_compute_heavy_future()

Use of log crate

Testing

arielb1 commented Dec 19, 2024

jlizen commented Dec 20, 2024

Changes since last review

jlizen commented Dec 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arielb1 commented Dec 23, 2024

jlizen commented Dec 23, 2024

rcoh commented Dec 24, 2024

jlizen commented Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlizen Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

jlizen Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

jlizen Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

jlizen Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlizen commented Dec 19, 2024 •

edited

Loading

Using `get_or_init()` inside `spawn_compute_heavy_future()`

jlizen commented Dec 24, 2024 •

edited

Loading

jlizen Dec 24, 2024 •

edited

Loading

jlizen Dec 24, 2024 •

edited

Loading

jlizen Dec 24, 2024 •

edited

Loading

jlizen Dec 24, 2024 •

edited

Loading