Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loader.rs async fixes #15

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

loader.rs async fixes #15

wants to merge 2 commits into from

Conversation

mdegans
Copy link

@mdegans mdegans commented Mar 1, 2023

This PR addresses a few bugs in loader.rs and should enhance performance when loading many files, especially from multiple sources. It does add a dependency on tokio for non-wasm, however I'm assuming (perhaps incorrectly) that this is fine because three-d depends on it as well for non-wasm.

  • fix fn load_from_disk -> async fn load_from_disk

Old code was incompatible with async and because this code here...

    for (path, handle) in handles.drain(..) {
        let bytes = handle
            .join()

... blocks the calling thread until all threads are joined (because join is a blocking call), which in the case of a single-threaded runtime would have the effect of preventing any other tasks/async fn from executing. In a multi-threaded runtime, it would just block all tasks on that particular thread, which could be a lot. This might exhibit as lag in other tasks while loading is in progress.

New code for non-wasm uses tokio tasks (green threads) and lets the runtime decide how to schedule file io by awaiting instead of joining. This allows the runtime to schedule other tasks for execution while awaiting the tasks. Note that although the tokio runtime will likely decide to load from disk from a worker thread in series (because disk storage is organized sequentially). The difference here is that the runtime knows it's going on and it won't block the calling thread.

  • new non-wasm download code

The old code used async and await, and the code was async, but the downloads would still happen sequentially because when you await in a loop, it stops iteration within that function. In the following code, each send() would therefore be made in sequence. Other async functions could be interleaved and run, but based on the name handles I think the intent here was to await all paths concurrently.

        for path in paths {
            let url = reqwest::Url::parse(path.to_str().unwrap())
                .map_err(|_| Error::FailedParsingUrl(path.to_str().unwrap().to_string()))?;
            handles.push((path, client.get(url).send().await));

The new code uses tokio tasks, which are all awaited concurrently -- the only limit being 8 connections per host, to avoid hammering servers.

  • simplify load_single

load_single now just calls load_async_single from a new executor on the current thread. Async and sync code is shared this way and performance will likely benefit provided it's not called in a tight loop. Tasks within the future may still be awaited concurrently provided they don't block.

  • ignore .vscode
  • enable parallel network and disk loading

Old behavior was to await network assets, then await disk assets.

New behavior is to await both disk and network IO concurrently.

https://rust-lang.github.io/async-book/06_multiple_futures/02_join.html

  • simplify load code
  • document load_async_single
  • add RawAssets::with_capacity - to allow pre-allocating space when the len is known ahead of time

* fix `fn load_from_disk` -> `async fn load_from_disk`

Old code was incompatible with async and because this code here...
```rust
    for (path, handle) in handles.drain(..) {
        let bytes = handle
            .join()
```
... blocks the calling thread until all threads are joined (because join is a blocking call), which in the case of a single-threaded runtime would have the effect of preventing any other tasks/`async fn` from executing. In a multi-threaded runtime, it would just block all tasks on that particular thread, which could be a lot. This might exhibit as lag in other tasks while loading is in progress.

New code for non-wasm uses tokio tasks (green threads) and lets the runtime decide how to schedule file io by `await`ing instead of `join`ing. This allows the runtime to schedule other tasks for execution while awaiting the tasks. Note that although the tokio runtime will likely decide to load from disk from a single worker thread in series. The difference here is that the runtime knows it's going on and will do all that in a dedicated thread, which won't block the calling thread.

* new non-wasm download code

The old code used async and await, and the code was async, but the downloads would still happen sequentially because when you await in a loop, it stops iteration *within that function*. In the following code, each `send()` would therefore be made in sequence. Other async functions could be interleaved and run, but based on the name `handles` I think the intent here was to await all paths concurrently.
```rust
        for path in paths {
            let url = reqwest::Url::parse(path.to_str().unwrap())
                .map_err(|_| Error::FailedParsingUrl(path.to_str().unwrap().to_string()))?;
            handles.push((path, client.get(url).send().await));
```

The new code uses tokio tasks, which are all awaited concurrently -- the only limit being 8 connections per host, to avoid hammering servers.

* simplify `load_single`

`load_single` now just calls `load_async_single` from a new executor on the current thread. Async and sync code is shared this way and performance will likely benefit provided it's not called in a tight loop. Tasks within the future may still be awaited concurrently provided they don't block.

* ignore `.vscode`
* enable parallel network and disk loading

Old behavior was to await network assets, then await disk assets.

New behavior is to await both disk and network IO concurrently.

https://rust-lang.github.io/async-book/06_multiple_futures/02_join.html

* simplify `load` code
* document `load_async_single`
* add `RawAssets::with_capacity` - to allow pre-allocating space when the len is known ahead of time
@mdegans
Copy link
Author

mdegans commented Mar 1, 2023

Hey, for some reason the workflow didn't run on my GH. I ran tests manually with --features all and did a wasm_build, although without browser tests. I didn't change much with the wasm version of the code, however, and I haven't tested in three-d, yet, but I think everything should work.

I plan on adding some tests and perhaps some benchmarks than can help us find regressions. I imagine this PR will add the cost of launching load a single time (because it creates a Tokio runtime which calls to block_on the async thing once), but will give us some performance benefit with many files from many different sources and will come at no cost when using the async already.

Please don't feel obligated to accept any of these changes in any case. I'm doing this mostly to get used to async in Rust and we should probably measure the performance benefits/drawbacks in any case before merging it into master no matter what. I was thinking of creating some large png files of random data and we can see how well it can fetch and decode them all maybe, unless you have another suggestion for testing this?

@asny
Copy link
Owner

asny commented Mar 1, 2023

Super nice work 💪 I'm all for these changes as long as it works and is fast 👍 I just ran a few three-d examples and it works and seems to be faster, but it would be really nice with a test/benchmark!

My only concern is the tokio dependency, three-d-asset was not necessarily meant to be used together with three-d and three-d don't depend on tokio, only the examples do. Could we make it optional? Could just be enabled with a load feature which enabled the load and load_async functions?

And about the tests not running, I have to explicitly allow to run them, because you haven't contributed to this repo before 🤷

@mdegans
Copy link
Author

mdegans commented Mar 1, 2023

only the examples do

My bad. I mis-remembered your top level Cargo.toml and should have double checked.

Could we make it optional?

Sure, that's doable, and I can see why you would want to do that. What do you think about making both io and async features? io would be required to load from or save to web/disk at all (users could still use bytes or data urls) and async would require any async dependencies? The current http would depend on io, but not necessarily async.

it would be really nice with a test/benchmark!

I definitely plan on it. All this wouldn't be much fun without measurable performance improvements :)

And about the tests not running

Oh, I meant they weren't running on my fork. I wanted to run the actions my own account (so I would know the tests pass ahead of submitting the PR here) but I think there's a setting I must change to allow actions to trigger on forks. I will figure it out. I think I found the setting. I just haven't tried pushing any changes.

@asny
Copy link
Owner

asny commented Mar 4, 2023

My bad. I mis-remembered your top level Cargo.toml and should have double checked.

No problem at all, I'd had to check myself 😆

Sure, that's doable, and I can see why you would want to do that. What do you think about making both io and async features? io would be required to load from or save to web/disk at all (users could still use bytes or data urls) and async would require any async dependencies? The current http would depend on io, but not necessarily async.

To me there's two solutions:

  • Either have a io feature flag that enables/disables io completely and enables/disables reqwest and tokio (we could get rid of the http flag then I think). That's relatively easy to implement and we don't have cfg attributes all over.
  • The other solution is to again have an io feature flag that enables/disables both reqwest and tokio, but in the code we have a fallback if reqwest or tokio is not enabled. This is what I did for reqwest, but if we want to do it for tokio it's a lot of cfg attributes and a lot of fallbacks. Maybe we can make it simple? It doesn't have to be optimal in that case I think. This solution also makes it possible to enable/disable reqwest and tokio individually. We could also have a feature flag for each of them (http for reqwest and async for tokio) but I don't know if that was a great design choice I made there 🤷

I definitely plan on it. All this wouldn't be much fun without measurable performance improvements :)

Oh, I know that feeling. I just did something at work that was only half finished, but I couldn't wait, I just had to measure the performance improvements even though it didn't made a lot of sense 😆 Anyway, sounds great 👍

Oh, I meant they weren't running on my fork. I wanted to run the actions my own account (so I would know the tests pass ahead of submitting the PR here) but I think there's a setting I must change to allow actions to trigger on forks. I will figure it out. I think I found the setting. I just haven't tried pushing any changes.

Ah, I misunderstood, sorry 🙂

@mdegans
Copy link
Author

mdegans commented Mar 7, 2023

I think I like the idea of them all being feature flags, but we might be able to lose tokio entirely for some io features, including http, if we write it with the futures crate instead since it has tasks and a suitable join_all. tokio depends on futures anyway, so there's no adding to the build size. It shouldn't break any examples and by doing this we might be able to unify our web and native codebases. Fewer #[cfg(...)] blocks that way.

I will try to make it all work with just futures. I don't think the code needs to change very much, and if it works, the file gets much smaller. Certainly worth a try.

@asny
Copy link
Owner

asny commented Mar 8, 2023

Sounds like an awesome plan 👍

@mdegans
Copy link
Author

mdegans commented Mar 17, 2023

Hey, getting closer with the futures code. Do you prefer I/O be optimized for SSD or HDD? If we read from multiple threads, it can likely help performance, but it would also hurt HDD performance? Alternatively, we could make it configurable at build or runtime somehow. Any suggestions on doing that?

@asny
Copy link
Owner

asny commented Mar 17, 2023

Hi again 👋 Sounds great 👍 I have to say, I don't have a strong preference, but I guess SSD is pretty standard nowadays and will be even more in the future? I think we should only make it configurable if someone really needs it, it's not like it's not going to work with HDD. Unless you think it's a fun task of course. The only real requirement, I have for these changes, is that it's not going to be way too complex as I might have to support it in the future 😆 And of course that it works and are fast and so on, but I'm not really worried about that 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants