Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider experimental support for shared memory #272

Open
ivan-aksamentov opened this issue Jun 28, 2020 · 5 comments
Open

Consider experimental support for shared memory #272

ivan-aksamentov opened this issue Jun 28, 2020 · 5 comments

Comments

@ivan-aksamentov
Copy link

ivan-aksamentov commented Jun 28, 2020

Hi Andy @andywer ,

We are having a great success using your library and it unlocks all kinds of new possibilities for client-side compute, and, hopefully, will serve some COVID-19 researchers soon:
https://github.com/neherlab/webclades

One problem we faced is queue being on main thread and the pool being blocked from retrieving new tasks when main thread is busy (in our case with some heavy rendering): this leads to underutilization of resources.

I touched this problem a bit in this issue:
nextstrain/nextclade#38

There is an experimental work happening in Mozilla that allows for shared memory
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer/Planned_changes

Do you think it can it be used to put queue in the shared segment of memory and to share it between workers, and execute dequeuing in the context of the worker itself, so that they don't have to tap the main thread?

I don't expect this is happening right now of course, especially that this particular API is not even released, but would be great to have this implemented one day.

Thanks again for the great library!

@andywer
Copy link
Owner

andywer commented Jun 29, 2020

Hey @ivan-aksamentov, that's so nice to hear!

Always happy to help put technology to good use 🙂

Regarding the feature request… Good point. Didn't think that the pool on the main thread would become a bottleneck easily, but now as I think about it, it makes sense.

I think we need to split the discussion into two here:
a) Moving the pool off the main thread
b) Shared memory blobs

I think (a) could be quite straight forward if we go for the "cheap solution", esp. now that #273 is about to land: One could manage the pool in a worker thread.

Not as elegant and maybe not 100% as efficiently as having a "decentralized pool" managed by the pool workers themselves, but it might suffice to end the resource fight between UI and pool scheduler. Running a pool on a worker might already be possible, in fact.

Now there's the question of (b) shared memory. The biggest issue with it is x-platform support, though. Browser support for shared memory buffers looks bad and for node.js we would need to get pretty creative… So that seems to be pretty much a deal-breaker at the moment.

With the callback support (#273) it might be worthwhile just giving the pool on a worker a shot and see how it performs.

@ivan-aksamentov
Copy link
Author

@andywer

a) Moving the pool off the main thread

Oh, I haven't thought about it. A dedicated worker, whose job to only distribute tasks and then idle, might work indeed. Will be waiting for the news in this area.

Thanks again for the great library!

@andywer
Copy link
Owner

andywer commented Jun 30, 2020

@ivan-aksamentov I published the current state of the callback PR as [email protected] in case you want to try it. You can find some very basic documentation how to use it in this comment.

@ivan-aksamentov
Copy link
Author

@andywer Thanks. I am not yet fully understand how exactly callbacks help with moving the pool off main thread, but I may poke around sometimes.

I assumed it will be an implementation detail of the pool itself. Or can we hack something together in userland?

Can you give me some pointers?

@andywer
Copy link
Owner

andywer commented Jun 30, 2020

Sure. I think it should be feasible in userland. Haven't tried it yet, but in my head it looks something like this:

// worker.js
import { expose, Pool } from "threads"

const pool = new Pool()
const tasks = new Map()

expose({
  completed() {
    return pool.completed()
  },
  queueFooTask(workDescription) {
    const task = pool.queue(async worker => {
      return worker.doFooTask(workDescription)
    })
    tasks.set(task.id, task)
    return {
      id: task.id
    }
  },
  awaitTaskCompletion(exposedTask) {
    // Pool tasks are `.then()`-able (resemble promises), so we might just be able to simply return the task
    // in order to return a completion promise to the calling thread
    return tasks.get(exposedTask.id)
  },
  cancelTask(exposedTask) {
    tasks.get(exposedTask.id).cancel()
  }
})

Just realized that you might not even need callbacks, but callbacks themselves are also not enough to be able to provide a really nice API – you would need to be able to return objects with callable methods.

Might make sense to extend the PR with such a feature as right now you can only pass callbacks directly, but you cannot pass or return an object with callback methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants