Skip to content

Commit

Permalink
Remove webworker.js and update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
hoodmane committed Oct 14, 2024
1 parent 92d6e22 commit fd359ee
Show file tree
Hide file tree
Showing 5 changed files with 91 additions and 190 deletions.
14 changes: 0 additions & 14 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,6 @@ all-but-packages: \
dist/python_stdlib.zip \
dist/test.html \
dist/module_test.html \
dist/webworker.js \
dist/webworker_dev.js \
dist/module_webworker_dev.js \


src/core/pyodide_pre.o: src/js/generated/_pyodide.out.js src/core/pre.js src/core/stack_switching/stack_switching.out.js
Expand Down Expand Up @@ -261,15 +258,6 @@ dist/console.html: src/templates/console.html
cp $< $@
sed -i -e 's#{{ PYODIDE_BASE_URL }}#$(PYODIDE_BASE_URL)#g' $@

dist/webworker.js: src/templates/webworker.js
cp $< $@

dist/module_webworker_dev.js: src/templates/module_webworker.js
cp $< $@

dist/webworker_dev.js: src/templates/webworker.js
cp $< $@


# Prepare the dist directory for the release by removing unneeded files
.PHONY: clean-dist-dir
Expand All @@ -278,8 +266,6 @@ clean-dist-dir:
rm dist/makesnap.mjs
rm dist/snapshot.bin
rm dist/module_test.html dist/test.html
# TODO: Remove webworker.js too? Would require updating the docs I think.
rm dist/module_webworker_dev.js dist/webworker_dev.js

# TODO: Source maps aren't useful outside of debug builds I don't think. But
# removing them adds "missing sourcemap" warnings to JS console. We should
Expand Down
1 change: 0 additions & 1 deletion docs/usage/downloading-and-deploying.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,4 +87,3 @@ deployments](https://emscripten.org/docs/compiling/Deploying-Pages.html).
| pyodide.mjs.map | Source maps to improve tracebacks. Not really that useful to people outside of the project, probably should be only included in debug builds. |
| \*-tests.tar | Unvendored tests from wheels. If a wheel includes a test folder, we take them out and put them here. |
| console.html | The Pyodide repl. |
| webworker.js | Mentioned in the docs. Developers should probably implement their own Web Worker. |
214 changes: 91 additions & 123 deletions docs/usage/webworker.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,57 +2,47 @@

# Using Pyodide in a web worker

This document describes how to use Pyodide to execute Python scripts
asynchronously in a web worker.
This document includes an example demonstrating how to use Pyodide to execute
Python scripts asynchronously in a web worker.

## Setup
Let's start with [the definition][worker api].

Setup your project to serve `webworker.js`. You should also serve
`pyodide.js`, and all its associated `.asm.js`, `.json`, and `.wasm`
files as well, though this is not strictly required if `pyodide.js` is pointing
to a site serving current versions of these files.
The simplest way to serve the required files is to use a CDN,
such as `https://cdn.jsdelivr.net/pyodide`. This is the solution
presented here.
> A worker is an object created using a constructor (e.g. [Worker()][worker constructor])
> that runs a named JavaScript file — this file contains the code
> that will run in the worker thread; workers run in another global context that
> is different from the current window.
Update the `webworker.js` sample so that it has as valid URL for `pyodide.js`, and sets
{js:func}`indexURL <globalThis.loadPyodide>` to the location of the supporting files.

In your application code create a web worker `new Worker(...)`,
and attach listeners to it using its `.onerror` and `.onmessage`
methods (listeners).

Communication from the worker to the main thread is done via the `Worker.postMessage()`
method (and vice versa).

[worker onmessage]: https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers#Sending_messages_to_and_from_a_dedicated_worker
[worker onerror]: https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers#Handling_errors
A lot of Python code does long running synchronous computations. Running them in
the main thread will block the UI. Using a web worker is advantageous because
the Python code runs in a separate thread from your UI and does not impact your
application's responsiveness.

## Detailed example

In this example process we will have three parties involved:

- The **web worker** is responsible for running scripts in its own separate thread.
- The **worker API** exposes a consumer-to-provider communication interface.
- The **consumer**s want to run some scripts outside the main thread, so they don't block the main thread.
- The **consumer**s want to run some scripts outside the main thread, so they
don't block the main thread.

### Consumers

Our goal is to run some Python code in another thread, this other thread will
not have access to the main thread objects. Therefore, we will need an API that takes
as input not only the Python `script` we want to run, but also the `context` on which
it relies (some JavaScript variables that we would normally get access to if we
were running the Python script in the main thread). Let's first describe what API
we would like to have.
Our goal is to run some Python code in another thread. This other thread will
not have access to the main thread objects. Therefore, we will need an API that
takes as input both the Python `script` we want to run and the `context` on
which it relies (some JavaScript variables that our code needs access to). Let's
first describe what API we would like to have.

Here is an example of consumer that will exchange with the web worker, via the worker interface/API `py-worker.js`. It runs the following Python `script` using the provided `context` and a function called `asyncRun()`.
Here is an example of consumer that will exchange with the web worker, via
`workerApi.mjs`. It runs the following Python `script` using the provided
`context` and a function called `asyncRun()`.

```js
import { asyncRun } from "./py-worker";
import { asyncRun } from "./workerApi.js";

const script = `
import statistics
from js import A_rank
statistics.stdev(A_rank)
`;

Expand All @@ -61,72 +51,43 @@ const context = {
};

async function main() {
try {
const { results, error } = await asyncRun(script, context);
if (results) {
console.log("pyodideWorker return results: ", results);
} else if (error) {
console.log("pyodideWorker error: ", error);
}
} catch (e) {
console.log(
`Error in pyodideWorker at ${e.filename}, Line: ${e.lineno}, ${e.message}`,
);
const { result, error } = await asyncRun(script, context);
if (result) {
console.log("pyodideWorker result:", result);
} else if (error) {
console.log("pyodideWorker error:", error);
}
}

main();
```

Before writing the API, let's first have a look at how the worker operates.
How does our web worker run the `script` using a given `context`.
Before writing the API, let's first have a look at how the worker runs the
`script` using a given `context`.

### Web worker

Let's start with the definition. [A worker][worker api] is:

> A worker is an object created using a constructor (e.g. [Worker()][worker constructor]) that runs a named JavaScript file — this file contains the code that will run in the worker thread; workers run in another global context that is different from the current window. This context is represented by either a DedicatedWorkerGlobalScope object (in the case of dedicated workers - workers that are utilized by a single script), or a SharedWorkerGlobalScope (in the case of shared workers - workers that are shared between multiple scripts).
In our case we will use a single worker to execute Python code without interfering with
client side rendering (which is done by the main JavaScript thread). The worker does
two things:

1. Listen on new messages from the main thread
2. Respond back once it finished executing the Python script

These are the required tasks it should fulfill, but it can do other things. For
example, to always load packages `numpy` and `pytz`, you would insert the line
{js:func}`await pyodide.loadPackage(['numpy', 'pytz']); <pyodide.loadPackage>` as
shown below:
Here is the worker code:

```js
// webworker.js
// webworker.mjs
import { loadPyodide } from "https://cdn.jsdelivr.net/pyodide/v0.26.2/full/pyodide.mjs";

// Setup your project to serve `py-worker.js`. You should also serve
// `pyodide.js`, and all its associated `.asm.js`, `.json`,
// and `.wasm` files as well:
importScripts("{{PYODIDE_CDN_URL}}pyodide.js");

async function loadPyodideAndPackages() {
self.pyodide = await loadPyodide();
await self.pyodide.loadPackage(["numpy", "pytz"]);
}
let pyodideReadyPromise = loadPyodideAndPackages();
let pyodideReadyPromise = loadPyodide();

self.onmessage = async (event) => {
// make sure loading is done
await pyodideReadyPromise;
// Don't bother yet with this line, suppose our API is built in such a way:
const pyodide = await pyodideReadyPromise;
const { id, python, ...context } = event.data;
// The worker copies the context in its own "memory" (an object mapping name to values)
for (const key of Object.keys(context)) {
self[key] = context[key];
}
// Now is the easy part, the one that is similar to working in the main thread:
// Now load any packages we need, run the code, and send the result back.
await pyodide.loadPackagesFromImports(python);
// make a Python dictionary with the data from `context`
const dict = pyodide.globals.get("dict");
const globals = dict(Object.entries(context));
try {
await self.pyodide.loadPackagesFromImports(python);
let results = await self.pyodide.runPythonAsync(python);
self.postMessage({ results, id });
// Execute the python code in this context
const result = await pyodide.runPythonAsync(python, {globals});
self.postMessage({ result, id });
} catch (error) {
self.postMessage({ error: error.message, id });
}
Expand All @@ -135,57 +96,64 @@ self.onmessage = async (event) => {

### The worker API

Now that we established what the two sides need and how they operate,
let's connect them using this simple API (`py-worker.js`). This part is
optional and only a design choice, you could achieve similar results
by exchanging message directly between your main thread and the webworker.
You would just need to call `.postMessages()` with the right arguments as
this API does.
Now that we established what the two sides need and how they operate, let's
connect them using an API that wraps the message passing code into an
asynchronous function.

```js
const pyodideWorker = new Worker("./dist/webworker.js");
// workerApi.mjs
function getPromiseAndResolve() {
let resolve;
let promise = new Promise((res) => {
resolve = res;
});
return { promise, resolve };
}

const callbacks = {};
// Each message needs a unique id to identify the response. In a real example,
// we might use a real uuid package
let lastId = 1;
function getId() {
return lastId++;
}

pyodideWorker.onmessage = (event) => {
const { id, ...data } = event.data;
const onSuccess = callbacks[id];
delete callbacks[id];
onSuccess(data);
};
// Add an id to msg, send it to worker, then wait for a response with the same id.
// When we get such a response, use it to resolve the promise.
function requestResponse(worker, msg) {
const { promise, resolve } = getPromiseAndResolve();
const id = getId();
worker.addEventListener("message", function listener(event) {
if (event.data?.id !== id) {
return;
}
// This listener is done so remove it.
worker.removeEventListener("message", listener);
// Filter the id out of the result
const {id, ...result} = data;
resolve(result);
});
worker.postMessage({ id, ...msg });
return promise;
}

const asyncRun = (() => {
let id = 0; // identify a Promise
return (script, context) => {
// the id could be generated more carefully
id = (id + 1) % Number.MAX_SAFE_INTEGER;
return new Promise((onSuccess) => {
callbacks[id] = onSuccess;
pyodideWorker.postMessage({
...context,
python: script,
id,
});
});
};
})();

export { asyncRun };
const pyodideWorker = new Worker("./webworker.mjs", { type: "module" });

export function asyncRun(script, context) {
return requestResponse(pyodideWorker, {
...context,
python: script,
});
}
```
[worker api]: https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API
[worker constructor]: https://developer.mozilla.org/en-US/docs/Web/API/Worker/Worker
## Caveats
Using a web worker is advantageous because the Python code is run in a separate
thread from your main UI, and hence does not impact your application's
responsiveness.
There are some limitations, however.
At present, Pyodide does not support sharing the Python interpreter and
packages between multiple web workers or with your main thread.
Since web workers are each in their own virtual machine, you also cannot share
globals between a web worker and your main thread.
Finally, although the web worker is separate from your main thread,
the web worker is itself single threaded, so only one Python script will
execute at a time.
There are some limitations, however. Pyodide does not support sharing the Python
interpreter and packages between multiple workers or with your main thread.
Since workers run in a separate global context distinct from the main thread,
you cannot directly share globals between a worker and your main thread. In
particular, the worker cannot directly manipulate the DOM. Finally, although the
worker is separate from your main thread, the worker is itself single threaded.
26 changes: 0 additions & 26 deletions src/templates/module_webworker.js

This file was deleted.

26 changes: 0 additions & 26 deletions src/templates/webworker.js

This file was deleted.

0 comments on commit fd359ee

Please sign in to comment.