Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loader: add support for components #224

Merged
merged 1 commit into from
Apr 5, 2024

Conversation

whitequark
Copy link
Contributor

Fixes #220.
Closes #219.
Depends on #221.
Depends on #222.

Works like this:

$ ls
wavedrom.wasm    ...
$ python -c 'import wasmtime.loader, json, wavedrom; store = wasmtime.Store(); print(wavedrom.Root(store).render_json(store, json.dumps({"signal": [{ "name": "clk", "wave": "p..." }, { "name": "data", "wave": "0110" }]})))'
<svg xmlns="http://www.w3.org/2000/svg" ....

@whitequark whitequark force-pushed the loader-components branch 2 times, most recently from c55adb5 to 766811d Compare April 4, 2024 14:02
@whitequark
Copy link
Contributor Author

This PR now includes an example of using autogenerated bindings for components.

@whitequark
Copy link
Contributor Author

whitequark commented Apr 4, 2024

That's a lot of mypy failures. I'm unconvinced mypy is actually useful given the sheer amount of Any and # type: ignore annotations all this requires but oh well...

@whitequark whitequark force-pushed the loader-components branch 11 times, most recently from b742c30 to 317ff2f Compare April 4, 2024 15:03
@whitequark
Copy link
Contributor Author

whitequark commented Apr 4, 2024

That was a hour of my life I'll never get back but at least mypy is happy now. (Having to spend this much effort on busywork actually kind of turns me off contributing nontrivial code to wasmtime-py so I think next time I'll just go with # type: ignore until tests pass; every single change that I've made in the last hour was a false positive of some sort, in that it did not point to an error in my code.)

Aside from that, I think this is ready to merge now, with two caveats:

  • This is still not thread-safe. I could easily make it so, but it's not totally clear to me what the granularity should be.
    • Store per thread? Works for core modules but it gets really weird when two unrelated libraries both use wasmtime.loader and their stuff ends up in the same store now. If one of them hits an unrecoverable trap I think the other will stop working too, maybe?
    • Store per .wasm file [per thread; implied here and in the following items]? Works for components but I think that wouldn't really work with the existing core module import code?
    • Store per .wasm file, except if you import something you use the import's store? Works for the case where your entire dependency tree eventually depends on the same module, which might actually not be uncommon, but doesn't work in general.
    • Some way to indicate which store to use?
    • Store per .wasm file, and all imports are imported anew? Works universally but will break mutable aliasing.
  • The interface differs between importing core modules and components, with the latter requiring an explicit store argument.

Ultimately I feel like maybe we should break the interface and surface the store argument explicitly for all use cases, core and component, because it's not obvious to me that a general solution is possible, or even a solution general enough to make most downstream users happy. And there is clearly unhappiness related to the status quo, such as #223.

@whitequark
Copy link
Contributor Author

Rebased.

@whitequark
Copy link
Contributor Author

Note that I've added an example but I'm not totally sure if the .wat file I committed follows the best practices for that kind of thing. I did a wasm-tools print and then hand-edited it to be more legible and closer to existing .wat files but a lot of the Component Model internals are still somewhat opaque to me.

In any case I hope the example provides a reasonable amount of testing for this functionality and also shows how to use it.

I basically don't show how to handle imports at all (which is unlike the core module handling code) and this is mainly because I don't expect to need more than WASI imports in close future at least, possibly ever; I think maybe another contributor is better positioned to address that, though if it's not a high-effort change I could do something as well.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thank you for this!

Having to spend this much effort on busywork actually kind of turns me off contributing nontrivial code to wasmtime-py

Sorry about that! I originally thought this would be a good place to have type annotations but I'm not sure if I'm swimming upstream here. I don't want to make this library hard to use or contribute to at all, but I figured that if folks had their own type annotations they'd want the usage here checked as well. I do agree though that the more "dynamic" parts like the loader I think can probably skip type-checks entirely.

This is still not thread-safe

I agree this is pretty thorny, and I agree it's best to defer to later. I think your idea is the right way to go here of surface a different API for core-wasm-modules and always require a Store is passed in. There's basically no way to share an instance across threads, and if a normal Python module can be shared across threads then representing a wasm instance as a Python module isn't the right way to go.

I'm not totally sure if the .wat file I committed follows the best practices for that kind of thing

Looks good to me!

I basically don't show how to handle imports at all

I think deferring this to later is fine yeah. The loader-based pieces were more demo-quality when I first added them anyway so I think it's fine to figure out how to best manage this later.

rust/src/bindgen.rs Outdated Show resolved Hide resolved
@whitequark
Copy link
Contributor Author

Sorry about that!

Thanks! In general I do my best to accommodate whatever the upstream likes even if it's weird or not what I'd do but sometimes things get a little out of hand.

I originally thought this would be a good place to have type annotations

I actually quite like Python type annotations; the issue I have is with mypy.

To elaborate: you and I both work in Rust a lot, where type inference is function-local. This is practically a hard necessity in statically typed languages like Rust, because while HM-style inference works just fine globally it quickly becomes unmaintainable if you don't require annotations on at least some boundary, with spooky action at a distance, impossible to understand type errors, etc. (OCaml does it on module boundaries but the basic logic is the same.)

However in Python you can't really think of annotations the same way because typing is fundamentally non-local, and any typechecker that attempts to treat it as a local phenomenon is going to create this kind of busywork. I like and use Pyright/Pylance, and I add type annotations in my code where needed to give it the context clues it needs to infer the rest of the types. But it doesn't have the same issue as mypy (perhaps, mypy in the default configuration?) has: if e.g. it knows that self._foo: bool, then it will happily infer that @property def foo(self): return self._foo is also -> bool, without having the signature annotated. And Pyright requires very few strategically placed annotations to be useful; a subset of the constructor arguments, plus mutable globals, plus some of the external code makes it go very far.

but I'm not sure if I'm swimming upstream here. I don't want to make this library hard to use or contribute to at all, but I figured that if folks had their own type annotations they'd want the usage here checked as well. I do agree though that the more "dynamic" parts like the loader I think can probably skip type-checks entirely.

To reiterate perhaps, I think there's definitely value in ensuring downstream users of the library have types available by one means or the others (and I wouldn't mind at all even annotating every public signature to that end, though I'm not sure it's strictly required) but I don't think that in this particular context mypy brings more value than it creates obstructions.

@alexcrichton
Copy link
Member

Do you have a sense for if there's a more conventional type-checker to use? I picked mypy as it seemed like a reasonable default at the time, but I could very well have been wrong and/or times may have changed in the meantime. I've got no love of mypy myself as I've run into lots of issues historically related to ctypes and mypy, so if Pyright/Pylance are better options I'd be happy to switch to that.

I definitely agree that mypy has felt like a lot of busywork, especially in all the -> None functions that don't return anything. I'd just assumed it was The Way up until now!

@whitequark
Copy link
Contributor Author

So, full disclosure, I'm mostly using Pylance from within VS Code, so my user experience will not quite match that of contributors using command-line Pyright analysis. However, I just ran Pyright on the wasmtime/loader.py file that I've been editing, here's the output:

$ ./node_modules/.bin/pyright -v .venv/ wasmtime/loader.py 
.../wasmtime/loader.py
  .../wasmtime/loader.py:44:54 - error: Argument of type "str | None" cannot be assigned to parameter "path" of type "str | bytes | PathLike[Unknown]" in function "from_file" (reportArgumentType)
  .../wasmtime/loader.py:44:70 - error: "origin" is not a known member of "None" (reportOptionalMemberAccess)
  .../wasmtime/loader.py:52:20 - error: Argument of type "str | None" cannot be assigned to parameter "key" of type "str" in function "__getitem__"
    Type "str | None" cannot be assigned to type "str"
      "None" is incompatible with "str" (reportArgumentType)
  .../wasmtime/loader.py:54:36 - error: Argument of type "AsExternType" cannot be assigned to parameter "ty" of type "FuncType" in function "__init__"
    Type "AsExternType" cannot be assigned to type "FuncType"
      "GlobalType" is incompatible with "FuncType" (reportArgumentType)
  .../wasmtime/loader.py:55:47 - error: Argument of type "str | None" cannot be assigned to parameter "name" of type "str" in function "define"
    Type "str | None" cannot be assigned to type "str"
      "None" is incompatible with "str" (reportArgumentType)
  .../wasmtime/loader.py:74:23 - error: Argument of type "str | None" cannot be assigned to parameter "args" of type "StrPath" in function "__new__" (reportArgumentType)
  .../wasmtime/loader.py:74:39 - error: "origin" is not a known member of "None" (reportOptionalMemberAccess)
  .../wasmtime/loader.py:94:9 - error: Method "is_resource" overrides class "ResourceReader" in an incompatible manner
    Parameter 2 name mismatch: base parameter is named "path", override parameter is named "name" (reportIncompatibleMethodOverride)
8 errors, 0 warnings, 0 informations 

Not only are these messages much more readable, but some of them are actionable, too! Let's do the breakdown of these 8 messages:

  1. error: Argument of type "str | None" cannot be assigned to parameter "path" of type "str | bytes | PathLike[Unknown]" in function "from_file": false positive, this is dynamically impossible (but it doesn't know this and this could have easily been a bug)
  2. error: "origin" is not a known member of "None": continuation of the same
  3. error: Argument of type "str | None" cannot be assigned to parameter "key" of type "str" in function "__getitem__": seems like a bug: unnamed import types would crash the loader
  4. error: Argument of type "AsExternType" cannot be assigned to parameter "ty" of type "FuncType" in function "__init__": seems like a bug: insufficiently generic type annotation
  5. error: Argument of type "str | None" cannot be assigned to parameter "name" of type "str" in function "define": seems like a bug: unnamed export types would crash the loader
  6. error: Argument of type "str | None" cannot be assigned to parameter "args" of type "StrPath" in function "__new__": same as (1)
  7. error: "origin" is not a known member of "None": continuation of (6)
  8. error: Method "is_resource" overrides class "ResourceReader" in an incompatible manner Parameter 2 name mismatch: base parameter is named "path", override parameter is named "name": definitely a bug, if someone uses a kwarg to call this method (unlikely but possible), it'll crash

So, out of the 6 errors reported as 8 diagnostics, only 2 are false positives due to runtime invariants outside of the typechecker's ability to reason, 2 are definitely bugs that could be reasonably triggered through normal use, and 2 are bugs that could only be triggered through the use of the module linking proposal extensions. There aren't any false positives due to insufficiently powerful analyses or the like. Mypy, of course, found none of the 6.

To handle the false positives due to runtime invariants all I had to do was to add

assert module.__spec__ is not None and module.__spec__.origin is not None

and now they're gone. The check doesn't hurt readability either.

@whitequark whitequark force-pushed the loader-components branch 2 times, most recently from 3cc473e to ac48863 Compare April 4, 2024 22:25
@whitequark
Copy link
Contributor Author

@alexcrichton OK, it should be calling from_file now for the case where it's actually on the filesystem.

@alexcrichton alexcrichton merged commit 6932406 into bytecodealliance:main Apr 5, 2024
11 checks passed
@alexcrichton
Copy link
Member

Nice! I'll try to look into switching to pylance/pyright in the future

@whitequark whitequark deleted the loader-components branch May 7, 2024 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants