- Where: zoom.us
- When: November 12th, 9am-10am Pacific Standard Time (November 12th, 5pm-6pm UTC)
- Location: link on calendar invite
None required if you've attended before. Send an email to the acting WebAssembly CG chair to sign up if it's your first time. The meeting is open to CG members only.
The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.
-
Opening, welcome and roll call
- Opening of the meeting
- Introduction of attendees
-
Find volunteers for note taking (acting chair to volunteer)
-
Adoption of the agenda
-
Proposals and discussions
- Review of action items from prior meeting.
- Discussion of atomics with unshared memories (WebAssembly/threads#144)
- POLL: move JS-BigInt-integration to phase 3
- POLL: move multiple memories to phase 2
- Move wasi-sdk et al into the WebAssembly organization?
- Rename reference-sysroot to wasi-libc and finish the move
- Move wasi-sdk to the WebAssembly organization too
- wasi-libc-test? (includes GPL-2.0 and non-commercial licenses)
-
Closure
None
None
- Derek Schuff
- Mitch Moore
- Sam Clegg
- Barbara Nichols
- Sven Sauleau
- Zhi An Ng
- Thomas Lively
- Heejin Ahn
- Keith Miler
- Sergey Rubanov
- Paul Dworzanski
- Minqiu Sun
- Lars Hansen
- Peter jensen
- Benjamin Bouvier
- Alex Crichton
- Dan Gohman
- Andreas Rossberg
- Alon Zakai
- Francis McCabe
- Luke Wagner
- Ben Titzer
- Petr Penzin
- Rich Winterton
- Till Schneidereit
- Adam Klein
- Jakob Kummerow
- Shravan Narayan
- Nathaniel McCallum
- Flaki
- Nick Fitzgerald
- Rick Battagline
- Bill Ticehurst
- Arun Purushan
Issue: WebAssembly/threads#144
Slides: https://docs.google.com/presentation/d/1TRje57rtVnX1mykYNbH8WlPvIQ6lAVK1-gFFwSNwSoQ/edit#slide=id.p
[TL presenting]
Presented different options for enabling this. Feedback?
FM: if you have a threadsafe library, won’t it be more different than just whether it invokes suspend/wait?
TL: potentially. E.g. if fast malloc impl has thread-local free list, uses compare+swap, etc. All of that stuff will just work in a single-threaded app.
FM but you might not want it.
TL: right, but a generally malloc might still want to use it. Current implementations just use the thread ops anyway and they just work. In wasm we could lower them to non-atomic equivalents at instantiation time and get better speed.
KM: you statically know when you instantiate whether the memory is shared?
TL: yes because the binary imports/exports a shared or unshared memory
AR: if the import has to be different, how can you avoid shipping the module twice?
TL: not the final linked module, but the library (e.g. static library binary) can be compiled just once
KM: can’t you just lower them into non-atomics at link time?
TL: yes but the problem is that static linking tools are such that this transformation would have to happen in the linker, which isn’t very smart. In theory you could have a relocation or something for the linker to rewrite.
LH: but now you’re asking the runtime to do that instead? That seems like a bigger change to ask.
AR: doesn’t the same argument about importing apply to libraries?
TL: no, wasm object files don’t really have memories (the memory import in the final binary is controlled by the linker)
… back to LH: you’re asking the runtime to do the linker’s job here, but at least the linker is optional. Also this is only applicable to a fairly small number of libraries, so it seems fairly special-purpose.
KM: it seems like libraries built this way would have to be specially designed if they will work with both threaded and non threaded apps
SC: I think most libraries are like that, they support both uses
TL: currently the linker doesn’t allow this (linking thread-using and non-thread-using libraries) at link time, and lots of users have run into this error already, meaning they are linking mixed shared/unshared code. It seems common. All you need a library that’s thread-safe (e.g. uses atomics) and doesn’t spawn its own threads, and it can be used this way.
PP: on native platforms the instructions will work either way, so I’d expect to see code that takes advantage of that. … is shared memory declaration done automatically or does the user have to ask for that?
TL: with emscripten, if you use the -pthread flag then you get a shared memory output.
PP: we probably don’t want to make the linker edit the instructions, that seems bad.
SC: there is some precedent on native, e.g. for linker relaxation.
KM: [... missed the comment]
BT: the JVM is inherently multithreaded, there’s no non-threading version.
SC: the work of the runtime is optional, you don’t have to lower away the atomics.
LH: it’s more about wait/notify. They don’t have nice semantics in the non-shared context. You can just wait and not expect to be woken (e.g. to implement sleep)
TL: you can do that, just not on the main thread.
LH: those semantic choices are useful, it would be cleaner to throw rather than speculating the wait.
TL: the second proposal is definitely much more involved. I’d be happy with just the first proposal. It’s still nice to not have to have to build everything twice, even with the guard code.
LH: which guard code?
TL: you have to check the value with a load before executing the wait.
PP: what if we just don’t validate the wait if there’s shared memory?
TL: then you still have to build the libraries twice.
PP: what user-facing construct results in a wait?
TL: pthread_mutex_wait should be implemented in terms of atomic wait.
PP: so if the code isn’t guarded right currently, they’d still have to fix it.
KM: a naive lock implementation would have this problem, if it only ever waits without checking, expecting the wait to return immediately. I’d lean toward the second solution because of that. I guess in theory the engine could optimize it away or something.
AR: would conditional compilation be an alternative?
TL: if you sequestered all of your calls to wait in a separate function you could use conditional compilation for that. You’d have 2 different versions of all those functions. it would be a big workaround for the problem.
PP: the non-wait part seems non-controversial. Not sure we want to hinge that on conditional sections.
LH: we mentioned we need the guard code. we could do that temporarily for now and then move away in the longer term with conditional compilation.
DS: we don’t seem to have a consensus here, we should clarify the options on the github discussion and continue it.
AR: presenting previous slides: [TODO: link] Current state is technically mostly ready for stage 3, but only proposing stage 2 here.
KM: do we know where we should start looking at performance characteristics?
AR: implementation is stage 3, so that would be the entry requirement for 4.
KM: I’d guess that there would be a big perf drop for using multiple memories (i.e. the secondary memories) since they don’t use the same VM tricks.
AR: you could continue to optimize memory 0 (that’s backward compatible) and the others would be slower, so at least that wouldn’t be a regression. But that’s something we should look into. Partly why i’m just proposing stage 2, so people can think about that.
KM: we want to avoid JS-style cliffs where the VM has to guess which memory you want to make fast.
BT: the drop might be less than you’d think. When we switched V8 the difference was less. So I think this won’t be too bad.
LH: we tried removing our pinned heap register, and saw a 3-5% hit, so not too bad. not great but not terrible.
FM: We looked at multiple memories in Interface Types. we looked at the case where we combined multiple modules into a single one, static linking style. It could result in having hundreds of memories. In that case there wouldn’t be just one special memory 0.
LH: we might want some way of designating memories as primary or secondary, as mentioned in the github issue.
KM: we’d definitely want that. Some platforms have a pretty small number of possible memories to use the big VM allocation.
LW: in that case you might even want them smaller than 64k. So maybe we might want memory to declare their own smaller page size. In the extreme case you could have a size of 1, and you’d take a bigger perf difference.
AR: the question is in what form we’d want this, and whether they’d be part of the multi memory MVP. As long as we don’t regress today, this should be fine. We’ve talked about optimization hints in the past, and don’t yet have a good way to do that. Not sure we want to add that to this proposal.
LH: we’d want to be careful since we could have a ton of memories accidentally.
BT: makes sense that we should separate the hints into a different section.
LH: we still need everything to be compatible, and will matter even if not technically semantically meaningful.
BT: we’d forseen the multiple memories, but it turned out that it was useful to leave room for extensions. Could we leave further room for extension even with this proposal?
AR: we could maybe do that with the extra flag field.
LW: we also have the further future goal of first-class memories where they are a dynamic reference.
AR: the current alignment field just has the one bit. It might be possible, might not be worth it.
DS: it sounds like we have a consensus that we should move forward with this generally. The details we are talking about now are about how best to do it.
Poll: Move the multiple memories proposal to stage 2
SF: 13 F: 6 N: 3 A: 0 SA: 0
[Proposal passes.]
SS: Link to slides AR: agreed that there’s nothing to do in the reference interpreter for this proposal since it’s only about JS integration.
TL: for the toolchain, emscripten currently does a transform that takes the i64s at the JS boundary and splits them into pairs of i32s. So for emscripten we’d just want a flag to not do that.
[No other comments.]
DS: it sounds like there are no concerns with this proposal. Let’s do a unanimous consent poll:
Poll: move BigInt to Stage 3: [No objections, proposal passes]
DG: this is something we started a while ago: reference-sysroot is currently in the wasm Github organization. it morphed into wasi-libc, and we have a usable implementation.
AZ: what are your thoughts on other implementations of libc for wasi?
DG: nothing stops people from creating others. It’s useful to have an example/reference which isn’t tied to any particular runtime. It’s not unlike other toolchains/tools being part of the wasm org, such as wabt/binaryen.
DS: So the first proposal is renaming reference-sysroot, what was the 2nd?
DG: second is to move wasi-sdk to wasm organization.
DS: what’s currently in the wasi-sdk repo?
PP: a wrapper for building clang for wasi. Technically everything that goes in there is basically part of LLVM. I tried to get the build to work out of the box but its tricky because you have to build libc before the compiler. Mostly because of the way the headers work. It would be good to improve that.
DG: right. Also it’s not actually a wrapper, we want everything to be just clang.
PP: yeah it’s really just a wrapper for building clang, not for running it. Might be best to even move it into clang proper.
SC: it’s basically just a build script. It’s really for a whole sysroot which includes all the libraries, headers, and compiler. We also publish binaries as releases.
DG: we are definitely interested in putting as much upstream as possible.
DS: any other comments/concerns?
[none]
DS: let’s do a unanimous consent poll for:
Renaming reference-sysroot to wasi-libc Creating wasi-sdk to contain build scripts and sysroot construction for wasi SDK
[no objections, proposal passes]