- Where: zoom.us
- When: January 18th, 5pm-6pm UTC (January 18th, 9am-10am Pacific Time)
- Location: link on calendar invite
None required if you've attended before. Send an email to the acting WebAssembly CG chair to sign up if it's your first time. The meeting is open to CG members only.
The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.
- Opening, welcome and roll call
- Opening of the meeting
- Introduction of attendees
- Find volunteers for note taking (acting chair to volunteer)
- Adoption of the agenda
- Proposals and discussions
- Fine grained control of WebAssembly memory (Deepti Gandluri) [20 minutes]
- Poll for phase 1
- Fine grained control of WebAssembly memory (Deepti Gandluri) [20 minutes]
- Closure
None
None
- Thomas LIvely
- Pat Hickey
- Johnnie Birch
- Michael Knyszek
- Lars Hansen
- Francis McCabe
- Ben Titzer
- Yuri Iozzelli
- Yury Delendik
- Deepti Gandluri
- Richard Winterton
- Chris Fallin
- Andrew Brown
- Slava Kuzmich
- Conrad Watt
- Radu Matei
- Paolo Severini
- Manos Koukoutos
- Dan Gohman
- Adam Klein
- Sabine Schmaltz
- Arun Purushan
- Petr Penzin
- Alon Zakai
- Alex Crichton
- Sergey Rubanov
- Jacob Abraham
- Jakob Kummerow
- Sean Jensen-Grey
- Emanuel Zeigler
- Sam Clegg
- Luke Wagner
- Mingqiu Sun
- Charles Vaughn
- Derek Schuff
- Zalim (Bashorov?)
- Shravan Narayan
- Peter Huene
- Heejin Ahn
- Jay (Phelps?)
- Nick Fitzgerald
- Shravan Narayan
- bhayes
- Is
- Oran
Fine grained control of WebAssembly memory (Deepti Gandluri)
DG presenting slides
CW: With reference to the idea of extending the memory type to include the protection status, could a memory start out protected then become unprotected at runtime? Similar to the “min” part of the memory type.
DG: With the mem type, the main option I wanted was to ensure if you have just a static read only memory, you’d have that. If the protections can change, you still ensure that you’d still be able to grow the memory.
CW: Sounds like there are lots of options, so don’t want to pin you down now.
DG: it’s not clear exactly what all the use cases are yet, so we there are still some open design questonis
CW: Could it be useful to have different kinds of memory bounds and protections be different types?
DG: I’m imagining that If you have memory other than the default, they should all enable external mappings. But if there are cases where you want to get it on permissions up front, it would be pretty easy to implement, if it’s useful. Not sure what the use cases are. But it wouldn't be hard I think
CW: Makes sense. I would be interested to hear opinions on how we would want this to work concurrently, i.e. with multiple threads racing. I’m not sure what’s safe or not in allowing races. In native programs, races are undefined. Presumably we would want something safer. I looked at what guest OSes do and I saw one example where it had to take a global lock before forwarding the operation to the host OS. That seems rather brutal. Do we have a roadmap for consulting experts, etc. to figure this out?
LW: I’m also worried about performance implications about protection changes anywhere at any time. But the idea of “below X bytes have no access” statically is something I like and seems reasonable. Also wholly-read-only memories are another good use case and easy. It might be a good idea to take some of these known easy cases and go incrementally before trying something fully-general.
YI (chat): A different type of memory like Conrad proposed would solve the issue of breaking the performance of existing applications. they would not opt in to use mappable memory and would not have extra bounds check. Maybe that is also combinable with multiple memories: a program could have a default non-mappable memory and an extra mappable one (potentially slower to access due to mapping checks)
SJG (chat): An interesting paper on creative uses of the kinds of things normally handled by an MMU http://dune.scs.stanford.edu/belay:dune.pdf Dune is a system that provides applications with direct but safe access to hardware features such as ring protection, page tables, and tagged TLBs, while preserving the existing OS interfaces for processes. Dune uses the virtualization hardware in modern processors to provide a process, rather than a machine abstraction. It consists of a small kernel module that initializes virtualization hardware and mediates interactions with the kernel, and a user-level library that helps applications manage privileged hardware features. We present the implementation of Dune for 64- bit x86 Linux. We use Dune to implement three userlevel applications that can benefit from access to privileged hardware: a sandbox for untrusted code, a privilege separation facility, and a garbage collector. The use of Dune greatly simplifies the implementation of these applications and provides significant performance advantages.
DG: Also Aside from that (what Luke said), having separate memories with more sciatic properties would probably be easier for integration into the web environment, JS APIs etc. Also agree that there are things we don’t know wrt concurrency, so we don’t have good answers on that, and agreed that we should try to find more folks with experience on that.
CW: My point is that concurrency is scary for normal applications, even more scary for us.
BT: Actually immutable memories could allow more optimizations, so that would be another useful distinction.
DG: that’s an interesting point. From the application perspective, that seems like a useful thing to be able to have too. Eg. for a codec use case, having a static view/data view
LH: Would like to see the tools story fleshed out early. Since we don’t currently have a good story for multiple memories and how mainstream languages would use it.
DG: Yes that is something I was hoping to look at very soon after we get started
TL: For LLVM, Igalia is currently working on exposing ref types, e.g. exposing user-defined tables. I would think that secondary memories would use similar mechanisms internally.
PP: why would this be similar? The problem with most languages is that there’s not really a concept for different memory spaces. So we’d need to figure out how this would map and be expressed in those languages. Using fully dynamic memory references might even be easier for those.
TL: for the table support in C, we’re looking at an attribute on a user-defined array of a builtin type. (e.g. array of funcref) that yo ucan say is a table, and you can do loads and store from it, those lower to table.get/table.set. A table is basically a memory full of funcref/anyref instead of bytes. Conversely a second memory is a just a tlable full of bytes instead of references. So it could be similar at the source language level.
PP: But memory accesses are more freeform, can overlap and be different sizes, etc.
TL: true. The way you access the secondary memories might be less ergonomic than how you access the primary memory.
PP: there was a discussion of address spaces e.g. that LLVM has, that could be the closed analogy.
SC: as an escape hatch you could compile the whole program to use the secondary memory instead of the primary one
DS: Could have a hybrid where you compile some of your code normally and compile other code to use a different memory, then arrange for them to call into each other somehow.
PP: …
LW: In this area there are 2 orthogonal approaches. One to use multiple memories which has hard problems in source langauges. E.g. code needs to opt in with special syntax etc. the other approach is, within a single memory, map stuff in. mutably mapping and changing protection is where all of the difficult portability problems. Maybe we can have immutable mappings (or e.g. with copy semantics) maybe we could optimize those. E.g. on the web there’s a concept of an immutable blob. If we could e.g. map one of those into the memory it might work well for some use cases. E.g. you can write into it but it doesn’t get reflected in the underlying blob.
DG: One more note on copy-on-write. I checked whether that made sense and some users thought that wouldn’t be problematic, so that does seem like a reasonable thing to investigate.
LH: Petr also mentioned my same point, that there’s a big space in the tools/langauges where we don’t know how it will work.
DG: [?]
CW: one other question: is it expected that if we have C that makes mmap calls, that we can compile that to these instructions?
DG: I would expect not. I’m not sure what happens now when C encounters a mmap.
DS: It’s basically completely outside the model of the language. But the pointers are uniform, for example zlib doesn’t care whether you pass it a pointer to mmapped data or not. We would give up that property if mmap only works on secondary memories. Would have to find alternative solutions for developers that have different use cases traditionally satisfied by mmap.
SC: yeah the summary is “probably not” currently we fake mmap, I expect we’ll keep doing that.
CV: Packing the memory id into the pointer is something that people have tried, it does have a lot of overhead though.
TL: yeah, fat pointers essentially turn every access into a switch over the index, which would probably be too slow for most use cases
PP: from that point of view a first class memory references might be easier.
DS: Not sure it’s easier in C since C doesn’t have opaque references of any kind. It probably makes some problems easier and others harder
PP: The likelier solution is that the memory reference would take the place of a pointer.
DG: No easy answers. I still have a lot of work to start and make progress on these discussions.
Poll: we can do a consensus poll, since the bar is low. Are there any objections to phase 1?
Unanimous consent achieved.
DG: Will take this discussions and split it into issues.
LW: It’s great to have someone actively working on this. Thanks.