Skip to content

Latest commit

 

History

History
1478 lines (1254 loc) · 100 KB

CG-11.md

File metadata and controls

1478 lines (1254 loc) · 100 KB

WebAssembly logo

Table of Contents

Agenda for the November meeting of WebAssembly's Community Group

  • Host: Intel, Santa Clara, CA

  • Dates: Wednesday-Thursday November 1-2, 2017

  • Times:

    • Wednesday - 9:00am to 5:00pm (breakfast 8-9am)
    • Thursday - 9:00am to 5:00pm (breakfast 8-9am)
  • Location: Intel SC12 Santa Clara, CA, 95054

    • Wednesday - Room: CR-SC12-154
    • Thursday - Room: CR-SC12-538
  • Wifi: WebAsm Intel Guest Login

  • Dinner:

    • Pedro's Restaurant & Cantina
    • 3935 Freedom Cir, Santa Clara, CA 95054
    • (Restaurant is within 5-10min walking distance from SC-12)
  • Contact:

Registration

Registration form

Logistics

  • Where to Go

    • Location: Intel Santa Clara SC12 - (37.3863932, -121.9666314)
    • Intel SC-12 3600 Juliette Lane, Santa Clara, CA 95054 Where to Park (37.3863932, -121.9666314)
    • SC 12 Parking Garage
    • Free parking is available outside the building in SC12 parking.
  • How to access the building

    • Access building from the main SC12 lobby entrance
  • Technical presentation requirements (adapters, google hangouts/other accounts required, etc.)

Hotels

A list of near by hotels:

  • Biltmore Hotel & Suites Silicon Valley (408) 988-8411 (walking distance) 2151 Laurelwood Rd, Santa Clara
  • Marriott Santa Clara (408) 988-1500 (walking distance) 2700 Mission College Blvd, Santa Clara
  • Embassy Suites Hotel Santa Clara (408) 496-6400, 2885 Lakeside Dr, Santa Clara
  • La Quinta Inn & Suites San Jose Airport (408) 435-8800, 2585 Seaboard Ave, San Jose
  • Hyatt House Santa Clara (408) 486-0800, 3915 Rivermark Plaza, Santa Clara

Agenda items

  • Wednesday - November 1st
    1. Opening, welcome and roll call
      1. Opening of the meeting
      2. Introduction of attendees
      3. Host facilities, local logistics
    2. Find volunteers for note taking
    3. Adoption of the agenda
    4. Review of action items from last in-person meeting:
      1. WebAssembly specification: "Luke to start an email thread, loop in Domenic."
      2. Multiple return values and generalised block types pick operator: "Luke to follow up on the impact for emscripten."
      3. Tail call:
        1. "Microsoft: figure out constraints"
        2. "Google toolchain folks: Gather data on where tail call would be used, what it would look like."
      4. Threads
        1. Memory serialization: "Ben Smith to follow up with Domenic."
        2. Testing: "Lars to propose a common shell API. Others to discuss."
      5. SIMD (follow up on day 2 SIMD item)
      6. User engagement: "JF to create a moderated announcement list for all CG members, and a users list which upsers opt-into."
      7. Administration of GitHub:
        1. "When a repo for a proposal is created add the champion as admin"
        2. "Keep limited admins, folks asking for new repos should route through other companies"
        3. "JF / BN: Cleanup existing admins on projects"
    5. Proposals and discussions
      1. Update on proposals from July: Non-trapping float-to-int conversions (Dan Gohman)
        1. Proposal Repo
        2. POLL: Should this enter the Implementation Phase?
      2. limits.h (Derek Schuff + Ben Titzer)
        1. Discussion of how we approach limits.h
        2. Discussion of function size limits.
          • Bug
          • How tractable function splitting in the tools?
          • How should we handle existing compatibility.
        3. Discussion of how to avoid mismatched commitments in the future
      3. Continue Web Platform Test discussion from where it left off on the last CG video call: do we want 1-way sync?
      4. CSP + WebAssembly
        1. Proposal
        2. Discussion: Which parts of this proposal have consensus?
        3. POLL: We should adopt the 'Proposed Homogenization of Existing Behavior'.
        4. POLL: We should adopt the 'Proposed 'wasm-eval' Directive'.
        5. POLL: We should adopt the 'Proposed Origin Bound Permission'.
      5. JavaScript Bindings for WebAssembly (Brad Nelson & Luke Wagner)
        1. Proposal
        2. Discussion: What do we like, what should change?
        3. POLL: We should start a jsdom fork of the spec.
      6. Specialization Proposal (Brad Nelson)
        1. Strawperson proposal
        2. Slides
        3. Discussion
          • Is this a good approach?
          • Are their better approaches?
          • Is this satifying in terms of where we want to go with threads?
          • Exploration of how existing implementations confront this problem domain.
        4. POLL: A repo for this proposal should be created and this proposal made more formal.
        5. POLL: This proposal should attempt to solve both limited "fast" memories, and address space being a separate resource.
    6. Adjourn
  • Thursday - November 2nd
    1. Find volunteers for note taking
    2. Proposals and discussions
      1. Update on proposals from July: Multi-value (Andreas Rossberg)
        1. Proposal Repo
        2. POLL: Should this enter the Implementation Phase?
      2. Updates on SIMD proposal
        1. Proposal Repo
        2. Presentation on Android ARM results (Google)
        3. Presentation on iOS results (Apple?)
      3. Items from last time to make sure we follow up on:
        1. "JZ to measure on other Android CPU ISAs."
        2. "JZ / BN to try contacting MIPS / POWER folks to perform measurements."
        3. "JF to measure on Apple hardware."
        4. "JZ / BN to gather another similar integer benchmark."
        5. "JZ / BN to come back with concrete proposal of what the narrowing/widening, min/max operations should look like."
      4. Updates on Threads proposal
        1. Proposal Repo
        2. POLL: Rename ixx.wait/wake to ixx.atomic.wait/atomic.wake
        3. POLL: i64 for wake count
      5. Exception Handling proposal (Heejin Ahn)
        1. Proposal Repo
        2. Exception handling scheme in toolchain
        3. Slides
        4. Discussion on the status of this proposal. Open issues in the design space.
        5. Presentation of tooling results and preliminary measurements.
    3. Closure

Schedule constraints

Andreas Rossberg: only available on Thursday (via VC)

Dates and locations of future meetings

Dates Location Host
2017-11-06 to 2017-11-07 Burlingame, CA TPAC

Meeting notes

Wednesday - November 1st

Opening, welcome and roll call

Opening of the meeting

Introduction of attendees

  • Brad Nelson (BN)
  • Ben Titzer (BT)
  • Mark Miller (MM)
  • Jacob Gravelle (JG)
  • Dan Gohman (DG)
  • Tyler McMullen (TM)
  • Johann Schleier-Smith (JS)
  • Arun Purusan (AP)
  • Michael Holman (MH)
  • Ben Smith (BS)
  • Derek Schuff (DS)
  • Erik Holk (EH)
  • Deepti Gandluri (DG)
  • Peter Jensen (PJ)
  • Vincent Belliard (VB)
  • JF Bastien (JF)
  • Bill Maddox (BM)
  • Art Scott (remote)
  • Limin Zhu (remote)
  • Luke Wagner (LW)
  • Richard Winterton (RW)
  • Mircea Trofin (MT)
  • Karl Schimpf (KS)
  • James Zern (JZ)
  • Keith Miller (KM)
  • Bill Budge (BB)

Adoption of the agenda

Seconded by Mark Miller.

Review of action items

From last in-person meeting

WebAssembly specification: "Luke to start an email thread, loop in Domenic."

  • LW: We did this! Dan Ehrenberg is working on this.

Multiple return values and generalised block types pick operator: "Luke to follow up on the impact for emscripten."

  • LW: Talked to Alon, and he said that it would not be easy to perform measurements for this because it isn’t really AST.
  • DS: Not easy for wasm backend either.
  • BT: We have some amount of implementation in v8.
  • JF: We were trying to figure out whether it was useful for just functions or block types.
  • BT: We have both.
  • JF: This item is about producers. But DS, you can work on this?
  • DS: Yes.

Action item: Derek to come back on this, with data from the LLVM wasm backend. Andreas implemented it in V8, so will try it out there.

Tail call

"Microsoft: figure out constraints"

  • MH: talked a bit, hoping to get something in the spec to get bounded increase in stack. Mutually recursive functions.
  • JF: So you mean provably bounded at compile time?
  • MH: Promise to not keep growing… so if you have 1 arg then tail call with 3 args, you have to add a whole new frame for 3 args. When you make a tail call, sometimes you need to increase stack, if you have more args. We want to create a new frame in that case. The way the spec is written today, it doesn’t allow that.
  • JF: With tweaks, you’d be ok with the spec?
  • MH: As long as there is something to allow you to grow the stack.
  • JF: The spec needs to mention where you’re allowed to grow the stack
  • MM: If the tail calls are statically visible at call site, can’t you reserve that much space?
  • MH: Sometimes using indirect calls, so you can’t know. The tail call itself isn’t the problem, it’s the caller of the function that does the tail call.
  • BN: Can we bound the number of parameters for the tail call site.
  • BS: Didn’t we discuss this last time?
  • MH: We did, but we didn’t really come to a conclusion. There was an exponential growth of tail callable annotations.
  • MH: At worst we’ll end up not doing a tail call. It places where it matters, we should be able to do it. We’re not going to be able to change the ABI.
  • BN: We need some way to bound the parameters.
  • JF: At the last meeting, Gabby said that MSVC would…
  • MH: MSVC has the same problem.
  • MM: For operations that reflect on the stack, e.g., debugging, we would need a mechanism to hide the tail callers that are actually on the stack but “shouldn’t” be, to have a deterministic contract.
  • JF: The champion for tail call is Andreas, maybe we should chat about this tomorrow when he is available.
  • BN: If we have some constraint, then what constraints are tractable for producers?
  • JF: What about tail calls across modules?
  • MH: Calling into JS, probably won’t work. Wasm, probably will work? Conversation: does tail calls across boundaries come up?

Action item: Brad and Andreas to look at this and revisit tomorrow.

"Google toolchain folks: Gather data on where tail call would be used, what it would look like."

Not done.

Action item: Brad to make sure someone measures interesting corpus for C++ code, to see how tail call could be used and if there are more / fewer / same number of parameters.

  • BN: Will this be used in LLVM?
  • DS: LLVM has this and uses it for haskell, maybe others.
  • BN: Natural tail calls?
  • DS: Front end will stick an attribute on the function, then backend can choose what to do.
  • MH: If it’s for optimization, we don’t have to spec at all.
  • DG: Current spec you’re not allowed to do that at all.
  • JF: We want to design something that works well for C/C++, and the main holdback for MS is fewer params to more params, that’s a problem.
  • BT: The feature is to support proper tail calls, if there is a limitation then it should be one that is easy to lift.
  • JF: We want to have something simple and useful to try out, then see if we can do more. Without involvement with people from other languages, we are just guessing.
  • BT: My impression overall, the people in this room have driven the design. Only now people are showing up to take a look.
  • BN: We were talking about having an outreach event in SF for people interested in wasm from a user point of view.
  • JF: TPAC is specifically for web platform
  • BT: We have survivor bias, we see the things that are successful
  • LW: Once we have GC types, we’ll see more folks who are interested in tail calls.
  • MH: Talking with C# team, they can’t use params have to put everything in linear memory.
  • BM: People would like to get better performance than that. If you’re already using a shadow stack, might as well do params too.
  • JF: We have two action items, Brad to sync with Rossberg, Brad also to follow up with measurements on toolchain end
  • DS: Need data for: what are the instances of tail call optimizations, and what are the instances of required tail calls.
  • BN: Have folks who are potential users of the format, but won’t necessarily come to the CG.

Action item: JF to figure out how to reach out to potential producers of the format who don’t want to come to CG meetings but would be good to talk to.

Threads

Memory serialization: "Ben Smith to follow up with Domenic."

  • BS: question was whether disallowing postMessage of non-shared WebAssembly memory would be hard.
  • BN: Dan Ehrenberg said he’d folk it into his current work.
  • BS: Domenic was looped in.

Testing: "Lars to propose a [common shell API] (WebAssembly/threads#52). Others to discuss."

  • BS: Lars is working on it

Action item: Lars to follow up next time.

SIMD (follow up on day 2 SIMD item)

Will discuss tomorrow

User engagement: "JF to create a moderated announcement list for all CG members, and a users list which users opt-into."

  • JF: I did the changes we discussed. We have public webassembly announce. Only Brad and JF can post. Others can request to post. We have the public list too, it is not auto opt-in. Announce is opt-in, and it is meant to be low traffic. Migrated everyone to the public list for folks who regularly come to the meeting.

Administration of GitHub:

"When a repo for a proposal is created add the champion as admin"

  • JF: Brad did this -- made it easy. Has a process document, new proposal, create a new repo, etc.
  • BN: I think there was some cleanup in mind, but this is fine.

"Keep limited admins, folks asking for new repos should route through other companies"

"JF / BN: Cleanup existing admins on projects"

  • JF: Not done, but it’s fine.

Non-trapping float-to-int conversions

Dan Gohman

Proposal Repo

  • DG: current status, we have test suite coverage, we have spec implementation. As far as I’m aware, this means we’re ready for stage 3. This is the first proposal going through this process, so we have to figure out how this works.
  • BN: (reads from process doc)
  • BN: You’re asking for stage 3, it has all the requirements.
  • JF: We should formally approve these things.
  • BN: There should be continual agreement.
  • MH: Should we have a place with this info about current proposals?
  • JF: We have this on the spec repo. What about updating webassembly.org?
  • BN: I’ll assign action item to our PM for this
  • JF: FutureFeatures.md has a list of all of them, with links to issue
  • BN: Should use names or numbers for proposal stages? Both?
  • JF: Yeah, both.
  • JF: OK, poll?
  • DG: Does moving phases require a poll?
  • BN: Doc says we have to keep consensus, so we should poll.
  • JF: I don’t want unanimous consent to move a thing, we want to know how people feel about it.
  • JF: (describes how polling works)

POLL: Should this enter the Implementation Phase?

SA A N F SF
0 0 1 12 4

Action item: Brad to sync with his PM to own WebAssembly.org

Action item: JF to update tracking issue.

Result: moves to implementation phase.

limits.h

Derek Schuff + Ben Titzer

        1. Discussion of how we approach [limits.h](https://github.com/v8/v8/blob/master/src/wasm/wasm-limits.h)
        1. Discussion of function size limits.
            * [Bug](https://github.com/WebAssembly/design/issues/1138)
            * How tractable function splitting in the tools?
            * How should we handle existing compatibility.
        1. Discussion of how to avoid mismatched commitments in the future
  • BN: We had some confusion about limits.h, should talk about it.
  • BN: Context: we had this set of limits that agreed for various implementations for interoperability. But it turns out that only MS was enforcing them. Chrome was enforcing for validation only (FF too). JSC implemented but removed since it broke a number of tests.
  • BN: How strictly should we enforce these limits?
  • BN: How tractable is this in the tools to chop apart long functions.
  • JF: Not just payload size. How do we standardize limits in general: that’s the question. When we initially started this, we informally chose these numbers but we didn’t standardize it. What we’ve learned is that it probably should be standardized and in the test suite. Should it be an official thing? Should it be just for web embeddings? Doesn’t make much sense for non web-embeddings. But it would be nice if it works the same everywhere.
  • BT: I agree we should make it part of the web embedding. We should also have a description of why we have this.
  • BN: We should have a description of why it is useful to have each of the limits.
  • MM: Sounds terrible to have these specific to the web embedding. On TC39, we have not had pleasant experiences with differences between JS in web and non-web embeddings. People will do tool support… bad game theory when enforced in some places and not others.
  • DS: It depends on why we have limits in the first place?
  • MM: Uniformity is the more important issue. Limits everywhere or nowhere.
  • BT: The web is special.
  • BN: Other places in the web, this isn’t specified.
  • SB: Why do we care about standardizing this.
  • MH: So it works in all browsers, not just folks testing in Chrome.
  • JF: We can’t standardize things like stack size. Some of the limits are static in the binary, some are runtime. Is there a maximum memory size? Is it runtime or static?
  • BT: Static, not runtime.
  • JF: We want to figure out if we want only static limits or some dynamic limits.
  • VB: User can decide.
  • BN: We’ll have some discussion at TPAC about this.
  • BT: We have a limitation already with varints limited to 32 bit.
  • DS: This is a web thing, so we can preserve the web experience. In a server environment, there’s no reason to do that. Maybe we should have different policies for different limits.
  • RW: We need to have better error reporting, so we have a reason why it couldn’t run.
  • JF: We don’t standardize that though, just that there’s an error.
  • BT: We should give better errors.
  • JF: Dan Ehrenberg brought this up w/ error messages in atomic accesses. We don’t specify which error you get if you throw multiple, we just say that you get one.
  • BT: We don’t have to go that far, we should try to do better though.
  • JF: For MS, you delay a lot of stuff until the function is run, right? How do you do it, does it fail at runtime?
  • MH: I think so.
  • BN: This is interesting, since the rationale was the quality of the experience for the user, but unlike 32-bit limit on varints, it’s not a natural boundary limit. We’re hitting this with real stuff. It’s an arbitrary limit. I’m nervous that if we have a lot of arbitrary limits, then we’ll have problem when we want to lift them.
  • MM: When one breaks a function into multiple functions, is there any observable semantic difference?
  • JF: Not right now, but if we allow stack inspection there would be. Error.stack you could.
  • MM: There’s no other observable difference?
  • JF: Maybe in the future.
  • MM: If, other than that, there is no observable difference. Why have the producer do it rather than the VM? What is the benefit?
  • BT: We err on the side of pushing work to the producer
  • MM: Good in general, but it is making software irregular. What is the benefit of having this in the web embedding?
  • BT: Compilation is superlinear, so it makes a 7meg function hard to compile. It is an engine convenience.
  • MM: We’re expecting a compilation failure?
  • BN: Folks come up with huge payloads, but there isn’t a performant way in the tools to split them apart.
  • JF: Alon was working on the binaryen outliner for this.
  • BN: That has the property that it might work, but can fail
  • DS: Outlining in LLVM IR didn’t land. The benefits are different, this is to try to solve based on an arbitrary limit, which was not the reason for the current pass.
  • JF: Having tool support will be useful, but it’s not the same goal.
  • LW: There’s also a practical limit for ARM w/ 32Mb jump limits.
  • BN: A lot of the other limits are in terms of engine convenience.
  • BT: There’s a lot of function limits, and you will probably crash anyway
  • MM: For a given limit, do we expect that the limit surfaces in the user-level tools. If we expect that it is OK to surface the limit to the tools…
  • BN: What concerns me about this is that how he got to the big limit is calling a bunch of static initializers, for that particular case we can break it up in the tools
  • BT: There is some number that is big enough.
  • BN: We’re not sure what this limit is for a particular example. In this particular case it is allocating space and then filling it in, so we could potentially break the function up.
  • BT: People get frustrated with these limits. We hit the 2G jar limit at Google.
  • LW: Having the limit means that we have knowledge from users, then we can all update the limit together. Coordination is useful if the limit is hit for valid use cases.
  • BN: Is it useful outside of web embeddings.
  • BT: Should we poll, about whether this is for web embeddings?
  • BT: I think it changes the discussion if it is just for the web embedding.
  • BN: We have payloads in the wild that exceed the capacity. MS followed the limit, and now they can’t run these binaries.
  • MH: We are stuck with the limit until the next version of windows.
  • BN: We want to avoid this in the future. If we say we have some limit, we don’t want to have a situation where we have incompatibility. If we had this in the test suite…
  • JF: We could have an appendix, like C++ where we describe this.
  • BM: These are quality of implementation issues, they differ from VM not toolchain. They are not toolchain limitations, maybe we should provide these as query parameters for the implementation.
  • BT: We should try to all have the same limits.
  • BN: The reason to have the limits is to say that we all should be able to handle this many. Maybe it should be a minimum not a max?
  • BT: Same problem.
  • RW: If we have it different for web and non-web, then we’ll have two modules.
  • BN: We could have prevented this in the tools. If we had this in the test suite, would we have noticed?
  • LW: We should have this in the test suite. At least failure on limits. We’ll have a giant file… (BN: or generated on the fly).
  • BN: If we have the shared limits…
  • LW: Dan E would like to add this to the web spec
  • BN: Or is it just an agreement between browsers?
  • MM: The one case we know of, lots of initialization in one function. That’s easy for the tool to break up.
  • DS: Single purpose code for one piece of source code.
  • MM: Design the tool for static initializers. For someone who has a function that is that actually that large, then we can have the tool display an error.
  • DS: I don’t disagree with this, but I don’t agree with earlier assertion. Difference between web and non-web, there are differences between embeddings so we shouldn’t be prescriptive about that. What do we do about binaries that work in one place and not another? At some point we start to depend on emscripten linux syscalls vs. a non-web syscall layer, so in some sense we’ll never be completely free of this limitation. Is this is the kind of limitation that we’ll likely have to deal with no matter what we do?
  • JF: We are running in circles on this. We made a mistake, and it sounds like we don’t have enough experience/info w/ non-web to move forward here. Next meeting we should discuss this further.
  • BN/DS: We need a short term plan for the next 6 months.
  • JF: Pick better numbers, and discuss better tooling.
  • BN: Are we going to try to stay below the 128k limit in the next 6 months?
  • MH: I think it’s acceptable to not implement outlining.
  • DS: We have to do something -- it’s bad that the tools don’t expose limits. At the very minimum we should push the error to the user of the tool. By default we should not create binaries that don’t work for the minimum limit. Maybe we can do something better for static initializer functions.
  • BN: Is there anyone who had a concern about the limit other than the number?
  • MH: Nope, just because it was in limits.h
  • JF: Right now, 4 browser vendors need to get together, implement the numbers, revisit at the next meeting.
  • BT: And add tests.

Action item: Derek to work on diagnostics for tooling, warning on currently implemented lower limits.

Action item: 4 browser vendors to clean up the limits.h numbers, agree on them, write tests, and unofficially support them for now. Next meeting, we’ll discuss the following polls which we didn’t take today or lack of information to for consensus.

POLL: do we only want to discuss only static limits, or dynamic ones as well?

POLL: do we want to standardize this?

POLL: Should the limits be Web-embedding specific, or standardized for all of WebAssembly?

Skipping polls.

Break.

Web Platform Test

Ben Titzer

Continue from where we left off on the last CG video call: do we want 1-way sync?

  • BT: 1-way sync means: from wasm spec repo to WPT (web platform tests). Another proposal: 2-way sync, from directory in engine (e.g. v8) to wasm spec repo.
  • JF: 2-way sync means… are you working on infrastructure… between v8, WPT to spec repo?
  • BT: Proposed direction, sync into the Wasm spec repo - things can be automatically synced to the engines
  • JF: We said at the last meeting that we don’t want the source of truth to be WPT. If we do 2-way sync, we don’t get random stuff to break my engine, right?
  • BN: The way this works with WPT is that there are test expectations, so when they do break, the waterfall isn’t red
  • BT: The larger context is to reduce the friction of getting the tests into WPT, and to make it easy for people to add tests for bugs they find.
  • JF: Sounds like 1-way sync, is spec syncs to WPT - 2-way is 1 engine to spec
  • BN: There are some conversations going on about test262 having this property. The model that’s been bounced around is a contrib directory, so they have the ability to add things to the contrib directly, and then someone like bocoup will take from contrib and incorporate into test suite.
  • JF: What Andreas was mentioning - adding a regression from v8 to the set of tests - but that’s not what Test262 does, it takes a thing and tests it in isolation - we’re talking about different types of tests, maybe we want both?
  • MH: Those aren’t the type of tests that we would be syncing (test262 style tests)
  • BN: For those regression tests, is there value in sharing? Folks in web engines have found that there is a benefit of sharing even regression tests.
  • MH: Someone took all of our tests and, I don’t know if they found any bugs, but was there any value?
  • JF: Michael did that?
  • SB: Definitely found bugs, there’s a little twiddling for error messages to align, caught some crashes
  • LW: Also Chakra imported a bunch of spidermonkey bugs, so…
  • MH: Yeah, that helped.
  • BT: Running other engines tests is useful, but should it be automated? Every regression needs massaging, so it needs manual work, like in the js-api tests, there’s really nothing that tests functionality - so it would be nice to import the v8 tests we have, and be able to use them
  • JF: From our point of view it would be useful to get other tests. We use our console builtins, mostly the same but some might not work (GC stuff). I’m not going to want to figure out how to make our ES6 modules work w/ your runner, etc.
  • BN: What’s happening in WPT is that they have a subset of the browser specific methods of running tests, some amount of wrapping is needed to figure out which engine - if it doesn’t work they just fall back to not knowing - so there is potential to be able to run it in all engines. If people get into adding tests to a common place, then we don’t need to write these tests 5 times over
  • MH: I wonder if that is a broader issue than wasm. Do we need to pave the way for that sync.
  • BN: It already exists for blink and WPT and firefox, those have 2-way sync. Other than that I’m not sure about any new plans. We’re not pioneering this. And perhaps will happen w/ test 262 too.
  • BT: Before we get to automating, we need to audit - and have someone collect the tests to sync
  • BN: Bear in mind that we are not overnight sharing all our tests, it’s just that when you add new tests you’ll put them in the shared place instead of the private place.
  • JF: We’ll have different testing frameworks that are similar - i.e. different frameworks do similar things, the goal is to have more coverage, not just have one framework. It’s not useful for us to move all our tests into some format
  • BN: The intention of the 2-way sync is to only do this for new tests.
  • LW: I think that someone writes a sync from the shared WPT to the web spec repo.
  • BN: It has to go to the engines, since v8 is in a separate repo
  • LW: That already works between Blink, and FF, tests would just go to shared WPT and we need to sync back to Wasm
  • BN: But V8 is in a separate engine - in terms of the 2-way sync, there’s missing places between v8 and the common shared tests or other engines and the shared thing. We could sync to the shared WPT. One concern is the loose ACLs around who can contribute to
  • JF: I thought that’s what we voted against last time. Now there are 3 types of sync. 1-way sync from WPT to web spec. Two types of 2-way sync...
  • BN: There’s two variants of two way sync - should WPT be included or should it be just the engines and and the tests
  • LW: Just the engines sounds like I would have to do a lot of work, so I’d like to hook into the 2-way sync we already do for WPT.
  • BN: We could get owners sorted out - if somebody else comes in and does additional work for the sharing bit, would there be interest in moving to the format?
  • BT: Should the tests be … we don’t have a reference interpreter for the JS API, so we may have them fail on one engine and fail on another.
  • LW: these things can be automatically marked that this is not expected to pass - it by default doesn’t make anyone red, just adds a todo item. This must happen all the time - th
  • BN: You may get partial breakage, but these tests are usually small. It’s more likely that you add a new test, it’s less likely that you’ll change an existing test and break some repo.
  • LW: We could just say nobody should change test expectations without filing an issue
  • BN: Adding a new test, and if you have wrong semantics, that’s something to figure out
  • JF: What Ben mentioned about 1-way sync sounds self-standing and straightforward, so we could move that forward. Let’s poll that.
  • (Others agree)
  • (Brad shows wpt.fyi)
  • MT: Failures on Chrome because IDB stuff is behind a flag
  • BN: We’ve kind of done the process backward, since normally folks write the spec and then go implement, and we’ve done it the other way.

POLL: there should be a 1-way sync from spec repo to WPT repo.

SA A N F SF
0 0 6 13 0

Clarification, this isn’t engine specific, this is the Spec repo.

  • BN: There are already the tests that require the whole browser… some of the tests that are already there were landed because of the existing 2-way sync.
  • EH: Maybe we should import the tests into a subfolder into WPT..
  • BN: These tests fit the WPT model already.
  • BT: Part of the 1-way sync could also update the status file.
  • (discussion about the current wasm tests in WPT)

Action item: Brad to ask some Google folk to do a WPT 1-way sync, and / or come back to the group if they won’t.

  • JF: Should we discuss more about the which 2-way sync variant we want?
  • BN: We can figure out the logistics of the two other types of 2-way sync. The fundamental property is that someone working on the engines can easily add tests to their repo. Maybe we’ll just start with 1-way sync and see how that goes.
  • LW: I don’t have a strong opinion, reviews rubber stamp them in some way - the only real difference seems to be do I land them locally and in a shared placE? Or land locally and wait for some automatic process to kick in
  • BN: The idea is that you land locally and let the process propagate the tests.
  • BT: It does mean that you have to write the shared tests in the shared directory.
  • LW: We have a shell tests that only run with spider monkey, but if we had the two way sync, I would be inclined to wirte them that way
  • BN: It sounds like what you’re saying is that if we had the 2-way sync in place you’d use it.
  • JF: We’re only talking about JS and web tests that work this way right? Core things are different right? Aside from regressions.
  • LW: Core I agree are different, the wast tests should always go to the spec repo, agree with the basic sentiment that if we have the two way sync, i’d be more inclined to write tests in the shared folder
  • BT: We already have work to landing and migrating tests…
  • BN: Why do we have to convert the existing tests?
  • BT: We already have tests that could already be there, we’re not going to write them, where we are in terms of compliance - we have all the parts, but no compliance suite - I don’t see why we need to solve this now
  • LW: Part of Dan E’s work is to boot up a bunch of tests for the spec that he wrote.
  • JF: In that case what I’d like to ask is should we poll anything now? It’s something we should monitor how it goes, no need to make decisions right away
  • LW: We could kick the can...
  • LW: First step is boot up the WPT
  • LW: He would drop them in the spec repo, then...
  • (Discussion about landing in Spec vs. Shared WPT tests)
  • BN: For the tests that he’ll write, are they around the JS embedding?
  • LW: Not wast tests, just JS embedding.

Action item: Luke to work with Dan E on JS embedding tests, and report back in the next meeting.

Specialization Proposal

Brad Nelson

(Brad Presenting slides)

  • JF: What do you mean it’s visible to the user?
  • BN: I’ll mention in a future slide. Some users like Unity have mentioned being interested in dynamically linking modules.
  • JF: You mean when you have multiple modules, and you want to link them together?
  • BN: Yes.
  • (Presentation continues)
  • JF: The Chrome security team is worried about reserving too much memory?
  • LW: Even 100,000 should only be 1% of the space
  • EH: There’s 64k
  • LW: You’ll hit the virtual address cap from the OS
  • EH: Even 8G * 64,000…
  • LW: The total quantity of space allowed to the process.
  • EH: You only get 47, OS takes 1
  • (Address space discussion)
  • BN: 48 bits, OS takes 1, 33 bits for 8GiB, that’s 14 bits, so 16k memories is the total to fill the entire address space.
  • LW: I think the OS stops you before you get to that point
  • BN: You may need to collect since you run out of address space.
  • MT: How many do we expect to be used?
  • LW: 1000 is still a lot.
  • MT: As a non web embedding, we couldn’t use NaCl, because we need 84G, so it’s an actual production problem
  • (presentation continues)
  • KM: We just defer until we know that we have the memory.
  • (Discussion about streaming other JS API)
  • MT: The motivation is… you can choose to do it in two steps, but when you have dynamic linking you’re hosed.
  • BN: When it’s one module it's fine, but with dynamic linking..
  • LW: This would be an alternative to promise on the import objects.
  • BN: What I wanted to figure out is which road we want to take
  • LW: (describing alternate proposal) When you import things, rather than getting an object you could get a promise instead.
  • KM: Is this what ES6 modules do?
  • LW: It’s a programmatic way of doing it
  • MM: The ES6 modules, the normal import -- there are no promises from one module to another. Promises from outside the loader... The import expression returns a promise. The things returned are not values but the locations themselves. You don’t run initialization code until you’ve loaded and linked the transitive closure...
  • SB: Modules do execute..
  • MM: Only after the transitive imports are linked. When you have cycles, if you touch the cycle before it’s initialized, you have the temporal deadzone issue…
  • BN: We either bring up all modules at once, or we separate the issue, of specialization, and decide when to bring the modules up, and disambiguate the two things
  • LW: I did like imports w/ promises, I think it solves a real problem. I don’t like specialization because we’re putting state on modules.
  • SB: Imports with promises?What does it mean? How does it know what kind of memory it has?
  • BN: (presentation continues)
  • JF: Does this integrate with ES6?
  • BN: It does not, idea is that modules will have the ability to be specialized to one set of imports.. Continues talking about specialization..
  • JF: Is it a subset of imports? I promise this set of imports - they will be the same or will it be something that look like it?
  • BN: I think it’s more interesting to require an exact match. It lets you burn in addresses, etc.
  • KM: What’s the difference between this? And instantiate?
  • BT: It is the same thing, but moves the thing one step earlier.
  • SB: It’s not the same thing..
  • BT: This information will be available to you earlier in the process
  • (Discussion about instatiation, imports)
  • BN: You’re committing yourself to a subset of the imports.
  • MT: The memory is the main thing, you can precompile
  • BT: The other imports could be intrinsic functions on another worker… (I just made this up!)
  • JF: Do they have to be exactly same or not?
  • BT: Don’t know
  • BN: If we know that it’s the exact memory, then we have more guarantees
  • SB: Why can’t we just have an instantiate module that can’t give you the module back?
  • BN: Then we have to solve the problem of how to hook these together.
  • LW: The only problem that’s being solved is how do we work out the cycles - either this or promised with imports
  • KM: I don’t see how this solves this problem -- I can have my own implicit contract that I give you a fast memory. This just makes it so you can throw later if the memory is different.
  • BN: What you can’t do now is if you have several shared memories, to do each of those memories in parallel..
  • (Brad explains on the board)
  • JF: What are we solving? I don’t understand what’s driving it?
  • BN: If Unity chops their engines into pieces..
  • JF: If we spec ES6 modules? Doesn’t that solve the problem?
  • BN: I’m not confident that ES6 modules provides us a way to have emscripten surface...
  • BT: Also doesn’t do anything for fast memory handling
  • JF: I’m worried that we’ll end up spec’ing two different ways… this way works for memory, but maybe not much else. The way we have now forces you to serialize.
  • KM: You can do it parallelly in JS.. explains
  • (discussion about whether this is possible in JSC)
  • MT: Why is compiling in parallel going to be faster?
  • BN: Can’t even start downloading…
  • LW: We need something new here- ES6 module integration is a solution, but we need something more - I like imports with promises because it solves a more general problem
  • BN: One disconcerting thing is that… up front compilation was assumed… guess JSC and Chakra considered otherwise
  • LW: I was thinking this too
  • BM: This is a fairly general mechanism of early binding at compile time - that sounds a nice general mechanism
  • BT: It’s specialization, it gives more information to give to the module at compile time
  • BN: Concern with promise based one is that it’s forcing us to.. We had the same problem when it came to idb serialization, the problem is what do you actually store? What would it mean to serialize modules that have been specialization?
  • Serialize Everything!!
  • (More discussion around compile/instantiate)
  • BT: Wouldn't you have to do work to resolve these promises? It would be serialized? Promises express dependencies - these have to be resolved before we start compiling
  • (discussion designing import with promises)
  • (BN talking through on the white board, Discussion about promises, cycles..)
A = fetch(‘a.wasm’)
B = fetch(‘b.wasm’)
C = fetch(‘c.wasm’)
One API function “instantiateStreaming({a: a, b: b, c:c}, {exports})

Returns POJO with module instance pairs?
  • (discussion about graph topology of all-in-one download/compile/instantiate function)
  • MT: We want to cache reusable libraries, do we always have knowledge of the graph?
  • LW: Unity will be a DAG
  • MT: DAG or not, but does this model not work if we don’t know the entire graph up front?
  • LW: The unit you’re caching is a single module?
  • BT: Are partial results from the function useful? If so, then knowing what the dependencies are can make the engine schedule usable results sooner.
  • (Discussion about value? What should we pursue?)
  • MT: Can we learn from Unity about what they’re trying to solve? What do we need to provide right now?
  • BN: Right now they could do something that would be fast in Chrome + FF but not Chakra+JSC, because they could do two-step compilation.
  • BS: Presenting the API like this makes it harder to understand, explaining actual usage with clarifying which things are parallel etc. makes it easier to understand
  • BN: This is putting more work on the engine.. Good in the sense of flexibility
  • BT: Advantage of ES6 modules, is the engine sees the dependency graph
  • BN: With ES6, you have the dependency graph after you’ve fetched all of the modules.
  • MM: The existing system -- the dependency graph is visible to the host component (the loader). The high-level loader API, has a registry that it manages, within the engine… instead there’s a low-level API where the dependencies are visible. Right now, the only thing visible is -- if you take an import action from outside, somehow the transitive closure of the imports is resolved by some host loading service, resolves module names to modules, then when everything is loaded + linked, initialization code is run.
  • JF: We’re discussing implementation specific details.. Not productive right away
  • MM: A lot of the issues that we’ve wrestled with are the same issues wrestled with on the loader API of ES6.
  • BN: Imagine this scenario -- one module is the core of Photoshop, and the others are plugins, how would that work?
  • (BN & LW taking it offline)

Action item: Brad to explore instantiateStreaming + promise imports more, and talk to module loader API folks (Domenic?). Mark Miller to help Brad.

  1. Discussion
            * Is this a good approach?
            * Are there better approaches?
            * Is this satisfying in terms of where we want to go with threads?
            * Exploration of how existing implementations confront this problem domain.
        1. POLL: A repo for this proposal should be created and this proposal made more formal.
        1. POLL: This proposal should attempt to solve both limited "fast" memories, and address space being a separate resource.

No polls taken

CSP + WebAssembly

Brad Nelson

        1. [Proposal](https://github.com/flagxor/spec/blob/csp/proposals/CSP.md)
        1. Discussion: Which parts of this proposal have consensus?
        1. [Safari behavior](https://github.com/WebAssembly/design/issues/1092#issuecomment-311869143)
  • BN: (describing CSP)
  • BN: Chrome locks down some parts of wasm when using the most conservative CSP (default)
  • JF: There’s the fact that existing websites restrict by using CSP, these websites might not change - the right thing to do is maybe we grandfather them in to not get Wasm? Or allow new websites to have the knob to turn on wasm
  • BN: When folks said they didn’t want eval, they didn’t say disallow eval. The default behavior is no eval, and you have to opt-in. Because of the different views of the browsers, we have differing API methods that are allowed by default.
  • (Presenting differences between browsers, and what APIs we block)
  • MT: Also missing is deserialization.
  • BN: Ambiguity is that - they use script tags as the underlying mechanism - but it’s not clear what this should look like wasm. It’s bad that we’re blocking different thingsthings based on Browsers? We also behave different things when things are blocked - Chrome - eval error, FF- Disallow scripts, don’t get a chance to catch errors.
  • MM: Where does execution continue after the error?
  • BN: No idea, I just dug into this last night. As a first step, we should homogenize behavior around it - we should be conservative about what we allow because people that have a CSP consciously opted in to more safety. Hope is that we can make wasm fit into this picture - I hope what we agree for the default makes sense in the future too. Things like memories/tables we have a strict policy - with origin bound policies..
  • LW: I was curious -- for compileStreaming, that would be origin bound, but only if it came from a fetch. Do we discriminate whether it came from a fetch?
  • BN: Under certain circumstances we would allow streaming - to give a very specific example
  • (Draws on board, discussion about what needs to be done for scripts/defaults)
  • LW: Same behavior that we have for script tags?
  • BN: Be able to choose for script tags / fetches...
  • MM: Wide spread notion that the thing that makes eval dangerous is that eval is run on code. Things that makes eval dangerous is what the code has access to, and not just that it’s code.
  • BN: I’ve had conversations about this. In the limit, you could have a JS interpreter inside your website, so you could do anything with data.
  • JF: To that point, right now wasm only has access to what it has access to imports
  • MM: How?
  • JF: Handwave...
  • BN: But it would probably be explicit.
  • JF: We haven’t finalized what it should look like, when we agree on CSP, someone could have a policy, where we change the security guarantee from underneath
  • MM: Even without CSP, that evolution of wasm can break their security assumptions. If you give wasm new features that aren’t in the explicit groups, then you can break the their security assumptions.
  • JF: We’ve had discussion with the WebGPU team, they’re interested in having direct access, doesn’t matter as a part of imports or now
  • BN: The next topic brings some of this up.
  • JF: We don’t want to taint whatever we standardize on -- this must be the behavior. I updated the doc w/ info about what Safari does. Compile doesn’t actually compile anything.
  • BN: That’s a property of the engine..
  • MH: CSP doesn’t protect you against the engine
  • BN: We should discuss compile in/out, but w.r.t eval - folks that are using CSP, they are using it as an additional security - as a bag of tools, and not a brick wall
  • JF: Mark’s point is important -- it’s not important that it is turing complete, it’s that you are only allowed to access what was given explicitly.
  • LW: NewFunction is disallowed?
  • MM: new Function is just as dangerous.
  • BN: They’re not trying to mitigate just from JS, by being able to assert static dependencies - protects from server side vulnerabilities
  • BN: Imagine you have some server-side vulnerability from PHP, … you can have a PHP injection allowing JS injection. In any event, I agree with some reservations that CSP may not be the correct security layer. We still need to decide how does describing all of the subresources reconcile with wasm. Down the line, a bunch of libraries will start using wasm. It would be unfortunate if all users of CSP now have to take additional steps to use wasm.
  • LW: Comment from framework people: Applied by who owns the site, framework folks have to deal with all the policies from different owners, for WASM are there good useful patterns? Layer one compression, delta patching.. If we rule these cases out, we’ve changed the characteristics of what we can run
  • BN: I share your worry that if we rule out these cases that’s bad.
  • MM: I want to mention that, there is a proposal before TC39, frozen realms. It introduces eval etc. to not cause effects on the surrounding environment. This question that JF is raising -- do we want to eventually cause effects that are not from granted capabilities -- that is the correct decision to answer the same question.
  • BN: Problem with eval is the ability to do execution of generated code- the dynamism with jit is exactly what they’re trying to prevent
  • BN: It’s a different philosophy, if you just exclude clever JIT tricks, it makes sense for regular JS, but you can see how they get to disallowing it
  • BT: Luke made the point of doing your own layer 1 compression. You could ask the CSP to grant additional capabilities, and otherwise fall back.
  • JF: Frameworks have to do the lowest common denominator, or frameworks can do the best thing, and the next best etc..
  • BN: When you exclude the dynamism, it also excludes wasm - so we have to figure out what we should do.
  • BN: Why does Apple also use CSP?
  • JF: Here are our reasons. Compile doesn’t do anything, just validate. Instantiate does the real work.
  • MM: When you say a module? Do you mean a compiled module? Does the module contain bound values?
  • JF: Needs to be instantiated for anything to happen
  • BN: The reason this isn’t an issue. If you have a policy that restricts to an origin, then anything within that origin is OK. Workers have to have the same policy (* JF: are you 100% sure of that?) 90% :-)
  • MT: To strengthen the reason to block compile, compile might provide executable code that’s a landing pad for potential exploits
  • BM: One of the problem with eval: there’s a style of attack where after some information disclosure, that you could craft an attack against that platform. You have more capability...
  • MM: What buttons/knobs can you push if you don’t have access?
  • BN: If you have a partial exploit somewhere else…
  • MH: But that’s not what CSP is for, right? Otherwise you could restrict regex.
  • JF: Browser exploits or XSS?
  • MH: For compile I think it’s browser attacks
  • KM: If you could trick someone to compile wasm, knowing there is an exploit, you could take advantage of the system.
  • BM: describing Jit spray(?) exploit
  • MM: You are worried about insecurities in the platform?
  • BM: I’m saying that the more control the attacking program has, on the contents of the address space of the browser, the more leverage there is to….
  • MM: I take that attack vector seriously, ex - rowhammer. If there’s any page in the same address space, visiting a site that can download code that can spray memory, then everything in that address space, or everything in that user’s account is vulnerable
  • BT: It basically is a mitigation
  • BN: We’re trying to get iframes out of process in chrome. That would make it more likely that you only have your code in your process.
  • BT: Knowing all the code that exists, if there was a wasm module with a vulnerability, and you allow eval that lets it leak into your code..
  • (Discussion about defaults, interop - BN, BT)
  • BN: Skipping ahead, wasm-eval turns on everything. The concern is that most folks don’t have wasm-eval.
  • BT: If you disallow memory on, we could define directives with granularities that make sense
  • (BN running through policy options)
  • LW: We’re bikeshedding attack vectors
  • BN: There’s an ambiguity.. You know all the JS that went into your site, if you call *.Memory, it should be from code on the site, which should be fine.
  • BN: Can we do an informal poll?
  • JF: Do we have enough information? Attack vectors are shifting
  • MM: There’s a different question first… can we take a poll on that? Guarantee that we don’t add features that allow you to access features outside of the module without explicit access.

POLL: WebAssembly instances must never be able to cause effects other than by wielding explicitly granted access (e.g. the importObject in a JS embedding).

SA A N F SF
0 0 6 9 7

Action item: Mark clarify what this poll is getting at, add to design repo’s “security.md” document, etc.

  • MH: We should find out if the CSP folks have a working group? We should talk to them so it’s not just Chrome’s security team deciding things
  • LW: Talk to CSP people, and wait for Mark to talk to security at Chrome - and get abck and poll at the next meeting.
  • BN: Polling JF, LW to see what FF & Apple’s security teams
  • BN: Should we spin up new branches of spec repos to explore?
  • (Discussion about CSP origin policies, mime types)
  • JF: Put it in the spec repo? Branch off the branch?
  • BN: I want an issue tracker - move discussion to github

Action Item: Brad to discuss with his security folks as well as webappsec group to better understand the attack surface, use cases for frameworks, etc. Report at the next meeting.

Action item: JF to create a repo and tracking issue for CSP, with Brad championing.

  • BN trying to poll eval directives - pushback regarding more polls.

  • BN: We will probably implement this because folks want to implement this, and can’t (wasm-eval directive)

  • (BN, MM clarifying discussion about clarifying goals of transitive properties)

  • POLL: We should adopt the 'Proposed Homogenization of Existing Behavior'.

  • POLL: We should adopt the 'Proposed 'wasm-eval' Directive'.

  • POLL: We should adopt the 'Proposed Origin Bound Permission'.

Polls not taken.

Updates on Threads proposal

Ben Smith

Proposal Repo

  • (BS presenting a threads update)

  • BS: Trying to figure out what information is relevant to people here

  • BN: Can you put a number on the phases?

  • BS: Somewhere between 2 & 4 right now, closer to 3+ later in the year

  • BN: Is there work on other browsers?

  • MH: We’re working on it - sometime next year?

  • BN: Apple?

  • JF & Others: (Silence)

  • DS: Punts out to wait & wake, tested the emscripten tests almost all pass on Chrome

  • LW: Jukka has a working patch - we’ll test it soon

  • LW: Are globals a part of threads MVP? If noone is pressing on it should we care?

  • JF: we want to make sure that we’re not permanently stuck with a bad thing

  • DS: I believe we want globals, because workaround is bad

  • LW: It’s doable, but not trivial - so we just wanted to know

  • BS: Additional proposal that came up last time about initialization - turning that off and on

  • JF: Dynamic linking?

  • BS: Being able to say turn off initialization on this thread - not really about dynamic linking

  • BN: Is this something we want to poll now? Or wait?

  • Let’s put it in now and see if anyone objects

  • BS: Any other updates?

  • BS: What happens when we have more than 32 bits? If it’s left at as a 32 bit value, what happens if the value is bigger than that? Also what about greater than 64 bits?

  • LW: Is that even possible? That’s larger than the word length?

  • MH: He wants to know what the overflow semantics should be?

  • BS: There is also a proposal for asyncWait

  • LW: we can trap on too many waiters

  • (poll happens: trapping on too many waits enqueued)

  • BS: we probably want to coordinate this with EcmaScript spec

  • BS: next poll: should we rename ixx.wait/wake to ixx.atomic.wait/atomic.wake?

  • BS: in JS it’s Atomics.wait and Atomics.wake

  • MM: We should make naming similar between Javascript and Wasm

  • BS: That should be a separate poll, we should change this because we can’t change JS

  • BT: The only place this occurs is in the text format, right?

  • JF: What is it in JS again?

  • LW: Do they all have the atomics prefix in the binary format?

  • BS: Yes.

  • LW: That seems like an argument for the prefix name

  • BN: Should it have been SAB.wait/wake?

  • BS: Makes sense in JS because it needs a namespace, in wasm, we’re not limited the same way

  • MM: What’s the downside in being more aligned with JavaScript names and saying Atomics?

  • BS: Saying Atomics. The only place this shows up is in the disassembly of the Wasm module, or when compiling from source

  • JF: Andreas’ point is that this is part of the atomics package. We get all the opcodes or none of them, at least until we get atomics2, but we aren’t going to call it atomics2.

  • BS: Atomic is not just to represent the prefix byte..

  • BT: Nuclear, instead of atomic

  • BS: The proposed poll is to rename from wait/wake to atomic.wait/atomic.wake

  • (poll commences)

  • BS: If we aren’t out of steam, let’s take a poll about aligning with javascript and calling them atomics.

  • BT: This is professional level bike shedding

  • DS: we’re even into trolling

  • (poll commences)

POLL: i64 for wake count and result (github discussion) instead of i32

SA A N F SF
2 2 11 4 0

Outcome is not really favoring one way or another. Slight preference for staying with i32. Could revisit.

POLL: trapping on too many waits enqueued

SA A N F SF
0 0 5 9 2

Want to coordinate with ECMAScript spec.

POLL: Rename ixx.wait/wake to ixx.atomic.wait/atomic.wake (github discussion)

SA A N F SF
0 4 7 6 0

Slight favor.

POLL: rename to “atomics” instead of “atomic” to match ECMAScript.

SA A N F SF
2 5 9 1 1

Two “for” votes are to increase consistency with ECMAScript.

Adjourn

Thursday

Find volunteers for note taking.

Updates on SIMD proposal

Brad Nelson

        1. [Proposal Repo](https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md)
        1. Presentation on Android ARM results (Google)
        1. Presentation on iOS results (Apple?)
   1. Items from last time to make sure we follow up on:
        1. "JZ to measure on other Android CPU ISAs."
        1. "JZ / BN to try contacting MIPS / POWER folks to perform measurements."
        1. "JF to measure on Apple hardware."
        1. "JZ / BN to gather another similar integer benchmark."
        1. "JZ / BN to come back with concrete proposal of what the narrowing/widening, min/max operations should look like."
  • JZ: Last time most everyone was here. We did a lossy version of WebP as a proof of concept.

To review. Last time we had positive gain. More so on x86, than arm. The follow up items where to look at follow up items on older systems. Also there was a follow up on why 32-bit arm intrinsics wasn't as much as expected.

Testing was the same. Kept everything in order.

S5 is interesting because the US version uses a Snapdragon, but the European one uses a Samsung.

There are some Chromebooks, they have a 5 year lifetime, these are some of the popular ones. They're Celeron and Atom. They have bad characteristics for video.

We took the best time we had on native intrinsics, and compared to the port (so not exact).

Our native intrinsics SSE2, we're getting 2x. And portable we're getting about 1.5x. The performance is still in range.

  • DG: Why is the sse4 version slower?
  • JZ: Likely the C code picked up some SSE4.
  • It's the same binary with assembly turned off and on.
  • That might be a little confusing.
  • JZ: This is the 2013 chipset (Toshiba Chromebook 2)
  • This is in a same realm. The native is better, but the portable intrinsics is the same as last time.
  • LW: 2x means it runs in half the time?
  • JZ: Yes
  • One other thing was why was armv7 so slow. (Pixel smartphone).
  • We should see a bigger gap, which we do.
  • This is an S5, one is the US version, the other is the European version with a Samsung.
  • JF: 5% is really bad.
  • JZ: Yeah it is.
  • The Qualcomm will throttle, so actually the Samsung ends up being faster for video.
  • KM: Is this a fundamental limitation?
  • JZ: This is a limit.
  • I'll go into more detail. When you add the saturating add and subtract it may have trouble pattern matching when expressions are deep.
  • JZ: Another candidate instruction is multiply high.
  • For those unfamiliar this multiplies two 16-bit values and keeps the high half.
  • Codecs exploit this to get more throughput.
  • Usually one vector is all constants.
  • This is tricky from an implementation perspective, as ARM's version does a double.
  • So we've have to limit it to 15-bit.
  • JF: So you'd have to standardize it multiplying by a 15-bit constant.
  • BN: Yes.
  • JF: Is ever used without a constant?
  • JZ: I don't have anywhere where it is.
  • RW: This is a very specific workload we'd have to go back and look.
  • JF: I wouldn't want to standardize a thing just for video codecs.
  • JF: I'd like to see a breakdown of what you think could be standardized.
  • Constant or whatever.
  • BN: This demos the constant version win.
  • JF: Yes, but I want to see how this would work in a clear proposal. And I want to see if Intel has traces showing if this is use in the variable way.
  • RW: We could take a look at that.
  • BN: So to be clear you'd look to see what percentage of the time the constant vs the variable data is used.
  • RW: I have some data, but the traces might not show it. If you want a wider set it'll be a static set.

Action item: James to map out the possible opcodes we could standardize for mulhi (constant or variable right-hand side, how to handle precision).

Action item: Intel folks to see in their traces how the instructions are used (variable or constants as inputs).

  • KM: Can we tell what percentage will be dynamic?
  • RW: The problem is with loops we won’t know how many times it will be executed.
  • JF: The traces are usually used for stuff that was hot
  • RW: Right. If it’s in a loop, I’ll count it.
  • JZ: Yeah I was just presenting this to see what the gain is like. We see some gains.
  • JF: You used the portable intrinsics and hard coded the mulhi.
  • JZ: Yeah, we can just turn it on in here. In practice mulhi is a useful instruction for video codecs.
  • JF: The one switch to use wasn't in the README
  • JZ: I can put it in there.

Action item: James to post slides and update README with instructions on how to enable mulhi.

  • Slides

  • README

  • JZ: I can round out a few of these.

  • RP: The newer toolchain improves ARM.

  • JZ: Yes that seems to be true.

  • JF: Could you post those numbers?

  • JZ: No, we had discussed this offline. I can add it to the appendix.

  • PJ: For portable intrinsics, do you have a C-style API? You assume this will be similar to what we’ll get with wasm?

  • JZ: I hope we can do better. Because we can extend it. Minimally I would hope we can match that.

  • JF: What about MIPS and power?

  • BN: Haven’t heard much from either yet.

Action item: James to post numbers from Vincent, measured with ARM A53 / A72.

TODO for ARM to clarify

Action item: (not done from last meeting) JZ / BN to try contacting MIPS / POWER folks to perform measurements.

  • JZ: There was another item on saturating ops.
  • JF: I was going to try this on our hardware. And I tried this on our hardware.
  • I ran it on the chips we make. The speedups were in the ballpark.
  • BN: Were you comfortable saying the numbers.
  • JF: Nope.
  • JZ: The devices were x86 or just arm?
  • JF: On my iMac and did it on phones.
  • JZ: The age of the devices?
  • JF: I measured the last 4 devices we launched.

Action item: (not done from last meeting) JZ / BN to gather another similar integer benchmark.

Action item: (not done from last meeting) JZ / BN to come back with concrete proposal of what the narrowing/widening, min/max operations should look like.

Action item: JZ Add slides to the notes.

Slides

Action item: Brad to make forward progress on the tools and come back with numbers in-browser. Microsoft and Mozilla to measure as well.

  • AR: Can you hear me?
  • JF: Yes, very well.
  • AR: I also have an idea on the tail call discussion.
  • I have a couple slides on that as well.

Tail-call follow-up

Slides

(Andreas presenting)

  • AR: I don't have anything on the cross module thing, but I want to solve Microsoft's problem. I realized this morning that, the property we care about is that if you have a chain of tail calls, they don’t grow the stack in an unbounded fashion. That’s the only thing we care about. That does not imply that every tail call must not grow, it’s fine if individual tail calls grow the stack. As long as you can guarantee that it’s a bounded number. The difference is not observable. If some tail calls turn into normal calls and grow the stack, you can amortize this number in the length of the call chain. So what that means is that we can use this for an implementation strategy that is simple. We can take a tailcall and translate it to:
If |call-frame-on-stack| >= |call-frame($f)|
  Tailcall $f ;; reuse frame
Else
  Call $f  ;; new frame

If you do this dynamic test, you can optimize away in most cases. You only have to do that branch in the critical case where you might have to grow. The important idea is that you have to figure out how large the thing is on the stack. If it’s large enough, you just reuse it. Because the maximum number of parameters is bounded, then the growth is bounded.

(example with mutually recursive functions)

  • JF: When you say this, it’s all in the VM implementation. It’s not imposed on the producers.
  • MM: question: one wasm instance, through opaque function table, call through to another wasm instance. What about tail calls there?
  • AR: Should be possible, but it’s potentially orthogonal.
  • BT: It is possible to do this, need an adapter frame to do this, but can still be constant space.
  • MM: On the normative issue: normative tailcalling has asymptotic property? (Yes) good.
  • AR: You have problem that you can’t break up arbitrary program into separate parts without changing program otherwise.
  • MM: I raise the issue because I would prefer it's non-observable for both. It is observable through the debugging interface. The error stack isn't normative, but...
  • AR: We have discussed that, and concluded that it isn’t an issue we care about. As a programmer, it’s not to trace call history but for call continuation
  • MM: What I'm suggesting is that we hide the non-tail call.
  • AR: Ideally the caller can mark it's metadata to track this. There should be ways to hide the call frame.
  • MH: I'm glad you provided this. We have these call flags that we can put on each call to capture which frames are real.
  • LW: What do you store on your call frame?
  • AR: Will this implementation strategy work for you guys?
  • MH: Yes, this is what we had been hoping.
  • AR: You do have to consider the dynamic size of the call frame.
  • This kind of assumes you have a maximum size. You might dynamically create more functions.
  • I would argue this is fine, because even if you did this with full tail call eventually you'd run out of space.
  • KM: You're gonna run out of stack space a lot faster.
  • MH: I think the max parameter count is 1000 in limits.h
  • BS: I have a question about this. Last time I seem to recall us discussing this exact scenario. * Why wasn't this ok?
  • MH: Yes, it did come up we wanted to go and see whether it would work. And we think it does. :-)
  • JF: To confirm, wasm to wasm calls are potentially orthogonal, are you proposing we allow tail calls wam to wasm?
  • AR: Yes, given implementations can do it correctly
  • JF: I think I have a todo, make this a tail call.
  • AR: Host functions you can't tell the difference, but this is kind of outside the space of webassembly.
  • JF: Just want to make sure we’re not considering tail call to embedder. Whatever you put in your import object could be embedder or not, and you can’t know. It’s weird, we don’t want to support tail call to embedder… do we want it to be a failure?
  • AR: It would transparently turn into a normal call.
  • LW: Multiple modules could do a different thing.
  • MM: It would be unfortunate if Wasm to Wasm call are fine, but calling out is not,
  • MH: You may need to marshal values.
  • (discussion about how tail calling to embedder might work in JSC)
  • AR: I don’t think you’d want to trap, and I don’t think it would break the semantics to change later. Providing tail call would just strengthen the semantics.
  • DG: Another option is to make it a validation error by putting the tailcall in the signature.
  • JF: That's a neat idea. I like that.
  • AR: You'd have to track it in the type. It's a slippery slope once you go down that path. I don't think this is important. And JavaScript doesn't have tailcall. JF&* KM: Yes it does :-)
  • LW: Having this might allow a more efficient calling convention.
  • DG: In native calling conventions this frees up an extra register.
  • BS: Could we leave out tail call to the embedder?
  • LW: It might matter even for normal calls. It would be nice if this feature had no negative impact on normal calls. You could end up using the Windows tools, etc.
  • JF: What are next steps?
  • BN: Extra bit on imports would be useful?
  • MH: Yes, would be useful. Wouldn’t have to push around the frame size all the time.
  • BN: Andreas, what do you think about having it in the signature?
  • AR: I’d like to avoid it, if possible. Opens up corner cases. Is it in the proposal?
  • JF: What are you worried about?
  • AR: Adds complication. Not relevant to wasm semantics. Whatever information you add to type system, you have to propagate it everywhere. What happens when you have a mismatch? Where it is allowed?
  • BT: Can you just add it to the function + tables? E.g. things in this table can be tail called?
  • BN: We’d need multiple tables for that.
  • AR: What does that mean? What if you put a host function in there? Only tail call functions?
  • BT: That’s one possibility, or you can require implementations to use wrappers
  • JF: We don’t have to explore all right now.
  • DG: Already have issue on tailcall repo about this.
  • DS: Do we have a language we can try for this? LLVM has a must tailcall bit, we could try that.
  • BN: Now that we’ve agreed, should we be exploring implementations? We need a payload to try this on

Action item: Andreas to explore making “tailcall” part of function signatures, and document why he thinks this has bad implications. Michael / Luke to synchronize on why it would be useful to them.

Multi-value

Andreas Rossberg

1. [Slides](https://docs.google.com/presentation/d/1-c49oY5au_beHpQWAtOciABPWKUrKz3t8YIWFHZKzMU/)
        1. [Proposal Repo](https://github.com/WebAssembly/multi-value)
        1. POLL: Should this enter the [Implementation Phase](https://github.com/WebAssembly/meetings/blob/master/process/phases.md#3-implementation-phase-community--working-group)?
  • AR: In MVP we had arbitrary restriction on functions, instructions, blocks. Proposal is to make it general in all cases. The state of the proposal: spec is complete, text and formal. Reference interpreter and test suite. Passes in v8 implementation as well, mostly. For v8: multiple instructions + blocks works. Block parameters work; simplified the code in some cases, everything is more regular; touched ~140 lines of code. Multiple return values for functions, as long as in registers. Mostly already worked in v8. Had to implement 64-bit lowering for 32-bit architectures. Only implementation work left is multiple function returns on stack. More work since we have many architectures, and turbofan doesn’t currently handle this conveniently.
  • All the tests that don’t use too many returns already work in v8. I think this meets requirements for stage 3.
  • BN: (referencing phase 3 requirements)
  • AR: Did not implement, because we took a poll and decided not to: multiple returns for embedder API. There was no enthusiasm about this last time. We discussed adding pick last time, but we decided against so I didn’t implement.
  • JF: We had some action items from yesterday to investigate pick.
  • DG: what specifically changes?
  • AR: Just the type signature -- so for instructions, call, block (via br) can return multiple values.
  • (discussion about select)
  • AR: You can now polyfill instructions w/ multiple returns by turning into a function, e.g. swap or dup.
  • BT: They wouldn’t be polymorphic though. (* AR: right)

POLL: Should multi-value enter the Implementation Phase, stage 3.

SA A N F SF
0 0 2 13 5

Action item: JF to update tracking issue and table.

Exception Handling proposal

Heejin Ahn

        1. [Proposal Repo](https://github.com/WebAssembly/exception-handling)
        1. [Exception handling scheme in

toolchain](https://github.com/WebAssembly/tool-conventions/blob/master/EHScheme.md) 1. Slides 1. Discussion on the status of this proposal. Open issues in the design space. 1. Presentation of tooling results and preliminary measurements.

  • HA: (presenting)
  • HA: Current status is asm.js style, future is zero-cost
  • (image of LLVM CFG)
  • BT: Are those indices declared in the type section?
  • KS: No, in the exception section. They include a list of types of values that are pushed on the stack when the exception is caught.
  • HA: Exception tag is opaque dynamic token
  • HA: try starts code region, where exception goes to corresponding catch block
  • JF: Why can’t you access it? (all): you don’t know what it is. How would you typecheck?
  • KS: point of catch_all is that you don’t know anything about the exception
  • JF: You can do catch(...) in C++ and then use std::current_exception
  • LW: You wouldn’t use catch_all for that
  • JF: Is it impossible to look at the exception for catch_all?
  • KS: We decided it’s opaque in this case
  • AR: This was intentional
  • JF: I’m asking -- if we want to access values in catch_all, is it not possible?
  • AR: They’re supposed to be opaque
  • KS: JS object is not a tuple, how would you access any data?
  • EH: We’d have to add reflection to do this
  • JF: How do you lower other languages? SEH in Windows for catch(...)?
  • MH: You need a special compile flag for that
  • MM: It’s more than opacity, it’s also not first class. You can’t operate on the value
  • AR: This is about language interop? If so, catch your own exceptions, single constructor for all your own exceptions. Identify your exceptions some other way, not via tag.
  • MH: What about your own exception’s values?
  • KS: They get put on the stack
  • HA: (continues)
  • BT: Is rethrow necessary only because it is opaque?
  • KS: Yes
  • EH: Also, rethrow gives original stack trace, throw has new stack
  • MH: Why tag per language? Isn’t that inefficient?
  • DS: C++ has to catch all anyway
  • EH: OCaml may be able to use tags as represented here directly
  • DS: If you don’t rethrow, you lose all your state. It preserves stack trace, not linear memory.
  • LW: Devtools can offer break on throw.
  • AR: (resumption…)
  • DS: You could do two unwinds, first time to determine exception is caught, second time to call destructors.
  • MM: does stack trace show up normatively?
  • KM: all engines show stack traces, generally don’t have names
  • LW:They include the names if you provide a names section
  • HA: (continues)
  • HA: Itanium ABI assumes 2 pass exception handling
  • HA: wasm doesn’t have 2-phase; only cleanup phase
  • BT: In this example, we never come back to the frame after we’ve unwound?
  • HA: Only if we find a matching handler we do a cleanup
  • BT: Are you able to access state in the frames that haven’t been cleaned up?
  • LW: This makes languages that have resume work (Ada)
  • BT: The frames aren’t cleaned up, so things in the frames are accessible?
  • DS: The code that runs is the personality function, not user code. User code has destructors, so once it executes it can’t be recovered
  • BT: The webassembly style will be semantically invisible to the user?
  • DG: User code in this example is C++, and can’t observe the non-cleaned up frames
  • DS: Observable in a debugger, since 2-phase doesn’t clean up when you’ve caught
  • KS: Was built into spec, but can’t do it since we use a single tag for C++
  • DS: Normal C++ code only has to worry about C++ exceptions; if JS throws then it needs to unwind through C++ frames… you want to have the property that even if you don’t throw exceptions, a JS exception still cleans up your frames
  • KS: VM has enough information
  • DS: I don’t think so, VM doesn’t know anything about linear memory state. Can’t use just catch, need catch_all in every frame
  • BT: this is what catch all is for; we’re basically using catch_all to implement finally
  • DG: Every frame that calls imports and has destructors will need catch_all
  • KS: 2-phase is difficult
  • LW: Maybe we should have a nothrow
  • (discussion about nothrow/noexcept/LLVM unwind)
  • DS: If you could make noexcept into something in the spec you could then trap
  • JF: It’s interesting w/ noexcept/nothrow part of type system, gives upside of call_indirect vs invoke (like in LLVM). Means you can see that certain calls will never throw and it doesn’t act as an effective barrier to code motion for the VM
  • DG: If you had a guarantee that a trap happens, you wouldn’t have to invoke
  • LW: Today we don’t trap, so if we introduce new behavior it would have to be opt-in
  • (multiple conversations)
  • MH: Why is two phase so difficult? I know it’s easier to do 1 pass
  • KS: Exceptions are happening in two languages. If we implement C++ exceptions using a single tag, then catch happens at every level. If it wasn’t for that -- there’s nothing inherent that prevents multiple phase. We don’t really have a choice with the way we’re trying to fit it in
  • DS: It’s because you need RTTI -- difference between itanium and wasm…
  • MH: Can we provide that in wasm? Provide personality function?
  • DS: We could do that maybe
  • LW: Maybe as v2 w/ backward compatibility?
  • MH: My concern: too targeted at C++
  • LW: Only observable w/ resumable exception language or SEH (description of how to support SEH w/ linked list in linear memory)
  • KS: Part of the problem is that we have importing/exporting functions, JS in the middle might catch
  • DG: Can JS catch C++ exceptions?
  • BN: JS needs to so it can catch at the bottom, right?
  • MH: Or unhandled exception
  • (discussion about whether JS needs to be 1 pass)
  • BT: personality function allows you to run code before unwinding? I think you’re out of zero-cost in that case… there’s precedence for this w/ wasm, there’s way to implement using linear memory so let’s not incorporate into wasm directly
  • BM: primitive exception mechanism that knows how to catch tags. Everything else happens in user code
  • DS: In C++ you can’t avoid having user code run during unwind
  • LW: We could do either from catch_all
  • HA: (continues)
  • HA: Personality function decides whether to stop at frame and sets IP/registers with info. Wasm can’t do this.
  • LW: Is the code (calling personality function, etc.) in every catch landing pad? (yes)
  • HA: in current implementation it just generates one catch block. Generating more is hard
  • (continues)
  • HA: Problem 1 is in catch_all we need to duplicate block from specific catch blocks, but we don’t know what each specific action is
  • KS: In LLVM IR, there is no catch statement anymore
  • HA: pattern matching doesn’t work to figure out what the original catch block code is, because the blocks might have been optimized
  • (continues)
  • BN: To be clear, clang windows ran into similar problems w/ optimization breaking the structure of the blocks
  • LW: Are you saying it’s not possible…
  • DG: You need to duplicate the blocks, but you don’t know which parts to duplicate
  • DS: What the clang windows folks did is create new IR to handle this
  • LW: Can we do that too?
  • DS: We can’t use it directly because it’s different
  • JF: We’re talking about one problem: we need catch(...) to handle non C++ exceptions?
  • DS: Yes ,but more. Same problem w/ cleanup even without C++ exception handlers
  • BM: Same problem w/ try finally in …
  • DS: Need to duplicate code, but we don’t know what to duplicate.
  • JF: Is it because duplicating is icky or that it is hard
  • DS: latter All: it’s not just C++, foreign exceptions too
  • JF: Core of difficulty is, making non C++ exceptions propagate through is hard
  • LW: Even if we ignore JS, we have this problem. Because we have unstructured IR, we don’t have structure to handle exceptions properly
  • HA: (continues)
  • HA: Possible solution is to only use catch_all. Within the catch_all, there should be some way to query whether this is c++ exception or not
  • HA: Problem 2 is that rethrow must happen inside the catch block, but sometimes shared code is factored out and can cause rethrow to happen outside of the block.
  • LW: Can you have try/catch_all around the outermost scope and then rethrow from there?
  • JF: No one who cares about perf uses exceptions
  • DG: On Windows people use SEH and care about performance
  • MH: At least w/ SEH, the unwind is slow. If you have to keep doing catch/unwind, it won’t be fast
  • HA: On clang windows, the separate each catch block w/ special IR to prevent optimization. They needed to do this because the LSDA format requires it. The have barriers for each catch clause. If we have to do it for each case, we’ll have the same troubles. Doing so will have performance decrease and code size increase.
  • LW: What about the non-throwing path?
  • DS: Just code size
  • LW: What about finally?
  • HA: We can have finally, but I’m not sure we can do cleanups in finally (why?) Because you only do cleanup in finally. If you catch something we don’t need to do cleanup, in finally we have to run cleanup.
  • MH:I would like to have finally
  • DS: We need try/catch/else
  • BT: You could have each catch_all run after each catch HA/* KS: Hard because it’s unstructured
  • HA: To do that we need to figure out what the common code is
  • BN: Can’t do it because the shared part isn’t structured
  • LW: Could do it if we broke things up into the parts, like SEH
  • BN: How bad is it for us to be a third variant
  • DS: It’s doable
  • HA: What is the benefit of doing all this work
  • LW: What’s the alternative
  • BN: There’s additional ideas in this presentation
  • HA: (continues) The difference between SEH and wasm is that they need to separate all catch types (e.g. int/float/std::exception). We only want to separate catch C++ from catch_all.
  • MH: The windows one has stronger requirements than what we need
  • LW: As long as the non-throwing pass is fast, it’s probably OK
  • BM: Seems like there are other cases where you want optimization barriers
  • BT: Why does it matter if the code moves?
  • KS: Because we need the structure of the CFG
  • HA: The trouble is that we need to maintain what the block does, not just the structure of the CFG

(break)

  • HA: would be useful to rethrow outside of catch block
  • JF: I would like to know if this is a difficulty with translating LLVM IR vs. a fundamental difficulty with exceptions
  • HA: Haven’t worked with other compilers than LLVM, so not sure
  • BN: It sounds like gcc gave up and used a vm instead to target wasm
  • JF: What about msvc? Maybe should talk to them
  • HA: I’m thinking it’s probably going to be an inherent problem. Sa long as the compiler uses control flow graphs and doesn’t carry a bunch of metadata, it will be difficult to reconstruct the catch blocks
  • JF: I think MSVC is an older style compiler, SSA is newer -- they have a switch for this, but they may have different ways of doing this.
  • MH: Pretty sure they use CFGs
  • LW: Since SEH is baked into windows, perhaps they have a more structured way to handle exception handling :-)
  • AR: How does binaryen do this?
  • DS: For asm.js? They wrap every invoke in a thunk out to JS
  • LW: Does that catch bottle up the exception and do it dynamically? (yes)
  • DS: It’s the nuclear option -- costs 10x. It’s not zero cost
  • BM: We’ve already had this problem w/ structured control flow and irreducible graph. Isn’t this a similar problem? To get back to a semantically equivalent program
  • HA: I agree, it’s a code gen problem. I’m trying to preserve the semantics, not original try/catch structure
  • BM: Trying to make relooper for exceptions?
  • DS: No
  • BN: Sort of
  • DS: Maybe
  • BT: Why don’t we change the spec more?
  • DS: That’s the third proposal
  • HA: (continues)
  • BN: Part of the assumption here is that there are limits to what is acceptable to add to wasm
  • AR: Keep in mind that some would be hard to keep resumption
  • BN: Do we want to keep it?
  • AR: It’s interesting because it is a much more powerful feature, allows for coroutines and lightweight threads and generators. Stack switching and such can be explained via exceptions w/ resumption
  • BN: So perhaps we should scope in the ability to resume (for future)
  • HA: No concrete proposal, maybe bring up in further meetings. We are having discussions in EH repo
  • LW: Other proposals in the deck?
  • HA: Problem 3 is that try and catch may not match, because optimizer moved code around. Possibly solution is to add more surrounding blocks and break to them. Add new instruction to rethrow to outer labels to reduce code size
  • DS: This new instruction also allows you to rethrow from the outer block
  • JF: Not sure that this is a useful discussion for this CG
  • DG: Would goto make this problem easier?
  • DS: You need goto plus the ability to capture an exception object
  • DS: Probably should table goto discussion for now, maybe revisit in future meeting
  • JF: Do we need to add resumption to the proposal?
  • BN: No, but we should not spec something that would preclude it
  • JF: Any size/perf info about this?
  • HA: Some info, but mostly not ready yet. We don’t have a good application that uses C++ exceptions to test this. Tested w/ Bananabread but it doesn’t use exceptions, just cleanup code.
  • HA: In Bananabread, increased code size of functions with exceptions by 23%, overall 7%.
  • BN: Doesn’t seem that bad
  • KM: Before or after compression?
  • HA: rough estimate by counting instructions only
  • DS: In future, we can say -- if we implement this proposal we can save N%
  • JF: What to explore for next meeting?
  • BN: Hopefully have a proposal variant we can try
  • MT: No allergic reactions to proposals as shown?
  • MH: Don’t like exception object one…
  • KS: Our job to see if there is a clean way to do that
  • DG: Can we separate creation of exception object from throw?
  • KS: Thought about it, but nothing clear yet. Want something clean

Action item: Heejin to come back with a full implementation of the toolchain and VM, and report on resulting size and performance. Which implementation approach is taken is left up to Heejin. Derek mentions that they may explore a few alternatives and quantify their cost.

JavaScript Bindings for WebAssembly

Brad Nelson + Luke Wagner

        1. [Proposal](https://github.com/flagxor/spec/blob/jsdom/proposals/jsdom/Overview.md)
        1. Discussion: What do we like, what should change?
        1. POLL: We should start a jsdom fork of the spec.
  • BN: (presenting)
  • BN: General idea is to increase ergonomics of interacting w/ JS + DOM Also structured so we can optimize direct calls into web APIs. Incrementalism, we want a partially polyfillable strategy, so some can be implemented in JS
  • JF: bindings for non-web hosts is relevant?
  • BN: we will probably need to do some outreach
  • BN: allow WebIDL integration, not trying to express things that you can’t express in js
  • BN: Basic idea is to extend table to allow other types, not just anyfunc. If elements are placed in the table of the inappropriate type, then throw. Table 0 is still indirect functions. Multiple tables can be imported. Then add JS bindings section to map JS types into wasm types.
  • BN: from perspective of wasm program nothing changes
  • JF: What is specific to JS in the JS bindings section BN/* LW: name should be changed to host bindings
  • BN: Some may be specific to JS, we should take a closer look.
  • Two sections, one for imports one for exports. Reference import by function index. Select style of out call: function, new, or method. If some passes in a JS function, you may call bound with this, or w/ args, or as constructor.
  • JF: This sounds super specific for JS
  • LW: May need to figure out which parts are JS-specific
  • KM: new seems very specific to JS
  • LW: Kind of similar to what the kernel gives you in some cases
  • BN: Haven’t been thinking much about generic host bindings yet
  • JF: Basically the VM will look at the binding section and use that as a template for how to generate the binding layer for wasm? Trying to understand how this works w/ C++
  • LW: Imagine that there is a tiny JS stub that takes the args and does this…
  • MT: Why does the embedder need to know this stuff?
  • BN: Makes it explicit so you know exactly what you have so you can optimize/validate etc.
  • BN: pass_thru/object_handle are necessary, others are nice to have
  • MM: JS strings are sequence of 16-bit values, sort of UTF16 allowed to be any arbitrary sequence, whether unicode or not.
  • BN: Need to be able to copy into linear memory w/ stability
  • LW: May be able to define w/ TextEncoder
  • MM: are the tables variable size?
  • LW: Yes, tables are growable.
  • BN: extra wrinkle w/ exports -- wasm wants to manage allocation, Luke suggested we can have a global that says here is a slot for the next parameter, store the param in the slot, then worry about allocation later.
  • JF: That would work across workers?
  • LW: Global would be thread local
  • BN: Return of exported objects, assumption is that return can be internal convention, use global to stash object to drop, then next time allocator is called you can handle
  • BN: Possible that you could so a polyfill, where you could do something approximately like proposal to figure out the ergonomics of the proposal
  • BN: need to have allocation function to be able to malloc space for values coming in
  • BN: Could have automatically generated headers w/ WebIDL
  • RW: How do these things get destroyed?
  • BN: You can imagine a way to clear elements
  • LW: I was imagining some functions to clear, copy and move values in tables
  • RW: So you grow, but you can’t shrink. But you can zero out.
  • BN: Yeah, similar to memory
  • JF: Table 0 is for call_indirect, than table 1?
  • BN: You would have multiple tables, one table per type. For example, JS engines could specialize if they know that a table contains all webgl types then they could skip over all the checks
  • JF: up to the embedder to decide which types make sense to support
  • LW: If you import currently, we don’t care if the JS function doesn’t match. We could have a special path for WebIDL so things can throw if they don’t match and provide a fast path if it matches
  • JF: tool chain now knows exactly what to generate because it knows all the types, but will there be other cases where we won’t?
  • LW: there can be more generic types that will have runtime checks to get the exact type
  • MM: From JS, if I’m calling into wasm function and I pass DOM object, the intention is that when it calls a method on the DOM object, that it calls the C++ and it ignores the JS properties… I’m worried about -- In JS, in my realm, all the different prototypes have wrappers, now any invocation invokes wrappers, not original methods. In JS, not C++. Expectation is that invoking goes to C++ or JS?
  • LW: The instantiator provides functions to the import object -- whatever objects you pass in are the functions we call.
  • BN: Even if you’ve changed them, we call the functions you gave
  • BN: there is a coordination question dabout which bindings we want to optimize. We can’t just skip over everything -- we may want to coordinate, since emscripten will have to provide specific bindings section to optimize this.
  • KM: I talked to some web platform people, and they don’t want all functions to be callable from wasm -- some maybe should be deprecated and not callable from wasm.
  • JF: sounds like you want a subtype table, would like to understand how that works
  • LW: me too
  • MM: the existing mechanism for inter-wasm calls can’t pass opaque values (description of how host binding can allow opaque wasm / wasm calls)
  • JF: this is great, I like it, it can allow copy in / out of C++ objects between instances which don’t share a memory
  • BN: (continues)
  • LW: I saw that there was a custom section, but don’t think it should be custom
  • BN: I think I removed that part
  • MT: What about multithreading?
  • BN: Tables aren’t sharable, only talks to embedder
  • MT: What about pure wasm threads, with tables, and some of these tables have functions from other threads.
  • BN: Because these are tied to imports/exports they’re tied to a JS context
  • MT: The embedder has to validate that you’re not putting the incorrect values into the table
  • RW: This doesn’t have to be the main thread?
  • BN: Sure, it could be a worker
  • MM: The caller and callee have to be in the same thread?
  • MT: Yes, even in the pure threaded world the embedder can validate that
  • BN: The little we’ve talked about shared tables, if these are accessed from non wasm threads there would be actions that would cause them to throw
  • MT: With this proposal it’s binding time when we decide what is allowed
  • JF: Say I want to have a thread that only does GL -- don’t want a full worker, just a pure wasm thread. Having those bindings allows you to not have the full worker, right?
  • KM: They could have their own embedding which has it’s own thing
  • LW: You could imagine that it’s its own actor
  • JF: I have a shared memory, then I tell the GL thread to start doing work, so the GL thread only does GL stuff. It’s a GL embedding, kind of like worklet
  • MM: is worklet an existing term?
  • BN: Yes, it’s a lightweight worker. Audio worklet runs on mixer thread, and DOM worklet can run while rendering
  • JF: I think way KM phrased it is the right way, where you have a certain type of embedding and that way you don’t need to declare specific things that can and cannot be accessed
  • BS: This is mostly a matter of ergonomics, right? One issue I saw recently was a copy in/copy out problem. We have no way to directly right to an external ArrayBuffer, but you have to make copies for everything. In some cases JS can do even better, because it doesn’t do copies. Is this moving in a direction that makes more copies?
  • LW: I was worried about the same thing. Was thinking maybe there are API changes that can allow us to avoid copies
  • KM: What about something like mmap, where you have multiple views with the same backing memory?
  • KM: Think that having multiple memories would be expensive because we do register pinning, would be way cheaper for VM to do mapping to the same memory
  • JF: I’m talking about multiple WebAssembly.Memory in a single instance. I’m ignoring that you have to teach C++ about this, which might be complicated
  • BN: I think where we could make the APIs better is if we don’t have to make copies, but in some cases the array buffer really is external, like the canvas
  • LW: Going multiple memory route would cause you to have to change a lot of your C++ source to handle it
  • BS: I’m skeptical that you are going to get API changed. You’re now changing the canonical location
  • BN: No objections to creating a new repo for this and moving it forward?
  • JF: No objection

Action item: JF to create Host Bindings repo and tracking issue.

(break)

Next meeting hosting

  • JF: I heard desire for a meeting in Europe, and I assume we won’t have a meeting before end of year. Ben suggested Munich. Dan Ehrenburg suggested maybe Barcelona
  • BN: Mozilla?
  • LW: We can fill the gap if necessary
  • BT: Probably don’t want to do Munich in winter
  • LW: Thinking next meeting in Feb or March?
  • BN: We can also do Google in MTV
  • JF: We’ll come back later with more details

TPAC

  • BN: General purpose of TPAC is outreach and increasing understanding from other groups
  • BT: Volunteers to present overview of wasm
  • LW: Host bindings is probably the most interesting topic to talk with other groups
  • BN: WebGPU people will be giving a talk
  • LW: Are WebAudio people there?
  • BN: Looks like there is an Audio WG

Closure

Derek moves that we adjourn. Ben seconds.

Adjourned