Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate using native streams #20

Open
MattiasBuelens opened this issue Mar 25, 2019 · 34 comments
Open

Investigate using native streams #20

MattiasBuelens opened this issue Mar 25, 2019 · 34 comments
Labels

Comments

@MattiasBuelens
Copy link
Owner

Inspired by this tweet from @surma:

@MattiasBuelens Is it possible to offer up the polyfills for Readable, Writable and Transform individually? Most browsers have Readable, so ideally I’d only load Writable and Transform.

I've thought about this previously. Back then, I decided that it was not feasible because readable byte streams are not supported by any browser. A full polyfill would always need to provide its own ReadableStream implementation that supports byte streams. By extension, it would also need to provide its own implementations for WritableStream (that works with its ReadableStream.pipeTo()) and TransformStream (that uses its readable and writable streams).

Looking at this again, I think we can do better. If you don't need readable byte streams, then the native ReadableStream should be good enough as a starting point for the polyfill. From there, the polyfill could add any missing methods (pipeTo, pipeThrough, getIterator,...) and implement them using the native reader from getReader().

This approach can never be fully spec-compliant though, since the spec explicitly forbids these methods to use the public API. For example, pipeTo() must use AcquireReadableStreamDefaultReader() instead of ReadableStream.getReader(), so it cannot be affected by user-land JavaScript code making modifications to ReadableStream.prototype. I don't think that has to be a problem though: we are already a user-land polyfill written in JavaScript that modifies those prototypes, it would be silly for the polyfill to try and guard itself against other JavaScript code making similar modifications.

Steps in the spec that require inspecting the internal state of the stream or call into internal methods will need to be replaced by something that emulates the behavior using solely the public API.

  • Often, this will be easy: e.g. ReadableStreamDefaultControllerEnqueue() becomes controller.enqueue().

  • Sometimes, we have to be a bit more lenient. ReadableStreamPipeTo()'s error propagation says:

    if source.[[state]] is or becomes "errored"

    We can check if it becomes errored by waiting for the source.closed promise to become rejected. However, we can't synchronously check if it is already errored.

  • In rare cases, this may turn out to be impossible. TransformStreamDefaultSinkWriteAlgorithm specifies:

    If state is "erroring", throw writable.[[storedError]].

    Usually, the writable stream starts erroring because the writable controller has errored, which the transform stream's implementation controls. However, it could also be triggered by WritableStream.abort(), which is out of the control of the transform stream implementation. In this case, the controller is only made aware of it after the writable stream finishes erroring (state becomes "errored") through its abort() algorithm, which is already too late.

Of course, we can't just flat-out remove byte stream support from the polyfill, just for the sake of using native streams more. The default should still be a full polyfill, but we might want to give users the option to select which features they want polyfilled (as @surma suggested in another tweet).

Anyway, I still want to give this a try. It might fail catastrophically, but then at least I'll have a better answer on why we use so little from the native streams implementation. 😅

@jimmywarting
Copy link

jimmywarting commented May 19, 2019

👍

It just so struck me with my StreamSaver lib where I try to transfer a ReadableStream to a Service Worker. Native streams work OK, But the polyfilled ReadableStream is not transferable with postMessage

For me native ReadableStream's are much more important than having a full stream specification. (just so that you can get byob).
Also the Response.body stream isn't compatible with the polyfill, so that's a problem too in my case

byob reader isn't available in any browser yet, So my guessing is that hardly anyone are using it today.
think byob is not quite ready yet to be used in the user-land. There is also no blob-read-into, WebSocket-read-into, datachannel-read-into or any other native implementation that utilize this byob mode today so it seems a little unnecessary atm to have byob support as you will already have a allocated buffer from fileReader, websocket datachannel etc so you can as well pass it along and use it. And let the garbage collector do it's job

So i'm guessing i'm in favor of removing byob mode and use native stream instead until browsers becomes more ready for it.


Side note: I also wondering if you can't extend the ReadableStream somehow to add support for byob but still have it being seen as a native stream and still be transferable and also work with native Response

new Response(readable).blob()
const klass = ReadableStream || NoopClass

window.ReadableStream = class Polyfill extends klass {
  constructor (args) {
    // check if byob mode and make magic happen
    super(...args)
  }

  someMethod () {
    super.someMethod()
  }
}

Maybe will be more troublesome then actually fixing the missing methods to native ReadableStream.prototype 🤔 but can maybe be the way to solve it?

@jimmywarting
Copy link

jimmywarting commented May 19, 2019

Just tried this:

class Foo extends ReadableStream {
  constructor (...args) {
    super(...args)
    console.log('Hi foo, i fix byob for ya')
  }
}

const rs = new Foo({
  start (ctrl) {
    ctrl.enqueue(new Uint8Array([97]))
    ctrl.close()
  }
})

new Response(rs).text().then(console.log)

const rs2 = new Foo()
postMessage(rs2, '*', [rs2])

Works.

(the current polyfill version can't do this - since it's not a native stream)

using native stream instead seems like it would be better then having some "convert to/from native stream" (#1)

@MattiasBuelens
Copy link
Owner Author

MattiasBuelens commented May 19, 2019

Thanks for your input! 😄

byob reader isn't available in any browser yet, So my guessing is that hardly anyone are using it today. [...]

So i'm guessing i'm in favor of removing byob mode and use native stream instead until browsers becomes more ready for it.

I know at least one person is using BYOB readers with this polyfill, because they found a bug with it: #3.

I want to keep BYOB support, but make it an optional feature rather than a default one. I see two options:

  • Include it by default, but give users a way to opt-out by using a separate variant. This is backwards-compatible, but it means that by default users will get a larger polyfill that can't leverage the native implementation.
  • Don't include it by default, and give users a way to opt-in. This means that the default polyfill can likely use the native implementation, but it would be a breaking change for users currently using BYOB support.

Side note: I also wondering if you can't extend the ReadableStream somehow to add support for byob but still have it being seen as a native stream and still be transferable and also work with native Response

I've had the same idea as well! But it won't be easy.

The PolyfillReadableStream constructor must call the super constructor (i.e. the native ReadableStream constructor). That super constructor only accepts a default underlying source. We cannot pass a "dummy" underlying source, since then super.getReader() wouldn't work (and by extension super.pipeTo() or super.tee() wouldn't work either). So we would need to somehow wrap our original byte source in a default underlying source, so the native implementation can use it. I'm not sure if this is at all possible.

If we can make the constructor work, the rest would be fairly straightforward:

  • getReader() will just return super.getReader(), since our underlying source wrapper will take care of everything. Similarly, pipeTo(), pipeThrough() and tee() will also "just work".
  • getReader({ mode: 'byob' }) must return a polyfilled BYOB reader. We can wrap the default reader from super.getReader() and use its cancel() and releaseLock() implementation. (Since this also locks the native stream, we won't have to do anything special for ReadableStream.locked.) Of course, we have to implement read(view) ourselves using the original byte source, to make it a proper BYOB read.

It'd be incredible if we could get this to work though! It would mean we could restructure the polyfill to "progressively enhance" native streams:

let PolyfillReadableStream;
if (supportsDefaultSource(ReadableStream)) {
  // native supports default source
  PolyfillReadableStream = class extends ReadableStream {};
} else {
  // no native support
  PolyfillReadableStream = PonyfillReadableStream;
}

if (!supportsByteSource(PolyfillReadableStream)) {
  // polyfill byte stream on top of default stream
  PolyfillReadableStream = class extends PolyfillReadableStream {
    /* ... */
  };
}

if (!PolyfillReadableStream.prototype.pipeTo) {
  // polyfill pipeTo and pipeThrough
}

if (!PolyfillReadableStream.prototype.tee) {
  // polyfill tee
}
// ...

We could put these in separate modules, and have users pick only the features they care about (like with core-js):

import {ReadableStream} from 'web-streams-polyfill';

// polyfill only pipeTo
// import `web-streams-polyfill/feature/readable-byte-stream`;
import `web-streams-polyfill/feature/pipe-to`;
// import `web-streams-polyfill/feature/tee`;

const readable = new ReadableStream();
readable.pipeTo(writable);

So yeah: would be cool, if we can get it to work. 😛

@MattiasBuelens
Copy link
Owner Author

I'm currently working on splitting the ReadableStream class into multiple modules (one for each feature), to get a better understanding on the dependencies between each feature.

After that, I have to figure out how these dependencies should be implemented so they can work with either native and polyfilled streams, without breaking any (or too many) tests. For example: ReadableStreamPipeTo uses AcquireReadableStreamDefaultReader. For polyfilled streams, this should call our implementation of this abstract operation so we can set forAuthorCode = false. However, for native streams, we only have stream.getReader() so we will always have forAuthorCode = true. This means that some tests will fail when implementing pipeTo() on top of native readers. I think it's fine in this case, but this is just one of many cases that will need to be considered.

I'm also worried that some of these dependencies on abstract operations cannot be implemented using only the public API of native streams. This would mean I'd have to approximate them, or leave them out entirely. That means more trade-offs about which test failures are acceptable and which aren't. For example: TransformStreamDefaultSinkWriteAlgorithm checks whether writable.[[state]] === "erroring", but a regular writable sink would only know about this after all pending write()s are completed and the sink's abort() method gets called. That means the write algorithm cannot know whether it should skip the PerformTransform() call and let the writable stream become errored, which is definitely going to break at least one test.

There's still a lot of questions, and I'm figuring them out as I go. I'm doing this in my spare time, so it's going to take a bit of time to get there! 😅

@jimmywarting
Copy link

Oh, sounds a bit complex 😅
I hope using native streams outweighs the trade-offs.
Thanks for the status update ❤️

@bt-88
Copy link

bt-88 commented Jul 27, 2020

Also the Response.body stream isn't compatible with the polyfill, so that's a problem too in my case

@jimmywarting
What does this mean? That writing response.Body is a no-go with the suggested polyfill?
If so, do you have a work-around that you implemented?

@MattiasBuelens
Copy link
Owner Author

@bt-88 I think he meant that you can't do:

import { ReadableStream } from 'web-streams-polyfill';
let stream = new ReadableStream({/* ... */});
let response = new Response(stream); // expects a native ReadableStream, but we're passing in a polyfilled stream

or

import { WritableStream } from 'web-streams-polyfill';
let stream = new Response(/* ... */).body;
await stream.pipeTo(new WritableStream({/* ... */})); // either pipeTo does not exist on the native ReadableStream, or the native pipeTo expects a native WritableStream but we're passing in a polyfilled stream

You can work around this by wrapping the native stream inside a polyfilled stream, or vice versa. I have a library that does exactly this: web-streams-adapter. However, you have to do this everywhere you send or receive a native ReadableStream. And you also lose some of the benefits of the native implementation, since the wrappers will route everything through the (slower?) polyfill implementation. And yes, it also doesn't play very nicely with transferable streams.

The goal of this issue is to make the ReadableStream exported by this polyfill be a proper native ReadableStream where possible, so you don't need any adapters.

@jimmywarting
Copy link

jimmywarting commented Jul 27, 2020

yea, what he said☝️

@guest271314

This comment has been minimized.

@MattiasBuelens

This comment has been minimized.

@guest271314

This comment has been minimized.

@surma

This comment has been minimized.

@jimmywarting

This comment has been minimized.

@guest271314

This comment has been minimized.

@surma

This comment has been minimized.

@guest271314

This comment has been minimized.

@MattiasBuelens

This comment has been minimized.

@guest271314

This comment has been minimized.

@guest271314

This comment has been minimized.

@guest271314

This comment has been minimized.

@MattiasBuelens
Copy link
Owner Author

While this all seems very interesting, it is not very relevant to this issue, or even to the goals of this project. So I'm gonna have to ask you to move all further discussion of this "streams using custom Chromium build" experiment to your own fork. You can drop a link to that issue here, in case others want to follow your progress. 🙂

@guest271314
Copy link

https://github.com/guest271314/web-streams-polyfill/projects/1 Good luck.

@guest271314
Copy link

@MattiasBuelens Marking the comments as "oof-topic" really makes no sense. This issue is titled

Investigate using native streams

which the comments directly focus on.

There are no restrictions on how that must be achieved at OP.

OP is akin to asking a golfing question without including restrictions, then adding restrictions in the comments.

It should not matter how the requirement is achieved since there are no restrictions on how the requirement can be achieved at OP.

@MattiasBuelens
Copy link
Owner Author

Marking the comments as "oof-topic" really makes no sense. This issue is titled

Investigate using native streams

which the comments directly focus on.

The goal for this issue is: investigate how the polyfill can leverage the native streams implementation as provided by the browser, such that it can be made compatible with browser-provided APIs that return or accept such native streams. This is explained in my previous comment. Maybe the title of this issue alone doesn't explain that that well enough, but there's more to an issue than just its title. I ask that you at least make an effort to read through the previous comments before commenting yourself, to better understand the problem we're trying to solve.

You propose to compile a C++ streams implementation (like the one in Chrome) to run inside a web page, either using WebAssembly or through some communication channel with a native application. As explained several times before, this proposal does not achieve the goals set out for this issue. Hence, while this might be an interesting challenge, it is not relevant for the discussion of this issue. Therefore, I marked the comments as off-topic.

There are no restrictions on how that must be achieved at OP.

In every project, there are many implicit restrictions defined just by the environment in which the solution must operate. I expect some common sense from contributors, so that I don't have to state these in every single issue.

For example, this is a polyfill that can be used on any web page running in any supported browser, as detailed in the README. That immediately imposes a bunch of restrictions: it must be written in a programming language that is supported by these platforms (i.e. JavaScript or WebAssembly, or something that compiles to these), and it must obey the rules set out by these platforms (i.e. only use browser-provided APIs, do not require additional extensions,...)

Switching to WebAssembly would mean we'd have to drop support for older browsers, which would be unfortunate but possibly defendable if the performance gains in newer browsers are significant. However, requiring users to install a separate native application would not work at all for usage in a web page. The browser's job is to protect the user from (potentially malicious) websites, and requiring untrusted native code just so a random website can use the streams API is very hard to sell to users. Thus, it's not an option for this polyfill.

@jimmywarting
Copy link

Just checking in to see how things have progress?

Have you started somewhere or is it still in the stage of "splitting the ReadableStream class into multiple modules"?
Maybe could start somewhere easier that isn't so hooked into the internals like forAuthorCode and start with something easier such as making native stream async iterable for example?

if (!ReadableStream.prototype[Symbol.asyncIterator]) {
  // polyfill iterator
}

@MattiasBuelens
Copy link
Owner Author

Sorry, not a whole lot of progress as of late. 😞 I've been busy working on the streams standard itself, e.g. adding ReadableStream.from(asyncIterable) and fixing some kinks with readable byte streams. Basically, giving myself even more work when I eventually have to re-implement the new stuff in the polyfill. 😛

Now that Chrome has shipped readable byte streams, the delta between Chrome's streams implementation and this polyfill has become very small. So I can definitely see the appeal of having a minimal polyfill to add the few missing bits (like @@asyncIterator). But it doesn't make it easier to actually build the thing. 😅

@jimmywarting
Copy link

note from fetch-blob: We could extend this to basically pull the streams from node:stream/web also

@jimmywarting
Copy link

I was thinking this conditional loading of node streams could be of any help to you:

let nodeStuff = {}

try {
  const load = new Function('x', 'return require(x)')
  const process = load('node:process')
  const { emitWarning } = process
  try {
    process.emitWarning = () => {}
    nodeStuff = load('node:stream/web')
    process.emitWarning = emitWarning
  } catch (err) {
    process.emitWarning = emitWarning
  }
} catch (err) {}

module.exports = nodeStuff

@MattiasBuelens
Copy link
Owner Author

MattiasBuelens commented Jan 21, 2022

What is process.load()? I can't find that in the docs. (Also, monkey-patching process.emitWarning() feels very wrong. 😛)

Either way, if we go this route (which we might, see #108 (comment)), I'd probably prefer something like:

try {
  module.exports = require("node:stream/web");
} catch {
  module.exports = require("web-streams-polyfill/es2018");
}

or for ESM:

let streams;
try {
  streams = await import("node:stream/web");
} catch {
  streams = await import("web-streams-polyfill/es2018");
}
const { ReadableStream, WritableStream, TransformStream } = streams;
export { ReadableStream, WritableStream, TransformStream };

But even then, it's not that simple. The polyfill's implementation may be ahead of Node's implementation, so in some cases we may still want to either monkey-patch Node's implementation or fallback to our own implementation anyway. 😕

@jimmywarting
Copy link

load = new Function('x', 'return require(x)') = load = require

Also i adapted it just for web-streams-polyfill in case you want to patch/extend nodes implementation, seeing that we now have some utilities that actually use the nodes stream like new buffer.Blob().stream() and that they are transferable too?

i was thinking maybe that if some webpack or rollup tries to polyfill stream/web and makes it load web-streams-polyfill... then it would be recursive loop back into itself or something - thats why i created load

the top level await would require node v14.8 i think?

quite many is using fetch which as a indirect dependency on fetch-blob -> node:stream/web and it cause a frustration among many developers who did not even use whatwg streams in any way
quite many wanted to get rid of that warning message.

@MattiasBuelens
Copy link
Owner Author

load = new Function('x', 'return require(x)') = load = require

Woops, I'm blind. Thanks. 😅

i was thinking maybe that if some webpack or rollup tries to polyfill stream/web and makes it load web-streams-polyfill... then it would be recursive loop back into itself or something - thats why i created load

I see, so it's trying to hide require() from bundlers... Interesting.

the top level await would require node v14.8 i think?

Indeed, so we'd have to wait a bit longer before we can safely drop Node 12 and below. 😢

quite many is using fetch which as a indirect dependency on fetch-blob -> node:stream/web and it cause a frustration among many developers who did not even use whatwg streams in any way
quite many wanted to get rid of that warning message.

I understand the frustration, but I don't feel like it's the polyfill's responsibility to hide this warning. The warning is there for a reason after all... 😕

@jimmywarting
Copy link

jimmywarting commented Jan 21, 2022

I see, so it's trying to hide require() from bundlers... Interesting.

correct

Indeed, so we'd have to wait a bit longer before we can safely drop Node 12 and below. 😢

when 12 goes EOL, node-fetch is going to drop it in v4 and only support something like 14.18 i think (where AbortController got introduced)
And we are looking into ways right now to conditional load some streams, i found https://www.npmjs.com/package/isomorphic-streams but that one requires node 16 and have no fallback to some ponyfill

@jimmywarting
Copy link

jimmywarting commented Jan 21, 2022

Somewhere back in my head I was perhaps also hoping that your library would also somehow extend/patch node:streams/web in [email protected] so that it can work with other built in streams - this is what this issue is all about, isn't it? investigate using native streams
That is what polyfill dose - handing you the native implementation if it exist if not, it gives you a polyfill

@MattiasBuelens
Copy link
Owner Author

MattiasBuelens commented Jan 22, 2022

The original intention was to extend/patch native streams in the browser. But yes, if at all feasible, we may want to do the same for Node. 🙂

Now that I think about it, it might be good to use subpath imports to import the native implementation (if any). That way, we can avoid importing from streams/web entirely when bundling for browsers, without weird new Function() hacks around require(). 😛

{
  "imports": {
    "#native-streams": {
      "node": "./node-stream.js",
      "default": "./global-stream.js"
    }
  },
}
// node-stream.js
export let streams;
try {
  streams = await import("node:stream/web");
} catch {
  streams = undefined;
}
// global-stream.js
export let streams = globalThis;

We could then build the polyfill like this:

// polyfill.js
let { streams } = await import("#native-streams");
if (streams) {
  streams = patchStreamsImplementation(streams);
} else {
  streams = await import("./ponyfill.js");
}
export const ReadableStream = streams.ReadableStream;
export const WritableStream = streams.WritableStream;
export const TransformStream = streams.TransformStream;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants