Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content-Length or arbitrary headers #69

Open
laverdet opened this issue Sep 27, 2023 · 2 comments
Open

Content-Length or arbitrary headers #69

laverdet opened this issue Sep 27, 2023 · 2 comments

Comments

@laverdet
Copy link

What do you think about adding a new sizes map which corresponds to the map implementation which includes information on the expected size of the uploaded payload. Or, overhaul map to contain arbitrary headers for each file. Off the top of my head I'm thinking of Last-Modified (advanced caching possibilities) Range (resumable uploads), Content-Encoding (compression for large text payloads) would be valuable.

Content-Length in particular in my case is required because I want to stream the request directly from the client right into S3 without needing to buffer the entire thing in memory or disk. S3 requires the length of the payload upfront, so right now I also need to send the payload size along, something like input UploadWithSize { file: Upload!, size: Int! }. It's useful in other cases as well, for example terminating the request early if the size will be too large, allocating an ArrayBuffer to avoid costly memcpys, or filesystem extents. Of course, the stream should error out if the content continues past the given Content-Length, or if it closes before.

It doesn't seem to be possible to set additional headers via multipart requests over FormData so implementing it in a blessed side-channel would be required. When uploading a single file via fetch or XHR you get the Content-Length for free, but once you go through FormData you lose the information.

@jaydenseric
Copy link
Owner

Sorry for the late reply; its been a busy few weeks. I've been thinking about the problems and ideas you're describing most days in my morning shower.

A good first step would be to confirm if headers like Content-Length are allowed in the the file field headers. At first I thought they are, because of this ancient spec:

https://datatracker.ietf.org/doc/html/rfc2046#section-5.1

A body part is an entity and hence is NOT to be interpreted as
actually being an RFC 822 message. To begin with, NO header fields
are actually required in body parts. A body part that starts with a
blank line, therefore, is allowed and is a body part for which all
default values are to be assumed. In such a case, the absence of a
Content-Type header usually indicates that the corresponding body has
a content-type of "text/plain; charset=US-ASCII".

The only header fields that have defined meaning for body parts are
those the names of which begin with "Content-". All other header
fields may be ignored in body parts. Although they should generally
be retained if at all possible, they may be discarded by gateways if
necessary. Such other fields are permitted to appear in body parts
but must not be depended on. "X-" fields may be created for
experimental or private purposes, with the recognition that the
information they contain may be lost at some gateways.

But then I came across this one:

https://datatracker.ietf.org/doc/html/rfc7578#section-4.8

4.8. Other "Content-" Header Fields

The multipart/form-data media type does not support any MIME header
fields in parts other than Content-Type, Content-Disposition, and (in
limited circumstances) Content-Transfer-Encoding. Other header
fields MUST NOT be included and MUST be ignored.

Had they been allowed, we could have looked at using the per file headers and figuring out a way to insert the ones we need using web standard browser APIs for encoding FormData.

Regardless, that approach would have had the limitation that you wouldn't get metadata about all the files being uploaded up front early in the multipart stream; you would have found out the content length of each file one at a time, at the time each file field is encountered in the stream.

Your idea about having a new structure for the map field that includes space for each file to have arbitrary medatadata fields makes sense; but that would be a big breaking change. Less breaking would be an optional complimentary field meta (name TBD) that follows immediately after the map field, and it could have a map structure where the keys are the same as the map field, but the values are an object with keys and values (i.e. MIME keys and values such as { "Content-Length": 12345).

@laverdet
Copy link
Author

laverdet commented Oct 24, 2023

The point about compatibility is fair. My personal tolerance for backwards compatibility is very low but in this case it's not a big deal to stick the headers in another field. I'll raise a counterpoint that if a client is sending a multipart file upload that the server can't fully understand then failing loudly is better than silently dropping the information contained in the side-channel. Content-Encoding and Range in particular should not be ignored.

Regardless of whether or not you decide in favor of backwards compatibility, I might suggest adding a marker in the top-level Content-Type header indicating specification version of the client. So the composed header might look like Content-Type: multipart/form-data; uploadSpec=20231024; boundary=--boundary. In this way a server could easily detect the client version and behave accordingly, or reject the request if the server doesn't understand the indicated version. This unlocks unlimited future changes to your specification. This is similar to what Apollo server/client do to support @defer since they shipped a pre-specified version of that directive.

Edit: Actually I don't think you'd be able to put it in Content-Type since the user agent needs to add the boundary. Setting it in another header like X-Upload-Spec would be a pitfall since people would need to worry about CORS. Maybe it should just be a field that comes before map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants