-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multipart upload content streamed #1357
Comments
This is precisely my use-case, I'm uploading multiple files in one multipart form and I'd like to stream each one directly to |
@gerryfletch as far as I remember http4s writes each part to disk if they are larger than a given threshold and then streams data from there. But this might have changed or it might be a special case. That said, a streaming multipart body would be nice to have. Probably not valid anymore, but @tg44 maybe you remember why a |
If I remember well |
@tg44 good points, thanks! But then with |
Yapp, for me PartBytes is a chunk of data, so the elements in the stream would be Probably we could add helpers to it like a The bigger problem I see is when I wrote the comment, not all server interpreters enabled an api like this. |
True we'd have to resort to documentation to specify where this is possible. |
Implementing this would require:
The first option would require quite far-reaching changes in the whole sttp stack, while the second could be added at a later point without breaking binary compatibility. |
Hey there 👋 |
@NavidJalali well it can be done (using the second approach described above), but I'm afraid there's no progress on implementing this. If you don't know the part names upfront, you can create a |
Hy!
Most of the clients handle multipart-form uploads as a convenient file upload mechanism. If I want to catch a file upload stream, and write it to S3 on the fly for example, in the current implementations I either need to parse the raw bytestream somehow, to actually stream the data, or I need to write down the file to the filesystem, and upload from there. The second method is an obvious attack surface, and also super slow for large files (multiple fast attackers -> big files -> full hdd -> dos).
We generally upload one file at a time, but with some tinkering, it could work with real multipart too (I don't want this right now, but the sake of completeness I write down my ideas). The API could look like something like
which is almost the same as
streamBody
just seeks into the required part, send it downstream, and when the downstream finishes or the part is finished, seeks to the end of the request.The extended version could be something like a stream of
(PartHeader | PartBytes)
and the downstream could build a state machine or other logic to drop/seek the unneeded parts and process thePartBytes
with the speed of the real downstream application logic. What we absolutely can't do is aSource[(PartHeader, Source[ByteString])]
bcs things likemapAsyncUnordered(4)
orbuffer(10)
would ruin the whole streaming. We need something like aSource[Either[PartHeader, PartBytes]]
or something similar (like a parent sealed trait).Prebuild stuff supported by the server interpreters:
Part
stream, but I never used http4s so no idea if that is really a stream or it is using filesystem cachingAlso, we should somewhat document the accepted input multipart data format. At this point I would allow a non compile-time checked custom Schema.
The text was updated successfully, but these errors were encountered: