feat: AsyncMultipartStreamParser#80
Open
defnull wants to merge 7 commits into
Open
Conversation
If the parser is fed with very small chunks then afther preamble scanning phase the internal buffer offset could be negative. A later check corrects this, so there are no consequences. It's still wrong and now fixed. Also, to account for partial delimiters we only need to keep len(delimiter)+1 bytes for the next round.
Bump major version since we are planning a release with breaking changes.
Performance tests have shown that working with immutable byte strings instead of bytearrays for internal parser buffering is slightly faster in all tested scenarios, and up to 15% faster in specific scenarios (mostly small requests). This is an API change, but in return the caller can now assume that the returned chunk does not change. Chunks can be collected and later merged without an additional defensive copy.
The buffer is large enough most of the time, so optimize for that case and replace an assert with an explicit check and remove the now redundant pre-check.
When parsing segment body chunks, we have to deal with partial boundaries at chunk borders. Currently we always keep `len(boundary)-1` bytes in the buffer 'just in case' and merge it with the next chunk, which is quite expensive. We can avoid this overhead in most cases by checking for an actual partial match first. Benchmarks show a 10-37% throughput increase for typical file uploads and slight speedups in most of the other tested scenarios. fix: Boundary delimiters inside segment bodies are now detected and reported as a fatal error (see RFC-7578 4.1).
An async-aware wrapper for PushMultipartParser that reads bytes from an awaitable read function on demand and returns AsyncMultipartPart instances that offer convenient async read methods for their payload.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
An async-aware wrapper for
PushMultipartParserthat reads bytes from an awaitable read function on demand and returnsAsyncMultipartPartinstances that offer convenient async read methods for their payload.This is still a stream parser, which means that parts are not buffered to disk and can only be read once, and only in the order they appear in the multipart stream. It is still WAY more convenient than using
PushMultipartParserdirectly.Example usage:
Work in Progress