Skip to content

feat: AsyncMultipartStreamParser#80

Open
defnull wants to merge 7 commits into
mainfrom
defnull-async-wrapper
Open

feat: AsyncMultipartStreamParser#80
defnull wants to merge 7 commits into
mainfrom
defnull-async-wrapper

Conversation

@defnull

@defnull defnull commented Jun 12, 2026

Copy link
Copy Markdown
Owner

An async-aware wrapper for PushMultipartParser that reads bytes from an awaitable read function on demand and returns AsyncMultipartPart instances that offer convenient async read methods for their payload.

This is still a stream parser, which means that parts are not buffered to disk and can only be read once, and only in the order they appear in the multipart stream. It is still WAY more convenient than using PushMultipartParser directly.

Example usage:

import multipart

async def handle_request(headers: dict[str, str], body_reader: multipart.t_AsyncReader):
    ctype, options = multipart.parse_options_header(headers["content-type"])
    assert ctype == "multipart/form-data"
    parser = multipart.AsyncMultipartStreamParser(
        multipart.PushMultipartParser(boundary=options["boundary"]), body_reader
    )

    async for part in parser:
        print(f"Found: {part.name} ({part.filename or '-'})")
        async for chunk in part.iter_chunks():
            print(f"[{len(chunk)} bytes]")
        print(f"Total size: {part.size}")

Work in Progress

  • Think about a nice(r) API for async applications
  • Implement stuff
  • Get test coverage back to 100%
  • Improve documentation
  • Benchmark this properly

defnull added 7 commits June 6, 2026 17:57
If the parser is fed with very small chunks then afther preamble scanning phase the internal buffer offset could be negative. A later check corrects this, so there are no consequences. It's still wrong and now fixed.

Also, to account for partial delimiters we only need to keep len(delimiter)+1 bytes for the next round.
Bump major version since we are planning a release with breaking changes.
Performance tests have shown that working with immutable byte strings instead of bytearrays for internal parser buffering is slightly faster in all tested scenarios, and up to 15% faster in specific scenarios (mostly small requests).

This is an API change, but in return the caller can now assume that the returned chunk does not change. Chunks can be collected and later merged without an additional defensive copy.
The buffer is large enough most of the time, so optimize for that case and replace an assert with an explicit check and remove the now redundant pre-check.
When parsing segment body chunks, we have to deal with partial boundaries at chunk borders.
Currently we always keep `len(boundary)-1` bytes in the buffer 'just in case' and merge it with the next chunk, which is quite expensive.
We can avoid this overhead in most cases by checking for an actual partial match first.

Benchmarks show a 10-37% throughput increase for typical file uploads and slight speedups in most of the other tested scenarios.

fix: Boundary delimiters inside segment bodies are now detected and reported as a fatal error (see RFC-7578 4.1).
An async-aware wrapper for PushMultipartParser that reads bytes from an awaitable read function on demand and returns AsyncMultipartPart instances that offer convenient async read methods for their payload.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant