Opened 2 years ago

Closed 19 months ago

Last modified 2 months ago

#33699 closed New feature (wontfix)

Read ASGI request body from asyncio queue on-demand

Reported by: Noxx Owned by: noneNote
Component: HTTP handling Version: dev
Severity: Normal Keywords: ASGI, async
Cc: Andrew Godwin, Carlton Gibson Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Carlton Gibson)

Was "Performance does not scale when increasing file upload sizes"

Story

Our application supports many file uploads.
In normal circumstances, those files are only a few megabytes to of around 100MB.
However, some uploads can be several GB in size (10-15GB and up).

I know, that file uploads with such a size can be handled differently,
e.g. using a seperate endpoint, allowing slow requests to be served seperately from the application etc.
and only to reference those files in a final request.
This solution would introduce some other problems regarding house-keeping, non-atomic requests etc, thus I want to ignore this workaround for now.

Performance impact:

The impact can be best observed when uploading a file which is bigger in it's size, e.g. 1GB+.
On my maschine it takes the client around 700ms to send the request to the application, but than waits around 1.5s for the final response.
Of course, those numbers are dramatically influenced additionally in production by storage speed, file size, webserver (uvicorn/daphne/..), load-balancers etc.
But the final take here is, that the server does some additional work after the client finished its request.
In a production-like environment the numbers peaks to 4s (send request) and 16s (waiting for response) for the same file size. Uploading a 3GB file, the numbers are 11s and 44s.
As you can see, the 44s are very near the default gateway timeout of Nginx and the hard-coded one of AWS load-balancers.
As soon as the server needs more than 60s to create the response, the client will get a gateway timeout error.

Workflow of file uploads:

I'm not a Django-dev, so please correct me if I'm wrong. As far as I understand, the uploaded file is processed as the following:

Given, that the application is running within an ASGI server, the whole request is received by the ASGIHandler.
It's method #read_body creates a SpooledTemporaryFile.
This temporary file contains the whole request body, including POST parameters and files.

As soon as the request has been received (this is the point the client finished uploading the file) it is than transformed into an ASGIRequest.
The ASGIRequest has #_load_post_and_files and #parse_file_upload as methods which parses the body into seperate files.
This is done by reading the body (the temporary file) and iterating over them seperated by the POST seperator done by MultiPartParser.
The generated chunks are than sent to upload handlers which process those files further on using #receive_data_chunk.
The default upload handlers provided by Django will write those files to disc again, depending on their size.

Problem

The problem here is, that the uploaded file(s) are transformed and written as well as read multiple times.
First the whole body is written into a SpooledTemporaryFile which is re-read using streams (LazyStream) just to be written once more by an upload handler.

The impact is low if the uploaded file is small, but increases dramatically if the size is increased, the file hits the disc and/or the storage is slow.

Optimization / Brainstorming

Would it be possible to reduce the workflow to a single write call?
E.g. if the ASGIHandler already splits the request body into seperate files, it would be possible to just forward the file pointers until the upload handlers needs to be called.
Those handlers would be able to either use those files as-is or to re-read them if pre-processing is needed.

In a best-case scenario, an user uploads a file whichis created as a temporary file in parallel.
As soon as the request has been finished, the file is than moved to its final location (as already implemented by upload handlers by providing #temporary_file_path)
The server would not need any time processing the request further on and would be able to sent the response within some milliseconds independent of the file size.
The roundtrip time would be reduced by 2/3 and also the gateway timeout would be fixed.

Environment

We're using Django 4.0.4 with Gunicorn 20.1.0 and Uvicorn 0.17.6.

Attachments

I've attached two flame graphs of a file upload which hopfully illustrates this issue.
One is using the internal runserver (wsgi) and one of our (stripped) application using gunicorn+uvicorn (asgi)

Attachments (2)

flame-runserver.svg (103.5 KB ) - added by Noxx 2 years ago.
flame-app.svg.tgz (41.7 KB ) - added by Noxx 2 years ago.

Download all attachments as: .zip

Change History (13)

by Noxx, 2 years ago

Attachment: flame-runserver.svg added

by Noxx, 2 years ago

Attachment: flame-app.svg.tgz added

comment:1 by Carlton Gibson, 2 years ago

Resolution: invalid
Status: newclosed

Hi. Interesting as this is, it's not really an addressable issue report at this stage. Can I ask you to use one of the support channels please. Thanks.

Likely the Forum https://forum.djangoproject.com or (if it's a feature suggestion) the DevelopersMailingList would be the best venues for a discussion.

TicketClosingReasons/UseSupportChannels

Last edited 2 years ago by Carlton Gibson (previous) (diff)

comment:2 by Carlton Gibson, 2 years ago

Cc: Andrew Godwin Carlton Gibson added
Component: File uploads/storageHTTP handling
Description: modified (diff)
Has patch: set
Keywords: ASGI async added
Resolution: invalid
Status: closednew
Summary: Performance does not scale when increasing file upload sizesRead ASGI request body from asyncio queue on-demand
Triage Stage: UnreviewedAccepted
Type: UncategorizedNew feature
Version: 4.0dev

Thanks Noxx, re-opening, and accepting for review after discussion on mailing list.

The basic idea here is to read the request body as-needed, rather than pulling it into the spooled temporary file before instantiating the ASGI request.
Perhaps there will be gotchas, but that feels like the right thing to do, so... 👍

comment:3 by Carlton Gibson, 2 years ago

Owner: changed from nobody to Noxx
Status: newassigned

comment:4 by Carlton Gibson, 2 years ago

Patch needs improvement: set

comment:5 by noneNote, 20 months ago

Owner: changed from Noxx to noneNote

comment:6 by Carlton Gibson, 19 months ago

Resolution: wontfix
Status: assignedclosed

Looking at this again, with an eye to #33738 — which is how we handle http.disconnect event as they come in — it's not clear that we can actually read the body file on demand (as nice as that might be in theory).

The ASGI spec is quite clear on the required behaviour:

  • "...the body message serves ... as a trigger to actually run request code (as you should not trigger on a connection opening alone)." (Elided to bring out the force.)
  • more_body (bool) – Signifies if there is additional content to come (as part of a Request message). If True, the consuming application should wait until it gets a chunk with this set to False. If False, the request is complete and should be processed.

i.e. The application is not permitted to begin processing the request until a more_body: False is received. You can't get that unless you read the body.

On top of that, ref #33738, if you want to look for an http.disconnect, however you might do that, you need to have got the body events, which come in on the same queue, out of the way in order to do that. A handler for a long-lived connection that wanted to check that the client hadn't disconnected before processing an expensive operation on the submitted body would be stuck.

Short of a PoC, and an unlikely change to the ASGI spec here, I think we have to close this as wontfix.

comment:7 by Natalia Bidart, 2 months ago

#35592 was a duplicate.

comment:8 by Klaas van Schelven, 2 months ago

TBH I'm not wholly convinced on your reading of the ASGI standard.

  1. The ellipsis in the quote from the first bullet read "the body message serves as a way to stream large incoming HTTP bodies in chunks", and it is not clear to me why there would be a "way to stream large incoming HTTP bodies" on the ASGI http protocol level if ASGI applications are forbidden to make use of this in a meaningful way.
  1. Regarding the second quote, first part "should wait" could be read as "should wait if the application needs the whole request body to make its response"
  1. Regarding the second quote, second part there is no need to read "should be processed" as "should be processed no earlier than"
  1. Your reading (correct or not) is in direct contradiction with the second paragraph of the ASGI spec which reads in full "ASGI attempts to preserve a simple application interface, while providing an abstraction that allows for data to be sent and received at any time, and from different application threads or processes." (emphasis mine)

You say the ASGI spec cannot be changed, but AFAIU the links between the ASGI Team and the Django team are quite close, so this is not entirely impossible.

As a final note: the 2 "other" big ASGI servers (not built by Andrew) actually have a "Hello World" example in their right on the homepage / in the quickstart that does in fact do no attempt to read the request (and are thus in violation of the spec as understood by Carlton):

Last edited 2 months ago by Klaas van Schelven (previous) (diff)

comment:9 by Carlton Gibson, 2 months ago

See this long discussion on the asgiref repo for more context.

Disconnect handling requires consuming the request body messages from the receive queue.

If the ASGI could pass an open file descriptor for the request body to the application, then in principle we could use that directly, rather than needing to spool the body file (to memory or disk, depending on size). That would loose the ability to have the server and application on different servers, but that might be OK in many cases. (IDK)

That would need first the PoC, and then the spec change (or clarification, if you prefer) to go with it.

Version 0, edited 2 months ago by Carlton Gibson (next)

comment:10 by Klaas van Schelven, 2 months ago

I have the feeling I'm missing something here but I really don't see it.

What does "disconnect handling" mean in this context? Disconnect handling by the ASGI server or by the ASGI application?

From my perspective as an application developer, there's not much to _handle_ I think? I just want to _trigger_ a disconnect. I fail to see why this could not simply be done by letting go of the constraint that the request body must be consumed before sending a response. i.e. for this particular issue I don't think changing the type of body to a file descriptor is actually required.

Responding to a quote in the mentioned thread:

The concerns with consuming the body (potentially unnecessarily, but then why send it?)

Because it's not always up to the client to determine what should be consumed by the server. Enforcement of quota / max message sizes is my personal use-case.

Another use-case (not concerning disconnects) is to reduce latency for big uploads, because processing (and possibly even sending back the result) can start earlier.

But my main point would be: given ASGI's stated goals, it's _very surprising_ that it's not possible to implement (for http) a streaming echo server.

Anyway, this is moving into "someone is wrong on the internet" territory for me; I ran into this while evaluating the streaming capacities of various server setups and have not tied myself to an ASGI setup yet, so for me the take-away is simply that ASGI is not a good fit for my use-case.

Background regarding the HTTP/1.1 spec, and the idea of send-before-full-receive (outside of the ASGI context): https://stackoverflow.com/questions/14250991/is-it-acceptable-for-a-server-to-send-a-http-response-before-the-entire-request

comment:11 by Carlton Gibson, 2 months ago

What does "disconnect handling" mean in this context?

It means responding appropriately to the the ASGI `http.disconnect` event that is sent on the receive queue in the case that the client disconnects, whilst (for example) a long running task is in process. (Correctly responding to this is needed to allow applications to cancel and clean up such long running tasks.)

Anyway, this is moving into...

OK, I wish you good luck with your search. If you wanted to explore options here then a Proof-of-Concept plus a discussion on the asgiref repo would be the appropriate way forward.

Note: See TracTickets for help on using tickets.
Back to Top