Opened 5 weeks ago
Last modified 3 weeks ago
#35838 new New feature
request.read() returns empty for Rueqests w/ Transfer-Encoding: Chunked — at Initial Version
Reported by: | Klaas van Schelven | Owned by: | |
---|---|---|---|
Component: | HTTP handling | Version: | 5.0 |
Severity: | Normal | Keywords: | |
Cc: | Carlton Gibson, bcail, Natalia Bidart, Claude Paroz | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Django's request.read() returns 0 bytes when there's no Content-Length header.
i.e. it silently fails.
But not having a Content-Length header is perfectly fine when there's a HTTP/1.1 Transfer-Encoding: Chunked request.
WSGI servers like gunicorn and mod_wsgi are able to handle this just fine, i.e. Gunicorn handles the hexidecimally encoded lengths and just passes you the right chunks and Apache's mod_wsgi does the same I believe.
Discussions/docs over at Gunicorn / mod_wsgi:
- https://github.com/benoitc/gunicorn/issues/1264
- https://github.com/benoitc/gunicorn/issues/605
- https://github.com/benoitc/gunicorn/issues/2947
- https://modwsgi.readthedocs.io/en/develop/configuration-directives/WSGIChunkedRequest.html
The actual single line of code that's problematic is this one: https://github.com/django/django/blob/97c05a64ca87253e9789ebaab4b6d20a1b2370cf/django/core/handlers/wsgi.py#L77
My personal reason I ran into this:
which is basically solved by using a lot of lines to undo the effects of that single line of code:
import django from django.core.handlers.wsgi import WSGIHandler, WSGIRequest os.environ.setdefault('DJANGO_SETTINGS_MODULE', '.... class MyWSGIRequest(WSGIRequest): def __init__(self, environ): super().__init__(environ) if "CONTENT_LENGTH" not in environ and "HTTP_TRANSFER_ENCODING" in environ: # "unlimit" content length self._stream = self.environ["wsgi.input"] class MyWSGIHandler(WSGIHandler): request_class = MyWSGIRequest def my_get_wsgi_application(): # Like get_wsgi_application, but returns a subclass of WSGIHandler that uses a custom request class. django.setup(set_prefix=False) return MyWSGIHandler() application = my_get_wsgi_application()
But I'd rather not solve this in my own application only, but have it be a Django thing. In the Gunicorn links, there's some allusion to "this won't happen b/c wsgi spec", but that seems like a bad reason from my perspective. At the very least request.read() should not just silently give a 0-length answer. And having tools available such that you don't need to make a "MyXXX" hierarchy would also be nice.
That's the bit for "actually getting request.read() to work when behind gunicorn".
There's also the part where this doesn't work for the debugserver. Which is fine, given its limited scope. But an error would be better than silently returning nothing (again).
My solution for that one is the following middleware:
class DisallowChunkedMiddleware: def __init__(self, get_response): self.get_response = get_response def __call__(self, request): if request.META.get("HTTP_TRANSFER_ENCODING").lower() == "chunked" and \ not request.META.get("wsgi.input_terminated"): # If we get here, it means that the request has a Transfer-Encoding header with a value of "chunked", but we # have no wsgi-layer handling for that. This probably means that we're running the Django development # server, and as such our fixes for the Gunicorn/Django mismatch that we put in wsgi.py are not in effect. raise ValueError("This server is not configured to support Chunked Transfer Encoding (for requests)") return self.get_response(request)
Some links:
- This one seemed related, but probably isn't (it's about forms): https://code.djangoproject.com/ticket/35289
- This seemed related, but is about uwsgi and uses special uwsgi features: https://github.com/btimby/uwsgi-chunked/blob/master/uwsgi_chunked/chunked.py
All of the above is when using WSGI (I did not test/think about ASGI)