Opened 5 years ago

Closed 5 years ago

#31564 closed Uncategorized (duplicate)

Django fails to return HttpResponse message on early response with large uploads.

Reported by: Jacob Crabtree Owned by: nobody
Component: File uploads/storage Version: 3.0
Severity: Normal Keywords: Memory Error, Large File Upload, HTTP 414
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Short Description:

I am trying to create a Django API that allows people to upload large (think >1GB files) to the server. This API will sometimes need to stop the upload early, and return a response to the client. I have discovered that Django will fail to return the response message if the upload is sufficiently large. Instead, it will return a generic Html template stating the error code to the client. On the Django server, it attempts to read *all* of the data left in the pipeline and interpret it as a URI (at least, as far as I can tell), and will then throw a 414 Request URI too long error.
This error does not occur with smaller files. I haven't found a lower limit, but 1GB triggers it fairly reliably. You may want to make it a larger file if your computer has a lot of RAM, or the error is otherwise not occurring.

Versions: Django v3.0.4, PycURL v7.43.0.2, and Python 3.6.4

Example Django Server to reproduce

from django.http import HttpResponse
from django.views.decorators.csrf import csrf_exempt
from django.views.decorators.http import require_POST

@csrf_exempt
@require_POST
def upload(request):
    print('got request')
    read_size = 1024 * 1024 * 50

    data = request.read(read_size)
    return HttpResponse("Early Response!", status_code=500)

This server has one endpoint at /upload, and just reads one chunk of data and then returns an early response. Also, in case it matters, in the Django settings file I have added the following line to allow for these large uploads:
DATA_UPLOAD_MAX_MEMORY_SIZE = None

Example upload code

import os
import pycurl
from io import BytesIO


def config_pycurl(url, file_handle, file_size, response_buffer):
    # Create the curl object, set the URL and HTTP method
    c = pycurl.Curl()
    c.setopt(pycurl.VERBOSE, True)
    c.setopt(pycurl.URL, url)
    c.setopt(pycurl.POST, 1)

    # Give curl a buffer to write the remote server's response to
    c.setopt(pycurl.WRITEDATA, response_buffer)

    c.setopt(pycurl.SSL_VERIFYPEER, 0)
    c.setopt(pycurl.SSL_VERIFYHOST, 0)

    c.setopt(pycurl.POSTFIELDSIZE, file_size)
    c.setopt(pycurl.READFUNCTION, file_handle.read)
    return c


def send_pycurl(url, file_path, file_size):
    resp_buffer = BytesIO()

    input_file = open(file_path, 'rb')
    c = config_pycurl(url, input_file, file_size, resp_buffer)
    c.perform()

    # Get HTTP response code, clean up handles
    resp_code = c.getinfo(c.RESPONSE_CODE)
    c.close()
    input_file.close()

    # Get response from the server - iso-8859-1 is the default encoding curl performs
    resp_body = resp_buffer.getvalue().decode('iso-8859-1')
    print(f'code: {resp_code}')
    print(f'response: {resp_body}')


if __name__ == '__main__':
    post_url = 'http://localhost:8000/upload'
    name = 'test_files/1_gigabyte_file.txt'
    send_pycurl(post_url, name, os.path.getsize(name))

Result

After returning HTTPResponse, Django prints the following stack trace:

Traceback (most recent call last):
  File "c:\python\3.6.4\Lib\wsgiref\handlers.py", line 138, in run
    self.finish_response()
  File "c:\python\3.6.4\Lib\wsgiref\handlers.py", line 183, in finish_response
    self.close()
  File "c:\.virtualenv\djangoError\lib\site-packages\django\core\servers\basehttp.py", line 113, in close
    self.get_stdin()._read_limited()
  File "c:\.virtualenv\djangoError\lib\site-packages\django\core\handlers\wsgi.py", line 28, in _read_limited
    result = self.stream.read(size)
MemoryError
[10/May/2020 19:32:23] code 414, message Request-URI Too Long
[10/May/2020 19:32:23] "" 414 -
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 63835)
Traceback (most recent call last):
  File "c:\python\3.6.4\Lib\socketserver.py", line 639, in process_request_thread
    self.finish_request(request, client_address)
  File "c:\python\3.6.4\Lib\socketserver.py", line 361, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "c:\python\3.6.4\Lib\socketserver.py", line 696, in __init__
    self.handle()
  File "c:\.virtualenv\djangoError\lib\site-packages\django\core\servers\basehttp.py", line 174, in handle
    self.handle_one_request()
  File "c:\.virtualenv\djangoError\lib\site-packages\django\core\servers\basehttp.py", line 187, in handle_one_request
    self.send_error(414)
  File "c:\python\3.6.4\Lib\http\server.py", line 473, in send_error
    self.wfile.write(body)
  File "c:\python\3.6.4\Lib\socketserver.py", line 775, in write
    self._sock.sendall(b)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

On the PycURL side, I see the following from its verbose output:

*   Trying ::1...
* TCP_NODELAY set
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8000 (#0)
> POST /upload HTTP/1.1
Host: localhost:8000
User-Agent: PycURL/7.43.0.2 libcurl/7.60.0 OpenSSL/1.1.0h zlib/1.2.11 c-ares/1.14.0 WinIDN libssh2/1.8.0 nghttp2/1.32.0
Accept: */*
Content-Length: 1073741825
Content-Type: application/x-www-form-urlencoded
Expect: 100-continue

< HTTP/1.1 100 Continue
< HTTP/1.1 500 Internal Server Error
< Date: Sun, 10 May 2020 23:32:23 GMT
< Server: WSGIServer/0.2 CPython/3.6.4
code: 500
< Content-Type: text/html
< X-Frame-Options: DENY
response: 
< Content-Length: 145
<!doctype html>
< Vary: Cookie
<html lang="en">
< X-Content-Type-Options: nosniff
<head>
* HTTP error before end of send, stop sending
  <title>Server Error (500)</title>
< 
</head>
<body>
  <h1>Server Error (500)</h1><p></p>
</body>
</html>
* Closing connection 0

Expected Result

I would expect Django to respond with a 500 Server Error. Additionally, it should have no memory errors. It appears from the stack trace that Django is attempting to read as much as Content-Length originally advertised, and then throws an error when the OS refuses to allocate that much memory. The 414 error is also not expected.
On PycURL's side, the response code variable should be 500, and the response body should be "Early Response!", the string passed to Django's HTTPResponse.

Please let me know if any additional information is required, this is my first time submitting a bug here! I have also attached the above scripts as files so it's hopefully easier to set up and reproduce.

Attachments (2)

Upload.py (1.3 KB ) - added by Jacob Crabtree 5 years ago.
PycURL Script for Upload
views.py (361 bytes ) - added by Jacob Crabtree 5 years ago.
Django Views

Download all attachments as: .zip

Change History (3)

by Jacob Crabtree, 5 years ago

Attachment: Upload.py added

PycURL Script for Upload

by Jacob Crabtree, 5 years ago

Attachment: views.py added

Django Views

comment:1 by Mariusz Felisiak, 5 years ago

Resolution: duplicate
Status: newclosed
Summary: Django fails to return HttpResponse message on early response with large uploadsDjango fails to return HttpResponse message on early response with large uploads.

Thanks for this ticket, however I cannot reproduce this behavior on linux, it's probably some Windows issue that is not related to Django. You can try to use one of support channels.

Duplicate of #30503.

Note: See TracTickets for help on using tickets.
Back to Top