Opened 14 years ago

Closed 14 years ago

Last modified 13 years ago

#15785 closed Bug (fixed)

HttpRequest.read(NUM_BYTES) can read beyond the end of wsgi.input stream. (Violation of WSGI spec & under-defined behaviour)

Reported by: Tom Christie Owned by: Tom Christie
Component: HTTP handling Version: 1.3-rc
Severity: Normal Keywords: http, wsgi
Cc: Maniac@…, Tom Christie Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Okay, this is the underlying cause behind #15762 and #15763.
I've marked those as duplicates of this ticket.

Please see discussion on this bug here: https://groups.google.com/forum/#!topic/django-developers/VG1ueWTSs_g

The problem is now that HttpRequest exposes a read() method, user code can do something like:

request_content = json.load(request)

at the moment that will:

  • break the wsgi spec, as the client app is contracted not to attempt to read more than CONTENT-LENGTH bytes from wsgi.input
  • result in under-defined behaviour, although it appears to work right now.
  • break when used in the test client, as per #15762

I've attached a patch with tests for this issue, which:

  • Changes WSGIRequest._stream to be a property that is (always) instantiated as a LimitedStream when first accessed.
  • Removes some redundant code in HttpRequest and MultiPartParser.
  • Fixes some minor bugs in tests/regressiontests/requests/tests.py
  • Adds two tests for MultiPartParser to check graceful behaviour on truncated or empty multipart requests.
  • Adds a test for TestClient request.read(LARGE_BUFFER) behaviour.

Attachments (6)

limited_stream.diff (15.7 KB ) - added by Tom Christie 14 years ago.
limited_stream_lazy.diff (14.9 KB ) - added by Tom Christie 14 years ago.
limited_stream_lazy_plus_extra_tests.diff (17.3 KB ) - added by Tom Christie 14 years ago.
limited_stream_lazy_plus_extra_tests.2.diff (16.5 KB ) - added by Tom Christie 14 years ago.
limited_stream_lazy_plus_extra_tests.3.diff (16.5 KB ) - added by anonymous 14 years ago.
limited_stream_lazy_plus_extra_tests.4.diff (16.5 KB ) - added by Tom Christie 14 years ago.

Download all attachments as: .zip

Change History (24)

by Tom Christie, 14 years ago

Attachment: limited_stream.diff added

comment:1 by Ivan Sagalaev, 14 years ago

Cc: Maniac@… added

by Tom Christie, 14 years ago

Attachment: limited_stream_lazy.diff added

comment:2 by Tom Christie, 14 years ago

Updated patch (limited_stream_lazy.diff)
Thanks to Ivan's testing it turns out the defered creation of the LimitedStream isn't necessary.
This should do the job nicely.

comment:3 by Ivan Sagalaev, 14 years ago

Tom, since you're hacking on it so actively :-) how about adding another test somewhere near read_limited_stream that would test reading from a GET request that doesn't have a Content-length header? It would test the situation with some generic view code always calling request.read() irregardless of the request type. The sensible thing to do here is not to break and just give it an empty string.

comment:4 by Tom Christie, 14 years ago

Keywords: http wsgi added
Triage Stage: UnreviewedAccepted

Good plan, I'll get on to that.
I'm assuming it's cool to drop this ticket into 'Accepted' now.

comment:5 by Tom Christie, 14 years ago

Cc: Tom Christie added

by Tom Christie, 14 years ago

comment:6 by Tom Christie, 14 years ago

Added some extra tests and cleaned up some test code to also include the test for #14753, which has similar logic.

I've added this as limited_stream_lazy_plus_extra_tests.2.diff.

As a final sanity check I reverted, applied the tests patches only, these two fail:

  • test_read_numbytes_from_empty_request
  • test_read_numbytes_from_nonempty_request

applied the code patches, checked they passed.

Ivan, could you review, I'm guessing this should be able to go into 'Ready for checkin' now.

Ta! :)

comment:7 by Ivan Sagalaev, 14 years ago

Couple of comments, small nits, actually:

+        r = {
+            'CONTENT_LENGTH': len(payload),
+            'CONTENT_TYPE':   client.MULTIPART_CONTENT,
+            'PATH_INFO':      "/file_uploads/echo/",
+            'REQUEST_METHOD': 'POST',
+            'wsgi.input':     client.FakePayload(payload),
+        }

Django uses a different style in such cases: values should not be vertically aligned but separated from keys with a colon and a single space.

+        try:
+            content_length = int(self.environ.get('CONTENT_LENGTH', 0))
+        except (ValueError, TypeError):
+            content_length = 0

I don't see how TypeError can be raised here. Also, the default value is defined in two places. I'd do it like this instead:

try:
  content_length = int(self.environ.get('CONTENT_LENGTH'))
except ValueError:
  content_length = 0

comment:8 by Graham Dumpleton, 14 years ago

If you are going to do:

try:
  content_length = int(self.environ.get('CONTENT_LENGTH'))
except ValueError:
  content_length = 0

then you must catch TypeError as lack of CONTENT_LENGTH would return None which yields a TypeError when int() is applied to it.

So yes, TypeError shouldn't have been needed in what they originally had, but is for what you have suggested.

comment:9 by Ivan Sagalaev, 14 years ago

Oh, good catch, thanks!

comment:10 by anonymous, 14 years ago

Coolio, I've updated the patch to reflect those points.

It's my first contribution, so thanks for the feedback and helping this along...

comment:11 by Ivan Sagalaev, 14 years ago

Triage Stage: AcceptedReady for checkin

This now looks to me ready for checkin. Thanks, Tom!

comment:12 by vung, 14 years ago

Easy pickings: unset

This patch would also fix #15890

comment:13 by Jannis Leidel, 14 years ago

Patch needs improvement: set
UI/UX: unset

Not sure what's wrong, but one test doesn't pass anymore:

Creating test database for alias 'default'...
Creating test database for alias 'other'...
Destroying old test database 'other'...
....F...........
======================================================================
FAIL: test_empty_multipart_raises_error (regressiontests.file_uploads.tests.FileUploadTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/jezdez/Code/git/django/tests/regressiontests/file_uploads/tests.py", line 217, in test_empty_multipart_raises_error
    self.assertRaises(MultiPartParserError, lambda r: self.client.request(**r), r)
AssertionError: MultiPartParserError not raised

----------------------------------------------------------------------
Ran 16 tests in 1.351s

FAILED (failures=1)
Destroying test database for alias 'default'...
Destroying test database for alias 'other'...

comment:14 by Tom Christie, 14 years ago

Patch needs improvement: unset

The expected behavior of MultiPartParser when given a Content-Length of 0 has changed slightly since this ticket/patch was submitted.
Please see ticket #16201 for details of how & why.

I've updated the test_empty_multipart_raises_error() test to instead be test_empty_multipart_handled_gracefully(), and test that it returns an empty QueryDict, rather than test that it throws a MultiPartParseError.

I've verified that the patch still applies okay to the latest revision, and that the tests are now all passing again.

Patch attached as limited_stream_lazy_plus_extra_tests.4.diff

comment:15 by Jannis Leidel, 14 years ago

Resolution: fixed
Status: newclosed

In [16479]:

Fixed #15785 -- Stopped HttpRequest.read() from reading beyond the end of a wsgi.input stream and removed some redundant code in the multipartparser. Thanks, tomchristie, grahamd and isagalaev.

comment:16 by anonymous, 14 years ago

Doesn't this revert the changes made in [14453]?

comment:17 by Jannis Leidel, 14 years ago

It reverts it, but it's not needed any more since we wrap the wsgi.input in a LimitedStream in any case, whether development server or not.

comment:18 by Jacob, 13 years ago

milestone: 1.3

Milestone 1.3 deleted

Note: See TracTickets for help on using tickets.
Back to Top