Opened 14 years ago
Last modified 3 years ago
#14035 assigned Bug
Cannot access POST after request.encoding was set to a custom value
Reported by: | Piotr Czachur | Owned by: | Piotr Czachur |
---|---|---|---|
Component: | HTTP handling | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | net147 | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Issue happens for multipart/form-data and WSGi handler.
#Django trunk r13353. class DetectEncodingMiddleware(object): def process_request(request): # Lets say we define data encoding in POST param called "enc" enc = request.POST['enc'] # request._load_post_and_files() request.encoding = enc # request._set_encoding(): del self._post request.POST # request._load_post_and_files() # raise AttributeError("You cannot set the upload handlers after the upload has been processed.");
This issue is related: #12522
Attachments (2)
Change History (22)
comment:1 by , 14 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
comment:2 by , 14 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
What you mean by modifying request here? I'm just setting request.encoding to encoding which I assume client will use. It's common issue if you obtain forms "from outside" your pages.
In fact Django HttpRequest is designed to handle encoding switching on fly:
def _set_encoding(self, val): """ Sets the encoding used for GET/POST accesses. If the GET or POST dictionary has already been created, it is removed and recreated on the next access (so that it is decoded correctly). """ self._encoding = val if hasattr(self, '_get'): del self._get if hasattr(self, '_post'): del self._post encoding = property(_get_encoding, _set_encoding)
In fact this encoding switching works fine when form data is encoded using "application/x-www-form-urlencoded". My point is that it doesn't work for "multipart/form-data".
comment:3 by , 14 years ago
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
You are taking the request object you are given, and you are changing it. That is, you are *modifying* the request. You shouldn't be doing that.
Django receives a HTTP request and parses it. If the content type specified in the HTTP request can be interpreted into multiple values, Django will parse the raw content into a POST dictionary.
You have a use case where the content encoding isn't being transmitted as part of the HTTP request. Therefore, it's up to you to parse the raw request data.
comment:4 by , 14 years ago
You say I shouldn't be doing that (modifying encoding of the HttpRequest), while Django docs says:
===
HttpRequest.encoding
A string representing the current encoding used to decode form submission data (or None, which means the DEFAULT_CHARSET setting is used). You can write to this attribute to change the encoding used when accessing the form data. Any subsequent attribute accesses (such as reading from GET or POST) will use the new encoding value. Useful if you know the form data is not in the DEFAULT_CHARSET encoding.
===
My case fits exactly in this scenario. So do we get each other ?
comment:5 by , 14 years ago
Triage Stage: | Unreviewed → Accepted |
---|
My apologies. You've obviously read (and paid closer attention to) the docs than I have :-)
In which case, accepting the ticket.
comment:6 by , 14 years ago
Nice to hear that finally...
It's in docs for quite a long time, it's really surprising you didn't read docs carefully while so easily closing such low-level tickets.
P.S.
You forgot to reopen the ticket.
comment:7 by , 14 years ago
Resolution: | wontfix |
---|---|
Status: | closed → reopened |
I have triaged approx 40 tickets today. I need to triage another 120 before I can start work on fixing bugs in the 1.3 release. I hope you'll understand that occasionally, human beings make errors when they're doing labor intensive work. I hope you've also noted that if you explain yourself clearly, human beings are usually willing to admit their mistakes and move on.
On the other hand, human beings don't generally take too kindly to being snarked at when they are volunteering their time to help out others. :-)
comment:8 by , 14 years ago
Other human beings, after spending several hours on investigating a problem, are pretty unhappy when their work gets wontfix'ed in a single click without a simple question ;-)
Have a nice day!
comment:9 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → Bug |
comment:12 by , 12 years ago
Cc: | added |
---|
comment:13 by , 12 years ago
Taking a bit of a look at this - and the problem ends up not being trivial.
First and obvious approach to a fix would be to add the deletion of _files to the property setter for encoding - as is done for _post
But taking that step just uncovers a bit more of a yak
First off is that WSGIRequest and HttpRequest differ in the way the POST attribute works - the former will refresh it if _post is deleted, the latter will not. This is perhaps by design - but the differences between these classes is not documented at all.
Second - there are problems with re-parsing the multipart data, as the first time parsing involves reading the post data - which sets the _read_started flag
This prevents _load_post_and_files from being able to reparse with the new encoding - instead it shortcuts to _mark_post_parse_error.
Even if one changes that detail - by allowing the object to set the full _body attribute on the first parse, you still have at least this test fail:
test_body_after_POST_multipart
Which is there to specifically verify that accessing body is not allowed after parsing post - because we don't want to repeat the expensive operation of parsing the multipart data
basically there are aspects here that assume parts of this machinery will only run through once, in a certain order
comment:14 by , 12 years ago
Generally speaking, content encoding should be set with the charset attribute of the Content-Encoding header. So we speak of corner cases here. I'd be more enclined to fix the docs and not support changing encoding after parsing, unless we have a common use case showing a clear need for this.
comment:15 by , 12 years ago
I think I've stumbled on the same bug, or a related side affect. I'm receiving some data ("Content-Type: multipart/alternative;") in a POST from an external site (SendGrid) site as per their API ( http://sendgrid.com/docs/API_Reference/Webhooks/parse.html ) . The encoding type for one of the fields (text) in the POST varies, and is given in another POST field (charsets). If read the encoding type in charsets and then set request.encoding to this encoding type, the QueryDict vanishes when I try to get to request.POSTtext, as in
MultiValueDictKeyError: "Key 'text' not found in <QueryDict: {}>"
This sounds like an edge use case to me. I'm going to assume that this will not get fixed, and look for a work around, but I thought it's worth having a more helpful exception, telling that you can't change the encoding in this scenario, and updating the docs.
comment:16 by , 12 years ago
Status: | reopened → new |
---|
comment:17 by , 11 years ago
Consider following scenario:
1) read HttpRequest.POST (raw request body is consumed)
2) change HttpRequest.encoding (discards parsed POST entirely)
3) read HttpRequest.POST again
Current behaviour for 3) depends on content-type:
- for multipart/form-data: returns {}
- for application/x-www-form-urlencoded: returns brand new dictionary with values encoded in brand new encoding, just like docs say
For "multipart/form-data" behaviour is against documentation and gives false impression that everything went fine. Only way for user to notice such error, is checking HttpRequest._post_parse_error attribute, but I doubt it's a way to go, because it's not pulic nor documented.
My proposition is to always raise exception when we try to re-parse POST for multipart/form-data. Patch addtached.
comment:18 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
by , 11 years ago
Attachment: | raising_exception.diff added |
---|
Raise exception when try to re-parse multipart/form-data
by , 11 years ago
Attachment: | raising_exception.2.diff added |
---|
Raise exception when try to re-parse multipart/form-data
comment:19 by , 8 years ago
hi all, I just found this ticket after I created a related question on stackoverflow regarding the same sendgrid problem:
http://stackoverflow.com/questions/42862082/reencode-django-request-body-with-another-encoding
is there a way to get around this at the moment? thank you for your help!
comment:20 by , 3 years ago
Hello,
5 years later and 3h of debugging... I stumbled on this. There was a discrepancy between my production code and my unittests: works in prod, not in test. My code listens to Paypal IPNs, they do not necessarily encode their payload in utf-8, it's given in a field of the POST data. They do not document the content-type used. The unittests use multipart/form-data.
Using https://docs.djangoproject.com/en/3.2/ref/unicode/#form-submission lead me to believe I could change the encoding after reading the POST data -> cannot work in unittest without changing the content type (all good once the actual content type to use is 'guessed'). I believe a note must be added to https://docs.djangoproject.com/en/3.2/ref/unicode/#form-submission explaining that changing the encoding based on POST data content only works if the content-type is application/x-www-form-urlencoded.
Since you shouldn't be modifying the request anyway, I'm afraid I don't see the problem here. If you need some sort of custom payload handling, then you should be getting the raw POST data and parsing it yourself, not trying to munge Django's POST handling.