#20530 closed Bug (fixed)
Incorrect QUERY_STRING handling on Python 3
Reported by: | Armin Ronacher | Owned by: | Aymeric Augustin |
---|---|---|---|
Component: | Core (URLs) | Version: | 1.5 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Certain browsers (IE cough) will not fully encode the path in the request in all situations. As such you will encounter non ascii letters in the request line. Currently the QueryDict does not handle that properly. In addition to that it also means that the WSGI QUERY_STRING
variable needs to be handled the same way as PATH_INFO
and SCRIPT_NAME
.
Here is what is necessary to handle the case properly:
- the
environ['QUERY_STRING']
attribute needs to go through the PEP 3333 dance on Python 3 that creates a bytes object - unquoting happens on the bytes
- finally everything is done to the intended encoding (UTF-8)
The logic currently employed by QueryDict in combination with the WSGIRequest object is double wrong:
- the WSGIRequest object is not properly doing the dance and passes a (potentially mangled) unicode string to query dict
- the query dict decodes that incorrectly formatted unicode string (WSGI on 3.x intentionally incorrectly encodes information) causing invalid data to show up in request.args
Independently of that if bytes are passed to the QueryDict it does not do proper decoding unless the bytes are a subset of ASCII.
Change History (8)
comment:1 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 11 years ago
Item 1 above was fixed by https://github.com/django/django/commit/8aaca651cf5732bbf395d24a7d9f2edfab00250c#L0L136
comment:4 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Thanks for the report. I'll take care of that.