Opened 15 years ago
Closed 13 years ago
#11903 closed Bug (invalid)
WSGIRequest.path not quoted properly
Reported by: | ianb | Owned by: | Fabián Ezequiel Gallina |
---|---|---|---|
Component: | HTTP handling | Version: | 1.1 |
Severity: | Normal | Keywords: | |
Cc: | ianb@… | Triage Stage: | Design decision needed |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
WSGIRequest.__init__
contains the code:
self.path = '%s%s' % (script_name, path_info)
Both script_name and path_info are url-decoded. That is, if you request /Foo%20bar then PATH_INFO will be '/Foo bar' -- to get the accurate path you have to re-encode both values.
Attachments (2)
Change History (16)
comment:1 by , 15 years ago
milestone: | → 1.2 |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 15 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:3 by , 15 years ago
Owner: | changed from | to
---|---|
Status: | assigned → new |
by , 15 years ago
Attachment: | 11903.diff added |
---|
comment:4 by , 15 years ago
Has patch: | set |
---|---|
Needs tests: | set |
comment:5 by , 15 years ago
Component: | Uncategorized → HTTP handling |
---|
comment:7 by , 15 years ago
Owner: | changed from | to
---|
comment:8 by , 15 years ago
Patch needs improvement: | unset |
---|
The proposed approach was not correct, urlencode works with a two-element tuples or a dictionary. urlquote should be used for it since script_name and path_info are strings.
The attached patch contains the correction for it and a test.
comment:9 by , 15 years ago
milestone: | 1.2 |
---|---|
Triage Stage: | Accepted → Design decision needed |
It seems like this could introduce backwards compatible issues (even though from a quick look at the docs there's no specific mention of quoted/unquoted when referring to request.path
.
Is there some standard which the proposal to quote request.path
would follow? I couldn't find any reference in pep333.
This also creates a disparate situation between path
and path_info
. Applications may be using both, and to have one quoted and the other not seems odd. And since path_info
is used by django's url resolution it may cause problems quoting that.
In any case, this isn't a regression and probably needs some discussion, so I'm bumping out of the already-late 1.2 phase.
comment:10 by , 15 years ago
The quoting of PATH_INFO is specified in the CGI specification, which PEP 333 refers to. This is also true for mod_python (and Apache generally).
comment:11 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → Bug |
comment:12 by , 14 years ago
Needs tests: | unset |
---|
comment:12 by , 14 years ago
Needs tests: | unset |
---|
comment:13 by , 13 years ago
Easy pickings: | unset |
---|---|
Resolution: | → invalid |
Status: | new → closed |
UI/UX: | unset |
I believe the current behavior is correct. Django handles the encoding / decoding wherever necessary and provides unicode
objects to the programmer.
request.path
is unicode
and has no reason to be url-encoded. (In the code quoted in the original report, path_info
is unicode
, which guarantees that self.path
is unicode
.)
This is a custom API of Django, which means we aren't bound by the WSGI or CGI spec there (while we are for request.META['PATH_INFO']
).
To sum up, if I'm typing "www.mysite.com/foo bar/" in my browser, the browser will issue a request for "/foo%20bar/", but Django will convert that back to u"/foo bar/"
.
Patch for #11903