#26308 closed Bug (fixed)
Django 1.9.3/1.8.10: is_safe_url() crashes with a byestring URL on Python 2
Reported by: | John Eskew | Owned by: | nobody |
---|---|---|---|
Component: | HTTP handling | Version: | 1.8 |
Severity: | Normal | Keywords: | |
Cc: | ned@…, pauloswald@…, michaelvantellingen@…, felisiak.mariusz@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
utils/http.py:is_safe_url() used to always call url.replace() on every url. In Django 1.8.10, the method has been split into a public and private method and is called twice - once with the plain, passed-in url and once with a url that is the result of a url.replace() call.
The code is in utils/http.py here:
https://github.com/django/django/blob/382ab137312961ad62feb8109d70a5a581fe8350/django/utils/http.py#L282
If a non-unicode URL is passed into is_safe_url(), it now throws the following exception:
TypeError: must be unicode, not str
due to the unicodedata.category() call on this line:
Change History (21)
comment:1 by , 9 years ago
Cc: | added |
---|
comment:2 by , 9 years ago
comment:3 by , 9 years ago
The issue is on Python 2, where bytestrings may be as common as unicode strings, especially for strings like URLs which have seldom non-ASCII chars in them. I'd suggest adding a force_text
somewhere which could be removed once we drop Python 2.
comment:4 by , 9 years ago
Cc: | added |
---|
comment:5 by , 9 years ago
Cc: | added |
---|
follow-up: 9 comment:6 by , 9 years ago
Triage Stage: | Unreviewed → Accepted |
---|
This was also reported in IRC and on the security mailing list. In both cases the problematic code passes request.META.get('HTTP_REFERER')
(str
(bytes) on Python 2 and str
(unicode on Python 3) to is_safe_url()
).
Doing this crashes Python 2 even before the security patch if the URL contains any non-ASCII characters. I guess there isn't much disadvantage to a force_text()
call. Some projects are adapting by inserting it themselves.
comment:7 by , 9 years ago
In our application (edx-platform), we use the following Django module in our urls.py for language setting:
url(r'^i18n/', include('django.conf.urls.i18n')),
The application then performs a POST to http://example.com/i18n/setlang/, which flows through the following stack:
/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/handlers/base.py in get_response response = wrapped_callback(request, *callback_args, **callback_kwargs) /edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/utils/decorators.py in inner return func(*args, **kwargs) /edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/utils/decorators.py in inner return func(*args, **kwargs) /edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/views/i18n.py in set_language if not is_safe_url(url=next, host=request.get_host()): /edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/utils/http.py in is_safe_url return _is_safe_url(url, host) and _is_safe_url(url.replace('\\', '/'), host) /edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/utils/http.py in _is_safe_url if unicodedata.category(url[0])[0] == 'C':
So the URL handling in its current form never flows through our code - only Django code.
comment:8 by , 9 years ago
For anyone blocked by this bug, below is a monkey patch that should function as a work-around.
NO IMPLIED GUARANTEES OR WARRANTIES!
from django.utils import http from django.utils.encoding import force_text def patch(): """ Monkey patch the django.utils.http.is_safe_url function to convert the incoming url and host parameters to unicode. """ def create_is_safe_url_wrapper(wrapped_func): def _wrap_is_safe_url(*args, **kwargs): def _conv_text(value): return None if value is None else force_text(value) return wrapped_func( # Converted *args. *tuple(map(_conv_text, args)), # Converted **kwargs. **{key: _conv_text(value) for key, value in kwargs.items()} ) return _wrap_is_safe_url http.is_safe_url = create_is_safe_url_wrapper(http.is_safe_url)
comment:9 by , 9 years ago
Replying to timgraham:
This was also reported in IRC and on the security mailing list. In both cases the problematic code passes
request.META.get('HTTP_REFERER')
(str
(bytes) on Python 2 andstr
(unicode on Python 3) tois_safe_url()
).
Doing this crashes Python 2 even before the security patch if the URL contains any non-ASCII characters. I guess there isn't much disadvantage to a
force_text()
call. Some projects are adapting by inserting it themselves.
As far As i'm concerned we should add following patch do i18n:
-
django/views/i18n.py
a b from django.utils._os import upath 13 13 from django.utils.encoding import smart_text 14 14 from django.utils.formats import get_format, get_format_modules 15 15 from django.utils.http import is_safe_url 16 from django.utils.text import force_text 16 17 from django.utils.translation import ( 17 18 LANGUAGE_SESSION_KEY, check_for_language, get_language, to_locale, 18 19 ) … … def set_language(request): 34 35 """ 35 36 next = request.POST.get('next', request.GET.get('next')) 36 37 if not is_safe_url(url=next, host=request.get_host()): 37 next = request.META.get('HTTP_REFERER')38 next = force_text(request.META.get('HTTP_REFERER'), strings_only=True, errors='replace') 38 39 if not is_safe_url(url=next, host=request.get_host()): 39 40 next = '/' 40 41 response = http.HttpResponseRedirect(next)
comment:10 by , 9 years ago
Why would you patch there instead of is_safe_url()
itself to prevent Python 2 users from having to add similar boilerplate in similar code? I was thinking something like (with errors='replace'
if that's needed):
-
django/utils/http.py
diff --git a/django/utils/http.py b/django/utils/http.py index 6e782bd..73e45fd 100644
a b def is_safe_url(url, host=None): 290 290 url = url.strip() 291 291 if not url: 292 292 return False 293 if six.PY2: 294 url = force_text(url) 293 295 # Chrome treats \ completely as / in paths but it could be part of some 294 296 # basic auth credentials so we need to check both URLs. 295 297 return _is_safe_url(url, host) and _is_safe_url(url.replace('\\', '/'), host)
comment:12 by , 9 years ago
Has patch: | set |
---|
comment:13 by , 9 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:14 by , 9 years ago
Summary: | Django 1.8.10: Calling is_safe_url() with a non-unicode url throws an exception. → Django 1.9.3/1.8.10: is_safe_url() crashes with a byestring URL on Python 2 |
---|
comment:15 by , 9 years ago
Cc: | added |
---|
comment:19 by , 9 years ago
In 552f03869ea7f3072b3fa19ffb6cb2d957fd8447:
Added safety to URL decoding in is_safe_url() on Python 2
The errors='replace' parameter to force_text altered the URL before checking
it, which wasn't considered sane. Refs [24fc935218] and [ada7a4aef].
comment:20 by , 9 years ago
In 9c195d45a64b0d2baee218e617ca3a762efc0bf5:
[1.9.x] Added safety to URL decoding in is_safe_url() on Python 2
The errors='replace' parameter to force_text altered the URL before checking
it, which wasn't considered sane. Refs 24fc935218 and ada7a4aef.
Backport of 552f03869e from master.
comment:21 by , 9 years ago
In beb392b85e71fdd41209d323126181d74090fecb:
[1.8.x] Added safety to URL decoding in is_safe_url() on Python 2
The errors='replace' parameter to force_text altered the URL before checking
it, which wasn't considered sane. Refs 24fc935218 and ada7a4aef.
Backport of 552f03869e from master.
The issue came up after we privately developed the patch, however, we didn't see a reason why an application would use a non-unicode value for a URL since Django handles all URLs as unicode. Using bytestring URLs already crashed
is_safe_url()
on Python 3.What's your use case? Could you fix your application instead?