Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#26308 closed Bug (fixed)

Django 1.9.3/1.8.10: is_safe_url() crashes with a byestring URL on Python 2

Reported by: John Eskew Owned by: nobody
Component: HTTP handling Version: 1.8
Severity: Normal Keywords:
Cc: ned@…, pauloswald@…, michaelvantellingen@…, felisiak.mariusz@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

utils/http.py:is_safe_url() used to always call url.replace() on every url. In Django 1.8.10, the method has been split into a public and private method and is called twice - once with the plain, passed-in url and once with a url that is the result of a url.replace() call.

The code is in utils/http.py here:
https://github.com/django/django/blob/382ab137312961ad62feb8109d70a5a581fe8350/django/utils/http.py#L282

If a non-unicode URL is passed into is_safe_url(), it now throws the following exception:

TypeError: must be unicode, not str

due to the unicodedata.category() call on this line:

https://github.com/django/django/blob/382ab137312961ad62feb8109d70a5a581fe8350/django/utils/http.py#L300

Change History (21)

comment:1 by Ned Batchelder, 9 years ago

Cc: ned@… added

comment:2 by Tim Graham, 9 years ago

The issue came up after we privately developed the patch, however, we didn't see a reason why an application would use a non-unicode value for a URL since Django handles all URLs as unicode. Using bytestring URLs already crashed is_safe_url() on Python 3.

What's your use case? Could you fix your application instead?

comment:3 by Claude Paroz, 9 years ago

The issue is on Python 2, where bytestrings may be as common as unicode strings, especially for strings like URLs which have seldom non-ASCII chars in them. I'd suggest adding a force_text somewhere which could be removed once we drop Python 2.

comment:4 by Paul Oswald, 9 years ago

Cc: pauloswald@… added

comment:5 by Michael, 9 years ago

Cc: michaelvantellingen@… added

comment:6 by Tim Graham, 9 years ago

Triage Stage: UnreviewedAccepted

This was also reported in IRC and on the security mailing list. In both cases the problematic code passes request.META.get('HTTP_REFERER') (str (bytes) on Python 2 and str (unicode on Python 3) to is_safe_url()).

Doing this crashes Python 2 even before the security patch if the URL contains any non-ASCII characters. I guess there isn't much disadvantage to a force_text() call. Some projects are adapting by inserting it themselves.

comment:7 by John Eskew, 9 years ago

In our application (edx-platform), we use the following Django module in our urls.py for language setting:

url(r'^i18n/', include('django.conf.urls.i18n')),

The application then performs a POST to http://example.com/i18n/setlang/, which flows through the following stack:

    /edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/handlers/base.py in get_response
                                        response = wrapped_callback(request, *callback_args, **callback_kwargs)
    /edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/utils/decorators.py in inner
                                        return func(*args, **kwargs)
    /edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/utils/decorators.py in inner
                                        return func(*args, **kwargs)
    /edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/views/i18n.py in set_language
                            if not is_safe_url(url=next, host=request.get_host()):
    /edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/utils/http.py in is_safe_url
                        return _is_safe_url(url, host) and _is_safe_url(url.replace('\\', '/'), host)
    /edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/utils/http.py in _is_safe_url
                        if unicodedata.category(url[0])[0] == 'C':

So the URL handling in its current form never flows through our code - only Django code.

comment:8 by John Eskew, 9 years ago

For anyone blocked by this bug, below is a monkey patch that should function as a work-around.
NO IMPLIED GUARANTEES OR WARRANTIES!

from django.utils import http
from django.utils.encoding import force_text


def patch():
    """
    Monkey patch the django.utils.http.is_safe_url function to convert the incoming
    url and host parameters to unicode.
    """
    def create_is_safe_url_wrapper(wrapped_func):
        def _wrap_is_safe_url(*args, **kwargs):
            def _conv_text(value):
                return None if value is None else force_text(value)
            return wrapped_func(
                # Converted *args.
                *tuple(map(_conv_text, args)),
                # Converted **kwargs.
                **{key: _conv_text(value) for key, value in kwargs.items()}
            )
        return _wrap_is_safe_url
    http.is_safe_url = create_is_safe_url_wrapper(http.is_safe_url)

in reply to:  6 comment:9 by Mariusz Felisiak, 9 years ago

Replying to timgraham:

This was also reported in IRC and on the security mailing list. In both cases the problematic code passes request.META.get('HTTP_REFERER') (str (bytes) on Python 2 and str (unicode on Python 3) to is_safe_url()).

Doing this crashes Python 2 even before the security patch if the URL contains any non-ASCII characters. I guess there isn't much disadvantage to a force_text() call. Some projects are adapting by inserting it themselves.

As far As i'm concerned we should add following patch do i18n:

  • django/views/i18n.py

    a b from django.utils._os import upath  
    1313from django.utils.encoding import smart_text
    1414from django.utils.formats import get_format, get_format_modules
    1515from django.utils.http import is_safe_url
     16from django.utils.text import force_text
    1617from django.utils.translation import (
    1718    LANGUAGE_SESSION_KEY, check_for_language, get_language, to_locale,
    1819)
    def set_language(request):  
    3435    """
    3536    next = request.POST.get('next', request.GET.get('next'))
    3637    if not is_safe_url(url=next, host=request.get_host()):
    37         next = request.META.get('HTTP_REFERER')
     38        next = force_text(request.META.get('HTTP_REFERER'), strings_only=True, errors='replace')
    3839        if not is_safe_url(url=next, host=request.get_host()):
    3940            next = '/'
    4041    response = http.HttpResponseRedirect(next)
Last edited 9 years ago by Mariusz Felisiak (previous) (diff)

comment:10 by Tim Graham, 9 years ago

Why would you patch there instead of is_safe_url() itself to prevent Python 2 users from having to add similar boilerplate in similar code? I was thinking something like (with errors='replace' if that's needed):

  • django/utils/http.py

    diff --git a/django/utils/http.py b/django/utils/http.py
    index 6e782bd..73e45fd 100644
    a b def is_safe_url(url, host=None):  
    290290        url = url.strip()
    291291    if not url:
    292292        return False
     293    if six.PY2:
     294        url = force_text(url)
    293295    # Chrome treats \ completely as / in paths but it could be part of some
    294296    # basic auth credentials so we need to check both URLs.
    295297    return _is_safe_url(url, host) and _is_safe_url(url.replace('\\', '/'), host)

comment:11 by Claude Paroz, 9 years ago

comment:12 by Claude Paroz, 9 years ago

Has patch: set

comment:13 by Tim Graham, 9 years ago

Triage Stage: AcceptedReady for checkin

comment:14 by Tim Graham, 9 years ago

Summary: Django 1.8.10: Calling is_safe_url() with a non-unicode url throws an exception.Django 1.9.3/1.8.10: is_safe_url() crashes with a byestring URL on Python 2

comment:15 by Mariusz Felisiak, 9 years ago

Cc: felisiak.mariusz@… added

comment:16 by Claude Paroz <claude@…>, 9 years ago

Resolution: fixed
Status: newclosed

In ada7a4ae:

Fixed #26308 -- Prevented crash with binary URLs in is_safe_url()

This fixes a regression introduced by c5544d28923.
Thanks John Eskew for the reporti and Tim Graham for the review.

comment:17 by Claude Paroz <claude@…>, 9 years ago

In 78f4830:

[1.9.x] Fixed #26308 -- Prevented crash with binary URLs in is_safe_url()

This fixes a regression introduced by c5544d28923.
Thanks John Eskew for the reporti and Tim Graham for the review.
Backport of ada7a4aef from master.

comment:18 by Claude Paroz <claude@…>, 9 years ago

In 28bed24:

[1.8.x] Fixed #26308 -- Prevented crash with binary URLs in is_safe_url()

This fixes a regression introduced by c5544d28923.
Thanks John Eskew for the reporti and Tim Graham for the review.
Backport of ada7a4aef from master.

comment:19 by Tim Graham, 9 years ago

In 552f03869ea7f3072b3fa19ffb6cb2d957fd8447:

Added safety to URL decoding in is_safe_url() on Python 2

The errors='replace' parameter to force_text altered the URL before checking
it, which wasn't considered sane. Refs [24fc935218] and [ada7a4aef].

comment:20 by Tim Graham, 9 years ago

In 9c195d45a64b0d2baee218e617ca3a762efc0bf5:

[1.9.x] Added safety to URL decoding in is_safe_url() on Python 2

The errors='replace' parameter to force_text altered the URL before checking
it, which wasn't considered sane. Refs 24fc935218 and ada7a4aef.
Backport of 552f03869e from master.

comment:21 by Tim Graham, 9 years ago

In beb392b85e71fdd41209d323126181d74090fecb:

[1.8.x] Added safety to URL decoding in is_safe_url() on Python 2

The errors='replace' parameter to force_text altered the URL before checking
it, which wasn't considered sane. Refs 24fc935218 and ada7a4aef.
Backport of 552f03869e from master.

Note: See TracTickets for help on using tickets.
Back to Top