Opened 13 years ago

Closed 9 years ago

Last modified 9 years ago

#17133 closed Cleanup/optimization (fixed)

get_script_name goofs when there is Apache URL rewriting

Reported by: gjanee@… Owned by: nobody
Component: HTTP handling Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Claude Paroz)

When running under Apache+mod_wsgi (I'm not sure the mod_wsgi is relevant here), using a WSGIScriptAlias of /foo, a request for URL /foo/bar//baz (note double slash) gets transformed by Apache to /foo/bar/baz and then by Django to /foo//bar/baz (note where double slashes are now). Furthermore, a callback function for urlpattern "^bar/.*" will be called on this latter path even though it doesn't match the pattern. That is, request.path will be /foo//bar/baz, which can be confusing for a callback function that is expecting request.path to match the WSGIScriptAlias plus the urlpattern.

I don't totally understand what Django is doing here, but I believe the problem is in django.core.handlers.base.get_script_name. In that function, to get the script name in this situation Django starts with the environment variable SCRIPT_URL, and strips off a number of characters at the end equal to the length of environment variable PATH_INFO. Because in this case Apache collapses the double slash in /foo/bar//baz to /foo/bar/baz, get_script_name strips off one less character than necessary, so that instead of the script name being /foo it is /foo/. This may account for the double slash appearing /foo, as in /foo//bar/baz.

Attachments (1)

django_script_path.patch (710 bytes ) - added by mwilck 12 years ago.
Suggested patch (for Django 1.3.1, but should apply on master as well)

Download all attachments as: .zip

Change History (13)

comment:2 by Aymeric Augustin, 13 years ago

Resolution: needsinfo
Status: newclosed

While it's clear that something's wrong in your setup, I'm unable to confirm that there's a bug in Django, and determine how we could fix it, sorry.

You might be exceeding the limits of how much mod_rewrite magic Django can compensate for; if so, you can work around the problem with the FORCE_SCRIPT_NAME setting.

If we wanted to debug this further, we'd need the values of the relevant environment variables, especially SCRIPT_URL, REDIRECT_URL, PATH_INFO, SCRIPT_NAME, as well as the relevant bits of your Apache configuration.

comment:3 by mwilck, 12 years ago

Has patch: set
Resolution: needsinfo
Status: closedreopened

This is definitely a Django bug, given that Apache/mod_wsgi is as it is.
Here are the variables for my case:

SCRIPT_URL = '/mst/milestonesaccounts/loginhelp'
PATH_INFO = u'/milestones/accounts/login/help'

SCRIPT_NAME = u'/mst/m'

The correct SCRIPT_NAME would be just u'/mst'.

Version 0, edited 12 years ago by mwilck (next)

by mwilck, 12 years ago

Attachment: django_script_path.patch added

Suggested patch (for Django 1.3.1, but should apply on master as well)

comment:4 by mwilck, 12 years ago

See e.g. http://trac.edgewall.org/demo-1.1/changeset/10609/branches/0.12-stable/trac/web/main.py for an independent assessment of the same problem.

comment:6 by Claude Paroz, 12 years ago

Description: modified (diff)

comment:7 by Łukasz Rekucki, 12 years ago

Triage Stage: UnreviewedAccepted

Here's an explanation from Graham Dumpleton (a bit old, but seems up-to-date): https://groups.google.com/d/msg/django-users/31oV1WhuAZ4/lpbt_3CCcXYJ

Nitpicking: The import and regexp definition should be on top of the file and a comment with a link to this issue would decrease the WTH level ;)

comment:8 by Aymeric Augustin, 12 years ago

Type: UncategorizedCleanup/optimization

comment:9 by Aymeric Augustin, 12 years ago

Status: reopenednew

comment:10 by Tim Graham, 10 years ago

Needs tests: set

Is there any way to add a test?

comment:11 by Claude Paroz, 9 years ago

Needs tests: unset
Version: 1.3master

comment:12 by Tim Graham, 9 years ago

Triage Stage: AcceptedReady for checkin

comment:13 by Claude Paroz <claude@…>, 9 years ago

Resolution: fixed
Status: newclosed

In 10ace52a:

Fixed #17133 -- Properly handled successive slashes in incoming requests

Thanks gjanee@… for the report and Tim Graham for the review.

comment:14 by Claude Paroz <claude@…>, 9 years ago

In ea2f48c:

Refs #17133 -- Optimized script_url handling in get_script_name

10ace52a added some regex processing for each request with SCRIPT_URL set.
In a speed critical section, conditionally apply of the regex will save some
resources.

Note: See TracTickets for help on using tickets.
Back to Top