Opened 16 years ago

Closed 12 years ago

#10809 closed New feature (fixed)

mod_wsgi authentication handler

Reported by: Peter Baumgartner Owned by: Preston Holmes
Component: contrib.auth Version: dev
Severity: Normal Keywords: mod_wsgi
Cc: Graham.Dumpleton@…, eschler@…, Unai Zalakain, alex@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

A mod_wsgi authentication handler that handles checking against users, staff or superusers.

Documentation on mod_wsgi access control mechanisms is here.

Attachments (4)

modwsgi_auth_handler.diff (1.4 KB ) - added by Peter Baumgartner 16 years ago.
modwsgi_auth_handler.2.diff (4.5 KB ) - added by David Fischer 15 years ago.
Contains the mod_wsgi auth handler and supporting documentation
modwsgi_auth_handler.3.diff (5.9 KB ) - added by David Fischer 15 years ago.
Contains the necessary methods to authenticate and authorize
modwsgi_auth_handler.4.diff (9.1 KB ) - added by David Fischer 13 years ago.
Updated docs and added tests

Download all attachments as: .zip

Change History (33)

by Peter Baumgartner, 16 years ago

Attachment: modwsgi_auth_handler.diff added

comment:1 by Graham Dumpleton, 16 years ago

It could be argued that checking whether one is staff or super user is not authentication but is authorisation. Thus perhaps shouldn't be getting handled in check_password. Apache has a separate authorisation phase for that sort of thing. :-)

I do accept though that to make it more efficient one would need to cache information for use by later phase to avoid database lookup again if doing that. How to cache that information is a bit of a problem as changes to WSGI environment don't get carried through to later phase at this point. Also, thread local data only works properly in main Python interpreter until changes in mod_wsgi 3.0 come along. So, hard to avoid a second database lookup at the moment if check was done through authorisation phase.

comment:2 by Malcolm Tredinnick, 16 years ago

Needs documentation: set
Patch needs improvement: set

Graham's point about authentication versus authorisation is correct here. The permissions somebody has is authorisation and the "auth" backends are abbreviation for authentication (yes, ambiguously named, but consistently implemented. And, yes, the HTTP header naming has to be looked at in just the right way not to appear to be incorrect, so that doesn't help).

A few things to consider for subsequent versions:

  1. Part of the patch would have to change the docs/ directory. Right now, there's no documentation.
  2. If you're adding some kind of authorisation helpers, why are they only appropriate for this pathway and not every other authentication backend? That's the question I'd be asking myself and it really comes down to whether you're proposing two features here (which should be split into separate tickets): an authentication handler for modwsgi and a way to also fold in permission checking for authentication handlers, which should be more holistic and integrate with all authentication backends.

comment:3 by Peter Baumgartner, 16 years ago

#1 - sure I can add that

#2 - I did my best to mirror the functionality provided in the mod_python handler which does both authentication and authorization. If you think it is better to have a separate file in here that handles authorization for all backends, let me know and I can adjust the existing mod_python handler accordingly.

comment:4 by Chris Beaven, 16 years ago

Triage Stage: UnreviewedDesign decision needed

\

comment:5 by David Fischer, 15 years ago

I am willing to work on this ticket, but I'm not sure it's going to work without some modification to mod_wsgi itself. Before the user can be authenticated, the DJANGO_SETTINGS_MODULE environment variable must be set. This cannot be set from inside the handler because mod_wsgi does not pass environment information set by SetEnv to the WSGIAuthUserScript. Please correct me if I am wrong.

comment:6 by Graham Dumpleton, 15 years ago

David. The mod_wsgi package does not need to be changed. You just need to have the value of DJANGO_SETTINGS_MODULE be specified in the authentication script file. Same with any Python module search path that needs to be set.

import sys
sys.path.append(...)

import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'

def check_password(environ, user, password):
  ...

The SetEnv variables are not passed to the authentication scripts because they are executed during an earlier phase than when SetEnv variables are pushed into request environment by Apache.

comment:7 by David Fischer, 15 years ago

Graham. I understand that the DJANGO_SETTINGS_MODULE can be set from inside an external mod_wsgi auth script as it is in the mod_wsgi docs. However, this ticket is about putting a generic mod_wsgi auth handler into the Django code line. A generic handler would need to somehow set DJANGO_SETTINGS_MODULE and possibly the Python path. The mod_python handler, for example, must set DJANGO_SETTINGS_MODULE into os.environ before

from django.contrib.auth.models import User

A generic mod_wsgi handler would probably need to do something similar, but I'm not sure that's possible. Ideally, I think that the functionality and setup of a mod_wsgi auth handler should be very close to that of the mod_python auth handler. Ideally, using a mod_wsgi auth handler would be as easy as:

<Location "/">
    AuthType Basic
    AuthName "wsgi protected"
    Require valid-user
    # maybe set DJANGO_SETTINGS_MODULE here somehow
    AuthBasicProvider wsgi
    WSGIAuthUserScript  django.contrib.auth.handlers.modwsgi
</Location>

Maintaining an extra custom auth script works fine (I'm using it), but it seems like everybody who uses Django/mod_wsgi authentication is going to have the same auth script with only the DJANGO_SETTINGS_MODULE changed. Some abstraction might be nice.

By the way, mod_wsgi is great! I'm a big fan and I'm not trying to put it down.

comment:8 by Graham Dumpleton, 15 years ago

I wouldn't be copying how mod_python/Django works as it is technically broken, or at least is a very bad way of doing things because of the problems it can cause.

The interaction with mod_python for both authentication handler and content handler, ie., main Django application, relies on stuffing all mod_python request variables into os.environ on every request. Doing this is bad for a number of reasons.

The first is that it causes pollution of the set of process environment variables. This is because every time you set something in os.environ, it also under the covers calls C level putenv(). Because mod_python and mod_wsgi can have multiple sub interpreters with separate applications running in them, any change made will affect the process environment variables used by other applications running in same process.

The second is that if the first application to get a request has additional variables set beyond what is set for other applications in other sub interpreters, when that second application gets a request, os.environ will be created as a copy of process environment variables at that time, so variables set in configuration for the first effectively leak into the os.environ dictionary of the second application. If the point of the configuration was not to affect something needing process environment variables, but just in Python context via os.environ, you have then potentially affected the way that the second application works.

The third is that it is in the main pointless setting os.environ on every request because for variables like DJANGO_SETTINGS_MODULE only the first value seen is ever used as it is only read when django first initialises itself.

For some additional detail about process environment variable leakage between Python sub interpreters read:

http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Application_Environment_Variables

Overall, the use of a globally scoped variable set in os.environ for driving Django configuration, and all the subsequent caching of configuration in global data, is suboptimal design. It is therefore arguable that this really comes down to being a Django problem and not a mod_wsgi limitation.

I would also suggest that having the authentication handler be dependent on loading and initialising the full Django stack is also sub optimal. This is because authentication hooks in mod_wsgi can only run in the embedded mode processes. If the main Django site instance is running daemon mode you therefore potentially have a full second set of copies of Django loaded just to handle authentication. What should be aimed at is a way for the authentication handler only loading the absolute minimum of what is required from Django instance to do what it needs, otherwise you just risk bloating out your Apache if main application actually running in daemon mode. Maybe it does do, that, but since it requires settings file, presuming it is doing all initialisation.

That all said, the other reason that SetEnv variables aren't propagated into authentication handler is that the authentication handler can only be setup from main Apache configuration file. Thus only by the administrator. This is done specifically so that the administrator can fully control how authentication is handled, given than any issues with it are going to be a security issue. If you allow SetEnv variables to control what configuration the authentication handler sees, then if a normal user if given FileInfo override rights for .htaccess file, then they could override the configuration sent into the authentication handler and so override the configuration the administrator set up. This would allow a normal user to circumvent the authentication mechanism and so open up security holes.

A solution to this is to have a separate set of directives for setting WSGI environment variables. Do this though and you end up with silly mess with mod_python where you have choice of SetEnv and PythonOption. Because in mod_python PythonOption can also be set in .htaccess, it also is subject to the same security hole where a normal user could override a variable being passed into an authentication handler and circumvent the security.

One solution is for the separate directives be only for setting variables passed into the authentication handler and for them only to be usable in main Apache configuration files and not in .htaccess file. Do that though and you still end up with silly situation where you would need to set same variable twice, once for authentication and once for main application.

In summary I strongly believe it is the wrong approach and the better thing to do is just require os.environ be set once in the script file, albeit any use of os.environ should also be discouraged. Although mod_wsgi allows SetEnv to be used for adding WSGI environment variables it is actually discouraged in general and configuration should always be done from within the WSGI script file. The intent of it being in the WSGI script file was also so that could get other WSGI hosting mechanisms to also support the concept of a WSGI script file and so it would include everything needed and so avoid dependencies on information contained in the server specific configuration files.

What you should be doing therefore is ignoring what mod_python does. You should just assume in your precanned authentication handler that DJANGO_SETTINGS_MODULE has already been set in os.environ and leave it up to the user to ensure that is done via the appropriate means.

BTW, you can not say:

WSGIAuthUserScript django.contrib.auth.handlers.modwsgi

in mod_wsgi. That is, for mod_wsgi it MUST be a path to a script file. It doesn't allow a Python module to be specified as does mod_python, so there is also perhaps a misunderstanding on your part in thing that there isn't going to be an intermediate script file in which a user can set os.environ in the first place.

And no mod_wsgi will never allow Python modules to be specified like that. This was a conscious decision based on experience from seeing the mess that people caused with mod_python and the poor way that mod_python handles Python module search paths.

comment:9 by Graham Dumpleton, 15 years ago

Cc: Graham Dumpleton added

comment:10 by Graham Dumpleton, 15 years ago

Cc: Graham.Dumpleton@… added; Graham Dumpleton removed

comment:11 by David Fischer, 15 years ago

Owner: changed from nobody to David Fischer
Status: newassigned

It sounds like you've thought this through. Still, I think that a mod_wsgi auth handler is a good idea. Considering your suggestions, I think an auth script should look something like this:

import sys
sys.path.append(...)

import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'

import django.contrib.auth.handlers.modwsgi

def check_password(environ, user, password):
    return django.contrib.auth.handlers.modwsgi.check_password(environ, user, password)

In this way, there will always be an external auth script but at least users won't be repeating themselves as much.

I like this solution a lot better than making sure that the system path is set properly and that the DJANGO_SETTINGS_MODULE environment variable is set in the OS level before starting Apache. That solution would also have problems with multiple Django projects authenticating against multiple auth databases running on the same Apache instance.

I'll start working on a patch.

comment:12 by Graham Dumpleton, 15 years ago

Presuming there is a corresponding Django application running in same process/sub interpreter, a better way is have all the following in single WSGI script file:

import sys
sys.path.append(...)

import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings'

import django.contrib.auth.handlers.modwsgi

def check_password(environ, user, password):
    return django.contrib.auth.handlers.modwsgi.check_password(environ, user, password)

import django.core.handlers.wsgi

application = django.core.handlers.wsgi.WSGIHandler()

That is, no need to actually have multiple script files. Apache configuration would then be:

AuthType Basic
AuthName "Top Secret"
AuthBasicProvider wsgi
Require valid-user

WSGIProcessGroup %{GLOBAL}
WSGIApplicationGroup django

WSGIAuthUserScript /usr/local/django/mysite/apache/django.wsgi
WSGIScriptAlias / /usr/local/django/mysite/apache/django.wsgi

The WSGIProcessGroup/WSGIApplicationGroup are to ensure both are executed in same sub interpreter as by default authentication handler will execute in %{GLOBAL} application group where as Django will be in group named based on virtual host ServerName and application mount point.

by David Fischer, 15 years ago

Attachment: modwsgi_auth_handler.2.diff added

Contains the mod_wsgi auth handler and supporting documentation

comment:13 by David Fischer, 15 years ago

Needs documentation: unset
Patch needs improvement: unset

I added a patch with supporting documentation and I removed the authorization functions from the old patch.

comment:14 by Eric Holscher, 15 years ago

Just set this up for myself and it's rather useful. Put me down for whatever is needed to get it into 1.3.

comment:15 by Graham Dumpleton, 15 years ago

Relevant to this is a recent discussion on mod_wsgi mailing list. See:

http://groups.google.com/group/modwsgi/browse_frm/thread/42e2d356d4eac415

This actually talks about authorization as distinct from authentication.

The original mod_python handler actually mixed the two when it shouldn't have.

In mod_wsgi one can properly separate the two and better integrate with Apache authentication mechanism and location based requirements on group membership.

comment:16 by David Fischer, 15 years ago

Implementing the groups_for_user() function in the same handler file is probably a good idea. However, instead of checking the permissions as on the mailing list, I think it's a good idea to return a list of Django groups for that particular user. Perhaps something along the line of:

from django.contrib.auth.models import User, AnonymousUser

def groups_for_user(environ, username): 
    try:  
        user = User.objects.get(username=username)  
    except User.DoesNotExist:  
        user = AnonymousUser()

    return [g.name for g in user.groups.all()]

by David Fischer, 15 years ago

Attachment: modwsgi_auth_handler.3.diff added

Contains the necessary methods to authenticate and authorize

comment:17 by David Fischer, 15 years ago

I added the check_for_user method and the extra documentation. I also did a little cleanup to simplify importing the methods. During testing, I noticed that mod_wsgi wants a regular string as opposed to the unicode string that Django returns for the group names. Encoding does the trick and I tested this against a group containing unicode in the name without any problems. I'm on board for anything needed to get this in 1.3.

comment:18 by Eric Holscher, 15 years ago

Good stuff!

Another interesting capability that might be useful is being able to allow only staff members to access pages as well. Perhaps letting super users in in some fashion would be good too. I like the ability of setting access by group, I don't know if we want to bend the meaning of staff to mean access to other things besides the admin, but it seems like a logical way to grant access that is already set up in a lot of pre-existing environments.

in reply to:  18 comment:19 by David Fischer, 15 years ago

Replying to ericholscher:

The admin already only allows staff members to access it. If a logged-in user without staff tries to access the admin it kicks them back to the login page. This happens even if the user is authenticated and already logged in by the RemoteUserMiddleware.

comment:20 by anonymous, 14 years ago

Cc: eschler@… added

comment:21 by Chris Beaven, 14 years ago

Severity: Normal
Type: New feature

comment:22 by David Fischer, 13 years ago

Easy pickings: unset
UI/UX: unset

I'm not sure if this ticket is on the 1.4 roadmap, but I just tested to make sure the patch still applies. Since mod_python is being deprecated, should the handler for mod_python be removed? Should the docs on mod_python auth be removed? That's probably a separate ticket.

comment:23 by Unai Zalakain, 13 years ago

Cc: Unai Zalakain added

comment:24 by Aymeric Augustin, 13 years ago

Needs tests: set
Triage Stage: Design decision neededAccepted

I'm not sure why this ticket is classified as DDN. The feature looks useful and there's been a significant amount of work on the patch.

It needs a solid review and tests.

by David Fischer, 13 years ago

Attachment: modwsgi_auth_handler.4.diff added

Updated docs and added tests

comment:25 by David Fischer, 13 years ago

Needs tests: unset

I updated the patch with tests and updated the docs because a new "WSGI" section was added since the last patch was generated. The patch applies cleanly to trunk. I'm still on board for whatever it takes to get this into 1.4 or just trunk if it is too late for that.

comment:26 by alex@…, 13 years ago

Cc: alex@… added

in reply to:  25 comment:27 by Preston Holmes, 12 years ago

Owner: changed from David Fischer to Preston Holmes
Status: assignednew
Triage Stage: AcceptedReady for checkin

I've just spent some time reviewing and updating this patch.

With mod_python being removed in 1.5 - it seems good to get this into a release

I made some minor revisions to the docs, updated to be py3 compatible, and caught a small error in the test.

I believe this to be RFC, but welcome final reviews.

https://github.com/django/django/pull/389

comment:28 by Claude Paroz, 12 years ago

I also think this is good to go, thanks!

comment:29 by Preston Holmes <preston@…>, 12 years ago

Resolution: fixed
Status: newclosed

In 373932fa6b9137a7e760d81dc66d49fc10ff2942:

fixed #10809 -- add a mod_wsgi authentication handler

Thanks to baumer1122 for the suggestion and initial
patch and David Fischer for the contributions and
long term patch maintenance and docs.

Note: See TracTickets for help on using tickets.
Back to Top