Opened 16 years ago
Closed 12 years ago
#10809 closed New feature (fixed)
mod_wsgi authentication handler
Reported by: | Peter Baumgartner | Owned by: | Preston Holmes |
---|---|---|---|
Component: | contrib.auth | Version: | dev |
Severity: | Normal | Keywords: | mod_wsgi |
Cc: | Graham.Dumpleton@…, eschler@…, Unai Zalakain, alex@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
A mod_wsgi authentication handler that handles checking against users, staff or superusers.
Documentation on mod_wsgi access control mechanisms is here.
Attachments (4)
Change History (33)
by , 16 years ago
Attachment: | modwsgi_auth_handler.diff added |
---|
comment:1 by , 16 years ago
comment:2 by , 16 years ago
Needs documentation: | set |
---|---|
Patch needs improvement: | set |
Graham's point about authentication versus authorisation is correct here. The permissions somebody has is authorisation and the "auth" backends are abbreviation for authentication (yes, ambiguously named, but consistently implemented. And, yes, the HTTP header naming has to be looked at in just the right way not to appear to be incorrect, so that doesn't help).
A few things to consider for subsequent versions:
- Part of the patch would have to change the docs/ directory. Right now, there's no documentation.
- If you're adding some kind of authorisation helpers, why are they only appropriate for this pathway and not every other authentication backend? That's the question I'd be asking myself and it really comes down to whether you're proposing two features here (which should be split into separate tickets): an authentication handler for modwsgi and a way to also fold in permission checking for authentication handlers, which should be more holistic and integrate with all authentication backends.
comment:3 by , 16 years ago
#1 - sure I can add that
#2 - I did my best to mirror the functionality provided in the mod_python handler which does both authentication and authorization. If you think it is better to have a separate file in here that handles authorization for all backends, let me know and I can adjust the existing mod_python handler accordingly.
comment:5 by , 15 years ago
I am willing to work on this ticket, but I'm not sure it's going to work without some modification to mod_wsgi itself. Before the user can be authenticated, the DJANGO_SETTINGS_MODULE
environment variable must be set. This cannot be set from inside the handler because mod_wsgi does not pass environment information set by SetEnv
to the WSGIAuthUserScript
. Please correct me if I am wrong.
comment:6 by , 15 years ago
David. The mod_wsgi package does not need to be changed. You just need to have the value of DJANGO_SETTINGS_MODULE be specified in the authentication script file. Same with any Python module search path that needs to be set.
import sys sys.path.append(...) import os os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings' def check_password(environ, user, password): ...
The SetEnv variables are not passed to the authentication scripts because they are executed during an earlier phase than when SetEnv variables are pushed into request environment by Apache.
comment:7 by , 15 years ago
Graham. I understand that the DJANGO_SETTINGS_MODULE
can be set from inside an external mod_wsgi auth script as it is in the mod_wsgi docs. However, this ticket is about putting a generic mod_wsgi auth handler into the Django code line. A generic handler would need to somehow set DJANGO_SETTINGS_MODULE
and possibly the Python path. The mod_python handler, for example, must set DJANGO_SETTINGS_MODULE
into os.environ
before
from django.contrib.auth.models import User
A generic mod_wsgi handler would probably need to do something similar, but I'm not sure that's possible. Ideally, I think that the functionality and setup of a mod_wsgi auth handler should be very close to that of the mod_python auth handler. Ideally, using a mod_wsgi auth handler would be as easy as:
<Location "/"> AuthType Basic AuthName "wsgi protected" Require valid-user # maybe set DJANGO_SETTINGS_MODULE here somehow AuthBasicProvider wsgi WSGIAuthUserScript django.contrib.auth.handlers.modwsgi </Location>
Maintaining an extra custom auth script works fine (I'm using it), but it seems like everybody who uses Django/mod_wsgi authentication is going to have the same auth script with only the DJANGO_SETTINGS_MODULE
changed. Some abstraction might be nice.
By the way, mod_wsgi is great! I'm a big fan and I'm not trying to put it down.
comment:8 by , 15 years ago
I wouldn't be copying how mod_python/Django works as it is technically broken, or at least is a very bad way of doing things because of the problems it can cause.
The interaction with mod_python for both authentication handler and content handler, ie., main Django application, relies on stuffing all mod_python request variables into os.environ on every request. Doing this is bad for a number of reasons.
The first is that it causes pollution of the set of process environment variables. This is because every time you set something in os.environ, it also under the covers calls C level putenv(). Because mod_python and mod_wsgi can have multiple sub interpreters with separate applications running in them, any change made will affect the process environment variables used by other applications running in same process.
The second is that if the first application to get a request has additional variables set beyond what is set for other applications in other sub interpreters, when that second application gets a request, os.environ will be created as a copy of process environment variables at that time, so variables set in configuration for the first effectively leak into the os.environ dictionary of the second application. If the point of the configuration was not to affect something needing process environment variables, but just in Python context via os.environ, you have then potentially affected the way that the second application works.
The third is that it is in the main pointless setting os.environ on every request because for variables like DJANGO_SETTINGS_MODULE only the first value seen is ever used as it is only read when django first initialises itself.
For some additional detail about process environment variable leakage between Python sub interpreters read:
http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Application_Environment_Variables
Overall, the use of a globally scoped variable set in os.environ for driving Django configuration, and all the subsequent caching of configuration in global data, is suboptimal design. It is therefore arguable that this really comes down to being a Django problem and not a mod_wsgi limitation.
I would also suggest that having the authentication handler be dependent on loading and initialising the full Django stack is also sub optimal. This is because authentication hooks in mod_wsgi can only run in the embedded mode processes. If the main Django site instance is running daemon mode you therefore potentially have a full second set of copies of Django loaded just to handle authentication. What should be aimed at is a way for the authentication handler only loading the absolute minimum of what is required from Django instance to do what it needs, otherwise you just risk bloating out your Apache if main application actually running in daemon mode. Maybe it does do, that, but since it requires settings file, presuming it is doing all initialisation.
That all said, the other reason that SetEnv variables aren't propagated into authentication handler is that the authentication handler can only be setup from main Apache configuration file. Thus only by the administrator. This is done specifically so that the administrator can fully control how authentication is handled, given than any issues with it are going to be a security issue. If you allow SetEnv variables to control what configuration the authentication handler sees, then if a normal user if given FileInfo override rights for .htaccess file, then they could override the configuration sent into the authentication handler and so override the configuration the administrator set up. This would allow a normal user to circumvent the authentication mechanism and so open up security holes.
A solution to this is to have a separate set of directives for setting WSGI environment variables. Do this though and you end up with silly mess with mod_python where you have choice of SetEnv and PythonOption. Because in mod_python PythonOption can also be set in .htaccess, it also is subject to the same security hole where a normal user could override a variable being passed into an authentication handler and circumvent the security.
One solution is for the separate directives be only for setting variables passed into the authentication handler and for them only to be usable in main Apache configuration files and not in .htaccess file. Do that though and you still end up with silly situation where you would need to set same variable twice, once for authentication and once for main application.
In summary I strongly believe it is the wrong approach and the better thing to do is just require os.environ be set once in the script file, albeit any use of os.environ should also be discouraged. Although mod_wsgi allows SetEnv to be used for adding WSGI environment variables it is actually discouraged in general and configuration should always be done from within the WSGI script file. The intent of it being in the WSGI script file was also so that could get other WSGI hosting mechanisms to also support the concept of a WSGI script file and so it would include everything needed and so avoid dependencies on information contained in the server specific configuration files.
What you should be doing therefore is ignoring what mod_python does. You should just assume in your precanned authentication handler that DJANGO_SETTINGS_MODULE has already been set in os.environ and leave it up to the user to ensure that is done via the appropriate means.
BTW, you can not say:
WSGIAuthUserScript django.contrib.auth.handlers.modwsgi
in mod_wsgi. That is, for mod_wsgi it MUST be a path to a script file. It doesn't allow a Python module to be specified as does mod_python, so there is also perhaps a misunderstanding on your part in thing that there isn't going to be an intermediate script file in which a user can set os.environ in the first place.
And no mod_wsgi will never allow Python modules to be specified like that. This was a conscious decision based on experience from seeing the mess that people caused with mod_python and the poor way that mod_python handles Python module search paths.
comment:9 by , 15 years ago
Cc: | added |
---|
comment:10 by , 15 years ago
Cc: | added; removed |
---|
comment:11 by , 15 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
It sounds like you've thought this through. Still, I think that a mod_wsgi auth handler is a good idea. Considering your suggestions, I think an auth script should look something like this:
import sys sys.path.append(...) import os os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings' import django.contrib.auth.handlers.modwsgi def check_password(environ, user, password): return django.contrib.auth.handlers.modwsgi.check_password(environ, user, password)
In this way, there will always be an external auth script but at least users won't be repeating themselves as much.
I like this solution a lot better than making sure that the system path is set properly and that the DJANGO_SETTINGS_MODULE
environment variable is set in the OS level before starting Apache. That solution would also have problems with multiple Django projects authenticating against multiple auth databases running on the same Apache instance.
I'll start working on a patch.
comment:12 by , 15 years ago
Presuming there is a corresponding Django application running in same process/sub interpreter, a better way is have all the following in single WSGI script file:
import sys sys.path.append(...) import os os.environ['DJANGO_SETTINGS_MODULE'] = 'mysite.settings' import django.contrib.auth.handlers.modwsgi def check_password(environ, user, password): return django.contrib.auth.handlers.modwsgi.check_password(environ, user, password) import django.core.handlers.wsgi application = django.core.handlers.wsgi.WSGIHandler()
That is, no need to actually have multiple script files. Apache configuration would then be:
AuthType Basic AuthName "Top Secret" AuthBasicProvider wsgi Require valid-user WSGIProcessGroup %{GLOBAL} WSGIApplicationGroup django WSGIAuthUserScript /usr/local/django/mysite/apache/django.wsgi WSGIScriptAlias / /usr/local/django/mysite/apache/django.wsgi
The WSGIProcessGroup/WSGIApplicationGroup are to ensure both are executed in same sub interpreter as by default authentication handler will execute in %{GLOBAL} application group where as Django will be in group named based on virtual host ServerName and application mount point.
by , 15 years ago
Attachment: | modwsgi_auth_handler.2.diff added |
---|
Contains the mod_wsgi auth handler and supporting documentation
comment:13 by , 15 years ago
Needs documentation: | unset |
---|---|
Patch needs improvement: | unset |
I added a patch with supporting documentation and I removed the authorization functions from the old patch.
comment:14 by , 15 years ago
Just set this up for myself and it's rather useful. Put me down for whatever is needed to get it into 1.3.
comment:15 by , 15 years ago
Relevant to this is a recent discussion on mod_wsgi mailing list. See:
http://groups.google.com/group/modwsgi/browse_frm/thread/42e2d356d4eac415
This actually talks about authorization as distinct from authentication.
The original mod_python handler actually mixed the two when it shouldn't have.
In mod_wsgi one can properly separate the two and better integrate with Apache authentication mechanism and location based requirements on group membership.
comment:16 by , 15 years ago
Implementing the groups_for_user()
function in the same handler file is probably a good idea. However, instead of checking the permissions as on the mailing list, I think it's a good idea to return a list of Django groups for that particular user. Perhaps something along the line of:
from django.contrib.auth.models import User, AnonymousUser def groups_for_user(environ, username): try: user = User.objects.get(username=username) except User.DoesNotExist: user = AnonymousUser() return [g.name for g in user.groups.all()]
by , 15 years ago
Attachment: | modwsgi_auth_handler.3.diff added |
---|
Contains the necessary methods to authenticate and authorize
comment:17 by , 15 years ago
I added the check_for_user
method and the extra documentation. I also did a little cleanup to simplify importing the methods. During testing, I noticed that mod_wsgi wants a regular string as opposed to the unicode string that Django returns for the group names. Encoding does the trick and I tested this against a group containing unicode in the name without any problems. I'm on board for anything needed to get this in 1.3.
follow-up: 19 comment:18 by , 15 years ago
Good stuff!
Another interesting capability that might be useful is being able to allow only staff members to access pages as well. Perhaps letting super users in in some fashion would be good too. I like the ability of setting access by group, I don't know if we want to bend the meaning of staff to mean access to other things besides the admin, but it seems like a logical way to grant access that is already set up in a lot of pre-existing environments.
comment:19 by , 15 years ago
Replying to ericholscher:
The admin already only allows staff members to access it. If a logged-in user without staff tries to access the admin it kicks them back to the login page. This happens even if the user is authenticated and already logged in by the RemoteUserMiddleware.
comment:20 by , 14 years ago
Cc: | added |
---|
comment:21 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → New feature |
comment:22 by , 13 years ago
Easy pickings: | unset |
---|---|
UI/UX: | unset |
I'm not sure if this ticket is on the 1.4 roadmap, but I just tested to make sure the patch still applies. Since mod_python is being deprecated, should the handler for mod_python be removed? Should the docs on mod_python auth be removed? That's probably a separate ticket.
comment:23 by , 13 years ago
Cc: | added |
---|
comment:24 by , 13 years ago
Needs tests: | set |
---|---|
Triage Stage: | Design decision needed → Accepted |
I'm not sure why this ticket is classified as DDN. The feature looks useful and there's been a significant amount of work on the patch.
It needs a solid review and tests.
follow-up: 27 comment:25 by , 13 years ago
Needs tests: | unset |
---|
I updated the patch with tests and updated the docs because a new "WSGI" section was added since the last patch was generated. The patch applies cleanly to trunk. I'm still on board for whatever it takes to get this into 1.4 or just trunk if it is too late for that.
comment:26 by , 13 years ago
Cc: | added |
---|
comment:27 by , 12 years ago
Owner: | changed from | to
---|---|
Status: | assigned → new |
Triage Stage: | Accepted → Ready for checkin |
I've just spent some time reviewing and updating this patch.
With mod_python being removed in 1.5 - it seems good to get this into a release
I made some minor revisions to the docs, updated to be py3 compatible, and caught a small error in the test.
I believe this to be RFC, but welcome final reviews.
comment:29 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
It could be argued that checking whether one is staff or super user is not authentication but is authorisation. Thus perhaps shouldn't be getting handled in check_password. Apache has a separate authorisation phase for that sort of thing. :-)
I do accept though that to make it more efficient one would need to cache information for use by later phase to avoid database lookup again if doing that. How to cache that information is a bit of a problem as changes to WSGI environment don't get carried through to later phase at this point. Also, thread local data only works properly in main Python interpreter until changes in mod_wsgi 3.0 come along. So, hard to avoid a second database lookup at the moment if check was done through authorisation phase.