Opened 14 years ago
Closed 13 years ago
#15367 closed New feature (fixed)
Improved Auth Password Hashing
Reported by: | Paul Oswald | Owned by: | Paul McMillan |
---|---|---|---|
Component: | contrib.auth | Version: | |
Severity: | Release blocker | Keywords: | password, hash, hashing, bcrypt, scrypt, pbkdf2, sha2, sha1 |
Cc: | jtunney@…, Paul Oswald, Donald Stufft | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | yes |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
As discussed on the django-developers mailing list, we aim to improve the default password hashing algorithm Django uses. This ticket will track the changes needed to upgrade the default algorithm to one more resistant to brute force attacks. This may include improvements ease allowing the developer to change the encryption used to a third party library wether that is by configuration or by decoupling the authentication code from the User model object.
If you would like to help, please be familiar with the summary and decisions made in the following discussion at least starting from this message:
Also be aware of the past issues surrounding this issue, specifically ticket:13969 which currently has existing code:
ticket:3316 (Adding `crypt' to list of password hashes for legacy apps. - closed: fixed)
ticket:5600 (Patch to enhance cryptography on django.contrib.auth - closed: wontfix)
ticket:5787 (BCrypt password hashing support in Django - closed: duplicate)
ticket:6028 (add compatibility with glibc2 MD5-based crypt passwords - new )
ticket:9101 (Improved salt generation for django.contrib.auth - closed: wontfix)
ticket:9194 (Allow additional hashing algorithms for passwords - closed: duplicate)
ticket:13969 (auth module should use longer salt for hashing - new)
The plan at this point is to follow this path:
- Django ships with PBKDF2 by default. This depends on SHA2 which should be python 2.5 compatible (due to hashlib being added in python 2.5) and PBKDF2 is short and simple enough that it could be included into the project. This satisfies NIST/US Gov requirements.
- SHA1 is maintained for backwards compatibility
- Salt size increased
- Configurable settings for the number of hashing rounds to future-proof for faster hardware
We can also Investigate simplified ways of allowing developers to upgrade the hashing library based on their requirements, however it is more important to get the default improved and that is where we should focus.
Attachments (5)
Change History (32)
comment:1 by , 14 years ago
Triage Stage: | Unreviewed → Design decision needed |
---|
comment:2 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → New feature |
comment:3 by , 13 years ago
Easy pickings: | unset |
---|---|
UI/UX: | unset |
cryptacular attempts to address the problem of supporting multiple hash functions. The delegating password manager accepts an array of supported hash functions, each hashing scheme implements a 'match()' method (does the given hash look like that hash function's hash?), and the 'check' method accepts a setter to upgrade to the most-preferred hash when someone logs in with an older hash. cryptacular includes PBKDF2 and bcrypt. https://bitbucket.org/dholth/cryptacular/src/tip/cryptacular/core/__init__.py
comment:4 by , 13 years ago
Cc: | added |
---|---|
Owner: | changed from | to
I'm going to start coding this in my github fork: https://github.com/jart/django/tree/auth-hashing
Here's a synopsis of the plan so far that a few of us came up with at the sprints:
- PBKDF2 will be the new default algorithm. It'll try to compile/install a C module or fall-back to pure python. This will not introduce a dependency
- bcrypt will also be supported through the py-bcrypt library. It can't be the default because there's no pure-python implementation
- An plugable password hashing backend API will be provided
- A setting will be added to specify module path of chosen algorithm, like: PBKDF2, bcrypt and SHA1
- We're still looking into ways to mitigate the possibility of DDOS attacks due to high CPU usage of these algorithms
- Rewriting of password hashes upon user login if needed
- Magic for people to painlessly migrate from python-bcrypt
Any help from the community would be very much welcome :)
comment:5 by , 13 years ago
Triage Stage: | Design decision needed → Accepted |
---|
This plan corresponds with what we worked out here at the sprints with Russell KM, Isaac Kelly, and myself. I'll also be writing a post to the django-dev list about this.
by , 13 years ago
Attachment: | django-passhash-2011-09-10.diff added |
---|
comment:6 by , 13 years ago
Has patch: | set |
---|---|
Needs documentation: | set |
Needs tests: | set |
comment:7 by , 13 years ago
Patch needs improvement: | set |
---|---|
Version: | 1.2 |
Hm, in the patch, why is this being put in django.utils
instead of staying in django.contrib.auth
? I'm asking cause I've specifically created django.contrib.auth.utils
a couple of weeks ago in trunk. What about having this in django.contrib.auth.hashers
Also, mind elaborating how the PASSWORD_HASHERS
setting is supposed to be used? It seems a bit cumbersome to specify the setting if it can't be used like the DATABASES
or CACHES
setting. From what I understand is you only want to be able to specify the default hasher (atm the first item in the tuple, yuck) and some options.
Switching the password hasher or changing an option of a password hasher isn't a common use case and therefor shouldn't be exposed like that as a multi-value setting. Instead subclassing and setting a single setting (named AUTH_PASSWORD_HASHER
) ought to be enough, e.g.::
AUTH_PASSWORD_HASHER = 'mysite.security.MyCustomBCryptPasswordHasher'
Also, please don't use backslashes but brackets for line continuation.
by , 13 years ago
Attachment: | django-passhash-2011-09-12.patch added |
---|
by , 13 years ago
Attachment: | django-passhash-2011-09-12.diff added |
---|
follow-up: 9 comment:8 by , 13 years ago
Sorry I didn't notice the incorrect references to the old utils file; I forgot to delete the pyc file :\
I uploaded a new patch with numerous changes including that backslash thing. The change log is here: https://github.com/jart/django/commits/auth-hashing
You might be right that people should overload classes to change the settings for a hasher so I included that in the new patch. We must however keep the list because if people use a third party password hasher, they need a way to migrate back to the standard ones. The only way to avoid this is to not allow people to write their own custom hashers.
I honestly think people should be discouraged from using the setting at all. We put a lot of time and effort into making sure the default behavior highly secure, stable, fast and portable without compromise. We still need more tests, review and research, but I believe this PBKDF2 implementation will end up being an excellent choice for everybody.
comment:9 by , 13 years ago
Replying to jart:
Sorry I didn't notice the incorrect references to the old utils file; I forgot to delete the pyc file :\
Hm, you seem to have put imports there now, please move the content of the passhash.py
file which was previously in the utils.py
file back there and the hasher classes to a file called hashers.py
in the auth contrib app's. The utils.py
*is not deprecated* as you claim. The reason is simple: there isn't any other password hashing happening outside the auth contrib app, so it should live there, too.
I uploaded a new patch with numerous changes including that backslash thing. The change log is here: https://github.com/jart/django/commits/auth-hashing
You might be right that people should overload classes to change the settings for a hasher so I included that in the new patch. We must however keep the list because if people use a third party password hasher, they need a way to migrate back to the standard ones. The only way to avoid this is to not allow people to write their own custom hashers.
You misunderstood me, the setting would *not* be a list of hashers, but just the module path of the activated hasher.
By default the global_settings.py
should contain::
AUTH_PASSWORD_HASHER = 'django.contrib.auth.hashers.PBKDF2PasswordHasher'
and users can override that in their own settings.py
accordingly (in case they have any options to modify, which is unlikely):
AUTH_PASSWORD_HASHER = 'mysite.security.MyCustomBCryptPasswordHasher'
I honestly think people should be discouraged from using the setting at all. We put a lot of time and effort into making sure the default behavior highly secure, stable, fast and portable without compromise. We still need more tests, review and research, but I believe this PBKDF2 implementation will end up being an excellent choice for everybody.
Well, even if it's not a common case, those users that need to switch the password algorithm (probably due to backwards compatibility or security requirements) need some kind of documentation of how to do it. That includes documenting the various hasher classes in that Django ships in the django.contrib.auth.hashers
.
comment:10 by , 13 years ago
After talking to Carl on IRC, I'm sorry to say that I misunderstood the reason why you used a list for the setting (to enforce an order of precedence).
So the global_settings.py
would indeed have this:
AUTH_PASSWORD_HASHERS = ( 'django.utils.passhash.PBKDF2PasswordHasher', 'django.utils.passhash.BCryptPasswordHasher', 'django.utils.passhash.SHA1PasswordHasher', 'django.utils.passhash.MD5PasswordHasher', 'django.utils.passhash.CryptPasswordHasher', )
and users could add their own implementation if needed:
AUTH_PASSWORD_HASHERS = ( 'mysite.security.MyCustomBCryptPasswordHasher', 'django.utils.passhash.PBKDF2PasswordHasher', # etc )
That said, I'm still convinced this should be part of the auth app and not of django.utils
.
comment:11 by , 13 years ago
Yeah, we're still discussing the location question :-)
I agree with jezdez that the PASSWORD_HASHERS setting, and the built-in hasher classes, do need to be documented, even if we don't expect most people to need to use them.
comment:12 by , 13 years ago
jezdez, could we please talk about this on IRC for a moment? I'm jart on freenode
comment:14 by , 13 years ago
So the question of where the hashing code goes is a minor issue compared to the bulk of the work here (which is awesome, btw), but since there is an unresolved disagreement, I'll briefly summarize for posterity's sake the points on either side from the conversation in IRC (jezdez, please correct if I mis-represent anything):
In favor of putting it in django.utils.passhash, as the current patch does (or django.core.passhash?):
- Contrib.auth is notorious for not being as flexible as people want (this was raised numerous times at DjangoCon), and some people build their own alternative auth solutions. It would be good for these people to use a standard, common, well-tested-and-reviewed password hashing implementation rather than rolling their own. Putting it in core encourages people to perceive it as a low-level standard utility rather than a part of contrib.auth, and doesn't require them to add a dependency to a contrib app they don't otherwise use.
- Contrib.auth is likely to get a major overhaul soonish; it'd be good for the password hashing implementation to not be subject to any upheavals or namespace deprecations or whatnot that might be involved in that.
In favor of putting it in contrib.auth.utils:
- Nothing else in Django besides contrib.auth uses it, and password-hashing isn't really useful outside of auth/auth, so it may as well be part of contrib.auth, which is our only built-in auth/auth solution.
- django.utils is kind of a miscellaneous dumping ground for stuff, and not everything in there is currently well-maintained (this was asserted, anyway; I haven't checked myself to look for examples).
- A django.contrib.auth.utils module was recently added, and is now the home of the current password-hashing code.
(In case it isn't clear, I favor putting it in core rather than contrib.)
comment:15 by , 13 years ago
I think the passhash stuff belongs in contrib.auth. It's specific enough to our use case that it would almost certainly require tweaking to work elsewhere. The upgrade mechanism, the specific string format - all of these are implementation details related to auth. It's a good system, but there's not a huge amount of stuff in it that reflects tricky (security-wise) implementation details.
The PBKDF2 stuff belongs in utils.crypto.
For better or worse the majority of the hashing code[1] IS crypto code. It's defined in PKCS#5v2, and the implementation is designed primarily for deriving a key suitable for use in encryption. Our verification use case is an expected application of the spec, but is only a small part of what the code is intended to do.
I'm not currently in favor of adding "encryption" type features to Django, but if we ever did, this code would be relevant for that use case. It would also be relevant for other signing-type operations, if we decide to extend the current functionality. Even if Django itself does not ever make use of this code for any purpose other than hashing user passwords, I would like it to be available to developers as an explicit crypto function, rather than requiring them to repurpose code that lives as part of auth.
[1] Technically, this is key stretching, the "hashing" part is just the sha256 function.
A more readable version of the relevant bits of spec can be found here: http://www.ietf.org/rfc/rfc2898.txt
comment:16 by , 13 years ago
That split works for me. I was under the impression that the hasher classes themselves would be useful in another auth implementation, since they provide a uniform API to a variety of hashing strategies (and that the string format we were using for hashes was a sort of semi-standard), but anything that wouldn't likely be used by others without tweaking makes more sense to keep in contrib.auth.
comment:17 by , 13 years ago
Yeah, that seems like a smart way to put it.
- the hashing functions ought to be part of the
django.utils
package since they fit the goal ofdjango.utils
of providing low-level implementations - the specific implementation (the "hashers") ought to live in the auth app and use the low-level pieces by importing from
django.utils
and provide a high-level API for users to implement their own hasher
comment:19 by , 13 years ago
Severity: | Normal → Release blocker |
---|
Marking this as a release blocker so it doesn't get overlooked for 1.4. The patch is just about ready, aside from the minor location stuff discussed above.
comment:20 by , 13 years ago
Would anyone mind picking up this ticket? I've been heavily involved in organizing Occupy Wall Street for the past month and my time is stretched pretty thin :(
Also yes, OccupyWallSt.org runs Django :)
https://github.com/jart/occupywallst
comment:21 by , 13 years ago
Also if you absolutely need me to make this into 1.4 on time, I'll make time in my schedule :)
comment:22 by , 13 years ago
Owner: | changed from | to
---|
No problem. Assigning this ticket to myself, though I would appreciate code review from others.
comment:24 by , 13 years ago
Cc: | added |
---|
There has been some focus on closing down 1.4 and releasing an alpha. As a user, I absolutely want this code to go into that release. Is this ready for testing? If so, I'm willing to apply it and give it a shot on upgrading some data dumps of my projects. I note one thing I think we are missing is a description of how to use this and information about backward compatibility for the release notes.
comment:25 by , 13 years ago
Cc: | added |
---|
by , 13 years ago
Attachment: | django-passhash-2011-12-3.diff added |
---|
comment:26 by , 13 years ago
Needs tests: | unset |
---|
I've attached my current version of the patch, merged up to trunk. There are a few things left to do, the most prominent being to improve the documentation. Overall this code is pretty close to what will be committed. Comments are encouraged, and I'd love it if people could give it a try. It can also be found on github.
by , 13 years ago
Attachment: | 15367.5.diff added |
---|
Minor nitpicks, cleanups and other improvments. More details can be found in the commits at https://github.com/jezdez/django/tree/feature/auth-hashing
comment:27 by , 13 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Marking DDN -- there's something worth pursuing here, but there are still details to sort out.