#28415 closed Cleanup/optimization (fixed)
Clarify what characters ASCIIUsernameValidator and UnicodeUsernameValidator accept
Reported by: | Dan Collins | Owned by: | nobody |
---|---|---|---|
Component: | Documentation | Version: | 1.11 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | yes | UI/UX: | no |
Description
Hello,
While investigating the current and historical validation rules for usernames in Django, I noticed the following inconsistency, which is visible on the following page:
username¶ Required. 150 characters or fewer. Usernames may contain alphanumeric, _, @, +, . and - characters.
but
class validators.ASCIIUsernameValidator¶ New in Django 1.10. A field validator allowing only ASCII letters, in addition to @, ., +, -, and _. The default validator for User.username on Python 2. class validators.UnicodeUsernameValidator¶ New in Django 1.10. A field validator allowing Unicode letters, in addition to @, ., +, -, and _. The default validator for User.username on Python 3.
The documentation is inconsistent on whether ASCII numbers are allowed in usernames at all, and whether Unicode numbers are allowed in usernames when using the Unicode validator or Python 3.
Further, in light of this, the following paragraph isn't clear:
Usernames and Unicode Django originally accepted only ASCII letters in usernames. Although it wasn’t a deliberate choice, Unicode characters have always been accepted when using Python 3. Django 1.10 officially added Unicode support in usernames, keeping the ASCII-only behavior on Python 2, with the option to customize the behavior using User.username_validator.
I believe that Django originally accepted only ASCII letters and numbers in usernames.
Note that there is such a thing as a Unicode number: I suspect the Unicode validator accepts all letter and number classes - not just 0-9, but also characters like ६ (Devanagari 6), ೬ (Kannada 6), ¹ (superscript), but reading the documentation makes me doubt this.
Change History (3)
comment:1 by , 7 years ago
Has patch: | set |
---|---|
Summary: | Documentation for ASCIIUsernameValidator is not consistent → Clarify what characters ASCIIUsernameValidator and UnicodeUsernameValidator accept |
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Cleanup/optimization |
According to the Python docs,
\w
matches "Unicode word characters; this includes most characters that can be part of a word in any language, as well as numbers and the underscore." For my PR, I used the term "Unicode characters." Let me know if you think that's not sufficiently precise.