Opened 13 years ago
Closed 6 months ago
#18119 closed New feature (fixed)
add DomainNameValidator to validate Internet Domain Names
Reported by: | michele | Owned by: | Nina Menezes |
---|---|---|---|
Component: | Core (Other) | Version: | dev |
Severity: | Normal | Keywords: | validators |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Internet Domain Names are very present throughout, e.g. in URLs and e-mail addresses.
This patch adds django.core.validators.DomainNameValidator to check Internet Domain Names.
DomainNameValidator supports IDNA (utf) domain names, but it can reject them if optional accept_idna=False is passed to the constructor (defaults to accept).
DomainNameValidator is implemented as follows:
- derives RegexValidator
- uses the same RegEx of URLValidator (with the additional constrain to accept only 127 labels / 255 chars, as per RFC)
- if plain validation fails, converts UTF -> ASCII with IDNA encoding and attempts to validate again (unless accept_idna=False)
Based on this Validator, I will submit separate patches to add DomainNameField support for models.
Attachments (2)
Change History (25)
by , 13 years ago
Attachment: | domainnamevalidator.txt added |
---|
comment:1 by , 13 years ago
Needs documentation: | set |
---|---|
Needs tests: | set |
Patch needs improvement: | set |
Triage Stage: | Unreviewed → Accepted |
by , 13 years ago
Attachment: | domainnamevalidator_2.txt added |
---|
DomainNameValidator patch #2 - remove validator-own attributes before super()
comment:2 by , 13 years ago
In attachment:domainnamevalidator_2.txt :
- fix the constructor argument as per claudep suggestion
- return a specific error message based on IDNA being accepted or not
comment:3 by , 10 years ago
Needs documentation: | unset |
---|---|
Needs tests: | unset |
Patch needs improvement: | unset |
Version: | 1.4 → master |
comment:4 by , 10 years ago
Patch needs improvement: | set |
---|
This is blocked on #20003. Patch will need to be updated after that change.
comment:7 by , 8 years ago
Patch needs improvement: | set |
---|
comment:8 by , 10 months ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:10 by , 8 months ago
Patch needs improvement: | unset |
---|
Thank you Nina for working on this! I have unset the patch needs improvement
flag so this is listed in the "needs review" list in the dashboard.
comment:11 by , 7 months ago
Needs documentation: | set |
---|---|
Patch needs improvement: | set |
comment:12 by , 7 months ago
Needs documentation: | unset |
---|---|
Patch needs improvement: | unset |
Documentation now updated.
comment:13 by , 6 months ago
Patch needs improvement: | set |
---|
comment:14 by , 6 months ago
Patch needs improvement: | unset |
---|
comment:15 by , 6 months ago
Patch needs improvement: | set |
---|
follow-up: 18 comment:16 by , 6 months ago
Somewhat naive question: are there still use cases to exclude IDNA domain names from such a validator, 12 years after the initial report?
comment:17 by , 6 months ago
Patch needs improvement: | unset |
---|
comment:18 by , 6 months ago
Replying to Claude Paroz:
Somewhat naive question: are there still use cases to exclude IDNA domain names from such a validator, 12 years after the initial report?
Good question, perhaps it makes sense to remove the accept_idna
argument and not support excluding IDNA domain names for now. Then if someone has a use case this can be raised in a new ticket?
Nina - what do you think?
comment:19 by , 6 months ago
Patch needs improvement: | set |
---|
(open question as to whether we should drop accept_idna
)
follow-up: 21 comment:20 by , 6 months ago
The reason for adding accept_idna had nothing to do with how "modern" the format was. It's a way to allow the developer to enforce ascii names.
Although python beautifully supports UTF, the developer is often required to handle non-ascii characters specially. For example, when checking their length, or storing them somewhere, or serializing them to somewhere else. The exceeding rarity of IDNA causes many to assume the domain is ASCII, only to run into exception later in production.
If not else, the presence of the accept_idna option forces the developer to think whether that's a relevant case for them, and what to do about it.
I wouldn't necessarily consider adding it if the patch was without, but I certainly see value in keeping it now it's already there.
PS: I'm the original submitter. Trac wouldn't send me a reset-password link.
comment:21 by , 6 months ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
Replying to michele:
I wouldn't necessarily consider adding it if the patch was without, but I certainly see value in keeping it now it's already there.
PS: I'm the original submitter. Trac wouldn't send me a reset-password link.
Thank you for your input. In which case, I am marking the patch as "Ready for checkin".
I think it is a good idea to centralize domain name validation. I detected at least three points where we could use this validator (django/core/mail/message.py, django/core/validators.py (URLValidator), django/utils/html.py).
For DRY reasons, the domain regex part should be shared with URLValidator. In
__init__
, just useself.accept_idna = kwargs.pop('accept_idna', True)