Opened 9 years ago

Last modified 8 months ago

#26423 new Cleanup/optimization

Make EmailValidator use HTML5 validation rather than more complicated regular expressions

Reported by: Tim Graham Owned by:
Component: Core (Other) Version: dev
Severity: Normal Keywords:
Cc: Ülgen Sarıkavak Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

As discussed on the django-developers mailing list, the regular expressions for validating email addresses are complicated for questionable benefit. We should simplify it to use HTML5 type="email" validation (possible candidate). A deprecation may be needed to give time for projects to add back more complex validation that they might required.

Change History (18)

comment:1 by Sergei Maertens, 9 years ago

We should also assert that non-ascii characters are allowed in the local part of the e-mail address, see #25986 for background on this issue.

comment:2 by Chris Butler, 8 years ago

Owner: changed from nobody to Chris Butler
Status: newassigned

comment:3 by Chris Butler, 8 years ago

Taking ownership on this at PyCon2016 Sprint. Discussed with Markus and have background on consensus direction a solution should take.

comment:4 by Claude Paroz, 8 years ago

It would have been nice to know the "consensus direction" mentioned in the comment above!

comment:5 by Claude Paroz, 8 years ago

#27029 was closed as a duplicate.

comment:6 by Claude Paroz, 8 years ago

This PR only marginally simplifies the regex, but all current tests are still passing, in addition of the non-ASCII local part.
I'm not sure it solves all concerns of this report, to be discussed/confirmed.

comment:7 by Claude Paroz, 8 years ago

Has patch: set

comment:8 by Claude Paroz, 8 years ago

Has patch: unset

My patch was moved to #27029.

in reply to:  description comment:9 by Jeff Willette, 8 years ago

Replying to timgraham:

As discussed on the django-developers mailing list, the regular expressions for validating email addresses are complicated for questionable benefit. We should simplify it to use HTML5 type="email" validation (possible candidate). A deprecation may be needed to give time for projects to add back more complex validation that they might required.

I've found another possible fix here https://html.spec.whatwg.org/multipage/forms.html#valid-e-mail-address, but I have tried this regex and it does not match much out of the ASCII range. What does "HTML5" like browsers match that this regex does not? if Django wants to accept a large range of unicode characters, why can't it just match the whole unicode range on the local part of the email?

Last edited 8 years ago by Jeff Willette (previous) (diff)

comment:10 by Claude Paroz, 8 years ago

I suggest that Unicode validation be discussed separately in #27029.

comment:11 by Tim Graham, 8 years ago

Owner: Chris Butler removed
Status: assignednew

comment:12 by Tim Graham, 8 years ago

Has patch: set

PR from Collin Anderson.

comment:13 by Tim Graham, 8 years ago

Needs documentation: set

comment:14 by Haris Ibrahim K. V., 8 years ago

Owner: set to Haris Ibrahim K. V.
Status: newassigned

I've added some documentation that I think will help keep the user expectations clear after digging into the conversation / discussion history of this ticket.

https://github.com/django/django/pull/8081

comment:15 by MisRob, 8 years ago

#27932 was closed as a duplicate

comment:16 by Max Nordlund, 8 years ago

I came across this today and realized that the current regexp, in 1.10.5, contains a type. At least I think it does, since the opening bracket of the second range in user_regex is escaped, but it shouldn't be.

Anyway, I threw together a small PR to fix that, and I hope it can be merge to stable until this lands. I also added a bunch of new examples, and at the very least those should get slurped into the PR from Haris.

See https://github.com/django/django/pull/8187

Version 0, edited 8 years ago by Max Nordlund (next)

comment:17 by Haris Ibrahim K. V., 8 years ago

Owner: Haris Ibrahim K. V. removed
Status: assignednew

Not maintaining the PR any more. Please do feel free to fork and submit a new PR.

comment:18 by Ülgen Sarıkavak, 8 months ago

Cc: Ülgen Sarıkavak added
Note: See TracTickets for help on using tickets.
Back to Top