1 | | If this gets reconsidered in the future, it will need to address potential security issues in changing how django.core.mail encodes recipient domains. As of July 2024, using IDNA 2003 for sending email (''not'' IDNA 2008) still seems to be the correct choice—or at least, matches what Gmail and Microsoft's Outlook.com do. Details in https://github.com/django/django/pull/16276#issuecomment-2227512278. |
| 1 | As of July 2024, some major email platforms and apps were inconsistent on which version of IDNA to follow. Gmail and Microsoft's Outlook.com appear to still use IDNA 2003 (or maybe Unicode UTS #46 transitional) to encode recipient domains. Apple's Mail apps and Thunderbird use IDNA 2008 (or UTS #46 ''non''-transitional). |
| 2 | |
| 3 | Django's EmailValidator can incorrectly reject domains that are now valid (under IDNA 2008) but were disallowed by IDNA 2003. (Example: `editor@މިހާރު.example.mv`.) This is arguably a bug, but I haven't been able to find any real-world cases of such domains being used for email. |
| 4 | |
| 5 | On the URLValidator side, things seem more clear. Browsers follow [https://url.spec.whatwg.org/#idna WHATWG's URL standard], which specifies that: |
| 6 | |
| 7 | > This document and the web platform at large use Unicode IDNA Compatibility Processing and not IDNA2008. For instance, ☕.example becomes xn--53h.example and not failure. |
| 8 | |
| 9 | Django's URLValidator correctly allows domains that are valid under [https://www.unicode.org/reports/tr46/tr46-33.html UTS #46] ("Unicode IDNA"), and it has since #20003 was fixed in 2015. (The apparent use of IDNA 2003/punycode() in URLValidator is in dead code; I'll open a separate ticket to clean that up.) |
| 10 | |
| 11 | It's true that the regular expressions in URLValidator and DomainNameValidator could also allow some strings that are not technically valid domains. But there ''isn't'' currently any complete Python implementation of UTS #46. (The idna package provides a partial implementation, and it rejects domains that WHATWG would allow.) |
| 12 | |
| 13 | Given all that, it seems better for URLValidator to allow some invalid domains, rather than incorrectly rejecting some valid ones. Here are a couple of real world URLs that both browsers and URLValidator (correctly) allow: |
| 14 | - `https://މިހާރު.com` - domain valid under IDNA 2008 but not IDNA 2003 (the name of a Maldivian newspaper in the local language and script, redirects to a Romanized version of their name) |
| 15 | - `http://👓.ws` - domain valid under UTS #46 but not IDNA 2008 (emoji domain owned by an eyeglasses retailer) |