1 | | I looked into this earlier today as part of ticket #35581, and was surprised to find that IDNA 2003 is probably still the correct choice for sending email. Documenting my findings here. |
2 | | |
3 | | The problem is for domains containing one of the [https://www.unicode.org/reports/tr46/#Deviations "deviation characters"] where the two IDNA versions differ. For instance: |
4 | | |
5 | | IDNA 2003: otto@faß.example → otto@fass.example |
6 | | IDNA 2008: otto@faß.example → otto@xn--fa-hia.example |
7 | | |
8 | | If those two domains are owned by different people, and Django uses a different version of IDNA than Otto expects, Otto's email could go to the wrong person. Big problem. |
9 | | |
10 | | So the question is, what version of IDNA does Otto expect? Browsers have all updated to IDNA 2008: if you enter http://faß.example, you will end up at http://xn--fa-hia.example, not http://fass.example. (You can try this with the .de equivalents to those domains, which are currently parked at different registrars.) |
11 | | |
12 | | I had assumed email should match the browsers, and be using IDNA 2008 by now. (And I was thinking that Django's ''not'' using it for email addresses was a serious security issue.) I was wrong. |
13 | | |
14 | | In testing earlier today, I found both Gmail and Outlook.com are still using IDNA 2003 for domains in address headers: both treat otto@faß.example as otto@fass.example. (They might be using IDNA 2008, but with UTS #46 "transitional processing" enabled, which retains the IDNA 2003 encoding for the deviation characters.) |
15 | | |
16 | | Bottom line: we wouldn't want to switch Django's sanitize_address() to use IDNA 2008 encoding (at least not without transitional processing), because ''that'' would actually introduce a security issue, by sending Otto's email to an unexpected domain. |
17 | | |
18 | | Also, If I'm understanding correctly, part of the request here is to be able to get Django's EmailMessage.message().as_string() to generate a message that hasn't had ''any'' encoding applied to the addresses, for use with SMTPUTF8. (That is, `To: jörg@faß.example` should stay just like that, not turn into `To: =?utf-8?q?j=C3=B6rg?=@fass.example`.) I'm hoping to address that as part of #35581, if [https://docs.python.org/3/library/email.policy.html#email.policy.SMTPUTF8 `email.policy.SMTPUTF8`] is used for EmailMessage.message(). |
19 | | |
20 | | Note that Django's SMTP EmailBackend doesn't currently support SMTPUTF8. That's probably best handled as a separate new feature request. (Or could also be implemented by a third-party custom EmailBackend.) |
| 1 | [deleted] |