Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#31784 closed Bug (fixed)

Emails name over 75 characters are incompatible with the latest versions of python.

Reported by: Nick Orr Owned by: Florian Apolloner
Component: Core (Mail) Version: 2.2
Severity: Normal Keywords:
Cc: Joachim Jablon, Florian Apolloner Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In the process of sending an email the addresses is sanatized:

django/core/mail/message.py:98 => def sanitize_address(addr, encoding)

The Name portion is encoded via the Header class email.header.Header.encode which will introduce newlines at 75 characters.

Unfortunately the most recent python security update no longer allows that to happen. So when Address(nm, addr_spec=addr) is called in sanitize_address a new error is raised from the Python Standard library.

The update to python can be found here: https://github.com/python/cpython/commit/f91a0b6df14d6c5133fe3d5889fad7d84fc0c046#diff-3c5a266cd05e7d4173bf110ee93edd16

Essentially Django can no longer send emails with names longer then 75 chracters.

Change History (22)

comment:1 by Mariusz Felisiak, 5 years ago

Resolution: invalid
Status: newclosed
Summary: When sending emails with a name over 75 characters long new lines are introduced which is incompatible with the latest minor versions of python 3.6/7/8Emails name over 75 characters are incompatible with the latest minor versions of python 3.6/7/8.

name-addr like other message headers has line length limit (see RFC 2822) on the other hand it cannot contain CR and LF newline characters because they will be vulnerable for header injection attacks (see issue39073). As far as I'm aware there is not much we can do to keep it safe and allowed for the name-addr longer than 78 chars.

comment:2 by Mariusz Felisiak, 5 years ago

Summary: Emails name over 75 characters are incompatible with the latest minor versions of python 3.6/7/8.Emails name over 75 characters are incompatible with the latest versions of python.

comment:3 by Nick Orr, 5 years ago

RFC 2822:

Each line of characters MUST be no more than 998 characters, and SHOULD be no more than 78 characters

Instead of Django being ok with feeding an underlying Python library data that will cause an exception as with all things Python it seems like Django would be better off getting out of the way and letting the users shoot themselves in the foot.

I'd suggest that this call:

nm = Header(nm, encoding).encode()

Include the actual RFC line length limit of 998 in either the class instantiation or the encode method as either of those accept maxlinelen as a kwarg.

Or as a non-changing compromise it could be hived off into setting with the default being 75 as it stands now. That would allow developers to overwrite the setting if needed. Though it would also add yet another setting for people to know about.

Either way the current implementation is likely to cause a great deal of pain as developers find out that an email addressed to: "Django with a really long name for reasons that make sense in the context of a project <noreply@…>" are inexplicably crashing in core Python code they've never seen before.

Most of us would rather send a wonky, if technically correct, email versus just having that email vanish into the ether.

comment:4 by Carlton Gibson, 5 years ago

Hi Nick.

If this is an issue, surely it's an issue in the Python stdlib? Is there a discussion there that we can point to?

What do the Python core devs says when you quote that line of RFC 2822?

Since this is a security sensitive issue, we can't bypass the stdlib implementation without being 100% sure that the judgement is correct. (If we were able to make that case, surely it would be convincing for the stdlib implementation too?)

Maybe there's an issue for Django here, but we need to pursue the Python course first. (Perhaps folks are already raising it there...?)

I hope that makes sense.

comment:5 by Carlton Gibson, 5 years ago

Include the actual RFC line length limit of 998 in either the class instantiation or the encode method as either of those accept maxlinelen as a kwarg.

Maybe we could do this.

in reply to:  4 comment:6 by Nick Orr, 5 years ago

https://github.com/python/cpython/pull/19007/files

This issue is caused by a recently issued patch for Python as linked above. As far as I can tell the standard library does not really care about the length in this context. And you are correct we wouldn't want to bypass using the stdlib here we might just not want to insert newlines when they aren't necessarily required.

I haven't brought it up with the Python folks but from their point of view I'm sure they see their implementation as fine. More so since the headerregistry.py file in question has this in its preamble:

Eventually HeaderRegistry will be a public API, but it isn't yet, and will probably change some before that happens.

Just for clarity this is the line in question in Django. https://github.com/django/django/blob/master/django/core/mail/message.py#L99

comment:7 by Mariusz Felisiak, 5 years ago

Cc: Joachim Jablon added

Joachim, Can I ask for your opinion?

comment:8 by Florian Apolloner, 5 years ago

Resolution: invalid
Status: closednew

comment:9 by Florian Apolloner, 5 years ago

Cc: Florian Apolloner added
Triage Stage: UnreviewedAccepted

comment:10 by Florian Apolloner, 5 years ago

@Nick: How are you generating those emails in the first place currently? I nowhere see a possibility to specify an actual name in https://docs.djangoproject.com/en/3.0/topics/email/

I also cannot reproduce it with:

In [12]: e = EmailMessage('subject', 'content', 'from@example.com', ['to@example.com' * 99])                                                                                                                                                  

In [13]: e.message()                                                                                                                                                                                                                          
Out[13]: <django.core.mail.message.SafeMIMEText at 0x7fc900437050>

In [14]: e.message().as_bytes()                                                                                                                                                                                                               
Out[14]: b'Content-Type: text/plain; charset="utf-8"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\nSubject: subject\nFrom: from@example.com\nTo: \n to@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.comto@example.com\nDate: Tue, 14 Jul 2020 20:41:37 -0000\nMessage-ID: <159475929766.21330.4162456845040318158@apollo13>\n\ncontent'

I assume you are manually injecting to headers?

comment:11 by Florian Apolloner, 5 years ago

Oh, it needs a non-ascii character to trigger encoding in the first place:

from django.conf import settings
settings.configure()

from django.core.mail.message import EmailMessage
e = EmailMessage('subject', 'content', 'from@example.com', ['"TestUser ä%s" <to@example.com>' % ('0' * 100)])

print(e.message().as_string())

raises

Traceback (most recent call last):
  File "/home/florian/sources/django.git/django/core/mail/message.py", line 62, in forbid_multi_line_headers
    val.encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 10: ordinal not in range(128)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "testing.py", line 7, in <module>
    print(e.message().as_string())
  File "/home/florian/sources/django.git/django/core/mail/message.py", line 242, in message
    self._set_list_header_if_not_empty(msg, 'To', self.to)
  File "/home/florian/sources/django.git/django/core/mail/message.py", line 397, in _set_list_header_if_not_empty
    msg[header] = value
  File "/home/florian/sources/django.git/django/core/mail/message.py", line 154, in __setitem__
    name, val = forbid_multi_line_headers(name, val, self.encoding)
  File "/home/florian/sources/django.git/django/core/mail/message.py", line 65, in forbid_multi_line_headers
    val = ', '.join(sanitize_address(addr, encoding) for addr in getaddresses((val,)))
  File "/home/florian/sources/django.git/django/core/mail/message.py", line 65, in <genexpr>
    val = ', '.join(sanitize_address(addr, encoding) for addr in getaddresses((val,)))
  File "/home/florian/sources/django.git/django/core/mail/message.py", line 107, in sanitize_address
    parsed_address = Address(nm, username=localpart, domain=domain)
  File "/usr/lib64/python3.7/email/headerregistry.py", line 37, in __init__
    raise ValueError("invalid arguments; address parts cannot contain CR or LF")
ValueError: invalid arguments; address parts cannot contain CR or LF

comment:12 by Florian Apolloner, 5 years ago

This patch:

diff --git a/django/core/mail/message.py b/django/core/mail/message.py
index 607eb4af0b..2a70528644 100644
--- a/django/core/mail/message.py
+++ b/django/core/mail/message.py
@@ -96,8 +96,8 @@ def sanitize_address(addr, encoding):
         nm, address = addr
         localpart, domain = address.rsplit('@', 1)
 
-    nm = Header(nm, encoding).encode()
     # Avoid UTF-8 encode, if it's possible.
+    # TODO: Is anything non-ascii even allowed in the local part?
     try:
         localpart.encode('ascii')
     except UnicodeEncodeError:
@@ -105,7 +105,11 @@ def sanitize_address(addr, encoding):
     domain = punycode(domain)
 
     parsed_address = Address(nm, username=localpart, domain=domain)
-    return str(parsed_address)
+    if nm:
+        display_name = Header(parsed_address.display_name, encoding).encode()
+        return f'{display_name} <{parsed_address.addr_spec}>'
+    else:
+        return '' if parsed_address.addr_spec=='<>' else parsed_address.addr_spec
 
 
 class MIMEMixin:

creates To headers as follows:

Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Subject: subject
From: from@example.com
To: 
 =?utf-8?q?TestUser=2C_=C3=A4000000000000000000000000000000000000000000000000?=
 =?utf-8?q?0000000000000000000000000000000000000000000000000000?=
 <to@example.com>,
 =?utf-8?q?TestUser=2C_=C3=A4000000000000000000000000000000000000000000000000?=
 =?utf-8?q?0000000000000000000000000000000000000000000000000000?=
 <to@example.com>
Date: Tue, 14 Jul 2020 22:17:04 -0000
Message-ID: <159476502450.14868.2537973479953610602@apollo13>

content

which looks correct at a first glance.

comment:13 by Florian Apolloner, 5 years ago

I somewhat feel that the security fix in Python is weird. As far as I understand you are supposed to pass in properly encoded names; but if you don't encode them with linebreaks who does it later on? (noone I feel)

comment:14 by Florian Apolloner, 5 years ago

Btw, does using Address offer us much over just keeping our code and manually concatenating the display_name, local_part and domain? Our code already checks for header injections if any header where to contain a newline in the first place…

comment:15 by Mariusz Felisiak, 5 years ago

Has patch: set
Owner: changed from nobody to Mariusz Felisiak
Status: newassigned

comment:16 by Joachim Jablon, 5 years ago

@felixmm I appreciate the ping but I’m afraid I’m not familiar enough with the intricacies of that part of the email RFCs to be really of any help. Florian’s patch seems to make quite some sense given the bug, I’m not sure of a compelling reason not to go in this direction.

I’ll review the PR :)

comment:17 by Mariusz Felisiak, 5 years ago

Owner: changed from Mariusz Felisiak to Florian Apolloner
Triage Stage: AcceptedReady for checkin

comment:18 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In f405954:

Refs #31784 -- Added test for preventing header injection in display name of email addresses.

comment:19 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

Resolution: fixed
Status: assignedclosed

In 96a3ea39:

Fixed #31784 -- Fixed crash when sending emails on Python 3.6.11+, 3.7.8+, and 3.8.4+.

Fixed sending emails crash on email addresses with display names longer
then 75 chars on Python 3.6.11+, 3.7.8+, and 3.8.4+.

Wrapped display names were passed to email.headerregistry.Address()
what caused raising an exception because address parts cannot contain
CR or LF.

See https://bugs.python.org/issue39073

Co-Authored-By: Mariusz Felisiak <felisiak.mariusz@…>

comment:20 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In bfe404de:

[3.1.x] Fixed #31784 -- Fixed crash when sending emails on Python 3.6.11+, 3.7.8+, and 3.8.4+.

Fixed sending emails crash on email addresses with display names longer
then 75 chars on Python 3.6.11+, 3.7.8+, and 3.8.4+.

Wrapped display names were passed to email.headerregistry.Address()
what caused raising an exception because address parts cannot contain
CR or LF.

See https://bugs.python.org/issue39073

Co-Authored-By: Mariusz Felisiak <felisiak.mariusz@…>

Backport of 96a3ea39ef0790dbc413dde0a3e19f6a769356a2 from master

comment:21 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In ccc088f8:

[3.0.x] Fixed #31784 -- Fixed crash when sending emails on Python 3.6.11+, 3.7.8+, and 3.8.4+.

Fixed sending emails crash on email addresses with display names longer
then 75 chars on Python 3.6.11+, 3.7.8+, and 3.8.4+.

Wrapped display names were passed to email.headerregistry.Address()
what caused raising an exception because address parts cannot contain
CR or LF.

See https://bugs.python.org/issue39073

Co-Authored-By: Mariusz Felisiak <felisiak.mariusz@…>

Backport of 96a3ea39ef0790dbc413dde0a3e19f6a769356a2 from master

comment:22 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In 1a3835f:

[2.2.x] Fixed #31784 -- Fixed crash when sending emails on Python 3.6.11+, 3.7.8+, and 3.8.4+.

Fixed sending emails crash on email addresses with display names longer
then 75 chars on Python 3.6.11+, 3.7.8+, and 3.8.4+.

Wrapped display names were passed to email.headerregistry.Address()
what caused raising an exception because address parts cannot contain
CR or LF.

See https://bugs.python.org/issue39073

Co-Authored-By: Mariusz Felisiak <felisiak.mariusz@…>

Backport of 96a3ea39ef0790dbc413dde0a3e19f6a769356a2 from master.

Note: See TracTickets for help on using tickets.
Back to Top