Opened 7 years ago

Closed 7 years ago

#29276 closed Bug (duplicate)

Compatibility of charset works not well when there are non-ascii characters in HTTPResponse headers

Reported by: sanjiely Owned by: nobody
Component: HTTP handling Version: 2.0
Severity: Normal Keywords: HTTPResponse header str bytes
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by sanjiely)

Python: 3.6.4
Django: 2.0.3
Browser: Chrome v65

To tell the browser to download a file, my code as below:

response = HttpResponse(my_data, content_type='text/plain')
response['Content-Disposition'] = 'attachment; filename="中中foo.xls"'

(Note: there are non-ascii characters in the filename.)
My code worked well in Python2.7 and Django1.8.

But when I transform my version to Django v2 and Python v3. The behavior of the browser changed: Chrome opened my file directly and wouldn't download the file as an attachment any more.
I catch the network packet, and found the response headers become as:

Content-Disposition: =?utf-8?b?YXR0YWNobWVudDsgZmlsZW5hbWU9IuS4reS4rWZvby54bHMi?=
Content-Length: 13
Content-Type: text/plain

Very strange header "Content-Disposition", isn't it?
I guess there is an unexpected code path in the Django lib file "response.py".
I tracked it, and found the code running into an exception statement, line 125, file "response.py"

       except UnicodeError as e:
            if mime_encode:
                print(value)
                value = Header(value, 'utf-8', maxlinelen=sys.maxsize).encode()
                print(value)

The output of these two print statement are:

attachment; filename="中中foo.xls"
=?utf-8?b?YXR0YWNobWVudDsgZmlsZW5hbWU9IuS4reS4rWZvby54bHMi?=

Thanks.

Change History (8)

comment:1 by sanjiely, 7 years ago

Summary: Advise to turn the headers of HTTPResponse from str to bytes automaticallySuggest to turn the headers of HTTPResponse from str to bytes automatically

comment:2 by sanjiely, 7 years ago

Summary: Suggest to turn the headers of HTTPResponse from str to bytes automaticallyTurn the headers of HTTPResponse from str to bytes automatically

comment:3 by Claude Paroz, 7 years ago

Description: modified (diff)

comment:4 by Claude Paroz, 7 years ago

Sorry, but you didn't show by your example why setting the header as a normal unicode string is a problem. You should develop the everthing will out of control part of your description.

comment:5 by Claude Paroz, 7 years ago

Resolution: duplicate
Status: newclosed

Mmmh, I guess this is in fact a duplicate of #16470.

comment:6 by sanjiely, 7 years ago

Description: modified (diff)
Resolution: duplicate
Status: closednew
Summary: Turn the headers of HTTPResponse from str to bytes automaticallyCompatibility of charset works not well when there are non-ascii characters in HTTPResponse headers
Type: Cleanup/optimizationBug

comment:7 by Simon Charette, 7 years ago

Shouldn't HTTP headers only contain ISO-8859-1 characters and RFC5987 be followed to include parameters containing characters outside of its range?

I think your header should be something along.

from urllib.parse import quote
response['Content-Type'] = "attachment; filename*=UTF-8''{}".format(quote(filename))

By the way I'm also pretty confident this is a duplicate of #16470.

Last edited 7 years ago by Simon Charette (previous) (diff)

comment:8 by Tim Graham, 7 years ago

Resolution: duplicate
Status: newclosed
Note: See TracTickets for help on using tickets.
Back to Top