Opened 12 years ago
Closed 6 years ago
#20147 closed New feature (fixed)
Provide an alternative to request.META for accessing HTTP headers
Reported by: | Luke Plant | Owned by: | Santiago Basulto |
---|---|---|---|
Component: | HTTP handling | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | marc.tamlyn@…, tom@…, ben@…, Zach Borboa, Santiago Basulto | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
From the docs:
HttpRequest.META
A standard Python dictionary containing all available HTTP headers...
With the exception of CONTENT_LENGTH and CONTENT_TYPE, as given above, any HTTP headers in the request are converted to META keys by converting all characters to uppercase, replacing any hyphens with underscores and adding an HTTP_ prefix to the name. So, for example, a header called X-Bender would be mapped to the META key HTTP_X_BENDER.
The question is, why? Why do we have this ridiculous transform? It is pure silliness, whose only explanation is a quirk of CGI, which is now totally irrelevant.
You should be able to look up a header in the HTTP spec and do something very simple to get it from the HTTP request. How about this API:
request.HEADERS['Host']
(for consistency with GET/POST/FILES etc.), or even
request['Host']
Dictionary access should obey HTTP rules about case-sensitivity of the header names.
This also would has the advantage that repr(request)
wouldn't have lots of junk you don't need i.e. the entire content of os.environ
, which, on a developer machine especially, can have a lot of noise (mine does).
It also future-proofs us for when WSGI is replaced with something more sensible, and the whole silly round trip to os.environ
can be removed completely, or if we want to support something else parallel to WSGI and client code wants to access HTTP headers in the same way for both.
This leaves a few things in META that are not derived from an HTTP header, and do not have a way of accessing them from the request object. I think these are just:
- SCRIPT_NAME - this is a CGI leftover, that is only useful in constructing other things, AFAICS
- QUERY_STRING - this can be easily constructed from
request.get_full_path()
for the rare times that you need the raw query string rather than request.GET - SERVER_NAME - should use get_host() instead
- SERVER_PORT - use get_host()
- SERVER_PROTOCOL - could use is_secure(), but perhaps it would be nice to have a convenience
get_protocol()
method.
(see http://wsgi.readthedocs.org/en/latest/definitions.html)
Change History (27)
comment:1 by , 12 years ago
comment:2 by , 12 years ago
HTTP headers are case insensitive. You want to get rid of the transform, but what happens when someone sends "accept: " and you check for HEADERSAccept?
comment:3 by , 12 years ago
As stated above, "Dictionary access should obey HTTP rules about case-sensitivity of the header names."
I didn't say get rid of the transform - it should be done within the API, not by the user of the API. In terms of implementation, request.HEADERS['Accept']
will map straight to request._META['HTTP_ACCEPT']
, at least for wsgi, or do something equivalent that will ensure case-insensitivity.
comment:4 by , 12 years ago
There are a few more things that need considering if this is to be done:
RequestFactory
and the testClient
, and their APIs which pass directly to request.META.REMOTE_ADDRESS
,REMOTE_USER
SECURE_PROXY_SSL_HEADER
comment:5 by , 12 years ago
Minor bikeshed-type question: is there really value in making request.HEADERS
all-caps? I realize the parallel to request.POST
, request.GET
, and request.META
, but the former two are all-caps simply because HTTP methods are usually written that way. I guess I'd just like to see a bit of rationale spelled out for how we decide whether a given request attribute ought to be all-caps; I'd probably lean towards just request.headers
for the new API.
More discussion of this proposal (in particular, whether to deprecate/change request.META
) is here: https://groups.google.com/d/topic/django-developers/Jvs3F79cY4Y/discussion
comment:6 by , 12 years ago
Cc: | added |
---|
It would be consistent for request.headers
to be lowercase to match up with request.body
for example.
comment:7 by , 12 years ago
Should we consider having request.headers
return unicode values rather than byte values?
Correctly decoding HTTP headers is slightly fiddly - the default supported encoding is iso-8859-1
,
but utf-8
can also be supported as per RFC 2231, RFC 5987.
Getting the decoding right probably isn't something we want developers to have to think about.
Note: For real-world usage see this example of browser support for utf-8
in uploaded filenames: https://code.google.com/p/chromium/issues/detail?id=57830
comment:9 by , 12 years ago
Okay, noticed that the link to chrome's use of iso-8859-1 is actually for response headers, so disregard that.
The question regarding unicode vs byte values still stands, though.
comment:10 by , 12 years ago
I'm happy with request.headers
instead of request.HEADERS
- the parallel to request.body
does make more sense that request.GET
.
Regarding unicode/bytes, it's a very thorny issue, and the more I look into it the worse it gets. PEP 3333 might apply, if we are assuming a simple mapping to request.META
, but that essentially leaves decoding issues to the user if I'm reading it correctly.
comment:11 by , 12 years ago
Okay, maybe it's not obvious if unicode values would be preferable or not.
I thought I'd take a look at what the requests
library does, and found this similar ticket: https://github.com/kennethreitz/requests/pull/1181
If it is something that we decide to do, then the following looks like it ought to do the trick:
from email.header import decode_header u''.join(header_bytes.decode(enc or 'iso-8859-1') for header_bytes, enc in decode_header(h))
For further reference note that the httpbis
spec is proposed to obsolete RFC2616, cleaning up & clarifying underspecified bits of the spec.
The relevant section on header value encoding is here: http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-19#section-3.2.2
comment:12 by , 12 years ago
Summary: | Replace and deprecate request.META for HTTP headers → Provide an alternative to request.META for accessing HTTP headers |
---|---|
Triage Stage: | Unreviewed → Accepted |
The mailing list discussion converged towards keeping META, but recommending a dict-like request.headers
.
I'm updating the summary to reflect this.
comment:13 by , 12 years ago
Regarding the transformation of request headers, for example from X-Bender to the META key HTTP_X_BENDER -
From what I see this transformation is not done in django but in the wsgi implementation.
I tested with apache mod_wsgi and with python's wsgiref and seems that they are doing this transformation not django.
I couldn't find it documented anywhere but see this from python's Lib/wsgiref/simple_server.py
99 for h in self.headers.headers:
100 k,v = h.split(':',1)
101 k=k.replace('-','_').upper(); v=v.strip()
102 if k in env:
103 continue # skip content length, type,etc.
104 if 'HTTP_'+k in env:
105 env['HTTP_'+k] += ','+v # comma-separate multiple headers
106 else:
107 env['HTTP_'+k] = v
comment:14 by , 9 years ago
Cc: | added |
---|
comment:16 by , 9 years ago
Proof of concept: https://github.com/django/django/pull/6803
This makes request.headers
have lowercase header names, and replaces underscores with hyphens. (header names are lowercase in http2)
Also, on python3 we already get unicode headers from WSGI, and we're dropping py2 in January, so I don't think it's worth making the values unicode on python2. The header names are already unicode on python2.
{ 'accept-language': 'en-US,en;q=0.8', 'accept-encoding': 'gzip, deflate, sdch', 'host': 'localhost:8000', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 'upgrade-insecure-requests': '1', 'connection': 'keep-alive', 'cache-control': 'max-age=0', 'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36', }
comment:17 by , 8 years ago
Version: | 1.5 → master |
---|
comment:18 by , 7 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
I will give it a try and submit a PR.
comment:20 by , 6 years ago
Patch needs improvement: | set |
---|
comment:21 by , 6 years ago
Cc: | added |
---|
comment:23 by , 6 years ago
Patch needs improvement: | unset |
---|
comment:24 by , 6 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:25 by , 6 years ago
Patch needs improvement: | set |
---|---|
Triage Stage: | Ready for checkin → Accepted |
comment:26 by , 6 years ago
Cc: | added |
---|---|
Patch needs improvement: | unset |
A strong argument against the
request['Referer']
API is the use of request in templates (e.g.if request.GET.some_flag
), which conflates dictionary access and attribute access, probably makingrequest.HEADERS['Referer']
a much safer API.