#20187 closed Bug (fixed)
Django 1.5 using a cached HttpResponse with WSGI has an empty body
Reported by: | Owned by: | Aymeric Augustin | |
---|---|---|---|
Component: | HTTP handling | Version: | 1.5 |
Severity: | Normal | Keywords: | HttpResponse |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
With the change to HttpResponse made in Django 1.5, I'm finding that in my code, which caches a generated response, results in an empty body when that page is requested a second time. The first time the page is requested, it is not in the cache, and the page is generated normally and added to the cache. A subsequent request for the same page finds the response in the cache and that response is returned, but with a content length of zero.
The reason is that the HttpResponse in Django 1.5 does not reset the content iterator when the content is requested to be iterated over again (the next time the response content is required).
I note the comments made about the way an iterator should behave when requested to iterate again:
https://code.djangoproject.com/ticket/13222
and the code which was added to explicitly prevent a reiteration from resetting the iterator. However, that causes a problem when using cached responses.
The HttpResponse in my case was not created by passing an iterator to HttpResponse. It is just using a string.
The problem is that the __iter__
method of HttpResponse contains the line:
if not hasattr(self, '_iterator'): self._iterator = iter(self._container)
This prevents the iterator being reset the next time it is required to iterate over the content.
_container still has the original content, but __iter__
does not reset the iterator as _iterator exists as an attribute since the first time that response was returned. The cached response is returning a used iterator, which returns no content.
I suspect this is a bug.
What about a work-around in the meantime?
When I retrieve the response from the cache, I could do:
response._iterator = iter(response._container)
or:
del response._iterator
This works, but makes my code dependent on the internals of the HttpResponse class, which isn't great.
My suggestion is to change the __iter__
method for HttpResponse such that it creates an instance of a separate iterator class. The __iter__
method should not return self. It would be OK for the __iter__
method of the separate iterator class to return self. This should avoid the problem reported in https://code.djangoproject.com/ticket/13222.
I accept that StreamingHttpResponse instances can only be iterated once, and are therefore not suitable for caching.
Kind regards,
Steve
Change History (11)
comment:1 by , 12 years ago
comment:3 by , 12 years ago
Something like this should work in your opinion then?
def test_http_response_rewinds_if_explicit(self):
ht = HttpResponse(content='Something')
self.assertEqual(list(ht), Something)
self.assertEqual(list(ht), Something)
I think a possible solution (if it makes sense) would be to check if the content passed in is an iterator or not, and in case it's not avoid consuming it, what do you think?
comment:4 by , 12 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:5 by , 12 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:6 by , 11 years ago
Hi Andrea,
your test case should probably have:
self.assertEqual(list(ht), ['Something']) self.assertEqual(list(ht), ['Something'])
The present HttpResponse implementation will result in the second list(ht) being an empty list [].
I like your suggestion of checking if the content passed is an iterator, and the member _base_content_is_iter can probably be used in this test.
In the class HttpResponse __iter__
method:
we could replace:
if not hasattr(self, '_iterator'): self._iterator = iter(self._container)
with:
if not hasattr(self, '_iterator') or not self._base_content_is_iter: self._iterator = iter(self._container)
The documentation on caching should give some mention to this, and warn that HttpResponse instances using iterators, or StreamingHttpResponse instances should not be cached as subsequent use of these instances will return empty bodies.
comment:7 by , 11 years ago
Hi Andrea,
that last comment was by me, the creator of the ticket.
Thanks,
Steve
comment:8 by , 11 years ago
I've been looking at this again today. As explained in #13222 it's a bad practice to return self
in an iterable's __iter__
method because multiple calls to iter(response)
return the same iterator. I fixed most symptoms by preventing Django from rewinding that iterator but it's still a sub-optimal behavior when iteration is repeatable.
Until the deprecation path for old-style streaming responses completes, iteration isn't repeatable in general. Besides, this code is very fragile and sensitive overall. So I'm reluctant to make changes right now.
Once the deprecation path has completed, we'll be able to simplify drastically the implementation of HttpResponse
. The content will always be stored as a string internally, say, in self._content
. As a consequence __iter__
could be as simple as:
def __iter__(self): return iter([self._content])
This returns a new iterator every time it's called, as suggested in #13222.
This will break some of the tests added when fixing #13222. Those tests should probably be rewritten to make more precise assertion about the problems. Essentially, they should test that calling iter(response)
again doesn't rewind a previously obtained iterator, not that it returns an empty iterator.
This goes against my previous decision, but the context will have changed :)
Some documentation may be useful; I'm not sure. We'll be in a better position to decide that once we have a patch.
comment:9 by , 11 years ago
stable/1.6.x was forked and the deprecated code was removed since my last comment. I will now take care of this.
comment:10 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Discussion here : https://groups.google.com/d/msg/django-developers/cGw-VitKWIw/hp860m27KggJ