#4992 closed Uncategorized (fixed)
Alter cache key based on GET parameters
Reported by: | anonymous | Owned by: | nobody |
---|---|---|---|
Component: | Core (Cache system) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | miracle2k, jim@…, dane.springmeyer@…, sylvain.pasche@…, taavi@…, apasotti@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Why GET requests are not cached by CacheMiddleware?
For example paginator accepts page numer as a GET parametr and pages other than first were not cached in any way.
I changed this policy on my own in django source, but I think it should be merged in to django, maybe as an option.
Patch changing it is attached to this ticket.
Attachments (3)
Change History (30)
by , 17 years ago
comment:1 by , 17 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
GET requests are cached by the cache middleware. If you're having difficulties making it work, try asking on the django-users mailing list.
Oh, and you don't want to use this patch. It's gonna break your site pretty bad.
comment:2 by , 17 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
Summary: | Cache GET request. → Alter cache key based on GET parameters |
Talking with the submitter on IRC, I think there was a miscommunication in his original summary. He's not saying that it won't cache GET requests at all, he wants the cache keys to vary based on the GET params, so you can cache a view and still use GET params.
For instance, with the current cache middleware, /?page=2, and /?page=3 would skip the cache entirely, while they could be cached based on the page=2 or page=3 bit as well.
The attached patch isn't a good way to do it, but I think the idea has merit :)
comment:3 by , 17 years ago
Probably the best way to get different cache keys from different GET
parameters is to use an md5 of the contents of request.GET
(so something like md5.new(str(request.GET)).hexdigest()
).
comment:4 by , 17 years ago
Triage Stage: | Unreviewed → Design decision needed |
---|
comment:5 by , 17 years ago
Component: | Uncategorized → Cache system |
---|
comment:6 by , 17 years ago
FWIW, a simple str(request.GET)
isn't good enough -- if this ticket gets accepted, I'd expect /?a=1&b=2
to have the same cache key as /?b=2&a=1
, so there's sorting business that needs to happen, too.
I'm currently -0 on this, though. For one, it increases the time to calculate caching in the most common case. On top of that, though GET with a query string is supposed to be cacheable, RFC 2616 notes that "some applications have traditionally used GETs [...] with query URLs [...] to perform operations with significant side effects" and suggests that caches treat such requests carefully (§13.9). I'm inclined to do the same.
comment:7 by , 16 years ago
What about just request.get_full_path()
?
HTTP caches don't consider /?a=1&b=2
to have the same cache key as /?b=2&a=1
; they treat the URL as opaque. We know different, but developers shouldn't be surprised about having to use the same key for internal caching that would be used externally.
comment:8 by , 16 years ago
milestone: | → post-1.0 |
---|
comment:9 by , 16 years ago
If I understand http://www.w3.org/DesignIssues/Axioms.html correctly you should treat URL:s as opaque and get three different cache objects for:
- example.com/list/
- example.com/list/?a=1&b=2
- example.com/list/?b=2&a=1
It is likely that there won't be many permutations for multiple URL parameters as they are typically constructed in a form controlled by the application developer.
Implementing this would likely give a decent performance boost for Django apps that filter large lists based on query parameters.
comment:10 by , 16 years ago
Design-wise it would be great if the developer could be in control of which parameters that require a new cache item. After all, it is the developer who knows which parameters are likely to influence the result returned to the client. The currenct view cache decorator could be complemented like this:
Vary by the entire URL (should be default behaviour):
@cache_page(60 * 15) @vary_by_param("*") def slashdot_this(request): ...
Only vary by values for parameter a and b (ignore everything else):
@cache_page(60 * 15) @vary_by_param(["a","b"]) def slashdot_this(request): ...
comment:11 by , 16 years ago
Restricting this to a subset of the GET parameters would only reduce the number of duplicate copies in the cache. As such I think it's more beneficial to provide transparent caching with a simple decorator API. The reason we need to use request.get_full_path()
is that the current behaviour allows Django to serve the wrong page for requests with a query string. Trying to reduce the number of duplicate entries in the cache to the bare minimum is more of a wishlist item.
Rather than building all of the possible ways you could generate cache keys into decorators, why not just allow cache_page
to take a callable?
eg.
@cache_page(key=lambda request: request.path) def slashdot_this(request): ...
This should still default to request.get_full_path()
so that the default behaviour works for all views.
comment:12 by , 16 years ago
get_full_path() seems to be the way to go. It returns the path and query, but ignores the fragment identifier. So, this would be a minor patch that improves performance for many sites.
by , 16 years ago
Attachment: | cache_by_request_full_path.diff added |
---|
Cache by request.get_full_path() instead of request.path
comment:14 by , 16 years ago
Cc: | added |
---|
comment:15 by , 16 years ago
Cc: | added |
---|
comment:16 by , 16 years ago
Cc: | added |
---|
comment:18 by , 15 years ago
I've implemented something like Daniel Pope 's suggestion and opened separate ticket for that: #11269.
comment:19 by , 15 years ago
Cc: | added |
---|
comment:20 by , 15 years ago
Cc: | added |
---|---|
Resolution: | → fixed |
Status: | reopened → closed |
comment:21 by , 15 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Sorry. This has happened to me before. Is it a bug in Trac?
comment:22 by , 14 years ago
Triage Stage: | Design decision needed → Accepted |
---|
Even squid recently (last year) changed their default caching behaviour to respect GET params, so they think the web has moved far enough to udnerstand side-effects methods. We should do this now.
comment:23 by , 14 years ago
Cc: | added |
---|
by , 14 years ago
Attachment: | cache_by_request_full_path_v2.diff added |
---|
Updated patch of PeterKz to apply cleanly to trunk. Added tests
comment:24 by , 14 years ago
Cc: | added |
---|
comment:25 by , 14 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:27 by , 13 years ago
Cc: | removed |
---|---|
Easy pickings: | unset |
Severity: | → Normal |
Type: | → Uncategorized |
UI/UX: | unset |
patch