#10870 closed Bug (fixed)
Aggregates with joins ignore extra filters provided by setup_joins
Reported by: | fas | Owned by: | fas |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | orm, aggregation, join, contenttypes, filter |
Cc: | mbencun@…, jdunck@…, michaelg@…, Robin, Noah Kantrowitz, Carl Meyer, Chris Chambers, joeri@…, anssi.kaariainen@…, ris, Alan Justino da Silva, daevaorn@…, fhahn, Hervé Cauwelier | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
(In case the description that follows is not understandable, please ask me for a better explanation).
An example where extra filters in joins are used are reverse generic relations (http://docs.djangoproject.com/en/dev/ref/contrib/contenttypes/#reverse-generic-relations).
class TaggedItem(models.Model): content_type = models.ForeignKey(ContentType) object_id = models.PositiveIntegerField() content_object = generic.GenericForeignKey('content_type', 'object_id') class Bookmark(models.Model): url = models.URLField() tags = generic.GenericRelation(TaggedItem)
By querying Bookmark.objects.filter(tags=some_tag)
the generic relation adds extra filters to ensure the correct content type. Those extra filters however are ignored when using aggregates, which leads to wrong results.
For example, the following query is not correct because it does not check the content_type.
Bookmark.objects.aggregate(Count('tags'))
What this aggregate actually calculates per bookmark is not the amount of tags for this bookmark but the amount of tags for any model referenced by TaggedItem with the same object id.
The error is isolated in db/models/sql/query.py in the add_aggregate method:
field, source, opts, join_list, last, _ = self.setup_joins( field_list, opts, self.get_initial_alias(), False) # Process the join chain to see if it can be trimmed col, _, join_list = self.trim_joins(source, join_list, last, False)
The last dummy element _ is normally named 'extra' in other uses of setup_joins and contains the extra filters needed to make it work. Adding self.add_filter(*extra)
after setup_joins fixes this problem.
Attachments (4)
Change History (36)
comment:1 by , 16 years ago
Triage Stage: | Unreviewed → Accepted |
---|---|
Version: | 1.0 → SVN |
by , 16 years ago
Attachment: | aggregation-generic.diff added |
---|
I don't think the solution you've suggested works entirely as with this setup it generates incorrect results for me(it returns 9 instead of 3).
by , 16 years ago
Attachment: | aggregation-generic.2.diff added |
---|
And elbow grease solves all problems.
by , 16 years ago
Attachment: | aggregation-generic.3.diff added |
---|
Added a test to show it doesn't work with annotate, for some reason it only returns books that have been tagged, which doesn't match the behavior of any other annotation, it's an annotation not a filter, I assume this is the result of using the wrong join type
comment:2 by , 16 years ago
This patch has a couple of problems, from looking at it quickly.
- There's a random comment on line 1436 that doesn't refer to any code. No code immediately follows it. It's also of the "set i to i+1" variety if it was intended to go before line 1448, so can be removed.
- The if-test on line 1448 can be removed. The
extra
parameter is always a list, possibly empty. So just iterate over it and if it's empty, nothing will be done.
by , 16 years ago
Attachment: | aggregation-generic.4.diff added |
---|
comment:3 by , 16 years ago
1 and 2 are dealt with, still the strangle filtering issue, which I can't quite figure out, since a LEFT OUTER JOIN is being preformed(I've left the debugging statement in there).
comment:4 by , 16 years ago
Cc: | added |
---|
comment:5 by , 16 years ago
Is this going in for 1.1? I think it should as it would otherwise leave a new feature with a quite serious bug.
comment:6 by , 16 years ago
milestone: | → 1.1 |
---|
comment:7 by , 16 years ago
I don't know that solving this for 1.1 will be easy. The annotate stuff has reveiled a subtle bug that will require some fairly invasive changed to fix. The explanation is here: http://searchoracle.techtarget.com/expert/KnowledgebaseAnswer/0,289625,sid41_gci1148316,00.html
comment:8 by , 16 years ago
milestone: | 1.1 → 1.2 |
---|
Bumping from 1.1 because this will require significant changes to the Query class, because we need to add support for additional join conditions. Also, Jacob said to do this.
comment:9 by , 16 years ago
Cc: | added |
---|
comment:10 by , 15 years ago
Cc: | added |
---|
comment:12 by , 14 years ago
Cc: | added |
---|
comment:13 by , 14 years ago
Cc: | added |
---|
comment:14 by , 14 years ago
Cc: | added |
---|
comment:15 by , 14 years ago
So I've been thinking about this, and I'm wondering whether it perhaps makes sense to commit this patch, even with it excluding legitimate results. My reasoning: the current status is that aggregate() and annotate() both return bogus data, with the patch aggregate() returns correct data, and annotate() returns data that is valid but is *missing* results for items with no related data. This strikes me as a clear improvement, and even though the patch is obviously not completely correct, it is a first step to the more complicated completely correct implementation.
comment:16 by , 14 years ago
Cc: | added |
---|
comment:17 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → Bug |
comment:18 by , 13 years ago
Cc: | added |
---|---|
Easy pickings: | unset |
UI/UX: | unset |
comment:19 by , 13 years ago
Cc: | added |
---|---|
Has patch: | set |
Needs tests: | set |
Patch needs improvement: | set |
I did some work to allow the contenttype lookup in left joins. This is useful in and of itself (you can do something like Book.objects.filter(notes__note__isnull
=True), and it will correctly find all books without any notes (before it would not have found any books). The work allows also to generate correct results when annotating and aggregating.
There are some loose ends in the patch. First, the clone method of sql/query doesn't properly clone the extra_join_filters (the conditions need to be deep-copied if I am not mistaken). Second, there is some non-dryness in setup_joins. There is also general commenting and stuff like that that needs to be done. I don't know if the patch would break under more complex joins (the main problem is getting the extra_filter to the right join). So, more tests needed also.
Github branch available at: https://github.com/akaariai/django/tree/generic_relations_left_joins
comment:21 by , 13 years ago
Cc: | added |
---|
comment:22 by , 13 years ago
Cc: | added |
---|
comment:23 by , 13 years ago
Cc: | added |
---|
comment:24 by , 13 years ago
Cc: | added |
---|
comment:25 by , 12 years ago
Cc: | added |
---|
comment:26 by , 12 years ago
Could somebody write a test case against master? I am pretty sure this one is already fixed in the ORM.
comment:27 by , 12 years ago
Needs tests: | unset |
---|---|
Patch needs improvement: | unset |
It seems fixed in master.
I've adopted the test from the previous patch and opened a pull request: https://github.com/django/django/pull/722
comment:28 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:29 by , 12 years ago
Component: | ORM aggregation → Database layer (models, ORM) |
---|
I really doubt this is with 1.0 ;)