#34750 closed Bug (fixed)
Fixed QuerySet.count() on querysets grouped by unused multi-valued annotations.
Reported by: | Toan Vuong | Owned by: | Simon Charette |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 4.2 |
Severity: | Release blocker | Keywords: | |
Cc: | Simon Charette | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I believe this is related to the work in https://code.djangoproject.com/ticket/28477 (and follow-up issues/changes due to that ticket).
Tested on:
Django 4.2.2
OS X 13.4.1
Python 3.9.16
For the Oracle backend:
cx-Oracle 8.3.0 with instantclient 19.8
For the Postgres backend:
psycopg/psycopg-binary 3.1.9
Attached is a sample project, and the relevant query is below:
def populate_data(): q1 = Question(question_text='t1') q1.save() for i in range(10): c = Choice(question=q1, choice_text='c1', votes=i) c.save() # Need to populate the data only once, so comment it out on subsequent runs populate_data() qs = ( Question.objects.values('id', 'question_text') .annotate(mysum=Sum('choice__votes')) .annotate(choice__votes_threshold=Case( When(choice__votes__gt=1, then=Value(1000)), default=Value(-1))) ) print(qs.count() == len(qs))
When issuing the count query (qs.count()
), Django generates this:
SELECT COUNT(*) FROM (SELECT "polls_question"."id" AS "col1", "polls_question"."question_text" AS "col2" FROM "polls_question" LEFT OUTER JOIN "polls_choice" ON ("polls_question"."id" = "polls_choice"."question_id") GROUP BY 1
I've chased it down to this optimization. This count would return 1 because it's just a simple join.
However, the actual query is this:
SELECT "polls_question"."id", "polls_question"."question_text", SUM("polls_choice"."votes") AS "mysum", CASE WHEN "polls_choice"."votes" > 1 THEN 1000 ELSE -1 END AS "choice__votes_threshold" FROM "polls_question" LEFT OUTER JOIN "polls_choice" ON ("polls_question"."id" = "polls_choice"."question_id") GROUP BY "polls_question"."id", 4
Due to the group by and the varying choice__votes_threshold
, there should be 2 rows in the result set.
This _did_ used to work in 3.2.18, and I didn't dig too much into it but I suspect the optimization was introduced after that version, hence why it worked.
Attachments (1)
Change History (9)
by , 18 months ago
Attachment: | mysite.tar.gz added |
---|
comment:1 by , 18 months ago
Cc: | added |
---|---|
Component: | Uncategorized → Database layer (models, ORM) |
Severity: | Normal → Release blocker |
Summary: | Group by removal optimization generates in correct count query → Fixed QuerySet.count() on querysets grouped by unused multi-valued annotations. |
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Bug |
Thanks for the report!
Regression in 59bea9efd2768102fc9d3aedda469502c218e9b7.
comment:2 by , 18 months ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
follow-up: 4 comment:3 by , 18 months ago
I don't think I'll be able to make the cut for 4.2.4 but I have an idea of our to address this issue so I self assigned.
comment:4 by , 18 months ago
Replying to Simon Charette:
I don't think I'll be able to make the cut for 4.2.4 but I have an idea of our to address this issue so I self assigned.
We can postpone 4.2.4 for a few days.
comment:5 by , 18 months ago
I've prepared a regression test:
-
tests/aggregation/tests.py
diff --git a/tests/aggregation/tests.py b/tests/aggregation/tests.py index 9f2a7c8841..8bdeaca97d 100644
a b class AggregateAnnotationPruningTests(TestCase): 2165 2165 self.assertEqual(sql.count("select"), 2, "Subquery wrapping required") 2166 2166 self.assertNotIn("authors_count", sql) 2167 2167 2168 def test_unused_aliased_aggregate_and_annotation_reverse_fk(self): 2169 Book.objects.create( 2170 name="b3", 2171 publisher=self.p2, 2172 pages=1000, 2173 rating=4.2, 2174 price=50, 2175 contact=self.a2, 2176 pubdate=datetime.date.today(), 2177 ) 2178 qs = Publisher.objects.annotate( 2179 total_pages=Sum("book__pages"), 2180 good_book=Case( 2181 When(book__rating__gt=4.0, then=Value(True)), 2182 default=Value(False), 2183 ), 2184 ) 2185 self.assertEqual(qs.count(), 3) 2186 2168 2187 def test_non_aggregate_annotation_pruned(self): 2169 2188 with CaptureQueriesContext(connection) as ctx: 2170 2189 Book.objects.annotate(
comment:6 by , 18 months ago
Has patch: | set |
---|
Sample project