Context Navigation

← Previous Ticket
Next Ticket →

#34750 closed Bug (fixed)

Fixed QuerySet.count() on querysets grouped by unused multi-valued annotations.

Reported by:	Toan Vuong	Owned by:	Simon Charette
Component:	Database layer (models, ORM)	Version:	4.2
Severity:	Release blocker	Keywords:
Cc:	Simon Charette	Triage Stage:	Accepted
Has patch:	yes	Needs documentation:	no
Needs tests:	no	Patch needs improvement:	no
Easy pickings:	no	UI/UX:	no

Description

I believe this is related to the work in https://code.djangoproject.com/ticket/28477 (and follow-up issues/changes due to that ticket).

Tested on:
Django 4.2.2
OS X 13.4.1
Python 3.9.16

For the Oracle backend:
cx-Oracle 8.3.0 with instantclient 19.8

For the Postgres backend:
psycopg/psycopg-binary 3.1.9

Attached is a sample project, and the relevant query is below:

def populate_data():
    q1 = Question(question_text='t1')
    q1.save()

    for i in range(10):
        c = Choice(question=q1, choice_text='c1', votes=i)
        c.save()

# Need to populate the data only once, so comment it out on subsequent runs
populate_data()

qs = (
        Question.objects.values('id', 'question_text')
            .annotate(mysum=Sum('choice__votes'))
            .annotate(choice__votes_threshold=Case(
                When(choice__votes__gt=1, then=Value(1000)),
                default=Value(-1)))
)
print(qs.count() == len(qs))

When issuing the count query (qs.count()), Django generates this:

SELECT COUNT(*) FROM (SELECT "polls_question"."id" AS "col1", "polls_question"."question_text" AS "col2" FROM "polls_question" LEFT OUTER JOIN "polls_choice" ON ("polls_question"."id" = "polls_choice"."question_id") GROUP BY 1

I've chased it down to this optimization. This count would return 1 because it's just a simple join.

However, the actual query is this:

SELECT "polls_question"."id", "polls_question"."question_text", SUM("polls_choice"."votes") AS "mysum", CASE WHEN "polls_choice"."votes" > 1 THEN 1000 ELSE  -1 END AS "choice__votes_threshold" FROM "polls_question" LEFT OUTER JOIN "polls_choice" ON ("polls_question"."id" = "polls_choice"."question_id") GROUP BY "polls_question"."id", 4

Due to the group by and the varying choice__votes_threshold, there should be 2 rows in the result set.

This _did_ used to work in 3.2.18, and I didn't dig too much into it but I suspect the optimization was introduced after that version, hence why it worked.

Attachments (1)

mysite.tar.gz (10.4 KB ) - added by Toan Vuong 19 months ago.: Sample project

Download all attachments as: .zip

Change History (9)

by Toan Vuong, 19 months ago

Attachment:	mysite.tar.gz added

Sample project

comment:1 by Mariusz Felisiak, 19 months ago

Cc:	Simon Charette added
Component:	Uncategorized → Database layer (models, ORM)
Severity:	Normal → Release blocker
Summary:	Group by removal optimization generates in correct count query → Fixed QuerySet.count() on querysets grouped by unused multi-valued annotations.
Triage Stage:	Unreviewed → Accepted
Type:	Uncategorized → Bug

Thanks for the report!

Regression in 59bea9efd2768102fc9d3aedda469502c218e9b7.

comment:2 by Simon Charette, 19 months ago

Owner:	changed from nobody to Simon Charette
Status:	new → assigned

follow-up: 4 comment:3 by Simon Charette, 19 months ago

I don't think I'll be able to make the cut for 4.2.4 but I have an idea of our to address this issue so I self assigned.

in reply to: 3 comment:4 by Mariusz Felisiak, 19 months ago

Replying to Simon Charette:

I don't think I'll be able to make the cut for 4.2.4 but I have an idea of our to address this issue so I self assigned.

We can postpone 4.2.4 for a few days.

comment:5 by Mariusz Felisiak, 19 months ago

I've prepared a regression test:

tests/aggregation/tests.py

diff --git a/tests/aggregation/tests.py b/tests/aggregation/tests.py
index 9f2a7c8841..8bdeaca97d 100644

                class AggregateAnnotationPruningTests(TestCase):
         self.assertEqual(sql.count("select"), 2, "Subquery wrapping required")
         self.assertNotIn("authors_count", sql)
+    def test_unused_aliased_aggregate_and_annotation_reverse_fk(self):
+        Book.objects.create(
+            name="b3",
+            publisher=self.p2,
+            pages=1000,
+            rating=4.2,
+            price=50,
+            contact=self.a2,
+            pubdate=datetime.date.today(),
+        )
+        qs = Publisher.objects.annotate(
+            total_pages=Sum("book__pages"),
+            good_book=Case(
+                When(book__rating__gt=4.0, then=Value(True)),
+                default=Value(False),
+            ),
+        )
+        self.assertEqual(qs.count(), 3)
     def test_non_aggregate_annotation_pruned(self):
         with CaptureQueriesContext(connection) as ctx:
             Book.objects.annotate(

comment:6 by Mariusz Felisiak, 19 months ago

Has patch:	set

comment:7 by GitHub <noreply@…>, 19 months ago

Resolution:	→ fixed
Status:	assigned → closed

In c9b9a52:

Fixed #34750 -- Fixed QuerySet.count() when grouping by unused multi-valued annotations.

Thanks Toan Vuong for the report.
Thanks Simon Charette for the review.

Regression in 59bea9efd2768102fc9d3aedda469502c218e9b7.

comment:8 by Mariusz Felisiak <felisiak.mariusz@…>, 19 months ago

In 8808d9d:

[4.2.x] Fixed #34750 -- Fixed QuerySet.count() when grouping by unused multi-valued annotations.

Thanks Toan Vuong for the report.
Thanks Simon Charette for the review.

Regression in 59bea9efd2768102fc9d3aedda469502c218e9b7.
Backport of c9b9a52edc66be117c6e5b5214fa788a4d5db7a8 from main

Note: See TracTickets for help on using tickets.

Download in other formats:

Issues