#32478 closed Bug (fixed)
Queryset annotation mixing aggregate and subquery doesn't GROUP BY outer column references.
Reported by: | Igor Pejic | Owned by: | Simon Charette |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.0 |
Severity: | Normal | Keywords: | outerref, subquery |
Cc: | Igor Pejic | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
After trying to migrate from 2.2. to 3.0, we experience a problem with the translation of queries in PostgreSQL which used to work in 2.2.
A breaking test is shown in: https://github.com/igorpejic/django/pull/1/files
From investigation this is related to: https://code.djangoproject.com/ticket/31094
and it seems the regression was introduced in: https://github.com/django/django/commit/fb3f034f1c63160c0ff13c609acd01c18be12f80
It seems that the combination of the "double OuterRef" with the "Case" logic is causing the problem.
Query looks like:
books_with_same_name_as_country = Book.objects.filter( id__in=Subquery( Book.objects.filter( name=OuterRef(OuterRef('country__name')), ).values('id') ) ).values('id')[:1] books_breakdown = Publisher.objects.annotate(total_books=Case( When( num_awards__gte=2, then=Subquery(books_with_same_name_as_country, IntegerField()) ), When( num_awards__lt=0, then=Count('country__publishers') ), ))
Stack trace:
====================================================================== ERROR: test_group_by_subquery_annotation_with_conditional (aggregation.tests.AggregateTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/igor/django-repo/django/db/backends/utils.py", line 86, in _execute return self.cursor.execute(sql, params) psycopg2.errors.GroupingError: subquery uses ungrouped column "aggregation_country.name" from outer query LINE 1: ..."id" FROM "aggregation_book" U0 WHERE U0."name" = "aggregati... ^ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/igor/django-repo/django/test/testcases.py", line 1229, in skip_wrapper return test_func(*args, **kwargs) File "/home/igor/django-repo/tests/aggregation/tests.py", line 1299, in test_group_by_subquery_annotation_with_conditional self.assertEqual(books_breakdown.get(id=self.p1.id).total_books, 1) File "/home/igor/django-repo/django/db/models/query.py", line 411, in get num = len(clone) File "/home/igor/django-repo/django/db/models/query.py", line 258, in __len__ self._fetch_all() File "/home/igor/django-repo/django/db/models/query.py", line 1261, in _fetch_all self._result_cache = list(self._iterable_class(self)) File "/home/igor/django-repo/django/db/models/query.py", line 57, in __iter__ results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size) File "/home/igor/django-repo/django/db/models/sql/compiler.py", line 1154, in execute_sql cursor.execute(sql, params) File "/home/igor/django-repo/django/db/backends/utils.py", line 68, in execute return self._execute_with_wrappers(sql, params, many=False, executor=self._execute) File "/home/igor/django-repo/django/db/backends/utils.py", line 77, in _execute_with_wrappers return executor(sql, params, many, context) File "/home/igor/django-repo/django/db/backends/utils.py", line 86, in _execute return self.cursor.execute(sql, params) File "/home/igor/django-repo/django/db/utils.py", line 90, in __exit__ raise dj_exc_value.with_traceback(traceback) from exc_value File "/home/igor/django-repo/django/db/backends/utils.py", line 86, in _execute return self.cursor.execute(sql, params) django.db.utils.ProgrammingError: subquery uses ungrouped column "aggregation_country.name" from outer query LINE 1: ..."id" FROM "aggregation_book" U0 WHERE U0."name" = "aggregati... ^
Full query 2.2:
SELECT "aggregation_regress_publisher"."id", "aggregation_regress_publisher"."name", "aggregation_regress_publisher"."num_awards", "aggregation_regress_publisher"."country_id", CASE WHEN "aggregation_regress_publisher"."num_awards" >= 2 THEN (SELECT V0."id" FROM "aggregation_regress_book" V0 WHERE V0."id" IN (SELECT U0."id" FROM "aggregation_regress_book" U0 WHERE U0."name" = ("aggregation_regress_country"."name") ORDER BY U0."name" ASC) ORDER BY V0."name" ASC LIMIT 1) WHEN "aggregation_regress_publisher"."num_awards" < 0 THEN COUNT(T3."id") ELSE NULL END AS "total_books" FROM "aggregation_regress_publisher" INNER JOIN "aggregation_regress_country" ON ("aggregation_regress_publisher"."country_id" = "aggregation_regress_country"."id") LEFT OUTER JOIN "aggregation_regress_publisher" T3 ON ("aggregation_regress_country"."id" = T3."country_id") GROUP BY "aggregation_regress_publisher"."id", (SELECT V0."id" FROM "aggregation_regress_book" V0 WHERE V0."id" IN (SELECT U0."id" FROM "aggregation_regress_book" U0 WHERE U0."name" = ("aggregation_regress_country"."name") ORDER BY U0."name" ASC) ORDER BY V0."name" ASC LIMIT 1)
Full query 3.0:
SELECT "aggregation_publisher"."id", "aggregation_publisher"."name", "aggregation_publisher"."num_awards", "aggregation_publisher"."duration", "aggregation_publisher"."country_id", CASE WHEN "aggregation_publisher"."num_awards" >= 2 THEN (SELECT V0."id" FROM "aggregation_book" V0 WHERE V0."id" IN (SELECT U0."id" FROM "aggregation_book" U0 WHERE U0."name" = "aggregation_country"."name") LIMIT 1) WHEN "aggregation_publisher"."num_awards" < 0 THEN COUNT(T3."id") ELSE NULL END AS "total_books" FROM "aggregation_publisher" INNER JOIN "aggregation_country" ON ("aggregation_publisher"."country_id" = "aggregation_country"."id") LEFT OUTER JOIN "aggregation_publisher" T3 ON ("aggregation_country"."id" = T3."country_id") GROUP BY "aggregation_publisher"."id"
Change History (13)
comment:1 by , 4 years ago
Component: | Uncategorized → Database layer (models, ORM) |
---|
comment:2 by , 4 years ago
Cc: | added |
---|
comment:3 by , 4 years ago
Type: | Uncategorized → Bug |
---|
comment:4 by , 4 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Summary: | Double OuterRef in Subquery with Case broken in 3.X → Queryset annotation mixing aggregate and subquery doesn't GROUP BY outer column references. |
Triage Stage: | Unreviewed → Accepted |
comment:5 by , 4 years ago
Description: | modified (diff) |
---|
comment:6 by , 4 years ago
Thanks. I edited the main description with the query in 3.0 and 2.2.
The main difference is in the last argument of the group by.
comment:7 by , 4 years ago
I confirmed that it's a regression in fb3f034f1c63160c0ff13c609acd01c18be12f80.
comment:8 by , 4 years ago
Has patch: | set |
---|
Igor, could you confirm your issue is resolved by https://github.com/django/django/pull/14039.
The underlying problem was that nested external column references were not recursively collected so the problem detailed in #31094 could manifest itself when mixing aggregation with a nested subqueries because Query.get_external_cols()
would not account for possibly nested subqueries with external column references.
comment:9 by , 4 years ago
I can confirm the fix works for our repository when using it with 3.0.12
Thank you for this fix and all other work Simon.
comment:10 by , 4 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
That looks like another case of #31094 but generalized for expressions that mix aggregates and subqueries.
Case.get_group_by_cols()
defers toBaseExpression.get_group_by_cols()
when it hascases
and the latter immediately returnsself
when it contains aggregates.Igor, could you include the SQL that uses to be generated and what the new one looks like?