Opened 5 years ago
Closed 5 years ago
#31136 closed Bug (fixed)
Multiple annotations with Exists() should not group by aliases.
Reported by: | Sigurd Ljødal | Owned by: | Mariusz Felisiak |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.0 |
Severity: | Release blocker | Keywords: | |
Cc: | Simon Charette, jon.dufresne@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
There's a regression in Django 3.0 where an annotation that is not selected is included in the group by section of the query on PostgreSQL. I looks a bit like https://code.djangoproject.com/ticket/31109, but it is still present in Django 3.0.2 (I haven't tested earlier versions yet). I haven't had time to look into where the incorrect group by cause is generated yet.
I have created a test case to reproduce the issue here: https://github.com/django/django/compare/master...ljodal:regression/annotate-groupby-postgresql
It only crashes when running against PostgreSQL. Here's the test case query:
latest_book_pubdate_qs = Book.objects.filter( publisher=OuterRef('pk') ).values_list('id') publisher_qs = Publisher.objects.values('pk').annotate( has_published_books=Exists(latest_book_pubdate_qs), ).filter( has_published_books=True ).annotate( book_count=Count('book') ).values_list('id')
The generated query is this:
SELECT "aggregation_publisher"."id" FROM "aggregation_publisher" LEFT OUTER JOIN "aggregation_book" ON ("aggregation_publisher"."id" = "aggregation_book"."publisher_id") WHERE EXISTS(SELECT U0."id" FROM "aggregation_book" U0 WHERE U0."publisher_id" = "aggregation_publisher"."id") GROUP BY "aggregation_publisher"."id", "has_published_books"
Which crashes because "has_published_books" in the "GROUP BY" clause is not a column on the table:
Traceback (most recent call last): File "/Users/sigurdljodal/Code/django/tests/aggregation/tests.py", line 1166, in test_aggregation_annotation print(list(publisher_qs)) File "/Users/sigurdljodal/Code/django/django/db/models/query.py", line 270, in __len__ self._fetch_all() File "/Users/sigurdljodal/Code/django/django/db/models/query.py", line 1284, in _fetch_all self._result_cache = list(self._iterable_class(self)) File "/Users/sigurdljodal/Code/django/django/db/models/query.py", line 144, in __iter__ return compiler.results_iter(tuple_expected=True, chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size) File "/Users/sigurdljodal/Code/django/django/db/models/sql/compiler.py", line 1085, in results_iter results = self.execute_sql(MULTI, chunked_fetch=chunked_fetch, chunk_size=chunk_size) File "/Users/sigurdljodal/Code/django/django/db/models/sql/compiler.py", line 1133, in execute_sql cursor.execute(sql, params) File "/Users/sigurdljodal/Code/django/django/db/backends/utils.py", line 66, in execute return self._execute_with_wrappers(sql, params, many=False, executor=self._execute) File "/Users/sigurdljodal/Code/django/django/db/backends/utils.py", line 75, in _execute_with_wrappers return executor(sql, params, many, context) File "/Users/sigurdljodal/Code/django/django/db/backends/utils.py", line 84, in _execute return self.cursor.execute(sql, params) File "/Users/sigurdljodal/Code/django/django/db/utils.py", line 90, in __exit__ raise dj_exc_value.with_traceback(traceback) from exc_value File "/Users/sigurdljodal/Code/django/django/db/backends/utils.py", line 84, in _execute return self.cursor.execute(sql, params) django.db.utils.ProgrammingError: column "has_published_books" does not exist LINE 1: ...her"."id") GROUP BY "aggregation_publisher"."id", "has_publi...
Change History (15)
comment:1 by , 5 years ago
Cc: | added |
---|---|
Severity: | Normal → Release blocker |
Summary: | Group by with annotation regression on PostgreSQL → Multiple annotations with Exists() should not group by aliases. |
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 5 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:6 by , 5 years ago
Resolution: | fixed |
---|---|
Status: | closed → new |
I found a regression in this fix when I was trying to reproduce #31150, see
diff --git a/tests/aggregation/tests.py b/tests/aggregation/tests.py index f523b3ca35..0209a9221b 100644 --- a/tests/aggregation/tests.py +++ b/tests/aggregation/tests.py @@ -1165,6 +1165,27 @@ class AggregateTestCase(TestCase): 'Sams', ]) + def test_aggregation_subquery_annotation_distinct(self): + books_qs = Book.objects.annotate( + first_author_the_same_age=Subquery( + Author.objects.filter( + age=OuterRef('contact__friends__age'), + ).order_by('age').values('id')[:1], + ) + ).filter( + publisher=self.p1, + first_author_the_same_age__isnull=False, + ).annotate( + min_age=Min('contact__friends__age'), + ).values('name', 'min_age').distinct() + self.assertEqual(list(books_qs),[ + { + 'name': 'The Definitive Guide to Django: Web Development Done Right', + 'min_age': 29, + }, + {'name': 'Practical Django Projects', 'min_age': 34}, + ]) + @skipUnlessDBFeature('supports_subqueries_in_group_by') def test_group_by_subquery_annotation(self): """
comment:8 by , 5 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:11 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:12 by , 5 years ago
Cc: | added |
---|
Commit a24686987f21fe6b5339cc13607c13fa906f81a8 causes a regression for my application. I'll try to pin down a minimal example but for now, here is the backtrace:
Traceback (most recent call last): <MY APP> File ".../django/db/models/query.py", line 276, in __iter__ self._fetch_all() File ".../django/db/models/query.py", line 1261, in _fetch_all self._result_cache = list(self._iterable_class(self)) File ".../django/db/models/query.py", line 184, in __iter__ for row in compiler.results_iter(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size): File ".../django/db/models/sql/compiler.py", line 1096, in results_iter results = self.execute_sql(MULTI, chunked_fetch=chunked_fetch, chunk_size=chunk_size) File ".../django/db/models/sql/compiler.py", line 1144, in execute_sql cursor.execute(sql, params) File ".../django/db/backends/utils.py", line 68, in execute return self._execute_with_wrappers(sql, params, many=False, executor=self._execute) File ".../django/db/backends/utils.py", line 77, in _execute_with_wrappers return executor(sql, params, many, context) File ".../django/db/backends/utils.py", line 86, in _execute return self.cursor.execute(sql, params) File ".../django/db/utils.py", line 90, in __exit__ raise dj_exc_value.with_traceback(traceback) from exc_value File ".../django/db/backends/utils.py", line 86, in _execute return self.cursor.execute(sql, params) django.db.utils.ProgrammingError: column "myapp_mymodel.myfield" must appear in the GROUP BY clause or be used in an aggregate function
comment:13 by , 5 years ago
Resolution: | fixed |
---|---|
Status: | closed → new |
Here is a POC test to demonstrate the regression. Fails with the postgresql backend.
diff --git a/tests/aggregation/models.py b/tests/aggregation/models.py index fd441fe51d..211a96d97c 100644 --- a/tests/aggregation/models.py +++ b/tests/aggregation/models.py @@ -42,3 +42,17 @@ class Store(models.Model): def __str__(self): return self.name + + +class ThisOrThat(models.Model): + pass + + +class This(models.Model): + thisorthat = models.ForeignKey("ThisOrThat", models.CASCADE) + created_at = models.DateTimeField(auto_now_add=True) + + +class That(models.Model): + thisorthat = models.OneToOneField("ThisOrThat", models.CASCADE) + created_at = models.DateTimeField(auto_now_add=True) diff --git a/tests/aggregation/tests.py b/tests/aggregation/tests.py index bef415abd4..508b9cf687 100644 --- a/tests/aggregation/tests.py +++ b/tests/aggregation/tests.py @@ -1239,3 +1239,20 @@ class AggregateTestCase(TestCase): contact_publisher__isnull=False, ).annotate(count=Count('authors')) self.assertSequenceEqual(books_qs, [book]) + + def test_regression(self): + from .models import ThisOrThat + from django.db.models.aggregates import Max + from django.db.models.functions import Coalesce + + qs = ( + ThisOrThat.objects.annotate( + Max("this__created_at"), + created_at=Coalesce( + "this__created_at__max", "that__created_at" + ), + ) + .order_by("created_at") + .values_list("pk", flat=True) + ) + list(qs)
Running this test results in the following error:
====================================================================== ERROR: test_regression (aggregation.tests.AggregateTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jon/devel/django/django/db/backends/utils.py", line 86, in _execute return self.cursor.execute(sql, params) psycopg2.errors.GroupingError: column "aggregation_that.created_at" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: ...BY COALESCE(MAX("aggregation_this"."created_at"), "aggregati... ^ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/jon/devel/django/tests/aggregation/tests.py", line 1258, in test_regression list(qs) File "/home/jon/devel/django/django/db/models/query.py", line 258, in __len__ self._fetch_all() File "/home/jon/devel/django/django/db/models/query.py", line 1261, in _fetch_all self._result_cache = list(self._iterable_class(self)) File "/home/jon/devel/django/django/db/models/query.py", line 184, in __iter__ for row in compiler.results_iter(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size): File "/home/jon/devel/django/django/db/models/sql/compiler.py", line 1096, in results_iter results = self.execute_sql(MULTI, chunked_fetch=chunked_fetch, chunk_size=chunk_size) File "/home/jon/devel/django/django/db/models/sql/compiler.py", line 1144, in execute_sql cursor.execute(sql, params) File "/home/jon/devel/django/django/db/backends/utils.py", line 68, in execute return self._execute_with_wrappers(sql, params, many=False, executor=self._execute) File "/home/jon/devel/django/django/db/backends/utils.py", line 77, in _execute_with_wrappers return executor(sql, params, many, context) File "/home/jon/devel/django/django/db/backends/utils.py", line 86, in _execute return self.cursor.execute(sql, params) File "/home/jon/devel/django/django/db/utils.py", line 90, in __exit__ raise dj_exc_value.with_traceback(traceback) from exc_value File "/home/jon/devel/django/django/db/backends/utils.py", line 86, in _execute return self.cursor.execute(sql, params) django.db.utils.ProgrammingError: column "aggregation_that.created_at" must appear in the GROUP BY clause or be used in an aggregate function LINE 1: ...BY COALESCE(MAX("aggregation_this"."created_at"), "aggregati...
comment:14 by , 5 years ago
Jon I think you should create another ticket referring this one instead of reopening this one.
Looks like something is off with contains_aggregate
handling in a24686987f21fe6b5339cc13607c13fa906f81a8.
Can you also reproduce the crash by chaining annotate
calls? Could you provide the query before and after a24686987f21fe6b5339cc13607c13fa906f81a8.
comment:15 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
I've create a new ticket #31217 for this regression.
Thanks for this report. As a temporary workaround you can use
Exists()
directly inQuerySet.filter()
(without an annotation), i.e.Regression in fb3f034f1c63160c0ff13c609acd01c18be12f80.
Reproduced at 495d7a1ddf3c8c83d49f099ee3599afbc28306fe.