Opened 3 years ago
Closed 3 years ago
#33282 closed Bug (fixed)
django.db.utils.ProgrammingError: more than one row returned by a subquery used as an expression
Reported by: | Antonio Terceiro | Owned by: | Simon Charette |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.2 |
Severity: | Normal | Keywords: | |
Cc: | Simon Charette | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Hi,
I'm working on porting lava to Django 3.2. When I run the test suite, I find 2 issues. One is #32690, for which I have the corresponding patch backported locally. The other one is this is a bunch of crashes that look like this:
E django.db.utils.ProgrammingError: more than one row returned by a subquery used as an expression
I wasn't able to produce a minimal reproducer app yet, but I will try to point to the relevant code.
in lava we have a few special model managers for handling access control. They are here: https://git.lavasoftware.org/lava/lava/-/blob/master/lava_scheduler_app/managers.py
This issue can easily be reproduced in a shell session:
$ ./manage.py shell Python 3.9.8 (main, Nov 7 2021, 15:47:09) Type 'copyright', 'credits' or 'license' for more information IPython 7.27.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: from lava_scheduler_app.models import TestJob, User In [2]: TestJob.objects.visible_by_user(User.objects.last()) Out[2]: --------------------------------------------------------------------------- CardinalityViolation Traceback (most recent call last) /usr/lib/python3/dist-packages/django/db/backends/utils.py in _execute(self, sql, params, *ignored_wrapper_args) 83 else: ---> 84 return self.cursor.execute(sql, params) 85 CardinalityViolation: more than one row returned by a subquery used as an expression The above exception was the direct cause of the following exception: ProgrammingError Traceback (most recent call last) /usr/lib/python3/dist-packages/IPython/core/formatters.py in __call__(self, obj) 700 type_pprinters=self.type_printers, 701 deferred_pprinters=self.deferred_printers) --> 702 printer.pretty(obj) 703 printer.flush() 704 return stream.getvalue() /usr/lib/python3/dist-packages/IPython/lib/pretty.py in pretty(self, obj) 392 if cls is not object \ 393 and callable(cls.__dict__.get('__repr__')): --> 394 return _repr_pprint(obj, self, cycle) 395 396 return _default_pprint(obj, self, cycle) /usr/lib/python3/dist-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle) 698 """A pprint that just redirects to the normal repr function.""" 699 # Find newlines and replace them with p.break_() --> 700 output = repr(obj) 701 lines = output.splitlines() 702 with p.group(): /usr/lib/python3/dist-packages/django/db/models/query.py in __repr__(self) 254 255 def __repr__(self): --> 256 data = list(self[:REPR_OUTPUT_SIZE + 1]) 257 if len(data) > REPR_OUTPUT_SIZE: 258 data[-1] = "...(remaining elements truncated)..." /usr/lib/python3/dist-packages/django/db/models/query.py in __len__(self) 260 261 def __len__(self): --> 262 self._fetch_all() 263 return len(self._result_cache) 264 /usr/lib/python3/dist-packages/django/db/models/query.py in _fetch_all(self) 1322 def _fetch_all(self): 1323 if self._result_cache is None: -> 1324 self._result_cache = list(self._iterable_class(self)) 1325 if self._prefetch_related_lookups and not self._prefetch_done: 1326 self._prefetch_related_objects() /usr/lib/python3/dist-packages/django/db/models/query.py in __iter__(self) 49 # Execute the query. This will also fill compiler.select, klass_info, 50 # and annotations. ---> 51 results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size) 52 select, klass_info, annotation_col_map = (compiler.select, compiler.klass_info, 53 compiler.annotation_col_map) /usr/lib/python3/dist-packages/django/db/models/sql/compiler.py in execute_sql(self, result_type, chunked_fetch, chunk_size) 1173 cursor = self.connection.cursor() 1174 try: -> 1175 cursor.execute(sql, params) 1176 except Exception: 1177 # Might fail for server-side cursors (e.g. connection closed) /usr/lib/python3/dist-packages/django/db/backends/utils.py in execute(self, sql, params) 96 def execute(self, sql, params=None): 97 with self.debug_sql(sql, params, use_last_executed_query=True): ---> 98 return super().execute(sql, params) 99 100 def executemany(self, sql, param_list): /usr/lib/python3/dist-packages/django/db/backends/utils.py in execute(self, sql, params) 64 65 def execute(self, sql, params=None): ---> 66 return self._execute_with_wrappers(sql, params, many=False, executor=self._execute) 67 68 def executemany(self, sql, param_list): /usr/lib/python3/dist-packages/django/db/backends/utils.py in _execute_with_wrappers(self, sql, params, many, executor) 73 for wrapper in reversed(self.db.execute_wrappers): 74 executor = functools.partial(wrapper, executor) ---> 75 return executor(sql, params, many, context) 76 77 def _execute(self, sql, params, *ignored_wrapper_args): /usr/lib/python3/dist-packages/django/db/backends/utils.py in _execute(self, sql, params, *ignored_wrapper_args) 82 return self.cursor.execute(sql) 83 else: ---> 84 return self.cursor.execute(sql, params) 85 86 def _executemany(self, sql, param_list, *ignored_wrapper_args): /usr/lib/python3/dist-packages/django/db/utils.py in __exit__(self, exc_type, exc_value, traceback) 88 if dj_exc_type not in (DataError, IntegrityError): 89 self.wrapper.errors_occurred = True ---> 90 raise dj_exc_value.with_traceback(traceback) from exc_value 91 92 def __call__(self, func): /usr/lib/python3/dist-packages/django/db/backends/utils.py in _execute(self, sql, params, *ignored_wrapper_args) 82 return self.cursor.execute(sql) 83 else: ---> 84 return self.cursor.execute(sql, params) 85 86 def _executemany(self, sql, param_list, *ignored_wrapper_args): ProgrammingError: more than one row returned by a subquery used as an expression
If I downgrade Django to 2.2, that just works. This is reproducible with Django 3.2 and with the current main branch from git.
Change History (14)
comment:1 by , 3 years ago
comment:2 by , 3 years ago
FWIW the generated query that causes this is:
SELECT COUNT (*) AS "__count" FROM "lava_scheduler_app_testjob" WHERE (((NOT "lava_scheduler_app_testjob". "is_public" AND "lava_scheduler_app_testjob"."submitter_id" = %s) OR ("lava_scheduler_app_testjob". "is_public" AND "lava_scheduler_app_testjob". "id" IN (SELECT U0. "id" FROM "lava_scheduler_app_testjob" U0 LEFT OUTER JOIN "lava_scheduler_app_testjob_viewing_groups" U1 ON (U0."id" = U1."testjob_id") WHERE U1. "group_id" IS NULL) AND (("lava_scheduler_app_testjob". "actual_device_id" IS NOT NULL AND "lava_scheduler_app_testjob". "actual_device_id" IN (SELECT V0. "hostname" FROM "lava_scheduler_app_device" V0 LEFT OUTER JOIN "lava_scheduler_app_groupdevicepermission" V1 ON (V0."hostname" = V1. "device_id") LEFT OUTER JOIN "auth_group" V2 ON (V1. "group_id" = V2. "id") LEFT OUTER JOIN "auth_user_groups" V3 ON (V2. "id" = V3. "group_id") LEFT OUTER JOIN "auth_permission" V5 ON (V1. "permission_id" = V5. "id") GROUP BY V0."hostname", (SELECT U0. "name" FROM "lava_scheduler_app_devicetype" U0 LEFT OUTER JOIN "lava_scheduler_app_groupdevicetypepermission" U1 ON (U0."name" = U1. "devicetype_id") LEFT OUTER JOIN "auth_group" U2 ON (U1."group_id" = U2. "id") LEFT OUTER JOIN "auth_user_groups" U3 ON (U2. "id" = U3. "group_id") LEFT OUTER JOIN "auth_permission" U5 ON (U1. "permission_id" = U5. "id") GROUP BY U0. "name" HAVING (SUM (CASE WHEN (U5."codename" = %s) THEN % s ELSE % s END) = %s OR NOT (SUM (CASE WHEN (U3. "user_id" = %s AND U5. "codename" IN (%s, %s, %s)) THEN % s ELSE % s END) = %s))) HAVING ((SUM (CASE WHEN (V5."codename" = %s) THEN % s ELSE % s END) = %s AND V0. "device_type_id" IN (SELECT U0. "name" FROM "lava_scheduler_app_devicetype" U0 LEFT OUTER JOIN "lava_scheduler_app_groupdevicetypepermission" U1 ON (U0."name" = U1. "devicetype_id") LEFT OUTER JOIN "auth_group" U2 ON (U1."group_id" = U2. "id") LEFT OUTER JOIN "auth_user_groups" U3 ON (U2."id" = U3. "group_id") LEFT OUTER JOIN "auth_permission" U5 ON (U1. "permission_id" = U5. "id") GROUP BY U0. "name" HAVING (SUM (CASE WHEN (U5. "codename" = %s) THEN % s ELSE % s END) = %s OR NOT (SUM (CASE WHEN (U3. "user_id" = %s AND U5. "codename" IN (%s, %s, %s)) THEN % s ELSE % s END) = %s)))) OR NOT (SUM (CASE WHEN (V3. "user_id" = %s AND V5. "codename" IN (%s, %s, %s)) THEN % s ELSE % s END) = %s)))) OR ("lava_scheduler_app_testjob". "actual_device_id" IS NULL AND "lava_scheduler_app_testjob". "requested_device_type_id" IN (SELECT U0. "name" FROM "lava_scheduler_app_devicetype" U0 LEFT OUTER JOIN "lava_scheduler_app_groupdevicetypepermission" U1 ON (U0."name" = U1. "devicetype_id") LEFT OUTER JOIN "auth_group" U2 ON (U1."group_id" = U2. "id") LEFT OUTER JOIN "auth_user_groups" U3 ON (U2."id" = U3. "group_id") LEFT OUTER JOIN "auth_permission" U5 ON (U1. "permission_id" = U5. "id") GROUP BY U0. "name" HAVING (SUM (CASE WHEN (U5. "codename" = %s) THEN % s ELSE % s END) = %s OR NOT (SUM (CASE WHEN (U3. "user_id" = %s AND U5. "codename" IN (%s, %s, %s)) THEN % s ELSE % s END) = %s)))))) OR (NOT ("lava_scheduler_app_testjob". "id" IN (SELECT U0. "id" FROM "lava_scheduler_app_testjob" U0 LEFT OUTER JOIN "lava_scheduler_app_testjob_viewing_groups" U1 ON (U0."id" = U1."testjob_id") WHERE U1. "group_id" IS NULL)) AND NOT (EXISTS (SELECT (1) AS "a" FROM "lava_scheduler_app_testjob_viewing_groups" V1 WHERE (V1. "group_id" IN (SELECT U0. "id" FROM "auth_group" U0 WHERE NOT (U0. "id" IN (%s))) AND V1. "testjob_id" = ("lava_scheduler_app_testjob". "id")) LIMIT 1)))) AND "lava_scheduler_app_testjob"."submitter_id" = %s AND "lava_scheduler_app_testjob"."state" IN (%s, %s))
comment:3 by , 3 years ago
Cc: | added |
---|---|
Resolution: | → needsinfo |
Status: | new → closed |
The issue seems related an attempt at doing a GROUP BY
by a subquery that returns more than one row. See the GROUP BY V0."hostname", (SELECT U0."name" FROM "lava_scheduler_app_devicetype" ...)
part.
The Q(actual_device__isnull=False, actual_device__in=accessible_devices)
lookup in RestrictedTestJobQuerySet.accessible_by_user
is causing that somehow, and I assume Device.objects.accessible_by_user
is the culprit as it must do a subquery annotation somehow that is not limited to one row or maybe your manual backport of 136ff592ad8aa8b7fa1e61435e5501cc98ce8573 is to blame? In call cases you'll want to debug get_group_by_cols
.
By giving a cursory look at your code base I couldn't identify the origin of the subquery annotation so I'll close with needsinfo for now but please re-open if you can create a minimal test case as I'll be monitoring this issue.
comment:4 by , 3 years ago
Hi, I spent some time debugging this today. This seems to be caused by get_group_by_cols()
returning an incorrect value for a Query
object. In a pdb session against the main branch of the git repository, I get this:
$ pdb3 ./manage.py runscript test > /home/terceiro/src/linaro/lava/manage.py(22)<module>() -> import argparse (Pdb) break /home/terceiro/src/django/django/db/models/lookups.py:434 Breakpoint 1 at /home/terceiro/src/django/django/db/models/lookups.py:434 (Pdb) c > /home/terceiro/src/django/django/db/models/lookups.py(434)get_group_by_cols() -> cols.extend(self.rhs.get_group_by_cols()) (Pdb) p self.rhs <django.db.models.sql.query.Query object at 0x7f6ddb7bd580> (Pdb) p self.rhs.get_group_by_cols() [<django.db.models.sql.query.Query object at 0x7f6ddb7bd580>]
In commit 35431298226165986ad07e91f9d3aca721ff38ec, which caused this, django.db.models.sql.query.Query
is made a subclass of django.db.models.expressions.BaseExpression
. BaseExpression.get_group_by_calls is implemented like this:
def get_group_by_cols(self, alias=None): if not self.contains_aggregate: return [self] cols = [] for source in self.get_source_expressions(): cols.extend(source.get_group_by_cols()) return cols
Since BaseExpression also hardcodes contains_aggregate
to False, that first if statement always applies. Providing an appropriate implementation of contains_aggregate to Query fixes things for me here:
diff --git a/django/db/models/sql/query.py b/django/db/models/sql/query.py index 2c5f11cbbf..f7c43ff622 100644 --- a/django/db/models/sql/query.py +++ b/django/db/models/sql/query.py @@ -414,6 +414,11 @@ class Query(BaseExpression): annotation.set_source_expressions(new_exprs) return annotation, col_cnt + @cached_property + def contains_aggregate(self): + annotations = self.annotations.values() + return any(getattr(ann, 'contains_aggregate', True) for ann in annotations) + def get_aggregation(self, using, added_aggregate_names): """ Return the dictionary with the values of the existing aggregations.
I wasn't able to write an unit tests for this so far, though.
comment:5 by , 3 years ago
Just noticed that this change causes a few errors for other tests so it's not the final fix.
comment:6 by , 3 years ago
Resolution: | needsinfo |
---|---|
Status: | closed → new |
Unfortunately I wasn't able yet to write an unit test that triggers this issue. I was, however, able to write a patch that both fixes my issue and does not cause any existing unit test to fail: https://github.com/django/django/pull/15129
comment:7 by , 3 years ago
Has patch: | set |
---|---|
Needs tests: | set |
Patch needs improvement: | set |
follow-up: 9 comment:8 by , 3 years ago
Triage Stage: | Unreviewed → Accepted |
---|
Managed to reproduce with this tests
-
tests/aggregation/tests.py
diff --git a/tests/aggregation/tests.py b/tests/aggregation/tests.py index 9ba70cb35e..f3277d2b5f 100644
a b def test_aggregation_nested_subquery_outerref(self): 1335 1335 ).values_list('publisher_count', flat=True) 1336 1336 self.assertSequenceEqual(books_breakdown, [1] * 6) 1337 1337 1338 def test_filter_in_subquery_or_aggregation(self): 1339 authors = Author.objects.annotate( 1340 books_count=Count('book'), 1341 ).filter( 1342 Q(pk__in=Book.objects.values('authors')) | Q(books_count__gt=0) 1343 ) 1344 self.assertQuerysetEqual(authors, Author.objects.all(), ordered=False) 1345 1338 1346 def test_aggregation_random_ordering(self): 1339 1347 """Random() is not included in the GROUP BY when used for ordering.""" 1340 1348 authors = Author.objects.annotate(contact_count=Count('book')).order_by('?')
The gist of it is that the logic that was added to Subquery.get_group_by_cols
should have been added to sql.Query.get_group_by_cols
and the latter should have proxied it.
The issue is triggered when a lookup containing a subquery is combined using OR
to a lookup against an aggregation. When this happens the ORM has to choice but to push the filtering to the HAVING
clause and thus group by all inputs passed down to the lookups. Since Subquery
resolves to sql.Query
and the latter defaults to returning self
as inherited from BaseExpression
the ORM tries to group by a subquery that potentially returns multiple rows.
Antonio if don't mind if I assign the issue to me I have a patch set almost ready to address the problem and include extra regression tests for new uses cases such as lookup annotations which is a newly supported in 4.0.
comment:9 by , 3 years ago
Replying to Simon Charette:
Antonio if don't mind if I assign the issue to me I have a patch set almost ready to address the problem and include extra regression tests for new uses cases such as lookup annotations which is a newly supported in 4.0.
Sure, please go ahead. I just want this to get fixed.
It would be nice if it's possible to backport the fix to 3.2 as well.
follow-up: 11 comment:10 by , 3 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
I should be able to submit a patch this week.
It would be nice if it's possible to backport the fix to 3.2 as well.
Unfortunately that won't be possible per our backporting policies, too much risk for regressions and 3.2 is only receiving security backports at this point.
comment:11 by , 3 years ago
Replying to Simon Charette:
I should be able to submit a patch this week.
It would be nice if it's possible to backport the fix to 3.2 as well.
Unfortunately that won't be possible per our backporting policies, too much risk for regressions and 3.2 is only receiving security backports at this point.
this means that 3.x will be left with two crash-causing regressions (this one and the one in #32690)
comment:12 by , 3 years ago
Needs tests: | unset |
---|---|
Patch needs improvement: | unset |
comment:13 by , 3 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
Bisecting points to the same culprit as #32690: