#34016 closed Bug (fixed)
QuerySet.values_list() crash on simple ArrayAgg.
Reported by: | Alex Kerkum | Owned by: | Alex Kerkum |
---|---|---|---|
Component: | contrib.postgres | Version: | 4.1 |
Severity: | Release blocker | Keywords: | |
Cc: | Simon Charette | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Using ArrayAgg
in combination with values_list
results in 'TypeError: Complex expressions require an alias'.
For example:
from django.contrib.postgres.aggregates import ArrayAgg SampleUser.objects.values_list(ArrayAgg("post_id")) File django/db/models/aggregates.py:98, in Aggregate.default_alias(self) 96 if len(expressions) == 1 and hasattr(expressions[0], "name"): 97 return "%s__%s" % (expressions[0].name, self.name.lower()) ---> 98 raise TypeError("Complex expressions require an alias") TypeError: Complex expressions require an alias
The expressions
variable here seems to contain [F(post_id), OrderByList()]
causing the len(expressions)
check to fail. That's as far as I got.
To me this seems related to #33898 caused by a regression in https://github.com/django/django/commit/e06dc4571ea9fd5723c8029959b95808be9f8812
This still worked in 4.0.7.
Change History (13)
comment:1 by , 2 years ago
Description: | modified (diff) |
---|
comment:2 by , 2 years ago
Description: | modified (diff) |
---|
comment:3 by , 2 years ago
Cc: | added |
---|---|
Severity: | Normal → Release blocker |
Summary: | QuerySet.values_list crash when using ArrayAgg → QuerySet.values_list() crash on simple ArrayAgg. |
Triage Stage: | Unreviewed → Accepted |
comment:4 by , 2 years ago
We could skip an empty OrderByList
🤔:
-
django/contrib/postgres/aggregates/mixins.py
diff --git a/django/contrib/postgres/aggregates/mixins.py b/django/contrib/postgres/aggregates/mixins.py index b2f4097b8f..2bf236474a 100644
a b class OrderableAggMixin: 14 14 return super().resolve_expression(*args, **kwargs) 15 15 16 16 def get_source_expressions(self): 17 if not self.order_by.source_expressions: 18 return super().get_source_expressions() 17 19 return super().get_source_expressions() + [self.order_by] 18 20 19 21 def set_source_expressions(self, exprs): 20 *exprs, self.order_by = exprs 22 if isinstance(exprs[-1], OrderByList): 23 *exprs, self.order_by = exprs 21 24 return super().set_source_expressions(exprs) 22 25 23 26 def as_sql(self, compiler, connection):
Alex, would you like to prepare a patch? (regression test is required)
comment:5 by , 2 years ago
Thanks for the quick reply! Your proposed fix does indeed seem to fix the issue.
I would love to create a patch, but I have no experience with the django code.
I'll have a look at it this afternoon to see if I can come up with something.
comment:6 by , 2 years ago
This approach will work as long no OrderableAggMixin
subclass use this get_source_expression
entry removal trick as well which I believe is the case for our internal usages.
Somewhat related but the Window
expression is an example of a composite that has some components that could be missing (partition_by
, frame
, order_by
) and cannot use the len
of the expressions to determine which ones were provided back (e.g. does 3 source expressions mean (source_expression, partition_by, frame)
or (source_expression, frame, order_by)
?).
It solves this problem by somewhat violating the get_source_expressions
API by returning None
placeholders which makes use some special casing in a few locations
- https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L397-L402
- https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L384-L389
- https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L274-L281
But not in all of them!
- https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L410-L417
- https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L424-L425
- https://github.com/django/django/blob/bc33b047848759e7ae90bf85c94ff6cc5913e1ee/django/db/models/expressions.py#L236-L246
All that so say that we should probably settle on what the signature of get_source_expressions
is; list[Optional[Expression]]
or list[Expression]
?
If we go with list[Optional[Expression]]
then we don't have to do any len
based special casing when dealing with optional sub-expressions but it means that checks of the form len(source_expressions)
are not adequate; they must exclude None
values. I guess we could provide a property/method that excludes them if this is a common pattern.
If we go with list[Expression]
then we should adjust Window
other expressions violating this guarantee and drop special casing for None
value. I guess Window.get_source_expressions()
could do something along the lines of
def get_source_expressions(self): expressions = self.source_expression if self.partition_by: expressions.append(self.partition_by) ... return expressions def set_source_expressions(self, expressions): self.source_expression = expressions.pop(0) if self.partition_by: self.partition_by = expressions.pop(0) ...
Given we've had these special cases in place for a while I think a better long term solution, and one to happen to maintain backward compatibility, would be to fully embrace the list[Optional[Expression]]
approach by using None
placeholders to address this problem. It doesn't have to be in this patch but at least agreeing on which signature we should support in main
and going forward would be beneficial.
From having more experience playing with Python typing in the past years, this is the kind of issues mypy
and friends could help us properly enforce. Maybe it's time for us to re-visit adding typing at least to some parts of Django?
comment:7 by , 2 years ago
I've made a PR with Mariusz' suggestions here: https://github.com/django/django/pull/16064
Is that sufficient, or does it need some more work?
comment:8 by , 2 years ago
Thanks for taking the time to submit a PR Alex, greatly appreciated. It should be enough assuming all tests are passing.
The long blob of text above was mostly about this regression being a sign of a larger problem and possible solutions to address it in follow up tickets.
comment:9 by , 2 years ago
Thanks Simon! As all tests are passing I've marked it as 'Ready for review'.
Fixing the larger problem in follow up tickets sounds good, although the implications of it are over my head :)
comment:10 by , 2 years ago
If we go with list[Optional[Expression]] then we don't have to do any len based special casing when dealing with optional sub-expressions but it means that checks of the form len(source_expressions) are not adequate; they must exclude None values. I guess we could provide a property/method that excludes them if this is a common pattern.
Thanks for checking. I'd choose list[Optional[Expression]]
and None
for empty optional expressions. However, Alex's PR is more suitable for backporting. We can standardized the behavior for optional sub-expressions in a separate cleanup targeted only to the main
branch.
comment:11 by , 2 years ago
Has patch: | set |
---|---|
Owner: | set to |
Status: | new → assigned |
Triage Stage: | Accepted → Ready for checkin |
Thanks for the report.
Regression in e06dc4571ea9fd5723c8029959b95808be9f8812.