Opened 9 months ago
Last modified 7 weeks ago
#35235 assigned Bug
ArrayAgg() doesn't return default when filter contains __in=[].
Reported by: | Per Carlsen | Owned by: | Sharon Woo |
---|---|---|---|
Component: | contrib.postgres | Version: | 5.0 |
Severity: | Normal | Keywords: | ArrayAgg |
Cc: | Simon Charette, Nick Pope | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | yes | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
ArrayAgg changed the default return value in Django 5.0 to None
instead of []
. The documentation recommends adding default=Value([])
to make it return an empty list instead of None
. My app requires empty lists to be returned, so I updated the code and noticed that the return type is inconsistent with the documentation when using the filter
functionality of ArrayAgg.
Consider the following query. filter_value
and default
are variables, and given combinations seem to give inconsistent return values.
MyModel.objects.annotate( annotated_ids=ArrayAgg( "my_other_model_set__id", filter=Q( my_other_model_set__id__in=filter_value, ), default=default, ) ).first().annotated_ids
MyOtherModel
has a fk relation to MyModel
, which we referenced via my_other_model_set
. Consider the following input values for filter_value
and default
:
Example 1:
filter_value = [-1] # Or any other ID that does not exist
default = Value([])
This returns an empty list ([]
), as expected because there are no results for ArrayAgg
Example 2:
filter_value = []
default = Value([])
BUG: This returns the string '{}'
instead of an empty list. This also happens if we don't specify a default
value. But if we give default=[]
(without the Value()
), the return value is consistent with the documentation.
Change History (21)
comment:2 by , 9 months ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:3 by , 9 months ago
Cc: | added |
---|---|
Summary: | ArrayAgg inconsistent return value when no results → ArrayAgg() doesn't return default when filter contains __in=[]. |
Triage Stage: | Unreviewed → Accepted |
I was able to reproduce the issue with a small models:
class MyModel(models.Model): pass class MyOtherModel(models.Model): mymodel = models.ForeignKey(MyModel, models.CASCADE) for filter_value in ([], [-1]): for default in ([], Value([])): result = MyModel.objects.annotate( annotated_ids=ArrayAgg( "myothermodel__id", filter=Q( myothermodel__id__in=filter_value, ), default=Value([]), ) ).first().annotated_ids print(result, filter_value, default)
results
{} [] [] {} [] Value([]) [] [-1] [] [] [-1] Value([])
comment:4 by , 9 months ago
If you change felix' example to add output_field=ArrayField(IntegerField())
to the default, it works 👍 seems related to #35149 where db_default
required output_field
set? 🤔
comment:5 by , 9 months ago
Looks like this patch fixes the issue:
--- a/django/db/models/aggregates.py +++ b/django/db/models/aggregates.py @@ -92,7 +92,7 @@ class Aggregate(Func): return c if hasattr(default, "resolve_expression"): default = default.resolve_expression(query, allow_joins, reuse, summarize) - if default._output_field_or_none is None: + if default._resolve_output_field() is None: default.output_field = c._output_field_or_none else: default = Value(default, c._output_field_or_none)
The problem is Aggregate.resolve_expression()
calls default._output_field_or_none
which then caches the None
because it's a @cached_property
.... which makes setting the output_field
essentially a no-op.
Edit: Sharon feel free to use this ^
as a starting point for your patch 👍 Simon Charette will have some feedback on whether this is "correct" or whether there's a larger problem at play here, but I'd wager this is definitely something that needs fixing.
follow-up: 7 comment:6 by , 9 months ago
Thanks for being patient with me. Let me get through everything and come back with an improved PR, I have some time this week.
comment:7 by , 9 months ago
Replying to Sharon Woo:
Thanks for being patient with me. Let me get through everything and come back with an improved PR, I have some time this week.
No worries, everyone is on Discord (chat) or the forum if you need help with anything 👍 Details here: https://www.djangoproject.com/community/
comment:8 by , 9 months ago
Sorry for the lack of progress, before I pushed more code into remote I wanted to get a debugger and the postgres tests running locally... and neither has led to much:
- I spent some time this morning setting up a simple app (branching off main at 561e16d6a7 and an empty postgres db, if it matters) to reproduce Felix's example before I make any further changes. However, instead of the mythical '{}', I always get None for
result
instead, no matter what I pass into default. I am going to step through a bit more here.
- I also spent some time looking into how to run the postgres tests locally (they are always skipped now using tox / runtests.py) -- better yet if I can step through the tests in a debugger, but haven't had any luck with resources. What I've done is searched in our favourite search engines and Discord, but lots of posts on general postgres/testing to sift through. Anyone?
Edit: I'll also post on Discord and cross reference this so the two threads are linked after dinner.
comment:9 by , 9 months ago
I may have misled you there… on GH I mentioned not needing test data but you do actually need at least one row present in the table to get the wrong result. (The reason why my simplified test worked is because there's data in setupTestData()
)
To explain:
The bug happens when the aggregate filter is testing for a "contradiction" (ie something that's guaranteed to be false), eg:
ArrayAgg(…, filter=Q(whatever__in=[]))
here it's guaranteed that the filter will always be false. Django has a chance to do some optimisations when this occurs.
Normally when you do
Foo.objects.filter(whatever__in=[])
Django will raise an EmptyResultSet
and the catching code will skip calling the db altogether so as not to run an unnecessary query & save on time.
However, when it's within an annotation, Django still needs to run the query. The optimisation that occurs in this case is that Django will simplify the expression instead.
If you observe the query (by doing print(queryset.query)
) you'll see that the annotation has been optimised to:
SELECT …, COALESCE(NULL, '{}') …
where Django has deliberately replaced the input to COALESCE
with NULL
.
This reveals the source of the problem: the expression COALESCE(NULL, '{}')
is of type string. It _should_ be something along the lines of COALESCE(NULL, '{}'::integer[])
to force the expression to be of type integer array.
Hope that helps? Sorry for the long-winded explanation but it took me yonks to realise what was happening under the hood with Django so thought you could use the primer 👍😊
Edit: Oh and I forgot to mention why you need at least 1 row, even though it doesn't matter what the data is: It's so that we can see the value of the COALESCE()
expression. If there are no rows then then the expression never gets evaluated.
follow-up: 11 comment:10 by , 9 months ago
Thanks for taking a shot at fixing the issue Sharon and the mentoring David.
Just dropping a message to say that I believe your solution in comment:5 and your assessment about similarity with #35149 is right David. I guess an alternative could be to wrap the default in ExpressionWrapper
instead to avoid the .output_field
assignment foot gun. Foot gun that we could probably disarm by making BaseExpression.output_field
a @property
that clears the _output_field_or_none
cache on assignment as I wouldn't be surprised other well intended .output_field
assignments are not working as expected.
We should definitely address this issue and I wonder if we should also adjust the documentation to denote that the Value
wrapping is not necessary anymore (in other words default=[]
should just work).
comment:11 by , 9 months ago
Replying to Simon Charette:
Foot gun that we could probably disarm by making
BaseExpression.output_field
a@property
that clears the_output_field_or_none
cache on assignment as I wouldn't be surprised other well intended.output_field
assignments are not working as expected.
Would there be any harm in simply changing @cached_property
on BaseExpression._output_field_or_none
to @property
? 🤔 It's body is simply:
@cached_property def _output_field_or_none(self): try: return self.output_field except FieldError: if not self._output_field_resolved_to_none: raise
and output_field
is already @cached_property
as you mentioned, so it seems redundant. These 2 methods were both introduced as cached props when originally added by Josh in 2013. I just wonder whether Josh was being extra safe around the FieldError
handling.
comment:12 by , 9 months ago
Would there be any harm in simply changing @cached_property on BaseExpression._output_field_or_none to @property?
I would definitely support that change! The cache invalidation challenges that the current implementation impose don't seem worth the mere caching benefits that caching _output_field_or_none
over the already existing output_field
caching provide.
comment:13 by , 9 months ago
❤️ awesome
@Sharon so that seems to be the solution if you'd like to prepare a patch 😊
ps: Many folks on Discord will happily answer your test setup q's if you still had them.
comment:14 by , 9 months ago
Hi David and Simon, thank you! This is extremely helpful. I've posted on Discord and contributed a cat pic in the pets channel for good measure.
I'm now able to reproduce exactly Felix's result locally, the little test app for reference (again definitely not best practice): https://github.com/sharonwoo/django/tree/test-app-for-pairing
Now that I have this up, to make changes and push to remote with slightly more confidence.
comment:15 by , 9 months ago
Test app validates
@property def _output_field_or_none(self):
fixes the issue. Now to write a test and commit to remote.
comment:16 by , 9 months ago
@Mariusz, I noticed you didn't categorise this as a release blocker but just wanted to check whether it should be since next month is approaching - technically it's not a regression from 5.0 but it was "uncovered" by the change in default behaviour 😅
comment:18 by , 7 months ago
Hi! Just wondering what the status of this ticket is and when you expect to have a fix ready? :-)
comment:19 by , 7 months ago
Hi Per, the current status of the ticket is reflected in the ticket history, where all conversations are documented openly.
We greatly rely on the community for contributing both fixes and new features. If you have the time and desire to help, there is an ongoing effort from Sharon Woo in this PR:
https://github.com/django/django/pull/17890
One option would be to reach out to Sharon Woo and confirm whether they have time to continue working in the PR and see if/where you could be of help. If they don't, perhaps pick up where they left of? From a quick read, it seems that the bulk of the work was done, only some extra tests are needed.
Thank you!
comment:21 by , 7 weeks ago
I'm working on it again as of today. Let's try to get it done before mid October.
New to django core and attempting to triage. I couldn't replicate this issue with some simple tests in a draft PR: https://github.com/django/django/pull/17890
I'd like to work on this, will set myself as the owner while I attempt a few more things.