#25307 closed Bug (fixed)
Cannot use .annotate with conditional expressions
Reported by: | Jared Proffitt | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | josh.smeaton@…, matt@…, Simon Charette, newport.travis@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
I am trying to create a query that first annotates a Count
, then aggregates some Sum
s together using Case
objects. If I do the query without the annotate, everything is fine. If I do the query without any Case
objects in the aggregate, everything is fine. But when they are both present I get this error:
'WhereNode' object has no attribute 'get_source_expressions'
My query is something like this:
counts = models.Package.objects.annotate( used_count=Count('projects') ).distinct().aggregate( no_parent=Sum(Case( When(parent_id__isnull=True, then=1), default=0, output_field=IntegerField() )) )
I would like to eventually add the used_count
value in the aggregate, but this itself won't even run.
Change History (23)
comment:1 by , 9 years ago
comment:2 by , 9 years ago
I have the same issue, here is enough code to reproduce the issue.
# In models.py from django.db import models class DateModel(models.Model): date1 = models.DateTimeField() date2 = models.DateTimeField() # elsewhere from django.db.models import Q, Case, When from django.db.models.functions import Coalesce from datetime import date annotate = {'date': Coalesce('date1', 'date2')} dms = DateModel.objects.annotate(**annotation) aggregation = { 'count_recent': Count(Case(When(Q(date__gte=date(year=2015, month=1, day=1))))) } dms = dms.aggregate(**aggregation)
AttributeError: 'WhereNode' object has no attribute 'get_source_expressions'
comment:3 by , 9 years ago
Cc: | added |
---|---|
Triage Stage: | Unreviewed → Accepted |
Version: | 1.8 → master |
comment:5 by , 9 years ago
Please correct me if I'm wrong, but it seems as if the issue lies with the django.db.models.expressions.When
class, in that the condition
attribute is supposed to be a django.db.models.Q
object and yet the get_source_epxressions()
and set_source_expressions()
methods are treating the condition
attribute as an django.db.models.expressions.Expression
, when it most certainly is not.
This violates the API and hence client-code of the API (such as aggregation code in this case) will fail unexpectedly.
E.g.
In the example provided by @martsberger, the count_recent
expression gets down to django.db.models.sql.query.Query.rewrite_cols()
. That method recursively travels into the nested expressions until it hits the Q
object returned by the When
expression's get_source_expressions()
and that's where it fails.
The problem is only triggered (during aggregation) by column rewriting, when this line evaluates to True
in django.db.models.sql.query.Query.get_aggregation()
:
if (isinstance(self.group_by, list) or has_limit or has_existing_annotations or self.distinct):
As I said initially, I believe the root cause is the violation of the Expression
API by the When
subclass.
comment:6 by , 9 years ago
Cc: | added |
---|
comment:7 by , 9 years ago
I'm not sure when/how, but the condition
attribute of the When
Expression
gets converted to a django.db.models.sql.where.WhereNode
, so it's actually a WhereNode
object that is having .get_source_expressions()
called on it.
Regardless, django.db.models.expressions.When
's get_source_expressions()
and set_source_expressions()
methods need to bridge the gap between Expression
s and django.utils.tree.Node
s, in order to properly re-write the column names to the aliases from the subquery.
I can't see a quick/easy way to do this, but as I'm not experienced in the ORM code, I could be wrong.
comment:8 by , 9 years ago
We have a couple of things that *resolve* to an expression. For example, Q() and F() objects are not expressions, but will resolve to an expression when added to a query.
If WhereNode doesn't respect the expression API, then the fix is to make WhereNodes full expressions.
comment:9 by , 9 years ago
To clarify, I think it's the django.db.models.expressions.When
class (which is a subclass of Expression
) that is not respecting the API, with regards to its get_source_expressions()
and set_source_expressions()
methods.
comment:10 by , 9 years ago
Cc: | added |
---|
Also just hit this when upgrading an old project using django-aggregate-if on Django 1.7 to conditional aggregation on Django 1.8.
comment:11 by , 9 years ago
From a short investigation it looks like the issue is actually caused by a distinct()
and aggregate()
of conditional expression combination.
Can the people affected confirm that removing the distinct()
call from their query silence the exception?
comment:12 by , 9 years ago
That does make it break as well, but not simply just the use of distinct. If you use EITHER .distinct()
OR .annotate()
along with an .aggregate()
that contains a conditional expression. (not sure exactly what part of the condition expression, but it must be either because of the Case()
or the When()
.
I have made a completely useless example using the polls app in the django tutorial. I copied the polls models exactly, and created this query, which will cause the error to happen:
counts = models.Choice.objects.annotate( used_count=Count('question') ).distinct().aggregate( yes=Sum(Case( When(choice_text='yes', then=F('votes')), default=0, output_field=IntegerField() )), no=Sum(Case( When(choice_text='no', then=F('votes')), default=0, output_field=IntegerField() )), )
(Obviously, then used_count
will always be 1, and its pretty pointless in this example, but you get the idea).
So if you remove the either the .annotate()
OR the .distinct()
, the query still breaks the same way. The presence of either one causes it to break. But if you remove BOTH, the query runs fine.
I do not know anything about the internals of the Query api, or I would help fix the problem. But it seems there is a problem when combining certain queryset methods like annotate
or distinct
with aggregate
s that have condition expressions.
I know it has to do with condition expressions, because doing the query like the following works just fine:
counts = models.Choice.objects.annotate( used_count=Count('question') ).distinct().aggregate( yes=Sum('votes'), )
comment:14 by , 9 years ago
Cc: | added |
---|
As mentioned on the pull request, this issue seems to exist for any of the 4 cases here:
if (isinstance(self.group_by, list) or has_limit or has_existing_annotations or self.distinct):
I was able to write unit tests for all but the isinstance(self.group_by, list)
case, if someone is able to provide a test for that case I can add it to my changes.
Edit: Unit test updates on my pull request should now cover all cases, my pull request can be closed in favor of #6267
comment:16 by , 9 years ago
Patch needs improvement: | unset |
---|
comment:18 by , 8 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
Any chance you could provide a regression test for Django's test suite? It's difficult to reproduce the issue without your models.