Opened 10 years ago
Closed 10 years ago
#24570 closed Bug (invalid)
`group_by` clause does not resolve keywords defined in `extra` clause.
Reported by: | user0007 | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.8 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Hello,
during migrating from Django 1.7.5 to 1.8 I found a bug(?)
truncate_date = connection.ops.date_trunc_sql('day', 'created_at') qs = MyModel.objects.filter( user_id=1 ).annotate( max_date=models.Max('created_at') ).extra( {'created_at': truncate_date} ).values( 'product_id', 'created_at', 'max_date' ) qs.query.group_by = ['product_id', truncate_date] qs.order_by('-max_date')[:10]
The result is:
FieldError: Cannot resolve keyword u"CAST(DATE_FORMAT(created_at, '%%Y-%%m-%%d 00:00:00') AS DATETIME)" into field.
This code works perfectly on Django 1.7.5
Traceback:
Traceback (most recent call last): File "<console>", line 1, in <module> File "/home/env/local/lib/python2.7/site-packages/django/db/models/query.py", line 138, in __repr__ data = list(self[:REPR_OUTPUT_SIZE + 1]) File "/home/env/local/lib/python2.7/site-packages/django/db/models/query.py", line 162, in __iter__ self._fetch_all() File "/home/env/local/lib/python2.7/site-packages/django/db/models/query.py", line 965, in _fetch_all self._result_cache = list(self.iterator()) File "/home/env/local/lib/python2.7/site-packages/django/db/models/query.py", line 1085, in iterator for row in self.query.get_compiler(self.db).results_iter(): File "/home/env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 783, in results_iter results = self.execute_sql(MULTI) File "/home/env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 818, in execute_sql sql, params = self.as_sql() File "/home/env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 367, in as_sql extra_select, order_by, group_by = self.pre_sql_setup() File "/home/env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 51, in pre_sql_setup group_by = self.get_group_by(self.select + extra_select, order_by) File "/home/env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 102, in get_group_by expressions.append(self.query.resolve_ref(expr)) File "/home/env/local/lib/python2.7/site-packages/django/db/models/sql/query.py", line 1522, in resolve_ref self.get_initial_alias(), reuse) File "/home/env/local/lib/python2.7/site-packages/django/db/models/sql/query.py", line 1461, in setup_joins names, opts, allow_many, fail_on_missing=True) File "/home/env/local/lib/python2.7/site-packages/django/db/models/sql/query.py", line 1386, in names_to_path "Choices are: %s" % (name, ", ".join(available))) FieldError: Cannot resolve keyword u"CAST(DATE_FORMAT(created_at, '%%Y-%%m-%%d 00:00:00') AS DATETIME)" into field.
Change History (13)
comment:1 by , 10 years ago
Description: | modified (diff) |
---|
comment:2 by , 10 years ago
Description: | modified (diff) |
---|
comment:3 by , 10 years ago
Summary: | Cannot resolve keyword "CAST" in group_by clause. → `group_by` clause does not resolve keywords defined in `extra` clause. |
---|
comment:4 by , 10 years ago
Description: | modified (diff) |
---|
comment:5 by , 10 years ago
Severity: | Normal → Release blocker |
---|---|
Triage Stage: | Unreviewed → Accepted |
follow-up: 7 comment:6 by , 10 years ago
is that extra(select= {'created_at': truncate_date})
?
(I'm not sure about the Release Blocker status -- I don't think setting group_by on a QSet's query like this is public API)
comment:7 by , 10 years ago
Replying to shaib:
is that
extra(select= {'created_at': truncate_date})
?
Yes. extra(select= {'created_at': truncate_date})
and extra({'created_at': truncate_date})
are equal.
comment:8 by , 10 years ago
Severity: | Release blocker → Normal |
---|---|
Triage Stage: | Accepted → Unreviewed |
I skimmed through the code too quickly and didn't spot the qs.query.group_by = ...
.
As mentionned by Shai, using the attribute like that is not documented so this doesn't technically count as a regression.
Is there no way to achieve what you're doing with the documented/supported queryset.group_by()
method?
Edit: I've been informed that such a method doesn't actually exist.
follow-up: 10 comment:9 by , 10 years ago
What you were doing with your original queryset seems suspect to me, it looks like there are better ways to accomplish your goals in a completely supported way. Let me highlight some issues with your original queryset (I'll remove pieces that aren't relevant for brevity):
MyModel.objects.annotate( max_date=models.Max('created_at') ).extra( {'created_at': truncate_date} # here you're overwriting the field called created_at, you should choose a new alias ).values( 'product_id', 'created_at', 'max_date' ) # this is private API, but you'd need to use the alias name of the extra clause ("created_at"), # which is ambiguous because you've used the same name. group_by=['product_id', 'created_extra'] qs.query.group_by = ['product_id', truncate_date] qs.order_by('-max_date')[:10]
I've written a comparable (but not exact) queryset with 1.8 that I'd like you to take a look at.
In [46]: truncate_date = connection.ops.date_trunc_sql('day', 'created') In [47]: qs = Stats.objects.extra({'created_day': truncate_date}).values('created_day').annotate(max_growth=Max('growth')) In [48]: print(qs.query) SELECT (DATE_TRUNC('day', created)) AS "created_day", MAX("scratch_stats"."growth") AS "max_growth" FROM "scratch_stats" GROUP BY (DATE_TRUNC('day', created)) In [49]: for q in qs: ....: print(q) ....: {'created_day': datetime.datetime(2015, 3, 29, 0, 0, tzinfo=<UTC>), 'max_growth': 16} {'created_day': datetime.datetime(2015, 3, 24, 0, 0, tzinfo=<UTC>), 'max_growth': 11} {'created_day': datetime.datetime(2015, 3, 31, 0, 0, tzinfo=<UTC>), 'max_growth': 17} {'created_day': datetime.datetime(2015, 3, 25, 0, 0, tzinfo=<UTC>), 'max_growth': 18} {'created_day': datetime.datetime(2015, 3, 23, 0, 0, tzinfo=<UTC>), 'max_growth': 17} ...
Note that the query groups by the extra created_day clause because we've not shadowed any model fields, and we've named it in our values call. The above code I've written is fully supported too, although I'd encourage you to avoid using extra
and to instead use Query Expressions. Django should provide proper db_functions for extracting date parts, but that's a work in progress (I would imagine Django 1.9 will have something).
What I think your query should look like:
truncate_date = connection.ops.date_trunc_sql('day', 'created_at') qs = MyModel.objects.filter( user_id=1 ).annotate( max_date=models.Max('created_at') ).extra( {'created_day': truncate_date} ).values( 'product_id', 'created_day', 'max_date' ).order_by('-max_date')[:10]
Can you give that a go and let us know the outcome please?
comment:10 by , 10 years ago
Thank You for the answer.
Replying to jarshwah:
What I think your query should look like:
truncate_date = connection.ops.date_trunc_sql('day', 'created_at') qs = MyModel.objects.filter( user_id=1 ).annotate( max_date=models.Max('created_at') ).extra( {'created_day': truncate_date} ).values( 'product_id', 'created_day', 'max_date' ).order_by('-max_date')[:10]Can you give that a go and let us know the outcome please?
I've tried in that way. This QuerySet generates incorrect SQL (it groups by id
which is wrong and gives duplicate results).
SQL output:
SELECT (CAST(DATE_FORMAT(created_at, '%Y-%m-%d 00:00:00') AS DATETIME)) AS `created_day`, `mymodel`.`product_id`, MAX(`mymodel`.`created_at`) AS `max_date` FROM `mymodel` WHERE (`mymodel`.`user_id` = 1) GROUP BY `mymodel`.`id` ORDER BY `max_date` DESC LIMIT 10
The query which works is:
SELECT (CAST(DATE_FORMAT(created_at, '%Y-%m-%d 00:00:00') AS DATETIME)), `mymodel`.`product_id`, MAX(`mymodel`.`created_at`) AS `max_date` FROM `mymodel` WHERE (`mymodel`.`user_id` = 1) GROUP BY `mymodel`.`product_id`, (CAST(DATE_FORMAT(created_at, '%%Y-%%m-%%d 00:00:00') AS DATETIME)) ORDER BY `max_date` DESC LIMIT 10
And the question is how to write it using QuerySet? :-) I was able to that in Django 1.7.5 using query.group by
, but now it throws errors.
comment:12 by , 10 years ago
comment:13 by , 10 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
It looks like this has turned into a support ticket about how to use private APIs; see TicketClosingReasons/UseSupportChannels.
Hi,
I can indeed reproduce this issue and I bisected it down to commit 0c7633178fa9410f102e4708cef979b873bccb76.
I'm bumping the severity to release blocker because of the regression.
Thanks!