Opened 16 years ago

Closed 9 years ago

#10302 closed New feature (fixed)

Add some date features to aggregation

Reported by: Adrian Ribao Owned by:
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: aggregation dates
Cc: miracle2k, ben.welsh@…, gmayer, kmpm, Mikhail Korobov, Robin, John Paulett, sebastian.goll@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

It would be nice to use datetime fields as a group in aggregation.

class Log(models.Model):
    datetime = models.DateTimeField(auto_now_add=True)
    subject = models.CharField(max_length=255)

Log.objects.values('datetime').annotate(num_logs=Count('id'))

Grouping by date(not datetime) is not possible.

Attachments (1)

ticket_10302.diff (2.1 KB ) - added by Raphael Michel 9 years ago.
Regression test

Download all attachments as: .zip

Change History (18)

comment:1 by Erin Kelly, 16 years ago

Using the sqlite3 backend:

Log.objects.extra(select={'date': 'django_date_trunc("day", "testapp_log"."datetime")'}).values('date').annotate(num_logs=Count('id'))

Other backends will require replacing django_date_trunc() with the appropriate call. But I agree that there should be a single API for this.

comment:2 by (none), 16 years ago

milestone: post-1.0

Milestone post-1.0 deleted

comment:3 by Jacob, 16 years ago

milestone: 1.1 beta
Triage Stage: UnreviewedAccepted

comment:4 by Russell Keith-Magee, 16 years ago

milestone: 1.1 beta

Bumping this back to accepted, but not on v1.1 path, because this isn't something that is strictly related to aggregates. You can already do a GROUP BY on any date/datetime field that exists on the model. This ticket is asking for the ability to do a GROUP BY on a value computed from another field on the model. This will first require the ability to compute fields and annotate them.

While this is definitely a good idea, it isn't a simple extension or omission from the v1.1 aggregates implementation. In the interim, it can be done using the approach Ian mentioned in the first comment.

comment:5 by palewire, 15 years ago

Cc: ben.welsh@… added

I tried to replicate this on a model with a DateTimeField and ran into a problem I can't explain.
I'm working in Postgres and on the 1.1 official release.

The code was:

>>> from django.db.models import Count
>>> from app.models import Model
>>> Model.objects.extra(select={'year': "DATE_TRUNC('year', \"app_model\".\"datetime\")"}).values('year').annotate(total=Count('id'))

And the SQL I got back seemed to erroneously have the datefield included into the GROUP BY statement, like:

>>> from django.db import connection
>>> print connection.queries[-1]
{'time': '0.207', 'sql': 'SELECT (DATE_TRUNC(\'year\', "app_model"."datetime")) AS "year", COUNT("app_model"."id") AS total 
FROM "app_model" GROUP BY DATE_TRUNC(\'year\', "app_model"."datetime"), "app_model"."datetime" ORDER BY "app_model"."datetime" DESC LIMIT 21'}

comment:6 by Russell Keith-Magee, 15 years ago

Component: Database layer (models, ORM)ORM aggregation
Owner: nobody removed

comment:7 by miracle2k, 15 years ago

Cc: miracle2k added

comment:8 by gmayer, 15 years ago

Cc: gmayer added

I ran into the same problem as palewire above but after some digging realised that my ordering in the model's Meta class was causing django to insert extra GROUP BY fields for all those listed in ordering. After commenting it out the extra().values().annotate() did the right thing, alternatively appending an order_by('year') to his command above would have worked too.

I'm not sure if this is a bug or a feature (why are ordering fields forced to be included in the GROUP BY SQL???) but if it's a feature then this gotcha is poorly documented. The queryset docs only mention something order_by() possibly conflicting with a values() under the distinct() heading, it should be a prominent warning under the values() and/or extra() description.

comment:9 by kmpm, 15 years ago

Cc: kmpm added

comment:10 by kmpm, 15 years ago

I made a method something like this in a manager.
When I didn't have that last .order_by things went mad as for @gmayer but as soon as I added my own order I could put back the ordering in the model's Meta class.
Don't know if will help but it's a workaround if you need ordering in the Meta class.

def period_avrage(self, start_at=None, end_at=None, period_size='day'):
    extra_psql={'period': "DATE_TRUNC('%s', \"collector_taglog\".\"created_at\")" % period_size}
    qs = self.filter(data_good=True) #only good values
    if start_at: qs = qs.filter(created_at__gte=start_at)
    if end_at: qs = qs.filter(created_at__lte=end_at)    
    #do not remove the order by. Otherwise the ordering on the model will screw things up
    return qs.extra(select=extra_psql).values('period').annotate(avg=Avg('v_num')).order_by('period')
Last edited 9 years ago by Tim Graham (previous) (diff)

comment:11 by anonymous, 15 years ago

Cc: Mikhail Korobov added

comment:12 by Robin, 14 years ago

Cc: Robin added

comment:13 by John Paulett, 14 years ago

Cc: John Paulett added

comment:14 by Chris Beaven, 14 years ago

Severity: Normal
Type: New feature

comment:15 by Sebastian Goll, 13 years ago

Cc: sebastian.goll@… added
Easy pickings: unset
UI/UX: unset

comment:16 by Anssi Kääriäinen, 12 years ago

Component: ORM aggregationDatabase layer (models, ORM)

by Raphael Michel, 9 years ago

Attachment: ticket_10302.diff added

Regression test

comment:17 by Raphael Michel, 9 years ago

Resolution: fixed
Status: newclosed

This has been fixed by the Expressions API and is therefore obsolete.

Note: See TracTickets for help on using tickets.
Back to Top