Opened 13 years ago

Closed 8 years ago

Last modified 8 years ago

#16735 closed New feature (wontfix)

QuerySet.values() should be aliasable using strings

Reported by: alex.latchford@… Owned by: Ian Foote
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: queryset, alias, values
Cc: django@…, Ben Davis Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Student.objects.all().values('name', 'mother__name', 'class__teacher__name')
> QD {'name': 'Freddie', 'mother__name': 'Helen', 'class__teacher__name': 'Mr Williams'}

This sort of query doesn't quite demonstrate the problem fully, but it's the best example I can think of for now. Essentially I'd like to be able to alias these values in such a fashion that they are easily distinguishable or able to be shortened..

I envisage something like this..

Student.objects.all().values(student_name='name', mother_name='mother__name', teacher_name='class__teacher__name')
> QD {'student_name': 'Freddie', 'mother_name': 'Helen', 'teacher_name': 'Mr Williams'}

You should also be able to leave arguments if the standard name will suffice..

Many thanks,
Alex

Attachments (2)

column_alias.diff (33.0 KB ) - added by Nate Bragg 13 years ago.
16735-2.patch (33.0 KB ) - added by Nate Bragg 13 years ago.
Corrected an oversight when aliasing columns - previous patch would alias columns generically when it didn't need to.

Download all attachments as: .zip

Change History (30)

comment:1 by Aymeric Augustin, 13 years ago

Triage Stage: UnreviewedAccepted

Interesting idea. From a quick inspection of the source (ValuesQuerySet) this looks doable.

by Nate Bragg, 13 years ago

Attachment: column_alias.diff added

comment:2 by Nate Bragg, 13 years ago

Has patch: set
Owner: changed from nobody to Nate Bragg
Status: newassigned

This patch is the most direct way I could see of adding this feature.

It passes all run tests in ./runtests.py --settings=test_sqlite

I tried to be careful, but if it is missing anything, I'd be happy to take another look at it.

comment:3 by Łukasz Rekucki, 13 years ago

Patch needs improvement: set

While adding this to values() is perfectly fine, doing the same with values_list() feels weird. Also, it's buggy:

# from ValuesListQuerySet in the patch
fields = list(self._fields) + self._aliased_fields.keys()

This makes the order of values in returned tuples depend on order of dictionary keys, which is _undefined_. Just try:

print Model.objects.values_list(a2="field", a3="other_field")
# on Python 2.7 and 3.2 values should be reversed.
Version 0, edited 13 years ago by Łukasz Rekucki (next)

by Nate Bragg, 13 years ago

Attachment: 16735-2.patch added

Corrected an oversight when aliasing columns - previous patch would alias columns generically when it didn't need to.

comment:4 by Nate Bragg, 13 years ago

Triage Stage: AcceptedDesign decision needed

Ostensibly, I agree with you - it's a bit odd, and I don't presume to know why one would use it. Initially, when I added that to values_list(), it was to "harmonize" it with the features added to values(). However, similar behavior results when you annotate or aggregate on a values_list() call - the aliased names are added in the order of the keys.

So, I would propose one of three solutions:

  1. Remove aliased names in the values_list() call entirely, replaced with a dummy empty dictionary.
  2. Force the sorting of the aliases, presumably alphabetically; this would also suggest repairing this defect with annotations and aggregations, which may not be quite so simple.
  3. Leave as is in my updated patch, with indeterminate order returned.

I will be happy to implement which ever one seems best to a core developer. As such, I'm setting this to DDN.

comment:5 by Anssi Kääriäinen, 13 years ago

What is the rationale of doing this for values_list()? If I am not mistaken, you will get a list back anyways. So the aliases are used for what?

The .annotate() + values_list() seems to be a bug. Basically, multiple annotates in one .annotate() call have indeterminate order, which results in indeterminate order in the values_list(). I bet some users will be hit if/when Python randomizes the hashing algorithm. So, indeterminate order is not good. Unfortunately I don't see a fix other than disallowing multiple annotates in one call. Which is backwards incompatible.

In addition, some benchmark that this doesn't slow down fetching large sets of objects from the DB is needed. .values() is mostly an optimization, so it should remain as fast as possible. Maybe django-bench contains some benchmark already?

comment:6 by Ulrich Petri, 12 years ago

Cc: django@… added

I would disagree with the observation that .values() is just an optimization. In combination with .annotate() it is vital if you need to annotate / group by multiple fields. This is also one of the use cases where the missing aliases are most painful.

Example:

>>> Region.objects.values(
    "name", 
    "orders__items__product__category__parent__name"
).annotate(quantity = Sum("orders__items__quantity"))
[
    {'quantity': 10, 'name': u'...', 'orders__items__product__category__parent__name': u'Something'},
    {'quantity': 20, 'name': u'...', 'orders__items__product__category__parent__name': u'Something else'},
]

comment:7 by jrs_66@…, 12 years ago

I'm guessing that the seemingly trivial, and widely used, concept of aliasing values() lists is still not possible? Ever try a union of self joined tables? Is there any hope of this being put in place in the future?

in reply to:  6 ; comment:8 by Karthik, 12 years ago

Replying to UloPe:

I would disagree with the observation that .values() is just an optimization. In combination with .annotate() it is vital if you need to annotate / group by multiple fields. This is also one of the use cases where the missing aliases are most painful.

Example:

>>> Region.objects.values(
    "name", 
    "orders__items__product__category__parent__name"
).annotate(quantity = Sum("orders__items__quantity"))
[
    {'quantity': 10, 'name': u'...', 'orders__items__product__category__parent__name': u'Something'},
    {'quantity': 20, 'name': u'...', 'orders__items__product__category__parent__name': u'Something else'},
]

I agree. I'm reading this thread because I have this exact same problem at the moment.

comment:9 by Aymeric Augustin, 12 years ago

Triage Stage: Design decision neededAccepted

Unless I missed something, there isn't any objection to adding this feature to values.

in reply to:  8 comment:10 by erik.telepovsky, 12 years ago

Replying to Karthik:

Replying to UloPe:

I would disagree with the observation that .values() is just an optimization. In combination with .annotate() it is vital if you need to annotate / group by multiple fields. This is also one of the use cases where the missing aliases are most painful.

Example:

>>> Region.objects.values(
    "name", 
    "orders__items__product__category__parent__name"
).annotate(quantity = Sum("orders__items__quantity"))
[
    {'quantity': 10, 'name': u'...', 'orders__items__product__category__parent__name': u'Something'},
    {'quantity': 20, 'name': u'...', 'orders__items__product__category__parent__name': u'Something else'},
]

I agree. I'm reading this thread because I have this exact same problem at the moment.

I agree as well. It is important to have alias functionality in values() method.

comment:11 by Anssi Kääriäinen, 12 years ago

Quick observation: it might be better to have the aliasing feature as separate queryset operation:

Region.objects.alias(
    orders__items__product__category__parent__name="parentname"
).values(
    "name", "parentname"
).annotate(quantity=Sum("orders__items__quantity"))

Why this way? There are a couple other places where aliasing could be useful. For example in multijoin situations you could explicitly define if you want to filter the same join or different join by using the same alias or two different aliases (currently the way is "if filtered in same .filter() condition, then same joins, else different joins). Also, this way you might be able to inject extra SQL directly into the query:

Region.objects.alias(
    somesql=RawSQL("case when somecol > 0 then 1 else -1 end")
).order_by('somesql')

Of course, this ticket should not try to do more than just the bare minimum to get the aliasing to work for .values(). The point is aliasing could be useful in other operations, too, so lets be prepared for that.

in reply to:  9 comment:12 by Wraithan, 11 years ago

Replying to aaugustin:

Unless I missed something, there isn't any objection to adding this feature to values.

So my understanding is that this ticket is being held back by touching values_list as well as values. So a patch only affecting values (as well as docs and tests of course) is what is needed to get accepted?

comment:13 by Russell Keith-Magee, 11 years ago

@Wraithan As far as I can make out, yes. I can't see any benefit to having API consistency between values() and values_list() - if only because values_list() already accepts a keyword argument, which introduces a whole world of pain in the API (what if you want an alias called flat?).

The "values() is an optimisation" argument doesn't hold water for me. Yes, it's an optimisation. The source of the optimisation is passing less information on the wire during the database query (i.e., only returning two of 15 fields). An alias is already being used for this operation - it just isn't user specified. Making it user specified is a single dictionary lookup in a couple of key locations is a minor change in implementation with huge usability benefits.

So - clean up the patch and we can get this into trunk.

comment:14 by Marc Tamlyn, 11 years ago

Agreed this should be possible.

As a side note Russ - I've found the main optimisation for using values was in fact avoiding calling Model.__init__ a few thousand times, not the lack of data on the wire.

comment:15 by django@…, 11 years ago

I've been wanting something like this for a while. Just another use case (when sending emails with pre-defined, admin-editable text.

text = Settings.objects.get(name='email_text').value  # text is something like "Hi {name}, you are {age} years old. You work at {job}"

email_texts = [text.format(**kw) for kw in Person.objects.all().values(name='name', age='age', job="job__name")]

comment:16 by Ben Davis, 11 years ago

Cc: Ben Davis added
Version: 1.3master

I've created a branch on github for a fix against the latest version of django: https://github.com/bendavis78/django/tree/issues/16735

The patch can be found here:
https://github.com/bendavis78/django/commit/cdff83bed850a631e7c6d3cb12359b9b1d3e9bc4

This patch only changes values() and not values_list().

comment:17 by paveluc.alexandr@…, 11 years ago

Will this patch be included in next Django release?

comment:18 by Simon Charette, 11 years ago

Unfortunately this feature didn't make it to the 1.7.x branch before it was feature frozen.

in reply to:  18 comment:19 by Abdulhaq Emhemmed, 10 years ago

Replying to charettes:

Unfortunately this feature didn't make it to the 1.7.x branch before it was feature frozen.

Since the feature freeze deadline for v1.8 is approaching, will this be added to 1.8??

I tried @bendavis78 patch and it works fine!

comment:20 by Tim Graham, 10 years ago

The patch likely needs to be updated to apply cleanly and then we need a pull request so the patch can be reviewed.

comment:21 by Josh Smeaton, 10 years ago

Just quickly, with the changes to annotate that have landed, it is now possible to create aliases yourself, and reference them from the values call:

Model.objects.annotate(my_alias=F('some__long__name__to__alias')).values('my_alias')

This is fairly similar to the suggestion Anssi made upthread of using a new method alias(). Not sure if this would be preferable though, as it is more verbose and requires an extra round of queryset copying.

comment:22 by Tim Graham, 9 years ago

Owner: Nate Bragg removed
Status: assignednew
Summary: Queryset values should be aliasableQuerySet.values() should be aliasable

comment:23 by Tim Graham, 9 years ago

Related to "Allow expressions in values() queryset method (#25871)" and django-developers discussion.

comment:24 by Ian Foote, 8 years ago

Owner: set to Ian Foote
Status: newassigned

comment:25 by Tim Graham, 8 years ago

PR with some comments for improvement.

comment:26 by Ian Foote, 8 years ago

Resolution: wontfix
Status: assignedclosed

Following the further discussion on django-developers, I think the consensus is not to support aliasing beyond an explicit .values(alias=F('field')) as supported since #25871 was fixed. I'm therefore updating this to wontfix.

comment:27 by Ionuț Ciocîrlan, 8 years ago

This being marked as wontfix is confusing. It's in fact been fixed by #25871, which allows for richer aliases.

For anyone landing here jumping to the conclusion the ticket was rejected, it wasn't: see Ian's note above -- it's available starting with 1.11.

comment:29 by Tim Graham, 8 years ago

Summary: QuerySet.values() should be aliasableQuerySet.values() should be aliasable using strings

I've clarified the ticket summary to reflect what was wontfixed.

Note: See TracTickets for help on using tickets.
Back to Top