Opened 17 years ago
Closed 14 years ago
#5768 closed (fixed)
Allow QuerySet.values() to return values spanning joins (for multi-valued relations)
Reported by: | Owned by: | ||
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Keywords: | values, values_list, one-to-many, many-to-many reverse | |
Cc: | trevor@…, miracle2k, Tai Lee | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I'd like to write:
class Tag(models.Model): name = models.CharField(maxlength=50) class Thing(models.Model): tags = models.ManyToManyKey(Tag) Thing.objects.filter(tags__name).values('tags.name').distinct()
My use case is slightly more complicated (I use django-tagging and the GenericRelation does not make things simpler).
Of course I could do it manually with a loop, but it is necessary to have a queryset for some things, such as using a generic view.
Attachments (2)
Change History (27)
comment:1 by , 17 years ago
comment:2 by , 17 years ago
Keywords: | qs-rf added |
---|---|
Triage Stage: | Unreviewed → Accepted |
Probably not a completely crazy idea and not too hard to do in queryset-refactor now.
comment:3 by , 17 years ago
Component: | Uncategorized → Database wrapper |
---|
comment:4 by , 17 years ago
Keywords: | qs-rf removed |
---|
I've marked #3358 as a duplicate of this. We can implement it eventually, but probably not as part of the queryset-refactor work, since it's not really a show-stopper at the moment.
A couple of thoughts on the implementation:
- we should use
foo__bar
as the format for specifying the names, so that it's consistent with filtering and the new cross-modelorder_by()
syntax. - If we allow many-to-many or reverse one-to-many relations to be specified, we should restrict it to a maximum of one such relation in each
values()
call. The problem here is one of efficiency: if relation 1 has n1 rows and relation 2 has n2 rows in their result sets, doing a single, non-union query will result in n1*n2 rows being returned. Not cool if n1 and n2 are something like 100 each. If we do a union query, it's not more efficient than doing two separate queries, except we'll have to pay some computation penalty inside Django every time to work out when this is needed. So we don't permit that and the few cases where it's needed can be done with separate queries and dictionary updates.
So I'm going to remove the qs-rf keyword, but any patch for this should wait until after that branch is merged, since it will touch stuff that bears no resemblance to the current trunk code.
comment:5 by , 17 years ago
Cc: | added |
---|
comment:6 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:7 by , 17 years ago
Has patch: | set |
---|---|
Resolution: | → fixed |
Status: | assigned → closed |
Support of related models in QuerySet.values() was implemented as part of ticket #5420. See following patch:
http://code.djangoproject.com/attachment/ticket/5420/queryset_fields_trunk.diff
Patch changes signature of this method from values(*fields)
to values(*fields, **related_fields)
but the change is backward compatible.
See more details about this patch at: http://www.mysoftparade.com/blog/django_orm_performance_patch/
comment:8 by , 17 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
This isn't fixed. A ticket is fixed when code for it has been committed to the appropriate branch in subversion.
Also, this isn't related to #5420 (your approach there is mixing multiple things). Reopening.
comment:9 by , 17 years ago
Also, for anybody working on this, please see the end of comment 4: patches should either be against queryset-refactor or wait until after that is merged into trunk. The current ValuesQuerySet implementation is quite different in the new code.
comment:10 by , 17 years ago
Owner: | removed |
---|---|
Status: | reopened → new |
comment:11 by , 17 years ago
(In [7230]) queryset-refactor: Refactored the way values() works so that it works properly
across inherited models.
Completely by accident, this also allows values() queries to include fields
from related models, providing it is crossing a single-valued relation
(one-to-one, many-to-one). Many-to-many values() fields still aren't supported,
since that requires actual thinking. So this refs #5768.
comment:12 by , 16 years ago
Summary: | Allow QuerySet.values() to return values spanning joins. → Allow QuerySet.values() to return values spanning joins (for multi-valued relations) |
---|
comment:13 by , 16 years ago
milestone: | → 1.2 |
---|
Tentatively pushing to 1.2 as this lines up with defer()
and only()
, covering the missing ground.
comment:14 by , 16 years ago
milestone: | 1.2 |
---|
Removing milestone, lest people become confused by thinking there's any commitment to this in any particular timeframe. It will be committed when it's done, whenever that might be.
comment:15 by , 16 years ago
Cc: | added |
---|
comment:16 by , 14 years ago
Cc: | added |
---|
Adding a milestone might also prompt someone to work on or review this ticket within the next 1.5 years or major release cycle.
by , 14 years ago
Attachment: | 5768-m2m-values-r14618.diff added |
---|
comment:17 by , 14 years ago
Keywords: | values values_list one-to-many many-to-many reverse added |
---|---|
Needs documentation: | set |
Version: | 0.96 → SVN |
The patch I've just uploaded doesn't add any new functionality, but simply exposes the existing functionality with an allow_m2m
argument on values()
and values_list()
. It should be completely backwards compatible, and I've added tests for forward relations (many-to-one) which were already possibly (but missing tests, unless I just couldn't find them) as well as reverse relations (one-to-many and many-to-many).
I ran the entire test suite with allow_m2m=True
by default without any breakage, but I've left it as allow_m2m=False
by default in the patch just to make sure that users aren't accidentally surprised when their call to values()
returns more results than there are in the original queryset due to the reverse relations, by requiring users to actively specify that they want to lookup values on these relations.
I still need to add docs to the patch, but would like to hear feedback before I do so about any concerns that anyone may have with the implementation.
comment:18 by , 14 years ago
This looks pretty good to me, though I'm not entirely sold on allowing for combinatorial explosions.
It seems to me that if we implement Malcolm's suggestion of limiting it to a single m2m relation per query, we could probably remove the new allow_m2m keyword arg altogether; it's not backwards-incompatible to have some additional fieldnames work that didn't before. Without the limit, I think the allow_m2m arg is necessary, just to force people to look at the docs and see the giant red warning that "you might be in for a combinatorial explosion if you use this."
I'd like to hear from some others on this point -- could you bring it up on the -developers mailing list?
Re the patch: the copy-pasted extra-kwargs TypeError in .values() still says "values_list" in its text. Also, there are already several tests for forward relations in .values(): some incidental, the one specifically testing it is in modeltests/many_to_one. However, I'm still not opposed to adding tests for that here; it seems best if test_values() actually tests the full range of .values() functionality.
comment:19 by , 14 years ago
Conclusion from IRC discussion with Russell and mrmachine is that combinatorial explosions are already possible in the ORM, and will only happen here under certain circumstances of dataset size and fields used in combination, so it's acceptable to eliminate the allow_m2m kwarg and just allow (any number of) m2ms in values/values_list, with a note in the docs about the possible explosion in returned dataset size.
comment:20 by , 14 years ago
Needs documentation: | unset |
---|
Updated the patch to allow m2m by default, with docs.
comment:21 by , 14 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:22 by , 14 years ago
Patch needs improvement: | set |
---|---|
Triage Stage: | Ready for checkin → Accepted |
I'm getting failures in the new tests on Postgres; appears to be a unicode/str issue. Shouldn't be too hard to fix, don't have time to dig into it at the moment.
by , 14 years ago
Attachment: | 5768-m2m-values-r14639.diff added |
---|
comment:23 by , 14 years ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
Apologies, I should have ran the tests for postgresql and mysql as well. The problem was that assertQuerysetEqual
is dependant on the order of results, and with no explicit ordering applied the results were different in postgresql. I've updated the tests (and also fixed another typo I missed earlier) and rerun the lookup tests on sqlite, postgresql and mysql.
comment:24 by , 14 years ago
Full test suite for sqlite and postgresql just passed, too. I didn't do mysql after I found out it was going to take 10+ hours or have a bunch of other failures.
comment:25 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Here is the use case with django-tagging: