#16855 closed Bug (fixed)
select_related doesn't chain as expected
Reported by: | Leo Shklovskii | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | jdunck@…, erik.wognsen@…, seldon, Trey Hunner, hrehfeld@…, tomek@…, marc.tamlyn@…, mike@…, i.virabyan@…, real.human@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
If I'm doing something like:
Foo.objects.select_related('fieldone').select_related('fieldtwo')
I would expect the select related to function on both fieldone
and fieldtwo
. What actually happens is that the select_related only works on fieldtwo
. This is a problem for compositing queries at runtime (for example in search filters) and is a departure from how I would expect queryset methods to work.
If the design decision is to keep it this way, there should be a note in the docs pointing this out.
Change History (39)
comment:1 by , 13 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 13 years ago
Jeremy Dunck observes in this django-developers thread that in the current implementation the "depth" setting is apparently sticky, while fields are not, and gives a more detailed outline of the desired behavior of depth in combination with multiple calls to select_related.
comment:3 by , 13 years ago
Thanks Carl, that approach makes a lot of sense to me and would solve my use case. At the moment I have to go dig at the 'private' data structure that's holding the depth and list of fields to act on in the queryset object to get what I need. It sort of works but is a really fragile hack. If this behavior could get fixed in an upcoming 1.3.x release that would be fantastic.
comment:4 by , 13 years ago
Hi Leo - thanks for confirming that the proposed fix would meet your use case! Unfortunately it's quite unlikely that this fix will make it into a 1.3.X release. In our current documented release process, the last stable branch receives only critical bugfixes (security, data-loss, or crashing bugs). This doesn't fit into any of those categories.
This could certainly still make it into 1.4, but that would require someone to put together a patch in the near future.
comment:5 by , 13 years ago
Cc: | added |
---|
comment:6 by , 13 years ago
Cc: | added |
---|
I have a problem where using select_related with field arguments in the queryset method of my ModelAdmin doesn't work.
Could this bug also be the cause of that problem because select_related is applied again later by the admin, clearing my field choices?
comment:7 by , 13 years ago
Cc: | added |
---|
comment:8 by , 13 years ago
Cc: | added |
---|
comment:9 by , 13 years ago
By no means I am an experienced Django user, but this https://gist.github.com/2549147 seems to fix this issue on my single usecase. No further testing done. Can someone verify this?
comment:10 by , 13 years ago
Cc: | added |
---|
comment:11 by , 13 years ago
Cc: | added |
---|
comment:12 by , 13 years ago
Has patch: | set |
---|
I've created pull request https://github.com/django/django/pull/124
comment:13 by , 13 years ago
Anssi Kääriäinen left some good comments on my pull request. Now I think there's need to extend this bug to consider following scenario:
# should get all immediate relations AND the Order Species.objects.select_related(depth=1).select_related('genus__family__order')
comment:14 by , 13 years ago
Patch needs improvement: | set |
---|
comment:15 by , 13 years ago
I am afraid that dealing with the case of select_related(depth=1).select_related('genus__family__order')
will be a lot more complicated than the patch you have currently.
I would look into trying to turn the select related descents (in db/models/query.py:get_klass_info, db/models/sql/compiler.py:fill_related_selections) into two phase implementations. In the first phase, turn the depth into lookups, in the second phase just add the lookups, and remove the depth arguments from the current descent implementations. Of course, there are numerous ways forward here. If you spot some easier way, use it instead.
comment:16 by , 13 years ago
Cc: | added |
---|
comment:17 by , 12 years ago
Cc: | added |
---|
comment:18 by , 12 years ago
Cc: | added |
---|
comment:19 by , 12 years ago
One solution might be to deprecate the 'depth' argument to select_related()
. I never thought it was a good idea, and I don't think I've ever used it in practice, because it could so easily start doing the wrong thing if you change your schema. It also just seems wrong - if you are using select_related()
, you are making specific fine-tunings to the SQL produced. To do that using a broad "just add joins to a depth = n" instruction seems bizarre.
If depth
was deprecated, I don't think it would be too bad to have bugs related to it (since we've already got such bugs) until it is finally removed, and then we can avoid fixing the combination.
Also, while I'm here, we should make the API for clearing match prefetch_selected()
and defer()
i.e. pass None
as single argument to clear.
comment:22 by , 12 years ago
+1 from me, too. It is dangerous to use. In addition, the select_related code will be a little bit cleaner if depth was removed (not much, but even a little is something...)
comment:23 by , 12 years ago
Proposed deprecation of depth
to go into 1.5 release on django-dev list at https://groups.google.com/forum/?fromgroups=#!topic/django-developers/K3Fzqw9RRgg
comment:25 by , 12 years ago
Given the overwhelmingly positive response from core devs to deprecating depth
, I went ahead and made the PR: https://github.com/django/django/pull/488
comment:28 by , 12 years ago
If we are deprecating the depth argument, we need to remove our own usage of it, or deprecate APIs that use it. The admin uses it:
I think we should deprecate this entirely, and encourage people instead to implement ModelAdmin.get_queryset
in order to add select_related
or prefetch_related
calls.
We also should fix the admin so that it does the right thing when reacting to list_display
- currently it is pretty stupid:
We can either make it more intelligent, or remove this behaviour entirely, but the latter is more likely to give performance regressions.
comment:29 by , 12 years ago
I agree it would be nice to tidy up the use of select_related in the admin - though we do need to make sure it remains intelligent, I know a lot of my sites depend on that performance gain.
To clarify though, as far as I can tell none of Django is broken by the removal of the depth kwarg - it's still acceptable at present (if inadvisable) to pass no arguments to select_related()
and have Django auto-follow all of the FKs (i.e. depth=infinity). To remove that as well is more likely to be a major backwards incompatibility.
This does beg the question as to what the correct API should be for chaining and clearing - is it ok to have select_related()
and select_related(None)
do different things? (former would add all automatically, latter would clear list.)
comment:30 by , 12 years ago
Ah - I was actually thinking we should only have select_related(*fields)
, and remove the behaviour where no arguments means 'select everything'. I think that has the same problems as selecting to an arbitrary depth. I was actually thinking that the current implementation worked by having 'depth=5' as the default, so select_related()
was the same as select_related(depth=5)
, and the former would be automatically deprecated along with the latter.
In fact, the depth = 5
idea is implemented at a lower level - in django/db/models/sql/query.py. So it still doesn't do infinite traversal AFAICS.
comment:31 by , 12 years ago
How about changing list_select_related to accept a list of selections. That way you could pretty easily define the needed select related stuff.
... list_display = 'my_fk_field' list_select_related = ('my_fk_field', 'my_fk_field__more')
Raise deprecation warning for True in there.
get_queryset() override works, too, though the admin seems to prefer the declarative attribute style...
Intelligent select_related is somewhat hard to do correctly, although that depends on the definition of what intelligent select_related means...
I am in favour of getting rid of unlimited or depth=5 select_related() calls altogether.
comment:32 by , 12 years ago
"Intelligent" select_related
in the admin should just be a case of ensuring all the columns on the list view which are foreign keys are passed into a select_related
call. I haven't looked at the practicalities of this, but I know that the statements needed are the same as those needed for ordering by that column, so we should be able to reuse some code (and the admin_order_field
property for methods which create columns).
I like the new version of list_select_related
.
There are a few uses of a blank select_related
in the auth code, and a stack of them in the tests but these should just be a case of working out which fields need to change and updating them.
I've posted on the dev group again to clarify this change in deprecation - to me it's a bigger deal that just removing depth
now and I'm probably only +0 on it.
comment:33 by , 12 years ago
Implementing such an intelligent detection would be hard; think of the list_display
callbacks that may access foreign objects.
# models.py class City(models.Model): name = models.CharField(max_length=30) def __unicode__(self): return self.name class Street(models.Model): name = models.CharField(max_length=30) city = models.ForeignKey(City) def __unicode__(self): return "%s, %s" % (self.name, self.city) # admin.py class StreetAdmin(admin.ModelAdmin): list_display = ('__unicode__', 'display_city',) list_select_related = True def display_city(self, obj): city = obj.city return '<a href="%s">%s</a>' % (city.get_absolute_url(), city) display_city.allow_tags = True
If we replace the list_select_related = True
behavior to an automatic detection we should introduce another property (inspired by admin_order_field
) on those callbacks, say admin_list_select_related
, to expose which field the intelligent mechanism should select. Otherwise people will have to override get_queryset
to achieve what was automatically done by specifying list_select_related = True
prior to this change.
FWIW I prefer the list_select_related
change proposed by @akaariai (list of selection) since it's way more declarative and less error prone.
comment:34 by , 12 years ago
comment:35 by , 12 years ago
I was thinking about this when I committed the fix for #19080 - I think it should be reasonable (and a good compromise solution to this issue) to have effectively two reset-mechanisms on this function. With depth gone after 1.6, there will be three ways of calling select_related
:
select_related('foo', 'bar')
is the preferred usage, where you specify the fields. Subsequent calls of this form will append to each other.
select_related()
removes list and replaces with "all" (actually roughly the same as original depth=5).
select_related(None)
removes list completely.
I think this will be pretty consistent with the existing behaviour (given multiple calls with names are pretty weird now), and also more consistent with other similar functions (like prefetch_related
).
comment:36 by , 11 years ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
PR from Marc.
comment:37 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:38 by , 11 years ago
Please do not further reduce the functionality of select_related()
by removing the no-arguments variant (select all related, with a high hard coded depth limit to prevent recursive joins).
I use select_related()
with no arguments all the time, as I have found that in almost all my use cases, following too many joins and returning more data than may be necessary is still much more performant than returning too little data, and subsequently having template authors generate additional queries by accessing related fields that were not selected in the view.
It also saves the view programmer from tediously listing larger numbers of related fields when they legitimately want to select most or all related fields, and avoids the need for view programmers to know in advance which related fields a template author will want to access.
I much rather fallback to naming explicit fields in select_related()
only when there is an actual performance problem because of too many joins, instead of premature optimisation by trying to explicitly list all relations, and having to re-visit that code when a template or model changes.
comment:39 by , 11 years ago
Cc: | added |
---|
Agreed.