#35947 closed Cleanup/optimization (wontfix)
prefetch_related makes duplicate queries for the same records by different relations
Reported by: | Jake Douglas | Owned by: | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 5.1 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Prefetching the same records via different relations using prefetch_related
results in duplicated queries for the same set of records. This is a problem because many web applications traverse relations by many different paths, even in the same request. The smallest example I could reproduce follows:
from django.db import models # Create your models here. class House(models.Model): pass class Story(models.Model): house = models.ForeignKey(House, related_name = "stories", on_delete = models.CASCADE) class Room(models.Model): story = models.ForeignKey(Story, related_name = "rooms", on_delete = models.CASCADE) house = models.ForeignKey(House, related_name = "rooms", on_delete = models.CASCADE) class Window(models.Model): story = models.ForeignKey(Story, related_name = "windows", on_delete = models.CASCADE) room = models.ForeignKey(Room, related_name = "windows", on_delete = models.CASCADE) House.objects.prefetch_related( "rooms__windows", "stories__rooms__windows" ).all()
This results in something like this:
SELECT ••• FROM "houses_house" SELECT ••• FROM "houses_room" WHERE "houses_room"."house_id" IN (1) SELECT ••• FROM "houses_window" WHERE "houses_window"."room_id" IN (1) # DUPLICATE SELECT ••• FROM "houses_story" WHERE "houses_story"."house_id" IN (1) SELECT ••• FROM "houses_room" WHERE "houses_room"."story_id" IN (1) SELECT ••• FROM "houses_window" WHERE "houses_window"."room_id" IN (1) # DUPLICATE
The same behavior occurs when the last relation in the chain is one-to-one.
Change History (2)
comment:1 by , 72 minutes ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Type: | Uncategorized → Cleanup/optimization |
comment:2 by , 28 minutes ago
All that to support the likely rare cases where the same relationship is prefetched through different paths and happens to target the exact set of objects
For the record, here's an example request from our production application:
(313 queries including 279 similar and 242 duplicates)
I don't think we can fix this without a significant refactor to how prefetch related work given the Django ORM does not maintain a map of object identities by related fields so I'm going to won't fix this one.
In order to implement such a feature the prefetch related logic would have to pass a memoization object of the form
dict[tuple[type[Model], models.Field], dict[Any, Model]]
that maps model types and from field to another map of from field value to model instances toprefetch_related_objects
andprefetch_one_level
. All that to support the likely rare cases where the same relationship is prefetched through different paths and happens to target the exact set of objects (as any difference would require an extra query anyway).If you disagree with the resolution of this ticket feel free to open a thread on the forums about the benefits vs complexity tradeoff of this requested optimization.