Changes between Initial Version and Version 1 of Ticket #22757, comment 7
- Timestamp:
- Nov 3, 2023, 8:52:54 PM (14 months ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Ticket #22757, comment 7
initial v1 1 1 Thanks for having a shot at this Nolan! 2 2 3 I think that a lot of the complexity here stems from the fact doesn't support polymorphic querysets that is querysets that can return multiple models.3 I think that a lot of the complexity here stems from the fact the ORM doesn't support polymorphic querysets that is querysets that can return multiple models. 4 4 5 This poses a challenge from an implementation perspective for this issue because the (simplified here) return signature of `get_prefetch_querysets` contains three members5 This poses a challenge from an implementation perspective for this issue because the (simplified here) return signature of `get_prefetch_querysets` must contains three members 6 6 7 1. A collection of prefetch querysets8 2. A callable that takes a model instance returned from one of the prefetch queries and return a tuple uniquely identifying the item (that's `lambda obj: (obj.pk, obj.__class__)`)9 3. A callable that take model instances relatedobjects are prefetched for and return a tuple uniquely identifying the related items (that's `gfk_key`)7 1. A collection of prefetching querysets 8 2. A callable that takes a model instance returned from the prefetching querysets and return a tuple uniquely identifying the item (that's `lambda obj: (obj.pk, obj.__class__)`) 9 3. A callable that take model instances that objects are prefetched for and return a tuple uniquely identifying the related items (that's `gfk_key`) 10 10 11 The `prefetch_related` logic then executes all the querysets from 1., build a map from 2, and in serts it with 3. to do the proper `_prefetch_related_cache` assignments.11 The `prefetch_related` logic then executes all the querysets from 1., build a map from 2, and intersects it with 3. to do the proper `_prefetch_related_cache` assignments. 12 12 13 Since Django only allows for a single model class to be used to construct objects returned from a queryset, remember the lack of polymorphism support, then one might be tempted to annotate objects with a `_content_type_id` attribute to have the logic in 2. changed to `lambda obj: (obj.pk, self.get_content_type(id=obj._content_type_id, using=obj._state.db).model_class()` so the proper mapping occurs. Doing do doesn't solve the problem of proper `__class__` assignment as you've brought up Nolan nor does it address another edge case that would have probably caused during review.13 Since Django only allows for a single model class to be used to construct objects returned from a queryset, remember the lack of polymorphism support, then one might be tempted to annotate objects with a `_content_type_id` attribute (e.g. using `Case(When)` like we do in `bulk_update` based on the id) to have the logic in 2. changed to `lambda obj: (obj.pk, self.get_content_type(id=obj._content_type_id, using=obj._state.db).model_class()` so the proper mapping occurs. Doing do doesn't solve the problem of proper `__class__` assignment as you've brought up Nolan nor does it address another edge case that would have probably caused during review. 14 14 15 Even if we're able to return a specialized `Queryset` that overrides the `_iterable_class` to return mixed models ([https://github.com/django-polymorphic/django-polymorphic/blob/5111fb070e4e0b0153e816d163871f2039ca2ae2/polymorphic/query.py#L25 it's the tactic used by packages like django-polymorphic] for example) howwe deal with the case where the same id is needs to be prefetched both concretely and for it's proxy.15 Even if we're able to return a specialized `Queryset` that overrides the `_iterable_class` to return mixed models ([https://github.com/django-polymorphic/django-polymorphic/blob/5111fb070e4e0b0153e816d163871f2039ca2ae2/polymorphic/query.py#L25 it's the tactic used by packages like django-polymorphic] for example) we'd have to we deal with the case where the same id is needs to be prefetched both concretely and for it's proxy. 16 16 17 17 In other words, given the reported scenario, what kind of SQL query would we generate for … … 27 27 }}} 28 28 29 In this case the same `foo_child` row needs to returned twice one as a child and one as a proxy_child which cannot be achieved with a simple `id IN (1, 1)`.29 In this case the same `foo_child` row needs to returned twice one as a `Child` and one as a `ProxiedChild` which cannot be achieved with a simple `id IN (1, 1)` to ''span'' two rows. 30 30 31 31 The only way I can think of dealing with this edge case is by using a `UNION ALL` with combined annotation so the following SQL would be generated … … 38 38 }}} 39 39 40 And that must then be combined with the `_iterable_class` override mentioned above.40 And that must then be combined with the `_iterable_class` logic mentioned above to create model instances with the proper model class based on the `_content_type_id` annotation. 41 41 42 There's even more fun now that we've added `GenericPrefetch` in 5.0 as it allows you to specify querysets for each of the models involved which means that it allows passing querysets with `select_related` and annotations which cannot use the `UNION ALL` approach described above so the ''optimization'' would have to be disabled.42 There's even more fun now that we've added `GenericPrefetch` in 5.0 as it allows you to specify querysets for each of the models involved which means that it allows passing querysets with `select_related` and annotations that result in variadic `SELECT` statement which cannot use the `UNION ALL` approach described above so the ''optimization'' would have to be disabled. 43 43 44 After looking at this into more details I also agree that the addition in complexity to avoid this query is likely not worth it as it would either require some potentially expensive Python workaround or significant boiler plate to get working properly. I suspect we'll want to ''wontfix'' this one unfortunately .44 After looking at this into more details I also agree that the addition in complexity to avoid this query is likely not worth it as it would either require some potentially expensive Python workaround or significant boiler plate to get working properly. I suspect we'll want to ''wontfix'' this one unfortunately at least until someone can come with a clever way of achieving the above efficiently or the ORM gains polymorphic capabilities which is a long shot.