Opened 5 years ago

Closed 5 years ago

#30842 closed Cleanup/optimization (duplicate)

Prefetch_related spends considerable time constructing querysets.

Reported by: Alex Aktsipetrov Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: prefetch_related
Cc: Simon Charette Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Alex Aktsipetrov)

As part of the bugfix django started constructing N+1 querysets during a typical prefetch_related call while only 2 sql queries are executed.

This adds a noticeable slowdown. Attaching the flamegraph for queryset fetching 100 objects, roughly 2/3 of the time are spent there.

Attachments (1)

prefetch.svg.gz (77.0 KB ) - added by Alex Aktsipetrov 5 years ago.

Download all attachments as: .zip

Change History (8)

by Alex Aktsipetrov, 5 years ago

Attachment: prefetch.svg.gz added

comment:1 by Alex Aktsipetrov, 5 years ago

Description: modified (diff)

comment:2 by Alex Aktsipetrov, 5 years ago

Unfortunately I wasn't able to produce a patch for that yet. Experimented a bit with making queryset construction lazy, but that seems excessively major for such an issue.

comment:3 by Simon Charette, 5 years ago

Triage Stage: UnreviewedAccepted

Thanks for the report Alex, this was a concern raised during the implementation https://code.djangoproject.com/ticket/26226#comment:2.

I'm tentatively accepting as we should definitely address this if possible.

Ideally only proxies to the original queryset would be created to defer the creation of querysets to only if needed. FWIW N querysets were created prefetching even before c92123cc1dceeb800b3b8900e2e530ed19d78863. It's true that the latter made the matter worse though by performing an addition filter call.

I wonder if performing some form of local memoization per related manager class to call manager._apply_rel_filters only once manager type and using queryset cloning could speed up things a bit here. Happy to give it a broad try if that can get you started Alex.

comment:4 by Simon Charette, 5 years ago

Cc: Simon Charette added

comment:5 by Alex Aktsipetrov, 5 years ago

Has patch: set

in reply to:  3 comment:6 by Alex Aktsipetrov, 5 years ago

Replying to Simon Charette:

Ideally only proxies to the original queryset would be created to defer the creation of querysets to only if needed.

I think we can't really defer the creation, since such a proxy would have to share lots of features with the QuerySet?
Although we probably can create a proxy to defer just filter calls.

But instead I've tried fumbling with QuerySet itself, see the PR. That seems simpler from implementation perspective and also gives more control of copying.

I wonder if performing some form of local memoization per related manager class to call manager._apply_rel_filters only once manager type and using queryset cloning could speed up things a bit here. Happy to give it a broad try if that can get you started Alex.

If you think this is a more promising approach, please do. I haven't found a good looking way to do it.

Last edited 5 years ago by Alex Aktsipetrov (previous) (diff)

comment:7 by Simon Charette, 5 years ago

Resolution: duplicate
Status: newclosed

I discovered that #20577 pushes for the same idea of deferring _apply_related_filter application. I think we should close this ticket as a duplicate and move the discussion there.

Note: See TracTickets for help on using tickets.
Back to Top