Opened 11 years ago

Closed 7 years ago

#20880 closed Cleanup/optimization (fixed)

Remove non-cloning logic from Query.clone() and QuerySet._clone()

Reported by: Anssi Kääriäinen Owned by: Tim Graham
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Anssi Kääriäinen)

QuerySet and Query clone() methods are doing more than just cloning, they are also doing set up for next operation, changing the class of the clone if requested, and even throwing away some attribute values.

To make the code a bit more readable the clone() method should be splitted to clone() part and pre_next_op() part. This also allows creating inplace querysets that work precisely like full-clone querysets. A queryset is called inplace if it doesn't do cloning at all between operations. This again allows speeding up certain internal operations like model.save() and prefetch_related(), and also allows users to optimise heavy-to-create querysets by avoiding clone() overhead.

Patch available from https://github.com/akaariai/django/tree/splitted_clone. The inplace queryset used for Model.save_base() speeds up model.save() by about 20% for a simple m = Model.objects.get(); m.save() benchmark. There are other places where similar optimisations could be used. Prefetch speedup is tracked in ticket #20577.

Change History (8)

comment:1 by Anssi Kääriäinen, 11 years ago

Description: modified (diff)
Triage Stage: UnreviewedAccepted

comment:2 by Anssi Kääriäinen, 11 years ago

I updated (force-pushed) to https://github.com/akaariai/django/tree/splitted_clone. The patch is pretty good (cleanup needed), and shows promising results for model saving and some other queryset operations:

Running 'model_save_existing' benchmark ...
Min: 0.014001 -> 0.010278: 1.3622x faster
Avg: 0.014304 -> 0.010394: 1.3763x faster
Significant (t=37.226713)


Running 'model_save_new' benchmark ...
Min: 0.013848 -> 0.010196: 1.3582x faster
Avg: 0.014079 -> 0.010408: 1.3527x faster
Significant (t=20.503974)


Running 'query_get' benchmark ...
Min: 0.027801 -> 0.023326: 1.1918x faster
Avg: 0.028116 -> 0.023599: 1.1914x faster
Significant (t=31.307197)

For qs_filter_chaining (~5 .filter() calls) using inplace results in speedup of:

Running 'qs_filter_chaining_inplace' benchmark ...
Min: 0.000892 -> 0.000553: 1.6132x faster
Avg: 0.000937 -> 0.000571: 1.6424x faster
Significant (t=16.654697)

This is by no means complex query, yet it shows promising results.

So, it is worth considering if .inplace() should be public API. If so, then it would make sense to guard against using stale versions of the queryset. This is doable by cloning the outer QuerySet. Skipping cloning of the internal Query, should give most of the speedup.

Another consideration is that if the ._clone() should be renamed to ._copy() or something. The patched version ._clone() does subtly different thing than the old version, so this might cause some grey hairs for those who use that internal API.

comment:3 by Tim Graham, 8 years ago

Owner: changed from nobody to Tim Graham
Status: newassigned

I've been rebasing Anssi's patch and will post a PR when it's polished. If I understand correctly, Josh proposed a similar idea to inplace().

comment:4 by Tim Graham, 8 years ago

Has patch: set

Here's a PR without the "inplace" stuff. I think we can leave that to a separate ticket/PR.

comment:5 by Tim Graham, 7 years ago

Summary: Split clone() to clone() and pre_next_op()Remove non-cloning logic from Query.clone() and QuerySet._clone()

#28455 is the follow up ticket for adding "inplace" querysets.

comment:6 by Tim Graham <timograham@…>, 7 years ago

In 66933a66:

Refs #20880 -- Removed non-cloning logic from QuerySet._clone().

comment:7 by Tim Graham <timograham@…>, 7 years ago

In 6155bc4a:

Refs #20880 -- Removed non-cloning logic from Query.clone().

comment:8 by Tim Graham, 7 years ago

Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.
Back to Top