Opened 4 years ago

Closed 3 years ago

Last modified 2 years ago

#32946 closed Cleanup/optimization (fixed)

Prefer non-kwargs construction of dynamically generated Q objects

Reported by: Keryn Knight Owned by: Keryn Knight
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

There are a number of places where Q objects are created dynamically to do things like construct a big OR expression, but they do so in the historically available API way; because once upon a time there was no way to pass the connector in the Q constructor, to make a OR node would've required doing:

x = Q(a=1, b=2)
x.connector = Q.OR

which understandably was not the ideal from the general public API standpoint, and so everyone re-uses the | pattern.
But that changed way back when, in 508b5debfb16843a8443ebac82c1fb91f15da687 for #11964, and now it can be done as Q(a=1, b=2, _connector=Q.OR)

This is important because it's a substantially faster way of constructing a single Q object which does the right thing, because it ignores the complexity of cloning an instance and combining the parts into it. Here follows demonstration data.

The smallest data point to consider is that sending in kwargs is (unfortunately) slower because they're sorted (which I don't think is correct, particularly, but that's a separate ticket tbh):

In [2]: %timeit Q(a=1, b=2, c=3, d=4)
        1.85 µs ± 14.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [3]: %timeit Q(('a', 1), ('b', 2), ('c', 3), ('d', 4))
        1.2 µs ± 10.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [4]: Q(a=1, b=2, c=3, d=4) == Q(('a', 1), ('b', 2), ('c', 3), ('d', 4))
True

Now, one of the common ways to build a dynamic Q is to use functools.reduce and operator.or_, like the below, which is a facsimile of the usage in Django and elsewhere:

In [1]: from functools import reduce
   ...: from operator import or_
   ...: from django.db.models.query_utils import Q
In [2]: values = (Q(a=1), Q(b=2), Q(c=3), Q(d=4), Q(e__in=[1,2,3,4]))
In [3]: %timeit reduce(or_, values)
        12.3 µs ± 91 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [4]: %timeit Q(a=1, b=2, c=3, d=4, e__in=[1,2,3,4], _connector=Q.OR)
        2.02 µs ± 23.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [5]: %timeit Q(('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e__in', [1, 2, 3, 4]), _connector=Q.OR)
        1.4 µs ± 2.31 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In the reduce given above, that's 5 Q objects created initially, and then 4 additional Q objects created, vs 1 for the latter. Which is basically borne out by the timings, 1.4 * 9 == 12.6.

Another usage which comes up is to do something like Q(**{field.name: xxx}) when Q((field.name, xxx)) would suffice.

Or doing something like:

x = models.Q()
if y:
    x |= models.Q(**{field: '!'})
if z:
    x |= models.Q(**{field: '?'})

when it could be:

x = []
if y:
    x.append((field, '!'))
if z:
    x.append((field, '?'))
q = Q(x, _connector=Q.OR)

and so on.

I have a patch which targets most/all? of the constructions I think it might make sense to change, and assuming the CI passes we can discuss from there.

I also think that Q(a=1, b=2, _connector=Q.OR) and/or the reduce(...) should be documented in the https://docs.djangoproject.com/en/3.2/topics/db/queries/#complex-lookups-with-q section. Dynamic Q construction is fairly common IME, doesn't appear to be covered there, and ultimately points to Q test cases which also don't show any form of dynamic variants.

Change History (7)

comment:1 by Mariusz Felisiak, 4 years ago

Triage Stage: UnreviewedAccepted

Agreed.

I also think that Q(a=1, b=2, _connector=Q.OR) and/or the reduce(...) should be documented in the https://docs.djangoproject.com/en/3.2/topics/db/queries/#complex-lookups-with-q section. Dynamic Q construction is fairly common IME, doesn't appear to be covered there, and ultimately points to Q test cases which also don't show any form of dynamic variants.

Using Q(a=1, b=2, _connector=...) internally is fine, but I don't think we would like to encourage users to use it. I'd add an example with reduce() which was mentioned in several tickets.

comment:2 by Keryn Knight, 3 years ago

Has patch: set
Patch needs improvement: set

PR is https://github.com/django/django/pull/14699
As I write this, CI is still running, but I've set patch needs improvement based on not having removed the imports for functools/operator ...

comment:3 by Mariusz Felisiak, 3 years ago

Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

comment:4 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In 5b8ef8aa:

Refs #32946 -- Changed Query.add_filter() to take two arguments.

comment:5 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In 9662193:

Refs #32946 -- Changed internal usage of dynamic Q() objects construction to use non-kwargs initialization.

This prefers non-kwargs construction of dynamically generated Q()
objects to create a single Q() object instead of many and then
combining them, where possible.

comment:6 by Mariusz Felisiak, 3 years ago

Resolution: fixed
Status: assignedclosed

comment:7 by Mariusz Felisiak <felisiak.mariusz@…>, 2 years ago

In 9dff316b:

Refs #32948, Refs #32946 -- Used Q.create() internally for dynamic Q() objects.

Node.create() which has a compatible signature with Node.init()
takes in a single children argument rather than relying in unpacking
*args in Q.init() which calls Node.init().

In addition, we were often needing to unpack iterables into *args and
can instead pass a list direct to Node.create().

Note: See TracTickets for help on using tickets.
Back to Top