Opened 3 years ago
Closed 3 years ago
#33474 closed Cleanup/optimization (fixed)
Introduce slots definition to Variable and FilterExpression
Reported by: | Keryn Knight | Owned by: | Keryn Knight |
---|---|---|---|
Component: | Template system | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Another attempt at a case-by-case of the idea in #12826, given #33465 was OK'd.
In the Pull-Request for #33465 there was some discussion about adding slots defs to Node
subclasses, which isn't as straight-forward as one might hope (though I don't think it's impossible, just a bit more work, and a bit more of a sell to make :)) because of class attributes and mutation during parsing.
However, internal to VariableNode
are 2 layers of "indirection" to actually resolve a value, FilterExpression
and Variable
. Neither of these is a Node
and neither inherits from anything nor is ordinarily inherited by anything. Additionally they don't make use of class attributes, assigning all of their values in __init__
This makes them (IMHO) a pretty decent candidate for removing the implicit __dict__
on the instance, because any user subclasses would still inherit and get a dict, and there's not any real complications (AFAIK - please say if you think of any) in tightening the attributes assignable.
Running the test suite (Ran 14963 tests in 438.336s
using --parallel=1
) locally, and patching VariableNode.__init__
to track the pympler.asizeof.asizeof(self)
at that point and increment a global int suggests that for me (using python3.10
), the baseline space taken up by VariableNode
instances is 39.3 MB
(using filesizeformat
on the byte count, remember every VariableNode
will contain a FilterExpression
and most will contain a Variable
). After patching in __slots__
definitions for each of FilterExpression
and Variable
, the cost across the same number of tests goes down to 21.4 MB
, "saving" roughly nearly 50% off the cost of storing them.
If we look at the individual pieces, before applying __slots__
to either class:
In [1]: from django.template.base import TVariable, FilterExpression, Parser In [2]: from pympler.asizeof import asizeof In [3]: p = Parser("test") In [4]: v = Variable("test") In [5]: f = FilterExpression("test", p) In [6]: asizeof(v) Out[6]: 592 In [7]: asizeof(f) Out[7]: 1000
and after applying to both:
... In [6]: asizeof(v) Out[6]: 216 In [7]: asizeof(f) Out[7]: 368
Note that the Parser
instance given to the FilterExpression
isn't changed and doesn't count towards the byte count, because it's not stored onto the FilterExpression
at any point (though individual filters are).
For good measure, a bit of the ole' timeit magic to check the runtime performance profile doesn't markedly change. AFAIK it ought to be every so slightly faster, but given the difference is probably in nanoseconds and the timings are in microseconds ...
Before patches:
In [3]: %timeit Variable("test") 3.47 µs ± 38.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) In [4]: p = Parser("test") In [5]: %timeit FilterExpression("test", p) 6.12 µs ± 196 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) In [6]: %timeit Variable("test").resolve({'test': 1}) 4.02 µs ± 40.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) In [7]: %timeit FilterExpression("test", p).resolve({'test': 1}) 6.84 µs ± 105 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) In [8]: v = Variable("test") In [9]: f = FilterExpression("test", p) In [10]: %timeit v.resolve({'test': 1}) 387 ns ± 2.24 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) In [11]: %timeit f.resolve({'test': 1}) 569 ns ± 7.82 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
After patches:
In [3]: %timeit Variable("test") 3.47 µs ± 52.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) In [4]: p = Parser("test") In [5]: %timeit FilterExpression("test", p) 6.01 µs ± 61.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) In [6]: %timeit Variable("test").resolve({'test': 1}) 4.09 µs ± 52.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) In [7]: %timeit FilterExpression("test", p).resolve({'test': 1}) 6.57 µs ± 83.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) In [8]: v = Variable("test") In [9]: f = FilterExpression("test", p) In [10]: %timeit v.resolve({'test': 1}) 365 ns ± 2.13 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) In [11]: %timeit f.resolve({'test': 1}) 521 ns ± 3.03 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
As one might expect, I have a branch locally with these changes that I can make into a PR, and all the tests pass.
As a minor aside and sanity check for "odd behaviours in the wild", given Malcolm's comment on #12826 about monkey-patching possibly presenting an issue, I've checked one of my monkeypatches on the builtin Variable
class and that still appears to all work fine, otherwise I would definitely not be suggesting this ;)
Change History (5)
comment:1 by , 3 years ago
Owner: | changed from | to
---|
comment:2 by , 3 years ago
Has patch: | set |
---|
comment:3 by , 3 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:4 by , 3 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
PR is https://github.com/django/django/pull/15378