Opened 19 months ago
Closed 18 months ago
#34521 closed Cleanup/optimization (wontfix)
Use __slots__ for template Node classes
Reported by: | Adam Johnson | Owned by: | Adam Johnson |
---|---|---|---|
Component: | Template system | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Keryn Knight, Carlton Gibson | Triage Stage: | Unreviewed |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Declaring slots reduces the memory footprint of a class, and can lead to performance gains due to less data fetching by the CPU.
#12826 proposed adding slots to many classes but was closed with the proposal to do so on a case-by-case basis.
#33474 added slots to Variable
and related classes, showing memory reductions and performance gains.
I propose adding slots to more template classes to further save memory (Template, Token, Lexer, and the Node classes). This change leads to 1-2% improvement in template rendering time, using this benchmark script:
import django from django.conf import settings from django.template import Context from django.template import Template settings.configure( TEMPLATES=[{"BACKEND": "django.template.backends.django.DjangoTemplates"}] ) django.setup() template = Template("{% if x %}{{ x }}{% endif %}" * 1_000) context = Context({"x": "abc"}) for i in range(1_000): template.render(context)
And invoking [hyperfine](https://github.com/sharkdp/hyperfine) like so, with Python 3.11.2:
$ hyperfine \ -N --warmup 1 \ --prepare 'git switch -d 7d0e566208' \ 'python benchmark.py' \ --prepare 'git switch ticket_34521_optimize_templates' \ 'python benchmark.py' Benchmark 1: python benchmark.py Time (mean ± σ): 2.028 s ± 0.042 s [User: 1.990 s, System: 0.035 s] Range (min … max): 1.997 s … 2.116 s 10 runs Benchmark 2: python benchmark.py Time (mean ± σ): 1.985 s ± 0.007 s [User: 1.950 s, System: 0.033 s] Range (min … max): 1.976 s … 1.997 s 10 runs Summary 'python benchmark.py' ran 1.02 ± 0.02 times faster than 'python benchmark.py'
(I decided not to use djangobench since it's not well maintained, and doesn't do such great stats as hyperfine.)
Change History (8)
comment:1 by , 19 months ago
Description: | modified (diff) |
---|
comment:2 by , 19 months ago
Description: | modified (diff) |
---|
comment:3 by , 19 months ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Triage Stage: | Unreviewed → Accepted |
comment:4 by , 19 months ago
Cc: | added |
---|
comment:5 by , 19 months ago
Cc: | added |
---|
comment:6 by , 18 months ago
Patch needs improvement: | set |
---|
Marking as "Needs improvement" while waiting for follow up.
comment:7 by , 18 months ago
I'd like to argue for wontfixing this, at least for a couple of release cycles...
Copying my comment from the PR:
Can I ask that we pause on this please?
I'm currently investigating possible implementations for _Template Fragments_ (or _Partials_), as per the [HTMX essay on that theme](https://htmx.org/essays/template-fragments/).
Whilst custom template tags have long been able to write to the
context
, this route isn't available for (an as yet in-development) _partial nodes_ that are not rendered, for example because they're declared outside ablock
, or are rendered after they're required for output, because of varying rendering order due to inheritance, say.
In order to work around this, and in order to allow partials to resolvable from a specific template, I'm currently assigning a mapping to the template Origin to serve as a lookup store.
This isn't particularly _nice™_, but it functions, and allows work on the prototype to continue.
My current implementation works on Django 4/2, but is broken by this PR.
- Can no longer create weak references to template objects. (This could be worked around but...)
- Can no longer assign a mapping to any object that's available at both parsing and rendering time, and so there's no (immediately) clear way to pass partial references out from the parser.
- A module global might work here, but it's difficult to maintain the correct per-template mapping, and (more) even using weakref for the storer making sure other references are dropped has proved non-trivial in my experiments thus far.
Beyond seriously putting in doubt my current work, I think forcing slots here makes experimenting with the DTL much harder, and I don't think that's paid for by the speed increase on offer. We're only now beginning to come back to _What's possible with templates?_, after a long period of JSON APIs being primary. I'd like to ask that we don't handicap that by overly restricting what's easy to do with template objects.
I also note that in researching this project, there is a long history of people trying all sorts of customisations (and I was only looking at output capture). Given that this PR breaks my WIP, I'd be surprised if it didn't lead to breakages more widely.
I appreciate that it's not part of the official contract for template object not to use slots, but I think it's a pretty major change.
tl;dr: With fresh attention, there are plenty of gains to be made to the DTL. Making everything use slots makes experimenting with that harder.
Thanks.
comment:8 by , 18 months ago
Resolution: | → wontfix |
---|---|
Status: | assigned → closed |
Triage Stage: | Accepted → Unreviewed |
Closing as "wontifx", as a deeper discussion is needed due to the backward incompatibility of this change.
Thanks y'all.
Thanks. Tentatively accepted.