#35518 closed Cleanup/optimization (needsinfo)
Avoid regex search for simple route patterns
Reported by: | Jake Howard | Owned by: | Jake Howard |
---|---|---|---|
Component: | Core (URLs) | Version: | 5.0 |
Severity: | Normal | Keywords: | |
Cc: | Adam Johnson | Triage Stage: | Unreviewed |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The RoutePattern
assumes all routes provided include some form of converter, and so needs to change it into a regex for matching. However, if no converters are included in the string, the additional overhead of using a regex vs simpler string operations is unnecessary.
Replacing this with a simpler string comparison results in between a 50 and 75% reduction in match time, which stacks up quickly as an application generally has numerous URLs. This can be done by modifying the RoutePattern
internally, with no external breakages.
Before
In [2]: endpoint_pattern = RoutePattern("foo/", "name", is_endpoint=True) In [3]: pattern = RoutePattern("foo/", "name", is_endpoint=False) In [4]: %timeit pattern.match("foo/") 441 ns ± 2.68 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) In [5]: %timeit endpoint_pattern.match("foo/") 435 ns ± 0.974 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
After
In [4]: %timeit pattern.match("foo/") 187 ns ± 1.84 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each) In [5]: %timeit endpoint_pattern.match("foo/") 103 ns ± 0.192 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
Theoretically, these improvements get better based on the length of the route pattern (although at this scale, not notably).
This optimisation could be done by adding a different kind of pattern (eg LiteralPattern
), but the added complexity to a project probably isn't necessary, not to mention the migration effort to take advantage of this.
Change History (7)
comment:1 by , 7 months ago
Cc: | added |
---|---|
Resolution: | → needsinfo |
Status: | assigned → closed |
comment:2 by , 7 months ago
These benchmarks were run against main
, with those changes. Adam's optimisation specifically targets optimising converting a route into a regex, which isn't something changed as part of this. This is just about avoiding matching against said regexes (we do still need to run _route_to_regex
to determine whether the route has any converters, and thus whether it's "simple").
comment:3 by , 7 months ago
If I understand correctly, the idea is that _route_to_regex()
may return a “plain” string if no converters are found. Then in RoutePattern.match()
, any such “plain” strings will be matched with str.startswith
or ==
, depending on _is_endpoint
, avoiding the overhead of performing a regex search?
It seems like a legitimate optimization, but saving nanoseconds is maybe a little small for the added complexity. I suppose it would multiply across the number of routes.
Can you make a draft PR with your patch? Does it pass all existing tests? We’d need to be extra careful here.
comment:4 by , 7 months ago
You're exactly right. It's a 'relatively' simple change, although does add a little complexity.
~300ns doesn't sound like much I agree, but this saving is per route per request, as opposed to just once per request, so for large projects with lots of URLs it can stack up. I can source some benchmarks of resolve
from a sensibly-sized project if that would help with review?
PR for your perusal.
comment:5 by , 7 months ago
Thanks for the PR. The change is a little less invasive than I expected.
A whole-project benchmark would be good for posterity. There are also some benchmarks in djangobench, called url_resolve*, which we can check.
comment:6 by , 7 months ago
It looks like djangobench
only has benchmarks for re_path
and not path
(this change only affects path
). With some guidance, I'm happy to add some benchmarks in (somehow ignoring Django versions which didn't have path
), but I'll try and get some project benchmarks anyway.
comment:7 by , 7 months ago
I've run some benchmarks against https://github.com/wagtail/bakerydemo, comparing main
to my branch, intentionally choosing a path without a match to get a worst-case benchmark:
Before:
In [2]: %timeit -n 10000 resolve("/foo/") 44.8 µs ± 18.5 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
After:
In [3]: %timeit -n 10000 resolve("/foo/") 39.7 µs ± 4.24 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
That's much less noticeable than I was expecting, and realistically within the margin of error. bakerydemo
isn't the largest project, and I suspect that because the URL pattern tree is hierarchical, it'll only make a notable difference when a given node has a large number of children, rather than being useful in blanket. That's still likely to cover a non-zero number of projects.
This has similarities to #35252 hence cc-ing Adam here. I also think your benchmarks are testing against 5.0, is that correct? That won't have those changes. Can you benchmark against main or 5.1a1?
Closing as "needsinfo" for now 👍