Opened 11 months ago
Closed 11 months ago
#35252 closed Cleanup/optimization (fixed)
Optimize django.urls.resolvers._route_to_regex()
Reported by: | Adam Johnson | Owned by: | Adam Johnson |
---|---|---|---|
Component: | Core (URLs) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
_route_to_regex()
converts Django’s parameterized route syntax into a regular expression. Whilst working on #35250, I noticed several opportunities for optimizing this function:
- It has O(n2) runtime from slicing the route string per parameter, repeatedly copying the remainder. A single search for all parameters would have O(n) runtime instead.
- Within my 950 URL project, there are many repeat calls with the same route, such as the empty string or ModelAdmin URL suffixes like '<path:object_id>/history/'. I think this would be typical of many projects, so it makes sense to add caching.
match.start()
andmatch.end()
are called separately, whenmatch.span()
gives both values in one function call.get_converter()
is unnecessary, the function can fetch all converters once withget_converters()
and use the dictionary directly.- Casting each parameter string to a set for the whitespace check is a bit costly, it’s faster to use a whitespace set scanning the string.
- An f-string can be used for concatenation, which is a little bit faster.
Applying these optimizations makes the function significantly faster, especially for more parameters. Below are some benchmarks.
Before optimization stats (Python 3.12, macOS, M1 mac, Django main branch):
- Converting a seven parameter route:
In [2]: %timeit _route_to_regex("<a>/<b>/<c>/<d>/<e>/<f>/<g>", True) 12.3 µs ± 68 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
- Profiling a project’s system checks: 950 calls take 9ms, ~1.5% of the total runtime.
After optimization:
- Converting that seven parameter route is ~2x faster (with caching removed):
In [2]: %timeit _route_to_regex("<a>/<b>/<c>/<d>/<e>/<f>/<g>", True) 5.49 µs ± 18.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
- Adding caching makes repeat calls ~100x faster, from 5µs to 50ns.
- The profile shows those 950 calls now take 5ms, ~50% faster, ~0.8% of total runtime.
Change History (3)
comment:1 by , 11 months ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 11 months ago
Triage Stage: | Accepted → Ready for checkin |
---|
Note:
See TracTickets
for help on using tickets.
Accepting on the basis of the provided benchmarks and the initial feedback in the PR. Will also review the PR soon, thanks!