Opened 9 months ago

Closed 8 months ago

#35252 closed Cleanup/optimization (fixed)

Optimize django.urls.resolvers._route_to_regex()

Reported by: Adam Johnson Owned by: Adam Johnson
Component: Core (URLs) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

_route_to_regex() converts Django’s parameterized route syntax into a regular expression. Whilst working on #35250, I noticed several opportunities for optimizing this function:

  1. It has O(n2) runtime from slicing the route string per parameter, repeatedly copying the remainder. A single search for all parameters would have O(n) runtime instead.
  2. Within my 950 URL project, there are many repeat calls with the same route, such as the empty string or ModelAdmin URL suffixes like '<path:object_id>/history/'. I think this would be typical of many projects, so it makes sense to add caching.
  3. match.start() and match.end() are called separately, when match.span() gives both values in one function call.
  4. get_converter() is unnecessary, the function can fetch all converters once with get_converters() and use the dictionary directly.
  5. Casting each parameter string to a set for the whitespace check is a bit costly, it’s faster to use a whitespace set scanning the string.
  6. An f-string can be used for concatenation, which is a little bit faster.

Applying these optimizations makes the function significantly faster, especially for more parameters. Below are some benchmarks.

Before optimization stats (Python 3.12, macOS, M1 mac, Django main branch):

  • Converting a seven parameter route:
In [2]: %timeit _route_to_regex("<a>/<b>/<c>/<d>/<e>/<f>/<g>", True)
12.3 µs ± 68 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
  • Profiling a project’s system checks: 950 calls take 9ms, ~1.5% of the total runtime.

After optimization:

  • Converting that seven parameter route is ~2x faster (with caching removed):
In [2]: %timeit _route_to_regex("<a>/<b>/<c>/<d>/<e>/<f>/<g>", True)
5.49 µs ± 18.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


  • Adding caching makes repeat calls ~100x faster, from 5µs to 50ns.
  • The profile shows those 950 calls now take 5ms, ~50% faster, ~0.8% of total runtime.

Change History (3)

comment:1 by Natalia Bidart, 9 months ago

Triage Stage: UnreviewedAccepted

Accepting on the basis of the provided benchmarks and the initial feedback in the PR. Will also review the PR soon, thanks!

comment:2 by Mariusz Felisiak, 8 months ago

Triage Stage: AcceptedReady for checkin

comment:3 by Mariusz Felisiak <felisiak.mariusz@…>, 8 months ago

Resolution: fixed
Status: assignedclosed

In eff21d8e:

Fixed #35252 -- Optimized _route_to_regex().

co-authored-by: Nick Pope <nick@…>

Note: See TracTickets for help on using tickets.
Back to Top