Opened 11 years ago
Last modified 8 months ago
#20757 new New feature
A more Object-Oriented URLResolver
Reported by: | Flavio Curella | Owned by: | |
---|---|---|---|
Component: | Core (URLs) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | flavio.curella@…, Ülgen Sarıkavak | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Currently, the RegexURLResolver
is populated using tuples.
Because of that, code that tries to inspect the resolver will end up containg awkward indexing like this (semplified code for clarity's sake):
def get_urlformat(urlname): """ Given a URL name, returns the URL as a string suitable for string.format. Example:: urlpatterns = patterns('', url(r'^extra/(?P<extra>\w+)/$', empty_view, name="named-url2"), ) >>> get_urlformat('named-url2') 'extra/%(extra)s/' """ resolver = get_resolver() return resolver.reverse_dict[urlname][0][0][0]
My proposal is to replace tuples with a tuple-like object whose elements can be accessed by using attribute names. That way, the above method could become:
def get_urlformat(urlname): """ Given a URL name, returns the URL as a string suitable for string.format. Example:: urlpatterns = patterns('', url(r'^extra/(?P<extra>\w+)/$', empty_view, name="named-url2"), ) >>> get_urlformat('named-url2') 'extra/%(extra)s/' """ resolver = get_resolver() urlbit = resolver.reverse_dict[urlname].urlbits[0] return urlbit.format
I realize this is mostly aesthetic, and there could be performance implications since the URLResolver is probably the most hit codepath, so I appreciate every kind of opinion.
The attached patch is still a draft, it definitely needs more extensive test coverage, but it's a start to get the idea.
Attachments (4)
Change History (14)
by , 11 years ago
Attachment: | patch_commit_44e7837a1706.patch added |
---|
comment:1 by , 11 years ago
Cc: | added |
---|---|
Component: | Uncategorized → Core (URLs) |
Type: | Uncategorized → New feature |
comment:2 by , 11 years ago
Needs tests: | set |
---|
comment:3 by , 11 years ago
Patch needs improvement: | set |
---|
by , 11 years ago
Attachment: | patch_commit_8a8468f33b16.patch added |
---|
by , 11 years ago
Attachment: | patch_commit_86a42e712a0e.patch added |
---|
comment:4 by , 11 years ago
Triage Stage: | Unreviewed → Accepted |
---|
Hi,
Python already has a namedtuple
class in its collections
module which is what you should be using for this case [1].
In theory, a namedtuple is a drop-in replacement for a regular tuple so backwards-compatibility should be assured.
However, I'm worried about the performance cost of such a replacement: as you noted, RegexURLResolver
is a pretty critical component of django.
From a quick search, there's nothing in our documentation that explains how to extend the URL resolver (or why you'd want to).
I've also never done it myself either so I don't see exactly what we'd gain with this change.
So, to sum up, I'm -0 on the idea of using namedtuples, provided there's no major performance hit.
On the other hand, I'd definitely be +1 on documenting the process of extending the URL resolver and its machinery.
[1] http://docs.python.org/2/library/collections.html?highlight=collections.namedtuple#collections.namedtuple
comment:5 by , 11 years ago
I've updated the patch to use namedtuples and rebased against 7523e784633b7757fbc82df58f80b197eeed988a.
Re performance: according to the Python docs 'Named tuple instances do not have per-instance dictionaries, so they are lightweight and require no more memory than regular tuples.'.
I'd still like to run some profiling to have a better idea of the impact of this change. Memory consumption might be the same, but instantiation could have an overhead.
comment:6 by , 10 years ago
I've done a quick benchmark. The code and raw results can be found here: https://gist.github.com/ulope/a63aff42f51b08181f72#file-results-txt
Here is a table of the results (n=1000000):
Python | operation | datastructure | time |
---|---|---|---|
CPy 2.6 | instantiate | tuple | 0.243867874146 |
instantiate | namedtuple | 0.791953086853 | |
access | tuple | 0.342396020889 | |
access | namedtuple | 1.09635186195 | |
CPy 2.7 | instantiate | tuple | 0.300357103348 |
instantiate | namedtuple | 0.817892074585 | |
access | tuple | 0.397746801376 | |
access | namedtuple | 1.12735199928 | |
CPy 3.3 | instantiate | tuple | 0.19853064499329776 |
instantiate | namedtuple | 0.8839332649949938 | |
access | tuple | 0.2663196460052859 | |
access | namedtuple | 1.1236915799963754 | |
CPy 3.4 | instantiate | tuple | 0.2100249359937152 |
instantiate | namedtuple | 0.707535210007336 | |
access | tuple | 0.2940493019996211 | |
access | namedtuple | 0.9200455860118382 | |
PyPy 2.3 | instantiate | tuple | 0.00525689125061 |
instantiate | namedtuple | 0.00653004646301 | |
access | tuple | 0.00449419021606 | |
access | namedtuple | 0.00840997695923 |
From this it seems that namedtuple indeed is noticeably more expensive than a plain tuple (except, unsurprisingly, on PyPy). However as the absolute times for namedtuple are still quite short even at one million iterations I think the impact on Django's URL resolvers should be negligible.
comment:7 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
I'll address this as part of https://groups.google.com/forum/#!topic/django-developers/WQoJN_MGTOg.
comment:8 by , 9 years ago
Has patch: | unset |
---|---|
Needs tests: | unset |
Patch needs improvement: | unset |
comment:9 by , 8 months ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:10 by , 8 months ago
Cc: | added |
---|
handle IndexError correctly