Opened 7 years ago

Last modified 3 years ago

#28628 closed Cleanup/optimization

Audit for and abolish all use of '\d' in regexes — at Version 2

Reported by: James Bennett Owned by: nobody
Component: Core (Other) Version: dev
Severity: Normal Keywords:
Cc: Ad Timmering, security@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by James Bennett)

Now that we're in the 2.0 release cycle and Python 3 only, any examples or code in Django using a \d in a regex should be replaced with [0-9], as \d on Python 3 matches any character with Unicode category [Nd], which is almost certainly not what people expect. Changing to explicit [0-9], perhaps with a note about why it should be preferred, would be better.

Change History (2)

comment:1 by James Bennett, 7 years ago

Here's a tentative list of places that should change, from grepping on current master:

django/contrib/gis/db/backends/postgis/operations.py:344:        proj_regex = re.compile(r'(\d+)\.(\d+)\.(\d+)')
django/contrib/gis/gdal/libgdal.py:87:version_regex = re.compile(r'^(?P<major>\d+)\.(?P<minor>\d+)(\.(?P<subminor>\d+))?')
django/contrib/gis/geometry.py:7:wkt_regex = re.compile(r'^(SRID=(?P<srid>\-?\d+);)?'
django/contrib/gis/geometry.py:11:                       r'[ACEGIMLONPSRUTYZ\d,\.\-\+\(\) ]+)$',
django/contrib/gis/geos/geometry.py:116:            match = re.match(b'SRID=(?P<srid>\-?\d+)', srid_part)
django/contrib/gis/templates/gis/admin/openlayers.js:4:{{ module }}.map = null; {{ module }}.controls = null; {{ module }}.panel = null; {{ module }}.re = new RegExp("^SRID=\\d+;(.+)", "i"); {{ module }}.layers = {};
django/contrib/humanize/templatetags/humanize.py:48:    new = re.sub(r"^(-?\d+)(\d{3})", r'\g<1>,\g<2>', orig)
django/core/management/commands/makemessages.py:402:        m = re.search(r'(\d+)\.(\d+)\.?(\d+)?', out)
django/core/management/commands/runserver.py:18:    (?P<ipv4>\d{1,3}(?:\.\d{1,3}){3}) |         # IPv4 address
django/core/management/commands/runserver.py:21:):)?(?P<port>\d+)$""", re.X)
django/core/validators.py:78:    ipv4_re = r'(?:25[0-5]|2[0-4]\d|[0-1]?\d?\d)(?:\.(?:25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}'
django/core/validators.py:99:        r'(?::\d{2,5})?'  # port
django/core/validators.py:136:            host_match = re.search(r'^\[(.+)\](?::\d{2,5})?$', urlsplit(value).netloc)
django/core/validators.py:153:    _lazy_re_compile(r'^-?\d+\Z'),
django/core/validators.py:296:    regexp = _lazy_re_compile(r'^%(neg)s\d+(?:%(sep)s%(neg)s\d+)*\Z' % {
django/db/backends/mysql/base.py:49:server_version_re = re.compile(r'(\d{1,2})\.(\d{1,2})\.(\d{1,2})')
django/db/backends/sqlite3/introspection.py:8:field_size_re = re.compile(r'^\s*(?:var)?char\s*\(\s*(\d+)\s*\)\s*$')
django/db/migrations/autodetector.py:1233:        match = re.match(r'^\d+', name)
django/db/migrations/writer.py:170:            if re.match(r"^import (.*)\.\d+[^\s]*$", line):
django/forms/widgets.py:925:    date_re = re.compile(r'(\d{4}|0)-(\d\d?)-(\d\d?)$')
django/http/request.py:21:host_validation_re = re.compile(r"^([a-z0-9.-]+|\[[a-f0-9]*:[a-f0-9\.:]+\])(:\d+)?$")
django/template/base.py:606:    'num': r'[-+\.]?\d[\d\.e]*',
django/template/defaultfilters.py:238:    return re.sub(r"\d([A-Z])", lambda m: m.group(0).lower(), t)
django/test/client.py:34:CONTENT_TYPE_RE = re.compile(r'.*; charset=([\w\d-]+);?')
django/utils/dateparse.py:14:    r'(?P<year>\d{4})-(?P<month>\d{1,2})-(?P<day>\d{1,2})$'
django/utils/dateparse.py:18:    r'(?P<hour>\d{1,2}):(?P<minute>\d{1,2})'
django/utils/dateparse.py:19:    r'(?::(?P<second>\d{1,2})(?:\.(?P<microsecond>\d{1,6})\d{0,6})?)?'
django/utils/dateparse.py:23:    r'(?P<year>\d{4})-(?P<month>\d{1,2})-(?P<day>\d{1,2})'
django/utils/dateparse.py:24:    r'[T ](?P<hour>\d{1,2}):(?P<minute>\d{1,2})'
django/utils/dateparse.py:25:    r'(?::(?P<second>\d{1,2})(?:\.(?P<microsecond>\d{1,6})\d{0,6})?)?'
django/utils/dateparse.py:26:    r'(?P<tzinfo>Z|[+-]\d{2}(?::?\d{2})?)?$'
django/utils/dateparse.py:31:    r'(?:(?P<days>-?\d+) (days?, )?)?'
django/utils/dateparse.py:32:    r'((?:(?P<hours>-?\d+):)(?=\d+:\d+))?'
django/utils/dateparse.py:33:    r'(?:(?P<minutes>-?\d+):)?'
django/utils/dateparse.py:34:    r'(?P<seconds>-?\d+)'
django/utils/dateparse.py:35:    r'(?:\.(?P<microseconds>\d{1,6})\d{0,6})?'
django/utils/dateparse.py:44:    r'(?:(?P<days>\d+(.\d+)?)D)?'
django/utils/dateparse.py:46:    r'(?:(?P<hours>\d+(.\d+)?)H)?'
django/utils/dateparse.py:47:    r'(?:(?P<minutes>\d+(.\d+)?)M)?'
django/utils/dateparse.py:48:    r'(?:(?P<seconds>\d+(.\d+)?)S)?'
django/utils/dateparse.py:58:    r'(?:(?P<days>-?\d+) (days? ?))?'
django/utils/dateparse.py:60:    r'(?P<hours>\d+):'
django/utils/dateparse.py:61:    r'(?P<minutes>\d\d):'
django/utils/dateparse.py:62:    r'(?P<seconds>\d\d)'
django/utils/dateparse.py:63:    r'(?:\.(?P<microseconds>\d{1,6}))?'
django/utils/html.py:28:unencoded_ampersands_re = re.compile(r'&(?!(\w+|#\d+);)')
django/utils/http.py:30:__D = r'(?P<day>\d{2})'
django/utils/http.py:31:__D2 = r'(?P<day>[ \d]\d)'
django/utils/http.py:33:__Y = r'(?P<year>\d{4})'
django/utils/http.py:34:__Y2 = r'(?P<year>\d{2})'
django/utils/http.py:35:__T = r'(?P<hour>\d{2}):(?P<min>\d{2}):(?P<sec>\d{2})'
django/utils/translation/trans_real.py:35:        (?:\s*;\s*q=(0(?:\.\d{,3})?|1(?:\.0{,3})?))?  # Optional "q=1.00", "q=0.8"
django/views/i18n.py:249:        match = re.search(r'nplurals=\s*(\d+)', self._plural_string or '')

comment:2 by James Bennett, 7 years ago

Description: modified (diff)
Summary: Audit for and abolish all use of '\d' in URL patternsAudit for and abolish all use of '\d' in regexes
Note: See TracTickets for help on using tickets.
Back to Top