Opened 4 years ago
Closed 3 years ago
#32930 closed Bug (fixed)
URLValidator should accept urls with port number < 10
Reported by: | Wu Haotian | Owned by: | Wu Haotian |
---|---|---|---|
Component: | Core (Other) | Version: | 3.2 |
Severity: | Normal | Keywords: | URLValidator |
Cc: | Collin Anderson, Florian Apolloner | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
>>> from django.core.validators import URLValidator >>> URLValidator()("https://code.djangoproject.com:1") Traceback (most recent call last): ..... raise ValidationError(self.message, code=self.code, params={'value': value}) django.core.exceptions.ValidationError: <unprintable ValidationError object>
The commit[1] for fixing Ticket #20003 added port length checks in URLValidator, but I didn't see why in the ticket comments or commit description.
Given that:
The URL Standard from WhatWG[2] defines URL Port as:
A URL’s port is either null or a 16-bit unsigned integer that identifies a networking port. It is initially null.
RFC 3986[3] defines "port" as "*DIGIT", and does not requires the length of port.
I believe URLs with port number < 10 should be valid URLs.
[1] https://github.com/django/django/commit/2e65d56156b622e2393dee1af66e9c799a51924f#diff-d9609d8dc8482b30eac30df16213cba134562949fd62c97573927b89e880f85bR85
[2] https://url.spec.whatwg.org/#concept-url-port
[3] https://datatracker.ietf.org/doc/html/rfc3986#section-3.2.3
Change History (9)
comment:1 by , 4 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Summary: | URLValidator does not accept urls with port number < 10 → URLValidator should accept urls with port number < 10 |
follow-up: 4 comment:2 by , 4 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:3 by , 4 years ago
Cc: | added |
---|
comment:4 by , 4 years ago
Replying to Nick Pope:
Or we can be a little more strict as the port should only be a 16-bit unsigned integer (0-65535) and use the following to disallow 65536 and higher:
(?:[0-9]|[1-9][0-9]{1,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])?
I would keep the regex simple and perform range validation on it afterwards… FWIW I do not think that 0 would be a valid port for a URL
comment:5 by , 4 years ago
It's hard to validating port using regex only.. Given that 0443
and 000080
are both valid ports in URL.
>>> import requests >>> requests.get("https://code.djangoproject.com:0443") <Response [200]> >>> requests.get("https://code.djangoproject.com:00000443") <Response [200]>
> curl -I http://example.com:0000000000080 HTTP/1.1 200 OK Content-Encoding: gzip Accept-Ranges: bytes Age: 591287 Cache-Control: max-age=604800 Content-Type: text/html; charset=UTF-8 Date: Thu, 15 Jul 2021 09:51:52 GMT Etag: "3147526947" Expires: Thu, 22 Jul 2021 09:51:52 GMT Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT Server: ECS (sab/5693) X-Cache: HIT Content-Length: 648
I'd prefer to use \d+
-- it's not perfect, but should produce fewer false negatives.
comment:7 by , 4 years ago
Patch needs improvement: | set |
---|
comment:8 by , 3 years ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
So I looked into why it was added in 2e65d56156b622e2393dee1af66e9c799a51924f but the best I could come up with is that is was copied from this gist. There are unaddressed comments, e.g. this one, about this issue with the port number and other things.
Note that this line is also affected and will need fixing.
We can change
(?::\d{2,5})?
to(?::\d{1,5})?
to fix this naively.Or we can be a little more strict as the port should only be a 16-bit unsigned integer (0-65535) and use the following to disallow 65536 and higher: