Opened 6 years ago
Closed 6 years ago
#29826 closed Bug (wontfix)
urlize does not handle HTML angle brackets correctly
Reported by: | Marc Dequènes (Duck) | Owned by: | nobody |
---|---|---|---|
Component: | Template system | Version: | 1.11 |
Severity: | Normal | Keywords: | |
Cc: | Vishvajit Pathak | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Quack,
It was supposedly fixed in #24471 but I can reproduce with 1.11.16:
>>> import django >>> django.VERSION (1, 11, 16, u'final', 0) >>> from django.template.defaultfilters import urlize >>> print urlize("coin <https://www.example.org/> coin", 76) coin &lt;<a href="https://www.example.org/%3E" rel="nofollow">https://www.example.org/&gt;</a> coin
Same with Python 3 and 2.1.2:
>>> django.VERSION (2, 1, 2, 'final', 0) >>> print(urlize("coin <https://www.example.org/> coin", 76)) coin &lt;<a href="https://www.example.org/%3E" rel="nofollow">https://www.example.org/&gt;</a> coin
I'll try to have a deeper look at the code but at the moment I have no fix to suggest.
This impact Mailman 3: https://gitlab.com/mailman/hyperkitty/issues/29
Hope we can find a workaround.
Regards.
\_o<
Change History (8)
comment:1 by , 6 years ago
Cc: | added |
---|
comment:2 by , 6 years ago
comment:3 by , 6 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
comment:4 by , 6 years ago
Resolution: | invalid |
---|---|
Status: | closed → new |
Quack,
I did not see this note which was not present in older versions.
Anyway the code show signs of HTML handling (WRAPPING_PUNCTUATION, _html_escapes, unescape…) so clearly this was not the original intent.
I have created a patch which fixes this problem: https://github.com/django/django/pull/10495
So would you please reconsider and review this patch?
Regards
\_o<
comment:5 by , 6 years ago
Hi Marc.
The test case in the PR doesn't quite match the example URL in the description here (or the discussion of trailing >
chars on the Mailman issue linked). Is it redundant, or already covered by #24471?
On the basis of the test case, I'm inclined to Accept this. (Didn't yet have time to fully review the patch.)
comment:6 by , 6 years ago
The note that urlize should be used in plaintext is there since Django 1.0 (cfd5b184fba4f51a6756fecaf4ec4272f3e23bfc). '<'
and '>'
have been in WRAPPING_PUNCTUATION
(LEADING_PUNCTUATION
and TRAILING_PUNCTUATION
before 15d10a5210378bba88c7dfa1f45a4d3528ddfc3f) since Django's initial public commit, but I don't think that justifies changing the behavior.
I created a PR to remove unused characters in WRAPPING_PUNCTUATION
.
Maybe your patch has some merit for simplification even if it doesn't change behavior?
comment:8 by , 6 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
The test in the ticket description worked before 15d10a5210378bba88c7dfa1f45a4d3528ddfc3f (which is in Django 1.4). As no one complained in 6 years since Django 1.4 was released, I don't see the past behavior as a precedent to readd it. You can make your case on the DevelopersMailingList but I don't see what justification there is based on the documentation.
On the pull request, Marc suggested adding a new urlize_html()
function and template filter but I doubt that's something we would add to Django (if it's even feasible).
See this note in
urlize
docs:I guess you might want to write a custom filter for your use case.