Opened 3 months ago
Last modified 3 months ago
#35852 assigned Bug
intcomma not locale aware when given a decimal as string
Reported by: | Jonathan Ströbele | Owned by: | HEMANT MISHRA |
---|---|---|---|
Component: | contrib.humanize | Version: | 5.1 |
Severity: | Normal | Keywords: | |
Cc: | Claude Paroz | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The documentation regarding localization says that for thousand separators USE_THOUSAND_SEPARATOR = True
or intcomma
can be used. The documentation of intcomma
says either int/float or string representation works.
This seems to be true for a en
locale, but in de
locale this doesn't hold up if the input is a string:
en: 9999999999.11 -> 9,999,999,999.11 "9999999999.11" -> 9,999,999,999.11 de: 9999999999.11 -> 9.999.999.999,11 "9999999999.11" -> 9,999,999,999.11
In the case of a decimal value as string in the de
-locale the output is wrong, it should be 9.999.999.999,11
(dot (.) as separator).
This seems to be the case because intcomma
casts the value to int
if its not a float
/Decimal
(which raises a ValueError
for "9999999999.11"
and then just puts in commas as thousand separators without any awareness of the current locale.
Also the DocBlock of the intcomma
functions says "Convert an integer to a string containing commas every three digits." wich contradicts the int/float (or string thereof) in the documentation.
So either the function is not following the documentation or the documentation is wrong (?).
I think this is problematic as the function is communicated as an alternative/equivalent option to USE_THOUSAND_SEPARATOR=True
. When a float value is formatted with {{ value|floatformat:2|intcomma}}
the floatformat
will return a string and as such, intcomma
is no longer locale aware (also reported in #33771). This can lead to outputs like 9,999,999,999,11
in the de
locale:
9999999999.11 -> floatformat:2 -> "9999999999,11" -> intcomma -> "9,999,999,999,11"
A solution could be to integrate some sort of float casting into intcomma
? Or remove the recommendation of using it in favor of just the USE_THOUSAND_SEPARATOR = True
setting and clarifying the documentation of intcomma
?
Change History (4)
comment:1 by , 3 months ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:2 by , 3 months ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:3 by , 3 months ago
Hi Natalia,
to be honest I'm not quite sure what the best fix actually would be and I need a bit more guidance here. ;) As I mentioned I think there is a mis-match between the documentation and the actual code, and DocBlock. Also I'm not sure what intcomma
actually was intended to mean and how it should be used:
int[eger]comma
as in separate a integer into groups of 3 digits separated by a comma (only valid for anen
locale). This seems to be the original code and intention of the function when it was added to Django.int[ernational]comma
as in provide a manual way to print a grouped number according to the current locale, if the settingUSE_THOUSAND_SEPARATOR
isFalse
and so no automatic number grouping is happening.
The 2. option is what I figured from reading the documentation and also think that's the intended function from the current state of the source code. Though it has the aforementioned bug regarding the floats passed as string where it's using a fallback to the old logic that's correct in the en
locale.
Since intcomma
is passing down the given value to the django.utils.formats.number_format
(with enabled force_grouping=True
) if it's a float or Decimal already, and only uses the code that is en
locale specific code in case of a string, I would suggest to use intcomma
as a wrapper of number_format
with force_grouping=True
. Since the number_format
already handles the formatting/grouping of numbers, I think it's no longer needed to reproduce this logic in intcomma
?
Making this change seems to conform to nearly all unit tests in tests/humanize_tests/tests.py
, except for some unicode stuff introduced in #28628 (fe76944269c13d59f8bb3bc1cfff7ab6443777e4) and a long string without numbers (16a8fe18a3b81250f4fa57e3f93f0599dc4895bc).
This seems to be the case, because number_format
doesn't check if the string is containing digits, and just inserts the thousand seperator into the string. Which I think is Okay?
AssertionError: '1,234,567' != '1,234,567' : intcomma test failed, produced '1,234,567', should've produced '1,234,567' AssertionError: 'th,e q,uic,k b,row,n f,ox ,jum,ped, ov,er ,the, la,zy ,dog' != 'the quick brown fox jumped over the lazy dog' : intcomma test failed, produced 'th,e q,uic,k b,row,n f,ox ,jum,ped, ov,er ,the, la,zy ,dog', should've produced 'the quick brown fox jumped over the lazy dog'
I have created a suggestion for a patch in a fork here: https://github.com/stroebjo/django/commit/f5302d43c45208ebb92516ab2724e86a1e60d71d
comment:4 by , 3 months ago
Cc: | added |
---|
Claude, would you happen to have an opinion on the above? I can't invest time into this until November as I'm taking some holidays, but I thought you might have a quicker answer.
Hello Jonathan, thank you for taking the time to create this report. I have reproduced what you said, by setting the server LANGUAGE_CODE to
de
. I agree that the result9,999,999,999.11
feels unexpected for the string"9999999999.11"
whenintcomma
is applied.Would you like to prepare a patch?