Opened 9 years ago
Last modified 10 months ago
#25762 new Cleanup/optimization
Optimize numberformat.format
Reported by: | Jaap Roes | Owned by: | |
---|---|---|---|
Component: | Utilities | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Johannes Maron, Sarah Boyce, Ülgen Sarıkavak | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
numberformat.format
is used a lot by Django and in most cases it's called with a number. Yet the current implementation is pretty much built for strings.
I've refactored the function to be up to 4x faster on my machine (and in the worst case it performs roughly the same). Attached is the benchmark script I've used.
Attachments (1)
Change History (17)
by , 9 years ago
Attachment: | bench_numberformat.py added |
---|
comment:1 by , 9 years ago
Triage Stage: | Unreviewed → Accepted |
---|
I'll mention djangobench in case you want to adapt any of your testing scripts so they can live on.
comment:3 by , 9 years ago
By the way, while refactoring this code I was wondering if supporting NUMBER_GROUPING other than 3 is even useful. None of the locales shipped with Django define anything other than 3, and the other related configuration options (THOUSAND_SEPARATOR and USE_THOUSAND_SEPARATOR) all refer to grouping by 3 anyway.
It would also be nice if number formatting on strings was deprecated. It seems that Django already tries to call it with numbers in the places it's used, and there's no real check if the strings that are passed into the function are even number like.
comment:4 by , 9 years ago
Patch needs improvement: | set |
---|
Left comments for improvement on the pull request. Maybe you'd like to raise the other questions on the DevelopersMailingList. I'm not sure about the answers myself.
comment:5 by , 5 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:6 by , 5 years ago
The approach used in the linked PR (https://github.com/django/django/pull/5668/files) actually has a problem. The following approach is used to avoid the rounding behaviour of the builtin format
function:
# If `number` is 10.888 and `decimal_pos` is 2 return 10.88 not 10.89. format_string = '.%sf' % (decimal_pos + 1)
and the string is then truncated to decimal_pos
decimals places. However, for a number such as 10.999, this will still be rounded 11.00. This is inconsistent with the behaviour of the rest of the function, which truncates and doesn't round.
We therefore can't use the bultin format
to do any kind of decimal formatting as there doesn't appear (as it currently stands) to be an option to truncate with it, however the speed advantages it offers can still be used for the common case of seperating a number into thousands.
Using the test cases in bench_number_format.py
linked above, an improved function performs anywhere between 2x and 5x faster for most cases, and again in the worst case is roughly the same (fluctuating between +2% and -2% compared with the original function).
I'll add a new PR.
comment:7 by , 5 years ago
Cc: | added |
---|
comment:8 by , 4 years ago
Patch needs improvement: | unset |
---|
comment:9 by , 4 years ago
Needs tests: | set |
---|---|
Patch needs improvement: | set |
comment:11 by , 4 years ago
Needs tests: | unset |
---|
comment:12 by , 4 years ago
Patch needs improvement: | unset |
---|
comment:13 by , 3 years ago
Patch needs improvement: | set |
---|
comment:14 by , 20 months ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:15 by , 18 months ago
Cc: | added |
---|
comment:16 by , 10 months ago
Cc: | added |
---|
Benchmark for numberformat.format