Opened 4 years ago
Last modified 10 months ago
#32312 new Cleanup/optimization
Performance decrease following implementation of ASGI handler and coroutine-safety.
Reported by: | David Smith | Owned by: | nobody |
---|---|---|---|
Component: | Core (Other) | Version: | 3.0 |
Severity: | Normal | Keywords: | async |
Cc: | Andrew Godwin, Adam Johnson, Ülgen Sarıkavak | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
#30451 added the ASGI handler and coroutine-safety.
I've spotted that this resulted in a number of the djangobench benchmarks taking longer to run, in some cases there are quite large impacts. I'll attach the full log from my run, but I found djangobench does not give repeatable outputs. I found that cpu isolation with pyperf
helps to give repeatable outputs, and have been using the asv
tool as a runner for this.
I've not re-created all of the benchmarks yet, but I have run a few see here to be more certain about the performance impacts before opening this ticket. Comparing the commit in question to the previous commit, I get the table below. A range of benchmarks see a performance decrease of c20%-50%. url_reverse()
seems to be an outlier which is very materially impacted at c140% slower.
I'm not sure if anything can be done about this, but thought it worth flagging as I couldn't see any previous discussion on this topic.
. before after ratio [cce47ff6] [a415ce70] <3.0a1~249> <3.0a1~248> + 474±7μs 1.15±0.01ms 2.43 bench_urls.URLBenchmarks.time_url_reverse + 656±3μs 1.01±0.01ms 1.54 forms.FormBenchmarks.time_create_form + 17.9±0.6ms 27.1±0.2ms 1.52 bench_urls.URLBenchmarks.time_url_resolve + 464±20μs 698±20μs 1.50 query.QueryDelete.time_query_delete + 35.5±0.3ms 45.2±0.9ms 1.27 bench_models.ModelBenchmarks.time_save_new + 1.68±0.02ms 2.09±0.02ms 1.25 bench_models.ModelBenchmarks.time_model_creation + 4.83±0.06ms 5.92±0.3ms 1.23 query.QueryCount.time_query_count + 2.20±0.03ms 2.62±0.04ms 1.19 query.QueryGetOrCreate.time_query_get_or_create + 1.91±0.07ms 2.20±0.03ms 1.15 query.QueryGet.time_query_get + 192±2ms 218±1ms 1.14 bench_urls.URLBenchmarks.time_url_resolve_flat
Attachments (3)
Change History (10)
by , 4 years ago
Attachment: | results.txt added |
---|
comment:1 by , 4 years ago
Cc: | added |
---|
Hi David. Thanks for your work here (really good effort! 👍)
First of all I will CC Andrew.
Then, let's dig in and have a review, and work out what's going on.
I'd be inclined to accept as an issue if we confirm a significant slow-down, but we need to decide what that means.
Andrew: your input/guidance here much appreciated.
(I'm seeing various slowness reports on Daphne/Channels, so it's already on my list to look at this sort of after we branch 3.2)
comment:2 by , 4 years ago
The benchmark for url_reverse
is the outlier here and with it being one of the more simple functions it is easier to see what is going on here. I've attached a couple of images generated by profiling this benchmark.
It shows, an increase due to the replacement of threading.local
with asgiref.local.Local
. In the "old" image the get_urlconf
and get_script_prefix
are the first two unnamed bars on the 2nd row, the time to execute these two grow significantly in the "new" version. It's a similar story with form_create
but it only has one element impacted and so is less impacted overall by this change. I started to look at save()
but this function is more complex, although initial viewing suggests some elements are being impacted by this change.
I think this starts to help indicate what the issue could be.
Your note about what a significant slow down means is important, what does significant mean? I think it's much more than a particular function being slower on a single benchmark, for example, I wouldn't suggest that url_reverse
is a particularly good indicator of the overall performance of Django in general.
comment:3 by , 4 years ago
I would not be surprised that the ASGI worked slowed down Django a little bit - that was always going to be the compromise - and I perf-tested it before and after the 3.1 view/request changes to see a negligible (~5%) difference for sync runtimes.
However, I never _did_ performance test this older commit, and I agree that the new Local class might well be to blame - it was built for safety rather than speed (for obvious reasons), and there's likely some potential there for optimisation. Particularly, it could likely do with some quick-shortcuts out of its deeper codepaths if it knows there's nothing async going on.
comment:4 by , 4 years ago
Component: | Uncategorized → Core (Other) |
---|---|
Keywords: | async added |
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Cleanup/optimization |
Hey David, sorry for the delay triaging this (3.2... 🙂) — It's a bit unspecific as to what we'll do but let's accept this as a tracking ticket for now, and we can either do sub-tickets or hang things on here.
You already did 622fcb81199864cad1ec9d0b25820a96ef3f586f which is super.
Can I just say, great work you're doing profiling and setting up the asv
runs, and the running django bench, and ... 🙂
Please do carry that on as you can, it's phenomenally helpful for you to break the ground there!
🏅
comment:5 by , 4 years ago
No need to apologise, the amount of patches yourself and Mariusz have got in this week is incredible 🤯
I was looking at the new Local class and there is a comment about using contextvars for storage, but this is dependent upon Python 3.7. I was wondering if that could help as contextvars are written in C and so should be faster? But I ran out of talent very quickly here 😃.
The other idea I had is, does Django know if there is any async going on? If we know this can we decide to use threading.local or the asgi version? (It seems logical to me but I'm well out of my depth here).
Finally, Yes, I'm happy to make progress with the benchmarking topic. It's already starting to identify a few items and help prove some other performance enhancements.
comment:6 by , 4 years ago
Cc: | added |
---|
comment:7 by , 10 months ago
Cc: | added |
---|
Djangobench_benchmarks