Opened 6 years ago
Closed 4 years ago
#30563 closed Cleanup/optimization (fixed)
Optimize django.forms.widgets.Media.__add__.
Reported by: | David Dorothy | Owned by: | David Smith |
---|---|---|---|
Component: | Forms | Version: | dev |
Severity: | Normal | Keywords: | media |
Cc: | Matthias Kestenholz | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
While working with another project that make extensive use of django.forms.widgets.Media
I discovered that the fix for ticket #30153 has unintended consequences on the performance of Media.__add__
. If the number of Media objects added grows beyond a certain point (not sure it may be machine specific) then the performance of all subsequent Media.__add__
calls becomes terrible.
This was causing page load times of several minutes on my development machine. I agree that you need to delay as long as possible as #30153 intends but it seems that there probably should be an upper bound on this so that performance does not suddenly decrease.
Here is some sample code that can reproduce the issue:
from django.forms import Media import datetime def create_media(MediaClass): '''Creates a simple Media object with only one or two items.''' return MediaClass(css={'all': ['main.css']}, js=['main.js']) start = datetime.datetime.now() media = create_media(Media) for i in range(100000): media = media + create_media(Media) print('100000 additions took: %s' % (datetime.datetime.now() - start))
On my machine several runs of this code result in times between 1:35 - 1:44 (eg. 1 minute 35 seconds to 1 minute 44 seconds). However, taking away one zero from the number of media objects runs in under a second. Near as I can tell this has to do with the memory used to store these arrays and when it gets past a certain point the performance is awful. Since I am not sure if it is machine specific, for reference my machine is a i7-8700 with 64 GB RAM.
Here is a sample that has a modified Media class that does not have theses issues:
from django.forms import Media import datetime def create_media(MediaClass): '''Creates a simple Media object with only one or two items.''' return MediaClass(css={'all': ['main.css']}, js=['main.js']) class CustomMedia(Media): def __add__(self, other): combined = CustomMedia() if len(self._css_lists) + len(other._css_lists) > 1000: combined._css_lists = [self._css, other._css] else: combined._css_lists = self._css_lists + other._css_lists if len(self._js_lists) + len(other._js_lists) > 1000: combined._js_lists = [self._js, other._js] else: combined._js_lists = self._js_lists + other._js_lists return combined start = datetime.datetime.now() media = create_media(CustomMedia) for i in range(100000): media = media + create_media(CustomMedia) print('100000 additions took: %s' % (datetime.datetime.now() - start))
With this change it again runs in under a second. If I increase the number of loops the performance seems to change at a much more expected level.
I set an upper limit on the length allowed before a merge occurs. I'm not sure if the number of additions allowed should be a setting that can be adjusted to meet specific needs or if something that is just "reasonably" high is sufficient. It does appear that limiting the total number of items in the list to about 1000 works best on my machine. I'm also not sure that this is the best solution.
Thanks for your consideration.
Change History (9)
comment:1 by , 6 years ago
Cc: | added |
---|---|
Keywords: | media added; performance unintended consequences removed |
Summary: | django.forms.widgets.Media.__add__ Performance Issue → Optimize django.forms.widgets.Media.__add__. |
Triage Stage: | Unreviewed → Accepted |
follow-up: 4 comment:2 by , 6 years ago
I wonder how many distinct lists of assets you have? In the example above you're merging the same list 100'000 times.
When calling merge
earlier than at the end it will always be (relatively) easy to construct a failing test case.
Maybe deduplicating the list of assets to merge would help? Something like (completely untested)
~/Projects/django$ git diff diff --git a/django/forms/widgets.py b/django/forms/widgets.py index c8ec3c35d5..248c29b4c1 100644 --- a/django/forms/widgets.py +++ b/django/forms/widgets.py @@ -124,7 +124,7 @@ class Media: """ dependency_graph = defaultdict(set) all_items = OrderedSet() - for list_ in filter(None, lists): + for list_ in OrderedSet(filter(None, lists)): head = list_[0] # The first items depend on nothing but have to be part of the # dependency graph to be included in the result. ~/Projects/django$
It might be better for the memory usage to deduplicate earlier (inside Media.__add__
maybe).
comment:4 by , 6 years ago
Replying to Matthias Kestenholz:
I wonder how many distinct lists of assets you have? In the example above you're merging the same list 100'000 times.
I encountered this issue when using the StreamField from Wagtail (http://wagtail.io) where we have a use case that dictates creating significant nesting of common forms to be able to edit the stream of output. To be fair, our use case of Wagtail is beyond the expected use of Wagtail as the performance is poor. I have created a customization that helps prevent the performance issue in Wagtail by declaring all stream objects as classes that only generate their forms once per object. This workaround resolves the wagtail performance issue (for our needs) but not the Media.__add__
performance issue in Django.
I may be able to share my code if that will help. I do not personally own that code though so I'd have to get permission.
We have nested classes of up to four or five levels deep and about 70 unique classes. But because of the way it works the number of forms generated grows exponentially. Since these are all the same classes though the number of distinct lists is limited that I think your de-duplication idea would probably work.
I'm not sure whether your code works or not because I actually short circuited in a subclass of a wagtail class to overcome the issue. I might be able to replace the Django Media class via monkey patching with a different version to test but I'm not sure. If I get some time to try this out I'll post a followup comment and try to include some sample code or possibly a patch with some matching tests. Not sure how much time I will be able to dedicate to this.
Thanks for looking into this issue.
comment:5 by , 5 years ago
I apologize for my delay in responding to this. I had to switch tasks for a little while to a different project that took a bit longer than intended.
I have done some further research into this issue and have some information based on the actual use. I have 194016 Media objects that are being added together. For the JavaScript references when the lists are de-duplicated in the way you describe above (in __add__
not in merge
) the list is trimmed to 13 records. CSS is handled differently and I'm not sure it is correct (see code later in this comment). This provides a performance on my machine of around 3-5 seconds. This is not as fast as my original optimization but is definitely close enough.
As far as sharing code, I cannot give you access to my repository. However, I can show you the code where the adding of Media objects is occurring at.
First, here is my attempt at improving the Media.__add__
method:
def combine_css(destination, css_list): for x in filter(None, css_list): for key in x.keys(): if key not in destination: destination[key] = OrderedSet() for item in x[key]: if item: destination[key].add(item) class ImprovedMedia(forms.Media): def __add__(self, other): combined = ImprovedMedia() combined._css_lists = list(filter(None, self._css_lists + other._css_lists)) css_lists = {} combine_css(css_lists, self._css_lists) combine_css(css_lists, other._css_lists) combined._css_lists = [css_lists] combined._js_lists = list(OrderedSet([tuple(x) for x in filter(None, self._js_lists + other._js_lists)])) return combined
I am concerned that my combine_css()
function may not be the best implementation because it does keep order but it always results in exactly one new self._css_lists
which may not be the correct solution. I don't think I have enough context to know for sure if this is right. Also, I have some reservations about how I use OrderedSet
for Media._css_lists
because in my initial attempts on the Media._js_lists
I had to convert the OrderedSet
back into a list
to make things work.
Second, here is the code where the __add__
method is called:
class StructuredStreamBlock(StreamBlock): js_classes = {} # ... def all_media(self): media = ImprovedMedia() all_blocks = self.all_blocks() for block in all_blocks: media += block. return media # ...
The original code can be found in wagtail's source code here: https://github.com/wagtail/wagtail/blob/e1d3390a1f62ae4b30c0ef38e7cfa5070d8565dc/wagtail/core/blocks/base.py#L74
Unfortunately I could not think of a way to monkey patch the Media class in Django without actually changing the Django code in my virtual environment.
Should I attempt at this point to create a fork and a branch (and hopefully eventually a pull request) and see if the change I make break any existing tests or should some others with more familiarity weigh in on this first?
comment:6 by , 4 years ago
Has patch: | set |
---|---|
Owner: | changed from | to
Status: | new → assigned |
comment:7 by , 4 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
I confirmed that 959d0c078a1c903cd1e4850932be77c4f0d2294d caused a performance issue. We should be able to optimize this by calling merge more often.