Opened 2 years ago
Last modified 8 months ago
#34211 new Cleanup/optimization
Performance Regression After Upgrade to Django 3.2 — at Initial Version
Reported by: | polarmt | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.2 |
Severity: | Normal | Keywords: | performance, regression |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Overview
We recently upgraded from Django 2.2 to Django 3.2. After the upgrade, several of our APIs slowed down significantly. For instance, our of our APIs increased by 50% from ~0.28 seconds to ~0.41 seconds.
Diagnosis
To diagnose the issue, I ran Flamegraphs on the API with Django 2.2 and Django 3.2. The results were nearly identical except for one part:
Code highlighting:
class ForeignKeyDeferredAttribute(DeferredAttribute): def __set__(self, instance, value): if instance.__dict__.get(self.field.attname) != value and self.field.is_cached(instance): self.field.delete_cached_value(instance) instance.__dict__[self.field.attname] = value
django/db/models/fields/related_descriptors.py in Django 3.2
This code might seem harmless, but it did not scale well when considering how it is called. From my understanding, this scales in relation to the # of rows returned by the query times the # of foreign key fields in the table.
The function below creates an object for every row:
Code highlighting:
class ModelIterable(BaseIterable): """Iterable that yields a model instance for each row.""" def __iter__(self): queryset = self.queryset db = queryset.db compiler = queryset.query.get_compiler(using=db) # Execute the query. This will also fill compiler.select, klass_info, # and annotations. ... for row in compiler.results_iter(results): obj = model_cls.from_db(db, init_list, row[model_fields_start:model_fields_end]) for rel_populator in related_populators: rel_populator.populate(row, obj) ...
Every object created will go through the following logic:
Code highlighting:
class Model(metaclass=ModelBase): def __init__(self, *args, **kwargs): # Alias some things as locals to avoid repeat global lookups cls = self.__class__ opts = self._meta _setattr = setattr _DEFERRED = DEFERRED ... if not kwargs: fields_iter = iter(opts.concrete_fields) # The ordering of the zip calls matter - zip throws StopIteration # when an iter throws it. So if the first iter throws it, the second # is *not* consumed. We rely on this, so don't change the order # without changing the logic. for val, field in zip(args, fields_iter): if val is _DEFERRED: continue _setattr(self, field.attname, val) ...
If the field is a foreign key, then it will call the overrided __set__
. For one API call, we had a filter
query on a table with three foreign keys that returns 11k rows. This small change did not scale well with a query.
Verification
We ran the same API call by disabling this overriding:
Code highlighting:
class ForeignKeyDeferredAttribute(DeferredAttribute): def set_fake(self, instance, value): if instance.__dict__.get(self.field.attname) != value and self.field.is_cached(instance): self.field.delete_cached_value(instance) instance.__dict__[self.field.attname] = value
The latency of that API improved from ~0.41 seconds to ~0.31 seconds.
Thoughts
This change does not seem scalable. Even with a moderately sized query and three foreign-key relationships, the query increased the API latency by 50%. We understand the motivation behind the change: Ticket #28147. We were wondering if there is a more performant fix to that problem.