Opened 5 weeks ago
Last modified 2 days ago
#35904 closed New feature
Speed up fixture loading by adding options bulk insert/create — at Version 3
Reported by: | JorisBenschop | Owned by: | |
---|---|---|---|
Component: | Testing framework | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
As per this forum discussion, I have created a patch to improve load times for the loaddata command under some circumstances.
Currently the “loaddata” management command uses the obj.save() method for each deserialized object within a fixture. This function first tries an UPDATE statement and, if that fails, tries an INSERT statement. By using the --force_insert a reduction of 50% of queries is achieved.
A second option is to use bulk_create for insertion of multiple records. This improves insertion speed by (n-1/n), or ~99% for insertion of 100 records.
These options are not meant to cover each use case, and therefore are set to optional.
Change History (3)
comment:1 by , 5 weeks ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
comment:2 by , 13 days ago
Summary: | Speed up fixture loading by bulk insert → Speed up fixture loading by adding options bulk insert/create |
---|---|
Type: | Uncategorized → New feature |
#35975 was a duplicate
Forum discussion: https://forum.djangoproject.com/t/feature-proposal-faster-fixture-loading-via-loaddata-command/36972
PR: https://github.com/django/django/pull/18889
comment:3 by , 13 days ago
Description: | modified (diff) |
---|---|
Has patch: | set |
Resolution: | wontfix |
Status: | closed → new |
Hello Joris,
This sounds interesting particularly given features like test case serialized rollbacks (which are quite slow) are based on top of model serialization. It would have to be a distinct option as
bulk_create
doesn't fire signals which some setup might require.Just like any new feature requests though they should be discussed on the forum to reach a consensus before being accepted. Given this is a performance related new feature I suggest your proposal come equipped with some details about what kind of improvements users should expect (profiles, benchmarks instead of solely claiming it's fairly inefficient) backed by step to reproduce as well as a PoC that properly deals with other features of serde framework such as natural keys and a plan on how to deal with backends that don't support
ignore_conflicts
. It might even be a good opportunity to augment our performance tracking system with serde benchmarks.It that's the case then sharing this code as a standalone package (e.g.
django-fast-loaddata
) might be a good way to get traction on the above.Assuming there is interest in moving forward we can then re-open this issue.