#23372 closed Cleanup/optimization (fixed)
Prevent loaddata from disabling and enabling constraints when no fixtures are found
Reported by: | Michael Manfre | Owned by: | nobody |
---|---|---|---|
Component: | Testing framework | Version: | 1.7 |
Severity: | Normal | Keywords: | |
Cc: | baryshev@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
On my primary development system (Intel i7 quad core with SSD drives), it takes nearly 20 minutes to get from executing runtests.py
to the start of the first test. Most of the time (~18 minutes) is spent on the "loaddata" call_command in the migrate command's sync_apps method.
https://github.com/django/django/blob/master/django/core/management/commands/migrate.py#L292
For comparison, using the same test environment and only changing the version of Django (to stable/1.6.x), it takes less than 2 minutes to get to the first test.
Change History (19)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
You didn't mention the database on which you encounter this problem, default SQLite?
It would be nice to track a bit further what is so slow inside loaddata
.
comment:3 by , 10 years ago
MS SQL Server I presume. I'd guess the slowness might be the transaction.atomic()
calls that happen in loaddata
(one for app app in Django's test suite) even if no fixtures exist.
comment:4 by , 10 years ago
Severity: | Release blocker → Normal |
---|
Not a release blocker, I'm afraid; at this point release blocker is reserved for bugs that either crash Django or remove features that previously existed. I'm not willing to delay the 1.7 release, which is literally on the verge of happening, for test slowdown, or we'll literally never get round to releasing.
Notably, the code in question has not been touched by migrations; we do a dumpdata as part of #22487, but that doesn't seem to be part of this. I'm not sure what the cause of the regression could be.
comment:6 by , 10 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:7 by , 10 years ago
I've seen loaddata
take eons on SQL Server, and even in the context of migrations (with South). As far as we could figure out, the problem was not transaction management per se, but the handling of constraints: For loaddata
to avoid complex and sometimes non-satisfiable ordering requirements, it needs FK constraints to be deferred while it works. On core databases, each for its own reasons, this is not an issue.
SQL Server does not support defering in the normal sense -- what you need to do is turn the constraints off in the beginning of worlk and back on in the end. I don't remember if this is an operation you can do within a transaction, but for sure it is not tied to a transaction -- you can turn the constraints off and then run several transactions before turning them back on. When you do turn them back on, however, you can do it WITH CHECK OPTION
-- which loaddata does, or else it might put constraint-violating data into the database.
SQL Server does not -- perhaps, given the semantics, cannot -- keep track of data changed while constraints were off. When you turn them on WITH CHECK OPTION
, it checks every single record in the database -- or, at least, takes long enough to do so.
We had a project with several dozen apps, and relatively simple sets of migrations on a full database (for upgrades) would take minutes on Oracle and hours on SQL Server. Then we told SQL Server users to use migrate --all --no-initial-data
(we didn't have any initial data anywhere), and sanity returned.
I'm not sure what this implies with regards to a solution. Some of the problem should go away if loaddata
only defered constraints when initial data actually existed. Perhaps there's a way to avoid the check in most cases (say, most data loaded in the Django test suite can be given a free ticket because it is verified by the other backends).
comment:8 by , 10 years ago
Version: | 1.7-rc-3 → 1.7 |
---|
comment:9 by , 10 years ago
I have an application in 1.7 that I have been building for a while. I have about 75 migrations. When I run
python manage.py test
with SQLite, it takes about 30 seconds to create the test database. In 1.6 this was almost instant (in fact, I just confirmed this. I ran my tests in an old environment at 1.6 and the test database was created instantly. I ran "pip install django --upgrade" and re-ran the tests. It took 30 seconds to create the test database).
I don't know if my issue is the same as this issue (being as I'm not using MSSQL Server), but I'd just like to add that there appears to be a significant slow down when going from 1.6 to 1.7, regardless of the database type.
I'm using "django.db.backends.sqlite3" on Windows 7 64-bit with a Core i7 processor and 8 GB ram.
follow-up: 12 comment:10 by , 10 years ago
Robert, when you run your tests in 1.7+ all migrations are applied, which takes several seconds per migration on 1.7 as reported in #23745. The performance of that will be improved in 1.8 as described in that ticket. You can check if you can use the --keepdb option in 1.8. Also, we might add an option to disable migrations when testing in the future. But ultimately, no, I don't think your issue is the one tracked by this ticket.
comment:11 by , 10 years ago
Cc: | added |
---|
comment:12 by , 10 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
If this is related to the performance of the migration operations - which doesn't seem to be completely clear at this point - the issue should be fixed or at leased reduced by an order of magnitude by the optimizations the migrate process received for 1.8.
I'm closing this with needsinfo for the reasons posted in comment 10
comment:13 by , 10 years ago
I wonder if this might be better on master (1.9) after commits like 67235fd4ef1b006fc9cdb2fa20e7bb93b0edff4b and 4aa089a9a9504c4a833eee8161be013206da5d15 -- the migrate
command should be more "lightweight" after them.
comment:14 by , 9 years ago
Has patch: | set |
---|---|
Resolution: | needsinfo |
Status: | closed → new |
The proposed PR for Django 1.8 reduces the time of the entire suite on MSSQL from about 7.75 hours to 5.25 hours (benchmarks from @manfre). I'm not particularly proud of that hack, but it seems worthwhile unless someone can spot a problem with it.
comment:16 by , 9 years ago
A yet new alternate approach is here: https://github.com/django/django/pull/5905. The idea is to skip disabling constraints when no fixture files are found in loaddata. The 1.8 test suite runs in 2500 seconds on my laptop after the patch (plus one line fix in django-mssql, see the pull request).
The previously proposed alternate approach takes off another 10 minutes from the runtime of the test suite on mssql, yielding a runtime of around 30 minutes. The changes needed there are a lot more invasive, as we will need to pre-walk the objects in the fixture files. This could lead to performance problems. The change might be worth it for master branch, but backpatching seems too risky.
comment:17 by , 9 years ago
Patch needs improvement: | unset |
---|---|
Summary: | Test suite initial database construction is too slow → Prevent loaddata from disabling and enabling constraints when no fixtures are found |
Triage Stage: | Accepted → Ready for checkin |
Set as release blocker because this is an order of magnitude performance regression added by new functionality.