Opened 5 years ago
Last modified 9 months ago
#31255 new Cleanup/optimization
Migrations create a redundant RemoveField operation when deleting 2 models with related fields.
Reported by: | Panagis Alisandratos | Owned by: | |
---|---|---|---|
Component: | Migrations | Version: | dev |
Severity: | Normal | Keywords: | migration, optimizer |
Cc: | Markus Holtermann, Andriy Sokolovskiy, Shai Berger, Ülgen Sarıkavak | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
class ModelA(models.Model): field1 = models.BooleanField(default=False) class ModelB(models.Model): another_field1 = models.TextField(blank=True) field1 = models.BooleanField(default=False) related_field = models.ForeignKey(ModelA, on_delete=models.CASCADE) class ModelC(models.Model): another_field1 = models.TextField(blank=True) related_field = models.ForeignKey(ModelA, on_delete=models.CASCADE)
Removing ModelB
and ModelC
yields the following migration:
class Migration(migrations.Migration): dependencies = [ ('analyzer', '0149_modela_modelb_modelc'), ] operations = [ migrations.RemoveField( model_name='modelc', name='related_field', ), migrations.DeleteModel( name='ModelB', ), migrations.DeleteModel( name='ModelC', ), ]
It is unclear whether the RemoveField
operation is redundant or if a RemoveField
operation for ModelB
is missing.
I've confirmed that the RemoveField
operation for ModelB
is part of the unoptimized operations set.
Change History (22)
comment:1 by , 5 years ago
Summary: | Migration optimizer retains a 1 RemoveField for FK when deleting 2 models → Migration optimizer retains 1 RemoveField for FK when deleting 2 models |
---|
comment:2 by , 5 years ago
Summary: | Migration optimizer retains 1 RemoveField for FK when deleting 2 models → Migration optimizer removes 1 RemoveField for FK when deleting 2 models |
---|
comment:3 by , 5 years ago
Cc: | added |
---|---|
Summary: | Migration optimizer removes 1 RemoveField for FK when deleting 2 models → Migrations create a redundant RemoveField operation when deleting 2 models with related fields. |
Triage Stage: | Unreviewed → Accepted |
Version: | 2.2 → master |
comment:4 by , 5 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:5 by , 5 years ago
Upon going through some of the source code, the issue clearly arises because we are calling RemoveField
in the generate_deleted_models
method in db/migrations/autodetector.py
https://github.com/django/django/blob/f283ffaa84ef0a558eb466b8fc3fae7e6fbb547c/django/db/migrations/autodetector.py#L707. I think calling this method here is redundant as DeleteModel
should drop the table from the database which automatically deletes the related fields as well. Can someone confirm my hypothesis or correct me if I am steering in the wrong direction?
comment:6 by , 5 years ago
Sankar it's bit more complicated than that.
If you remove two interelated models (or any model cycle) at the same time fields have to be removed to break the cycle before hand
Given
class Foo(models.Model): bar = models.ForeignKey('Bar') class Bar(models.Model): foo = models.ForeignKey(Foo)
Removing both models and running makemigrations
must generate a migration that either drop Foo.bar
or Bar.foo
before proceeding with the model removal. I'm pretty sure we have auto-detector tests for this case.
The optimizer likely need to be adjusted somehow but I doubt generate_deleted_models
's generation of RemoveField
is to blame.
comment:7 by , 5 years ago
Upon doing some more debugging, I found some peculiar behaviour. I tried to print generated_operations
at the end of the function generate_deleted_models's
. This is the output: {'some': [<RemoveField model_name='book', name='authors_book'>, <DeleteModel name='Book'>, <RemoveField model_name='pub', name='authors_pub'>, <DeleteModel name='Pub'>]}
.
The method is adding four operations to the generated_operations
, but only three of them show up in the actual migrations file. Am I missing something here?
comment:8 by , 5 years ago
The method is adding four operations to the generated_operations, but only three of them show up in the actual migrations file. Am I missing something here?
Yes. The optimizer is likely able to optimize one of the RemoveField
into its corresponding DeleteModel
but not the other one. The reduction operation takes place in RemoveField.reduce
.
See #26720 (https://github.com/django/django/pull/7999) and https://code.djangoproject.com/ticket/24424#comment:35 for more details about that.
comment:9 by , 5 years ago
Unfortunately, I don't think this is the case here. Here is the migration file that was generated: https://pastebin.com/nSN5HNW8
comment:10 by , 5 years ago
I'm not sure I understand, the migration file you pointed at clearly has 3 operations so the <RemoveField model_name='book', name='authors_book'>
got reduced into the <DeleteModel name='Book'>
.
comment:11 by , 5 years ago
Forgive me if I come off as a bit unaware, I have just started to work with the migrations component. I have been going through the code in the meanwhile, and I am confident that the reduce()
method or MigrationOptimizer
's optimize_inner()
method is causing the issue. Just to be on the same page, the issue here is that while one RemoveField
gets reduced into DeleteModel
, while the other does not, right?
comment:12 by , 5 years ago
This is explained in more details in the above links but the crux of the issue is that RemoveField
and DeleteModel
operations don't hold a reference to what they are removing so they must assume they could be referring to any model.
In other words, because the DeleteModel
operation doesn't hold a reference to the model state it's removing it cannot differentiate between the cases where the optimization can be performed (this ticket's example) and when it cannot because there's cycles between models (comment:6).
You likely can get the optimization working for this ticket by changing DeleteModel.references_model
to return False
but that I'll break when for the cases where model cycles are deleted in the same migration.
If DeleteModel
operations were created with the set of fields they have at removal time it would be possible for them to properly determine whether or not they references each other.
follow-up: 14 comment:13 by , 5 years ago
Here's a branch pushing the idea above further.
It currently has two failures related to many-to-many field removals optimizations in the auto-detector but the additional tests in test_optimizer
demonstrate that changes work as expected. We should likely add more tests for the RemoveField
and DeleteModel
without fields(s) tests but I think this should be a good basis if you want to familiarize yourself with the internals of the optimizer.
comment:14 by , 5 years ago
Replying to Simon Charette:
After going through this, I have built on top of this and further fixed the ManyToManyField removal test cases and added another test case for this ticket as well. How should I create the PR, given that I have built on top of your existing code?
comment:15 by , 5 years ago
After going through this, I have built on top of this and further fixed the ManyToManyField removal test cases and added another test case for this ticket as well
Nice, I think test_non_cyclical_models_removals
already covers this ticket's case though; no need to add another one at the auto-detector level IMO.
How should I create the PR, given that I have built on top of your existing code?
Assuming you pushed your work to a branch that includes the above commit it should be the same process as opening any other PR. I'd leave the two commits separated so they can be reorganized by the mergers as they deem the most appropriate.
comment:16 by , 5 years ago
While working on my branch, I manually changed all the files to reflect the changes you made on your branch, and then made some further changes on my own. I was wondering if there is a way, that the megers recognize that the PR contains your work too since if I simply create a PR, you wouldn't get any credit.
comment:17 by , 5 years ago
I wouldn't worry too much about attribution, mergers can use the Git Co-authored-by
feature to give attribution to multiple contributors.
comment:18 by , 5 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
Sanskar Jaiswal: Please feel free to pick this
comment:19 by , 4 years ago
Has patch: | set |
---|---|
Owner: | set to |
Status: | new → assigned |
Submitted a draft PR.
I'd like to try a shot at an alternative patch that changes Operation.reduce
signature to (operation: Operation, app_label: str, state: ProjectState)
to allow the optimization to be enabled even for legacy projects with migrations that haven't tracked removed field and models over time. It would also avoid a public API change for RemoveField
and DeleteModel
.
comment:20 by , 4 years ago
Has patch: | unset |
---|
comment:21 by , 3 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:22 by , 9 months ago
Cc: | added |
---|
IMO,
RemoveField
is redundant in this case (see #24061 with a related discussion).