Opened 14 years ago
Closed 11 years ago
#14226 closed Bug (fixed)
Bug in dumpdata dependency calculation involving ManyToManyFields
Reported by: | aneil | Owned by: | Rainer Koirikivi |
---|---|---|---|
Component: | Core (Serialization) | Version: | 1.2 |
Severity: | Normal | Keywords: | easy-pickings |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The manage.py dumpdata command incorrectly interprets ManyToMany relationships as dependencies of the model that declares them (rather than the other way around).
In the example below are 5 models - User, Tag and Posting, where both User and Posting have M2M relationships to Tag via UserTag and PostingTag, respectively. This should be serializable.
Here are the actual dependencies:
User: None Tag: None Posting: User PostingTag: Posting, Tag UserTag: User, Tag
However, dumpdata fails with this error:
Error: Can't resolve dependencies for main.Posting, main.PostingTag, main.Tag, main.User, main.UserTag in serialized app list.
from django.db.models import Model,CharField,ForeignKey,ManyToManyField,TextField,DateTimeField class User(Model): username = CharField(max_length=20) password = CharField(max_length=20) topics = ManyToManyField("Tag",through="UserTag") def natural_key(self): return (self.username,) class Posting(Model): user = ForeignKey(User) text = TextField() time = DateTimeField() def natural_key(self): return (self.user.username,self.time) natural_key.dependencies=['main.User'] class Tag(Model): name = CharField(max_length=20) postings = ManyToManyField(Posting,through="PostingTag") def natural_key(self): return (self.name,) class PostingTag(Model): tag = ForeignKey(Tag) posting = ForeignKey(Posting) def natural_key(self): return (self.tag.natural_key(),self.posting.natural_key()) class UserTag(Model): user = ForeignKey(User) tag = ForeignKey(Tag) def natural_key(self): return (self.tag.natural_key(),self.user.natural_key())
The reason this occurs is invalid logic in django/core/management/commands/dumpdata.py in lines 152-155. Here is the relevant code & context:
145 # Now add a dependency for any FK or M2M relation with 146 # a model that defines a natural key 147 for field in model._meta.fields: 148 if hasattr(field.rel, 'to'): 149 rel_model = field.rel.to 150 if hasattr(rel_model, 'natural_key'): 151 deps.append(rel_model) 152 for field in model._meta.many_to_many: 153 rel_model = field.rel.to 154 if hasattr(rel_model, 'natural_key'): 155 deps.append(rel_model) 156 model_dependencies.append((model, deps))
Lines 152-155 treat M2M relations like FK relations. This is incorrect. A Model named by an FK is a dependency, however, the model named by an M2M is not.
The fix requires adding the M2M *table* to the model_list, and processing its dependencies accordingly.
I've attached a simple test project that demonstrates the problem.
Attachments (3)
Change History (19)
by , 14 years ago
Attachment: | dumpdata_m2m_problem.tar.gz added |
---|
comment:1 by , 14 years ago
Cc: | added |
---|---|
Has patch: | set |
Keywords: | database postgres schema added |
Needs tests: | set |
Patch needs improvement: | set |
Triage Stage: | Unreviewed → Accepted |
This needs tests; fixtures_regress already has tests for the sort_dependencies utility method, which is what is being modified here.
Also - I'm not completely convinced the patch is correct. On first inspection, I'm fairly certain the "add the through model to the dependency chain" will result in objects being added to the fixture that aren't required for the simple case. The simple case (normal m2m) can be satisfied by simply removing m2m checks from the dependency chain. In the complex case (manually specified m2m model), checks aren't required either, because the manually specified m2m model will be processed as a standalone model.
Tests would help to validate this :-)
comment:2 by , 14 years ago
Cc: | removed |
---|---|
Keywords: | easy-pickings added; database postgres schema removed |
Not sure how the CC and keywords got modified. Sorry James and Andrew.
comment:3 by , 14 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:4 by , 14 years ago
the patch to add tests will only be usefully applied after the genocide of doc tests has been merged into master.
comment:6 by , 14 years ago
The proposed code patch appeared to break fixtures_regress:test_dependency_sorting_dangling, but on closer inspection, should not be considered to be the case. That test explicitly expected the dangling, un-related model to have a position in the resultant dependencies, but such an expectation is invalid.
comment:7 by , 14 years ago
Needs tests: | unset |
---|
active development of this at http://github.com/outofculture/django
comment:8 by , 14 years ago
The test cases look good, but running the full test suite reveals some additional breakages -- most notably in the fixtures tests. Some of these breakages are output ordering problems, but some appear to be more than that.
comment:9 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → Bug |
comment:10 by , 14 years ago
Easy pickings: | set |
---|
comment:12 by , 11 years ago
Owner: | changed from | to
---|
comment:13 by , 11 years ago
I think the correct way to fix this is to remove models referenced by complex (with explicit intermediate models i.e. through=...) M2M relations from the dependency chain, but keep simple (with automatic intermediate models) in the dependency chain.
This is also the check done by django.core.serializers.python.Serializer.handle_m2m_field. In serialized data, simple M2M relations are shown inline with the model that defines them, which makes dependencies of the referenced models. For complex M2M relations, however, the intermediate models should be serialized along with the other models, and should be included as models in the serialized data. The model defining the M2M relation thus has no dependency to the other model, but the intermediate model will have a dependency to both of the M2M models.
Development branch at https://github.com/koirikivi/django/tree/ticket_14226
Pull request at https://github.com/django/django/pull/1495
All tests pass under sqlite and postgres.
I created quite a few tests to first learn about the issue and then to be sure that everything works. The test_dump_and_load_m2m_complex_* tests are most likely redundant with the other tests in the PR, and can be removed as seemed fit.
FWIW I also tried the patch posted by aneil, which broke a couple of tests, and removing the M2M checks altogether, which didn't break any tests but resulted in a regression that's caught in the tests in the PR (namely test_dependency_sorting_m2m_simple)
comment:14 by , 11 years ago
Patch needs improvement: | unset |
---|
comment:15 by , 11 years ago
Easy pickings: | unset |
---|
comment:16 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
test project demonstrating dumpdata problem