#14394 closed Bug (fixed)
Assigning bad data to an m2m attribute should not clear existing data
Reported by: | Carl Meyer | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | krzysiumed@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
If you assign something nonsensical to a many-to-many attribute on a model instance, you'll get a TypeError
; but not until after Django has cleared all the existing relationship data. Ideally Django would verify that you've passed in something reasonable before it clears the existing data.
(This issue was originally half of #14373).
Attached patch with test demonstrating the issue.
Attachments (3)
Change History (17)
by , 14 years ago
Attachment: | 14394_r13984_testonly.diff added |
---|
comment:1 by , 14 years ago
milestone: | → 1.3 |
---|---|
Triage Stage: | Unreviewed → Accepted |
by , 14 years ago
Attachment: | validate_bad_data_assignment_to_m2m.patch added |
---|
comment:2 by , 14 years ago
Has patch: | set |
---|
comment:3 by , 14 years ago
Triage Stage: | Accepted → Unreviewed |
---|
comment:4 by , 14 years ago
Triage Stage: | Unreviewed → Accepted |
---|
Please don't change accepted tickets to unreviewed. These statuses have nothing to do with your patch.
comment:5 by , 14 years ago
The if not isinstance(value, (list, tuple))
check in the patch is not quite right. Any iterable (QuerySet
, set
, etc.) is also valid so long as it yields valid related model instances or primary keys.
by , 14 years ago
Attachment: | validate_bad_data_assignment_to_m2m_strictly.patch added |
---|
comment:6 by , 14 years ago
Following jonh_scott's suggestion, I've attached a patch that validates the value more carefully. It has to be an iterable with valid models instances or primary keys.
comment:7 by , 14 years ago
That is certainly much more strict validation. I will leave it to people much smarter than I to decide whether it is worth the queries to test if each int is a valid pk. There is validation much further down the stack that tests this before saving to the db, just not before calling clear()
.
Since the add()
method can handle an iterable of pks, perhaps just provide this rather than model instances since _add_items()
just grabs these anyway:
try: instance = manager.model.objects.get(pk=obj) except (manager.model.DoesNotExist, ValueError): raise TypeError(u"Cannot set value to this ManyToManyField. Make sure you pass valid models or primary keys.") else: model_instances.append(instance.pk)
comment:8 by , 14 years ago
[14602] removed ManyToManyField.isValidIDList which was unused and did something a little different from what's proposed here, but I'm not it for posterity (and because Carl asked me to).
comment:9 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → Bug |
comment:11 by , 13 years ago
Cc: | added |
---|---|
Easy pickings: | unset |
Patch needs improvement: | set |
UI/UX: | unset |
The last patch seems to be almost ready-for-checkin, but I think we should not check if primary keys are valid (lines 743-746 in last patch), because it requires large amount of sql queries and it could be slow. Instead, we should check if primary key is int
. Maybe we should add note in documentation that django does not check it?
In my opinion tests need small improvements:
- In
test_assigning_any_iterable_with_valid_models_to_m2m_works
we don't need to check all possible iterables: sets, lists, tuples. - As I wrote, we should not validate primary keys, so we don't need
test_assigning_iterable_with_invalid_pks_to_m2m_doesnt_clear_existing_relations
. - We can join
test_assigning_any_iterable_with_valid_models_to_m2m_works
andtest_assigning_any_iterable_with_valid_primary_keys_to_m2m_works
by testingc1.tags = [t1, t2.pk]
. - There are redundant lines of code (for example
t1 = Tag.objects.create(name='t1')
). We can move them tosetUp
.
comment:12 by , 13 years ago
The "lets iterate through the objects and fetch them one at a time" is a clear no-no. However, you can't assume a primary key is an int.
I really don't know what there is to do here. You are at a point where data validation should have prevented the errors. If you get an error, rollback your transaction. If you don't want to do that, use a savepoint.
This is not a unique situation in Django. For example, if you do a .save() on multitable inherited model, you can get it half saved if there is a data type mismatch in the topmost object in the inheritance chain. Something like:
class A: pass class B(A): f = models.IntegerField() B(f='not an integer').save()
do that and you got a save into A's table, but not into B's table.
Now, this is not a reason to avoid all sanity checks. But, aiming for anything like perfection (guaranteeing that the insert will succeed after the delete) isn't worth the effort in my opinion. Even the query for all the objects one by one will not be guaranteed to work under normal transactional rules. You would need something like select for update. Lets not go there.
comment:13 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
I think this issue has been addressed in current code, notably by bc9be72bdc9bb4dfc7f967ac3856115f0a6166b8. I'll still commit a test for confirmation.
This seems like reasonable behavior.