Opened 14 years ago

Closed 10 years ago

Last modified 10 years ago

#14394 closed Bug (fixed)

Assigning bad data to an m2m attribute should not clear existing data

Reported by: Carl Meyer Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: krzysiumed@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

If you assign something nonsensical to a many-to-many attribute on a model instance, you'll get a TypeError; but not until after Django has cleared all the existing relationship data. Ideally Django would verify that you've passed in something reasonable before it clears the existing data.

(This issue was originally half of #14373).

Attached patch with test demonstrating the issue.

Attachments (3)

14394_r13984_testonly.diff (1.9 KB ) - added by Carl Meyer 14 years ago.
validate_bad_data_assignment_to_m2m.patch (1.7 KB ) - added by Igor Sobreira 14 years ago.
validate_bad_data_assignment_to_m2m_strictly.patch (6.3 KB ) - added by Igor Sobreira 14 years ago.

Download all attachments as: .zip

Change History (17)

by Carl Meyer, 14 years ago

Attachment: 14394_r13984_testonly.diff added

comment:1 by Paul McMillan, 14 years ago

milestone: 1.3
Triage Stage: UnreviewedAccepted

This seems like reasonable behavior.

by Igor Sobreira, 14 years ago

comment:2 by Igor Sobreira, 14 years ago

Has patch: set

comment:3 by Igor Sobreira, 14 years ago

Triage Stage: AcceptedUnreviewed

comment:4 by Paul McMillan, 14 years ago

Triage Stage: UnreviewedAccepted

Please don't change accepted tickets to unreviewed. These statuses have nothing to do with your patch.

comment:5 by John-Scott Atlakson, 14 years ago

The if not isinstance(value, (list, tuple)) check in the patch is not quite right. Any iterable (QuerySet, set, etc.) is also valid so long as it yields valid related model instances or primary keys.

comment:6 by Igor Sobreira, 14 years ago

Following jonh_scott's suggestion, I've attached a patch that validates the value more carefully. It has to be an iterable with valid models instances or primary keys.

comment:7 by John-Scott Atlakson, 14 years ago

That is certainly much more strict validation. I will leave it to people much smarter than I to decide whether it is worth the queries to test if each int is a valid pk. There is validation much further down the stack that tests this before saving to the db, just not before calling clear().

Since the add() method can handle an iterable of pks, perhaps just provide this rather than model instances since _add_items() just grabs these anyway:

try:
    instance = manager.model.objects.get(pk=obj)
except (manager.model.DoesNotExist, ValueError):
    raise TypeError(u"Cannot set value to this ManyToManyField. Make sure you pass valid models or primary keys.")
else:
    model_instances.append(instance.pk)

comment:8 by Alex Gaynor, 14 years ago

[14602] removed ManyToManyField.isValidIDList which was unused and did something a little different from what's proposed here, but I'm not it for posterity (and because Carl asked me to).

comment:9 by Julien Phalip, 13 years ago

Severity: Normal
Type: Bug

comment:10 by Jacob, 13 years ago

milestone: 1.3

Milestone 1.3 deleted

comment:11 by Christopher Medrela, 13 years ago

Cc: krzysiumed@… added
Easy pickings: unset
Patch needs improvement: set
UI/UX: unset

The last patch seems to be almost ready-for-checkin, but I think we should not check if primary keys are valid (lines 743-746 in last patch), because it requires large amount of sql queries and it could be slow. Instead, we should check if primary key is int. Maybe we should add note in documentation that django does not check it?

In my opinion tests need small improvements:

  • In test_assigning_any_iterable_with_valid_models_to_m2m_works we don't need to check all possible iterables: sets, lists, tuples.
  • As I wrote, we should not validate primary keys, so we don't need test_assigning_iterable_with_invalid_pks_to_m2m_doesnt_clear_existing_relations.
  • We can join test_assigning_any_iterable_with_valid_models_to_m2m_works and test_assigning_any_iterable_with_valid_primary_keys_to_m2m_works by testing c1.tags = [t1, t2.pk].
  • There are redundant lines of code (for example t1 = Tag.objects.create(name='t1')). We can move them to setUp.

comment:12 by Anssi Kääriäinen, 13 years ago

The "lets iterate through the objects and fetch them one at a time" is a clear no-no. However, you can't assume a primary key is an int.

I really don't know what there is to do here. You are at a point where data validation should have prevented the errors. If you get an error, rollback your transaction. If you don't want to do that, use a savepoint.

This is not a unique situation in Django. For example, if you do a .save() on multitable inherited model, you can get it half saved if there is a data type mismatch in the topmost object in the inheritance chain. Something like:

class A:
    pass
class B(A):
    f = models.IntegerField()
B(f='not an integer').save()

do that and you got a save into A's table, but not into B's table.

Now, this is not a reason to avoid all sanity checks. But, aiming for anything like perfection (guaranteeing that the insert will succeed after the delete) isn't worth the effort in my opinion. Even the query for all the objects one by one will not be guaranteed to work under normal transactional rules. You would need something like select for update. Lets not go there.

comment:13 by Claude Paroz, 10 years ago

Resolution: fixed
Status: newclosed

I think this issue has been addressed in current code, notably by bc9be72bdc9bb4dfc7f967ac3856115f0a6166b8. I'll still commit a test for confirmation.

comment:14 by Claude Paroz <claude@…>, 10 years ago

In 7ce9644d9347455ae6f9bd383788a65e4bcadda3:

Added a test to ensure bad assignation to M2M doesn't clear data

Refs #14394.

Note: See TracTickets for help on using tickets.
Back to Top