#32388 closed Cleanup/optimization (fixed)
bulk_update() doesn't necessarily ignore duplicates.
Reported by: | Tim McCurrach | Owned by: | Tim McCurrach |
---|---|---|---|
Component: | Documentation | Version: | 3.1 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | yes | UI/UX: | no |
Description
In the docs for bulk_update it says: "If objs contains duplicates, only the first one is updated." This is due to the way the query is constructed.
However this is only true for each SQL update. And if there are duplicates that fall into different batches, they will both (or all) be used as part of the update.
To Reproduce Error
>>> m1 = MyModel.objects.get(id=1) >>> m2 = MyModel.objects.get(id=1) >>> m1.name="a" >>> m2.name="b" >>> MyModel.objects.bulk_update([m1, m2], ['name'], batch_size=1) >>> MyModel.objects.get(id=1).name 'b'
Whilst the above is an extreme example, it demonstrates the point. If a large number of objects are being updated, you cannot currently rely on the behaviour that only the first instance of a duplicate will affect the update.
Change History (5)
comment:1 by , 4 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:2 by , 4 years ago
Component: | Database layer (models, ORM) → Documentation |
---|---|
Easy pickings: | set |
Summary: | bulk_update doesn't necessarily ignore duplicates → bulk_update() doesn't necessarily ignore duplicates. |
Triage Stage: | Unreviewed → Accepted |
Type: | Bug → Cleanup/optimization |
I don't think it is worth additional complexity (see #29968), we can clarify this in docs, e.g.