#29499 closed Bug (fixed)
Race condition in QuerySet.update_or_create()
Reported by: | Michael Sanders | Owned by: | Michael Sanders |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | race-condition |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I believe that there is a potential race condition in QuerySet.update_or_create()
When initially trying to obtain the object to update using get()
, the row is locked using the select_for_update()
method - this would lock the object against changes by other processes until the end of the transaction. However, if the object does not already exist, update_or_create()
then calls the _create_object_from_params()
method - see code:
with transaction.atomic(using=self.db): try: obj = self.select_for_update().get(**lookup) except self.model.DoesNotExist: obj, created = self._create_object_from_params(lookup, params)
The _create_object_from_params()
method looks like this:
def _create_object_from_params(self, lookup, params): """ Try to create an object using passed params. Used by get_or_create() and update_or_create(). """ try: with transaction.atomic(using=self.db): params = {k: v() if callable(v) else v for k, v in params.items()} obj = self.create(**params) return obj, True except IntegrityError as e: try: return self.get(**lookup), False except self.model.DoesNotExist: pass raise e
Here, it initially tries to create the object. However (assuming that the object has a unique key constraint policed by the DB), if the object has already been created meanwhile (by a different process) the IntegrityError
is caught and the code uses the get()
method to attempt to obtain the newly created object. The object is then returned to the calling update_or_create()
method to update - however in this case, the row has not been locked against changes. So, at this point other processes are free to modify the DB row and those changes might then be overwritten by the save()
in update_or_create()
.
Suggested Fix
This could be fixed by changing the _create_object_from_params()
method to take a new parameter to specify whether the object should be locked on get, i.e.:
def _create_object_from_params(self, lookup, params, lock=False): """ Try to create an object using passed params. Used by get_or_create() and update_or_create(). """ try: with transaction.atomic(using=self.db): params = {k: v() if callable(v) else v for k, v in params.items()} obj = self.create(**params) return obj, True except IntegrityError as e: try: if lock: return self.select_for_update().get(**lookup), False else: return self.get(**lookup), False except self.model.DoesNotExist: pass raise e
The call to _create_object_from_params()
from get_or_create()
would remain the same, but from update_or_create()
it would change to:
obj, created = self._create_object_from_params(lookup, params, lock=True)
Change History (13)
comment:1 by , 7 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 7 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
I am obtaining permission from my employer and then will create a PR.
comment:3 by , 6 years ago
Permission from my employer has been obtained, so I will now work on a PR.
comment:5 by , 6 years ago
Triage Stage: | Accepted → Ready for checkin |
---|---|
Version: | 2.0 → master |
comment:6 by , 6 years ago
Do you think we should backport to 1.11 based on your comment about data loss?
comment:7 by , 6 years ago
I think it should be pretty safe to backport for the rare cases when this happens. It's really an edge case but as Michael demonstrated in his test it can effectively lead to data-losses.
The issue is legitimate and the suggested fix makes sense; if
update_or_create
enforcesselect_for_update()
when a row exists it should always do so.Would you be able to submit a PR on Github with some tests. I guess this could qualify for backports as it's a possible data loss issue.