Opened 7 years ago

Closed 6 years ago

Last modified 4 days ago

#29499 closed Bug (fixed)

Race condition in QuerySet.update_or_create()

Reported by: Michael Sanders Owned by: Michael Sanders
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: race-condition
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I believe that there is a potential race condition in QuerySet.update_or_create()

When initially trying to obtain the object to update using get(), the row is locked using the select_for_update() method - this would lock the object against changes by other processes until the end of the transaction. However, if the object does not already exist, update_or_create() then calls the _create_object_from_params() method - see code:

      with transaction.atomic(using=self.db):
          try:
              obj = self.select_for_update().get(**lookup)
          except self.model.DoesNotExist:
              obj, created = self._create_object_from_params(lookup, params)

The _create_object_from_params() method looks like this:

 def _create_object_from_params(self, lookup, params):
     """
     Try to create an object using passed params. Used by get_or_create()
     and update_or_create().
     """
     try:
         with transaction.atomic(using=self.db):
             params = {k: v() if callable(v) else v for k, v in params.items()}
             obj = self.create(**params)
         return obj, True
     except IntegrityError as e:
         try:
             return self.get(**lookup), False
         except self.model.DoesNotExist:
             pass
         raise e

Here, it initially tries to create the object. However (assuming that the object has a unique key constraint policed by the DB), if the object has already been created meanwhile (by a different process) the IntegrityError is caught and the code uses the get() method to attempt to obtain the newly created object. The object is then returned to the calling update_or_create() method to update - however in this case, the row has not been locked against changes. So, at this point other processes are free to modify the DB row and those changes might then be overwritten by the save() in update_or_create().

Suggested Fix

This could be fixed by changing the _create_object_from_params() method to take a new parameter to specify whether the object should be locked on get, i.e.:

 def _create_object_from_params(self, lookup, params, lock=False):
     """
     Try to create an object using passed params. Used by get_or_create()
     and update_or_create().
     """
     try:
         with transaction.atomic(using=self.db):
             params = {k: v() if callable(v) else v for k, v in params.items()}
             obj = self.create(**params)
         return obj, True
     except IntegrityError as e:
         try:
             if lock:
                 return self.select_for_update().get(**lookup), False
             else:
                 return self.get(**lookup), False
         except self.model.DoesNotExist:
             pass
         raise e

The call to _create_object_from_params() from get_or_create() would remain the same, but from update_or_create() it would change to:

             obj, created = self._create_object_from_params(lookup, params, lock=True)

Change History (13)

comment:1 by Simon Charette, 7 years ago

Triage Stage: UnreviewedAccepted

The issue is legitimate and the suggested fix makes sense; if update_or_create enforces select_for_update() when a row exists it should always do so.

Would you be able to submit a PR on Github with some tests. I guess this could qualify for backports as it's a possible data loss issue.

comment:2 by Michael Sanders, 7 years ago

Owner: changed from nobody to Michael Sanders
Status: newassigned

I am obtaining permission from my employer and then will create a PR.

comment:3 by Michael Sanders, 6 years ago

Permission from my employer has been obtained, so I will now work on a PR.

comment:5 by Simon Charette, 6 years ago

Triage Stage: AcceptedReady for checkin
Version: 2.0master

comment:6 by Tim Graham, 6 years ago

Do you think we should backport to 1.11 based on your comment about data loss?

comment:7 by Simon Charette, 6 years ago

I think it should be pretty safe to backport for the rare cases when this happens. It's really an edge case but as Michael demonstrated in his test it can effectively lead to data-losses.

comment:8 by Tim Graham <timograham@…>, 6 years ago

Resolution: fixed
Status: assignedclosed

In 271542da:

Fixed #29499 -- Fixed race condition in QuerySet.update_or_create().

A race condition happened when the object didn't already exist and
another process/thread created the object before update_or_create()
did and then attempted to update the object, also before update_or_create()
saved the object. The update by the other process/thread could be lost.

comment:9 by Tim Graham <timograham@…>, 6 years ago

In 221ef69a:

[2.1.x] Fixed #29499 -- Fixed race condition in QuerySet.update_or_create().

A race condition happened when the object didn't already exist and
another process/thread created the object before update_or_create()
did and then attempted to update the object, also before update_or_create()
saved the object. The update by the other process/thread could be lost.

Backport of 271542dad1686c438f658aa6220982495db09797 from master

comment:10 by Tim Graham <timograham@…>, 6 years ago

In 44418260:

[2.0.x] Fixed #29499 -- Fixed race condition in QuerySet.update_or_create().

A race condition happened when the object didn't already exist and
another process/thread created the object before update_or_create()
did and then attempted to update the object, also before update_or_create()
saved the object. The update by the other process/thread could be lost.

Backport of 271542dad1686c438f658aa6220982495db09797 from master

comment:11 by Tim Graham <timograham@…>, 6 years ago

In 2668418d:

[1.11.x] Fixed #29499 -- Fixed race condition in QuerySet.update_or_create().

A race condition happened when the object didn't already exist and
another process/thread created the object before update_or_create()
did and then attempted to update the object, also before update_or_create()
saved the object. The update by the other process/thread could be lost.

Backport of 271542dad1686c438f658aa6220982495db09797 from master

comment:12 by Tim Graham <timograham@…>, 6 years ago

In 8a0b9051:

[1.11.x] Refs #29499 -- Skipped QuerySet.update_or_create() test that fails on MySQL.

comment:13 by Sarah Boyce <42296566+sarahboyce@…>, 4 days ago

In 6cfe00e:

Refs #29499 -- Fixed race condition in update_or_create() test.

The usage of time.sleep() could result in the update_or_create() thread winning
the race to create the row if the backend takes a while to create a new
connection in the main thread.

Relying on threading.Event ensures that the flow of execution is systematically
yield back and forth between the main thread and the thread in charge of
performing the background update_or_create().

Note: See TracTickets for help on using tickets.
Back to Top