#28704 closed Bug (needsinfo)
update_or_create() calls select_for_update(), which locks database row
Reported by: | Rafal Radulski | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.11 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
An issue arises when update_or_create() is executed during a long-running transaction. It's caused by update_or_create() calling select_for_update(), which locks a row until the end of the entire transaction.
def update_or_create(self, defaults=None, **kwargs): defaults = defaults or {} lookup, params = self._extract_model_params(defaults, **kwargs) self._for_write = True with transaction.atomic(using=self.db): try: obj = self.select_for_update().get(**lookup) except self.model.DoesNotExist: obj, created = self._create_object_from_params(lookup, params) if created: return obj, created for k, v in defaults.items(): setattr(obj, k, v() if callable(v) else v) obj.save(using=self.db) return obj, False
Let's say that we are using a PostgreSQL database with "read committed" isolation level. There are two processes. First process starts a transaction. It updates and retrieves an existing model using update_or_create(). select_for_update() is called, which locks the row until the end of the transaction. Then, second process starts. It retrieves the same row using get() method. At this point, database query in second process is blocked until the end of transaction in first process.
I believe this is a bug. update_or_create() should not call select_for_update(). Doing so can create a long-lived database lock, even when database transaction isolation level is relaxed.
I don't have a possible solution. I believe that QuerySet.update_or_create() should call QuerySet.update(). Unfortunately, QuerySet.update() does not support generic relationships. If QuerySet.update() did support generic relationships, a solution could work as follows:
def update_or_create(self, defaults=None, **kwargs): defaults = defaults or {} lookup, params = self._extract_model_params(defaults, **kwargs) self._for_write = True with transaction.atomic(using=self.db): try: obj = self.only('pk').get(**lookup) except self.model.DoesNotExist: obj, created = self._create_object_from_params(lookup, params) if created: return obj, created update_params = {k: v() if callable(v) else v for k, v in defaults.items()} self.filter(pk=obj.pk).update(**update_params) obj = self.get(pk=obj.pk) return obj, False
Related ticket:
https://code.djangoproject.com/ticket/26804
Change History (3)
comment:1 by , 7 years ago
comment:2 by , 7 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
comment:3 by , 7 years ago
I thought that a call to select_for_update
was problematic, but I was wrong. Thank you for the explanation.
I'm assuming you're passing something in
defaults
, right?update_or_create
currently callsModel.save
and you are suggesting it should callQuerySet.update
instead. Both of them sendUPDATE
SQL statements to the database. That will create a row-level lock. In Postgres, the lock level is eitherFOR UPDATE
orFOR NO KEY UPDATE
, see https://www.postgresql.org/docs/9.6/static/explicit-locking.html#LOCKING-ROWS.But, neither of these locks will block an ordinary
SELECT
statement like the.get()
in the second transaction.