#28010 closed New feature (fixed)
Add support for SELECT...FOR UPDATE OF
Reported by: | Ran Benita | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | adam@…, Adam Chidlow | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Using select_for_update()
on a queryset appends FOR UPDATE
to the query. If the query joins other tables, the effect is that all tables are locked. I believe this is not what most people using select_for_update()
want, instead they only want to lock the row from the main table.
Example
Consider a scenario like this:
class Product(models.Model): name = models.CharField(max_length=100) price = models.PositiveIntegerField() class Order(models.Model): product = models.ForeignKey(Product, on_delete=models.PROTECT) status = models.CharField(max_length=100, choices=( ('NEW', 'New'), ('APPROVED', 'Approved'), ('CANCELED', 'Canceled'), ) ... @classmethod def approve(cls, id): with transaction.atomic(): order = Order.objects.select_related('product').select_for_update().get(pk=id) # ...Check order.product.price & stuff... order.status = 'APPROVED' order.save() @classmethod def cancel(cls, id): with transaction.atomic(): # Similar...
Here, locking is needed in order to serialize approve()
and cancel()
on the same Order, to avoid race conditions.
I have also used select_related
in approve()
, in order to avoid an extra query. Potentially, there are many related objects, but to keep the example simple I added just one.
The interaction between select_related
and select_for_update
has the unintended consequence (IMO) of also locking the related Product
row. Hence, if there is some concurrent task which updates product prices for example, it can cause conflicts, slowdowns, deadlocks, and other bad things.
The fix is to remove the select_related
. But then extra queries are needed for each related object access without a reasonable workaround (I think?).
Possible solutions
PostgreSQL (which I am using) has an option to specify exactly which tables are to be locked, instead of locking all tables in the query: https://www.postgresql.org/docs/9.6/static/sql-select.html#SQL-FOR-UPDATE-SHARE. The syntax is SELECT ... FOR UPDATE OF orders_order
.
Oracle also supports this with similar syntax, however it allows specifying which *columns* to lock: http://docs.oracle.com/javadb/10.8.3.0/ref/rrefsqlj31783.html. I guess there it is possible to lock specific columns of specific rows, while postgres only locks full rows.
I am not familiar with MySQL but it doesn't seem like it supports refining the FOR UPDATE
.
Here are some possible solutions I can think of:
- Add support for
select_for_update
to specify which models to lock, e.g.select_for_update('self', 'product')
. This will use theOF
syntax.
- Only support the
select_for_update('self')
behavior, and make it implicit, i.e.select_for_update
only ever locks the main model of the QuerySet.
- Add a way to do
select_related
on an already-existing object, something like a standaloneselect_related_objects
after the existingprefetch_related_objects
function: https://docs.djangoproject.com/en/1.10/ref/models/querysets/#prefetch-related-objects, and keepselect_for_update
as-is. Then it should at least be possible to do one extra query instead of one per related object.
Thanks for reading this far :)
Change History (14)
comment:1 by , 8 years ago
Triage Stage: | Unreviewed → Accepted |
---|---|
Version: | 1.10 → master |
comment:2 by , 8 years ago
Thanks Simon. I will try to implement approach 1. I will start with just supporting self
(not implicit), and see if I can get it to work; that would already be useful.
Also, I need to correct my comment on Oracle a bit: while Oracle requires to specify a column, it does not "lock a column", it locks the entire row of the table containing the column. So it's equivalent to Postgres, only we should write SELECT ... FOR UPDATE OF "orders_order.id"
in Oracle instead of SELECT ... FOR UPDATE "orders_order"
in Postgres. Here's the doc: http://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_10002.htm#SQLRF55372
comment:3 by , 8 years ago
Has patch: | set |
---|
I submitted a patch: https://github.com/django/django/pull/8330
comment:4 by , 8 years ago
Cc: | added |
---|
comment:5 by , 8 years ago
Summary: | select_for_update() is too eager in presence of joins → Add support for SELECT...FOR UPDATE OF |
---|
comment:7 by , 7 years ago
This is a great feature. Thanks!
I've found an issue though, the validation seems to be a little-overzealous. It's not allowing reverse unique relationships (e.g. OneToOneField), even though they can be used with select_related.
comment:8 by , 7 years ago
Thanks for testing, Adam. I did not think of this scenario.
Please correct me of the following is incorrect:
Reverse OneToOne fields are nullable (might not exist on the other side), and Django does not allow select_for_update on such fields (you get "django.db.utils.NotSupportedError: FOR UPDATE cannot be applied to the nullable side of an outer join"). Hence, including such a field in of=(...)
should not be allowed. In fact, in order to use select_for_update
at all on such a query, you must use of=(...)
*excluding* the reverse field, otherwise it will fail with the error above.
comment:9 by , 7 years ago
Hi Ran, I'm using select_for_update
from the reverse side of a OneToOneField succesfuly, it just requires that you exclude(<related_field>=None)
as well.
This is what I'm working with at the moment to be able to use of=
as well: https://github.com/achidlow/django/commit/83fa6963a0b4b9f02c88b3abc84e20bce60d9035
(I'm not familiar with django's SQL internals so I don't know if that's the right fix, and in the unlikely event that it is, _get_field_choices
also needs updating. However it's working for me right now)
comment:10 by , 7 years ago
Cc: | added |
---|
comment:11 by , 7 years ago
I sent a patch https://github.com/django/django/pull/9043 to allow reverse relations.
comment:12 by , 7 years ago
Awesome, thanks! Do we need to change any flags here to bring the new pull request to someones attention?
Allowing the usage of
FOR UPDATE OF
is definitely something we want to add.Now for the proposed options the one that makes the most sense to me is
1.
.2.
would be backward incompatible given we actually lock all rows by default and some backends don't allow locking only a specific table which would makeselect_for_update()
behave quite a bit differently on different backends.