Opened 11 years ago

Closed 6 years ago

Last modified 5 years ago

#21171 closed Cleanup/optimization (fixed)

Skip transaction creation for single statement operations

Reported by: Anssi Kääriäinen Owned by: nobody
Component: Database layer (models, ORM) Version: 1.5
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

There are a couple of cases where Django will execute just a single data modifying statement for the operation done. Examples include single table .save(), .delete() that doesn't cascade, bulk_create() and .update(). Atomicity of single statements guarantees that these operations work without creating / releasing transaction. This could increase performance of these statements when used in autocommit mode.

The save() method should be easy to handle - it is easy to check if the table has parents or not. Bulk create is also somewhat easy, although there are cases where multiple queries need to be executed. Model deletion might be hard to handle, signals are sent inside the same transaction and it isn't that far fetched to think that signal handler will do additional data modifications (log changes for example).

It might be there are more operations that could benefit from this - get_or_create() and update_or_create() for example.

I got the idea while inspecting if cache db backend could use ORM directly. Currently the db backend doesn't wrap all data modifications in atomic blocks, but if using ORM directly that would happen and this could lead to performance problems in the backend.

Change History (7)

comment:1 by Tim Graham, 11 years ago

Triage Stage: UnreviewedAccepted

comment:2 by Shai Berger, 11 years ago

I must be missing something -- in current master, get_or_create() and update_or_create() are not single-statement operations.

Also, as far as I see, the signals argument applies to save() just as much as it applies to delete().

Perhaps add a keyword argument? subtransaction=True, so if you want to save a bit you pass in False?

comment:3 by Anssi Kääriäinen, 11 years ago

For save() signals are sent outside the transaction but for delete inside the transaction. So, for delete, if you skip the transaction creation, then queries ran from signals aren't inside transaction control. But for save() this isn't the case - any DML is ran outside transaction in any case.

The delete() case is a change in behaviour.

You are right about get_or_create() and update_or_create().

All in all, I'm beginning to wonder if this is really worth it...

comment:4 by Tim Graham, 6 years ago

Has patch: set

comment:5 by Florian Apolloner <apollo13@…>, 6 years ago

Resolution: fixed
Status: newclosed

In bc7dd849:

Fixed #21171 -- Avoided starting a transaction when a single (or atomic queries) are executed.

Checked the following locations:

  • Model.save(): If there are parents involved, take the safe way and use transactions since this should be an all or nothing operation.

If the model has no parents:

  • Signals are executed before and after the previous existing transaction -- they were never been part of the transaction.
  • if force_insert is set then only one query is executed -> atomic by definition and no transaction needed.
  • same applies to force_update.
  • If a primary key is set and no force_* is set Django will try an UPDATE and if that returns zero rows it tries an INSERT. The first case is completly save (single query). In the second case a transaction should not produce different results since the update query is basically a no-op then (might miss something though).
  • QuerySet.update(): no signals issued, single query -> no transaction needed.
  • Model/Collector.delete(): This one is fun due to the fact that is does many things at once.

Most importantly though: It does send signals as part of the
transaction, so for maximum backwards compatibility we need to be
conservative.

To ensure maximum compatibility the transaction here is removed only
if the following holds true:

  • A single instance is being deleted.
  • There are no signal handlers attached to that instance.
  • There are no deletions/updates to cascade.
  • There are no parents which also need deletion.

comment:6 by GitHub <noreply@…>, 5 years ago

In f8ef5f2c:

Refs #21171 -- Made Collector.delete() rollback in the correct database.

Regression in c7dd8490b882b2cefdc7faf431dc64c532b79c9.

comment:7 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

In aca675a:

[3.1.x] Refs #21171 -- Made Collector.delete() rollback in the correct database.

Regression in c7dd8490b882b2cefdc7faf431dc64c532b79c9.
Backport of f8ef5f2c86683bef3b200fd864efc14f1fbbc23b from master

Note: See TracTickets for help on using tickets.
Back to Top