Opened 2 years ago

Closed 19 months ago

#33759 closed Cleanup/optimization (fixed)

Using subquery to filter a model delete generates a sub-optimal SQL

Reported by: Christofer Bertonha Owned by: Aman Pandey
Component: Database layer (models, ORM) Version: 4.0
Severity: Normal Keywords:
Cc: Simon Charette Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Christofer Bertonha)

Taking in consideration the app of tutorial for demonstrate the issue.
using postgresql as db.

Executing this following code:

subquery = Comment.objects.filter(created_at__lt=datetime(2022, 6, 1))[:1000]
Comment.objects.filter(id__in=subquery).delete()

Generates the following SQL:

DELETE FROM "comment" WHERE "comment"."id" IN (SELECT V0."id" FROM "comment" V0 WHERE V0."id" IN (SELECT U0."id" FROM "comment" U0 WHERE U0."created_at" < '2022-06-01T00:00:00+00:00'::timestamptz LIMIT 1000))

The inner select V0 is completely unnecessary

If a simple select with the same subquery is executed. It generates expected SQL query.

subquery = Comment.objects.filter(created_at__lt=datetime(2022, 6, 1))[:1000]
comments = list( Comment.objects..filter(id__in=subquery) )
SELECT * FROM "comment" WHERE "comment"."id" IN (SELECT U0."id" FROM "comment" U0 WHERE U0."created_at" < '2022-06-01T00:00:00+00:00'::timestamptz LIMIT 1000),

Change History (11)

comment:1 by Christofer Bertonha, 2 years ago

Description: modified (diff)

comment:2 by Christofer Bertonha, 2 years ago

Description: modified (diff)

comment:3 by Christofer Bertonha, 2 years ago

Summary: Using subquery to filter a model delete creates sub-optimal SQLUsing subquery to filter a model delete generates a sub-optimal SQL

comment:4 by Mariusz Felisiak, 2 years ago

Cc: Simon Charette added
Type: BugCleanup/optimization

This behavior was changed in 4074f38e1dcc93b859bbbfd6abd8441c3bca36b3 to prevent:

django.db.utils.NotSupportedError: (1235, "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'")

on MySQL. I have no idea for an alternative fix.

comment:5 by Mariusz Felisiak, 2 years ago

Triage Stage: UnreviewedAccepted

We could add a new feature flag to add an extra subquery only on MySQL:

  • django/db/backends/base/features.py

    diff --git a/django/db/backends/base/features.py b/django/db/backends/base/features.py
    index c54d30cf73..e8737aadc2 100644
    a b class BaseDatabaseFeatures:  
    1212    allows_group_by_selected_pks = False
    1313    empty_fetchmany_value = []
    1414    update_can_self_select = True
     15    # Does the backend support self-reference subqueries in the DELETE
     16    # statement?
     17    delete_can_self_reference_subquery = True
    1518
    1619    # Does the backend distinguish between '' and None?
    1720    interprets_empty_strings_as_nulls = False
  • django/db/backends/mysql/features.py

    diff --git a/django/db/backends/mysql/features.py b/django/db/backends/mysql/features.py
    index 3ea3deeae3..848d891a84 100644
    a b class DatabaseFeatures(BaseDatabaseFeatures):  
    2525    supports_slicing_ordering_in_compound = True
    2626    supports_index_on_text_field = False
    2727    supports_update_conflicts = True
     28    delete_can_self_reference_subquery = False
    2829    create_test_procedure_without_params_sql = """
    2930        CREATE PROCEDURE test_procedure ()
    3031        BEGIN
  • django/db/models/sql/compiler.py

    diff --git a/django/db/models/sql/compiler.py b/django/db/models/sql/compiler.py
    index 9c7bd8ea1a..e57a43ac47 100644
    a b class SQLDeleteCompiler(SQLCompiler):  
    17211721        Create the SQL for this query. Return the SQL string and list of
    17221722        parameters.
    17231723        """
    1724         if self.single_alias and not self.contains_self_reference_subquery:
     1724        if self.single_alias and (
     1725            self.connection.features.delete_can_self_reference_subquery or
     1726            not self.contains_self_reference_subquery
     1727        ):
    17251728            return self._as_sql(self.query)
    17261729        innerq = self.query.clone()
    17271730        innerq.__class__ = Query

Another issue is that QuerySet.delete() from the ticket description:

subquery = Comment.objects.filter(created_at__lt=datetime(2022, 6, 1))[:1000]
Comment.objects.filter(id__in=subquery).delete()
...
  File "site-packages/MySQLdb/connections.py", line 254, in query
    _mysql.connection.query(self, query)
django.db.utils.NotSupportedError: (1235, "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'")

crashes on MySQL even with 4074f38e1dcc93b859bbbfd6abd8441c3bca36b3 (I'm not sure it's worth fixing.)

Last edited 2 years ago by Mariusz Felisiak (previous) (diff)

comment:6 by Simon Charette, 2 years ago

Mariusz I think that it makes sense to only enable the query wrapping when necessary through a feature flag as you've suggested.

As for the MySQL limited subquery issue that's a different can of worms. The unnecessary query wrapping should be testable on backends with the feature disabled by counting the number of SELECT.

Thanks for the ping and sorry for my late answer!

comment:9 by Aman Pandey, 19 months ago

Owner: set to Aman Pandey
Status: newassigned

comment:10 by Aman Pandey, 19 months ago

Has patch: set
Needs tests: set

comment:11 by Aman Pandey, 19 months ago

Needs tests: unset

PR ready.

comment:12 by Mariusz Felisiak, 19 months ago

Triage Stage: AcceptedReady for checkin

comment:13 by Mariusz Felisiak <felisiak.mariusz@…>, 19 months ago

Resolution: fixed
Status: assignedclosed

In 0b0998dc:

Fixed #33759 -- Avoided unnecessary subquery in QuerySet.delete() with self-referential subqueries if supported.

Note: See TracTickets for help on using tickets.
Back to Top