Opened 8 years ago
Last modified 6 years ago
#27867 new Bug
Excluding a reverse relation produces inconsistent queries — at Version 1
Reported by: | Sébastien Diemer | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | diemersebastien@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
When excluding a reverse relation from a queryset, the query generated by the ORM depends on the order in which the exclude
kwargs are processed.
This order is random since python 3.3 (see https://docs.python.org/3/reference/datamodel.html#object.__hash).
Django converts a query like Tag.objects.filter(category_id=2).exclude(note__isnull=True)
to the following SQL:
SELECT "queries_tag"."id", "queries_tag"."name", "queries_tag"."parent_id", "queries_tag"."category_id" FROM "queries_tag" WHERE ( "queries_tag"."category_id" = 2 AND NOT ( "queries_tag"."id" IN ( SELECT U0."id" AS Col1 FROM "queries_tag" U0 LEFT OUTER JOIN "queries_note" U1 ON ( U0."id" = U1."tag_id" ) WHERE U1."id" IS NULL)) ) ORDER BY "queries_tag"."name" ASC
If another kwarg is added to exclude
, then the subquery "queries_tag"."id" IN
may be modified : a new clause is added to the WHERE
filter to avoid searching inside the entire queries_tag
table. WHERE U1."id" IS NULL
becomes WHERE (U1."id" IS NULL AND U0."id" = ("queries_tag"."id"))
.
This is not systematic, and seems to depend on the order on which the kwargs are processed (which is random for python3.3 and above).
This random behavior is the first thing to fix in my opinion.
In addition if the reverse relation is the only kwarg (like in my example above), the generated subquery misses this additional clause that enables to dramatically increase the performance of the query on large datasets. For performance reasons I think that the WHERE
filtering in the subquery should always be WHERE (U1."id" IS NULL AND U0."id" = ("queries_tag"."id"))
.
This is a second aspect of the bug that could be fixed.
see https://github.com/sebdiem/django/commit/e8ca265e8c30f61c88ea17b7c1cab72695d6a4ff for details on how to reproduce the error.
The bugs exist in Django 1.8 and master (and probably all version in between)