#31453 closed Cleanup/optimization (fixed)
Avoid comparison against a boolean value for Exists().
Reported by: | Balazs Endresz | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.0 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Django creates sql for Exists
subqueries like this: WHERE EXISTS(...) = true
But for some reason this results in a very inefficient query plan (at least in postgres 11.2).
Simply removing = true
makes it several hundred times faster in some cases.
With = true
:
EXPLAIN ANALYZE SELECT COUNT(*) FROM ( SELECT "ap_facilitysubtype"."facility_sub_type_id" AS Col1 FROM "ap_facilitysubtype" WHERE EXISTS( SELECT U0."facility_id" FROM "ap_facility" U0 WHERE U0."facility_sub_type_id" = ("ap_facilitysubtype"."facility_sub_type_id") ) = true ) subquery; QUERY PLAN --------------------- Aggregate (cost=225.56..225.57 rows=1 width=8) (actual time=1100.085..1100.086 rows=1 loops=1) -> Seq Scan on ap_facilitysubtype (cost=0.00..225.41 rows=58 width=0) (actual time=0.137..1100.040 rows=61 loops=1) Filter: (alternatives: SubPlan 1 or hashed SubPlan 2) Rows Removed by Filter: 54 SubPlan 1 -> Seq Scan on ap_facility u0 (cost=0.00..3248.91 rows=1704 width=0) (actual time=9.560..9.560 rows=1 loops=115) Filter: ("facility_sub_type_id" = ap_facilitysubtype."facility_sub_type_id") Rows Removed by Filter: 53523 SubPlan 2 -> Seq Scan on ap_facility u0_1 (cost=0.00..2993.33 rows=102233 width=4) (never executed) Planning Time: 1.184 ms Execution Time: 1100.749 ms
Without = true
:
EXPLAIN ANALYZE SELECT COUNT(*) FROM ( SELECT "ap_facilitysubtype"."facility_sub_type_id" AS Col1 FROM "ap_facilitysubtype" WHERE EXISTS( SELECT U0."facility_id" FROM "ap_facility" U0 WHERE U0."facility_sub_type_id" = ("ap_facilitysubtype"."facility_sub_type_id") ) ) subquery; QUERY PLAN --------------------- Aggregate (cost=52.24..52.25 rows=1 width=8) (actual time=1.759..1.760 rows=1 loops=1) -> Nested Loop Semi Join (cost=0.29..52.09 rows=60 width=0) (actual time=0.067..1.734 rows=61 loops=1) -> Seq Scan on ap_facilitysubtype (cost=0.00..6.15 rows=115 width=4) (actual time=0.015..0.052 rows=115 loops=1) -> Index Only Scan using "ap_facility_facility_sub_type_id_577e61d3" on ap_facility u0 (cost=0.29..108.41 rows=1704 width=4) (actual time=0.013..0.013 rows=1 loops=115) Index Cond: ("facility_sub_type_id" = ap_facilitysubtype."facility_sub_type_id") Heap Fetches: 61 Planning Time: 0.674 ms Execution Time: 1.841 ms
It's not likely to be the correct way to do this but = true
can be removed by adding the following to BuiltinLookup.as_sql
:
from django.db.models.expressions import Exists if isinstance(self.lhs, Exists) and self.rhs is True and params == [True]: return '%s' % (lhs_sql), []
I suspect = true
is not necessary in any other database either, so would it make sense to remove it when used with subqueries like this?
Change History (2)
comment:1 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Summary: | Speed up Exists subqueries → Avoid comparison against a boolean value for Exists(). |
comment:2 by , 5 years ago
That works great, thank you!
I installed django from master and can confirm = true
isn't there in the sql any more.
It's already fixed on master, see 37e6c5b79bd0529a3c85b8c478e4002fd33a2a1d.