Opened 20 months ago
Closed 20 months ago
#34450 closed Bug (fixed)
Lookup expressions across foreign keys introduce extra joins
Reported by: | Roman Odaisky | Owned by: | Simon Charette |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 4.1 |
Severity: | Normal | Keywords: | |
Cc: | David Sanders | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Assuming Child has a ForeignKey to Parent,
Parent.objects.filter(child__x="x", child__y="y")
results in
SELECT ... FROM "app_parent" INNER JOIN "app_child" ON ("app_parent"."id" = "app_child"."parent_id") WHERE ("app_child"."x" = ('x') AND "app_child"."y" = ('y'))
which makes perfect sense.
However,
Parent.objects.filter(Exact(F("child__x"), "x"), child__y="y")
unexpectedly produces
SELECT ... FROM "app_parent" LEFT OUTER JOIN "app_child" ON ("app_parent"."id" = "app_child"."parent_id") INNER JOIN "app_child" T3 ON ("app_parent"."id" = T3."parent_id") WHERE ("app_child"."x" = ('x') AND T3."y" = ('y'))
basically converting an AND lookup into an OR, which canβt be correct.
Same error here:
Parent.objects.filter(Exact(F("child__x"), "x") & Q(child__y="y"))
But these work correctly:
Parent.objects.filter(Q(child__y="y") & Exact(F("child__x"), "x")) Parent.objects.filter(Exact(F("child__x"), "x") & Exact(F("child__y"), "y"))
It would seem that all of the above should produce the same SQL as the first invocation instead of introducing extra joins that quietly turn ANDs into ORs, and most definitely there should be no dependence on the order of the operands.
Change History (12)
comment:1 by , 20 months ago
Triage Stage: | Unreviewed β Accepted |
---|
comment:2 by , 20 months ago
Some testing:
Correct joining:
query = Query(Parent) query.add_q(Q(child__x__exact= 'x')) query.add_q(Q(Exact(F("child__x"), 'x'))) print(query) SELECT "ticket_34450_parent"."id" FROM "ticket_34450_parent" INNER JOIN "ticket_34450_child" ON ("ticket_34450_parent"."id" = "ticket_34450_child"."parent_id") WHERE ("ticket_34450_child"."x" = x AND "ticket_34450_child"."x" = (x))
On the other hand with the order reversed:
query = Query(Parent) query.add_q(Q(Exact(F("child__x"), 'x'))) query.add_q(Q(child__x__exact= 'x')) print(query) SELECT "ticket_34450_parent"."id" FROM "ticket_34450_parent" LEFT OUTER JOIN "ticket_34450_child" ON ("ticket_34450_parent"."id" = "ticket_34450_child"."parent_id") INNER JOIN "ticket_34450_child" T3 ON ("ticket_34450_parent"."id" = T3."parent_id") WHERE ("ticket_34450_child"."x" = (x) AND T3."x" = x)
comment:3 by , 20 months ago
QuerySet.filter
results βin a single Query.add_q
so I'm not sure that comparing the effect of two distinct add_q
call should be the area of focus here as the latter is more alike to filter()
chaining calls (e.g. QuerySet.filter().filter()
) which has different semantic when dealing with multi-valued relationships.
It appears there is an issue in Query.build_filter
when dealing with expressions, possibly the lack of can_reuse
/used_aliases
passing to Expression.resolve_expression
.
comment:4 by , 20 months ago
Haven't run the full test suite yet but it seems that intuitively passing reuse
so the JOINs are reused does the trick for this particular issue at least
diff --git a/django/db/models/sql/query.py b/django/db/models/sql/query.py index 3cb8bdbc8a..1ed7d7cb6e 100644 --- a/django/db/models/sql/query.py +++ b/django/db/models/sql/query.py @@ -1373,7 +1373,7 @@ def build_filter( if not getattr(filter_expr, "conditional", False): raise TypeError("Cannot filter against a non-conditional expression.") condition = filter_expr.resolve_expression( - self, allow_joins=allow_joins, summarize=summarize + self, allow_joins=allow_joins, reuse=can_reuse, summarize=summarize, ) if not isinstance(condition, Lookup): condition = self.build_lookup(["exact"], condition, True)
comment:5 by , 20 months ago
Owner: | changed from | to
---|---|
Status: | new β assigned |
comment:7 by , 20 months ago
Has patch: | set |
---|
comment:8 by , 20 months ago
@Simon
so I'm not sure that comparing the effect of two distinct add_q call should be the area of focus
Sure, I'm not saying that was the area of focus, just that it was some interesting behaviour I saw that Query will try to demote if possible based on existing inner joins (which appears to be done in add_q()
) (and that it was worth sharing to whoever might be interested)
comment:9 by , 20 months ago
Oh and it also raises the question of whether
Parent.objects.filter(Exact(F("child__x"), "x")
should be demoted (produce an inner join) because at it produces a left join even with Simon's patch.
This is because the moment build_filter()
doesn't attempt to find any "needed inners" for anything that looks like an expression, thus _add_q()
doesn't have anything to try to demote to.
π
comment:10 by , 20 months ago
Cc: | added |
---|
comment:11 by , 20 months ago
Just FYI Simon's patch resolves the reported issue:
> print(Parent.objects.filter(Exact(F("child__x"), "x"), child__y="y").query) SELECT "ticket_34450_parent"."id" FROM "ticket_34450_parent" INNER JOIN "ticket_34450_child" ON ("ticket_34450_parent"."id" = "ticket_34450_child"."parent_id") WHERE ("ticket_34450_child"."x" = (x) AND "ticket_34450_child"."y" = y)
Hi, thanks for the report π
Confirmed on latest main that a left outer join is produced.