#32449 closed New feature (wontfix)
Allow specifying tables in RawSQL().
Reported by: | João Carneiro Haas | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.1 |
Severity: | Normal | Keywords: | annotate, count, RawSQL, select_related, JOIN |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
We have in our application a RawSQL annotation which depends on a internal join to work:
# accounts/models.py class Org(models.Model): name = models.TextField() class Account(models.Model): org = models.ForeignKey(Org) query = Account.objects.select_related("org").annotate(org_name=RawSQL("SELECT custom_name(accounts_org.name)"))
Well, sometimes Django tries to optimize the queries and removes unneded JOINs, such as when calling count:
query.count() # SELECT COUNT(*) FROM ( # SELECT "accounts_account"."id" AS Col1, (SELECT "accounts_org"."name") AS "org_name" # FROM "accounts_org" # GROUP BY "devices_device"."id", (SELECT "accounts_org"."name") # )
Which results in an 'missing FROM-clause' error.
Since (I believe) it's impossible for Django to analyze the passed RawSQL query, I would like to request some way to specify which joins/select_related the annotated RawSQL depends on.
My suggestion would be to pass a list of table dependencies to the RawSQL:
raw = RawSQL("SELECT custom_name(accounts_org.name)", tables=["org"])
If the ticket gets 'confirmed' and an API gets specified I can try tackling this issue.
For now I'm using the following workaround, which forces a join:
query = Accounts.objects.annotate(org_id=F('org__id'), custom_name=...)
Obs.: Fixing this before https://code.djangoproject.com/ticket/28477 would result in queries with unused annotations (such as the ones above) to always perform the join. I believe it won't impact the development of this ticket, but who knows ¯\_(ツ)_/¯
Change History (5)
follow-up: 3 comment:1 by , 4 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Type: | Uncategorized → New feature |
comment:2 by , 4 years ago
Summary: | Issues with RawSQL annotations dependencies and count() → Allow specifying tables in RawSQL(). |
---|
follow-up: 4 comment:3 by , 4 years ago
Replying to Mariusz Felisiak:
You can do the same by creating a custom function
Ok, that actually solves the issue I presented, but I'm at fault here, the custom_name
example was just the easiest way I found to explain the issue, we're actually doing a Postgres recursive query, which as far as I know there's no way to represent in a Django structure.
Here's a more truthful example on how we're using the RawSQL
. It's not the exact query and models, but I believe if there's a way to solve for this our case would also be solved:
# graph/models.py class Node(models.Model): pass class Edge(models.Model): parent = models.ForeignKey(Node) child = models.ForeignKey(Node) class A(models.Model): node = models.ForeignKey(Node) class B(models.Model): a = models.ForeignKey(A) # b_qs is a really complex 'B' queryset b_qs = b_qs.objects.select_related('a').annotate(graph_child_count=RawSQL(""" WITH RECURSIVE children(id) AS ( SELECT "e"."child_id" FROM "graph_edge" AS "e" WHERE "e"."parent_id" = "graph_a"."node_id" UNION SELECT "e"."child_id" FROM "graph_edge" AS "e", "children" WHERE "e"."parent_id" = "children"."id" ) SELECT COUNT(*) FROM "graph_children" """, [])).filter(graph_child_count__lt=10).count()
Now, regarding this:
RawSQL()
is the last resort
using QuerySet.extra()
extra
is the real last resort though. Every place in the documentation says it should be avoided, and it will be deprecated as soon as it's not needed anymore, and if needed, you should use a RawSQL
with an annotate. Yes, I believe I could use extra
to solve my issue, but I want to use a reccomended way to solve this issue.
So, to summarize, if there's a way I'm unaware of about encapsulating my query in a Django structure (such as in the Func
custom_name
example), yeah, I think this ticket is invalid. Otherwise, I really think there should be a way to run more specific queries while having the guarantee that stuff inside the final query will be there.
Edit: So, maybe the ticket should be changed to implement some kind of CustomQuery
class which would allow me to define a query and dependent fields, or receive arguments with __
operators as in the Func
example.
follow-up: 5 comment:4 by , 4 years ago
Replying to João Carneiro Haas:
Ok, that actually solves the issue I presented, but I'm at fault here, the
custom_name
example was just the easiest way I found to explain the issue, we're actually doing a Postgres recursive query, which as far as I know there's no way to represent in a Django structure.
We have an accepted ticket for CTE, see #28919.
So, to summarize, if there's a way I'm unaware of about encapsulating my query in a Django structure (such as in the
Func
custom_name
example), yeah, I think this ticket is invalid. Otherwise, I really think there should be a way to run more specific queries while having the guarantee that stuff inside the final query will be there.
There is no need for a new API, because resolving #28477 will strip unused annotation, so crash is a duplicate of #28477. You can also push everything to a subquery.
comment:5 by , 4 years ago
Replying to Mariusz Felisiak:
We have an accepted ticket for CTE, see #28919.
Yeah, I believe this will end up solving my issue.
There is no need for a new API, because resolving #28477 will strip unused annotation, so crash is a duplicate of #28477
The crash will still occur even after #28477 is fixed, since you can (and probably will) use the annotation on other statements which will influence count
:
b_qs = b_qs.objects.annotate(graph_child_count=RawSQL(...)).filter(graph_child_count__lt=10).count()
So yeah, the CTE ticket solves our specific issue, although I still believe this feature would be useful for other non-supported Django stuff which require a RawSQL
.
Thanks for this ticket, however
RawSQL()
is the last resort and I don't think we would like to add any extra API to it. You can do the same by creating a custom function (preferable):or by using
QuerySet.extra()
(which also supportstables
):or by specifying the full subquery in
RawSQL()
:I treated this as a request for a new API, because resolving #28477 will strip unused annotation, so crash is a duplicate of #28477.