Opened 9 years ago
Closed 9 years ago
#25403 closed New feature (duplicate)
Subqueries: a common case where people get tempted to use .extra()
Reported by: | Paolo | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | QuerySet.extra |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
As written in the documentation I submit a use case where the .extra()
method seems the only way to do what I need. Given this model:
from django.db import models class Post(models.Model): text = models.CharField(max_length=255) class Comment(models.Model): text = models.CharField(max_length=255) needs_attention = models.BooleanField(default=False) created_at = models.DateTimeField(auto_now_add=True) post = models.ForeignKey(Post)
I can annotate
the list of posts with:
Post.objects.annotate(last_comment_date=Max('comment__created_at'))
to know the last_comment_date
, how to annotate to set a value for last_comment_needs_attention
?
I mean how can I obtain more informations from the comments while listing the posts
?
I see that this is a quite common request as a feature of the Django ORM, the most detailed description of this problem can be found in this nice blog post .
Is there any plan to allow such a way to annotate? Thx
Change History (9)
follow-up: 6 comment:1 by , 9 years ago
comment:2 by , 9 years ago
Version: | 1.8 → master |
---|
I'm not sure if there's already a ticket tracking this feature request but Anssi would know.
I guess we could provide an API similar to the prefetch_related()
/Prefetch()
one.
from django.db import models from django.utils import timezone class Post(models.Model): text = models.CharField(max_length=255) class Comment(models.Model): post = models.ForeignKey(Post, related_name='comments') text = models.CharField(max_length=255) created_at = models.DateTimeField(default=timezone.now) Post.objects.select_related( latest_comment=Select( 'comments', queryset=Comments.objects.order_by('-created_at') ) )
comment:3 by , 9 years ago
The prefetch api already lets you get the correct result at the expense of an extra query. Unfortunately, Django doesn't have a good story when it comes to supporting joins to subqueries just yet. Note that the linked blog post also causes an extra query, so the below is comparable (and supported!)
>>> p1 = Post.objects.create(text='Post 1') >>> p2 = Post.objects.create(text='Post 2') >>> c1 = Comment.objects.create(text='Comment 1', post=p1) >>> c2 = Comment.objects.create(text='Comment 2', post=p1) >>> c3 = Comment.objects.create(text='Comment 3', post=p2) >>> latest_comment_ids = Comment.objects.values('post_id').annotate(max_id=Max('id')).values('max_id') >>> latest_comments = Comment.objects.filter(id__in=latest_comment_ids) >>> posts = Post.objects.prefetch_related(Prefetch('comment_set', queryset=latest_comments, to_attr='latest_comment')) >>> for post in posts: ... print(post.text) ... for comment in post.latest_comment: ... print(comment.text) ... Post 1 Comment 2 Post 2 Comment 3
The queries executed will be close to:
SELECT * FROM "scratch_post" -- and -- SELECT "scratch_comment"."id", "scratch_comment"."text", "scratch_comment"."needs_attention", "scratch_comment"."created_at", "scratch_comment"."post_id" FROM "scratch_comment" WHERE "scratch_comment"."id" IN ( SELECT MAX(U0."id") AS "max_id" FROM "scratch_comment" U0 GROUP BY U0."post_id" )
comment:4 by , 9 years ago
I am convinced having something like
Post.objects.select_related( latest_comment=Select( 'comments', queryset=Comments.objects.order_by('-created_at') ) )
in Django is a great idea.
If we got .filter() to accept expressions, then you could do exactly the same raw SQL query with expressions as the blog mentions. That is, extra(where=['(blog_id, added) IN %s' % (values,)])
== .filter(RawSQL('(blog_id, added) IN %s', (values,))
.
Note that the .extra() version used in the blog post uses string formatting. While as used in the blog post the query is safe, you might be vulnerable to SQL injection attacks if you use the same pattern with user editable char fields for example.
comment:5 by , 9 years ago
Summary: | A quite common use of .extra() → Subqueries: a common case where people get tempted to use .extra() |
---|
comment:6 by , 9 years ago
Replying to timgraham:
Thanks Tim for noting this, does the title make more sense now?
follow-up: 8 comment:7 by , 9 years ago
comment:8 by , 9 years ago
comment:9 by , 9 years ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
I don't know, I guess it depends what's actually implemented. I'll add a comment on #25403 asking to reopen this ticket if it doesn't address this issue if/when it's completed.
Could you please try to edit the ticket description to describe the query/behavior you are trying to achieve? Otherwise, it seems we have to translate the models you've provided into the the details of the blog post. Thanks.