#32433 closed Bug (fixed)
Delete distinct produces an unexpected and potentially harmful SQL
Reported by: | egism | Owned by: | egism |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.1 |
Severity: | Normal | Keywords: | orm, delete, distinct |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
I was looking for a way to delete the first Comment of each Post (a sample domain). Since I know that every new Post starts with a system generated comment I decided to go with:
Comment.objects.order_by('post_id', 'created_at').distinct('post_id').delete()
Before proceeding I tested it with:
Comment.objects.order_by('post_id', 'created_at').distinct('post_id').count()
Made sure the result actually contains what I needed and proceeded with the delete()
. The result was rather surprising. I was notified that the whole table was wiped clean. I then checked the actual SQL that was executed and it was a simple DELETE FROM comments;
.
As an ORM user, I would say it is the worst outcome possible and I would at least expect an error in such a case or ideally a SQL of what I was trying to achieve. At the same time, count
and delete
produces an inconsistent result which is even more mistaking.
Potential solutions:
- raise an error with a decent explanation
- produce a desired SQL according to the query
Since I have never submitted a change to Django, I have a very limited knowledge of the ORM and its intricacies. I could give it a try and issue a patch for this with some guidance.
Change History (10)
comment:1 by , 4 years ago
Description: | modified (diff) |
---|
follow-up: 3 comment:2 by , 4 years ago
Keywords: | postgresql removed |
---|---|
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Bug |
comment:3 by , 4 years ago
Hi Simon,
Sounds very simple, I will give it a go. Will be a good opportunity to make my first contribution. Thanks for directions.
comment:4 by , 4 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:6 by , 4 years ago
Has patch: | set |
---|
comment:7 by , 4 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
I think we should start by failing loudly when
distinct().delete()
is used as adding support for it would require a subquery pushdown (DELETE FROM table WHERE field IN (SELECT DISTINCT field FROM table
) which would require a non negligible amount of work and additional complexity for the rare cases where this is useful and a manual__in
subquery lookup can be used to achieve the same results e.g.egism if you want to work on a patch you should raise a
TypeError
inQuerySet.delete
ifself.query.distinct or self.query.distinct_fields
. A regression test can then be added todelete_regress/tests
(source).