Opened 12 years ago
Closed 9 years ago
#19513 closed Bug (fixed)
update() after annotate(val=Sum(...)) crashes on PostgreSQL & Oracle
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.4 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
An QuerySet operation like:
Combination.objects.all().update(tag='TagName')
The structure like:
class Combination: items = ForeignKey(Item)
CombinationManager: def get_query_set(self): qs = super(CombinationManager, self).get_query_set() qs = qs.annotate(price=Sum('items__price'))
The SQLite and MySQL runs fine with such structure, not with Oracle, it will encounter "ORA-00913: too many values"
Change History (11)
comment:1 by , 12 years ago
comment:2 by , 12 years ago
Triage Stage: | Unreviewed → Accepted |
---|
I had to use way too much time reproducing this. Your models do not actually work, the SQL in the comment doesn't match the queries in the description. The queries in the description doesn't match the models.
Here are the models & code to reproduce this (on master):
class Item(models.Model): price = models.IntegerField() class CombinationManager(models.Manager): def get_query_set(self): qs = super(CombinationManager, self).get_query_set() qs = qs.annotate(price=models.Sum('items__price')) return qs class Combination(models.Model): tag = models.CharField(max_length=32) items = models.ForeignKey(Item) objects = CombinationManager() Combination.objects.all().update(tag='TagName')
The problem is the SUM() select in the subquery. Happens on PostgreSQL too.
comment:3 by , 9 years ago
Summary: | Annotate broken django.db.backends.oracle while updating → update() after annotate(val=Sum(...)) crashes on PostgreSQL & Oracle |
---|
comment:5 by , 9 years ago
Has there been any recent progress on this issue? It really limits adding annotations in a Manager because those annotations may prevent updates. I'm seeing this with sqlite3 as well, so the original (3 year old) description is not accurate in that regard. I believe MySQL is OK for the reason in #1, it appears to do a SELECT first and then update only those IDs. The example code akaariai gave still replicates the issue on sqlite3 with the slight tweak that 'get_query_set' is now 'get_queryset'.
I have a fix that I believe nicely resolves the issue, but I don't know nearly enough about query and SQL generating codebase to know if it's missing something obvious or will cause other potential problems.
The fix is fairly simple I simply clear annotations (unsure if the set_annotation_mask
call is necessary) on the query after it is cloned and the update values are added in QuerySet.update. That resolved 90% of the issue. On #25171 zauddelig made a comment that this approach of clearing annotations would mess with filters and and F functions using the annotations. At least with filters on sqlite3 this is not a problem, as the filters are applied to the query before the update occurs and simply inserts the resolved annotation into the WHERE clause. Aggregations are already not allowed in the update so really it's only situations where you have an annotation like annotate(range=F('max') - F('min')).update(foo=F('range')). I fixed this by resolving the update values in UpdateQuery. add_update_fields so that they're resolved before stripping the annotations from the UPDATE query.
I believe this fix to be 'safe' in the sense that annotations create new names and aren't allowed to shadow a field on the model so any potential problem from clearing the annotations should appear as a name failing to resolve in a query because the annotation was removed. I don't believe it could cause any subtle mischief or corruption.
Is there any risk in resolving expressions at the time they're added to the update query as an update field?
BTW, zauddelig's comment actually points out an edge case in the MySQL generated code (and for a different reason sqlite3, I tested both) where if you try to use an annotation which spans tables in an update, everything will resolve and attempt to execute but the SQL will fail because for MySQL the query is broken into a SELECT and then an UPDATE, but the necessary info is not available in the UPDATE.
For example:
class Bar(models.Model): name = models.CharField(max_length=32) class Foo(models.Model): related_bar = models.ForeignKey(Bar) bar_name = models.CharField(max_length=32) ---------------------- Foo.objects.annotate(bar_name=F('related_bar__name')).update(name=F('bar_name')) ------------------------ SELECT `test_foo`.`id`, `test_bar`.`name` AS `bar_name` FROM `test_foo` LEFT OUTER JOIN `test_bar` ON ( `test_foo`.`related_bar` = `test_bar`.`id` ); UPDATE `test_foo` SET `bar_name` = `test_bar`.`name` WHERE `test_foo`.`id` IN (1);
The generated SQL for sqlite3 is similar with the SELECT inside the WHERE clause. Seems like if this case is allowed (which is a narrow use case since aggregates aren't allowed in the update at the moment) it would need to add the JOIN to the update query for everything to work as expected. Instead it fails currently during the SQL execution. Note, I only tested this on Django 1.8.7 but I haven't seen any recent changes in the related code to suggest that a similar result wouldn't happen on master.
comment:6 by , 9 years ago
There isn't much risk in adding a regression test to your patch and sending a pull request. If it passes the test suite, we'll at least know there aren't any known issues with the approach you suggested and query experts are more likely to provide feedback. Thanks!
comment:7 by , 9 years ago
It may be a bit before I get the time to do a test for it, so I'll let it sit here for a bit and hopefully if it's a waste of time someone will chime in with that info before I spend the time writing a test. I can probably get to a test in a couple weeks.
comment:8 by , 9 years ago
Opened a PR request with a regression test as requested. The PR should also fix #18580 which is the same underlying issue as this ticket.
comment:9 by , 9 years ago
Has patch: | set |
---|
I think that I know why MySQL and SQlite runs fine, cause both of them will query first then update the queried sets, however Oracle backend will not, it simply puts everything in the same query.
Here is the buggy Oracle Query set
Here is MySQL Query Set