Opened 12 years ago

Closed 8 years ago

#19513 closed Bug (fixed)

update() after annotate(val=Sum(...)) crashes on PostgreSQL & Oracle

Reported by: mengzhuo1203@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

An QuerySet operation like:
Combination.objects.all().update(tag='TagName')

The structure like:

class Combination:
    items = ForeignKey(Item)
CombinationManager:
    def get_query_set(self):
        qs = super(CombinationManager, self).get_query_set()
        qs = qs.annotate(price=Sum('items__price'))

The SQLite and MySQL runs fine with such structure, not with Oracle, it will encounter "ORA-00913: too many values"

Change History (11)

comment:1 by Meng Zhuo, 12 years ago

I think that I know why MySQL and SQlite runs fine, cause both of them will query first then update the queried sets, however Oracle backend will not, it simply puts everything in the same query.

Here is the buggy Oracle Query set

UPDATE "ITEM_COMBINATION" SET "ACTIVE" = :arg0 WHERE "ITEM_COMBINATION"."ID" IN (
    SELECT U0."ID", SUM(U8."PRICE") AS "PRICE" 
           FROM "ITEM_COMBINATION" U0 
           LEFT OUTER JOIN "ITEM_COMBINATION_ITEMS" U7 ON (U0."ID" = U7."COMBINATION_ID") 
           LEFT OUTER JOIN "ITEM_ITEM" U8 ON (U7."ITEM_ID" = U8."ID") WHERE U0."DURATION_ID" = :arg1  
           GROUP BY U0."ID", U0."ITEM_ID", U0."NAME", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."ACTIVE", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."CATEGORY_ID"); args=(False, 2)

Here is MySQL Query Set

SELECT U0.`id`, SUM(U8.`price`) AS `price`, MIN(U8.`store`) AS `store` FROM `item_combination` U0 LEFT OUTER JOIN `item_combination_items` U7 ON (U0.`id` = U7.`combination_id`) LEFT OUTER JOIN `item_item` U8 ON (U7.`item_id` = U8.`id`) GROUP BY U0.`id`, U0.`item_id`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`active`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM` ORDER BY NULL; args=()

UPDATE `item_combination` SET `active` = 1 WHERE `item_combination`.`id` IN (1, 3); args=(True, 1, 3)

comment:2 by Anssi Kääriäinen, 12 years ago

Triage Stage: UnreviewedAccepted

I had to use way too much time reproducing this. Your models do not actually work, the SQL in the comment doesn't match the queries in the description. The queries in the description doesn't match the models.

Here are the models & code to reproduce this (on master):

class Item(models.Model):
    price = models.IntegerField()
  
class CombinationManager(models.Manager):
    def get_query_set(self):
        qs = super(CombinationManager, self).get_query_set()
        qs = qs.annotate(price=models.Sum('items__price'))
        return qs
  
class Combination(models.Model):
    tag = models.CharField(max_length=32)
    items = models.ForeignKey(Item)
    objects = CombinationManager()

Combination.objects.all().update(tag='TagName')

The problem is the SUM() select in the subquery. Happens on PostgreSQL too.

comment:3 by Tim Graham, 9 years ago

Summary: Annotate broken django.db.backends.oracle while updatingupdate() after annotate(val=Sum(...)) crashes on PostgreSQL & Oracle

comment:4 by Josh Smeaton, 9 years ago

See also #25171

comment:5 by David Sanders, 9 years ago

Has there been any recent progress on this issue? It really limits adding annotations in a Manager because those annotations may prevent updates. I'm seeing this with sqlite3 as well, so the original (3 year old) description is not accurate in that regard. I believe MySQL is OK for the reason in #1, it appears to do a SELECT first and then update only those IDs. The example code akaariai gave still replicates the issue on sqlite3 with the slight tweak that 'get_query_set' is now 'get_queryset'.

I have a fix that I believe nicely resolves the issue, but I don't know nearly enough about query and SQL generating codebase to know if it's missing something obvious or will cause other potential problems.

The fix is fairly simple I simply clear annotations (unsure if the set_annotation_mask call is necessary) on the query after it is cloned and the update values are added in QuerySet.update. That resolved 90% of the issue. On #25171 zauddelig made a comment that this approach of clearing annotations would mess with filters and and F functions using the annotations. At least with filters on sqlite3 this is not a problem, as the filters are applied to the query before the update occurs and simply inserts the resolved annotation into the WHERE clause. Aggregations are already not allowed in the update so really it's only situations where you have an annotation like annotate(range=F('max') - F('min')).update(foo=F('range')). I fixed this by resolving the update values in UpdateQuery. add_update_fields so that they're resolved before stripping the annotations from the UPDATE query.

I believe this fix to be 'safe' in the sense that annotations create new names and aren't allowed to shadow a field on the model so any potential problem from clearing the annotations should appear as a name failing to resolve in a query because the annotation was removed. I don't believe it could cause any subtle mischief or corruption.

Is there any risk in resolving expressions at the time they're added to the update query as an update field?

BTW, zauddelig's comment actually points out an edge case in the MySQL generated code (and for a different reason sqlite3, I tested both) where if you try to use an annotation which spans tables in an update, everything will resolve and attempt to execute but the SQL will fail because for MySQL the query is broken into a SELECT and then an UPDATE, but the necessary info is not available in the UPDATE.

For example:

class Bar(models.Model):
    name = models.CharField(max_length=32)

class Foo(models.Model):
    related_bar = models.ForeignKey(Bar)
    bar_name = models.CharField(max_length=32)

----------------------

Foo.objects.annotate(bar_name=F('related_bar__name')).update(name=F('bar_name'))

------------------------

SELECT `test_foo`.`id`, `test_bar`.`name` AS `bar_name` FROM `test_foo` LEFT OUTER JOIN `test_bar` ON ( `test_foo`.`related_bar` = `test_bar`.`id` );

UPDATE `test_foo` SET `bar_name` = `test_bar`.`name` WHERE `test_foo`.`id` IN (1);

The generated SQL for sqlite3 is similar with the SELECT inside the WHERE clause. Seems like if this case is allowed (which is a narrow use case since aggregates aren't allowed in the update at the moment) it would need to add the JOIN to the update query for everything to work as expected. Instead it fails currently during the SQL execution. Note, I only tested this on Django 1.8.7 but I haven't seen any recent changes in the related code to suggest that a similar result wouldn't happen on master.

comment:6 by Tim Graham, 9 years ago

There isn't much risk in adding a regression test to your patch and sending a pull request. If it passes the test suite, we'll at least know there aren't any known issues with the approach you suggested and query experts are more likely to provide feedback. Thanks!

comment:7 by David Sanders, 9 years ago

It may be a bit before I get the time to do a test for it, so I'll let it sit here for a bit and hopefully if it's a waste of time someone will chime in with that info before I spend the time writing a test. I can probably get to a test in a couple weeks.

comment:8 by David Sanders, 9 years ago

Opened a PR request with a regression test as requested. The PR should also fix #18580 which is the same underlying issue as this ticket.

comment:9 by Tim Graham, 9 years ago

Has patch: set

comment:10 by Tim Graham, 9 years ago

Patch needs improvement: set

Comments for improvement are on the PR.

comment:11 by Tim Graham <timograham@…>, 8 years ago

Resolution: fixed
Status: newclosed

In a84344bc:

Fixed #19513, #18580 -- Fixed crash on QuerySet.update() after annotate().

Note: See TracTickets for help on using tickets.
Back to Top