#34502 closed Bug (duplicate)
Union giving wrong column ordering even when using .values/.values_list
Reported by: | Thiago Bellini Ribeiro | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 4.2 |
Severity: | Normal | Keywords: | |
Cc: | Simon Charette | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
This is a simplified example of a union I have in a project:
Product.objects.all().annotate( kind=Value("base_price", output_field=models.CharField()), priority=Value(0), product_id=F("id"), pos_id=Value(None, output_field=models.BigAutoField()), ).values( "kind", "priority", "price", "product_id", "pos_id", ).order_by().union( PriceTableProduct.objects.all().annotate( priority=Value(1), kind=Value("price-table", output_field=models.CharField()), pos_id=F("price_table__pos__id"), ) .values( "kind", "priority", "price", "product_id", "pos_id", ), )
On both sqlite and postgresql this produces wrong sql. On sqlite:
SELECT "product_product"."price" AS "col1", 'base_price' AS "kind", 0 AS "priority", "product_product"."id" AS "product_id", NULL AS "pos_id" FROM "product_product" UNION SELECT "product_pricetableproduct"."price" AS "col1", "product_pricetableproduct"."product_id" AS "col2", 1 AS "priority", 'price-table' AS "kind", "pos_pos"."id" AS "pos_id" FROM "product_pricetableproduct" INNER JOIN "product_pricetable" ON ("product_pricetableproduct"."price_table_id" = "product_pricetable"."id") LEFT OUTER JOIN "pos_pos" ON ("product_pricetable"."id" = "pos_pos"."price_table_id") LIMIT 21
On postgreql:
( SELECT "product_product"."price" AS "col1", 'base_price' AS "kind", 0 AS "priority", "product_product"."id" AS "product_id", NULL AS "pos_id" FROM "product_product" ) UNION ( SELECT "product_pricetableproduct"."price" AS "col1", "product_pricetableproduct"."product_id" AS "col2", 1 AS "priority", 'price-table' AS "kind", "pos_pos"."id" AS "pos_id" FROM "product_pricetableproduct" INNER JOIN "product_pricetable" ON ("product_pricetableproduct"."price_table_id" = "product_pricetable"."id") LEFT OUTER JOIN "pos_pos" ON ("product_pricetable"."id" = "pos_pos"."price_table_id") ) LIMIT 21
You can see that the orders given to values
are not preserved, and thus the returned value is wrong (e.g. "kind" is the second column in the base query, but "product_id" is the second column on the union query)
I'm using django 4.2, so not sure if this is a 4.2 regression or some old issue.
obs. the .order_by
in the first query is also another issue. Since that model has an ordering
defined in its meta, I get DatabaseError: ORDER BY not allowed in subqueries of compound statements.
when running that query on sqlite (on postgresql it is fine and works without the empty order_by, but the main issue in this ticket still exists)
Change History (3)
comment:1 by , 20 months ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
comment:2 by , 6 months ago
This may not be a duplicate of #28900. The problem here is in how annotate is working with the SQL SELECT
.
In django 4.2 the order in which annotations are created matters here is an example:
class Foo(models.Model): name = models.CharField(max_length=16) value = models.SmallIntegerField() qs1 = ( Foo.objects.all() .annotate( kind=Value(1, output_field=SmallIntegerField()), pos=Value("value", output_field=CharField()), ) .values("name", "kind", "value", "pos") ) qs2 = ( Foo.objects.all() .annotate( pos=Value("other", output_field=CharField()), kind=Value(2, output_field=SmallIntegerField()), ) .values("name", "kind", "value", "pos") ) print(qs1.union(qs2).query)
In django 4.2 this gets translated by the ORM in the following SQL
( SELECT "django_issue_34502_foo"."name" AS "col1", "django_issue_34502_foo"."value" AS "col2", 1 AS "kind", value AS "pos" FROM "django_issue_34502_foo" ) UNION ( SELECT "django_issue_34502_foo"."name" AS "col1", "django_issue_34502_foo"."value" AS "col2", other AS "pos", 2 AS "kind" FROM "django_issue_34502_foo" )
As you can note the order of annotations is reflected in the SELECT
clause.
This owever seems to have been fixed in the 5.0 since the same invocation of the ORM now gives the fields in a compatible order:
( SELECT "django_issue_34502_foo"."name" AS "col1", "django_issue_34502_foo"."value" AS "col2", 1 AS "kind", value AS "pos" FROM "django_issue_34502_foo" ) UNION ( SELECT "django_issue_34502_foo"."name" AS "col1", "django_issue_34502_foo"."value" AS "col2", 2 AS "kind", other AS "pos" FROM "django_issue_34502_foo" )
However there is still a possibile issue: even in django 5.0 the order of fields is not the one declare in the .values
/.values_list
! First there are model fields and then annotations, it is like instead of calling .values("name", "kind", "value", "pos")
it was called .values("name", "value", "kind", "pos")
.
This should at least be pointed out in the docs, because it may confuse users which belive being able to control the order in which fields/functions are placed in the sql SELECT
which is very important when performing query operations like UNION
, INTERSECT
or DIFFERENCE
.
comment:3 by , 6 months ago
Cc: | added |
---|
I'm not sure in which issue this is captured but the underlying problem here is that order of fields specified in values
is always going to be
SELECT *extra_fields, *model_fields, *annotations
In other words doing values("annotation1", "field1", "field2", "annotation2")
is going to result in
SELECT "table"."field1", "table"."field2", annotation1_expr, annotation2_expr
What was fixed in Django 5.0 is that the local order (within the respective select groups) will respect the order specified in values
for annotations but we still haven't fixed the global ordering problem. See #28553 and d6b6e5d0fd4e6b6d0183b4cf6e4bd4f9afc7bf67.
I tried giving it a shot in this MR but there are complications regarding the usage of extra(select)
that forced me to take a detour in attempting to deprecate it entirely.
In the mean time your best shot at working around the issue is to always use annotations in values
so something like
qs3 = ( Foo.objects.all() .annotate( name_an=F("name"), value_an=F("value"), pos=Value("other", output_field=CharField()), kind=Value(2, output_field=SmallIntegerField()), ) .values("name_an", "kind", "value_an", "pos") )
Duplicate of #28900.