Opened 4 months ago

Closed 3 months ago

#35788 closed Cleanup/optimization (wontfix)

Order By using column number with Annotated fields

Reported by: Adrian Garcia Owned by:
Component: Database layer (models, ORM) Version: 5.1
Severity: Normal Keywords: order_by, annotate, column number,
Cc: Adrian Garcia, Simon Charette Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

As the title states, .order_by() is using a deprecated method of selecting which column to order by. While most modern DBs still allow this, the use of constants is discouraged.

>>> from django.db import models
>>> class Test(models.Model):
>>>     name = models.CharField()
>>>     class Meta:
>>>         app_label = "test"
>>>
>>> # Use of a constant when referencing an annotation
>>> Test.objects.all().annotate(test_annotation = models.F("name")).order_by("test_annotation").query
SELECT "test_test"."id", "test_test"."name", "test_test"."name" AS "test_annotation" FROM "test_test" ORDER BY 3 ASC
>>>
>>> # Use of column name when not referencing annotations:
>>> Test.objects.all().annotate(test_annotation = models.F("name")).order_by("name").query
SELECT "test_test"."id", "test_test"."name", "test_test"."name" AS "test_annotation" FROM "test_test" ORDER BY "test_test"."name" ASC

I'd be happy to draft a PR for this if this is deemed something worth addressing.

Change History (5)

comment:1 by Sarah Boyce, 4 months ago

Resolution: needsinfo
Status: newclosed

Hi Adrian, can you share some references as to why this shouldn't be used?
I can't find anything saying any reasons this should be a concern

comment:2 by Adrian Garcia, 4 months ago

The MySQL documentation references its removal from the standard. Additionally, this article specifically references that the use of constants was defined in the ANSI SQL-92 standard, and subsequently removed in ANSI SQL-99, but goes on to say that most RDBMS vendors still support this practice.

All DBs that Django currently supports (and many that are supported via third party libraries) appear to honor the deprecated column numbers, and given how long ago this change was made it's likely they will continue to support this. It's mostly the inconsistency of order_by using an integer constant _only_ with annotated fields that bothers me, rather than any risk of the feature suddenly breaking, which is why I offered to make the change if that's acceptable.


Since these older versions of the SQL spec can be found online for free, I was able to find the actual definitions.
From page pages 371 and 372 of ANSI SQL-92:

 <order by clause> ::=
	  ORDER BY <sort specification list>

 <sort specification list> ::=
	  <sort specification> [ { <comma> <sort specification> }... ]

 <sort specification> ::=
	  <sort key> [ <collate clause > ] [ <ordering specification> ]


 <sort key> ::=
		<column name>
	  | <unsigned integer>

 <ordering specification> ::= ASC | DESC
...

10)If ORDER BY is specified, then each <sort specification> in the <order by clause> shall identify a column of T.
    Case:
    a) If a <sort specification> contains a <column name>, then T shall contain exactly one column with that <column name> and the <sort specification> identifies that column.
    b) If a <sort specification> contains an <unsigned integer>, then the <unsigned integer> shall be greater than 0 and not greater than the degree of T. The <sort specification> identifies the column of T with the ordinal position specified by the <unsigned integer>.

From page 651 of ANSI SQL-99

<order by clause> ::=
    ORDER BY <sort specification list>
<sort specification list> ::=
    <sort specification> [ { <comma> <sort specification> }... ]
<sort specification> ::=
    <sort key> [ <collate clause> ] [ <ordering specification> ]
<sort key> ::=
    <value expression>
<ordering specification> ::= ASC | DESC

...

NOTE 287 – A previous version of ISO/IEC 9075 allows <sort specification> to be a <signed in-
teger> to denote a column reference of a column of T. That facility no longer exists. See Annex E,
‘‘Incompatibilities with ISO/IEC 9075:1992 and ISO/IEC 9075-4:1996’’.
Last edited 4 months ago by Adrian Garcia (previous) (diff)

comment:3 by Sarah Boyce, 3 months ago

Resolution: needsinfo
Status: closednew

comment:4 by Simon Charette, 3 months ago

Cc: Simon Charette added

I'm curious about whether there is any bug or breakage that using index based ORDER BY or GROUP BY causes or if it's solely an ideological request to adhere to the SQL spec (which the queries the ORM generates often diverges from).

I'm asking because there are two legitimate reasons I can think of why the ORM does that for GROUP BY and ORDER BY clauses

  1. To avoid ambiguity when referencing columns with colliding aliases; see #34346 and #34176 which introduced the changes
  2. To support prepared statements and server-side parameters binding without requiring a significant rewrite of how SQL is generated for GROUP BY and ORDER BY clause; see #35028 and #34255 which uncovered the problem and #20516 which discussed the addition of prepared statements.

comment:5 by Sarah Boyce, 3 months ago

Resolution: wontfix
Status: newclosed

Given the input from Simon, I feel we can accept this ticket if we have encountered a bug or breakage due to using index based ORDER BY.
Adrian, if you find something, feel free to comment and reopen the ticket

Note: See TracTickets for help on using tickets.
Back to Top