Opened 10 months ago

Closed 8 months ago

#35028 closed Bug (fixed)

psycopg3: distinct query iteration causes psycopg.errors.InvalidColumnReference: for SELECT DISTINCT, ORDER BY expressions must appear in select list

Reported by: Richard Ebeling Owned by: Simon Charette
Component: Database layer (models, ORM) Version: 4.2
Severity: Normal Keywords:
Cc: Simon Charette, Florian Apolloner, Daniele Varrazzo, Jonathan Weth Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Richard Ebeling)

Using PostgreSQL, after upgrading from psycopg2 to psycopg3 (psycopg[binary]==3.1.14), iterating a distinct queryset of a model with an ordering that uses Value expressions causes psycopg.errors.InvalidColumnReference error saying that the ORDER BY expression must appear in the select list.

This is a minimal reproducible example:

### MODELS

from django.db import models

class TestModel(models.Model):
    test_field = models.CharField()

    class Meta:
        ordering = [NullIf("test_field", Value(""))]

### TEST

from name.models import TestModel
from django.test import TestCase

class TestTestModel(TestCase):
    def test_iterating(self):
        for el in TestModel.objects.all().distinct().iterator():
            pass

which gives this output when running:

$ ./manage.py test -k test_iterating
Found 1 test(s).
Creating test database for alias 'default'...
System check identified no issues (0 silenced).
E
======================================================================
ERROR: test_iterating (name.tests.test_models.TestTestModel)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/.../venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 105, in _execute
    return self.cursor.execute(sql, params)
  File "/.../venv/lib/python3.10/site-packages/psycopg/server_cursor.py", line 294, in execute
    raise ex.with_traceback(None)
psycopg.errors.InvalidColumnReference: for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ..._field", $1) FROM "name_testmodel" ORDER BY NULLIF("ev...
                                                             ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File ".../tests/test_models.py", line 43, in test_iterating
    for el in TestModel.objects.all().distinct().iterator():
  File "/.../venv/lib/python3.10/site-packages/django/db/models/query.py", line 516, in _iterator
    yield from iterable
  File "/.../venv/lib/python3.10/site-packages/django/db/models/query.py", line 91, in __iter__
    results = compiler.execute_sql(
  File "/.../venv/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1562, in execute_sql
    cursor.execute(sql, params)
  File "/.../venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 79, in execute
    return self._execute_with_wrappers(
  File "/.../venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 92, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/.../venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 100, in _execute
    with self.db.wrap_database_errors:
  File "/.../venv/lib/python3.10/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/.../venv/lib/python3.10/site-packages/django/db/backends/utils.py", line 105, in _execute
    return self.cursor.execute(sql, params)
  File "/.../venv/lib/python3.10/site-packages/psycopg/server_cursor.py", line 294, in execute
    raise ex.with_traceback(None)
django.db.utils.ProgrammingError: for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ..._field", $1) FROM "name_testmodel" ORDER BY NULLIF("ev...
                                                             ^

----------------------------------------------------------------------
Ran 1 test in 0.010s

FAILED (errors=1)
Destroying test database for alias 'default'...

Removing the .distinct() call prevents the error.

This is the relevant excerpt from the postgres log:

2023-12-11 19:15:15.900 UTC [18994] name@test_name ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list at character 235
2023-12-11 19:15:15.900 UTC [18994] name@test_name STATEMENT:  DECLARE "_django_curs_139772156350464_sync_1" NO SCROLL CURSOR FOR SELECT DISTINCT "name_testmodel"."id", "name_testmodel"."test_field", NULLIF("name_testmodel"."test_field", $1) FROM "name_testmodel" ORDER BY NULLIF("name_testmodel"."test_field", $2) ASC

When using psycopg2, this is the resulting postgres log (without errors):

2023-12-11 19:19:32.456 UTC [19038] name@test_name LOG:  statement: DECLARE "_django_curs_140476300926976_sync_1" NO SCROLL CURSOR WITHOUT HOLD FOR SELECT DISTINCT "name_testmodel"."id", "name_testmodel"."test_field", NULLIF("name_testmodel"."test_field", '') FROM "name_testmodel" ORDER BY NULLIF("name_testmodel"."test_field", '') ASC
2023-12-11 19:19:32.456 UTC [19038] name@test_name LOG:  statement: FETCH FORWARD 2000 FROM "_django_curs_140476300926976_sync_1"
2023-12-11 19:19:32.456 UTC [19038] name@test_name LOG:  statement: CLOSE "_django_curs_140476300926976_sync_1"

Change History (10)

comment:1 by Richard Ebeling, 10 months ago

Description: modified (diff)

comment:2 by Simon Charette, 10 months ago

Last edited 10 months ago by Simon Charette (previous) (diff)

in reply to:  2 comment:3 by Richard Ebeling, 10 months ago

Replying to Simon Charette:

Have you enabled server-side parameters bindings?

No, in the database configuration, no OPTIONS are defined, so it should default to client-side binding.

The test I posted above fails independently of this setting (i.e., both with "OPTIONS": { "server_side_binding": True } and with "OPTIONS": { "server_side_binding": False }).

comment:4 by Simon Charette, 10 months ago

Cc: Simon Charette Florian Apolloner Daniele Varrazzo added
Triage Stage: UnreviewedAccepted
Version: 5.04.2

It appears that psycopg3 will always use server-side cursors and ignores cursor_factory when using a named cursor and we know they are causing issues with how the ORM generates SQL (DISTINCT, GROUP BY, ORDER BY) as it doesn't have proper prepared statement support yet (#20516).

I see a few paths forward here

  1. Adapt psycopg3 to allow the use of named client cursors or have Django directly create cursors instead of calling create_cursor so it can issue the proper DECLARE, FETCH, CLOSE commands. I'm not sure this is possible but I assume it is given it was the case of psycopg2.
  2. On the Django side, when on psycopg>=3, make it so the DISABLE_SERVER_SIDE_CURSORS setting defaults to not db_settings.get("server_side_binding", False) and consider merging both settings under a single server_side_cursors: bool setting instead. This would disable server side cursors on psycopg>=3 entirely until #20516 is fixed which I think is the right thing to do until we've demonstrated that we actually support this configuration properly.
  3. Invest significant efforts in getting #20516 fixed so we can enable server side cursors which require the ORM to generate SQL that can be prepared.

Until a solution lands Richard you have two choices

  1. Keep using psycopg2
  2. Use psycopg>=3 but disable server side cursors
Version 2, edited 10 months ago by Simon Charette (previous) (next) (diff)

comment:5 by Simon Charette, 9 months ago

Owner: changed from nobody to Simon Charette
Status: newassigned

comment:6 by Simon Charette, 9 months ago

Has patch: set
Severity: NormalRelease blocker

Elevating to release blocker for 4.2 as it's a bug in a newly released feature (psycopg>=3 support).

Just submitted a patch for it.

comment:7 by Mariusz Felisiak, 9 months ago

Severity: Release blockerNormal

Django 4.2 is in extended support so it no longer receive bugfixes (except security patches).

comment:8 by Jonathan Weth, 9 months ago

Cc: Jonathan Weth added

comment:9 by Mariusz Felisiak, 8 months ago

Triage Stage: AcceptedReady for checkin

comment:10 by Mariusz Felisiak <felisiak.mariusz@…>, 8 months ago

Resolution: fixed
Status: assignedclosed

In 92d6cff:

Fixed #35028 -- Disabled server-side bindings for named cursors on psycopg >= 3.

While we provide a cursor_factory based on the value of the
server_side_bindings option to psycopg.Connection it is ignored by
the cursor method when name is specified for QuerySet.iterator()
usage and it causes the usage of psycopg.ServerCursor which performs
server-side bindings.

Since the ORM doesn't generates SQL that is suitable for server-side
bindings when dealing with parametrized expressions a specialized cursor
must be used to allow server-side cursors to be used with client-side
bindings.

Thanks Richard Ebeling for the report.

Thanks Florian Apolloner and Daniele Varrazzo for reviews.

Note: See TracTickets for help on using tickets.
Back to Top