Opened 9 years ago

Closed 8 years ago

#25476 closed Cleanup/optimization (fixed)

Use pg_catalog tables to introspect constraints

Reported by: heasus Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: psycopg2 fetchall inspectdb permissions
Cc: gagangupt16@…, marcos@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description (last modified by Claude Paroz)

Using inspectdb of Django1.8.4 with psycopg2 2.6.1 to import an existing db, get_constraints() attempts to read table constraints. It exits here parsing foreign keys with the traceback:

File "/home/www/db/lib/python3.4/site-packages/django/db/backends/postgresql_psycopg2/introspection.py", line 171, in get_constraints
"foreign_key": tuple(used_cols[0].split(".", 1)) if kind.lower() == "foreign key" else None,
IndexError: list index out of range

The array of columns returned from the db is populated:

SELECT
kc.constraint_name,
kc.column_name,
c.constraint_type,
array(SELECT table_name::text || '.' || column_name::text
FROM information_schema.constraint_column_usage
WHERE constraint_name = kc.constraint_name)
FROM information_schema.key_column_usage AS kc
JOIN information_schema.table_constraints AS c ON
kc.table_schema = c.table_schema AND
kc.table_name = c.table_name AND
kc.constraint_name = c.constraint_name
WHERE
kc.table_schema = 'public' AND
kc.table_name = 'things'

ORDER BY kc.ordinal_position ASC;
constraint_name     | column_name | constraint_type |                array

------------------------+-------------+-----------------+-----------------------

things_pkey | trunk | PRIMARY KEY | {things.trunk,things.c
hild}
things_child_fkey | child | FOREIGN KEY | {children.idx}
things_trunk_fkey | trunk | FOREIGN KEY | {children.idx}
things_pkey | child | PRIMARY KEY | {things.trunk,things.c
hild}

but empty from cursor.fetchall(): for constraint, column, kind, used_cols in cursor.fetchall(). A script using psycopg2 directly does not exhibit this problem. I;ve also tried a current development checkout of django.

This is as far as I've traced this so far and it does not seem to be a known issue.

Change History (29)

comment:1 by heasus, 9 years ago

Component: UncategorizedDatabase layer (models, ORM)

comment:2 by heasus, 9 years ago

Type: UncategorizedBug

comment:3 by Claude Paroz, 9 years ago

Description: modified (diff)

comment:4 by Claude Paroz, 9 years ago

Would it be possible to have the schema of the involved tables?

comment:5 by Claude Paroz, 9 years ago

Resolution: needsinfo
Status: newclosed

comment:6 by heasus, 9 years ago

Was this fixed? or did it go directly from 'need more info' directly to closed? I can provide more information, but have been traveling.

comment:7 by Tim Graham, 9 years ago

"needsinfo" is a "closed" state -- please reopen when you provide the additional info. Thanks.

comment:8 by heasus, 9 years ago

Resolution: needsinfo
Status: closednew

This schema reproduces the problem for me:

{{{CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;

COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';
SET search_path = public, pg_catalog;
SET default_tablespace = ;
SET default_with_oids = false;
CREATE TABLE children (

idx bigint NOT NULL,
fullname character varying(64) NOT NULL

);
CREATE SEQUENCE children_idx_seq

START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;

ALTER TABLE ONLY children ALTER COLUMN idx SET DEFAULT

nextval('children_idx_seq'::regclass);

ALTER TABLE ONLY children

ADD CONSTRAINT children_pkey PRIMARY KEY (idx);

CREATE TABLE things (

sibling bigint NOT NULL,
child bigint NOT NULL

);
ALTER TABLE ONLY things

ADD CONSTRAINT things_child_fkey FOREIGN KEY (child) REFERENCES
children(idx) ON DELETE CASCADE;

ALTER TABLE ONLY things

ADD CONSTRAINT things_sibling_fkey FOREIGN KEY (sibling) REFERENCES
children(idx) ON DELETE CASCADE;

# This is an auto-generated Django model module.
# You'll have to do the following manually to clean this up:
# * Rearrange models' order
# * Make sure each model has one field with primary_key=True
# * Remove managed = False lines if you wish to allow Django to create, modify, and delete the table
# Feel free to rename the models, but don't rename db_table values or field names.
#
# Also note: You'll have to insert the output of 'django-admin sqlcustom [app_label]'
# into your database.
from future import unicode_literals

from django.db import models

class Children(models.Model):

idx = models.BigIntegerField(primary_key=True)
fullname = models.CharField(max_length=64)

class Meta:

managed = False
db_table = 'children'

class Things(models.Model):
Traceback (most recent call last):

File "./manage.py", line 10, in <module>

execute_from_command_line(sys.argv)

File "/home/www/beerdb_debug/lib/python3.4/site-packages/django/core/management/init.py", line 338, in execute_from_command_line

utility.execute()

File "/home/www/beerdb_debug/lib/python3.4/site-packages/django/core/management/init.py", line 330, in execute

self.fetch_command(subcommand).run_from_argv(self.argv)

File "/home/www/beerdb_debug/lib/python3.4/site-packages/django/core/management/base.py", line 393, in run_from_argv

self.execute(*args, cmd_options)

File "/home/www/beerdb_debug/lib/python3.4/site-packages/django/core/management/base.py", line 444, in execute

output = self.handle(*args, options)

File "/home/www/beerdb_debug/lib/python3.4/site-packages/django/core/management/commands/inspectdb.py", line 25, in handle

for line in self.handle_inspection(options):

File "/home/www/beerdb_debug/lib/python3.4/site-packages/django/core/management/commands/inspectdb.py", line 72, in handle_inspection

constraints = connection.introspection.get_constraints(cursor, table_name)

File "/home/www/beerdb_debug/lib/python3.4/site-packages/django/db/backends/postgresql_psycopg2 /introspection.py", line 169, in get_constraints

"foreign_key": tuple(used_cols[0].split(".", 1)) if kind.lower() == "foreign key" else None,

IndexError: list index out of range

}}}

Version 0, edited 9 years ago by heasus (next)

comment:9 by Claude Paroz, 9 years ago

Thanks for the instructions, but I'm still unable to reproduce. This IndexError would imply that used_cols is empty, which is weird. Could you try debugging this issue by setting a break point near the error and examine the values of constraint, column, kind, used_cols just before the crash?

comment:10 by heasus, 9 years ago

Are our environments similar?

% python -V
Python 3.4.3
% pip freeze
Django==1.8.4
django-postgresql==0.0.3
psycopg2==2.6.1

%psql things
things=# \encoding
UTF8

It is indeed empty in get_constraints and, as best I could follow it through with the debugger, it is empty when it returns from psycopg. But, running the sql manually or with the test script below produces the expected output.

things=# SELECT
kc.constraint_name,
kc.column_name,
c.constraint_type,
array(SELECT table_name::text || '.' || column_name::text
FROM information_schema.constraint_column_usage
WHERE constraint_name = kc.constraint_name)
FROM information_schema.key_column_usage AS kc
JOIN information_schema.table_constraints AS c ON
kc.table_schema = c.table_schema AND
kc.table_name = c.table_name AND
kc.constraint_name = c.constraint_name
WHERE
kc.table_schema = 'public' AND
kc.table_name = 'things'
ORDER BY kc.ordinal_position ASC;
   constraint_name   | column_name | constraint_type |     array      
---------------------+-------------+-----------------+----------------
 things_child_fkey   | child       | FOREIGN KEY     | {children.idx}
 things_sibling_fkey | sibling     | FOREIGN KEY     | {children.idx}
(2 rows)

import psycopg2

conn = psycopg2.connect("dbname=things")
curs = conn.cursor()

curs.execute("""SELECT
                kc.constraint_name,
                kc.column_name,
                c.constraint_type,
                array(SELECT table_name::text || '.' || column_name::text
                      FROM information_schema.constraint_column_usage
                      WHERE constraint_name = kc.constraint_name)
            FROM information_schema.key_column_usage AS kc
            JOIN information_schema.table_constraints AS c ON
                kc.table_schema = c.table_schema AND
                kc.table_name = c.table_name AND
                kc.constraint_name = c.constraint_name
            WHERE
                kc.table_schema = 'public' AND
                kc.table_name = 'things'
             ORDER BY kc.ordinal_position ASC;
""")
print(curs.fetchone())
print(curs.fetchone())
print(curs.fetchone())
print(curs.fetchone())

% python arraytest.py
('things_child_fkey', 'child', 'FOREIGN KEY', ['children.idx'])
('things_sibling_fkey', 'sibling', 'FOREIGN KEY', ['children.idx'])
None
None

in pdb:

% ./manage.py inspectdb
# This is an auto-generated Django model module.
# You'll have to do the following manually to clean this up:
#   * Rearrange models' order
#   * Make sure each model has one field with primary_key=True
#   * Remove `managed = False` lines if you wish to allow Django to create, modify, and delete the table
# Feel free to rename the models, but don't rename db_table values or field names.
#
# Also note: You'll have to insert the output of 'django-admin sqlcustom [app_label]'
# into your database.
from __future__ import unicode_literals

from django.db import models


class Children(models.Model):
> /home/www/beerdb_debug/lib/python3.4/site-packages/django/db/backends/postgresql_psycopg2/introspection.py(163)get_constraints()
-> for constraint, column, kind, used_cols in cursor.fetchall():
(Pdb) 
(Pdb) n
> /home/www/beerdb_debug/lib/python3.4/site-packages/django/db/backends/postgresql_psycopg2/introspection.py(165)get_constraints()
-> if constraint not in constraints:
(Pdb) p used_cols
[]
(Pdb) 

I'm not proficient in python; I may be missing an obvious fix. I've also tried the git HEAD, which appears to have its own issues with timezones that I hacked around to test. I've also tried python 2.7 and on Debian. I'm happy to try other variations of the environment.

Last edited 9 years ago by Tim Graham (previous) (diff)

comment:11 by Tim Graham, 9 years ago

To rule out a bug in a third-party package, can you reproduce without django-postgresql? What version of PostgreSQL?

comment:12 by heasus, 9 years ago

Without django-postgresql, the behavior is the same.

This Postgresql 9.4.4 (client and server).

comment:13 by Tim Graham, 9 years ago

I also couldn't reproduce with the provided schema.

  • Python 3.4.3
  • PostgreSQL 9.4.5
  • psycopg2==2.6.1

One difference is that when I execute the query in the description, I only get two rows:

"things_child_fkey";"child";"FOREIGN KEY";"{children.idx}"
"things_sibling_fkey";"sibling";"FOREIGN KEY";"{children.idx}"

(no results for primary keys). I haven't tried to understand the issue further besides these steps.

comment:14 by heasus, 9 years ago

Updating postgres to 9.4.5 did not change the behavior for me.

I reduced the schema to just the minimum to replicate the issue, and indeed there are no default primary keys on table things but is one on children (the object of the foreign keys). However, this does not matter; the problem is in parsing the last field of the foreign keys as in the previous comment in this ticket.

Hmmm; I'm not sure how to proceed to debug this. Same environment, different results. The problem does appear to be within Django. Advice?

comment:15 by Tim Graham, 9 years ago

Could you try creating a failing test case for Django's test suite?

comment:16 by Tim Graham, 9 years ago

Resolution: needsinfo
Status: newclosed

comment:17 by Gagan Gupta, 9 years ago

Facing similar issue while using inspectdb.

Environment:
Python: 3.4.3
Django: 1.9
PostgreSQL: 9.3.10
psycopg2: 2.6.1

Traceback (most recent call last):
  File "manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.4/dist-packages/django/core/management/__init__.py", line 350, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.4/dist-packages/django/core/management/__init__.py", line 342, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.4/dist-packages/django/core/management/base.py", line 348, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.4/dist-packages/django/core/management/base.py", line 399, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python3.4/dist-packages/django/core/management/commands/inspectdb.py", line 25, in handle
    for line in self.handle_inspection(options):
  File "/usr/local/lib/python3.4/dist-packages/django/core/management/commands/inspectdb.py", line 70, in handle_inspection
    constraints = connection.introspection.get_constraints(cursor, table_name)
  File "/usr/local/lib/python3.4/dist-packages/django/db/backends/postgresql/introspection.py", line 171, in get_constraints
    "foreign_key": tuple(used_cols[0].split(".", 1)) if kind.lower() == "foreign key" else None,
IndexError: list index out of range

In this query,

cursor.execute("""
            SELECT
                kc.constraint_name,
                kc.column_name,
                c.constraint_type,
                array(SELECT table_name::text || '.' || column_name::text
                      FROM information_schema.constraint_column_usage
                      WHERE constraint_name = kc.constraint_name)
            FROM information_schema.key_column_usage AS kc
            JOIN information_schema.table_constraints AS c ON
                kc.table_schema = c.table_schema AND
                kc.table_name = c.table_name AND
                kc.constraint_name = c.constraint_name
            WHERE
                kc.table_schema = %s AND
                kc.table_name = %s
            ORDER BY kc.ordinal_position ASC
        """, ["public", table_name])

The subquery

"SELECT table_name::text || '.' || column_name::text
                      FROM information_schema.constraint_column_usage
                      WHERE constraint_name = kc.constraint_name"


is returning empty array.

Hence, Index out of bound exception.
However on running the same query on postgresql console and replacing with appropriate values. It was not returning array.

comment:18 by Gagan Gupta, 9 years ago

Cc: gagangupt16@… added
Resolution: needsinfo
Status: closednew
Version: 1.81.9

comment:19 by Simon Charette, 9 years ago

Could you provide a list of your INSTALLED_APPS?

in reply to:  19 comment:20 by Gagan Gupta, 9 years ago

Replying to charettes:

Could you provide a list of your INSTALLED_APPS?

INSTALLED_APPS = [

'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',

]

comment:21 by Claude Paroz, 9 years ago

We will probably need the dump file of a database causing this issue to be able to debug it.

comment:22 by Tim Graham, 9 years ago

Resolution: needsinfo
Status: newclosed

comment:23 by Ionuț Ciocîrlan, 9 years ago

This is most likely caused to by an import by a different user, with incomplete privileges (pg_dump -O maybe?). For example:

$ psql -qU user testdb
testdb=> select count(*) from information_schema.constraint_column_usage;
 count 
-------
     0
(1 row)

as opposed to

$ psql -qU superuser testdb
testdb=# select count(*) from information_schema.constraint_column_usage;
 count 
-------
   219
(1 row)

comment:24 by TitanFighter, 9 years ago

I had the same error.
After investigation I found, that one table did not have some FK values to another one (fields were NULL'ed). So, in order to fix this issue either rows with NULL'ed FK's can be removed or NULL'ed fields can be populated.

comment:25 by heasus, 9 years ago

Resolution: needsinfoinvalid

The problem was indeed permissions; the database user intentionally did not have superuser privileges, but these queries, though apparently denied, produce no error logs AFAICT.

Thanks for all the replies.

comment:26 by Marcos Del Sol Vives, 8 years ago

Cc: marcos@… added
Has patch: set
Keywords: inspectdb permissions added
Resolution: invalid
Status: closednew
Version: 1.9master

The reason inspectdb wasn't working for you it's because the inspectdb uses information_schema.constraint_column_usage, a view that, as explained on the PostgreSQL documentation:

[...] identifies all columns in the current database that are used by some constraint. Only those columns are shown that are contained in a table owned by a currently enabled role. [...]

So, although Django was able to list all contraints and get their names by using information_schema.key_column_usage and information_schema.table_constraints (which have no such requirement about owning the table - these let you view all data as long as you can modify the said table), it wasn't able to get the foreign key the contraint referenced to, and:

"foreign_key": tuple(used_cols[0].split(".", 1)) if kind.lower() == "foreign key" else None,

threw an exception because used_cols was an empty array.

The fix for this is easy: not using the constraint_column_usage table. I've submitted a PR on GitHub that uses pg_catalog tables so it can be used regardless of superuser status: https://github.com/django/django/pull/7106

comment:27 by Tim Graham, 8 years ago

Patch needs improvement: set
Summary: bug in array handling hit in get_constraints() for psycopg2?Use pg_catalog tables to introspect constraints
Triage Stage: UnreviewedAccepted
Type: BugCleanup/optimization

I'm not an expert there, but tentatively accepting, although the tests are passing with the proposed PR. That might just be a mistake rather than something that can't be fixed though.

comment:28 by Marcos Del Sol Vives, 8 years ago

I've tweaked a bit the PR to support unique_together, too. Seems to be working fine on every case now.

comment:29 by Tim Graham <timograham@…>, 8 years ago

Resolution: fixed
Status: newclosed

In d6b9aab:

Fixed #25476 -- Allowed PostgreSQL introspection to work regardless of table owner.

Note: See TracTickets for help on using tickets.
Back to Top