Opened 14 years ago

Closed 12 years ago

Last modified 12 years ago

#15586 closed Bug (wontfix)

Fields with missing columns populated when querying on PostgreSQL

Reported by: Sebastian Żurek Owned by: nobody
Component: Documentation Version: 1.2
Severity: Normal Keywords: postgresql
Cc: lrekucki@…, botondus@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

With PostgreSQL as backend, when I use CharField field and name it "name", i.e.:

class MyModel(models.Model):
   name = models.CharField(max_length=255, ....)

I can reference to a 'name' field even though the appropriate db column is missing... (I forgot to create it
manually that's how I've noticed that strange issue)

The issue went out when I updated the model (no 'name' field before), I did not create a DB column, but that did not
stop Django from filling the field with a value! On .save() everything is OK and expected DatabseError
exception is raised by psycopg2.

The value stored in the field looks like a tuple, namely its: '(A, B, C)' where A-C are integers.

Attachments (1)

test.sql (270 bytes ) - added by Łukasz Rekucki 14 years ago.
SQL query to check against.

Download all attachments as: .zip

Change History (19)

comment:1 by Ramiro Morales, 14 years ago

Resolution: wontfix
Status: newclosed

It is not completely clear to me what is the behavior you expect here.

If what you are suggesting is that Django flags the divergence between the model design and its corresponding database table structure as soon as possible (e.g. at Python model field population time) it would be a waste of resources (it would need to introspect the DB table every time). IMHO current behavior is a good compromise: Let the DB-API driver/database engine flag the error when actual interaction with the DB is performed. Remember, we are using a RDBMS plus an ORM, not a OODB, there are two separate realms, the model objects one and the database one.

If what you are asking for is database schema evolution so it follows the model design, it is a feature that hopefully will be added in a future version and there are third party tools that implement that now (but AFAIK none of them perform automatic schema evolution so they won't be of help in cases where the developer wants to get away with changing the model design at will and hoping the software will take care of the database changes).

See also http://docs.djangoproject.com/en/1.2/faq/models/#if-i-make-changes-to-a-model-how-do-i-update-the-database

Last edited 12 years ago by Ramiro Morales (previous) (diff)

comment:2 by Sebastian Żurek, 14 years ago

Of course testing the synchronization between the Django models & db schema every time would be a stupid idea - that was not my point.

Also, I was not asking for automated schema evolution, which is quite a tricky issue and in most cases, for complex DB schema updates, You just have to handle
it manually on your own.

I was just curious HOW 'SELECT' statement could retrieve something from database, when there was no column to get from?

Once again I try to describe the proces:

  • model MyMode was created and db was synced
  • to model was added a CharField and I did not create corresponding DB table on my own
  • I performed qs = MyMode.objects.all(), that I had something in: qs[0].name, qs[1].name and so on ... which I do not understand how this field was field when there was no data in DB? It looks to me like an unexpected behavior....

I would appreciate Your further comments.

comment:3 by Sebastian Żurek, 14 years ago

sorry I made a mistake - of course in second point of process workflow should be:

to model was added a CharField and I did not create corresponding DB *column* on my own

comment:4 by Łukasz Rekucki, 14 years ago

Keywords: postgresql added
Resolution: wontfix
Status: closedreopened
Summary: Field populated when no DB column presentFields with missing columns populated when querying on PostgreSQL
Triage Stage: UnreviewedDesign decision needed

This is a PostgreSQL related issue. The query looks something like this:

SELECT "ticket15586_mymodel"."id", "ticket15586_mymodel"."name" FROM "ticket15586_mymodel"

And with absence of name column on the table, one would expect it to fail (and it does on SQLite, for example). But PostgreSQL seems to let you select the table as a ROW object (or something like that - I'm not an expert on that). So the following queries are valid:

SELECT "ticket15586_mymodel" FROM "ticket15586_mymodel";
SELECT "ticket15586_mymodel".name FROM "ticket15586_mymodel";
SELECT ("ticket15586_mymodel")::name FROM "ticket15586_mymodel";

With the second being treated as the third, which is a type cast. Don't know if there is a way to make it unambiguous that we want a table column, not some weird type thing. I'm going to reopen, so maybe someone with more PostgreSQL-related knowledge can fix this.

Last edited 14 years ago by Łukasz Rekucki (previous) (diff)

comment:5 by Łukasz Rekucki, 14 years ago

Cc: lrekucki@… added
Component: UncategorizedDatabase layer (models, ORM)

comment:6 by Béres Botond, 14 years ago

Cc: botondus@… added

comment:7 by Ramiro Morales, 14 years ago

Resolution: needsinfo
Status: reopenedclosed

I can't reproduce this. Tried with Django trunk and 1.2.5

  • PostgresSQL version is 8.3.14
  • psycopg2 version is 2.0.7-4
# t15586/models.py

from django.db import models

class MyModel(models.Model):
    name = models.CharField(max_length=255)
$ ./manage.py dbshell
Welcome to psql 8.3.14, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help with psql commands
       \g or terminate with semicolon to execute query
       \q to quit

dbname=> select * from t15586_mymodel;
 id
----
(0 rows)

dbname=>

$ ./manage.py shell
Python 2.5.2 (r252:60911, Jan 24 2010, 14:53:14)
Type "copyright", "credits" or "license" for more information.

IPython 0.8.4 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object'. ?object also works, ?? prints more.

In [1]: from django.db import connection

In [2]: from t15586.models import MyModel

In [3]: MyModel.objects.all()
Out[3]: ---------------------------------------------------------------------------
DatabaseError                             Traceback (most recent call last)

dtest07/<ipython console> in <module>()

/var/lib/python-support/python2.5/IPython/Prompts.pyc in __call__(self, arg)
    549
    550             # and now call a possibly user-defined print mechanism
--> 551             manipulated_val = self.display(arg)
    552
    553             # user display hooks can change the variable to be stored in

/var/lib/python-support/python2.5/IPython/Prompts.pyc in _display(self, arg)
    575             return IPython.generics.result_display(arg)
    576         except TryNext:
--> 577             return self.shell.hooks.result_display(arg)
    578
    579     # Assign the default display method:

/var/lib/python-support/python2.5/IPython/hooks.pyc in __call__(self, *args, **kw)
    133             #print "prio",prio,"cmd",cmd #dbg
    134             try:
--> 135                 ret = cmd(*args, **kw)
    136                 return ret
    137             except ipapi.TryNext, exc:

/var/lib/python-support/python2.5/IPython/hooks.pyc in result_display(self, arg)
    163
    164     if self.rc.pprint:
--> 165         out = pformat(arg)
    166         if '\n' in out:
    167             # So that multi-line strings line up with the left column of

/usr/lib/python2.5/pprint.pyc in pformat(self, object)
    109     def pformat(self, object):
    110         sio = _StringIO()
--> 111         self._format(object, sio, 0, 0, {}, 0)
    112         return sio.getvalue()
    113

/usr/lib/python2.5/pprint.pyc in _format(self, object, stream, indent, allowance, context, level)
    127             self._readable = False
    128             return
--> 129         rep = self._repr(object, context, level - 1)
    130         typ = _type(object)
    131         sepLines = _len(rep) > (self._width - 1 - indent - allowance)

/usr/lib/python2.5/pprint.pyc in _repr(self, object, context, level)
    193     def _repr(self, object, context, level):
    194         repr, readable, recursive = self.format(object, context.copy(),
--> 195                                                 self._depth, level)
    196         if not readable:
    197             self._readable = False

/usr/lib/python2.5/pprint.pyc in format(self, object, context, maxlevels, level)
    205         and whether the object represents a recursive construct.
    206         """
--> 207         return _safe_repr(object, context, maxlevels, level)
    208
    209

/usr/lib/python2.5/pprint.pyc in _safe_repr(object, context, maxlevels, level)
    290         return format % _commajoin(components), readable, recursive
    291
--> 292     rep = repr(object)
    293     return rep, (rep and not rep.startswith('<')), False
    294

django/db/models/query.py in __repr__(self)
     65
     66     def __repr__(self):
---> 67         data = list(self[:REPR_OUTPUT_SIZE + 1])
     68         if len(data) > REPR_OUTPUT_SIZE:
     69             data[-1] = "...(remaining elements truncated)..."

django/db/models/query.py in __len__(self)
     80                 self._result_cache = list(self.iterator())
     81         elif self._iter:
---> 82             self._result_cache.extend(list(self._iter))
     83         return len(self._result_cache)
     84

django/db/models/query.py in iterator(self)
    269         model = self.model
    270         compiler = self.query.get_compiler(using=db)
--> 271         for row in compiler.results_iter():
    272             if fill_cache:
    273                 obj, _ = get_cached_row(model, row,

django/db/models/sql/compiler.py in results_iter(self)
    675         fields = None
    676         has_aggregate_select = bool(self.query.aggregate_select)
--> 677         for rows in self.execute_sql(MULTI):
    678             for row in rows:
    679                 if resolve_columns:

django/db/models/sql/compiler.py in execute_sql(self, result_type)
    730
    731         cursor = self.connection.cursor()
--> 732         cursor.execute(sql, params)
    733
    734         if not result_type:

django/db/backends/util.py in execute(self, sql, params)
     13         start = time()
     14         try:
---> 15             return self.cursor.execute(sql, params)
     16         finally:
     17             stop = time()

django/db/backends/postgresql_psycopg2/base.py in execute(self, query, args)
     42     def execute(self, query, args=None):
     43         try:
---> 44             return self.cursor.execute(query, args)
     45         except Database.IntegrityError, e:
     46             raise utils.IntegrityError, utils.IntegrityError(*tuple(e)), sys.exc_info()[2]

DatabaseError: column t15586_mymodel.name does not exist
LINE 1: SELECT "t15586_mymodel"."id", "t15586_mymodel"."name" FROM "...
                                      ^

comment:8 by Łukasz Rekucki, 14 years ago

Could you try with PostgreSQL >= 8.4 ? I was using PostgreSQL 8.4.7 and psycopg2 2.3.2. Only reference to a similar problem I could find is: http://archives.postgresql.org/pgsql-general/2010-05/msg01126.php.

by Łukasz Rekucki, 14 years ago

Attachment: test.sql added

SQL query to check against.

comment:9 by Sebastian Żurek, 14 years ago

I was using Django 1.2.3, Postgres 9.0.1, psycopg2 v. 2.2.2 (dt dec ext pq3). In a few hours I'll try once again, with a raw SQL query and I'll report the results. I'll try to test it also with Postgres 8.

comment:10 by Ramiro Morales, 14 years ago

Resolution: needsinfo
Status: closedreopened

I can reproduce this with PostgreSQL 8.4.7. Reopening the ticket.

comment:11 by Adrian Holovaty, 14 years ago

This is an interesting one! I was able to reproduce it in Postgres 8.4.1. The issue seems to be isolated to the column called "name". I tried selecting non-existent columns with other names besides "name", and it worked as expected, raising an error. Here's an example:

db=# CREATE TABLE test (id INTEGER PRIMARY KEY);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "test_pkey" for table "test"
CREATE TABLE
db=# INSERT INTO test VALUES (1), (2), (3);
INSERT 0 3
db=# SELECT "test"."id", "test"."name" FROM "test";
 id | name 
----+------
  1 | (1)
  2 | (2)
  3 | (3)
(3 rows)

db=# SELECT "test"."id", "test"."name2" FROM "test";
ERROR:  column test.name2 does not exist

Here is a thread from the Postgres mailing list with more info: http://archives.postgresql.org/pgsql-general/2010-02/msg01038.php. "Calling t.name is the same as name(t) if a column reference is not found."

I agree with the previous comments that we should avoid introspecting the database, and I can't immediately think of a user-friendly solution to this beyond documenting it for Postgres users. Any ideas?

comment:12 by Luke Plant, 14 years ago

Type: Bug

comment:13 by Luke Plant, 14 years ago

Severity: Normal

comment:14 by Sebastian Żurek, 14 years ago

Easy pickings: unset

After a few weeks I'm returning to this issue. To summarize what we see, let me point out the two elements:

  • the issue has rather low probability to appear: You have to use PG backend + You have to have Your model defining non-existing (in DB schema sense) field/column
  • I feel this is also a kind of a issue in PostgreSQL itself

What we should do now, is for sure to document this behavior --- while it's not clear in DB engine itself what does it mean to select column from table, I believe Django should not be smarter in handling it in other way than the engine does it.

comment:15 by Aymeric Augustin, 13 years ago

Component: Database layer (models, ORM)Documentation
Triage Stage: Design decision neededAccepted
UI/UX: unset

Since no one came up with a better plan, let's document it in ref/databases.txt.

We have comparable limitations documented for other databases, for instance, "don't create a field called timestamp if you're using Oracle".

comment:16 by Tim Graham, 12 years ago

The usefulness of adding something like the following seems very limited to me:

"If you are using PostgreSQL and add a field called 'name' to an existing model but forget to add the column in your database, SELECT statements will appear to work even though the column doesn't exist (since NAME is a Postgres function and SELECT foo.name FROM foo is equivalent to SELECT name(foo) FROM foo) ."

but since the ticket has been accepted, is that the essence of what is being suggested?

comment:17 by Łukasz Rekucki, 12 years ago

Note that "name" is just an example here. From my understanding, you can swap it to any other datatype that PostgreSQL supports and instead of raising an error about a missing column, it will treat it as a cast of the whole row to that datatype.

Anyways, I tried this on PostgreSQL 9.2 and the behavior is no longer there. I'm guessing it was removed in version 9, but don't have an instance to verify.

comment:18 by Tim Graham, 12 years ago

Resolution: wontfix
Status: reopenedclosed

Since this doesn't happen on PostgreSQL 9.2 and seems like it has a very low probability of happening, I don't think it's worth documenting at this point. If someone really wants to see this added and can provide a patch, I'll be happy to review and commit it.

Note: See TracTickets for help on using tickets.
Back to Top