Opened 14 years ago

Closed 9 years ago

#14518 closed Bug (worksforme)

Field.to_python not called on foreign key IDs

Reported by: David Wolever Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: David Wolever, kitsunde@…, info@…, anubhav9042@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Given a subclass of Field like this:

class MyIDField(Field):
    ...
    def to_python(self, value):
        return to_base_36(value)

Used like this:

class Foo(m.Model):
    id = MyIDField(primary_key=True)

class Bar(m.Model):
    foo = m.ForeignKey(Foo)

Then the MyIDField.to_python function isn't called when Bar.foo_id is accessed:

>>> b = Bar.objects.all()[0]
>>> b.foo_id
1234
>>> b.foo.id
'ya'
>>>

This is annoying for a few reasons, but most significantly because it breaks dropdown menus in admin: even if a value is set for a foreign key, the menu selects the default item (usually ------). This is because the menu items *have* been to_python'd, but the default value has *not* been to_python'd).

I'm seeing this in Django 1.2.1.

If this is confirmed as a bug, I can submit test cases.

Attachments (1)

14518.diff (1.1 KB ) - added by ANUBHAV JOSHI 10 years ago.

Download all attachments as: .zip

Change History (17)

comment:1 by Russell Keith-Magee, 14 years ago

Component: UncategorizedDatabase layer (models, ORM)
Triage Stage: UnreviewedAccepted

Accepting, because it's clearly an inconsistency.

However, I have a sneaking suspicion that this won't be easy to fix without breaking backwards compatibility. For historical reasons, it's possible to instantiate an object with string values, and those values aren't converted into field-appropriate values (integers, etc) until the object is saved to the database. This is an artefact of early-days Django form processing.

comment:2 by David Wolever, 14 years ago

Cc: David Wolever added

It may also be worth noting: I have a StackOverflow question asking the same thing here: http://stackoverflow.com/questions/3983510

comment:3 by Julien Phalip, 14 years ago

Severity: Normal
Type: Bug

comment:4 by avnimahajan@…, 13 years ago

Easy pickings: unset
UI/UX: unset

I am also facing same problem. We have changed the default behavior of having primary keys. Throughout our code it has been replaced with custom UUIDFiled(given below). Till django 1.1 we were not facing any problem but while migrating to django1.3, I started facing problem during syncdb(duplicate key error while inserting in auth_permissions).
Since there is change in "django/contrib/auth/management/init.py" create_permission code. Now it is referring to "content_type" which is a Foreign key. In my case its value is not being to_python'd. Thus it fails in comparison with a uuid.UUID type value (returned when referred as pk which is always to_python'd).

Please update by when this bug will be solved?

class UUIDField(models.Field):
    

    __metaclass__ = models.SubfieldBase
    empty_strings_allowed = False

    def __init__(self, *args, **kwargs):
        if kwargs.get('primary_key', False):
            kwargs['editable'] = False
        kwargs['db_index'] = True
        super(UUIDField, self).__init__(*args, **kwargs)

    def db_type(self, connection):
        return 'uuid'

    def get_internal_type(self):
        return 'UUIDField'

    def to_python(self, value):
        if (value is None) or isinstance(value, uuid.UUID):
            return value
        try:
            return uuid.UUID(value)
        except (AttributeError, TypeError, ValueError):
            return value

    def pre_save(self, obj, add):
        old_val = getattr(obj, self.attname)
        if (self.primary_key and add) and (old_val is None):
            value = uuid.uuid4()
            setattr(obj, self.attname, value)
            return value
        else:
            return old_val

    def get_db_prep_lookup(self, lookup_type, value,connection=None, prepared=False):
        if lookup_type == 'exact':
            return [self.get_db_prep_value(value,connection=None, prepared=False)]
        elif lookup_type == 'in':
            return [self.get_db_prep_value(v,connection=None, prepared=False) for v in value]
        elif lookup_type == 'isnull':
            return []
        raise TypeError(u"Field has invalid lookup type: %s" % lookup_type)

    def value_to_string(self, obj):
        value = getattr(obj, self.attname)
        if value is None:
            return value
        else:
            return unicode(value)

    def formfield(self, **kwargs):
        defaults = {
            'form_class': forms.RegexField,
            'regex': UUID_REGEX,
            'max_length': 47,
            'error_messages': {'invalid': u"Enter only valid UUID."}
        }
        defaults.update(kwargs)
        return super(UUIDField, self).formfield(**defaults)
Last edited 10 years ago by Tim Graham (previous) (diff)

comment:5 by Kit Sunde, 13 years ago

Cc: kitsunde@… added

comment:6 by Markus Holtermann, 11 years ago

Cc: info@… added

comment:7 by ANUBHAV JOSHI, 10 years ago

Well there have been some changes in the code and some things have changed.
Here is my models.py

from django.db import models


class MyField(models.CharField):
	__metaclass__ = models.SubfieldBase

	def __init__(self, *args, **kwargs):
		super(MyField, self).__init__(*args, **kwargs)

	def to_python(self, value):
		return "Value: %s" % value


class Foo(models.Model):
   	field = MyField(max_length=30, primary_key=True)

class Bar(models.Model):
    foo = models.ForeignKey(Foo)

In shell, I get correct values here:

>>> m1 = Foo.objects.create(field="anubhavjoshi")
>>> m2 = Bar.objects.create(foo=m1)
>>> m2.foo_id
u'Value: Value: anubhavjoshi'
>>> m2.foo.field
u'Value: Value: anubhavjoshi'

But if I do:

>>> m1 = Foo.objects.create(field="anubhavjoshi")
>>> m2 = Bar.objects.create(foo=m1)
>>> m2 = Bar.objects.all()[0]
>>> m2.foo_id
u'Value: Value: anubhavjoshi'
>>> m2.foo.field
DoesNotExist: Foo matching query does not exist.
>>> m2.foo
DoesNotExist: Foo matching query does not exist.

Following traceback:

---------------------------------------------------------------------------
DoesNotExist                              Traceback (most recent call last)
C:\Users\Anubhav\DjangoDev\apps\<ipython-input-7-3a4a21c18010> in <module>()
----> 1 m2.foo

c:\users\anubhav\django\django\db\models\fields\related.pyc in __get__(self, instance, instance_type)
    562                     qs = qs.filter(extra_filter, **params)
    563                 # Assuming the database enforces foreign keys, this won't fail.

--> 564                 rel_obj = qs.get()
    565                 if not self.field.rel.multiple:
    566                     setattr(rel_obj, self.field.related.get_cache_name(), instance)

c:\users\anubhav\django\django\db\models\query.pyc in get(self, *args, **kwargs)
    374             raise self.model.DoesNotExist(
    375                 "%s matching query does not exist." %
--> 376                 self.model._meta.object_name)
    377         raise self.model.MultipleObjectsReturned(
    378             "get() returned more than one %s -- it returned %s!" % (

comment:8 by ANUBHAV JOSHI, 10 years ago

Cc: anubhav9042@… added
Version: 1.2master

comment:9 by ANUBHAV JOSHI, 10 years ago

I am attaching a diff to fix what remains.

by ANUBHAV JOSHI, 10 years ago

Attachment: 14518.diff added

comment:10 by ANUBHAV JOSHI, 10 years ago

This might cause other tests to fail but might give some insight as to why this problem occured.
As queryset gets the correct value earlier on and is changed to empty qs after that condition.

comment:11 by Harm Geerts <hgeerts@…>, 10 years ago

The reason the first example works is because in Bar.objects.create(foo=m1) the m1 Foo instance is cached on the Bar instance, so m2.foo.field does not perform a database query.

The second example fails because m2 has to perform a database lookup, and that's where things get fun.

Your MyField.to_python is called multiple times so the value stored in the database is actually "Value: Value: anubhavjoshi".
When you access bar.foo the primary key value "Value: Value: anubhavjoshi" is passed through to_python again and becomes "Value: Value: Value: anubhavjoshi" and that primary key does not exist.

The reason your patch works is because you only had one Foo record in your dataset.
If you had multiple Foo records your test would raise MultipleObjectsReturned.
The only thing the patch does is disable related lookup filters which is not what you want ;)

It may be an idea to add a note to the documentation about custom field classes.

comment:12 by ANUBHAV JOSHI, 10 years ago

Thanks for your explanation. As I said my patch wasn't correct, it only was to tell which line caused errors.

Now that I tried the following:

def to_python(self, value):
	return ".".join(value)

So the lookup which failed earlier passes, but as explained by Harm above,

>>> m2.foo_id
u'a...n...u...b...h...a...v'
>>> m2.foo.field
u'a.......n.......u.......b.......h.......a.......v'

But it should have been: u'a.n.u.b.h.a.v'

It seems that to_python is called too many times and that might be wrong when it is defined in a way like in my examples..

Also if we remove __metaclass__ = models.SubfieldBase from our custom field then to_python does gets called but something strange happens as it appears it does not serves the purpose as always u'anubhav' is returned by m1.field or m2.foo_id.
There might be problem in that metaclass implementation, but thats just a wild guess.

Version 0, edited 10 years ago by ANUBHAV JOSHI (next)

comment:13 by Nick Sandford, 10 years ago

@anubhav9042: looks like this is a problem with the way you're subclassing the CharField. When to_python is called, it could well already be in the right format, or it might be unset. to_python gets called every time the field is assigned a value and you are responsible for ensuring it is in the right format (https://docs.djangoproject.com/en/1.6/howto/custom-model-fields/#django.db.models.Field.to_python). Notice the two lines at the beginning of that example code. You will need to do something similar.

comment:14 by antialiasis, 10 years ago

Is there any chance this might be fixed in some way at some point?

Like others, we've been using UUIDs as primary keys on a PostgreSQL database, and this bug is completely breaking prefetch_related. The way prefetching is implemented (at least in 1.7), instances are collected into a dictionary with their to_python'd primary keys as the keys, and then it tries to access them using the non-to_python'd foreign key ID field. Needless to say, this results in a KeyError if to_python changes the representation of the field at all (in our case, to strip PostgreSQL's native hyphens).

comment:15 by Tim Graham, 10 years ago

Yes, once we have a working patch and someone reviews it.

comment:16 by Tim Graham, 9 years ago

Resolution: worksforme
Status: newclosed

I can't reproduce the behavior described in the ticket as far back as Django 1.2. b.foo_id and b.foo.id both return an integer. A look of the follow up comments use SubfieldBase which is deprecated in favor of Field.from_db_value(). If problems remain in newer versions of Django, please open a new ticket with details.

Note: See TracTickets for help on using tickets.
Back to Top