#9619 closed Bug (fixed)
to_python not called when fetching data with .values(...)
Reported by: | Valera Grishin | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | values, custom field, datetime |
Cc: | amlau@…, olau@…, Chris Chambers, Kevin Christopher Henry, django@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Consider following models:
from django.db import models class BinaryField(models.Field): __metaclass__ = models.SubfieldBase def encode(self): # some encoding here def decode(self): # some decoding here def db_type(self): return 'text' def get_db_prep_value(self, value): return super(BinaryField, self).get_db_prep_value(BinaryField.__encode(value)) def to_python(self, value): return super(BinaryField, self).to_python(BinaryField.__decode(value)) class Logo(models.Model): channel = models.CharField(max_length=100) logo = BinaryField(editable=False)
Now assume you want to fetch values of "logo" field:
print Logo.objects.values('logo')[1]['logo']
This will print encoded value as it is get stored in the database. That is to_python not called.
Change History (22)
comment:1 by , 16 years ago
Cc: | added |
---|
comment:2 by , 16 years ago
Cc: | added; removed |
---|
comment:3 by , 16 years ago
milestone: | → 1.1 |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:4 by , 16 years ago
Triage Stage: | Accepted → Design decision needed |
---|
comment:5 by , 16 years ago
Also occurs to me that a new parameter to values()
and values_list()
might not be out of the realms of possibility here.
Something like do_the_extra_work_required_to_convert_to_python_even_though_I_could_use_a_normal_queryset_with_defer = True
, perhaps. :-)
comment:6 by , 16 years ago
Malcolm, you mentions defer. Got something you want to share with the class :)
comment:8 by , 16 years ago
milestone: | 1.1 |
---|
Having thought about this a lot more, it isn't really a bug in existing functionality since values()
has always behaved this way and the name is pretty descriptive. Worthwhile having a discussion about adding some feature to this in the future, but I can't see that it's a bug in the 1.1-release sense.
comment:9 by , 15 years ago
Keywords: | values custom field datetime added |
---|---|
Version: | 1.0 → SVN |
This seems like a bug to me. Please see my post for a better case:
http://groups.google.com/group/django-users/browse_thread/thread/7f6f75334ef1653c/0174e8ce8e1e514b?hl=en%C2%AEe8ce8e1e514b
It seems to me that 'values()' should return data as it's _meant_ to be consumed, not as it's stored. If you want raw data, get at something like a 'raw_values()' method or similar. As it stands now, I'm not sure how to use date_hierarchy and maybe others with a custom datetime field when 'to_python()' is not called without digging into the internals. Are there other standard apps that access API'd data 'through the back door' as well?
comment:10 by , 14 years ago
Hey, while we're waiting for a design decision, could we please get a quick fix on this that updates the documentation? It never has occurred to me that values() wouldn't deserialize, I've always thought it would just be an easy way to get plain dicts/tuples instead of the heavier objects, and the documentation doesn't mention raw values at all:
"Returns a ValuesQuerySet -- a QuerySet that returns dictionaries when used as an iterable, rather than model-instance objects.
Each of those dictionaries represents an object, with the keys corresponding to the attribute names of model objects."
I have been using values() for a couple of years without noticing the difference before today, where it broke on me because I've defined a custom field.
Here's a suggestion for a wording: "Note that the values returned are directly from the database, no field-specific conversions are performed, e.g. for custom fields with to_python()."
Otherwise, +1 for getting this fixed, with raw=True/False or whatever. :)
comment:11 by , 14 years ago
Cc: | added |
---|
comment:12 by , 14 years ago
Cc: | added |
---|
comment:13 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → Bug |
comment:14 by , 13 years ago
Easy pickings: | unset |
---|---|
UI/UX: | unset |
IMHO, if this isn't a bug it's definitely a gotcha (and the documentation should be updated). I spent the last several hours trying to debug why values_list was not returning the python objects defined by my custom model field's to_python method.
comment:15 by , 13 years ago
I probably haven't been here long enough, but I'm finding this behavior is extremely unexpected.
I'll leave this ticket in DDN because I'm not sure I grasp all the consequences.
But my gut feeling is that we should swallow the backwards incompatibility and fix this at some point — or at the very least make it clear in the documentation that QuerySet.values()
returns raw database values, as spit by the database adapter.
comment:16 by , 12 years ago
Triage Stage: | Design decision needed → Accepted |
---|
I vote for raw kwarg, if True no conversion happens. One day the default should be False, but likely with backwards compatibility period where it default still to True.
It might be worth it to inspect if there is a way to skip those method calls that do not actually do anything, maybe just if 'col_field.to_python == Field.to_python'. The standard Field.to_python just returns the value as is. There might be need to ask the field explicitly if the given connection's values needs conversion, most of the to_python() implementations seem non-necessary (IntegerField does int(value) for example, but I don't believe this is necessary on any DB's values).
The overhead of calling no-ops in trivial test case is 0.7s vs 0.07s. The test is:
def noop(value): return value myvals = [ (0, 1, 2, 3, 4), ] * 100 def values(): for row in myvals: yield row # alternate: yield map(noop, row) from datetime import datetime start = datetime.now() for _ in range(0, 10000): for val in values(): pass print datetime.now() - start
Still, the test passes in under a second even with the map call, so it might be that the overhead doesn't matter in practice.
Anyways, marking as accepted, it would be good to end up in a situation where you can opt out from conversion, but the default is to use to_python() for the values.
comment:17 by , 11 years ago
Cc: | added |
---|
comment:18 by , 10 years ago
Cc: | added |
---|
comment:19 by , 10 years ago
If i see it right, this has been fixed with Django v1.8 and the change from to_python()
to from_db_value()
isn't it?
The docs at https://docs.djangoproject.com/en/1.8/howto/custom-model-fields/#converting-values-to-python-objects says:
If present for the field subclass, from_db_value() will be called in all circumstances when the data is loaded from the database, including in aggregates and values() calls.
EDIT: I add a unittest in https://github.com/django/django/pull/4875 to test it.
comment:21 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
This is a bit of a line-ball, since it's called values, not Python objects, so returning the data representation isn't a completely bad idea -- leaving it up to the user to decide what they want to do with it. For example, complete consistency here would then be converting foreign key values to their corresponding Python objects (kind of defeating the purpose of calling
values()
instead of a normal queryset). I've just wontfixed #8144 about that particular issue (largely because of the huge impact it would have on existing code).I'm kind of in favour of staying away from
django.db.models.fields.Field
entirely when working withvalues()
andvalues_list()
, keeping it as the fast path for raw data.I'm going to risk the wrath of Jacob and DDN this for a little bit. I think we need a discussion when there's a quiet moment.