#13442 closed (invalid)
only() and defer() will always select id
Reported by: | mampf | Owned by: | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.1 |
Severity: | Keywords: | ||
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Hi there,
when you do a query with defer() or only(), you will still execute a SELECT query including the object's id. This makes it impossible to use distinct() on it.
last_5_uploaded_files = UploadFile.objects.order_by('upload_time')[:5] last_changed_list = MainObject.objects \ .filter(upload_file__in=last_5_uploaded_files) \ .only('field_1', 'hostname', 'target', 'modifed_date') \ .distinct() \ .order_by('-modified_date')[:50] response = render_to_response('output/mainobject_list.html', {'object_list': last_changed_list,}) connection.queries print("") # Still includes MainObject.id! WHY? print("current queue: %s" % connection.queries) print("") return response
Now if I print the executed SQL using connection.queries I get an SQL statement which includes queryset.id. This is why my distinct won't word - I want a distinct list created without id.
On the other hand, .values('field_1', 'field_n') does work. But this way I do not get the foreign table resolved, just it's id.
Please change only() and defer(), so it doesn't select the id in any case.
Thanks
Change History (4)
comment:1 by , 15 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
follow-up: 3 comment:2 by , 15 years ago
Hi lukeplant,
thanks for your answer. Here are some additional thoughts.
I used this document: http://docs.djangoproject.com/en/dev/ref/models/querysets/#defer-fields
It doesn't state that that the primary key an never be deferred.
only() is about only retrieving certain fields of objects
I thought I was doing this. But how useful is distinct when I need raw SQL to make it useful anyway? I see no usecase for that. The docs say:
However, if your query spans multiple tables, it's possible to get duplicate results when a QuerySet is evaluated. That's when you'd use distinct().
If I span over multiple tables, I always get ids as seen above. This is why I will never have an use case for distinct() then.
Thanks for your answer. It made some thinks about how the orm works a little clearer to me.
comment:3 by , 15 years ago
Replying to mampf:
Hi lukeplant,
thanks for your answer. Here are some additional thoughts.
I used this document: http://docs.djangoproject.com/en/dev/ref/models/querysets/#defer-fields
It doesn't state that that the primary key an never be deferred.
You need to read it again :-)
only() is about only retrieving certain fields of objects
I thought I was doing this. But how useful is distinct when I need raw SQL to make it useful anyway? I see no usecase for that. The docs say:
However, if your query spans multiple tables, it's possible to get duplicate results when a QuerySet is evaluated. That's when you'd use distinct().
If I span over multiple tables, I always get ids as seen above. This is why I will never have an use case for distinct() then.
The use case for distinct is, as stated, to eliminate duplicate results (which will have the same id).
comment:4 by , 12 years ago
Component: | ORM aggregation → Database layer (models, ORM) |
---|
The docs state that the primary key can never be deferred. We have to include the primary key, because the other fields in the return objects need to be lazily loaded. For that to work, we need to store the id so that we can later retrieve the right row of the database if needed.
Your use case doesn't really fit the reasons why either
distinct()
oronly()
exist.only()
is about only retrieving certain fields of objects, not generating arbitrary SQL. You may need to use some raw SQL to get what you want.Thanks.