Opened 3 years ago
Closed 2 years ago
#33191 closed Cleanup/optimization (fixed)
Avoid unnecessary clear of cached reference
Reported by: | Barry Johnson | Owned by: | AllenJonathan |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.2 |
Severity: | Normal | Keywords: | |
Cc: | John Speno | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Consider this case of ORM models "Parent" and "Child", where Child has a foreign key reference to Parent (and the database can return generated IDs following insert operations):
parent = Parent(name='parent_object') child = Child(parent=parent) parent.save() child.save() print(child.parent.name)
The print statement will cause an unnecessary lazy read of the parent object.
In the application where this behavior was first observed, the application was creating thousands of parent and child objects using bulk_create(). The subsequent lazy reads occurred when creating log entries to record the action, and added thousands of unwanted SELECT queries.
Closed ticket #29497 solved a problem with potential data loss in this situation by essentially executing child.parent_id = child.parent.pk
while preparing the child object to be saved. However, when the child's ForeignKeyDeferredAttrbute "parent_id" changes value from None to the parent's ID, the child's internal cache containing the reference to "parent" is cleared. The subsequent reference to child.parent then must do a lazy read and reload parent from the database.
A workaround to avoid this lazy read is to explicitly update both the "parent_id" and "parent" cache entry by adding this non-intuitive statement:
child.parent = child.parent
after executing parent.save()
But it appears that a simple change could avoid clearing the cache in this narrow case.
Within Model._prepare_related_fields_for_save(), replace
setattr(self, field.attname, obj.pk)
with
setattr(self, field.name, obj)
This suggested code has -not- been tested.
This change would set the associated "parent_id" attribute while ensuring that the currently referenced object remains referenced.
Change History (11)
comment:1 by , 3 years ago
Description: | modified (diff) |
---|
comment:2 by , 3 years ago
Description: | modified (diff) |
---|
comment:3 by , 3 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
comment:4 by , 3 years ago
Resolution: | wontfix |
---|---|
Status: | closed → new |
Triage Stage: | Unreviewed → Accepted |
Follow up discussion on the mailing list.
Reopening based on the discussion there.
comment:5 by , 3 years ago
Cc: | added |
---|
comment:6 by , 3 years ago
Description: | modified (diff) |
---|
follow-up: 8 comment:7 by , 3 years ago
Changed the suggested correction based on the discussion within the mailing list.
comment:8 by , 3 years ago
Replying to Barry Johnson:
Changed the suggested correction based on the discussion within the mailing list.
Would you like to prepare a patch?
comment:10 by , 2 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
Hi Barry. Thanks for the report.
For me, this is just how things are expected to work, so I'm going to say wontfix. I'm happy if you want to take it to the DevelopersMailingList for a wider discussion, but (in general) I'm suspicious of a "simple change" adjusting a long-standing behaviour like this… (Perhaps there's a case to be made on the mailing list though.)
To address the example, my first thought is that you should be saving
parent
before creating child:My second is that, you have
parent
, so print the name from there:I appreciate you've no-doubt reduced this in order to show the issue more clearly than in a real example, but I'd suspect those thoughts would go back to the more realistic case.
However, as I say, the DevelopersMailingList is a better venue to pursue this kind of thing.
Thanks.