#27017 closed Uncategorized (invalid)
Why doesn't Django's Model.save() save only the dirty fields by default? And how can I do that if I want?
Reported by: | prajnamort | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.8 |
Severity: | Normal | Keywords: | |
Cc: | Dan Tao | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I've noticed that Model.save() will update all fields by default, which can introduce a lot of race conditions.
If it update only the dirty fields, the situation would be much better.
How can I do that?
Change History (6)
comment:1 by , 8 years ago
comment:2 by , 8 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
Please see TicketClosingReasons/UseSupportChannels for places to ask usage questions.
comment:3 by , 6 years ago
Cc: | added |
---|
Can I make a case for re-opening this?
I understand that update_fields
makes it possible to only update specific fields of a model. But it places a significant burden on calling code and introduces a maintenance cost. For me to explain, first consider a typical function where update_fields
can be useful:
def update_thing(pk, foo): thing = Thing.objects.get(pk=pk) thing.foo = foo thing.save()
Code like this is incredibly common but potentially problematic, especially for sites with heavy production traffic. Different processes running to update various fields on the same model at the same time are prone to clobber each other's writes. This is where update_fields
is currently the best fix available:
def update_thing(pk, foo): thing = Thing.objects.get(pk=pk) thing.foo = foo thing.save(update_fields=['foo'])
I see two ways this could be better. First, this solution requires calling code to define the same information twice (what field(s) to update). Second, it adds a maintenance tax, as any developer who sets another field in the future has to remember to also update update_fields
:
def update_thing(pk, foo, bar): thing = Thing.objects.get(pk=pk) thing.foo = foo thing.bar = bar thing.save(update_fields=['foo', 'bar'])
The above example is contrived, of course; most real-world functions are bigger and more complex than this, meaning the opportunity to make mistakes is typically greater.
In my opinion Django could make most code bases inherently more resilient against latent race conditions by implementing some form of dirty field tracking and effectively providing the functionality of update_fields
automatically. I would like to propose a new setting, something like SAVE_UPDATE_DIRTY_FIELDS_ONLY
, to change the ORM's default behavior so that calls to Model.save()
only update the fields that have been set on the model instance. Naturally for backwards compatibility this setting would be False
by default.
I admit I probably haven't thought through all of the scenarios in which this might not be desirable. But my intuition is that more often than not, this change would be a very good one. Off the top of my head, some necessary exceptions to this behavior include:
- Calling
save()
on a new model instance without a PK (when inserting a record for the first time we obviously want to save all fields' default values) - Fields that are designed to be set automatically, e.g.
DateTimeField(auto_now=True)
- Any calls to
save()
whereupdate_fields
has been explicitly specified should remain untouched, I would think
If I'm making sense here, and there is support for re-opening this, perhaps it would make sense to update the title of this ticket to sound more like a feature request since I realize it currently reads like a usage question.
comment:4 by , 6 years ago
There's discussion in #4102 about trying to save only dirty fields. It looks like there were too many complications. If you want to try to tackle this, you should make your proposal on the DevelopersMailingList.
comment:5 by , 5 years ago
There is the django-dirtyfields that will at least tell you if a model is dirty and what fields that's dirty, but it won't do the saving...
https://github.com/romgar/django-dirtyfields
comment:6 by , 4 years ago
Just a note for anyone coming across Andreas' comment above. django-dirtyfields does now make it possible to update only the dirty (changed) fields: https://django-dirtyfields.readthedocs.io/en/develop/#saving-dirty-fields
You can manually pass
update_fields
to thesave()
method. Only the fields in that list will be updated through the query. See the docs: https://docs.djangoproject.com/en/1.9/ref/models/instances/#specifying-which-fields-to-save