Opened 9 months ago

Closed 9 months ago

#35425 closed Bug (fixed)

.save(force_update=True) not respected for model instances with default primary keys

Reported by: Jacob Walls Owned by: Jacob Walls
Component: Database layer (models, ORM) Version: 4.2
Severity: Normal Keywords:
Cc: Simon Charette Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Jacob Walls)

With this model,

class WithDefault(models.Model):
    id = models.UUIDField(primary_key=True, default=uuid.uuid4)
    message = models.CharField(null=True)

the first IntegrityError at line 5 is expected as of https://code.djangoproject.com/ticket/29260#comment:3, but the second one at line 6, I suggest, is not.

In [1]: from models import WithDefault

In [2]: import uuid

In [3]: known_uuid = uuid.uuid4()

In [4]: WithDefault.objects.create(pk=known_uuid)
Out[4]: <WithDefault: WithDefault object (0ccbf1df-6296-4efe-8f5d-8f9091d9ebdc)>

In [5]: WithDefault(pk=known_uuid, message="overwritten").save()
---------------------------------------------------------------------------
UniqueViolation                           Traceback (most recent call last)

UniqueViolation: duplicate key value violates unique constraint "models_withdefault_pkey"
DETAIL:  Key (id)=(0ccbf1df-6296-4efe-8f5d-8f9091d9ebdc) already exists.

...
IntegrityError: duplicate key value violates unique constraint "models_withdefault_pkey"
DETAIL:  Key (id)=(0ccbf1df-6296-4efe-8f5d-8f9091d9ebdc) already exists.


In [6]: WithDefault(pk=known_uuid, message="overwritten").save(force_update=True)
---------------------------------------------------------------------------
UniqueViolation                           Traceback (most recent call last)
File ~/django/django/db/backends/utils.py:105, in CursorWrapper._execute(self, sql, params, *ignored_wrapper_args)
    104 else:
--> 105     return self.cursor.execute(sql, params)

UniqueViolation: duplicate key value violates unique constraint "models_withdefault_pkey"
DETAIL:  Key (id)=(0ccbf1df-6296-4efe-8f5d-8f9091d9ebdc) already exists.


The above exception was the direct cause of the following exception:

IntegrityError                            Traceback (most recent call last)
Cell In[6], line 1
----> 1 WithDefault(pk=known_uuid, message="overwritten").save(force_update=True)

File ~/django/django/db/models/base.py:1185, in Model._do_insert(self, manager, using, fields, returning_fields, raw)
   1180 def _do_insert(self, manager, using, fields, returning_fields, raw):
   1181     """
   1182     Do an INSERT. If returning_fields is defined then this method should
   1183     return the newly created data for the model.
   1184     """
-> 1185     return manager._insert(
   1186         [self],
   1187         fields=fields,
   1188         returning_fields=returning_fields,
   1189         using=using,
   1190         raw=raw,
   1191     )
...
IntegrityError: duplicate key value violates unique constraint "models_withdefault_pkey"
DETAIL:  Key (id)=(0ccbf1df-6296-4efe-8f5d-8f9091d9ebdc) already exists.

I had an illuminating conversation on the forum regarding my surprise at the behavior of save() when dealing with the first failure on line 5. Stating what I learned from it in case helpful.

Until yesterday, I thought that the following two calls were equivalent, other than perhaps one being faster for updates and the other faster for inserts. In fact, I thought this equivalence would have been a nice clarification of the value prop of the ORM that “Django abstracts the need to use INSERT or UPDATE SQL statements.” In other words, what does save() do? It updates or creates.

def overwrite_1(known_uuid):
    MyModel.objects.update_or_create(
        pk=known_uuid,
        defaults={
            "other_field": 1,
        },
    )

def overwrite_2(known_uuid):
    MyModel(
        pk=known_uuid,
        other_field=1,
    ).save()

So, at least when I'm in an overwriting posture--and Ken brings up a good point that forcing the user to opt in to this potential for data loss for existing primary keys is worth something--I prefer save(). It's simpler and doesn't involve "defaults". (What's a "default" got to do with updating one row's values?) It's also faster for the UPDATE case, which is my hot path. I might be in the minority, and could be convinced otherwise. (A third variation is possible, .filter(pk=known).update(...), but not unless you know the objects necessarily exist.) But everything I just said is skating on pretty rarefied ice -- the basic point is that huh, I have to be careful with save() depending on the details of my field definitions?

This was acknowledged as an acceptable wart in #29260 and then documented further in #31071 with a fleshed out comment in the 3.0 release notes.

At all of those points, though, it was assumed that this would still succeed:

def overwrite_2(known_uuid):
    MyModel(
        pk=known_uuid,
        other_field=1,
    ).save(force_update=True)

This ticket is for that bug, the failure on line 6 in my REPL. (had a looksee, likely to be a one-line fix, happy to PR it 🤞).

---

But perhaps in another ticket, or on the forum, we could consider whether we should drive at the solution discussed several times by the participants on those tickets (~"if only there were a way to determine whether a field's value came from a default..."). On the thread I suggest we could do that by adjusting ._state.adding = False to agree with the documentation stating that when explicitly specifying primary keys, "Django will assume you’re changing the existing record rather than creating a new one."

To paraphrase Carlton, who suggested reverting #29260, that change traded one user's desire to optimize out one query, which can already be had with force_insert=True, for another person's obligation to start using force_update=True (once we fix it), which we discourage, or to avoid save() altogether in favor of the QuerySet API, which seems like a loss for feature parity / understandability.

I understand this might cause churn in the "release story" if we revisit it, but it's churn that wouldn't require user action. Appreciate all the hard work that went into developing and reviewing those changes--I'm totally fine with a decision to let it be. 🙂

---

I'll volunteer some small edits to the docs. I re-read the save() docs and didn't grok what "existing model instance" meant, since "exist" is also used in the sentence to refer to existing database rows. (We could say "fetched.") And Ken pointed me to the 3.0 release notes, which I wouldn't have discovered on my own developing a feature against 4.2. Suggest surfacing up the recipes in that 3.0 release note somewhere more permanent.

Change History (6)

comment:1 by Jacob Walls, 9 months ago

Description: modified (diff)

comment:2 by Jacob Walls, 9 months ago

Description: modified (diff)

comment:3 by Jacob Walls, 9 months ago

Has patch: set

comment:4 by Simon Charette, 9 months ago

Cc: Simon Charette added
Triage Stage: UnreviewedAccepted

Having save() always do the expected thing when called on a newly created instance of a model with a primary key configured with a Python level default is a complex situation as you've pointed out.

Given the ORM has historically expected newly created model instances to be used for additions and uses _state.adding to take a decision in the face of ambiguity I still believe that #29260 was the right thing to do.

With that being said, when being explicitly told what to do via force_update=True there is no save ambiguity and thus we should not fallback to force_insert. In this sense I think this represents a valid bug and the provided patch seems adequate.

comment:5 by Sarah Boyce, 9 months ago

Triage Stage: AcceptedReady for checkin

comment:6 by Sarah Boyce <42296566+sarahboyce@…>, 9 months ago

Resolution: fixed
Status: assignedclosed

In ceea86b:

Fixed #35425 -- Avoided INSERT with force_update and explicit pk.

Affected models where the primary key field is defined with a
default or db_default, such as UUIDField.

Note: See TracTickets for help on using tickets.
Back to Top