Opened 13 months ago
Last modified 8 months ago
#34943 assigned New feature
Support passing unique constraint names to bulk_create().
Reported by: | Alex Vandiver | Owned by: | HAMA Barhamou |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | bulk insert update upsert |
Cc: | Vitor Pereira, Chih Sean Hsu, Simon Charette, Ülgen Sarıkavak | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
#31685 added support for INSERT ... ON CONFLICT(col1, col2) DO UPDATE SET ...
In some cases, however, the unique constraint may be on computed columns, which cannot be used with unique_fields
because they are directly quoted.
For instance:
from django.db.models.functions import Upper UserTopic.objects.bulk_create( [ut], update_conflicts=True, update_fields=['last_updated','visibility_policy'], unique_fields=['user_profile_id','stream_id',Upper('topic_name')], )
...fails because UserTopic has no field named 'Upper(F(topic_name))'
.
And:
UserTopic.objects.bulk_create( [ut], update_conflicts=True, update_fields=['last_updated','visibility_policy'], unique_fields=['user_profile_id','stream_id','upper(topic_name)'], )
...fails similarly, with UserTopic has no field named 'upper(topic_name)'
.
It would be a useful feature to be able to handle these cases.
Change History (11)
comment:1 by , 13 months ago
Cc: | added |
---|---|
Description: | modified (diff) |
Summary: | Support computed unique fields for INSERT ON CONFLICT ... UPDATE → Support passing database expressions to bulk_create()'s unique_fields. |
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → New feature |
comment:2 by , 13 months ago
Keywords: | bulk insert update upsert added |
---|
follow-up: 8 comment:3 by , 13 months ago
Maybe the would be an opportunity to deprecate/rename the unique_fields
kwarg name to unique_expressions
or unique_together
to denote that not only fields are acceptable here.
Or we could take an hybrid approach like we did with Index
and Constraint
where either allow update_fields
or update_expressions
to be specified.
I get a sense that we should have preferred a unique_constraint
kwarg that points to the name of an entry of Meta.constraints
or the name of a Field(unique)
in the first place instead of forcing users to specify the exact same index definition they have in their model, Meta.unique_together
, or Meta.constraints
instead.
For backends that support the ON CONFLICT ON CONSTRAINT
clause we could have simply used it for the constraint name specified otherwise we know what the index expression associated with the specified constraint is so we can generate the proper SQL.
In other words, for a model like
class UserTopic(models.Model): ... class Meta: constraints = [ UniqueConstraint( 'user_profile_id','stream_id',Upper('topic_name'), name='unique_profile_stream_topic' ) ]
The call to bulk_create
could simply have been
UserTopic.objects.bulk_create( [ut], update_fields=['last_updated','visibility_policy'], unique_constraint='unique_profile_stream_topic', ) --- My recommendation would be to introduce a `unique_constraint: str | tuple[str | Expression]` kwarg and deprecate `unique_fields`. When provided a `str` it would be a reference to a `UniqueConstraint` by `.name` and when it's a `tuple` it would be expected to be a index expression where `str` are resolved to field names. I guess the `str` -> `UniqueConstraint.name` part could be a distinct feature request.
comment:4 by , 13 months ago
Summary: | Support passing database expressions to bulk_create()'s unique_fields. → Support passing unique constraint names to bulk_create(). |
---|
Great suggestion, Simon!
comment:5 by , 13 months ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:6 by , 13 months ago
It's worth noting that [PostgreSQL's INSERT
documentation](https://www.postgresql.org/docs/current/sql-insert.html) suggests against ON CONFLICT ON CONSTRAINT
, preferring the raw list of expressions:
It is often preferable to use unique index inference rather than naming a constraint directly using
ON CONFLICT ON CONSTRAINT constraint_name
. Inference will continue to work correctly when the underlying index is replaced by another more or less equivalent index in an overlapping way, for example when usingCREATE UNIQUE INDEX ... CONCURRENTLY
before dropping the index being replaced.
That is, naming a specific constraint is in general more brittle, since it depends on the specific name, rather than naming the effect of the constraint.
In our particular use case, we also currently only have a unique index, not a constraint -- and I'm not sure how well doing that migration would go. But that's a separate issue.
comment:7 by , 13 months ago
Cc: | added |
---|
Makes sense.
I guess that from an application maintenance perspective there are still benefits in using references to already defined constraint though, think of urlpattern names, annotations, lookups.
To take back the example above nothing would prevent the ORM from simply resolving unique_constraint='unique_profile_stream_topic'
to ('user_profile_id', 'stream_id', Upper('topic_name'))
and avoid the database level usage of ON CONFLICT ON CONSTRAINT
if that's discouraged.
In other words, the aliasing capabilities could be solely done at the application level assuming users opt-in into the usage of Meta.constraints
. That would also happen to address the issue involved in dealing with the difference between unique indexes and constraints as Postgres treat them differently.
comment:8 by , 8 months ago
Replying to Simon Charette:
Maybe the would be an opportunity to deprecate/rename the
unique_fields
kwarg name tounique_expressions
orunique_together
to denote that not only fields are acceptable here.
Or we could take an hybrid approach like we did with
Index
andConstraint
where either allowupdate_fields
orupdate_expressions
to be specified.
I get a sense that we should have preferred a
unique_constraint
kwarg that points to the name of an entry ofMeta.constraints
or the name of aField(unique)
in the first place instead of forcing users to specify the exact same index definition they have in their model,Meta.unique_together
, orMeta.constraints
instead.
For backends that support the
ON CONFLICT ON CONSTRAINT
clause we could have simply used it for the constraint name specified otherwise we know what the index expression associated with the specified constraint is so we can generate the proper SQL.
In other words, for a model like
class UserTopic(models.Model): ... class Meta: constraints = [ UniqueConstraint( 'user_profile_id','stream_id',Upper('topic_name'), name='unique_profile_stream_topic' ) ]The call to
bulk_create
could simply have been
UserTopic.objects.bulk_create( [ut], update_fields=['last_updated','visibility_policy'], unique_constraint='unique_profile_stream_topic', )---
My recommendation would be to introduce a
unique_constraint: str | tuple[str | Expression]
kwarg and deprecateunique_fields
. When provided astr
it would be a reference to aUniqueConstraint
by.name
and when it's atuple
it would be expected to be a index expression wherestr
are resolved to field names.
I guess the
str
->UniqueConstraint.name
part could be a distinct feature request but it seems related to this issue given the only way Django supports creating unique constraints composed of expressions is throughMeta.constaints
and having to define the exact expression twice is error prone.
Hi, If you depreciated unique_fields, what would that mean for ticket #34277 https://code.djangoproject.com/ticket/34277, which I'm working on?
comment:9 by , 8 months ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:10 by , 8 months ago
Cc: | added |
---|
comment:11 by , 8 months ago
Owner: | set to |
---|---|
Status: | new → assigned |
I will make a proposal to work on the ticket for GSoC 2024.
Sounds reasonable.