Opened 7 years ago
Last modified 3 months ago
#28821 assigned New feature
Allow QuerySet.bulk_create() on multi-table inheritance when possible
Reported by: | Joey Wilhelm | Owned by: | HAMA Barhamou |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | multi-tabel, bulk-creation, optimization, queryset, sql |
Cc: | Abhishek Gautam, Sardorbek Imomaliev, jon.dufresne@…, Shai Berger, Adam Johnson, Arthur, Paolo Melchiorre | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
According to this comment in bulk_create
:
# When you bulk insert you don't get the primary keys back (if it's an # autoincrement, except if can_return_ids_from_bulk_insert=True), so # you can't insert into the child tables which references this.
This implies that, if we do retrieve primary keys from the parent model's bulk insert, then it is possible to bulk insert into the child tables automatically.
Now that Django does have the ability to automatically retrieve, and set, primary keys on a bulk create operation, it would be nice to allow this use case when possible (specifically, when the backend has can_return_ids_from_bulk_insert=True
). Keying it off this feature would give PostgreSQL this ability immediately, and then let it work for Oracle as soon as retrieval of PKs is fully supported on that engine as well.
Also, regardless if Django does this automatically, I would like to be able to manually set the _ptr
fields on the child records in order to affect a bulk_create without the need for automatic retrieval of IDs. However, even that is not possible, as the bulk_create
method fails on multi-table inheritance in all cases.
Change History (18)
comment:1 by , 7 years ago
Summary: | Allow bulk_create on multi-table inheritance when possible → Allow QuerySet.bulk_create() on multi-table inheritance when possible |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 7 years ago
Cc: | added |
---|---|
Owner: | changed from | to
Status: | new → assigned |
comment:3 by , 7 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:4 by , 5 years ago
Cc: | added |
---|
comment:5 by , 4 years ago
comment:7 by , 4 years ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:8 by , 4 years ago
Patch needs improvement: | set |
---|
comment:9 by , 3 years ago
Cc: | added |
---|
There's a naive implementation of a special case in the new broken-down-models library. Could be interesting to compare.
https://github.com/Matific/broken-down-models/blob/main/bdmodels/models.py#L114 (actual line number may have changed by the time you read this, of course)
comment:10 by , 3 years ago
Cc: | added |
---|
comment:11 by , 2 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:12 by , 11 months ago
Has patch: | unset |
---|---|
Owner: | set to |
Patch needs improvement: | unset |
Status: | new → assigned |
comment:13 by , 10 months ago
Hi Django Team,
I am picking up where @jdufresne left off . My recent commits (https://github.com/django/django/pull/17754) introduce the initial steps towards enabling QuerySet.bulk_create to support multi-table inheritance.
This is just the beginning, and I plan to make iterative improvements to this feature. Looking forward to your feedback and suggestions as we progress.
comment:14 by , 10 months ago
Has patch: | set |
---|
comment:15 by , 10 months ago
Patch needs improvement: | set |
---|
Marking as "needs improvement" as author mentioned that it's a draft.
comment:16 by , 5 months ago
I have a use case where I would like to bulk create the child model and and the primary key / parent key is known in advance. That seems the most straightforward version to support, would love to have that enabled
comment:17 by , 5 months ago
Cc: | added |
---|
comment:18 by , 3 months ago
Cc: | added |
---|---|
Keywords: | multi-tabel bulk-creation optimization queryset sql added |
This would be an excellent feature and we would really love to be able to use it.
In certain circumstances (think UUID PKs) it might also be possible, at least with Postgresql (don't know about the others), to do it the other way around. Foreign key integrity checks are deferred to the end of the transaction, so you could save all of the child tables first, and then save all the parent tables. As long as the entire operation is wrapped in a single transaction, it wouldn't matter that
_ptr
temporarily held an invalid FK.