#28273 closed Cleanup/optimization (fixed)
Document how to prevent adding columns with defaults in migrations
Reported by: | Raphael Gaschignard | Owned by: | Caio Ariede |
---|---|---|---|
Component: | Documentation | Version: | 1.10 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
So normally the Django documentation says that it doesn't use database defaults.
If you have a model where you add a NullBooleanField
with a model-side default and generate a migration then you end up with the following SQL being executed (at least for Postgres)
BEGIN; -- -- Add field some_field to table -- ALTER TABLE "appname_model" ADD COLUMN "some_field" boolean DEFAULT true NULL; ALTER TABLE "appname_model" ALTER COLUMN "some_field" DROP DEFAULT; COMMIT;
This actually backfills the value for you in the DB during the ADD COLUMN command
But from the PostgreSQL documentation:
When a column is added with ADD COLUMN, all existing rows in the table are initialized with the column's default value (NULL if no DEFAULT clause is specified).
Adding a column with a non-null default or changing the type of an existing column will require the entire table and indexes to be rewritten.
So if you're adding a column to a big enough table, the migration run itself will be very painful, because it will update _every_ row of your table to set the default.
Usually you want to set the default
value model-side to help with the future backfill. So the way to handle this is to usually remove the automatically inserted default from the migration. But it can be easy to miss this (or to have custom fields that auto-set defaults).
But the generated SQL is something you basically never want to run on a big table. I would love to have a way to just prevent a migration from ever generating an ADD COLUMN with a DEFAULT clause.
I think there would be a way to handle this through simple changes in the schema editor but I couldn't figure out how the default code was generated.
Change History (11)
follow-up: 2 comment:1 by , 8 years ago
comment:2 by , 8 years ago
Replying to Tim Graham:
Can't you accomplish this by adding the column as
null=True
, populating the values, then changing the column tonull=False
(similar to the Migrations that add unique fields howto)? I believe defaults should only be added when adding non-nullable columns -- if that behavior were changed, how would the non-null constraint be avoided?
Because we want to deploy without downtime, we already deploy fields with null=True
, and add a backfill later. The issue I'm trying to point out here is that with default=a_value
set (even with null=True
) we still hit an issue because Postgres backfills defaults into existing rows when a column is added (even if the column is nullable). I am not aware of how other DB backends handle adding a column with DEFAULT
+ NULL
set.
Our conclusion is that any AddField
with a default
set cannot be accomplished without downtime in a production system (similar to a non-null field), so we've written the following check for our migrations:
from django.db.migrations.operations.fields import AddField from django.db.models.fields import NOT_PROVIDED from django.db.models.signals import pre_migrate def check_migration_safety(**kwargs): # `plan` is not part of the public API and is subject to change in later # versions of Django. plan = kwargs['plan'] for migration_cls, rolled_back in plan: # Check if this is going forwards or rolling back a migration. if not rolled_back: for operation in migration_cls.operations: if isinstance(operation, AddField): description = "{migration!r} model '{model}' field '{field}': ".format( migration=migration_cls, model=operation.model_name, field=operation.name, ) if not operation.field.null: raise ValueError(description + "do not add new non-nullable columns") if operation.field.default != NOT_PROVIDED: raise ValueError(description + "do not set a default when adding columns") pre_migrate.connect( check_migration_safety, dispatch_uid="operations.check_migration_safety", )
comment:3 by , 8 years ago
Component: | Database layer (models, ORM) → Migrations |
---|---|
Type: | Uncategorized → Cleanup/optimization |
If a model field default
is added after adding a nullable field, does it have the same issue? For example:
migration one:
adding foo = models.NullBooleanField()
ALTER TABLE "t28273_test" ADD COLUMN "foo" boolean NULL;
migration two:
changing to foo = models.NullBooleanField(default=True)
ALTER TABLE "t28273_test" ALTER COLUMN "foo" SET DEFAULT true; # does this cause the backfill problem?
ALTER TABLE "t28273_test" ALTER COLUMN "foo" DROP DEFAULT;
I'm trying to think of how to solve this by structuring migrations appropriately because I'm not seeing another elegant way forward.
comment:4 by , 8 years ago
Adding a default after the column was added does not seem to trigger a backfill. Like you're saying, creating two migrations doesn't cause this issue to appear (except when custom fields add a default, for example)
testing=# select * from t; a | b ----+--- hi | 3 hi | 4 hi | 5 (3 rows) testing=# alter table t add column c integer null; ALTER TABLE testing=# select * from t; a | b | c ----+---+--- hi | 3 | hi | 4 | hi | 5 | (3 rows) testing=# alter table t alter COLUMN c set default 3; ALTER TABLE testing=# select * from t; a | b | c ----+---+--- hi | 3 | hi | 4 | hi | 5 | (3 rows) testing=# alter table t add column d integer null default 10; ALTER TABLE testing=# select * from t; a | b | c | d ----+---+---+---- hi | 3 | | 10 hi | 4 | | 10 hi | 5 | | 10 (3 rows)
comment:6 by , 8 years ago
Component: | Migrations → Documentation |
---|---|
Summary: | Have a way to prevent adding columns with defaults in migrations → Document how to prevent adding columns with defaults in migrations |
Triage Stage: | Unreviewed → Accepted |
comment:7 by , 5 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Replying to Tim Graham:
Would documenting this solution be sufficient?
Where do you think the documentation for this could go?
Can't you accomplish this by adding the column as
null=True
, populating the values, then changing the column tonull=False
(similar to the Migrations that add unique fields howto)? I believe defaults should only be added when adding non-nullable columns -- if that behavior were changed, how would the non-null constraint be avoided?