#34964 closed Cleanup/optimization (wontfix)
Reversing the order of Q objects in a CheckConstraint generates a migration
Reported by: | Jacob Walls | Owned by: | Jacob Walls |
---|---|---|---|
Component: | Migrations | Version: | dev |
Severity: | Normal | Keywords: | noop |
Cc: | David Sanders, David Wobrock | Triage Stage: | Unreviewed |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Change History (14)
comment:1 by , 14 months ago
Has patch: | set |
---|
comment:2 by , 14 months ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:3 by , 14 months ago
Cc: | added |
---|
To extend Simon's concerns; I think this needs more investigation as commutative doesn't always mean that the system will behave in an equivalent way 🤔
comment:4 by , 14 months ago
Simon's suggestion on the PR was to sort Q objects in the constructor of CheckConstraint
so that we wouldn't even get into a situation where the autodetector would be remaining quiet even with SQL changes. I see that this would come at the cost of causing migrations to be emitted for existing projects, though, so we might want a release note.
The case that originally prompted me to open a ticket was for a migration dropping and recreating a constraint with the exact same SQL, e.g. changing Q(A) & Q(B)
to Q(A, B)
. I just added a regression test for this case.
comment:5 by , 14 months ago
The creation of new migrations could be avoided entirely by performing the sorting only during __eq__
-
django/db/models/constraints.py
diff --git a/django/db/models/constraints.py b/django/db/models/constraints.py index 56d547e6b0..e221bebd8b 100644
a b def __repr__(self): 152 152 ) 153 153 154 154 def __eq__(self, other): 155 if isinstance(other, CheckConstraint): 156 return ( 157 self.name == other.name 158 and self.check == other.check 159 and self.violation_error_code == other.violation_error_code 160 and self.violation_error_message == other.violation_error_message 161 ) 162 return super().__eq__(other) 155 if not isinstance(other, CheckConstraint): 156 return super().__eq__(other) 157 # Relax check equality to equivalence comparisons to allow 158 # re-ordering of components. 159 if isinstance(self.check, Q): 160 if not isinstance(other.check, Q) or not self.check.equivalent_to( 161 other.check 162 ): 163 return False 164 elif self.check != other.check: 165 return False 166 return ( 167 self.name == other.name 168 and self.violation_error_code == other.violation_error_code 169 and self.violation_error_message == other.violation_error_message 170 ) 163 171 164 172 def deconstruct(self): 165 173 path, args, kwargs = super().deconstruct()
comment:6 by , 14 months ago
Thanks Simon. I just checked more carefully and found that I wasn't able to come up with a test case that creates a migration for an existing project. (Sorry!)
comment:7 by , 14 months ago
Cc: | added |
---|
comment:8 by , 14 months ago
I have doubts, in my experience database optimizers are sometimes very picky and the order of statements may be significant even for commutative operators.
Is it worth changing?
comment:9 by , 14 months ago
Thanks for the extra context around those edge cases. I'd love to know more about them so I can look for them in my own projects.
I notice we've been sorting the arguments to the Q constructor since #29125. So if someone did need to change the order to work around a database edge case, they'd want a change from Q(A, B)
to Q(A) & Q(B)
to generate a migration, since if B sorts before A, it can only be expressed in Django as the latter, and this PR would take away the migration to make it.
The argument in favor of this change (which is very slight, I know) is that refactoring the code for consistency/readability with the surrounding constraints might generate a migration with a table rebuild. That's something you might miss. In other words, you need to know that it's not worth paying that cost when you're editing/reviewing the code. Until I knew about these edge cases, I thought that was a potential DX win to reduce cognitive burden.
Finally, I guess if we still wanted this change we might ought to block it on a resolution of #25245. Should we Someday/Maybe this ticket, then?
comment:10 by , 14 months ago
I have a gut feeling that it's not worth doing, as there is a risk of potential regressions. If you modify check
of condition
you should be ready to deal with consequences of recreating the index, even if they theoretically have the same meaning 🤷 I'm not strongly against this patch, this is just a friendly warning ;)
... with a table rebuild. That's something you might miss.
I know how migrations work on SQLite ;)
comment:11 by , 14 months ago
Advice from #postgresql IRC channel is that an expression's operand order _does_ matter, but only for check constraints.
in most contexts ANDed and ORed terms are sliced-and-diced as the planner sees fit. but there is one case that might matter:
the check constraint obviously gets evaluated to check inserted rows, and I don't believe (but would have to check) that it gets reordered there
in which case CHECK (foo OR slowfunc(bar)) might be faster than CHECK (slowfunc(bar) OR foo)
let me check the code. sec.
looks like check expressions are not reordered by the planner for performance
inside queries, lists of ANDed quals evaluated at a single node are reordered when possible to put expensive ones last (subject to security restrictions)
but that doesn't apply to checking constraints on insert/update
so I would expect CHECK (foo OR slowfunc(bar)) to be noticably faster than CHECK (slowfunc(bar) OR foo) in the case where "foo" is usually true
tested it and it looks like i am correct
Advice courtesy of RhodiumToad 💚
comment:12 by , 14 months ago
Resolution: | → wontfix |
---|---|
Status: | assigned → closed |
Triage Stage: | Accepted → Unreviewed |
It looks like changing the order is not an option, at least not for CheckConstraint
s.
The only thing that slightly worries me is that I've seen Postgres be very finicky about the order of components in complex constraint when dealing with conditional of functional indices and in these rare cases the order matters even they are representing the same condition.