Opened 13 months ago

Closed 13 months ago

Last modified 10 months ago

#34964 closed Cleanup/optimization (wontfix)

Reversing the order of Q objects in a CheckConstraint generates a migration

Reported by: Jacob Walls Owned by: Jacob Walls
Component: Migrations Version: dev
Severity: Normal Keywords: noop
Cc: David Sanders, David Wobrock Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Changing Q(A) & Q(B) to Q(B) & Q(A) (or Q(B, A)) generates a migration that drops and recreates the exact same constraint.

Suggesting to build on the work in #34744 to adjust the identity property for Q objects to use a deterministic sort, given that ^, &, and | operations are commutative.

PR

Change History (14)

comment:1 by Jacob Walls, 13 months ago

Has patch: set

comment:2 by Simon Charette, 13 months ago

Triage Stage: UnreviewedAccepted

The only thing that slightly worries me is that I've seen Postgres be very finicky about the order of components in complex constraint when dealing with conditional of functional indices and in these rare cases the order matters even they are representing the same condition.

comment:3 by David Sanders, 13 months ago

Cc: David Sanders added

To extend Simon's concerns; I think this needs more investigation as commutative doesn't always mean that the system will behave in an equivalent way 🤔

comment:4 by Jacob Walls, 13 months ago

Simon's suggestion on the PR was to sort Q objects in the constructor of CheckConstraint so that we wouldn't even get into a situation where the autodetector would be remaining quiet even with SQL changes. I see that this would come at the cost of causing migrations to be emitted for existing projects, though, so we might want a release note.

The case that originally prompted me to open a ticket was for a migration dropping and recreating a constraint with the exact same SQL, e.g. changing Q(A) & Q(B) to Q(A, B) . I just added a regression test for this case.

comment:5 by Simon Charette, 13 months ago

The creation of new migrations could be avoided entirely by performing the sorting only during __eq__

  • django/db/models/constraints.py

    diff --git a/django/db/models/constraints.py b/django/db/models/constraints.py
    index 56d547e6b0..e221bebd8b 100644
    a b def __repr__(self):  
    152152        )
    153153
    154154    def __eq__(self, other):
    155         if isinstance(other, CheckConstraint):
    156             return (
    157                 self.name == other.name
    158                 and self.check == other.check
    159                 and self.violation_error_code == other.violation_error_code
    160                 and self.violation_error_message == other.violation_error_message
    161             )
    162         return super().__eq__(other)
     155        if not isinstance(other, CheckConstraint):
     156            return super().__eq__(other)
     157        # Relax check equality to equivalence comparisons to allow
     158        # re-ordering of components.
     159        if isinstance(self.check, Q):
     160            if not isinstance(other.check, Q) or not self.check.equivalent_to(
     161                other.check
     162            ):
     163                return False
     164        elif self.check != other.check:
     165            return False
     166        return (
     167            self.name == other.name
     168            and self.violation_error_code == other.violation_error_code
     169            and self.violation_error_message == other.violation_error_message
     170        )
    163171
    164172    def deconstruct(self):
    165173        path, args, kwargs = super().deconstruct()

comment:6 by Jacob Walls, 13 months ago

Thanks Simon. I just checked more carefully and found that I wasn't able to come up with a test case that creates a migration for an existing project. (Sorry!)

comment:7 by David Wobrock, 13 months ago

Cc: David Wobrock added

comment:8 by Mariusz Felisiak, 13 months ago

I have doubts, in my experience database optimizers are sometimes very picky and the order of statements may be significant even for commutative operators.

Is it worth changing?

comment:9 by Jacob Walls, 13 months ago

Thanks for the extra context around those edge cases. I'd love to know more about them so I can look for them in my own projects.

I notice we've been sorting the arguments to the Q constructor since #29125. So if someone did need to change the order to work around a database edge case, they'd want a change from Q(A, B) to Q(A) & Q(B) to generate a migration, since if B sorts before A, it can only be expressed in Django as the latter, and this PR would take away the migration to make it.

The argument in favor of this change (which is very slight, I know) is that refactoring the code for consistency/readability with the surrounding constraints might generate a migration with a table rebuild. That's something you might miss. In other words, you need to know that it's not worth paying that cost when you're editing/reviewing the code. Until I knew about these edge cases, I thought that was a potential DX win to reduce cognitive burden.

Finally, I guess if we still wanted this change we might ought to block it on a resolution of #25245. Should we Someday/Maybe this ticket, then?

comment:10 by Mariusz Felisiak, 13 months ago

I have a gut feeling that it's not worth doing, as there is a risk of potential regressions. If you modify check of condition you should be ready to deal with consequences of recreating the index, even if they theoretically have the same meaning 🤷 I'm not strongly against this patch, this is just a friendly warning ;)

... with a table rebuild. That's something you might miss.

I know how migrations work on SQLite ;)

comment:11 by David Sanders, 13 months ago

Advice from #postgresql IRC channel is that an expression's operand order _does_ matter, but only for check constraints.

in most contexts ANDed and ORed terms are sliced-and-diced as the planner sees fit. but there is one case that might matter:
the check constraint obviously gets evaluated to check inserted rows, and I don't believe (but would have to check) that it gets reordered there
in which case CHECK (foo OR slowfunc(bar)) might be faster than CHECK (slowfunc(bar) OR foo)

let me check the code. sec.
looks like check expressions are not reordered by the planner for performance
inside queries, lists of ANDed quals evaluated at a single node are reordered when possible to put expensive ones last (subject to security restrictions)
but that doesn't apply to checking constraints on insert/update
so I would expect CHECK (foo OR slowfunc(bar)) to be noticably faster than CHECK (slowfunc(bar) OR foo) in the case where "foo" is usually true

tested it and it looks like i am correct

Advice courtesy of RhodiumToad 💚

comment:12 by Mariusz Felisiak, 13 months ago

Resolution: wontfix
Status: assignedclosed
Triage Stage: AcceptedUnreviewed

It looks like changing the order is not an option, at least not for CheckConstraints.

comment:13 by Mariusz Felisiak <felisiak.mariusz@…>, 10 months ago

In 7714ccfe:

Refs #34964 -- Doc'd that Q expression order is preserved.

comment:14 by Mariusz Felisiak <felisiak.mariusz@…>, 10 months ago

In a8de04f8:

[5.0.x] Refs #34964 -- Doc'd that Q expression order is preserved.

Backport of 7714ccfeae969aca52ad46c1d69a13fac4086c08 from main

Note: See TracTickets for help on using tickets.
Back to Top