Opened 5 years ago

Closed 5 years ago

Last modified 4 years ago

#30841 closed Cleanup/optimization (fixed)

Prevent using __isnull lookup with non-boolean value.

Reported by: André Ericson Owned by: André Ericson
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Mariusz Felisiak)

__isnull should not allow for non-boolean values. Using truthy/falsey doesn't promote INNER JOIN to an OUTER JOIN but works fine for a simple queries. Using non-boolean values is undocumented and untested. IMO we should raise an error for non-boolean values to avoid confusion and for consistency.

Change History (22)

comment:1 by André Ericson, 5 years ago

Owner: changed from nobody to André Ericson
Status: newassigned

comment:3 by André Ericson, 5 years ago

Has patch: set

comment:4 by Mariusz Felisiak, 5 years ago

Summary: Using isnull to a non-boolean value does not promote joinUsing __isnull lookup to a non-boolean value doesn't promote join.
Triage Stage: UnreviewedAccepted

comment:5 by Mariusz Felisiak, 5 years ago

Resolution: wontfix
Status: assignedclosed
Type: BugNew feature

After the reconsideration I don't think that we should change this documented behavior (that is in Django from the very beginning). __isnull lookup expects boolean values in many places and IMO it would be confusing if we'll allow for truthy/falsy values, e.g. take a look at these examples field__isnull='false' or field__isnull='true' (both would return the same result). You can always call bool() on a right hand side.

Sorry for my previous acceptation (I shouldn't triage tickets in the weekend).

Last edited 5 years ago by Mariusz Felisiak (previous) (diff)

in reply to:  5 comment:6 by André Ericson, 5 years ago

Replying to felixxm:

After the reconsideration I don't think that we should change this documented behavior (that is in Django from the very beginning). __isnull lookup expects boolean values in many places and IMO it would be confusing if we'll allow for truthy/falsy values, e.g. take a look at these examples field__isnull='false' or field__isnull='true' (both would return the same result). You can always call bool() on a right hand side.

Sorry for my previous acceptation (I shouldn't triage tickets in the weekend).

I understand your point. But is there anything we can do to avoid people falling for the same pitfall I did? The problem, in my opinion, is that it works fine for simple queries but as soon as you add a join that needs promotion it will break, silently. Maybe we should make it raise an exception when a non-boolean is passed?

One valid example is to have a class that implements __bool__.

You can see here https://github.com/django/django/blob/d9881a025c15d87b2a7883ee50771117450ea90d/django/db/models/lookups.py#L465-L470 that non-bool value is converted to IS NULL and IS NOT NULL already using the truthy/falsy values.

IMO it would be confusing if we'll allow for truthy/falsy values, e.g. take a look at these examples fieldisnull='false' or fieldisnull='true' (both would return the same result).

This is already the case. It just is inconsistent, in lookups.py field__isnull='false' will be a positive condition but on the query.py it will be the negative condition.

comment:7 by André Ericson, 5 years ago

Maybe adding a note on the documentation? something like: "Although it might seem like it will work with non-bool fields, this is not supported and can lead to inconsistent behaviours"

comment:8 by Mariusz Felisiak, 5 years ago

Description: modified (diff)
Has patch: unset
Resolution: wontfix
Status: closednew
Summary: Using __isnull lookup to a non-boolean value doesn't promote join.Prevent using __isnull lookup with non-boolean value.
Type: New featureCleanup/optimization

Agreed, we should raise an error for non-boolean values, e.g.

diff --git a/django/db/models/lookups.py b/django/db/models/lookups.py
index 9344979c56..fc4a38c4fe 100644
--- a/django/db/models/lookups.py
+++ b/django/db/models/lookups.py
@@ -463,6 +463,11 @@ class IsNull(BuiltinLookup):
     prepare_rhs = False
 
     def as_sql(self, compiler, connection):
+        if not isinstance(self.rhs, bool):
+            raise ValueError(
+                'The QuerySet value for an isnull lookup must be True or '
+                'False.'
+            )
         sql, params = compiler.compile(self.lhs)
         if self.rhs:
             return "%s IS NULL" % sql, params

I changed the ticket description.

comment:9 by André Ericson, 5 years ago

Thanks, I'll work on it! Wouldn't that possibly break backward compatibility? I'm not familiar with how Django moves in that regard.

comment:10 by Mariusz Felisiak, 5 years ago

We can add a release note in "Backwards incompatible changes" or deprecate this and remove in Django 4.0. I have to thing about it, please give me a day, maybe I will change my mind :)

comment:11 by André Ericson, 5 years ago

No problem. Thanks for taking the time to look into this!

comment:12 by André Ericson, 5 years ago

Another interesting example related to this:

As an anecdote, I've also got bitten by this possibility.
An attempt to write WHERE (field IS NULL) = boolean_field as .filter(field__isnull=F('boolean_field')) didn't go as I expected.
Alexandr Aktsipetrov -- https://groups.google.com/forum/#!msg/django-developers/AhY2b3rxkfA/0sz3hNanCgAJ

This example will generate the WHERE .... IS NULL. I guess we also would want an exception thrown here.

comment:13 by Mariusz Felisiak, 5 years ago

André, IMO we should deprecate using non-boolean values in Django 3.1 (RemovedInDjango40Warning) and remove in Django 4.0 (even if it is untested and undocumented). I can imagine that a lot of people use e.g. 1 and 0 instead of booleans.

Attached diff fixes also issue with passing a F() expression.

     def as_sql(self, compiler, connection):
        if not isinstance(self.rhs, bool):
            raise RemovedInDjango40Warning(...)
        ....
Last edited 5 years ago by Mariusz Felisiak (previous) (diff)

in reply to:  13 comment:14 by André Ericson, 5 years ago

Replying to felixxm:

André, IMO we should deprecate using non-boolean values in Django 3.1 (RemovedInDjango40Warning) and remove in Django 4.0 (even if it is untested and undocumented). I can imagine that a lot of people use e.g. 1 and 0 instead of booleans.

Attached diff fixes also issue with passing a F() expression.

     def as_sql(self, compiler, connection):
        if not isinstance(self.rhs, bool):
            raise RemovedInDjango40Warning(...)
        ....

Sound like a good plan. Not super familiar with the branch structure of Django. So, I guess the path to follow is to make a PR to master adding the deprecation warning and eventually when master is 4.x we create the PR raising the ValueError. Is that right?

Thanks!

comment:15 by Simon Charette, 5 years ago

André, yes mostly. You can find more details about that from the documentation.

in reply to:  15 comment:16 by André Ericson, 5 years ago

Replying to Simon Charette:

André, yes mostly. You can find more details about that from the documentation.

Thanks! I had missed the bit about deprecation. Super helpful!

comment:17 by André Ericson, 5 years ago

Has patch: set
Last edited 5 years ago by André Ericson (previous) (diff)

comment:18 by Mariusz Felisiak, 5 years ago

Needs tests: set
Patch needs improvement: set

comment:19 by André Ericson, 5 years ago

Patch needs improvement: unset

comment:20 by Mariusz Felisiak, 5 years ago

Needs tests: unset
Triage Stage: AcceptedReady for checkin

comment:21 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

Resolution: fixed
Status: newclosed

In 3117403:

Fixed #30841 -- Deprecated using non-boolean values for isnull lookup.

comment:22 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

In 396da8b9:

Refs #30841 -- Made isnull lookup raise ValueError for non-boolean values.

Per deprecation timeline.

Note: See TracTickets for help on using tickets.
Back to Top