Opened 5 years ago

Closed 5 years ago

Last modified 4 years ago

#31667 closed Cleanup/optimization (fixed)

Avoid passing NULL to the IN lookup

Reported by: Adam Johnson Owned by: Adam Johnson
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Currently prefetch_related on a FK passes the NULL through to the database for e.g. author_id IN (NULL, 2). Passing NULL is always unnecessary, since it's not allowed in FK's. There's a small risk from passing NULL that it could lead to incorrect with complex prefetch querysets using PK refs because of NULL's weirdness in SQL.

For example with these models:

from django.db import models


class Author(models.Model):
    pass


class Book(models.Model):
    author = models.ForeignKey(Author, null=True, on_delete=models.DO_NOTHING)

Prefetching authors on Books, when at least one Book has author=None, uses IN (..., NULL, ...) in the query:

In [1]: from example.core.models import Author, Book

In [2]: a1 = Author.objects.create()

In [3]: Book.objects.create(author=a1)
Out[3]: <Book: Book object (3)>

In [4]: Book.objects.create(author=None)
Out[4]: <Book: Book object (4)>

In [5]: Book.objects.prefetch_related('author')
Out[5]: <QuerySet [<Book: Book object (3)>, <Book: Book object (4)>]>

In [6]: from django.db import connection

In [7]: connection.queries
Out[7]:
[{'sql': 'INSERT INTO "core_author" ("id") VALUES (NULL)', 'time': '0.001'},
 {'sql': 'INSERT INTO "core_book" ("author_id") VALUES (2)', 'time': '0.001'},
 {'sql': 'INSERT INTO "core_book" ("author_id") VALUES (NULL)',
  'time': '0.001'},
 {'sql': 'SELECT "core_book"."id", "core_book"."author_id" FROM "core_book" LIMIT 21',
  'time': '0.000'},
 {'sql': 'SELECT "core_author"."id" FROM "core_author" WHERE "core_author"."id" IN (NULL, 2)',
  'time': '0.000'}]

Maybe this could generally be extended to use of __in with non-nullable fields?

Change History (8)

comment:1 by Simon Charette, 5 years ago

Triage Stage: UnreviewedAccepted

Maybe this could generally be extended to use of __in with non-nullable fields?

Since IN translates to OR = for each elements and NULL != NULL I assume it could be done at the __in lookup level even for non-nullable fields.

Last edited 5 years ago by Simon Charette (previous) (diff)

comment:2 by Mariusz Felisiak, 5 years ago

Has patch: set
Owner: changed from nobody to Adam Johnson
Status: newassigned

comment:3 by Adam Johnson, 5 years ago

Apologies for not doing my paperwork.

comment:4 by Adam Johnson, 5 years ago

Summary: Avoid passing NULL to prefetch_related() queriesAvoid passing NULL to the IN lookup

comment:5 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

Resolution: fixed
Status: assignedclosed

In 5776a166:

Fixed #31667 -- Made in lookup ignore None values.

comment:6 by Adam Sołtysik, 4 years ago

This has partially fixed #20024.

comment:7 by Simon Charette, 4 years ago

As Adam Sołtysik pointed out in #20024.this change also introduced an inconsistency

Foo.objects.filter(bar__in=list(Bar.objects.values_list('nullable_field', flat=True)))

Won't match

Foo.objects.filter(bar__in=Bar.objects.values_list('nullable_field', flat=True))

If any NULL values are returned.

We should make sure that a complete corrective for #20024 lands in 3.2 otherwise 5776a1660e54a95159164414829738b665c89916 should be reverted.

comment:8 by Adam Sołtysik, 4 years ago

There is also an inconsistency between In and RelatedIn lookups. The latter seems to be working consistenly with this change (details in #20024).

Note: See TracTickets for help on using tickets.
Back to Top