Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#20091 closed Bug (fixed)

outer join regression over nullable foreign keys

Reported by: Erin Kelly Owned by: nobody
Component: Database layer (models, ORM) Version: 1.5
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

We recently underwent an upgrade from Django 1.0.4 all the way to 1.5 and discovered a QuerySet regression at some intermediate point under Oracle. Given the models:

class Parent(models.Model):
    key = models.CharField(max_length=100)

class Child(models.Model):
    parent = models.ForeignKey(Parent, null=True)

Due to the way Oracle interprets empty strings as null, the following queries should both use an outer join and an IS NULL test:

Child.objects.filter(parent__key=None)
Child.objects.filter(parent__key=u'')

The first query correctly uses both an outer join and an IS NULL test. The code responsible for transforming this lookup into an isnull is at https://github.com/django/django/blob/master/django/db/models/sql/query.py#L1069.
The second query correctly uses an IS NULL test but with an inner join. The code responsible for transforming this lookup into an isnull is at https://github.com/django/django/blob/master/django/db/models/sql/where.py#L206.

I'm thinking that the latter code ought to be refactored into the former somehow.

Change History (9)

comment:1 by Erin Kelly, 12 years ago

Summary: outer join regressing over nullable foreign keysouter join regression over nullable foreign keys

comment:2 by Anssi Kääriäinen, 12 years ago

I see two possible fixes for this. The first is to add something like if lookup == 'exact' and connection.interprets_empty_strins_as_nulls and value = "": value = None just before the exact=None -> isnull conversion in build_filter (or add_filter in 1.5). This is somewhat dirty as the default connection will need to be used for checking the interprets_empty_strings_as_nulls condition (see is_nullable() in Query for another similar condition).

The other option is to do Oracle specific join promotion in compiler stage. This would involve going through all the where/having conditions checking for exact lookups against "". Then promote all the aliases found. This will be ugly to do correctly, but it would lead to correct results under multidb...

I think the first option is enough. If you are using multidb where your other database is Oracle, then you just need to do parent__key=None instead of u"" in your code.

comment:3 by Anssi Kääriäinen, 12 years ago

This seems to be regression from 1.1 to 1.2, in 1.1 add_filter() contains a value = "" and connection interprets emptry strings as nulls check, in 1.2 that is gone.

As such I don't see this as a regression and thus this isn't prime candidate for backpatch. Of course, the patch is going to be pretty simple, and if a core committer is willing to backpatch this to 1.5 I am not going to complain...

I will try to commit a patch to master soon, it will not apply directly to 1.5 due to changes between master and 1.5.

comment:4 by Anssi Kääriäinen, 12 years ago

A commit fixing this is available from https://github.com/akaariai/django/commit/be3f1d35be9f2db8c4c1f8ad5e7b0f1680675cb6

I am not sure if this should be actually fixed, wouldn't it be possible to just use somefield=None in this case? The fix will not work under multidb and I believe changes done for multidb were the reason for this regression.

I will leave the commit and backpatch decision to Ian.

comment:5 by Erin Kelly, 12 years ago

I believe the fix is important because while somefield=None would work, the ORM itself uses the empty string. For example, this would produce an empty queryset in Oracle:

somechild = Child.objects.create(parent=someparent)
Child.objects.filter(parent__somefield=somechild.parent.somefield)

comment:6 by Anssi Kääriäinen, 12 years ago

OK, convinced.

I will try to remember to do the patch + backport to 1.5 today.

comment:7 by Anssi Kääriäinen <akaariai@…>, 12 years ago

Resolution: fixed
Status: newclosed

In e17fa9e877e84e93b699c2bd13ea48dbbb86e451:

Fixed #20091 -- Oracle null promotion for empty strings

comment:8 by Anssi Kääriäinen <akaariai@…>, 12 years ago

In 207117ae731096d8148c17c6ca16f06ebf18537c:

[1.5.x] Fixed #20091 -- Oracle null promotion for empty strings

Backpatch of e17fa9e877e84e93b699c2bd13ea48dbbb86e451

comment:9 by Anssi Kääriäinen, 12 years ago

The fix is ugly, but it is the best that can be done easily. The alternative of doing nothing leads to Django itself generating queries that do not work.

I think something like Query.conditional_actions could be added if a perfect fix is needed. These are something that can be added during query construction and will be executed at start of compiling the query. So, one could add a conditional_promote to conditional_actions for the alias, and then promote the join if the given connection needs it. Then, this conditional promotion needs to be done at the end of compiling, otherwise the query will be changed in compiler and this doesn't work.

So, this would be a lot of work and some complicated code is needed to make the promotion work correctly. The gain is that correct join promotion happens when using non-default database for nullable strings. Doesn't seem to be worth it...

Note: See TracTickets for help on using tickets.
Back to Top