#20091 closed Bug (fixed)
outer join regression over nullable foreign keys
Reported by: | Erin Kelly | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.5 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
We recently underwent an upgrade from Django 1.0.4 all the way to 1.5 and discovered a QuerySet regression at some intermediate point under Oracle. Given the models:
class Parent(models.Model): key = models.CharField(max_length=100) class Child(models.Model): parent = models.ForeignKey(Parent, null=True)
Due to the way Oracle interprets empty strings as null, the following queries should both use an outer join and an IS NULL test:
Child.objects.filter(parent__key=None) Child.objects.filter(parent__key=u'')
The first query correctly uses both an outer join and an IS NULL test. The code responsible for transforming this lookup into an isnull is at https://github.com/django/django/blob/master/django/db/models/sql/query.py#L1069.
The second query correctly uses an IS NULL test but with an inner join. The code responsible for transforming this lookup into an isnull is at https://github.com/django/django/blob/master/django/db/models/sql/where.py#L206.
I'm thinking that the latter code ought to be refactored into the former somehow.
Change History (9)
comment:1 by , 12 years ago
Summary: | outer join regressing over nullable foreign keys → outer join regression over nullable foreign keys |
---|
comment:2 by , 12 years ago
comment:3 by , 12 years ago
This seems to be regression from 1.1 to 1.2, in 1.1 add_filter() contains a value = "" and connection interprets emptry strings as nulls check, in 1.2 that is gone.
As such I don't see this as a regression and thus this isn't prime candidate for backpatch. Of course, the patch is going to be pretty simple, and if a core committer is willing to backpatch this to 1.5 I am not going to complain...
I will try to commit a patch to master soon, it will not apply directly to 1.5 due to changes between master and 1.5.
comment:4 by , 12 years ago
A commit fixing this is available from https://github.com/akaariai/django/commit/be3f1d35be9f2db8c4c1f8ad5e7b0f1680675cb6
I am not sure if this should be actually fixed, wouldn't it be possible to just use somefield=None in this case? The fix will not work under multidb and I believe changes done for multidb were the reason for this regression.
I will leave the commit and backpatch decision to Ian.
comment:5 by , 12 years ago
I believe the fix is important because while somefield=None would work, the ORM itself uses the empty string. For example, this would produce an empty queryset in Oracle:
somechild = Child.objects.create(parent=someparent) Child.objects.filter(parent__somefield=somechild.parent.somefield)
comment:6 by , 12 years ago
OK, convinced.
I will try to remember to do the patch + backport to 1.5 today.
comment:7 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:9 by , 12 years ago
The fix is ugly, but it is the best that can be done easily. The alternative of doing nothing leads to Django itself generating queries that do not work.
I think something like Query.conditional_actions could be added if a perfect fix is needed. These are something that can be added during query construction and will be executed at start of compiling the query. So, one could add a conditional_promote to conditional_actions for the alias, and then promote the join if the given connection needs it. Then, this conditional promotion needs to be done at the end of compiling, otherwise the query will be changed in compiler and this doesn't work.
So, this would be a lot of work and some complicated code is needed to make the promotion work correctly. The gain is that correct join promotion happens when using non-default database for nullable strings. Doesn't seem to be worth it...
I see two possible fixes for this. The first is to add something like
if lookup == 'exact' and connection.interprets_empty_strins_as_nulls and value = "": value = None
just before the exact=None -> isnull conversion in build_filter (or add_filter in 1.5). This is somewhat dirty as the default connection will need to be used for checking the interprets_empty_strings_as_nulls condition (see is_nullable() in Query for another similar condition).The other option is to do Oracle specific join promotion in compiler stage. This would involve going through all the where/having conditions checking for exact lookups against "". Then promote all the aliases found. This will be ugly to do correctly, but it would lead to correct results under multidb...
I think the first option is enough. If you are using multidb where your other database is Oracle, then you just need to do
parent__key=None
instead of u"" in your code.