Opened 13 years ago

Closed 12 years ago

#16112 closed Bug (fixed)

Excluding some value in a related model excludes objects without that model

Reported by: adehnert Owned by: nobody
Component: Database layer (models, ORM) Version: 1.3
Severity: Normal Keywords:
Cc: adehnert Triage Stage: Design decision needed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I have models that look something like:

class Group(models.Model):
    name = models.CharField(max_length=100)
    activity_category = models.ForeignKey('ActivityCategory', null=True, blank=True, )
class ActivityCategory(models.Model):
    name = models.CharField(max_length=50)

I'm trying to find each Group that doesn't have some ActivityCategory:

qobj = Q(activity_category__name='Dorm')
queryset = groups.models.Group.objects.filter(~qobj)

Intuitively, I would expect this to find both groups with no ActivityCategory and groups where the ActivityCategory exists and it isn't "Dorm". Reading https://docs.djangoproject.com/en/dev/topics/db/queries/#lookups-that-span-relationships seems to support this interpretation --- "If you are filtering across multiple relationships and one of the intermediate models doesn't have a value that meets the filter condition, Django will treat it as if there is an empty (all values are NULL), but valid, object there." Unfortunately, the ORM generates a query using INNER JOIN instead of LEFT OUTER JOIN, so my Group with a null ActivityCategory will not be returned.

I can work around this by explicitly indicating that null is possible, but it's annoying and unintuitive:

qobj = Q(activity_category__isnull = True) | ~(Q(activity_category__name='Dorm') | Q(activity_category__name='FSILG'))
queryset = groups.models.Group.objects.filter(qobj)

Attachments (1)

models.py (1.4 KB ) - added by adehnert 13 years ago.
models.py to reproduce

Download all attachments as: .zip

Change History (8)

comment:1 by Aymeric Augustin, 13 years ago

Triage Stage: UnreviewedDesign decision needed

The docs basically say that the ORM won't crash'n'burn when you're filtering on foo__bar=42 just because one of your objects doesn't have a related Foo object. See the paragraph that follows your quote. It doesn't really say anything about "negative queries".

That said, I agree with your analysis. Assuming a given object doesn't have an associated Foo object, the value of its foo__bar is something akin to NaN. Since NaN != <whatever> evaluates to True, this analogy supports your interpretation.

Fixing this problem would be backwards incompatible: it would change the result of "negative queries". IMO, we can't say it's an obvious bug, rather an under-defined behavior. So I'll mark it as DDN.

comment:2 by Aymeric Augustin, 13 years ago

Addendum: this might be a duplicate of #14876 — I'm not sure.

comment:3 by adehnert, 13 years ago

Mmm.

If the outcome of the design decision is to stick with the status quo (which would make me sad, but such is life), it'd be nice to more clearly document that behavior. (Admittedly, I'm not sure that it'll actually help very many people.)

comment:4 by Aleksandra Sendecka, 13 years ago

Resolution: worksforme
Status: newclosed
UI/UX: unset

Can't reproduce it anymore with a current trunk version. The query contains no INNER JOIN, but LEFT OUTER join.

by adehnert, 13 years ago

Attachment: models.py added

models.py to reproduce

comment:5 by adehnert, 13 years ago

Resolution: worksforme
Status: closedreopened

Hm, okay, so apparently I over-simplified the my steps to reproduce. It looks like the bug is actually in how combining Q objects is handled, somehow?

In particular, it appears that the LEFT OUTER JOIN occurs if (a) your qobj is a single clause, and (b) you don't try to NOT any Q wrapping the qobj. If you either do an OR (e.g., Q(activity_category__name='Dorm') | Q(activity_category__name='Dorm'), which should have the same result set) or NOT a wrapping Q (for example, by doing a .exclude instead of a .filter), things will break.

The following demonstrates behavior with various different Q objects (see the attached models.py for what they are exactly):

>>> import groups.models; groups.models.test()
or'd/exclude [<Group: foo (cat=1)>]
or'd/filter [<Group: foo (cat=1)>]
or'd/filterout [<Group: foo (cat=1)>]
or'd/filterin [<Group: foo (cat=1)>]
single/exclude [<Group: foo (cat=1)>]
single/filter [<Group: foo (cat=1)>, <Group: Bar (cat=None)>]
single/filterout [<Group: foo (cat=1)>]
single/filterin [<Group: foo (cat=1)>, <Group: Bar (cat=None)>]

SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  OR "groups_activitycategory"."name" = Dorm ))
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  OR "groups_activitycategory"."name" = Dorm ))
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  OR "groups_activitycategory"."name" = Dorm ))
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  OR "groups_activitycategory"."name" = Dorm ))
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT ("groups_activitycategory"."name" = Dorm )
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" LEFT OUTER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  AND NOT ("groups_activitycategory"."id" IS NULL)))
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT ("groups_activitycategory"."name" = Dorm )
SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" LEFT OUTER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm  AND NOT ("groups_activitycategory"."id" IS NULL)))

Given that ORing together identical Q objects will change the queryset produced by filtering on it, this seems a little more clearly a bug (rather than a wart) than it previously appeared.

comment:6 by Aymeric Augustin, 12 years ago

Status: reopenednew

comment:7 by Aymeric Augustin, 12 years ago

Resolution: fixed
Status: newclosed

(I was wrong in my first comment -- in SQL NULL != <whatever> evaluates to False.)

Anyway, I just tried your example, and Django now consistently uses LEFT OUTER JOIN. This was fixed, maybe at the same time as #14876.

Note: See TracTickets for help on using tickets.
Back to Top