Opened 14 years ago
Closed 12 years ago
#16112 closed Bug (fixed)
Excluding some value in a related model excludes objects without that model
Reported by: | adehnert | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.3 |
Severity: | Normal | Keywords: | |
Cc: | adehnert | Triage Stage: | Design decision needed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I have models that look something like:
class Group(models.Model): name = models.CharField(max_length=100) activity_category = models.ForeignKey('ActivityCategory', null=True, blank=True, ) class ActivityCategory(models.Model): name = models.CharField(max_length=50)
I'm trying to find each Group that doesn't have some ActivityCategory:
qobj = Q(activity_category__name='Dorm') queryset = groups.models.Group.objects.filter(~qobj)
Intuitively, I would expect this to find both groups with no ActivityCategory and groups where the ActivityCategory exists and it isn't "Dorm". Reading https://docs.djangoproject.com/en/dev/topics/db/queries/#lookups-that-span-relationships seems to support this interpretation --- "If you are filtering across multiple relationships and one of the intermediate models doesn't have a value that meets the filter condition, Django will treat it as if there is an empty (all values are NULL), but valid, object there." Unfortunately, the ORM generates a query using INNER JOIN instead of LEFT OUTER JOIN, so my Group with a null ActivityCategory will not be returned.
I can work around this by explicitly indicating that null is possible, but it's annoying and unintuitive:
qobj = Q(activity_category__isnull = True) | ~(Q(activity_category__name='Dorm') | Q(activity_category__name='FSILG')) queryset = groups.models.Group.objects.filter(qobj)
Attachments (1)
Change History (8)
comment:1 by , 14 years ago
Triage Stage: | Unreviewed → Design decision needed |
---|
comment:3 by , 14 years ago
Mmm.
If the outcome of the design decision is to stick with the status quo (which would make me sad, but such is life), it'd be nice to more clearly document that behavior. (Admittedly, I'm not sure that it'll actually help very many people.)
comment:4 by , 14 years ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
UI/UX: | unset |
Can't reproduce it anymore with a current trunk version. The query contains no INNER JOIN, but LEFT OUTER join.
comment:5 by , 14 years ago
Resolution: | worksforme |
---|---|
Status: | closed → reopened |
Hm, okay, so apparently I over-simplified the my steps to reproduce. It looks like the bug is actually in how combining Q objects is handled, somehow?
In particular, it appears that the LEFT OUTER JOIN occurs if (a) your qobj is a single clause, and (b) you don't try to NOT any Q wrapping the qobj. If you either do an OR (e.g., Q(activity_category__name='Dorm') | Q(activity_category__name='Dorm')
, which should have the same result set) or NOT a wrapping Q (for example, by doing a .exclude instead of a .filter), things will break.
The following demonstrates behavior with various different Q objects (see the attached models.py for what they are exactly):
>>> import groups.models; groups.models.test() or'd/exclude [<Group: foo (cat=1)>] or'd/filter [<Group: foo (cat=1)>] or'd/filterout [<Group: foo (cat=1)>] or'd/filterin [<Group: foo (cat=1)>] single/exclude [<Group: foo (cat=1)>] single/filter [<Group: foo (cat=1)>, <Group: Bar (cat=None)>] single/filterout [<Group: foo (cat=1)>] single/filterin [<Group: foo (cat=1)>, <Group: Bar (cat=None)>] SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm OR "groups_activitycategory"."name" = Dorm )) SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm OR "groups_activitycategory"."name" = Dorm )) SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm OR "groups_activitycategory"."name" = Dorm )) SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm OR "groups_activitycategory"."name" = Dorm )) SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT ("groups_activitycategory"."name" = Dorm ) SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" LEFT OUTER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm AND NOT ("groups_activitycategory"."id" IS NULL))) SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" INNER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT ("groups_activitycategory"."name" = Dorm ) SELECT "groups_group"."id", "groups_group"."name", "groups_group"."activity_category_id" FROM "groups_group" LEFT OUTER JOIN "groups_activitycategory" ON ("groups_group"."activity_category_id" = "groups_activitycategory"."id") WHERE NOT (("groups_activitycategory"."name" = Dorm AND NOT ("groups_activitycategory"."id" IS NULL)))
Given that ORing together identical Q objects will change the queryset produced by filtering on it, this seems a little more clearly a bug (rather than a wart) than it previously appeared.
comment:6 by , 12 years ago
Status: | reopened → new |
---|
comment:7 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
(I was wrong in my first comment -- in SQL NULL != <whatever>
evaluates to False
.)
Anyway, I just tried your example, and Django now consistently uses LEFT OUTER JOIN. This was fixed, maybe at the same time as #14876.
The docs basically say that the ORM won't crash'n'burn when you're filtering on
foo__bar=42
just because one of your objects doesn't have a relatedFoo
object. See the paragraph that follows your quote. It doesn't really say anything about "negative queries".That said, I agree with your analysis. Assuming a given object doesn't have an associated
Foo
object, the value of itsfoo__bar
is something akin toNaN
. SinceNaN != <whatever>
evaluates toTrue
, this analogy supports your interpretation.Fixing this problem would be backwards incompatible: it would change the result of "negative queries". IMO, we can't say it's an obvious bug, rather an under-defined behavior. So I'll mark it as DDN.