#18785 closed Bug (fixed)
Extra-Join Regression from ticket #15316
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.4 |
Severity: | Normal | Keywords: | |
Cc: | trbs@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The change to prevent over-trimming in trim_joins() for https://code.djangoproject.com/ticket/15316 appears to have introduced new, unneeded joins (LEFT OUTERs) as a side effect. The result is that in certain cases Django 1.4+ creates less efficient SQL than Django 1.3.
The following test passes in 1.3.x and fails w/ #15316's change. The failure is due to a new LEFT OUTER JOIN appearing against Document's delete_log.
queries/models.py
class AuditLog(models.Model): info = models.CharField(max_length=50) class Person(models.Model): name = models.CharField(max_length=50) delete_log = models.ForeignKey(AuditLog) class Document(models.Model): status = models.CharField(max_length=50) delete_log = models.ForeignKey(AuditLog) person = models.ForeignKey(Person)
queries/tests:
class OneFourRegressionProof(unittest.TestCase): """ Should evolve this into a better test case that proves something positive, but for now this is attempting to provide evidence that for some queries we got less efficient in Django 1.4 """ def setUp(self): pass def test_it(self): """ Ideally, this would only create on INNER join on person. """ qs = Document.objects.exclude(delete_log__isnull=False).filter(status='something', person__delete_log__isnull=True) self.assertEquals(1, str(qs.query).count('INNER JOIN')) self.assertEquals(0, str(qs.query).count('OUTER JOIN'))
git bisect results:
(django-base)~/projects/elation/django-fork ((2e56066...)|BISECTING) philltornroth → git bisect bad 2e56066a5b93b528fbce37285bac591b44bc6ed7 is the first bad commit commit 2e56066a5b93b528fbce37285bac591b44bc6ed7 Author: Malcolm Tredinnick <malcolm.tredinnick@gmail.com> Date: Tue Aug 23 03:38:42 2011 +0000 Fixed an isnull=False filtering edge-case. Fixes #15316. The bulk of this patch is due to some fine analysis from Aleksandra Sendecka. git-svn-id: http://code.djangoproject.com/svn/django/trunk@16656 bcc190cf-cafb-0310-a4f2-bffc1f526a37 :100644 100644 5635f2141d5ed22bfcc03df7fd5456a48f7f051e 38c7601579c5b366a5027358a3806b24baeaddfb M AUTHORS :040000 040000 33a384a63b8e8d7e3b325640339f13a14babc315 fe57cf0e7bc932f15d181d3a186bffe453d6bb2a M django :040000 040000 afb2570e431195108c3d8b0b634e4b30eac42168 e1213b550270d389ffdca827116642b036855a6b M tests
It's worth noting that this is effectively an exacerbation of https://code.djangoproject.com/ticket/10790 . I began the discussion on that ticket and a patch is starting to take shape, but it looks too drastic to be a 1.4 merge candidate. I wanted to stand up this related ticket so that a more surgical fix to 1.4 could be considered, if reasonable.
Anecdotally, I'm seeing queries in our system where this change added 3-6 new joins, making an upgrade to 1.4 a complete non-starter for us unless there's a workaround that I haven't turned up yet.
Change History (9)
comment:1 by , 12 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 12 years ago
Awesome! Thanks much for the support on this ticket and #10790. It's definitely appreciated. I'm still not well enough versed in the ORM code to be much help putting a patch together in short order, but I'd be more than willing to write more/better unit tests that verify the regression/fix... at this point I'm plenty familiar with the queries regression tests :)
comment:3 by , 12 years ago
I think you will need two thing here:
- better pruning logic in trim_joins(). You will need to have better information about which joins are direct foreign keys and thus trimmable.
- you will likely need to do similar improvement to Query.combine() as was done here: https://github.com/akaariai/django/blob/d301010fb92d6588d75fb3f9ecfe95ca6042cfed/django/db/models/sql/query.py#L480, that is, do not skip unused aliases, just unref them after addition.
I think you should try to avoid changing existing return value of setup_joins() or adding new mandatory parameters to trim_joins(). Breaking these APIs in a minor release doesn't look nice, even if they are private APIs.
Alternatively you might also want to check if you could solve #15316 in a way that doesn't introduce the regression. It seems that handling nonnull_check in the way it is done currently will simply not work.
If you can test #10790 and see if it solves this regression for you, that would be very welcome.
comment:4 by , 12 years ago
I have a pretty good understanding of what is going on. The trim joins logic should go as follows: if you have joins to forward direction (you are following a foreign key), and the value you want is the thing the foreign key points to (usually the PK of the foreign model), then you can trim the join. This will always be valid, as the value of the foreign key _must_ be the same it is one the referenced model, and there can only be one value on the other side (the other side must be unique). If the foreign key is nullable and contains a null, the join wouldn't produce anything anyways.
On the other hand if you are following a foreign key to the reverse direction (usually from some model's PK to the foreign key in the referenced model), then you can _never_ trim the join. The reason for this is that you can not know if there is anything on the other side, that is, even if you know the value the foreign key must have, you can't know if there is anything on the other side. As an example:
table a table b id 1 a_id 1 id 2
Now, if you make the assumption that if you know the value of id in table a, then you know the value of a_id in table b then you are wrong, as the id=2 value doesn't have any matching row on the other side.
Current Django code makes the assumption you can trim joins in the reverse case. This was blocked for the OneToOneKey case in #15316 by the nonnull_check, but I don't believe that to be the correct fix. Just block all trimming in the reverse case and you have your solution. The question is how do you pass the direction information from setup_joins to trim_joins with minimal (internal) API breakages. You might want to add a completely new variable to the Query class ("join_directions" or something like that), which tracks the join directions for each alias generated and nothing else. This way you need only internal changes to setup_joins, trim_joins, Query.__init__
and !Query.clone(). This solution isn't too beautiful, but I don't expect it to bite us. In any case it seems obvious code changes in this are will not be backportable, as the API is going to change a lot in 1.5.
I might be overcautious above. Another solution is to just add the direction to alias_map items and be done with it. This will break 3rd party code that uses alias_map. As this is totally private API one could claim that they had it coming all along... Still, it seems we can avoid this break somewhat easily, so why be evil...
You can get the directions from the patch in #10790, names_to_path().
If you still keep the left outer joins in place in trim_joins(), I believe you don't need any changes to !Query.combine(). In #10790 the left joins are trimmed, too, as I can't see any reason why they are non-trimmable.
The reverse join trimming issue isn't commonly seen, as it is somewhat rare to have reverse filters which are trimmable to begin with. The most common case is OneToOneKey, or doing something like A.objects.filter(b__a_id=1)
, but that usually doesn't make sense, instead you would just do A.objects.filter(id=1).
comment:5 by , 12 years ago
Cc: | added |
---|
comment:6 by , 12 years ago
This regression from 1.4 still appears broken in the last 1.4 and 1.5 builds. Are there any plans to fix it? Will improvements in 1.6 resolve this issue? Our project is orphaned on 1.3 because this change causes a huge query spike and abysmal performance.
comment:7 by , 12 years ago
I should point out though, the problem looks fixed in 1.6. I confirmed that our project unit test is now failing in an awesome direction using the newest bits.. the join count for the little test I wrote to remind us to look out for this error is 2 joins in v1.3, 4 joins in 1.4 & 1.5, and 1 join in 1.6. Huge thanks to Anssi who's work on https://code.djangoproject.com/ticket/10790 I believe is responsible for the major improvements. This ought to mean that basically everyone with a non-trivial Django project just got a really significant performance boost. If a 1.4 or 1.5 patch is too hairy, I suppose we can jump past them to 1.6 when it's stable.
comment:8 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:9 by , 12 years ago
Backpatching of all the ORM changes to 1.5 can't be done. Way too much risk, I am sure there are still regressions hiding in there. If a very targeted patch for this issue is written then that might be a candidate to 1.5 as a regression fix.
I am accepting this as a regression in 1.4. To get this fixed in 1.4 will need a simple patch. I have hopes that 1.5 will contain a more thorough setup_joins() refactoring from #10790, but as said, that will not help this ticket as the changes are way too large to backpatch.