Opened 7 years ago
Last modified 7 years ago
#29416 closed Bug
Undesired subquery added to the GROUP BY clause — at Initial Version
Reported by: | Antoine Pinsard | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 2.0 |
Severity: | Release blocker | Keywords: | groupby, subquery |
Cc: | Mariusz Felisiak | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I am facing an issue while upgrading from Django 1.11 to Django 2.0.
I have a complex query interacting with a legacy database, which I simplified below to hilight the issue:
>>> from user.models import Sponsor >>> from django.db.models import ExpressionWrapper, Count, DecimalField >>> from django.db.models.expressions import RawSQL >>> nb_reports = RawSQL("SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2", []) >>> str(Sponsor.objects.all().annotate(report_rate=ExpressionWrapper(nb_reports / Count('deliveries'), output_field=DecimalField())).order_by('-report_rate').query)
This code, in Django 1.11.9, gives me the following query:
SELECT `ala_sponsor`.`sponId`, [...], ((SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) / COUNT(`ala_sponsor_need`.`asnId`)) AS `report_rate` FROM `ala_sponsor` LEFT OUTER JOIN `ala_sponsor_need` ON (`ala_sponsor`.`sponId` = `ala_sponsor_need`.`asnSponId`) GROUP BY `ala_sponsor`.`sponId` ORDER BY `report_rate` DESC
This is the expected behavior and it works well.
However, in Django 2.0.5, the same code gives me this query:
SELECT `ala_sponsor`.`sponId`, [...], ((SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) / COUNT(`ala_sponsor_need`.`asnId`)) AS `report_rate` FROM `ala_sponsor` LEFT OUTER JOIN `ala_sponsor_need` ON (`ala_sponsor`.`sponId` = `ala_sponsor_need`.`asnSponId`) GROUP BY `ala_sponsor`.`sponId`, (SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) ORDER BY `report_rate` DESC
As you can see, the ORM appended the subquery (SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2)
to the GROUP BY clause. Which is wrong, and takes forever to execute.
I tried to play with .values('id')
or such as I usually do when I get unexpected GROUP BY. I spent an afternoon on it but there's no way I could get rid of this undesired group by clause. The order_by
is not to blame either. Here is another example of what I tried:
str(Sponsor.objects.all().values('id').annotate(report_rate=ExpressionWrapper(nb_reports / Count('deliveries'), output_field=DecimalField())).order_by().query)
Which gives:
SELECT `ala_sponsor`.`sponId`, ((SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) / COUNT(`ala_sponsor_need`.`asnId`)) AS `report_rate` FROM `ala_sponsor` LEFT OUTER JOIN `ala_sponsor_need` ON (`ala_sponsor`.`sponId` = `ala_sponsor_need`.`asnSponId`) GROUP BY `ala_sponsor`.`sponId`, (SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) ORDER BY NULL
Also note that this is the annotate(report_rate=ExpressionWrapper(nb_reports / Count('deliveries'), output_field=DecimalField()))
that causes this issue. If I only do annotate(nb_reports=nb_reports)
or annotate(nb_deliveries=COUNT('deliveries'))
there is no additional GROUP BY clause generated.
In [40]: str(Sponsor.objects.all().values('id').annotate(nb_reports=nb_reports).order_by().query) Out[40]: "SELECT `ala_sponsor`.`sponId`, (SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) AS `nb_reports` FROM `ala_sponsor`" In [41]: str(Sponsor.objects.all().values('id').annotate(nb_deliveries=Count('deliveries')).order_by().query) Out[41]: 'SELECT `ala_sponsor`.`sponId`, COUNT(`ala_sponsor_need`.`asnId`) AS `nb_deliveries` FROM `ala_sponsor` LEFT OUTER JOIN `ala_sponsor_need` ON (`ala_sponsor`.`sponId` = `ala_sponsor_need`.`asnSponId`) GROUP BY `ala_sponsor`.`sponId` ORDER BY NULL' In [42]: str(Sponsor.objects.all().values('id').annotate(nb_reports=nb_reports, nb_deliveries=Count('deliveries')).order_by().query) Out[42]: "SELECT `ala_sponsor`.`sponId`, COUNT(`ala_sponsor_need`.`asnId`) AS `nb_deliveries`, (SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) AS `nb_reports` FROM `ala_sponsor` LEFT OUTER JOIN `ala_sponsor_need` ON (`ala_sponsor`.`sponId` = `ala_sponsor_need`.`asnSponId`) GROUP BY `ala_sponsor`.`sponId`, (SELECT COUNT(*) FROM pro_moderation WHERE objType='sponsor' AND objId=ala_sponsor.sponId AND state=2) ORDER BY NULL"