Opened 10 months ago
Last modified 9 months ago
#35099 closed Bug
Combining QuerySets with "|" or "&" produce side effects affecting further queries — at Version 1
Reported by: | Alan | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 5.0 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Hello everyone.
Combining some queries with "|" or "&" somehow affects queries involved in the operation, leading to malformed SQL and unexpected results.
Here are details and steps to reproduce. Apologise, for maybe a bit confusing model names, I copied them from production.
class SiteUser(models.Model): pass class Notification(models.Model): user = models.ForeignKey(to=SiteUser, on_delete=models.CASCADE) class PayoutRequest(models.Model): requester = models.ForeignKey(to=SiteUser, on_delete=models.CASCADE)
Test:
from django.test import TestCase from django.db.models import OuterRef, Exists from reproduce.models import Notification, SiteUser, PayoutRequest class Reproduce(TestCase): def test(self): u01 = SiteUser.objects.create() u02 = SiteUser.objects.create() u03 = SiteUser.objects.create() Notification.objects.create(user=u01) PayoutRequest.objects.create(requester=u01) Notification.objects.create(user=u02) PayoutRequest.objects.create(requester=u03) are_active = SiteUser.objects.all().distinct() got_money = SiteUser.objects.filter( Exists(PayoutRequest.objects.filter(requester=OuterRef('pk'))) ).distinct() whatever_query = SiteUser.objects.all().distinct() # Execute queries first time need_help = are_active.exclude(pk__in=got_money) notified = Notification.objects.filter(user__in=need_help).values_list('user_id', flat=True) query_before = str(notified.query) self.assertEqual(len(notified), 1) # correct whatever_query | got_money # Touch "got_money" with any other query # Execute same queries second time need_help = are_active.exclude(pk__in=got_money) notified = Notification.objects.filter(user__in=need_help).values_list('user_id', flat=True) query_after = str(notified.query) print(query_before) print(query_after) self.assertEqual(len(notified), 1) # expected 1, got 0 self.assertEqual(query_before, query_after) # false
As you can see, merely touching the got_money
query with any other query leads to modifying the results of the same queries executed after that.
This test case probably may be simplified even further, but unfortunately, I have no more time resources to dig much deeper.
I had another queries built using simple .filter() and .exclude(). Those were not affected by combining.
I found only this query got_money
using Exists()
and OuterRef()
to be affected. There might be more of which I am not aware of.
The reason for this I don't know, but query_before
and query_after
differs.
query_before
correctly separates subqueries using W0, U0, V0 aliases, while the query_after
uses a single U0 alias for all subqueries, leading to incorrect results.
Before
SELECT "reproduce_notification"."user_id" FROM "reproduce_notification" WHERE "reproduce_notification"."user_id" IN ( SELECT DISTINCT W0."id" FROM "reproduce_siteuser" W0 WHERE NOT ( W0."id" IN ( SELECT DISTINCT V0."id" FROM "reproduce_siteuser" V0 WHERE EXISTS( SELECT 1 AS "a" FROM "reproduce_payoutrequest" U0 WHERE U0."requester_id" = (V0."id") LIMIT 1 ) ) ) )
After
SELECT "reproduce_notification"."user_id" FROM "reproduce_notification" WHERE "reproduce_notification"."user_id" IN ( SELECT DISTINCT U0."id" FROM "reproduce_siteuser" U0 WHERE NOT ( U0."id" IN ( SELECT DISTINCT U0."id" FROM "reproduce_siteuser" U0 WHERE EXISTS( SELECT 1 AS "a" FROM "reproduce_payoutrequest" U0 WHERE U0."requester_id" = (U0."id") LIMIT 1 ) ) ) )
Found bug in version 4.2.7, but reproduced it in 5.0.1 the same way.
Feel free to request any additional information you might need for this.