Opened 16 months ago
Closed 16 months ago
#34791 closed Bug (fixed)
Prefetch's queryset attribute is ignored when the same nested relation is used multiple times.
Reported by: | Maxime Toussaint | Owned by: | Maxime Toussaint |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 4.2 |
Severity: | Normal | Keywords: | prefetch, prefetch_related |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Note: Edited the description following the discussion
There seems to be an issue when using a Prefetch object to fetch something that has already been fetched. The issue only seems to happen when there is depth in the prefetch. Here is an example I made this morning that fails:
pizzas = Pizza.objects.all().prefetch_related( "toppings__origin", Prefetch( "toppings__origin", queryset=Country.objects.filter(label="China"), to_attr="china", ), ) china = pizzas[0].toppings.all()[0].china
Here, when trying to get china, I would assume it to either be a Country object or None. However, I get the message: AttributeError: 'Topping' object has no attribute 'china'
Here are the models I set up for my test:
class Country(models.Model): label = models.CharField(max_length=50) class Pizza(models.Model): label = models.CharField(max_length=50) class Topping(models.Model): pizza = models.ForeignKey(Pizza, on_delete=models.CASCADE, related_name="toppings") label = models.CharField(max_length=50) origin = models.ForeignKey( Country, on_delete=models.CASCADE, related_name="toppings" )
And here are the queries made when calling the queryset:
1. SELECT "tests_pizza"."id", "tests_pizza"."label", "tests_pizza"."provenance_id" FROM "tests_pizza" 2. SELECT "tests_topping"."id", "tests_topping"."pizza_id", "tests_topping"."label", "tests_topping"."origin_id" FROM "tests_topping" WHERE "tests_topping"."pizza_id" IN (1, 2) 3. SELECT "tests_country"."id", "tests_country"."label", "tests_country"."continent_id" FROM "tests_country" WHERE "tests_country"."id" IN (2, 3, 4)
Note that the filter by label='china' has completely disappeared.
Now, if I switch the prefetches around like so:
pizzas = Pizza.objects.all().prefetch_related( Prefetch( "toppings__origin", queryset=Country.objects.filter(label="China"), to_attr="china", ), "toppings__origin", )
Fetching china now works, and here are the queries being made:
1. SELECT "tests_pizza"."id", "tests_pizza"."label", "tests_pizza"."provenance_id" FROM "tests_pizza" 2. SELECT "tests_topping"."id", "tests_topping"."pizza_id", "tests_topping"."label", "tests_topping"."origin_id" FROM "tests_topping" WHERE "tests_topping"."pizza_id" IN (1, 2) 3. SELECT "tests_country"."id", "tests_country"."label", "tests_country"."continent_id" FROM "tests_country" WHERE ("tests_country"."label" = 'China' AND "tests_country"."id" IN (2, 3, 4)) 4. SELECT "tests_country"."id", "tests_country"."label", "tests_country"."continent_id" FROM "tests_country" WHERE "tests_country"."id" IN (2, 3, 4)
This time, both calls to Country were made.
I did follow the code a bit yesterday, and I believe it comes from the fact that the ForeignKey field defines the cache name as being simply the name of the field. I am not certain though, and it would likely require someone with more knowledge than me to look into it.
Change History (14)
comment:1 by , 16 months ago
comment:2 by , 16 months ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
I have created a few models of my own and tested your queries. The queries are correct in both cases and I get expected results. I have created many toppings and onion
and tomato
are out of stock:
>>> def print_toppings(q): ... print([t.name for p in q for t in p.toppings.all()], [t.name for p in q for t in p.out_of_stock_toppings]) ... >>> q1 = Pizza.objects.all().prefetch_related('toppings',Prefetch('toppings', queryset=Topping.objects.filter(is_in_stock=False), to_attr='out_of_stock_toppings')) >>> print_toppings(q1) ['sweet pepper', 'egg', 'ham', 'bacon', 'onion', 'cheese', 'tomato', 'onion', 'sweet pepper', 'egg', 'cheese', 'onion', 'tomato'] ['onion', 'tomato', 'onion', 'onion', 'tomato'] >>> print("\n\n".join(i["sql"] for i in connection.queries)) SELECT "ticket_34791_pizza"."id", "ticket_34791_pizza"."name" FROM "ticket_34791_pizza" SELECT ("ticket_34791_pizza_toppings"."pizza_id") AS "_prefetch_related_val_pizza_id", "ticket_34791_topping"."id", "ticket_34791_topping"."name", "ticket_34791_topping"."is_in_stock" FROM "ticket_34791_topping" INNER JOIN "ticket_34791_pizza_toppings" ON ("ticket_34791_topping"."id" = "ticket_34791_pizza_toppings"."topping_id") WHERE "ticket_34791_pizza_toppings"."pizza_id" IN (2, 3, 4, 1) SELECT ("ticket_34791_pizza_toppings"."pizza_id") AS "_prefetch_related_val_pizza_id", "ticket_34791_topping"."id", "ticket_34791_topping"."name", "ticket_34791_topping"."is_in_stock" FROM "ticket_34791_topping" INNER JOIN "ticket_34791_pizza_toppings" ON ("ticket_34791_topping"."id" = "ticket_34791_pizza_toppings"."topping_id") WHERE (NOT "ticket_34791_topping"."is_in_stock" AND "ticket_34791_pizza_toppings"."pizza_id" IN (2, 3, 4, 1))
The second query also works fine:
>>> q2 = Pizza.objects.all().prefetch_related(Prefetch('toppings', queryset=Topping.objects.filter(is_in_stock=False), to_attr='out_of_stock_toppings'),'toppings') >>> print_toppings(q2) ['sweet pepper', 'egg', 'ham', 'bacon', 'onion', 'cheese', 'tomato', 'onion', 'sweet pepper', 'egg', 'cheese', 'onion', 'tomato'] ['onion', 'tomato', 'onion', 'onion', 'tomato'] >>> print("\n\n".join(i["sql"] for i in connection.queries)) SELECT "ticket_34791_pizza"."id", "ticket_34791_pizza"."name" FROM "ticket_34791_pizza" SELECT ("ticket_34791_pizza_toppings"."pizza_id") AS "_prefetch_related_val_pizza_id", "ticket_34791_topping"."id", "ticket_34791_topping"."name", "ticket_34791_topping"."is_in_stock" FROM "ticket_34791_topping" INNER JOIN "ticket_34791_pizza_toppings" ON ("ticket_34791_topping"."id" = "ticket_34791_pizza_toppings"."topping_id") WHERE (NOT "ticket_34791_topping"."is_in_stock" AND "ticket_34791_pizza_toppings"."pizza_id" IN (2, 3, 4, 1)) SELECT ("ticket_34791_pizza_toppings"."pizza_id") AS "_prefetch_related_val_pizza_id", "ticket_34791_topping"."id", "ticket_34791_topping"."name", "ticket_34791_topping"."is_in_stock" FROM "ticket_34791_topping" INNER JOIN "ticket_34791_pizza_toppings" ON ("ticket_34791_topping"."id" = "ticket_34791_pizza_toppings"."topping_id") WHERE "ticket_34791_pizza_toppings"."pizza_id" IN (2, 3, 4, 1)
You should note that when using to_attr
, the result is stored in a list as documented here (see the Note): https://docs.djangoproject.com/en/4.2/ref/models/querysets/#prefetch-objects
comment:3 by , 16 months ago
Hi, thanks a lot for the quick answer. I seem to have misunderstood the parameters of my issue, and the example I gave does work for me. In trying to simplify it, I took out the part that was making it fail.
So the issue only seems to happen when there is depth in the prefetch. Here is an example I made this morning that fails:
pizzas = Pizza.objects.all().prefetch_related( "toppings__origin", Prefetch( "toppings__origin", queryset=Country.objects.filter(label="China"), to_attr="china", ), ) china = pizzas[0].toppings.all()[0].china
Here, when trying to get china, I would assume it to either be a Country object or None. However, I get the message: AttributeError: 'Topping' object has no attribute 'china'
Here are the models I set up for my test:
class Country(models.Model): label = models.CharField(max_length=50) class Pizza(models.Model): label = models.CharField(max_length=50) class Topping(models.Model): pizza = models.ForeignKey(Pizza, on_delete=models.CASCADE, related_name="toppings") label = models.CharField(max_length=50) origin = models.ForeignKey( Country, on_delete=models.CASCADE, related_name="toppings" )
And here are the queries made when calling the queryset:
1. SELECT "tests_pizza"."id", "tests_pizza"."label", "tests_pizza"."provenance_id" FROM "tests_pizza" 2. SELECT "tests_topping"."id", "tests_topping"."pizza_id", "tests_topping"."label", "tests_topping"."origin_id" FROM "tests_topping" WHERE "tests_topping"."pizza_id" IN (1, 2) 3. SELECT "tests_country"."id", "tests_country"."label", "tests_country"."continent_id" FROM "tests_country" WHERE "tests_country"."id" IN (2, 3, 4)
Note that the filter by label='china' has completely disappeared.
Now, if I switch the prefetches around like so:
pizzas = Pizza.objects.all().prefetch_related( Prefetch( "toppings__origin", queryset=Country.objects.filter(label="China"), to_attr="china", ), "toppings__origin", )
Fetching china now works, and here are the queries being made:
1. SELECT "tests_pizza"."id", "tests_pizza"."label", "tests_pizza"."provenance_id" FROM "tests_pizza" 2. SELECT "tests_topping"."id", "tests_topping"."pizza_id", "tests_topping"."label", "tests_topping"."origin_id" FROM "tests_topping" WHERE "tests_topping"."pizza_id" IN (1, 2) 3. SELECT "tests_country"."id", "tests_country"."label", "tests_country"."continent_id" FROM "tests_country" WHERE ("tests_country"."label" = 'China' AND "tests_country"."id" IN (2, 3, 4)) 4. SELECT "tests_country"."id", "tests_country"."label", "tests_country"."continent_id" FROM "tests_country" WHERE "tests_country"."id" IN (2, 3, 4)
This time, both calls to Country were made.
I did follow the code a bit yesterday, and I believe it comes from the fact that the ForeignKey field defines the cache name as being simply the name of the field. I am not certain though, and it would likely require someone with more knowledge than me to look into it.
Thanks again, hopefully we can figure out what is happening!
comment:4 by , 16 months ago
Resolution: | worksforme |
---|---|
Status: | closed → new |
comment:5 by , 16 months ago
Description: | modified (diff) |
---|
comment:7 by , 16 months ago
I did follow the code a bit yesterday, and I believe it comes from the fact that the ForeignKey field defines the cache name as being simply the name of the field. I am not certain though, and it would likely require someone with more knowledge than me to look into it.
As far as I'm aware this is an expected behavior, subsequent lookups in prefetch_related()
affect each other, see docs:
"The ordering of lookups matters."
You prefetched toppings__origin
with a default queryset, so the second Prefetch()
ignores a custom queryset on the same relation. You should use Prefetch(..., to_attr="")
in both cases to have two independent queries.
In such cases, we may consider raising ValueError
.
comment:8 by , 16 months ago
Summary: | Issue when using Prefetch objects in prefetch_related → Prefetch's queryset attribute is ignored when the same nested relation is used multiple times. |
---|---|
Triage Stage: | Unreviewed → Accepted |
Tentatively accepted.
comment:9 by , 16 months ago
So, it makes sense that the order matters, especially for Prefetch with to_attr, since until it has been fetched, it does not exist yet. Here though, these are 2 very separate calls to the database that have absolutely no interaction with each other. Shouldn't they be independent? It doesn't make sense to me that it works in one order but it does not in the other. I am vey surprised to hear that such a thing could be the intended behaviour, as I can't make sense of why it could be. If you could give more depth to your reasoning, it would be much appreciated.
comment:10 by , 16 months ago
I am vey surprised to hear that such a thing could be the intended behaviour, as I can't make sense of why it could be. If you could give more depth to your reasoning, it would be much appreciated.
I changed my mind and agreed that we should fix it, that's why I accepted this ticket.
comment:11 by , 16 months ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:13 by , 16 months ago
Triage Stage: | Accepted → Ready for checkin |
---|
Hello! Thank you for your report. Could you please post your models.py definition (reduced to this example)? Also it would be useful if you can paste the query each example generate.
This way, we can try to reproduce and triage accordingly. Thanks!