Opened 4 years ago
Closed 4 years ago
#32089 closed Bug (fixed)
prefetch_related_objects() does not work for reused model instances.
Reported by: | Dennis Kliban | Owned by: | AlexeyNigin |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 2.2 |
Severity: | Normal | Keywords: | |
Cc: | François Freitag, Adam Johnson | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Our project processes instances in a stream. In some cases the instances are repeated. In these cases, we discovered that prefetch_related_objects()
does not set the to_attr
on all of the instances if the first instance in the list already has it set.
When Django determines that the very first instance in the list is_fetched
[0], it does not call into the the prefetch_one_level()
[1]. The attributed specified in the to_attr parameter is only set in the prefetch_one_level()
method[2].
is_fetched
is set by looking for the to_attr
attribute on the instance[3].
[0] https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1605-L1609
[1] https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1624
[2] https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1799
[3] https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1708
Attachments (1)
Change History (12)
comment:1 by , 4 years ago
Description: | modified (diff) |
---|
by , 4 years ago
Attachment: | 32089-test.diff added |
---|
comment:2 by , 4 years ago
Cc: | added |
---|---|
Component: | Uncategorized → Database layer (models, ORM) |
Resolution: | → needsinfo |
Status: | new → closed |
Summary: | prefetch_related_objects() does not set the `to_attr` on all instances if the first instance in the list already has it set → prefetch_related_objects() does not set the to_attr on all instances if the first instance in the list already has it. |
Type: | Uncategorized → Bug |
comment:3 by , 4 years ago
Our project processes instances in a stream.
Could you give more details on that? prefetch_related_objects()
loads all objects upfront due to https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1546. Even with an overloaded __bool__
, the QuerySet
is iterated over by https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1585.
comment:4 by , 4 years ago
Resolution: | needsinfo |
---|---|
Status: | closed → new |
I will put together a simple test that demonstrates the problem with Books and Authors. Here is the code from Pulp that causes this issue:
We have asyncio stages that process items in batches. Each batch is processed here[0]. That method calls into _needed_remote_artifacts()
[1] which calls into prefetch_related_objects()
[2].
This bug is experienced when the very first item in a batch that was already present in a previous batch. These batches contain other information also, so we don't consider the data fully duplicated.
[0] https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/stages/artifact_stages.py#L243
[1] https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/stages/artifact_stages.py#L248
[2] https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/stages/artifact_stages.py#L270
comment:5 by , 4 years ago
I've written a simple test here[0]. This test fails with the following error:
(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 tests]$ ./runtests.py --settings=settings prefetch_related.test_prefetch_related_objects Testing against Django installed in '/usr/local/lib/pulp/lib64/python3.6/site-packages/django' with up to 4 processes Creating test database for alias 'default'... System check identified no issues (0 silenced). .......E.. ====================================================================== ERROR: test_prefetch_object_to_attr_twice (prefetch_related.test_prefetch_related_objects.PrefetchRelatedObjectsTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/vagrant/devel/django/tests/prefetch_related/test_prefetch_related_objects.py", line 114, in test_prefetch_object_to_attr_twice self.assertCountEqual(book3.the_authors, [self.author3]) AttributeError: 'Book' object has no attribute 'the_authors' ---------------------------------------------------------------------- Ran 10 tests in 0.091s FAILED (errors=1) Destroying test database for alias 'default'...
[0] https://github.com/dkliban/django/commit/8093932797a6597a5075ecd3038e2f91c632023a
comment:6 by , 4 years ago
Cc: | added |
---|---|
Summary: | prefetch_related_objects() does not set the to_attr on all instances if the first instance in the list already has it. → prefetch_related_objects() does not work for reused model instances. |
Triage Stage: | Unreviewed → Accepted |
Thanks Dennis for details and a regression test! We should fix this issue or at least document the restriction if it's not feasible.
comment:7 by , 4 years ago
I had a brief look at fixing this. I think it's possible by filtering the objects before trying to prefetch for them. See https://github.com/adamchainz/django/commit/bc2991c0908abbf4973cbb4850ffff0dea4bbd6f . This currently fails three prefetch tests - one for executing an extra query, two for executing one fewer query (which may actually represent a true optimization).
comment:8 by , 4 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Hello! I would like to work on this ticket to complete a requirement for the final project in my software engineering class (https://web.eecs.umich.edu/~weimerw/481/hw6.html). Since this is my first time contributing to Django, it might take me some time to make progress on this ticket. Thank you for your patience, and if you need to fix this ticket urgently in the meantime, feel free to unassign me from it.
comment:9 by , 4 years ago
Has patch: | set |
---|
PR: https://github.com/django/django/pull/13774
I modified Adam Johnson's patch so that it doesn't fail any of the prefetch tests, and added a minimized version of Dennis Kliban's test for regression. Would appreciate a review!
comment:10 by , 4 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
Thanks for this report, however I cannot reproduce this issue on Django 2.2 or on the current master (maybe I'm missing sth). Can you provide a sample project or a regression test? I tried the following test case 32089-test.diff.