Opened 4 years ago

Closed 4 years ago

#32089 closed Bug (fixed)

prefetch_related_objects() does not work for reused model instances.

Reported by: Dennis Kliban Owned by: AlexeyNigin
Component: Database layer (models, ORM) Version: 2.2
Severity: Normal Keywords:
Cc: François Freitag, Adam Johnson Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Dennis Kliban)

Our project processes instances in a stream. In some cases the instances are repeated. In these cases, we discovered that prefetch_related_objects() does not set the to_attr on all of the instances if the first instance in the list already has it set.

When Django determines that the very first instance in the list is_fetched[0], it does not call into the the prefetch_one_level()[1]. The attributed specified in the to_attr parameter is only set in the prefetch_one_level() method[2].

is_fetched is set by looking for the to_attr attribute on the instance[3].

[0] https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1605-L1609
[1] https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1624
[2] https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1799
[3] https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1708

Attachments (1)

32089-test.diff (2.2 KB ) - added by Mariusz Felisiak 4 years ago.

Download all attachments as: .zip

Change History (12)

comment:1 by Dennis Kliban, 4 years ago

Description: modified (diff)

by Mariusz Felisiak, 4 years ago

Attachment: 32089-test.diff added

comment:2 by Mariusz Felisiak, 4 years ago

Cc: François Freitag added
Component: UncategorizedDatabase layer (models, ORM)
Resolution: needsinfo
Status: newclosed
Summary: prefetch_related_objects() does not set the `to_attr` on all instances if the first instance in the list already has it setprefetch_related_objects() does not set the to_attr on all instances if the first instance in the list already has it.
Type: UncategorizedBug

Thanks for this report, however I cannot reproduce this issue on Django 2.2 or on the current master (maybe I'm missing sth). Can you provide a sample project or a regression test? I tried the following test case 32089-test.diff.

comment:3 by François Freitag, 4 years ago

Our project processes instances in a stream.

Could you give more details on that? prefetch_related_objects() loads all objects upfront due to https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1546. Even with an overloaded __bool__, the QuerySet is iterated over by https://github.com/django/django/blob/stable/2.2.x/django/db/models/query.py#L1585.

comment:4 by Dennis Kliban, 4 years ago

Resolution: needsinfo
Status: closednew

I will put together a simple test that demonstrates the problem with Books and Authors. Here is the code from Pulp that causes this issue:

We have asyncio stages that process items in batches. Each batch is processed here[0]. That method calls into _needed_remote_artifacts()[1] which calls into prefetch_related_objects()[2].

This bug is experienced when the very first item in a batch that was already present in a previous batch. These batches contain other information also, so we don't consider the data fully duplicated.

[0] https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/stages/artifact_stages.py#L243
[1] https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/stages/artifact_stages.py#L248
[2] https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/stages/artifact_stages.py#L270

comment:5 by Dennis Kliban, 4 years ago

I've written a simple test here[0]. This test fails with the following error:

(pulp) [vagrant@pulp2-nightly-pulp3-source-centos7 tests]$ ./runtests.py --settings=settings prefetch_related.test_prefetch_related_objects
Testing against Django installed in '/usr/local/lib/pulp/lib64/python3.6/site-packages/django' with up to 4 processes
Creating test database for alias 'default'...
System check identified no issues (0 silenced).
.......E..
======================================================================
ERROR: test_prefetch_object_to_attr_twice (prefetch_related.test_prefetch_related_objects.PrefetchRelatedObjectsTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/vagrant/devel/django/tests/prefetch_related/test_prefetch_related_objects.py", line 114, in test_prefetch_object_to_attr_twice
    self.assertCountEqual(book3.the_authors, [self.author3])
AttributeError: 'Book' object has no attribute 'the_authors'

----------------------------------------------------------------------
Ran 10 tests in 0.091s

FAILED (errors=1)
Destroying test database for alias 'default'...

[0] https://github.com/dkliban/django/commit/8093932797a6597a5075ecd3038e2f91c632023a

comment:6 by Mariusz Felisiak, 4 years ago

Cc: Adam Johnson added
Summary: prefetch_related_objects() does not set the to_attr on all instances if the first instance in the list already has it.prefetch_related_objects() does not work for reused model instances.
Triage Stage: UnreviewedAccepted

Thanks Dennis for details and a regression test! We should fix this issue or at least document the restriction if it's not feasible.

comment:7 by Adam Johnson, 4 years ago

I had a brief look at fixing this. I think it's possible by filtering the objects before trying to prefetch for them. See https://github.com/adamchainz/django/commit/bc2991c0908abbf4973cbb4850ffff0dea4bbd6f . This currently fails three prefetch tests - one for executing an extra query, two for executing one fewer query (which may actually represent a true optimization).

comment:8 by AlexeyNigin, 4 years ago

Owner: changed from nobody to AlexeyNigin
Status: newassigned

Hello! I would like to work on this ticket to complete a requirement for the final project in my software engineering class (https://web.eecs.umich.edu/~weimerw/481/hw6.html). Since this is my first time contributing to Django, it might take me some time to make progress on this ticket. Thank you for your patience, and if you need to fix this ticket urgently in the meantime, feel free to unassign me from it.

comment:9 by AlexeyNigin, 4 years ago

Has patch: set

PR: https://github.com/django/django/pull/13774

I modified Adam Johnson's patch so that it doesn't fail any of the prefetch tests, and added a minimized version of Dennis Kliban's test for regression. Would appreciate a review!

comment:10 by Mariusz Felisiak, 4 years ago

Triage Stage: AcceptedReady for checkin

comment:11 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

Resolution: fixed
Status: assignedclosed

In b9ba85a:

Fixed #32089 -- Fixed prefetch_related_objects() when some objects are already fetched.

Thanks Dennis Kliban for the report and Adam Johnson for the initial
patch.

Co-authored-by: Adam Johnson <me@…>

Note: See TracTickets for help on using tickets.
Back to Top