prefetch_related fails with SQLite when used with 1000 parent records
This is described in ticket:16937#comment:3, but I don't see any issue filed for it.
"[prefetch_related] will not work on some backends if you have a lot of objects in your queryset. For example the SQLite backend has a limitation of 999 parameters to single SQL query, so if you have 1000 objects, prefetch_related will fail as you need to supply 1000 id values to the query."'
Batch it up like in #16426 and #17788, which dealt with the same limitation?
Change History
(12)
Description: |
modified (diff)
|
Triage Stage: |
Unreviewed → Accepted
|
Owner: |
changed from nobody to Raphael Michel
|
Status: |
new → assigned
|
Patch needs improvement: |
set
|
Patch needs improvement: |
unset
|
Patch needs improvement: |
set
|
Patch needs improvement: |
unset
|
Triage Stage: |
Accepted → Ready for checkin
|
Triage Stage: |
Ready for checkin → Accepted
|
Patch needs improvement: |
set
|
Owner: |
Raphael Michel removed
|
Status: |
assigned → new
|
Owner: |
set to Yashas Donthi
|
Status: |
new → assigned
|
As mentioned in the PR, I think there is one major issue:
prefetch_related
expects the results of the prefetch query to be unique.Splitting the query in batches do not enforce uniqueness and can lead to incorrect results. In short, if related objects are shared between several instances, when instances are split into batches, nothing prevents a related objects from appearing twice in the results. For example:
Prefetching in batches of 2 results in prefetch queries for:
[A, B]
then[C]
.Which gives the following mapping of related objects:
[(A, 1), (B, 2)]
then[(A, 1)]
. Therel_obj_cache
would then be:A more detailed example is available on the PR.
Maybe the way forward is to use a set or a dict to store the related queries' results (either for
all_related_objects
or forrel_obj_cache
).