#35279 closed Cleanup/optimization (invalid)
Memory Leak with `prefetch_related`
Reported by: | Ken Tong | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 4.2 |
Severity: | Normal | Keywords: | memory leak |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Memory Leak after calling queryset.prefetch_related()
or prefetch_related_objects()
To reproduce:
import gc from django.db import models from django.db.models import prefetch_related_objects class Foo(models.Model): id = models.AutoField(primary_key=True) class Bar(models.Model): id = models.AutoField(primary_key=True) foo = models.ForeignKey(Foo, on_delete=models.CASCADE) def prepare_data(): if Foo.objects.exists(): return foo = Foo() foo.save() bar = Bar(foo=foo) bar.save() def test1(): # no prefetch for foo in Foo.objects.all(): for bar in foo.bar_set.all(): print(foo.id, bar.id) def test2(): # queryset.prefetch_related() for foo in Foo.objects.prefetch_related("bar_set").all(): for bar in foo.bar_set.all(): print(foo.id, bar.id) def test3(): # prefetch_related_objects() foo_list = list(Foo.objects.all()) prefetch_related_objects(foo_list, "bar_set") for foo in foo_list: for bar in foo.bar_set.all(): print(foo.id, bar.id) def run(): prepare_data() # warn up test1() test2() test3() gc.collect() gc.set_debug(gc.DEBUG_LEAK) gc.collect() print(f"baseline - garbage count: {len(gc.garbage)}") test1() gc.collect() print(f"test1 - garbage count: {len(gc.garbage)}") test2() gc.collect() print(f"test2 - garbage count: {len(gc.garbage)}") test3() gc.collect() print(f"test3 - garbage count: {len(gc.garbage)}") gc.set_debug(0) run()
Output
1 1 1 1 1 1 baseline - garbage count: 0 1 1 test1 - garbage count: 0 # no memory leak 1 1 test2 - garbage count: 23 # 23 objects leaked 1 1 test3 - garbage count: 46 # another 23 objects leaked
Change History (6)
comment:1 by , 10 months ago
comment:2 by , 10 months ago
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Triage Stage: | Unreviewed → Accepted |
Type: | Bug → Cleanup/optimization |
Interesting, thanks for the report. Tentatively accepted for further investigation.
comment:3 by , 10 months ago
The following code snippet shows the same result:
import gc class Parent: def __init__(self): self.cache = {} class Child: def __init__(self, parent): self.parent = parent def test(): foo = Parent() bar = Child(parent=foo) foo.cache["bars"] = [bar] print(foo.cache, bar.parent) test() gc.collect() print(len(gc.garbage)) gc.set_debug(gc.DEBUG_LEAK) gc.collect() print(len(gc.garbage)) test() gc.collect() print(len(gc.garbage))
Results in following output
{'bars': [<__main__.Child object at 0x6f520cdd90>]} <__main__.Parent object at 0x6f520cd6d0> 0 0 {'bars': [<__main__.Child object at 0x6f520b32d0>]} <__main__.Parent object at 0x6f520b1fd0> gc: collectable <Parent 0x6f520b1fd0> gc: collectable <Child 0x6f520b32d0> gc: collectable <list 0x6f520b1600> gc: collectable <dict 0x6f520b1e80> 4
Removing the gc.set_debug
statement, the gc.garbage
is always empty, so it looks like à side effect of DEBUG_LEAK
.
{'bars': [<__main__.Child object at 0x7535cf1d90>]} <__main__.Parent object at 0x7535cf1650> 0 0 {'bars': [<__main__.Child object at 0x7535cd7310>]} <__main__.Parent object at 0x7535cd5fd0> 0
As per the gc
documentation:
To debug a leaking program call gc.set_debug(gc.DEBUG_LEAK). Notice that this includes gc.DEBUG_SAVEALL, causing garbage-collected objects to be saved in gc.garbage for inspection.
So, using DEBUG_LEAK
leads to collected objects to be present in gc.garbage. So, I would say that looking at gc.garbage
in this case does not identifies a memory leak. On the contrary, it shows objects that were garbage collected
comment:4 by , 10 months ago
Thank you for your detailed explanation, Antoine. I confirm that memory leak is a false alarm and I am sorry about it
comment:5 by , 10 months ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
Hi Team,
So far I am adding the code below in the appropriate lines in order to fix the memory leak in my projects. Hopefully there will be a fix and documented way to properly clean up the cache.
Thank you for your attention!