#12374 closed (worksforme)
QuerySet .iterator() loads everything into memory anyway
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.1 |
Severity: | Keywords: | orm, cache, iterator | |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Iterating through the result of .iterator() still causes a huge spike in memory consumption. In contrast, loading only one record with [:1] does not.
Others have run into this problem:
http://stackoverflow.com/questions/1443279/django-iterate-over-a-query-set-without-cache
Notice his follow-up comment to the suggestion of using .iterator():
"Its still chewing through a ton of RAM when I use your call. :("
This has been my experience as well.
Change History (5)
comment:1 by , 15 years ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
comment:2 by , 15 years ago
Argh, maybe it is psycopg2.
http://www.velocityreviews.com/forums/t649192-psycopg2-and-large-queries.html
comment:3 by , 15 years ago
Resolution: | worksforme |
---|---|
Status: | closed → reopened |
I'm having this problem with QuerySet.iterator() as well - we have a table with about 2 million records in it, and looping it through it the following way causes the Python process to gather upwards of 2GB of RAM (from about 50MB to start):
for p in Property.objects.all().iterator()
print "hello"
Definitely confused because there's very little out there that says iterator() didn't help with memory consumption, and lots more out there that says iterator() did help. I'm running Django 1.1 with Python 2.6.4.
Does Django make use of the psycopg2 mentioned above?
comment:4 by , 15 years ago
Resolution: | → worksforme |
---|---|
Status: | reopened → closed |
The most likely cause of problems here is having DEBUG=True enabled; this will cause the debug cursor to soak up memory.
If you can validate that this problem still exists with DEBUG=False (and without the MySQL query cache problems), please reopen.
comment:5 by , 15 years ago
I have experienced the similar problem and I have set DEBUG=False in settings. I have over 7 million rows of records in my Postgresql database.
My current setup is Debian lenny running django 1.2 (lenny backports), postgresql 8.3.7
Currently I have been using pagination approach to iterate through all records as iterator() uses up close to all of my system memory.
However, it was interesting to see that I could iterator through the dataset by doing the following (using only fraction of my system memory):
query_set = Article.objects.all()
cache = query_set._fill_cache
i = 0
while (True):
try:
r = cache.im_self[i]
print r.id, r.title, r.author
i += 1
except Exception, e:
break
Would appreciate it if someone could give some insight to this.
Regards.
The author there identifies the issue as being the mysql query cache (in mysqldb I imagine). There's nothing we can do about that, when you use .iterator() django doesn't cache the data.