Opened 13 years ago
Closed 13 years ago
#17251 closed Cleanup/optimization (fixed)
select_for_update tests threads should close connections manually
Reported by: | Anssi Kääriäinen | Owned by: | Aymeric Augustin |
---|---|---|---|
Component: | Uncategorized | Version: | |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
This is a small patch which closes two connections taken in threads.
It is good to close the conncetions manually, as they are thread locals in DatabaseWrapper, and thus new connections are created for new threads.
The connections will be closed by GC eventually, but IMHO it is good practice to close the connections manually, as they will not be used again anyways. They will be idle in transaction until they happen to be GCed.
Attachments (1)
Change History (7)
by , 13 years ago
Attachment: | 17251.diff added |
---|
comment:1 by , 13 years ago
comment:2 by , 13 years ago
I did some more digging into when separate thread connections are closed. It happens to be so that the current behaviour relies on DatabaseWrapper
being thread-local. The problem is that the chain from django.db.connection to the real underlying connection (for example psycopg2 connection) contains cycles. Because of the cycles, the connections will not be GCed (and thus closed) immediately when last reference to them is closed. It seems that an object which is threading.local subclass has a bit different rules to when it will be GCed (or more precisely, when its __dict__
will be collected). This makes select_for_update tests leave open connections behind if you move the threading.local from DatabaseWrapper
to the ConnectionHandler
._connections dictionary. This also has the possibility to make for some really hard to debug problems if Python (or PyPy
or Jython or...) decides to change the rules how GCing is implemented.
Anyways, it would be a good idea to always close connections in separate threads. The abovementioned patch does this for select_for_update tests. Some method decorators would be useful so that you could easily ensure your connections are closed when running command line tasks, or when creating separate threads. I think it would also be a good idea to break the cyclic references in DatabaseWrapper
(for example DatabaseWrapper
-> DatabaseFeatures
-> DatabaseWrapper
). This will however break every external backend... If this is not done, at least document the gotchas in this area of the code.
Additional note: do not use __del__
to check when the objects are GCed, this will make any cyclic reference uncollectable by the GC. I happen to know this isn't nice to debug... :)
But most importantly: it seems it is essential connections for separate threads are closed manually in the current Django implementation. So, lets do that for select_for_update tests.
comment:3 by , 13 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:4 by , 13 years ago
I don't think there is much to do in this ticket. The monologue above about breaking cycles in DatabaseWrapper is not relevant, as it would break every external backend, and it doesn't belong into this ticket anyways.
The patch is pretty trivial, so I think this is ready for commit. But as it is my patch, I can't set ready for checkin...
comment:5 by , 13 years ago
Owner: | changed from | to
---|
Yes, I noticed the patch is quite straightforward. I'll review it.
The GCed part of the description is somewhat wrong. Threadlocals do not get GCed when the thread goes away. However it seems they do get GCed if the thread has gone away and _another_ thread is then accessing the same thread-local object. Or at least this is my working hypothesis based on the fact that the connection gets closed on this line in the transactions.py, managed() [L111]:
And this is ran from main thread! I assume this is the first time the default_db_alias' DatabaseWrapper object (which is thread-local) is used after the another thread has gone away. I assume Python GCes the state of the other thread because it has gone away, and it sees it only when the object is accessed.
If you want to try tracking this down, I have a branch where you can run:
Then wait until you get into PDB. Begin stepping through the code. Note when the 'I got deleted' for thread 1 is printed. Note also which thread you are currently in. It is also good to use postgresql and set log_connections / log_disconnections to ON, and also log at least the backend pid in the log_line_prefix (%p).
I think this enforces the point that it would be good to close the connection in the tester threads. The current behavior seems to be based on pure luck :) I am also beginning to see why some of the core developers don't like thread-locals...
The test code for this is at:
https://github.com/akaariai/django/tree/connection_close