#9878 closed Uncategorized (invalid)
Database backend leaves idling connections when used in multiple threads
Reported by: | Fredde | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.0 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
When creating and saving model instances in multiple threads, the database backend leaves idling connections. This will finally cause a "Too many connections"-exception.
I'm using Django 1.0 with a MySQL database
This script will leave ~10-20 open connections
def createArticle(): a = Article(title = '...', text = '...') a.save() threads = [] for i in xrange(200): t = threading.Thread(target = createArticle) t.start() threads.append(t) for t in threads: t.join()
To see the active connections in the mysql client
mysql> show full processlist; +--------+------+-----------+------+---------+------+-------+-----------------------+ | Id | User | Host | db | Command | Time | State | Info | +--------+------+-----------+------+---------+------+-------+-----------------------+ | 463987 | ---- | localhost | NULL | Query | 0 | NULL | show full processlist | | 463994 | ---- | localhost | ---- | Sleep | 44 | | NULL | | 464020 | ---- | localhost | ---- | Sleep | 85 | | NULL | | 464042 | ---- | localhost | ---- | Sleep | 85 | | NULL | | 464045 | ---- | localhost | ---- | Sleep | 85 | | NULL | | 464048 | ---- | localhost | ---- | Sleep | 85 | | NULL | | 464067 | ---- | localhost | ---- | Sleep | 85 | | NULL | | 464093 | ---- | localhost | ---- | Sleep | 85 | | NULL | | 464105 | ---- | localhost | ---- | Sleep | 85 | | NULL | | 464136 | ---- | localhost | ---- | Sleep | 85 | | NULL | | 464139 | ---- | localhost | ---- | Sleep | 85 | | NULL | | 464165 | ---- | localhost | ---- | Sleep | 85 | | NULL | | 464166 | ---- | localhost | ---- | Sleep | 85 | | NULL | | 464183 | ---- | localhost | ---- | Sleep | 85 | | NULL | +--------+------+-----------+------+---------+------+-------+-----------------------+ 14 rows in set (0.00 sec)
Change History (7)
follow-up: 3 comment:1 by , 16 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
comment:2 by , 16 years ago
Hi,
You are correct in that we are not using the HTTP layer. We have a back-end application and it was very convenient to reuse Django's model API that we use for the front-end Webapp. However there is clearly a bug in Django, as the connections are not closed even when the threads using those connections are terminated. Those DB connections are just hanging there for a long time (or forever, don't really know that).
We know that this is not exactly the way Django is supposed to be used, but Django has grown too powerful and useful to limit it only to the front-end.
norvegh
comment:3 by , 16 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
Replying to mtredinnick:
However, Django closes its connection to the database at the end of each request/response cycle
This is not true even when using the HTTP layer (only tested with the development server). After all the threads have joined, there is still idling connections left behind. Try the code I posted erlier and you will see.
comment:4 by , 16 years ago
Resolution: | → invalid |
---|---|
Status: | reopened → closed |
What Malcolm means is that the database connection Django itself created automatically for the ORM in the current request/thread will be closed. Additional connections that you yourself cause to be opened are your responsibility to clean up (just as files you open are your responsibility to close, mail-server connections you open are your responsibility to close, etc., etc.).
comment:5 by , 16 years ago
(and to see why this is so for the case of DB connections, note that Django's automatic connection-closing happens on the request_finished
signal, which fires only in a thread that's processing an HTTP request, not in any other threads you may have spawned on your own. Django takes care of its thread, and you should take care of yours)
comment:6 by , 16 years ago
So at the end of each thread django.db.connection.close() should be called? But still, not every thread leaves an open connection. Only about 10% of them, so it looks like a race condition to me.
comment:7 by , 12 years ago
Easy pickings: | unset |
---|---|
Severity: | → Normal |
Type: | → Uncategorized |
UI/UX: | unset |
I just got bitten by this after migration from Django 1.3.4 to 1.4.2. Test runner started to hang during cleanup (on an instruction that issues 'drop database test_<appname>
' query). Turns out that's because the threads one of the methods under test is spawning were not closing the database connection (even though they themselves seem exit correctly) and server was waiting for them to do so. show full processlist;
in mysql client was showing the threads as sleeping with no command to execute.
I understand that this is a bug in my code, but if the connections need to be closed explicitly, wouldn't it be better if I was also required to open them explicitly? I wasn't even aware that I'm using the database in those threads. They don't explicitly import any models or database-related code from Django. The fact that they're using it is the actual bug and it would be better if it caused an error instead of failing silently.
Another thing is that the test runner should be prepared to handle misbehaving code. Wouldn't it make sense for it check if there are connections from other threads still open (is it possible if the threads have already exited and the runner did not have any references to them?) and if so, print an error and forcefully close them? Letting it just hang like that makes debugging harder.
The particular example you give is not the way Django is used, however. So, yes, if you do something to deliberately create lots of connections, lot of connections will be created. However, Django closes its connection to the database at the end of each request/response cycle, so there is only one connection in operation per thread or process handling requests and responses. If you're not using the HTTP layer, it's still only one connection per thread of execution and you are in complete control of the number of threads you create.
So there's not really any Django bug here. The connection usage is predictable and controlled.