#18135 closed Bug (fixed)
Sleeping Database Connections on Startup with MySQL
Reported by: | Michael Newman | Owned by: | Aymeric Augustin |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.4 |
Severity: | Normal | Keywords: | MySQL |
Cc: | Triage Stage: | Design decision needed | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I am digging into this a little further, but right now, starting a new project starting it up with runserver, or gunicorn, creates a sleeping MySQL connection.
./manage.py startproject testproject cd testproject
Change the settings for a local MySQL database.
./manage.py runserver
On another command prompt
./manage.py dbshell
SHOW PROCESSLIST;
Observe the 2 sleeping connections (on for the shell and one for runserver).
Removing all the default apps seems to stop the connection. Adding contenttypes creates the connection. Adding staticfiles does not. Due to this, I believe that it is some code in contenttypes that is causing this connection.
Attachments (1)
Change History (32)
comment:1 by , 13 years ago
follow-up: 23 comment:2 by , 13 years ago
Practical effect is that a database only has so many connections available. Each instance fires up a connection, so if a database has 50 available connections and you have 10 servers with 5 instances each, you all of a sudden are out of database connections. Sleeping connections are a really bad thing.
follow-up: 4 comment:3 by , 13 years ago
Triage Stage: | Unreviewed → Accepted |
---|
I did a little digging and the reason for the connection is that model validation needs to know server version, and checking server version opens a connection which is then not closed. So, this code in mysql/validation.py:
try: db_version = self.connection.get_server_version() text_version = '.'.join([str(n) for n in db_version[:3]]) except OperationalError: db_version = None text_version = ''
seems to need some way to close the connection. I am not sure if it is safe to just have a "finally: close connection" in there. Maybe it should be checked that if there is no existing connection when entering this function, but there is one when leaving it, then close the created connection.
EDIT: reading further it seems the only usage for the version in this part of the code is checking for pre-5.0.3 versions, and as support for those versions is going to be dropped, it seems it might be possible that 1.5 does not need any other fix here than removing the get_server_version() call altogether.
comment:4 by , 13 years ago
Replying to akaariai:
EDIT: reading further it seems the only usage for the version in this part of the code is checking for pre-5.0.3 versions, and as support for those versions is going to be dropped, it seems it might be possible that 1.5 does not need any other fix here than removing the get_server_version() call altogether.
See #18116.
comment:5 by , 13 years ago
So, I guess now the question is should this be back-patched? I am not the right person to answer this as I am not too experienced with Django's back-patching rules.
Another question which arises is how to create a regression test for this? This is the kind of bug which could easily resurface later on.
comment:6 by , 13 years ago
In a self serving way, I do think that this should be back-ported. This is kind of a shocking problem that people use ORMs to avoid in their own code. I was lucky, I have had sleeping connections in the past that have brought down my DB, so I monitor for sleeping connections to the database. People on smaller machines with databases that only allow a limited number of connections will reach their DB's limit very quickly as their sites scale.
Also, fixing this problem is difficult. The only way to open those connections back up is to go into the database and kill those processes. This takes quite a bit of knowledge of the DB, which is another reason that people use ORMs.
comment:7 by , 13 years ago
The backport policy is clear:
Patches applied to the trunk will also be applied to the last minor release, to be released as the next micro release, when they fix critical problems:
- Security issues.
- Data-loss bugs.
- Crashing bugs.
- Major functionality bugs in newly-introduced features.
This bug doesn't fall into any of these categories.
If you want to backport the patch, you can. But you don't have to.
follow-up: 9 comment:8 by , 13 years ago
I would argue that the bug leads to crashing, quite reliably. When a database runs out of connections, Django doesn't run. Thanks for reminding me of those policies, my first note should have mentioned that.
I will backport it myself. I am more worried about new people coming to Django. This took me a long time to figure out and I knew what I was looking for.
comment:9 by , 13 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Replying to Mnewman:
I would argue that the bug leads to crashing, quite reliably. When a database runs out of connections, Django doesn't run. Thanks for reminding me of those policies, my first note should have mentioned that.
I will backport it myself. I am more worried about new people coming to Django. This took me a long time to figure out and I knew what I was looking for.
Fixed in [17921].
Given this reasoning I now think it could be right to fix this in previous releases. Will try to port fixes to 1.4.X and 1.3.X and attach patches here for review and testing.
by , 13 years ago
Attachment: | 18135-sleeping-mysql-connection-1.4.X.diff added |
---|
Patch for this issue in 1.4.X branch
comment:10 by , 13 years ago
I've attached a fix for this issue in the 1.4.X and 1.3.X branches (in the latter it applies with fuzz 1 (offset -73 lines)
).
Testing report and feedback welcome.
comment:11 by , 13 years ago
My quick test indicates that the extra connection is gone both in 1.4.X and 1.3.X. So, looks good to me. Only thing I wonder is if there should be try-finally to ensure the new connection is closed. But I guess if the finally branch is ever hit the open connection is not the biggest of your worries...
comment:12 by , 13 years ago
Did this ticket get closed by accident? I don't see it patched in the repo.
I can confirm that this patch closes the lingering connection.
I opened pull requests with this patch for the master, 1.4.x and 1.3.x banches on GitHub:
comment:13 by , 13 years ago
Has patch: | set |
---|---|
Resolution: | fixed |
Status: | closed → reopened |
Triage Stage: | Accepted → Ready for checkin |
Yes, it looks this was only partly fixed in [17921] - that is the server version checking still can leave open connections, it is just not called in the same place as before. It would probably be a good idea to apply all of the above pull requests.
comment:14 by , 13 years ago
Verified here also that the issue is present both in 1.3 and 1.4 and the patch fixes the issues.
Mnewman: If you could add [1.4.x] Fixed... and [1.3.x] Fixed... prefixes to the commit messages I think we could directly merge the pull requests to Django for all versions. Maybe you should add a paragraph of "Made sure server version checking does not leave open connection behind. Backport of <COMMIT_HASH of the master commit>".
comment:15 by , 13 years ago
Quick crash course in git rebase (which is awesome) and I think I have done it. Please double check that I did it right, I am still figuring out Git. Thanks
comment:16 by , 12 years ago
I think the problem is gone in master, but this still needs to be fixed in 1.4 and 1.3. Master should get this fix, too, even if the problem isn't present in model validation.
comment:17 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
comment:20 by , 12 years ago
Why has this patch been applied to the 1.3.X branch? The *only* patches that should be applied to 1.3 are those that will result in a security release.
comment:21 by , 12 years ago
I missed the security-only status of 1.3.x. I will revert the patch later on today from 1.3.
comment:23 by , 12 years ago
Replying to Mnewman:
Practical effect is that a database only has so many connections available. Each instance fires up a connection, so if a database has 50 available connections and you have 10 servers with 5 instances each, you all of a sudden are out of database connections. Sleeping connections are a really bad thing.
I stumbled upon this ticket while working on persistent database connections and I don't understand this argument.
If you have 50 workers, and they're all serving requests simultaneously, then you're going to have 50 database connections in parallel anyway.
If your database is only allows, say, 20 connections at a time, then you can't have more than 20 workers active, and there's no point in having 50 of them!
I fail to see how sleeping connections are a bad thing.
Currently, the connection created at startup will be closed after the first query. In the near future, it will stay open. (See the discussion on django-developers.)
I'm going to revert this change.
comment:24 by , 12 years ago
Resolution: | fixed |
---|---|
Status: | closed → new |
comment:25 by , 12 years ago
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Owner: | changed from | to
Status: | new → assigned |
Triage Stage: | Ready for checkin → Design decision needed |
follow-up: 27 comment:26 by , 12 years ago
This is not scalable. It makes sense for small applications to have a database query open for every thread that you have running the Django app, but what happens as your application scales? You can leverage agressive caching techniques or NoSQL to make the load on the databases less and less. Some applications that I run only need connection to the database when a write operation is performed.
So what happens on sites, like I have, that run millions of concurrents? I need hundreds of thousands threads to properly scale. I cannot afford a relational database that requires that many open "sleeping" connections.
Persistant connections are a fine thing for small sites that don't autoscale along with swaths of users that are looking to improve performance by the couple of hundredths of second that it takes to connect to a local database, but when it comes to a large site, the overhead of a connection on each Django instance is really detrimental.
This needs to at least be an option. The difference in price between a database that has 50 available connections to one that requires 500 is extreme. Applications can cache queries to the database, they can't magically increase the number of available connections to the database.
I will weigh in on the discussion on django-dev, can you point me in the right direction? I have not seen it pass through my inbox.
comment:27 by , 12 years ago
Replying to Mnewman:
So what happens on sites, like I have, that run millions of concurrents? I need hundreds of thousands threads to properly scale. I cannot afford a relational database that requires that many open "sleeping" connections.
Exaggeration doesn't help your credibility. No one's running hundred of thousands of Django workers connected to a MySQL database without a pooler. And the topic of this ticket isn't persistent connections. At the moment, they're just a proposal, open for discussion on the mailing list.
Back to the topic.
I missed a use case in my previous comment: it's possible to have a MySQL database that's only occasionally hit by workers, especially if it isn't the primary database. Such a database could support fewer simultaneous connections than the total amount of workers.
In the current codebase, the only place that uses the MySQL server version is sequence_reset_by_name_sql
, and this function is only used by the tests. It appears that mysql_version
was called at import time in previous versions of Django, but that isn't true any longer.
One could argue for keeping this code in case future versions of Django introduce MySQL-version-dependent code that's evaluated at compile time, but in general, unnecessary code hurts maintainability.
comment:29 by , 12 years ago
Today, I learned that some people indeed run hundreds of thousands of Django threads... My comment above is wrong!
comment:30 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
I see this also with 1.3.X.
Also, the contenttypes app isn't the only one that causes this, activating any of the DB-using apps (e.g. auth) makes the connection appear.
What is the practical effect of this issue? Is it an issue at all?