Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#31997 closed Uncategorized (invalid)

Regression in Django 3 related to ORM in async tasks (OperationalError: database is locked)

Reported by: Andrey Zelenchuk Owned by: nobody
Component: Database layer (models, ORM) Version: 3.1
Severity: Normal Keywords: async
Cc: Andrew Godwin, Carlton Gibson Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Andrey Zelenchuk)

https://docs.djangoproject.com/en/3.1/topics/async/#async-safety

Certain key parts of Django are not able to operate safely in an async environment <...>. The ORM is the main example <...>.

This is not accurate. Actually, the ORM is not able to operate safely in some asynchronous (async) environments (for example, async views), but can do it (and perfectly did it in Django 2) in some other async environments (for example, see the steps below).

Starting from Django 3.0 (see commit a415ce7), Django ORM prevents reusing database (DB) connections between async tasks. This breaks some use cases that worked before.

Steps to reproduce

The full demo project demonstrating this bug: https://github.com/AndreyMZ/django-ticket-31997

It is based on the Polls application from the tutorial. The meaningful part is polls/management/commands/demo.py:

import asyncio
import os

import django
from django.core.management.base import BaseCommand
from django.db import transaction
from django.utils import timezone

from polls.models import Question, Choice

N = 100
M = 10


class Command(BaseCommand):
    def handle(self, *args, **options):
        if django.VERSION >= (3, 0):
            # https://docs.djangoproject.com/en/3.1/topics/async/#envvar-DJANGO_ALLOW_ASYNC_UNSAFE
            os.environ["DJANGO_ALLOW_ASYNC_UNSAFE"] = "true"
        
        asyncio.run(handle_async())


async def handle_async():
    with transaction.atomic():
        Question.objects.all().delete()

        async def _process_task(i):
            await asyncio.sleep(0) # Real application would make e.g. an async HTTP request here.
            with transaction.atomic():
                question = Question.objects.create(question_text=f"demo question {i}", pub_date=timezone.now())
                for j in range(N // M - 1):
                    Choice.objects.create(question=question, choice_text=f"demo choice {i}.{j}")

        tasks = [_process_task(i) for i in range(M)]
        await asyncio.gather(*tasks)

To reproduce the bug, checkout the project and run the following:

python manage.py migrate
docker-compose build
docker-compose up django-2
docker-compose up django-3

Actual result

C:\mysite>docker-compose up django-2
Starting mysite_django-2_1 ... done
Attaching to mysite_django-2_1
mysite_django-2_1 exited with code 0

C:\mysite>docker-compose up django-3
Starting mysite_django-3_1 ... done
Attaching to mysite_django-3_1
django-3_1  | Traceback (most recent call last):
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
django-3_1  |     return self.cursor.execute(sql, params)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
django-3_1  |     return Database.Cursor.execute(self, query, params)
django-3_1  | sqlite3.OperationalError: database is locked
django-3_1  |
django-3_1  | The above exception was the direct cause of the following exception:
django-3_1  |
django-3_1  | Traceback (most recent call last):
django-3_1  |   File "manage.py", line 21, in <module>
django-3_1  |     main()
django-3_1  |   File "manage.py", line 17, in main
django-3_1  |     execute_from_command_line(sys.argv)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line
django-3_1  |     utility.execute()
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/core/management/__init__.py", line 395, in execute
django-3_1  |     self.fetch_command(subcommand).run_from_argv(self.argv)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/core/management/base.py", line 330, in run_from_argv
django-3_1  |     self.execute(*args, **cmd_options)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/core/management/base.py", line 371, in execute
django-3_1  |     output = self.handle(*args, **options)
django-3_1  |   File "/workspace/polls/management/commands/demo.py", line 18, in handle
django-3_1  |     asyncio.run(handle_async())
django-3_1  |   File "/usr/local/lib/python3.7/asyncio/runners.py", line 43, in run
django-3_1  |     return loop.run_until_complete(main)
django-3_1  |   File "/usr/local/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
django-3_1  |     return future.result()
django-3_1  |   File "/workspace/polls/management/commands/demo.py", line 32, in handle_async
django-3_1  |     await asyncio.gather(*tasks)
django-3_1  |   File "/workspace/polls/management/commands/demo.py", line 27, in _process_task
django-3_1  |     question = Question.objects.create(question_text=f"demo question {i}", pub_date=timezone.now())
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/models/manager.py", line 85, in manager_method
django-3_1  |     return getattr(self.get_queryset(), name)(*args, **kwargs)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/models/query.py", line 447, in create
django-3_1  |     obj.save(force_insert=True, using=self.db)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/models/base.py", line 751, in save
django-3_1  |     force_update=force_update, update_fields=update_fields)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/models/base.py", line 789, in save_base
django-3_1  |     force_update, using, update_fields,
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/models/base.py", line 892, in _save_table
django-3_1  |     results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/models/base.py", line 932, in _do_insert
django-3_1  |     using=using, raw=raw,
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/models/manager.py", line 85, in manager_method
django-3_1  |     return getattr(self.get_queryset(), name)(*args, **kwargs)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/models/query.py", line 1249, in _insert
django-3_1  |     return query.get_compiler(using=using).execute_sql(returning_fields)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1395, in execute_sql
django-3_1  |     cursor.execute(sql, params)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 98, in execute
django-3_1  |     return super().execute(sql, params)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 66, in execute
django-3_1  |     return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
django-3_1  |     return executor(sql, params, many, context)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
django-3_1  |     return self.cursor.execute(sql, params)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/utils.py", line 90, in __exit__
django-3_1  |     raise dj_exc_value.with_traceback(traceback) from exc_value
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
django-3_1  |     return self.cursor.execute(sql, params)
django-3_1  |   File "/usr/local/lib/python3.7/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
django-3_1  |     return Database.Cursor.execute(self, query, params)
django-3_1  | django.db.utils.OperationalError: database is locked
mysite_django-3_1 exited with code 1

Expected result

C:\mysite>docker-compose up django-2
Starting mysite_django-2_1 ... done
Attaching to mysite_django-2_1
mysite_django-2_1 exited with code 0

C:\mysite>docker-compose up django-3
Starting mysite_django-3_1 ... done
Attaching to mysite_django-3_1
mysite_django-3_1 exited with code 0

Possible solution

We should invent and implement some more sophisticated mechanism to prevent reusing DB connections between async views. Such mechanism must not prevent reusing DB connections between other async tasks.

Workaround (limited)

Variant 1

Patch django/db/utils.py:

  • django\db\utils.py

     
     1import os
    12import pkgutil
     3import threading
    24from importlib import import_module
    35from pathlib import Path
    46
     
    145147        # their code from async contexts, but this will give those contexts
    146148        # separate connections in case it's needed as well. There's no cleanup
    147149        # after async contexts, though, so we don't allow that if we can help it.
    148         self._connections = Local(thread_critical=True)
     150        if os.environ.get('DJANGO_ALLOW_ASYNC_REUSE_DB_CONNECTIONS'):
     151            self._connections = threading.local()
     152        else:
     153            self._connections = Local(thread_critical=True)
    149154
    150155    @cached_property
    151156    def databases(self):

Variant 2

Monkey-patch django.db.connections (e.g. in mysite/__init__.py):

import os
import threading

import django.db
import django.db.utils


class ConnectionHandler(django.db.utils.ConnectionHandler):
    def __init__(self, databases=None):
        self._databases = databases
        if os.environ.get('DJANGO_ALLOW_ASYNC_REUSE_DB_CONNECTIONS'):
            self._connections = threading.local()
        else:
            self._connections = django.db.utils.Local(thread_critical=True)

def django_db_connections_exist() -> bool:
    # noinspection PyProtectedMember
    connections = django.db.connections._connections
    databases = django.db.connections.databases
    return any(getattr(connections, alias, None) for alias in databases)

def monkey_patch_django_db_connections():
    assert not django_db_connections_exist()
    django.db.connections = ConnectionHandler()


monkey_patch_django_db_connections()

Warning

Do not use this workaround (do not set the DJANGO_ALLOW_ASYNC_REUSE_DB_CONNECTIONS environment variable) for processes which use async views!

Change History (6)

comment:1 by Mariusz Felisiak, 4 years ago

Cc: Andrew Godwin Carlton Gibson added

comment:2 by Andrey Zelenchuk, 4 years ago

Description: modified (diff)

comment:3 by Andrew Godwin, 4 years ago

I'm not quite sure what the bug is here - Django doesn't correctly allow sharing of connections between async tasks due to the threadlocal implementation, and so we disabled it for safety reasons; you're saying that if you turn the safety mechanism off, then it breaks? Or is the regression here that Django 2.0 used to be simple enough to let you do this, even though it has potential for data corruption?

Specifically, you can't run db.connections as a normal Threading.local, because then if you start a transaction in one async task, suspend it, and another async task comes along, they'll both be running in the same transaction even though they're technically different contexts, as they're all on the same thread. Constructing a test for this is hard, but if you try putting some long, actual sleeps in your async views when you're doing this, and run several in parallel, you should see them start to break serialization with each other - which is the sort of subtle bug that we immediately put a guardrail around so people don't only run into it in production.

comment:4 by Mariusz Felisiak, 4 years ago

Resolution: invalid
Status: newclosed

Marking as invalid based on Andrew's comment.

comment:5 by Andrey Zelenchuk, 4 years ago

you can't run db.connections as a normal Threading.local

I do not suggest doing this. As I wrote, this workaround is limited. A final solution should be different (I do not know it).

if you start a transaction in one async task, suspend it, and another async task comes along, they'll both be running in the same transaction

Yes, this is yet another case when it is not possible to share a DB connection. As I wrote, I understand that the ORM is not able to operate safely in some use cases.

But you did not consider the use case which I showed in the "steps to reproduce" section. I believe that it has no potential for data corruption. Don't you agree?

I think I understand the changes introduced in Django 3 related to sharing DB connections between async tasks. Django 2 allowed both safe and unsafe use cases. Now Django 3 allows neither unsafe nor safe use cases. In a perfect world, I want Django 3 to always allow all safe use cases and deny all unsafe ones. In a not so perfect world, I want Django 3 to deny all unsafe use cases (even if it denies some safe ones) by default, and to allow all safe use cases (even if it allows some unsafe ones) optionally (not by default, for those who understand what he/she does). This is a fair wish, isn't it?

comment:6 by Andrew Godwin, 4 years ago

But you did not consider the use case which I showed in the "steps to reproduce" section. I believe that it has no potential for data corruption. Don't you agree?

(edit, after reading again) It doesn't have as high a potential for data corruption, but it's also not really running asynchronously - because create is a blocking synchronous call, it'll hold up the event loop every time it runs, and you've mostly written a fancier for loop. I don't _think_ it would cause corruption, but I'm also not sure if that inner transaction is doing savepoints or pure transactions - if it's doing savepoints because of the outer transaction, then it's absolutely going to run into itself.

In a perfect world, I want Django 3 to always allow all safe use cases and deny all unsafe ones. In a not so perfect world, I want Django 3 to deny all unsafe use cases (even if it denies some safe ones) by default, and to allow all safe use cases (even if it allows some unsafe ones) optionally (not by default, for those who understand what he/she does). This is a fair wish, isn't it?

As do I - trust me, I would have made a slightly less high wall around this if there was a way to have made it safe without an ORM refactor. As it stands, this is why we ship DJANGO_ALLOW_ASYNC_UNSAFE - if you absolutely know what you're doing, you can turn off the safety and go back to the way it was, but as you are hopefully seeing, it is very, very hard to do this correctly.

I believe that if you write async code where there are:

  • No awaits inside any transactions
  • No per-view transactions (autocommit is on)
  • No connection-level parameters being set

then it might be possible, but there's no way we could enforce that given the current state of the Django ORM.

Last edited 4 years ago by Andrew Godwin (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top