Opened 4 years ago
Closed 22 months ago
#31920 closed New feature (fixed)
ASGI/ASYNC SessionMiddleware - SynchronousOnlyOperation exception if request.user is not unwrapped in sync code
Reported by: | Michael Galler | Owned by: | Jon Janzen |
---|---|---|---|
Component: | contrib.sessions | Version: | 3.1 |
Severity: | Normal | Keywords: | async |
Cc: | Andrew Godwin, Carlton Gibson, Jon Janzen, Adam Johnson | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
class AuthenticationMiddleware(MiddlewareMixin): def process_request(self, request): request.user = SimpleLazyObject(lambda: get_user(request))
AuthenticationMiddleware wraps the request.user in a SimpleLazyObject, so the Database Query is called lazy
If i start now a project via django-admin and I add a user and log in and request the following view:
from asgiref.sync import sync_to_async from django.core.exceptions import SynchronousOnlyOperation from django.http import HttpResponse @sync_to_async def get_user_from_request(request): return request.user if bool(request.user) else None async def async_test(request): # CASE 1 try: print(request.user.is_authenticated) except SynchronousOnlyOperation: print('error 1') # CASE 2 try: user = await sync_to_async(request.user) print(user) except SynchronousOnlyOperation: print('error 2') # CASE 3 try: user = await sync_to_async(request.user)() print(user) except SynchronousOnlyOperation: print('error 3') # CASE 4 try: user = await sync_to_async(bool)(request.user) print(request.user) except SynchronousOnlyOperation: print('error 4') # CASE 5 try: user = await get_user_from_request(request) print(user) except SynchronousOnlyOperation: print('error 5') return HttpResponse('Hello, async world!')
Only case 4 and 5 works, i find it a little awkward that i have to do one of this cases.
Change History (22)
comment:1 by , 4 years ago
Cc: | added |
---|---|
Triage Stage: | Unreviewed → Accepted |
Type: | Cleanup/optimization → New feature |
follow-up: 8 comment:2 by , 4 years ago
So the problem here is that attribute access is always synchronous, so lazy loading simply isn't possible. SynchronousOnlyOperation
was added to at least defend against this being done by accident, but if we want request.user
to simply work during async views, it will need to be greedily loaded at middleware execution time.
This is tricky, as the middleware doesn't know if the view below it is sync or async. There's a variety of solutions to this - a setting to always greedily load it, a nice-ish async-compatible function on user/SimpleLazyObject that loads it, trying to escape to the async loop using greenlets - but I'm not sure which one I prefer right now.
follow-up: 4 comment:3 by , 4 years ago
would not something like the following be easier?
class AuthenticationMiddleware(MiddlewareMixin): def process_view(self, request, view_func, args, kwargs): hasattr(request.user, 'foobar')
This unpacks the user after all process_requests calls and before any views are called, also there are no conflicts with other middlewares like RemoteUserMiddleware.
follow-up: 5 comment:4 by , 4 years ago
Right, that's just removing the lazy-loading aspect of it for every request ("greedily loading it"). We can't ship that in default Django as it would be a performance regression from now, essentially removing the lazy loading part permanently. This is why I suggested we maybe do this with a setting.
However, it's a useful workaround if you want to write your own middleware.
Replying to Michael Galler:
would not something like the following be easier?
class AuthenticationMiddleware(MiddlewareMixin): def process_view(self, request, view_func, args, kwargs): hasattr(request.user, 'foobar')This unpacks the user after all process_requests calls and before any views are called, also there are no conflicts with other middlewares like RemoteUserMiddleware.
comment:5 by , 4 years ago
Replying to Andrew Godwin:
Right, that's just removing the lazy-loading aspect of it for every request ("greedily loading it"). We can't ship that in default Django as it would be a performance regression from now, essentially removing the lazy loading part permanently. This is why I suggested we maybe do this with a setting.
However, it's a useful workaround if you want to write your own middleware.
Wouldn't it make sense to have a decorator that automatically executes the function synchronously thread sensitive when it is executed in a async context?
What i came up with after some time is the following based on async_unsafe:
async def run_in_sync(func, current_executor, *args, **kwargs): if current_executor: AsyncToSync.executors.current = current_executor return await sync_to_async(func, thread_sensitive=True)(*args, **kwargs) def ensure_sync_environment(message): def decorator(func): @functools.wraps(func) def inner(*args, **kwargs): try: loop = asyncio.get_running_loop() except RuntimeError: return func(*args, **kwargs) else: current_executor = getattr(AsyncToSync.executors, "current", None) with ThreadPoolExecutor(max_workers=1) as executor: return executor.submit(asyncio.run, run_in_sync(func, current_executor, *args, **kwargs)).result() return inner # If the message is actually a function, then be a no-arguments decorator. if callable(message): func = message message = '' return decorator(func) else: return decorator
If I now decrorate the function get_user with it
@ensure_sync_environment def get_user(request): if not hasattr(request, '_cached_user'): request._cached_user = auth.get_user(request) return request._cached_user
The problem is gone.
This works with asgi and wsgi and does not break any thread sensitivity, even the database connections are correctly cleared.
comment:6 by , 4 years ago
Cc: | added |
---|---|
Keywords: | async added |
comment:7 by , 4 years ago
Unfortunately that only appears to work reliably - you're still running synchronous blocking code inside of a coroutine, so you're holding up the main loop. There's unfortunately literally no way around this from inside attribute access, as you have no way to signal the outer event loop that it can pause the current coroutine and continue with other work.
If we put this into Django, you'd end up with every view that accessed user locking the entire async loop while it did so (which given it usually calls a database, is not going to be super quick).
comment:8 by , 4 years ago
Replying to Andrew Godwin:
This is tricky, as the middleware doesn't know if the view below it is sync or async. There's a variety of solutions to this - a setting to always greedily load it, a nice-ish async-compatible function on user/SimpleLazyObject that loads it, trying to escape to the async loop using greenlets - but I'm not sure which one I prefer right now.
Just checking to see if the current state of Django lent itself to any of the proposed ideas here (setting to greedily load / async-compatible function / greenlets / something else)? I'd be keen to have a crack at this but wanted to see what the best approach might be before doing a deep dive on a badly chosen one. Given the above choices I guess my next questions would be:
- For the greedy load, I would imagine the greedy load would be for all of the middleware? I think
AuthenticationMiddleware
is the only one that lazy loads an object (althoughGZipMiddleware
has a_lazy_re_compile
which I've not wrapped my head around yet to know if that would need to be updated too).
- The user /
SimpleLazyObject
async function sounds like it might potentially be the neatest, but I don't really understand how that could be implemented. Do you have any existing examples that do similar things (or could you elaborate on how I might approach this)?
- I've not encountered "greenlets" before. Are you referencing this library? https://github.com/python-greenlet/greenlet
comment:9 by , 2 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:10 by , 2 years ago
Chatted with lirontb about this at the DCUS2022 sprints, and we believe we can just implement the __await__
method on SimpleLazyObject, roughly like this:
class X: def __await__(self): if isinstance(self.wrapped, coroutine): async def inner(): result = await coroutine self.wrapped = result return result return inner().__await__() else: async def result(): return self.wrapped return result().__await__()
That would mean we could simply await request.user
(or any other SimpleLazyObject) and get the value out. If this works, we could even consider this for the related object descriptors for foreign keys...!
comment:12 by , 2 years ago
Needs documentation: | set |
---|---|
Patch needs improvement: | set |
comment:13 by , 2 years ago
I don’t like the suggestion of adding SimpleLazyObject.__await__
. It’s too “magical” and it makes the type of request.user
complex.
Currently django-stubs types request.user
as User | AnonymousUser
(with a documented technique to use just User
for logged in requests). If we wanted to support the __await__
method in type hints, then the type would become User | AnonymousUser | Awaitable[[], User | AnonymousUser]
. This would require type narrowing everywhere request.user
is used... so, we probably wouldn’t want to do that in django-stubs, but that would make it harder to have a correctly type-checked async Django app.
I propose instead making the middleware add a coroutine request.auser()
that fetches from request.user
, used like:
user = await request.auser() if user.is_authenticated: ...
Yes, it’s a little more code, but it follows Django’s other async API’s and it looks like a normal coroutine call.
comment:14 by , 2 years ago
Cc: | added |
---|
comment:15 by , 23 months ago
I think Adam is likely right here. We should likely have a separate auser()
, similarly to elsewhere.
To await, or not await, 'tis the question
comment:16 by , 23 months ago
Should auser
cache the results of the DB query? AIUI that's how SimpleLazyObject
works right now so it should probably be conistent.
For myself, I know there are certain codepaths in my personal installation that read the equivalent of request.user
several times during a request.
If it doesn't need to be cached, I see this as a fairly trivial and I'd be happy to do it. But I'll wait for guidance on my question before moving forward.
Also, while I'm commenting here: I posted about a larger issue about asyncifying the auth app overall which touches on this ticket if anyone is interested in that: https://forum.djangoproject.com/t/asyncifying-django-contrib-auth-and-signals-and-maybe-sessions/18770
comment:17 by , 22 months ago
Owner: | changed from | to
---|
I went ahead and created a PR with a very simple cache:
comment:18 by , 22 months ago
Needs tests: | set |
---|
comment:19 by , 22 months ago
Cc: | added |
---|
Just to follow-up, there was a mypy proposal to allow `typing.overload` to work with `T | Awaitable[T type examples], but that was closed as
wontfix`.
No, this isn't something mypy is going to do. Separate methods … is an easy solution.
So, I take it that the auser()
approach is going with the wind.
Current PR is looking OK. I just want to investigate if/how sharing the cache between .user
and .auser()
might work — but any other reviews or input welcomed.
comment:20 by , 22 months ago
Needs documentation: | unset |
---|---|
Needs tests: | unset |
Patch needs improvement: | unset |
comment:21 by , 22 months ago
Triage Stage: | Accepted → Ready for checkin |
---|
HI Michael, I’m going to accept this as a New Feature.
I accept totally that you just want to do
request.user
, and that’s clearly where we need to go, but I don’t think that’s entailed by the view and middleware support in 3.1, so this is something we need to think about and address.(If you want to help there, that would be cool.)
Maybe we can treat it as a bug in a new feature, but I worry we have to draw the line somewhere between what was there for 3.1 and what wasn’t.
It might be that we can document this somehow in the meantime? (If that’s too rough, then I would then lean towards making it a Release Blocker.)