Opened 3 years ago

Closed 3 months ago

#33092 closed Bug (worksforme)

PyMemcacheCache backend fails when running as a wsgi application with gevent worker class

Reported by: Martijn van der Blom Owned by: nobody
Component: Core (Cache system) Version: 3.2
Severity: Normal Keywords:
Cc: Andrew Godwin, Nick Pope Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Nick Pope)

For thread-safety when using pymemcache the option 'use_pooling': True can be passed via OPTIONS which will make pymemcache.HashClient use pymemcache.PooledClient instead of pymemcache.Client internally.

Some documentation should be added or improved.


In our application we were using the MemcachedCache backend to connect to a memcached server. Since this backend will be removed in Django 4.1 we thought we'd migrate to the alternative PyMemcacheCache backend as suggested in the Django documentation at: https://docs.djangoproject.com/en/3.2/topics/cache/

After upgrading we encountered several errors when running the application in gunicorn with the gevent worker class.

Summary of errors:

  • gevent._socketcommon.cancel_wait_ex: [Errno 9] File descriptor was closed in another greenlet
  • gevent.exceptions.ConcurrentObjectUseError: This socket is already used by another greenlet: <bound method Waiter.switch of <gevent._gevent_c_waiter.Waiter object at 0x7f8e77b00a40>>
  • OSError: [Errno 9] Bad file descriptor

These errors seem to be related to either the Django backend implementation or Pymemcache not handling multi-threading/thread-safety properly.
There is a related bug for the Pymemcache library where a member of that team states that is up the application using Pymemcache to handle thread-safety (https://github.com/pinterest/pymemcache/issues/195#issuecomment-452523524). In this case the Django framework. This comment in Django's BaseMemcachedCache implementation indicates that that was the original intent: https://github.com/django/django/blob/main/django/core/cache/backends/memcached.py#L38
so i think PyMemcacheCache should handle this.

Example project that reproduces the error:
https://github.com/mvanderblom/django-memcached-bugreport

For us, this error prevents us from using the PyMemcacheCache backend and thus from upgrading to Django 4.1 when it gets released.

Attachments (4)

example.log (169.2 KB ) - added by Martijn van der Blom 3 years ago.
Log file that contains the log from a run of the example project. The log file contains the full stacktraces for this error.
use_pooling_true.log (125.6 KB ) - added by Martijn van der Blom 3 years ago.
Logfile of a run with use_pooling set to true
20241018-without-pooling.log (115.8 KB ) - added by Nick Pope 3 months ago.
20241018-with-pooling.log (115.8 KB ) - added by Nick Pope 3 months ago.

Download all attachments as: .zip

Change History (19)

by Martijn van der Blom, 3 years ago

Attachment: example.log added

Log file that contains the log from a run of the example project. The log file contains the full stacktraces for this error.

comment:1 by Mariusz Felisiak, 3 years ago

Cc: Andrew Godwin Nick Pope added

Thanks for the report. This can be also an issue with asgiref.local.Local. Can you confirm that this issue still exists with asgiref==3.4.1 and on the Django's main branch?

comment:2 by Nick Pope, 3 years ago

Description: modified (diff)
Summary: PyMemcacheCache backend fails when running as a wsgi application with gevent worker classAdd note regarding thread-safety when using PyMemcacheCache.
Triage Stage: UnreviewedAccepted
Type: BugCleanup/optimization

So, yes, pymemcache.Client is not thread-safe. We are using pymemcache.HashClient so that we can support connections to multiple servers.

I note that pymemcache.PooledClient is thread-safe according to the documentation. We can pass the use_pooling flag to HashClient. Unfortunately pymemcache's documentation is a little sparse!

If I add 'OPTIONS': {'use_pooling': True} to your CACHES configuration in your reproducer the problem goes away for me.

Would you be prepared to open a PR with a tweak to the documentation? I already mentioned use_pooling in at the end of the cache arguments section, so maybe we just need to amend the sentence before?

by Martijn van der Blom, 3 years ago

Attachment: use_pooling_true.log added

Logfile of a run with use_pooling set to true

comment:3 by Martijn van der Blom, 3 years ago

Thanks for the quick reply! I've tried your suggested solution, but although les frequent (3 vs 31 times), the error still occurs.
I've verified that the config actually causes the HashClient to use a PooledClient and it does.
I've updated the example in the git repo and attached a log to this issue.

Do you guys have any clue as to what's happening here or any hints about what to do next?

comment:4 by Nick Pope, 3 years ago

That's interesting - I'll have to have another try later.

I don't know a great deal about this... Perhaps the issue is that PooledClient is using threading.Lock. See here.

As mentioned here some monkey patching needs to be done with gevent.monkey.patch_all(), but it seems that is already handled by gunicorn for the gevent worker.

Perhaps you can try passing lock_generator in OPTIONS with a different lock type, e.g. see https://www.gevent.org/api/gevent.lock.html

comment:5 by Martijn van der Blom, 3 years ago

I've tried passing gevent.lock.BoundedSemaphore as a lock_generator, but unfortunately that didn't make any difference. As you say, the gevent worker already calls monkey.patch_all() that should cause all usages of threading.[R]Lock to use the gevent variant. I'm still not sure how to proceed. Any further help would be much appreciated.

comment:6 by Martijn van der Blom, 3 years ago

Hi Nick, did you get a chance to look at this issue again? I'm still in the dark as to how to resolve this issue.
For now this prevents me from using the suggested alternative to MemcachedCache and when time comes, it will prevent me (and probably others) to update to Django 4.1.
If you want me to investigate something, please let me know.

comment:7 by Martijn van der Blom, 3 years ago

Summary: Add note regarding thread-safety when using PyMemcacheCache.PyMemcacheCache backend fails when running as a wsgi application with gevent worker class
Type: Cleanup/optimizationBug

I really don't think this issue can be resolved with only changes to the docs. Again, please let me know if you want me to investigate something further

Last edited 3 years ago by Martijn van der Blom (previous) (diff)

comment:8 by Iuri de Silvio, 3 years ago

I'm not sure it is related, but I found CacheMiddleware is not thread-safe and provided a solution in #33252.

comment:9 by Martijn van der Blom, 3 years ago

I can still reproduce the issue in Django 3.2.10 so it doesn't seem to be fixed by #33252

in reply to:  9 comment:10 by Nick Pope, 3 years ago

Replying to Martijn van der Blom:

I can still reproduce the issue in Django 3.2.10 so it doesn't seem to be fixed by #33252

That fix will not be backported to the 3.2.x series according to our backport policy.

It would be handy if you can run your test against the master branch to see if #33252 resolves the issue.

comment:11 by Giacomo Persichini, 3 years ago

Will this be fixed in future releases of Django 4? I'm encountering the exact same issue with Django + mod_wsgi

Django==4.0
asgiref==3.4.1
pymemcache==3.5.0

Setting the "use_pooling" option helped, but didn't get rid of all the errors.

comment:12 by Aaron Mader, 2 years ago

I've set up a simple project to reproduce this error here.

From my testing: This issue still occurs on django==4.0.7, but appears to be resolved on django==4.1.0. Perhaps this is the first version to include the fix from #33252? (I'm not sure how to tell which release(s) contains the fix made for that ticket).

comment:13 by aseem-md, 16 months ago

Another issue I recently ran into is that pymemcache.base.client.PooledClient.gets does not have the same signature as pymemcache.base.client.Client.gets. The latter accepts default as a keyword argument whereas the former does not.

When combined with the recommend configuration ignoring exceptions by setting ignore_exc: True you can run into a situation where calling gets there is an error that gets swallowed despite the value being in the cache and you get nothing back. This took a while to track down and debug.

comment:14 by Iuri de Silvio, 12 months ago

From my testing: This issue still occurs on django==4.0.7, but appears to be resolved on django==4.1.0. Perhaps this is the first version to include the fix from #33252?

The #33252 merge commit is 3ff7b15bb79f2ee5b7af245c55ae14546243bb77. It is tagged from 4.1a1.

by Nick Pope, 3 months ago

by Nick Pope, 3 months ago

Attachment: 20241018-with-pooling.log added

comment:15 by Nick Pope, 3 months ago

Resolution: worksforme
Status: newclosed

I've just tested this using the following:

python==3.9.20
memcached==1.6.31

django==4.2.16
gevent==24.10.2
gunicorn==23.0.0
pymemcache==4.0.0

I was unable to reproduce the original problem using the provided reproducer on multiple runs, with or without {"use_pooling": True}, so it seems to have been resolved.


Replying to aseem-md:

Another issue I recently ran into is that pymemcache.base.client.PooledClient.gets does not have the same signature as pymemcache.base.client.Client.gets. The latter accepts default as a keyword argument whereas the former does not.

When combined with the recommend configuration ignoring exceptions by setting ignore_exc: True you can run into a situation where calling gets there is an error that gets swallowed despite the value being in the cache and you get nothing back. This took a while to track down and debug.

This is unrelated to the original issue. However, I'll note that there have been a bunch of improvements in pymemcache==4.0.0 such that the default is returned if from .get() if ignore_exc is True and no server is available. See the release notes for details.

Note: See TracTickets for help on using tickets.
Back to Top