Opened 7 weeks ago

Last modified 5 weeks ago

#35801 new Bug

Signals are dispatched to receivers associated with dead senders

Reported by: bobince Owned by:
Component: Core (Other) Version: 5.1
Severity: Normal Keywords:
Cc: bobince, Simon Charette Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

django.dispatch.Signal.receivers is keyed on the id() of the receiver and sender (modulo a thing to make bound methods consistent).

If a sender connected to a signal is destroyed, and a new object is allocated with the same id() and then also connected as a sender to the signal, when the signal fires it will match the original sender and call the receiver that was connected for that sender (as well as the new one).

Signal works around the problem of re-used ids for the receiver by having a weakref to the receiver (since it needs a reference anyway to be able to call it), but it doesn't for sender.

In my case this resulted in post-migrate hooks for the wrong apps being occasionally called in migration tests that mutated INSTALLED_APPS causing AppConfig senders to be re-created, but it can be more simply provoked with:

from django.dispatch import Signal

receivers_called = []
def create_receiver(i):
    def receiver(**kwargs):
        receivers_called.append(i)
    return receiver

n = 100
receivers = [create_receiver(i) for i in range(n)]

signal = Signal()
for i in range(n):
    sender = object()
    signal.connect(receivers[i], sender=sender)
    receivers_called = []
    _ = signal.send(sender=sender)
    # only the receiver for the new sender object should be called
    assert receivers_called == [i], f'Expected [{i}], called {receivers_called}'

(how readily this explodes may depend on local memory allocation differences, but it dies pretty consistently for me on iteratio
n 3.)

Perhaps Signal should be keeping a weakref to each sender as well as receiver, and detecting when it has gone None? Does this need to participate in the _dead_receivers mechanism?

Change History (5)

comment:1 by Natalia Bidart, 6 weeks ago

Resolution: needsinfo
Status: newclosed

Hello bobince, thank you for taking the time to create this ticket report.

I have investigated and I have used the code that you shared to reproduce. I applied a small change because I don't think it's correct to clear receivers_called = [] like that inside the loop. I also added some kwargs when sending the Signal to ensure we are debugging the right thing. What I find more "correct" is the following, which unless I'm missing something should be equivalent to your proposal. Sadly, this does not fail for me at any iteration (I even increased n to be 1000):

from collections import defaultdict

from django.dispatch import Signal


signal_calls = defaultdict(list)


def create_receiver(i):
    def receiver(sender, **kwargs):
        signal_calls[i].append((sender, kwargs))

    return receiver


n = 1000
receivers = [create_receiver(i) for i in range(n)]
signal = Signal()
for i in range(n):
    sender = object()
    signal.connect(receivers[i], sender=sender)
    _ = signal.send(sender=sender, i=i)
    # only the receiver for the new sender object should be called
    expected = [(sender, {"signal": signal, "i": i})]
    assert signal_calls[i] == expected, f"{expected=}, got {signal_calls=}"

Do you think you could, instead of a script, provide a failing test case for the Django test suite to illustrate the issue you are describing?

comment:2 by bobince, 6 weeks ago

I don't think it's correct to clear receivers_called = [] like that inside the loop

It should be fine as long as the loop is in the same scope as the definition of receivers_called. But yes, it might be clearer to reset the list without rebinding it by doing del receivers_called[:].

Sadly, this does not fail for me at any iteration

That's because the new example stores a reference to sender in signal_calls, which prevents the sender being destroyed and its id re-used.

If you store id(sender) or str(sender) instead you'll get the error again.

in reply to:  2 comment:3 by Natalia Bidart, 6 weeks ago

Cc: Simon Charette added
Resolution: needsinfo
Status: closednew
Triage Stage: UnreviewedAccepted

Replying to bobince:

I don't think it's correct to clear receivers_called = [] like that inside the loop

It should be fine as long as the loop is in the same scope as the definition of receivers_called. But yes, it might be clearer to reset the list without rebinding it by doing del receivers_called[:].

Right, I would suggest receivers_called.clear()

Sadly, this does not fail for me at any iteration

That's because the new example stores a reference to sender in signal_calls, which prevents the sender being destroyed and its id re-used.

If you store id(sender) or str(sender) instead you'll get the error again.

Great catch, that makes perfect sense. I tweaked my script and was able to confirm your report. I'll accept this ticket on that basis, I couldn't find a previous report about this.

Would you like to prepare a patch?

(Adding Simon as cc since he might have some good ideas for a robust fix.)

comment:4 by bobince, 6 weeks ago

Right, I would suggest receivers_called.clear()

(Ha ha, what is this crazy new-fangled API? Oh... Python 3.3 you say. well. huh!)

Would you like to prepare a patch?

Here's a candidate fix: https://github.com/bobince/django/compare/main..signals_from_dead_senders

It raises some questions about what to do when ‘sender’ isn't weakreffable — which admittedly is uncommon, and would already break if ‘use_caching’, but there's no doc that rules it out as far as I can see. In this commit it quietly falls back to making a strong reference, which would _probably_ be better than the buggy behaviour, but is still a change in behaviour.

So would certainly appreciate thoughts from someone familiar with the background of dispatch before PR'ing.

comment:5 by Simon Charette, 5 weeks ago

In my case this resulted in post-migrate hooks for the wrong apps being occasionally called in migration tests that mutated INSTALLED_APPS causing AppConfig senders to be re-created

Could you provide a minimal project demonstrating the issue in a non-synthetic manner? Before we commit to a solution (which yours appear to be correct from a quick look) we'd want to establish under which organic conditions the problem can be reproduced.

The Django test suite has many instances of @override_settings(INSTALLED_APPS) and doesn't run into this issue and I'm not sure it's fair to expect Django to behave properly under memory pressure circumstances that make id and therefore hash (as the default implementation of __hash__ is id() based) to return the same value for two distinct objects.

I think a possible alternative solution could be to identify under which organic circumstances this problem is likely to happen, document that signal connected with a sender must be disconnected prior to the object being garbage collected, and adjust the infringing cases within the framework itself.

Note: See TracTickets for help on using tickets.
Back to Top