Opened 11 years ago
Last modified 18 months ago
#20752 new Bug
Error signals are not reliable, especially when dealing with database errors
Reported by: | Owned by: | ||
---|---|---|---|
Component: | HTTP handling | Version: | dev |
Severity: | Normal | Keywords: | signals errors databaseError |
Cc: | tal@…, preston@…, schacki, Ülgen Sarıkavak | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | yes | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
In core.handlers.base, unhandled exceptions are processed as such:
except: # Handle everything else, including SuspiciousOperation, etc. # Get the exception info now, in case another exception is thrown later. signals.got_request_exception.send(sender=self.__class__, request=request) response = self.handle_uncaught_exception(request, resolver, sys.exc_info())
Which delivers the error signal to the various handlers (database transaction resets, things like Sentry etc).
However, the code that dispatches signals aborts the dispatch if any of the handlers fail (to try..except block):
for receiver in self._live_receivers(_make_id(sender)): response = receiver(signal=self, sender=sender, **named) responses.append((receiver, response)) return responses
This is perfectly reasonable in most cases, but is problematic when dealing with top-level error handling, as this prevents the unhandled error from reaching handlers that just want to report it.
This is very noticeable in cases where "something bad" happens to the database connection, as the first handler is always the database rollback handler, which only catches DatabaseError, which excludes a lot of errors that can and do crop up in production
def _rollback_on_exception(**kwargs): from django.db import transaction for conn in connections: try: transaction.rollback_unless_managed(using=conn) except DatabaseError: pass
I think that it will be better to have the error signal dispatcher defer except raising until after all the handlers were called (or just swallow them all).
Alternatively, a different signal for error reporting is possible, but I think that's a little confusing.
Thoughts?
Attachments (2)
Change History (21)
comment:1 by , 11 years ago
comment:2 by , 11 years ago
Has patch: | set |
---|
comment:3 by , 11 years ago
Easy pickings: | set |
---|---|
Needs tests: | set |
Patch needs improvement: | set |
send_robust()
implies (correctly) that errors are expected. But the current patch proposes to just silence them, and that seems wrong to me.
Also, if you see a real problem, can you provide a test that exemplifies it (and can be used to verify that the solution is correct)?
comment:4 by , 11 years ago
I was on the fence on send_robust(): I generally dislike mechanisms that just swallow exceptions, because it makes bugs slip through the cracks. On the other hand, some signal handlers (db rollback) are actually extremely likely to raise an exception if something bad happened to the underlying connection, which is what we were seeing.
You can simulate something like this by closing the underlying connections. You can simulate it by doing something like:
from django.db import connection connection.connection.close()
At a view or a middleware.
Some alternatives to send_robust:
1) implement something that will aggregate the exceptions and just raise a "SignalHandlersException" container at the end. No exceptions are lost, but I question the overall value of it in this scenario.
2) Print out exceptions to stderr, which is what django actually ends up doing when the error signal handler go rogue.
I'm leaning towards option 2.
comment:5 by , 11 years ago
I like option 2 as well, but to take it just one step further, what do you think of doing a logger.error()
call with the stacktrace? Seems to me that this gives the most control and flexibility to developers.
comment:6 by , 11 years ago
shai: while I agree with you on silencing errors being wrong, it does seem like a problem with send_robust, as it's actually pretty sensible to rely on the existing "robust" dispatch.
I can also have send_robust emit a call to logger.error (as proposed by Ian) as a part of this patch, although it might be a little out of scope :)
What do you think? Right now we're working around this issue by doing some setattr magic:
def make_unhandled_signal_safe(): from django.core import signals as core_signals setattr(core_signals.got_request_exception, 'send', core_signals.got_request_exception.send_robust)
and we would very much like to get rid of it :)
comment:7 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:8 by , 11 years ago
Please find my patch attached. I have changed the core handler in a way such that:
- the initial error is logged as expected, so the root cause of the trouble is visible and not covered by any other exceptions
- all connected handlers are completely executed as expected, so no unexpected side-effects from one handler to the other
- any exceptions from the handlers are logged to the error log, so no errors pass silently.
I have also tried to add unit tests. Unfortunately, I have not managed, to really read the logging output. Any help is appreciated.
by , 11 years ago
Attachment: | 20752_patch.diff added |
---|
comment:9 by , 11 years ago
I have just sent this pull request: https://github.com/django/django/pull/1778
comment:10 by , 11 years ago
Component: | Core (Other) → HTTP handling |
---|---|
Easy pickings: | unset |
Triage Stage: | Unreviewed → Accepted |
Version: | 1.5 → master |
I agree there is some room to improve the robustness of the handler here in the case of "something bad happened".
Signal receiver's shouldn't be doing anything fancy when getting the "got_request_exception" signal (basically heeding the warning in the docstring of "handle_uncaught_exception", but the handler itself should be robust against any poorly written signal handler.
Another related place the handler could be more robust is in "handle_uncaught_exception" itself. While hopefully rare, someone could have attempted to install some fancy 500 handler which itself raises some unforseen exceptoin - so maybe a try except is needed there as well, with some logging, and then returning a minimal 500 ala:
return http.HttpResponseServerError('<h1>Server Error (500)</h1>', content_type='text/html')
Also it is not clear to me why the more robust sending of the signal step shouldn't just be rolled into handle_uncaught_exception. Those two steps occur together every single time in the handler, so I'm not sure why they need to be separate steps - sending the signal could easily be considered as "part of" handling the uncaught exception.
comment:11 by , 11 years ago
Cc: | added |
---|
comment:12 by , 11 years ago
I have changed the code as follows:
- Refactored the got_request_exception handling into handle_uncaught_exception. This makes to code much cleaner and the process more transparent
- The inital exception, that triggered the handle_uncaught_exception is the main focus. I.e. this exception is logged first, this exception is raised if settings.DEBUG_PROPAGATE_EXCEPTIONS is True. This makes debugging and analyzing the root cause of all the trouble much easier (which was one of the reasons, why this ticket was raised), nevertheless all follow-up errors are still logged.
- Implemented proposal from ptone, to make the resolve500 more robust.
- Commented the code in handle_uncaugh_exception to make the process and ideas behind it more transparent.
comment:13 by , 11 years ago
Cc: | added |
---|
comment:14 by , 11 years ago
Hi all, may I ask on the status here. Anything I could/should? Thanks (was not aware that I was not logged in while writing my previous message)
comment:15 by , 11 years ago
schacki - there aren't any steps to do you haven't already, thank you for the patch. I myself have not been able to make time to give it another review, I hope to be able to late this week, or early next. You can always try to recruit any member of the Django community to review the patch (not just core members), more eyes is always better. Thank you for your patience.
comment:16 by , 11 years ago
The PR looks good to me. I am not marking it as RFC though, because it would be better if core-dev does that.
comment:18 by , 20 months ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:19 by , 18 months ago
Cc: | added |
---|
Oh, just saw that the defer thing is implemented in send_robust(). Maybe just switch it from send() to send_robust() then?