Restore the default django.utils.log.AdminEmailHandler when
ERROR_REPORTING is enabled. Those with more sophisticated needs can
turn it off and use Sentry or a Sentry-compatible system.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Running notify_server_error directly from the logging handler can lead
to database queries running in a random context. Among the many
potential problems that could cause, one actual problem is a
SynchronousOnlyOperation exception when running in an asyncio event
loop.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The return type of `ugettext_lazy('...')` (aliased as `_`) is a
promise, which is only forced into a string when it is dealt with in
string context. This `django.utils.functional.lazy.__proxy__` object
is not entirely transparent, however -- it cannot be serialized by
`orjson`, and `isinstance(x, str) == False`, which can lead to
surprising action-at-a-distance.
In the two places which will serialize the role value (either into
Zulip's own error reporting queue, or Sentry's), force the return
value. Failure to do this results in errors being dropped
mostly-silently, as they cannot be serialized and enqueued by the
error reporter logger, which has no recourse but to just log a
warning; see previous commit.
When we do this forcing, explicitly override the language to be the
realm default. Failure to provide this override would translate the
role into the role in the language of the _request_, yielding varying
results.
AdminNotifyHandler is used to notify admins of errors; it is a
critical piece of logic. Failures in reporting errors will compound,
since its `except Exception` clauses cannot generate logging at the
`error` or `exception` level, as that would be recursive. It must
settle for logging at the `warning` level, and hope that admins are
vigilant to the logging there.
Increase the chances of being notified of failures in this logger, by
bubbling up those exceptions to Sentry, which is an orthogonal
reporting stack.
To make it easier to check if there is user information to be used
in the error report emails, we create a user object inside report.
Now, to check if we have the user's full name, email, etc, we just
need to do report['user']['user_full_name'] rather than check
each information one by one, because if the value of one key in
the report is different than None, all the others will be as well.
Fixes#2665.
Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.
Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start. I expect this change will increase pressure for us to split
those files, which isn't a bad thing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We no longer have intermediate constants of
`git_described` and `zulip_version_const`.
Instead, we make a `deployment_data` dictionary
that is grep-friendly, and we just let
`deployment_repr` do simple formatting
without translating string constants.
This is pretty easy to test:
- set DEBUG_ERROR_REPORTING = True
- modify some code to throw an exception
- see error output in #errors
- use "/emails" with text-only option to view
errors
This code was bitrotted--we no longer have a file
called `version`.
The info that was probably reported when that feature
was originally written probably lives now
in `zulip-git-version`, although I didn't research
all the history here. Here is the relevant
excerpt from `version.py`:
zulip_git_version_file = os.path.join(
os.path.dirname(os.path.abspath(__file__)),
'zulip-git-version')
if os.path.exists(zulip_git_version_file):
with open(zulip_git_version_file) as f:
version = f.read().strip()
if version:
ZULIP_VERSION = version
The file gets written as follows:
$ cat tools/cache-zulip-git-version
#!/usr/bin/env bash
set -e
cd "$(dirname "$0")/.."
git describe --tags --match='[0-9]*' > zulip-git-version || true
Here is what that might look like:
2.2-dev-2102-gf256ea39eb
Here is an excerpt from one of our recent error reports,
which demonstrates that the code I eliminated here was not
functioning (the third field is missing):
Deployed code:
- git: 2.2-dev-2028-g99ce96d49b-dirty
- ZULIP_VERSION: 2.2-dev-2028-g99ce96d49b
This fixes the main problem reported on #7868. I think
we may just want to close the issue, since the other
`nocoverage` stuff seems harmless to me.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This has the benefit that we now get the usual data about the
user/request/etc. in error emails related to bugdown exceptions;
previously we were just getting the traceback in the emails (since our
`mail_admins` template was very simple) and no other debugging
details.
Comments tweaked by tabbott to help make clear exactly what's going on
here, since it's a little subtle and a little hacky.
Fixes#8843.
This cuts out about 11 calls to `git describe`. In a nice fast LXC
container following our instructions for development on a Linux host,
this might save "only" about 1.5s; in a dev environment on a Windows
host, the savings have been clocked at 49s, presumably due to an
extremely slow filesystem in the VM.
The tests weren't doing much with this codepath as they were, and
there isn't a lot of value to be gained by testing it anyway; it's
totally non-critical and rarely changes.
[Commit message rewritten by greg.]
This is nicer than the "For manual testing ..." comment. :-)
Also as a proper setting we can have it control some logging I
added locally while testing my recent changes to pika logging.
When I added this "Deployed code" feature to the error reporting,
I apparently hadn't worked out enough of how this code works to
realize that `notify_server_error` may be in a different process,
at a different time and potentially even on a different machine
from the actual error being reported.
Given that architecture, all the data about the error must be computed
in `AdminNotifyHandler`, before sending the report through the queue,
or else it risks being wrong. The job of `notify_server_error` and
friends is only to format the data and send it off. So, move the
implementation of this feature in order to do that.
(@showell added some "nocoverage" directives here for code that
is hard to test (exceptions being thrown, deployment files not
existing) and that was originally part of a file that didn't
require 100% coverage)
This helps prevent them from diverging and getting different sets of
features and fixes. As a bonus, the email path gets a nice tweak that
the Zulip path has had for years, since f7f2ec0ac, which makes the
emails clearer and less broken-looking when logging a message with no
stack trace.
This deduplicates a little bit of logic, and also has us always put
things into `report` the same way.
Empirically an exception in this codepath is very rare, so we won't
complicate the code by trying to salvage a lot of partial information
if it happens -- just log the traceback, and try to get a minimal
notification sent of the bare fact this happened.
This name hasn't been right since f7f2ec0ac back in 2013; this handler
sends the log record to a queue, whose consumer will not only maybe
send a Zulip message but definitely send an email. I found this
pretty confusing when I first worked on this logging code and was
looking for how exception emails got sent; so now that I see exactly
what's actually happening here, fix it.