When pulling batches out of the ScheduledEmail list in a single
transaction, an unexpected failure to send an email will result in the
whole batch getting retried. This will result in infinite email
sending loops.
Pull a single row off at a time and send it. We continue without
retries to the next email on EmailNotDeliveredException, but will
retry infinitely on other exceptions.
Fixes: #20943.
The tool needs to run this function, since it uses django's send_email
directly instead of going through our zerver.lib.send_email.send_email
codepath.
Django 3.2 expects a list, and Django 4.1 will require one. Fixes
“RemovedInDjango41Warning: Using a boolean value for
requires_system_checks is deprecated. Use '__all__' instead of True,
and [] (an empty list) instead of False.”
Signed-off-by: Anders Kaseorg <anders@zulip.com>
TOR users are legitimate users of the system; however, that system can
also be used for abuse -- specifically, by evading IP-based
rate-limiting.
For the purposes of IP-based rate-limiting, add a
RATE_LIMIT_TOR_TOGETHER flag, defaulting to false, which lumps all
requests from TOR exit nodes into the same bucket. This may allow a
TOR user to deny other TOR users access to the find-my-account and
new-realm endpoints, but this is a low cost for cutting off a
significant potential abuse vector.
If enabled, the list of TOR exit nodes is fetched from their public
endpoint once per hour, via a cron job, and cached on disk. Django
processes load this data from disk, and cache it in memcached.
Requests are spared from the burden of checking disk on failure via a
circuitbreaker, which trips of there are two failures in a row, and
only begins trying again after 10 minutes.
Unhandled exceptions propagating to process_queue were not caught there,
causing improper logging - errors didn't land in errors.log as expected.
Exceptions should be caught and explicitly logged by the process_queue
logger. Exceptions occurring during consuming events are caught and
handled inside the worker's logic - however those that happen while
setting up the worker were not addressed at all, and that's the core bug
we mean to address here.
Furthermore, in multi-threaded mode we want the autoreload mechanism to
be working - which it doesn't without catching the exceptions. The
correct approach is to - again - catch the exception, log it and then
send SIGUSR1 signal to trigger exit and autoreload.
A SIGTERM can show up at any point in the ioloop, even in places which
are not prepared to handle it. This results in the process ignoring
the `sys.exit` which the SIGTERM handler calls, with an uncaught
SystemExit exception:
```
2021-11-09 15:37:49.368 ERR [tornado.application:9803] Uncaught exception
Traceback (most recent call last):
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/http1connection.py", line 238, in _read_message
delegate.finish()
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/httpserver.py", line 314, in finish
self.delegate.finish()
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/routing.py", line 251, in finish
self.delegate.finish()
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 2097, in finish
self.execute()
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 2130, in execute
**self.path_kwargs)
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/gen.py", line 307, in wrapper
yielded = next(result)
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 1510, in _execute
result = method(*self.path_args, **self.path_kwargs)
File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/handlers.py", line 150, in get
request = self.convert_tornado_request_to_django_request()
File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/handlers.py", line 113, in convert_tornado_request_to_django_request
request = WSGIRequest(environ)
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/django/core/handlers/wsgi.py", line 66, in __init__
script_name = get_script_name(environ)
File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/event_queue.py", line 611, in <lambda>
signal.signal(signal.SIGTERM, lambda signum, stack: sys.exit(1))
SystemExit: 1
```
Supervisor then terminates the process with a SIGKILL, which results
in dropping data held in the tornado process, as it does not dump its
queue.
The only command which is safe to run in the signal handler is
`ioloop.add_callback_from_signal`, which schedules the callback to run
during the course of the normal ioloop. This callbacks does an
orderly shutdown of the server and the ioloop before exiting.
For export realm following changes have been made:
- `./manage.py export --upload` would delete `.tar.gz` and unpacked dir
- `./manage.py export` would only delete `unpacked dir`
Besides, we have removed `--delete-after-upload` as we have set it as
the default.
Fixes#20081
It is confusing to have the plan type constants not be namespaced
by the thing they represent. We already have a namespacing
convention in place for constants, so we should use it for
Realm.plan_type as well.
This adds the X-Smokescreen-Role header to proxy connections, to track
usage from various codepaths, and enforces a timeout. Timeouts were
kept consistent with their previous values, or set to 5s if they had
none previously.
This is a roundabout way to appease a semgrep complaint about
‘error_msg = error_msg % (string_id,)’ while also improving the code.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The ability to use multiple ports has been removed a long time ago.
And the "optional" note in the help message is in fact incorrect
since `addrport` being `None` is not supported.
The auth attempt rate limit is quite low (on purpose), so this can be a
common scenario where a user asks their admin to reset the limit instead
of waiting. We should provide a tool for administrators to handle such
requests without fiddling around with code in manage.py shell.
Only clear_scheduled_emails previously took a lock on the users before
removing them; make deliver_scheduled_emails do so as well, by using
prefetch_related to ensure that the table appears in the SELECT. This
is not necessary for correctness, since all accesses of
ScheduledEmailUser first access the ScheduledEmail and lock it; it is
merely for consistency.
Since SELECT ... FOR UPDATE takes an UPDATE lock on all tables
mentioned in the SELECT, merely doing the prefetch is sufficient to
lock both tables; no `on=(...)` is needed to `select_for_update`.
This also does not address the pre-existing potential deadlock from
these two use cases, where both try to lock the same ScheduledEmail
rows in opposite orders.
These checks suffer from a couple notable problems:
- They are only enabled on staging hosts -- where they should never
be run. Since ef6d0ec5ca, these supervisor processes are only
run on one host, and never on the staging host.
- They run as the `nagios` user, which does not have appropriate
permissions, and thus the checks always fail. Specifically,
`nagios` does not have permissions to run `supervisorctl`, since
the socket is owned by the `zulip` user, and mode 0700; and the
`nagios` user does not have permission to access Zulip secrets to
run `./manage.py print_email_delivery_backlog`.
Rather than rewrite these checks to run on a cron as zulip, and check
those file contents as the nagios user, drop these checks -- they can
be rewritten at a later point, or replaced with Prometheus alerting,
and currently serve only to cause always-failing Nagios checks, which
normalizes alert failures.
Leave the files installed if they currently exist, rather than
cluttering puppet with `ensure => absent`; they do no harm if they are
left installed.
This is effectively a step closer to what was proposed in
https://github.com/zulip/zulip/pull/18678#discussion_r644490540 when
this code was written in #18678.
If the Customer object has neither of a Stripe id, nor any historical
plans, then there's no real billing association contained in the
existence of the Customer object, and it's safe to delete.
This commit allows to import the following from rocketchat:
* All users
* All public/private channels
* All teams and its public/private channels
* All discussion rooms as topics in their parent channel
* All the messages in all the channels
* All private conversations
* Reactions on messages (except for custom emojis)
* Mentions in messages (except @all, @here mentions)
Zulip identifies users by realm+delivery_email which means that the
Django changepassword command doesn't work well -
since it looks only at the .email field.
Thus we fork its code to our own change_password command.
Commit 81d7dd1fda broke this nearly
eight years ago, so probably nobody cares except the ever-watchful eye
of mypy.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This command was part of the complex migration to introduce the
`unread_msgs` data structure as the source of truth for unreads.
Effectively, it's a migration to remove anomalies that we ran several
times before turning it into the final 0104_fix_unreads.py migration.
Fixes part of #18898.
This command is part of a statsd infrastructure that we stopped
supporting years ago. Its only purpose for some time has been to
provide sample code for how the restart script might trigger a
notification to a graphing system, which doesn't justify maintaining
it.
Fixes part of #18898.
This command was part of early prototyping of the digests feature, and
in particular its purpose is better served via the organization-level
setting to control digest emails for the organization.
Fixes part of #18898.
This command was written to allow generating multiuse invite links
before the "Invite a user" UI supported them. It no longer has a
purpose and can be safely deleted.
Fixes part of #18898.
This command predates there being a normal UI for inviting users to
Zulip. It no longer has a role for which it's a better way to do
things. (Especially with upcoming API documentation for the endpoint).
Fixes part of #18898.
This command was introduced in 2013 via
6d6c3364dc as part of implementing
marking messages as read in a separate process for performance reasons.
We fixed the performance issues and removed that pipeline years ago,
but forgot to delete this.
Fixes part of #18898.
Sometimes the Slack import zip file we get isn't quite the canonical
form that Slack produces -- often because the user has unzip'd it,
looked at it, and re-zip'd it, resulting in extra nested directories
and the like.
For such cases, support passing in a path to an unpacked Slack export
tree.
The `create_user` API and data import tools can result in our having
active users in the database who haven't intentionally created a Zulip
account or agreed to the ToS; we should never email such users.
The check for `TOS_VERSION is not None` is necessary for the
development environment, which has `TERMS_OF_SERVICE` set but not
`TOS_VERSION`.
It's likely that we will want this check in other places as well.
`deliver_scheduled_emails` and `deliver_scheduled_messages` use their
respective tables like a queue, but do not have guarantees that there
was only one consumer (besides the EMAIL_DELIVERER_DISABLED setting),
and could send duplicate messages if multiple consumers raced in
reading rows.
Use database locking to ensure that the database only feeds a given
ScheduledMessage or ScheduledEmail row to a single consumer. A second
consumer, if it exists, will block until the first consumer commits
the transaction.
This makes it parallel with deliver_scheduled_messages, and clarifies
that it is not used for simply sending outgoing emails (e.g. the
`email_senders` queue).
This also renames the supervisor job to match.
Since the invariant we're trying to protect is that every realm has an
active owner, we should check precisely that.
The root bug here, which the parent commit failed to fix properly, is
that we were doing a "greater than" check when we clearly originally
meant a "less than" check -- lower role numbers have more permissions.
Long-term, we probably want to make the filtering options more
generic, but there's little harm in adding an option for a specific
group we're likely to email multiple times.
Add a `--dry-run` flag to send_custom_email management command
in order to provide a mechanism to verify the emails of the recipients
and the text of the email being sent before actually sending them.
Add tests to:
- Check that no emails are actually sent when we are in the dry-run mode.
- Check if the emails are printed correctly when we are in the dry-run mode.
Fixes#17767
model__id syntax implies needing a JOIN on the model table to fetch the
id. That's usually redundant, because the first table in the query
simply has a 'model_id' column, so the id can be fetched directly.
Django is actually smart enough to not do those redundant joins, but we
should still avoid this misguided syntax.
The exceptions are ManytoMany fields and queries doing a backward
relationship lookup. If "streams" is a many-to-many relationship, then
streams_id is invalid - streams__id syntax is needed. If "y" is a
foreign fields from X to Y:
class X:
y = models.ForeignKey(Y)
then object x of class X has the field x.y_id, but y of class Y doesn't
have y.x_id. Thus Y queries need to be done like
Y.objects.filter(x__id__in=some_list)
django.utils.translation.ugettext is a deprecated alias of
django.utils.translation.gettext as of Django 3.0, and will be removed
in Django 4.0.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Plurals are handled natively by the ICU MessageFormat syntax, so I
think we don’t have to do anything here.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This is a prep change to eventually completely
replace the term "filter" with "linkifier" in
the codebase.
This only renames files. Code changes will be
done in further commits.
These flags were put in place in the first commit that introduced
Tornado (9afd63692f) with unclear
utility.
Remove them, since they have never been documented, and do not have a
clear need.
Add a `--allow-reserved-subdomain` flag which allows creation of
reserved keyword domains. This also always enforces that the domain
is not in use, which was removed in 0258d7d.
Fixes#16924.
Allowing any admins to create arbitrary users is not ideal because it
can lead to abuse issues. We should require something stronger that
requires the server operator's approval and thus we add a new
can_create_users permission.
We change the return type of check_message to be dataclass instead of
Dict[str, Any]. This refactoring helps us to understand the context of the
data structure returned by check_message clearly which was not possible
when using Dict.
SendMessageRequest class is added in zerver/lib/message.py inspite of it
not being used in that file itself just to maintain consistency as other
TypedDicts and dataclasses are defined in that file and to avoid circular
dependency as SendMessageRequest is being used in lib/widget.py as well.
We also rename local variable to 'send_request' for accessing
SendMessageRequest objects.
The {addr} part isn't directly useful, since connections to Tornado
are done on localhost anyway, and made the development environment
output a bit more confusing.
Also, use the same phrasing for restarts we use for Django.
We export a realm's data, and disable the realm, because the user
is moving from Zulip Cloud (e.g. https://example.zulipchat.com/) to
self-hosting or another platform (e.g. https://zulip.example.com/)
which we do not control. This commit adds a field in the realm object
called deactivated_redirect to store the url to which the realm has
moved.
By registering a post_delete handler to clear appropriate caches in a
nicer way, we can get rid of the ugly flush-memcached call in the
delete_realm command.
We replace knight command with change_user_role command which
allows us to change role of a user to owner, admins, member and
guest. We can also give/revoke api_super_user permission using
this command.
Tweaked by tabbott to improve the logging output and update documentation.
Fixes#16586.
Right now the list of languages in Display settings → Default language
is sorted in an unintuitive order due to the varying case conventions:
British English
Chinese (Taiwan)
Deutsch
English
Hindi
Indonesian (Indonesia)
Lietuviškai
Magyar
Malayalam
Nederlands
Português
Română
Tiếng Việt
Türkçe
català
español
français
galego
italiano
polski
suomi
svenska
česky
Русский
Українська
български
српски
فارسی
தமிழ்
日本語
简体中文
繁體中文
한국어
Fix the sort to use the locale-independent Unicode Collation
Algorithm:
British English
català
česky
Chinese (Taiwan)
Deutsch
English
español
français
galego
Hindi
Indonesian (Indonesia)
italiano
Lietuviškai
Magyar
Malayalam
Nederlands
polski
Português
Română
suomi
svenska
Tiếng Việt
Türkçe
български
Русский
српски
Українська
فارسی
தமிழ்
한국어
日本語
简体中文
繁體中文
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This class removes a lot of the annoying tuples
we were passing around.
Also, by including the user everywhere, which
is easily available to us when we make instances
of SubInfo, it sets the stage to remove
select_related('user_profile').
I think it's important that the callers understand
that bulk_add_subscriptions assumes all streams
are being created within a single realm, so I make
it an explicit parameter.
This may be overkill--I would also be happy if we
just included the assertions from this commit.
do_send_messages has side effects outside the database and may not
work reliably if its database effects are reordered by being inside a
transaction.
This also fixes a bug where we were doing the update incorrectly on
the Message table.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
SIGALRM is the simplest way to set a specific maximum duration that
queue workers can take to handle a specific message. This only works
in non-threaded environments, however, as signal handlers are
per-process, not per-thread.
The MAX_CONSUME_SECONDS is set quite high, at 10s -- the longest
average worker consume time is embed_links, which hovers near 1s.
Since just knowing the recent mean does not give much information[1],
it is difficult to know how much variance is expected. As such, we
set the threshold to be such that only events which are significant
outliers will be timed out. This can be tuned downwards as more
statistics are gathered on the runtime of the workers.
The exception to this is DeferredWorker, which deals with quite-long
requests, and thus has no enforceable SLO.
[1] https://www.autodesk.com/research/publications/same-stats-different-graphs
We call build_message_send_dict from check_message instead of
do_send_messages.
This is a prep commit for adding a new setting for handling
wildcard mentions in large streams.
We display the text of the consent message, and then continue with the
export, which will scroll the content off the screen. Allow the
administrator time to examine the contents of the message, and decide
whether to proceed based on that and the fraction of users that have
responded so far.
`zproject/settings.py` itself is mostly-empty now. Adjust the
references which should now point to `zproject/computed_settings.py`
or `zproject/default_settings.py`.
These weren’t wrong since orjson.JSONDecodeError subclasses
json.JSONDecodeError which subclasses ValueError, but the more
specific ones express the intention more clearly.
(ujson raised ValueError directly, as did json in Python 2.)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The exception trace only goes from where the exception was thrown up
to where the `logging.exception` call is; any context as to where
_that_ was called from is lost, unless `stack_info` is passed as well.
Having the stack is particularly useful for Sentry exceptions, which
gain the full stack trace.
Add `stack_info=True` on all `logging.exception` calls with a
non-trivial stack; we omit `wsgi.py`. Adjusts tests to match.
The S3 data export tool's upload code path uses this nice boto
callback feature for showing a progress bar, which is nice for the
management command. It's spammy/broken in production and the backend
tests, so we change percent_callback to be a parameter passed in so
that it can only be used in the contexts where it makes sense.
A few major themes here:
- We remove short_name from UserProfile
and add the appropriate migration.
- We remove short_name from various
cache-related lists of fields.
- We allow import tools to continue to
write short_name to their export files,
and then we simply ignore the field
at import time.
- We change functions like do_create_user,
create_user_profile, etc.
- We keep short_name in the /json/bots
API. (It actually gets turned into
an email.)
- We don't modify our LDAP code much
here.
Added new Event Type in AbstractRealmAuditLog STREAM_CREATED.
Since we finally create streams in create_stream_if_needed function
in zerver/lib/streams.py so logged realm_audit there.
Passed acting_user when create_stream_if_needed or ensure_stream
function is called.
Added tests in test_audit_log.
A generator that yields values without receiving or returning them is
an Iterator. Although every Iterator happens to be iterable, Iterable
is a confusing annotation for generators because a generator is only
iterable once.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Fixes#14960.
The default of 6 thread may not be appropriate in certain
configurations. Taking half of the numer of CPUs available to the
process will be more flexible.
Fixes#2665.
Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.
Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start. I expect this change will increase pressure for us to split
those files, which isn't a bad thing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Automatically generated by the following script, based on the output
of lint with flake8-comma:
import re
import sys
last_filename = None
last_row = None
lines = []
for msg in sys.stdin:
m = re.match(
r"\x1b\[35mflake8 \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
)
if m:
filename, row_str, col_str, err = m.groups()
row, col = int(row_str), int(col_str)
if filename == last_filename:
assert last_row != row
else:
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
with open(filename) as f:
lines = f.readlines()
last_filename = filename
last_row = row
line = lines[row - 1]
if err in ["C812", "C815"]:
lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
elif err in ["C819"]:
assert line[col - 2] == ","
lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We're migrating to using the cleaner zulip.com domain, which involves
changing all of our links from ReadTheDocs and other places to point
to the cleaner URL.
Generated by pyupgrade --py36-plus --keep-percent-format, but with the
NamedTuple changes reverted (see commit
ba7906a3c6, #15132).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
datetime.timezone is available in Python ≥ 3.2. This also lets us
remove a pytz dependency from the PostgreSQL scripts.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This commit merges do_change_is_admin and do_change_is_guest to a
single function do_change_user_role which will be used for changing
role of users.
do_change_is_api_super_user is added as a separate function for
changing is_api_super_user field of UserProfile.