The `create_user` API and data import tools can result in our having
active users in the database who haven't intentionally created a Zulip
account or agreed to the ToS; we should never email such users.
The check for `TOS_VERSION is not None` is necessary for the
development environment, which has `TERMS_OF_SERVICE` set but not
`TOS_VERSION`.
It's likely that we will want this check in other places as well.
`deliver_scheduled_emails` and `deliver_scheduled_messages` use their
respective tables like a queue, but do not have guarantees that there
was only one consumer (besides the EMAIL_DELIVERER_DISABLED setting),
and could send duplicate messages if multiple consumers raced in
reading rows.
Use database locking to ensure that the database only feeds a given
ScheduledMessage or ScheduledEmail row to a single consumer. A second
consumer, if it exists, will block until the first consumer commits
the transaction.
This makes it parallel with deliver_scheduled_messages, and clarifies
that it is not used for simply sending outgoing emails (e.g. the
`email_senders` queue).
This also renames the supervisor job to match.
Since the invariant we're trying to protect is that every realm has an
active owner, we should check precisely that.
The root bug here, which the parent commit failed to fix properly, is
that we were doing a "greater than" check when we clearly originally
meant a "less than" check -- lower role numbers have more permissions.
Long-term, we probably want to make the filtering options more
generic, but there's little harm in adding an option for a specific
group we're likely to email multiple times.
Add a `--dry-run` flag to send_custom_email management command
in order to provide a mechanism to verify the emails of the recipients
and the text of the email being sent before actually sending them.
Add tests to:
- Check that no emails are actually sent when we are in the dry-run mode.
- Check if the emails are printed correctly when we are in the dry-run mode.
Fixes#17767
model__id syntax implies needing a JOIN on the model table to fetch the
id. That's usually redundant, because the first table in the query
simply has a 'model_id' column, so the id can be fetched directly.
Django is actually smart enough to not do those redundant joins, but we
should still avoid this misguided syntax.
The exceptions are ManytoMany fields and queries doing a backward
relationship lookup. If "streams" is a many-to-many relationship, then
streams_id is invalid - streams__id syntax is needed. If "y" is a
foreign fields from X to Y:
class X:
y = models.ForeignKey(Y)
then object x of class X has the field x.y_id, but y of class Y doesn't
have y.x_id. Thus Y queries need to be done like
Y.objects.filter(x__id__in=some_list)
django.utils.translation.ugettext is a deprecated alias of
django.utils.translation.gettext as of Django 3.0, and will be removed
in Django 4.0.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Plurals are handled natively by the ICU MessageFormat syntax, so I
think we don’t have to do anything here.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This is a prep change to eventually completely
replace the term "filter" with "linkifier" in
the codebase.
This only renames files. Code changes will be
done in further commits.
These flags were put in place in the first commit that introduced
Tornado (9afd63692f) with unclear
utility.
Remove them, since they have never been documented, and do not have a
clear need.
Add a `--allow-reserved-subdomain` flag which allows creation of
reserved keyword domains. This also always enforces that the domain
is not in use, which was removed in 0258d7d.
Fixes#16924.
Allowing any admins to create arbitrary users is not ideal because it
can lead to abuse issues. We should require something stronger that
requires the server operator's approval and thus we add a new
can_create_users permission.
We change the return type of check_message to be dataclass instead of
Dict[str, Any]. This refactoring helps us to understand the context of the
data structure returned by check_message clearly which was not possible
when using Dict.
SendMessageRequest class is added in zerver/lib/message.py inspite of it
not being used in that file itself just to maintain consistency as other
TypedDicts and dataclasses are defined in that file and to avoid circular
dependency as SendMessageRequest is being used in lib/widget.py as well.
We also rename local variable to 'send_request' for accessing
SendMessageRequest objects.
The {addr} part isn't directly useful, since connections to Tornado
are done on localhost anyway, and made the development environment
output a bit more confusing.
Also, use the same phrasing for restarts we use for Django.
We export a realm's data, and disable the realm, because the user
is moving from Zulip Cloud (e.g. https://example.zulipchat.com/) to
self-hosting or another platform (e.g. https://zulip.example.com/)
which we do not control. This commit adds a field in the realm object
called deactivated_redirect to store the url to which the realm has
moved.
By registering a post_delete handler to clear appropriate caches in a
nicer way, we can get rid of the ugly flush-memcached call in the
delete_realm command.
We replace knight command with change_user_role command which
allows us to change role of a user to owner, admins, member and
guest. We can also give/revoke api_super_user permission using
this command.
Tweaked by tabbott to improve the logging output and update documentation.
Fixes#16586.
Right now the list of languages in Display settings → Default language
is sorted in an unintuitive order due to the varying case conventions:
British English
Chinese (Taiwan)
Deutsch
English
Hindi
Indonesian (Indonesia)
Lietuviškai
Magyar
Malayalam
Nederlands
Português
Română
Tiếng Việt
Türkçe
català
español
français
galego
italiano
polski
suomi
svenska
česky
Русский
Українська
български
српски
فارسی
தமிழ்
日本語
简体中文
繁體中文
한국어
Fix the sort to use the locale-independent Unicode Collation
Algorithm:
British English
català
česky
Chinese (Taiwan)
Deutsch
English
español
français
galego
Hindi
Indonesian (Indonesia)
italiano
Lietuviškai
Magyar
Malayalam
Nederlands
polski
Português
Română
suomi
svenska
Tiếng Việt
Türkçe
български
Русский
српски
Українська
فارسی
தமிழ்
한국어
日本語
简体中文
繁體中文
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This class removes a lot of the annoying tuples
we were passing around.
Also, by including the user everywhere, which
is easily available to us when we make instances
of SubInfo, it sets the stage to remove
select_related('user_profile').
I think it's important that the callers understand
that bulk_add_subscriptions assumes all streams
are being created within a single realm, so I make
it an explicit parameter.
This may be overkill--I would also be happy if we
just included the assertions from this commit.
do_send_messages has side effects outside the database and may not
work reliably if its database effects are reordered by being inside a
transaction.
This also fixes a bug where we were doing the update incorrectly on
the Message table.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
SIGALRM is the simplest way to set a specific maximum duration that
queue workers can take to handle a specific message. This only works
in non-threaded environments, however, as signal handlers are
per-process, not per-thread.
The MAX_CONSUME_SECONDS is set quite high, at 10s -- the longest
average worker consume time is embed_links, which hovers near 1s.
Since just knowing the recent mean does not give much information[1],
it is difficult to know how much variance is expected. As such, we
set the threshold to be such that only events which are significant
outliers will be timed out. This can be tuned downwards as more
statistics are gathered on the runtime of the workers.
The exception to this is DeferredWorker, which deals with quite-long
requests, and thus has no enforceable SLO.
[1] https://www.autodesk.com/research/publications/same-stats-different-graphs
We call build_message_send_dict from check_message instead of
do_send_messages.
This is a prep commit for adding a new setting for handling
wildcard mentions in large streams.
We display the text of the consent message, and then continue with the
export, which will scroll the content off the screen. Allow the
administrator time to examine the contents of the message, and decide
whether to proceed based on that and the fraction of users that have
responded so far.
`zproject/settings.py` itself is mostly-empty now. Adjust the
references which should now point to `zproject/computed_settings.py`
or `zproject/default_settings.py`.
These weren’t wrong since orjson.JSONDecodeError subclasses
json.JSONDecodeError which subclasses ValueError, but the more
specific ones express the intention more clearly.
(ujson raised ValueError directly, as did json in Python 2.)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The exception trace only goes from where the exception was thrown up
to where the `logging.exception` call is; any context as to where
_that_ was called from is lost, unless `stack_info` is passed as well.
Having the stack is particularly useful for Sentry exceptions, which
gain the full stack trace.
Add `stack_info=True` on all `logging.exception` calls with a
non-trivial stack; we omit `wsgi.py`. Adjusts tests to match.
The S3 data export tool's upload code path uses this nice boto
callback feature for showing a progress bar, which is nice for the
management command. It's spammy/broken in production and the backend
tests, so we change percent_callback to be a parameter passed in so
that it can only be used in the contexts where it makes sense.
A few major themes here:
- We remove short_name from UserProfile
and add the appropriate migration.
- We remove short_name from various
cache-related lists of fields.
- We allow import tools to continue to
write short_name to their export files,
and then we simply ignore the field
at import time.
- We change functions like do_create_user,
create_user_profile, etc.
- We keep short_name in the /json/bots
API. (It actually gets turned into
an email.)
- We don't modify our LDAP code much
here.
Added new Event Type in AbstractRealmAuditLog STREAM_CREATED.
Since we finally create streams in create_stream_if_needed function
in zerver/lib/streams.py so logged realm_audit there.
Passed acting_user when create_stream_if_needed or ensure_stream
function is called.
Added tests in test_audit_log.
A generator that yields values without receiving or returning them is
an Iterator. Although every Iterator happens to be iterable, Iterable
is a confusing annotation for generators because a generator is only
iterable once.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Fixes#14960.
The default of 6 thread may not be appropriate in certain
configurations. Taking half of the numer of CPUs available to the
process will be more flexible.
Fixes#2665.
Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.
Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start. I expect this change will increase pressure for us to split
those files, which isn't a bad thing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Automatically generated by the following script, based on the output
of lint with flake8-comma:
import re
import sys
last_filename = None
last_row = None
lines = []
for msg in sys.stdin:
m = re.match(
r"\x1b\[35mflake8 \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
)
if m:
filename, row_str, col_str, err = m.groups()
row, col = int(row_str), int(col_str)
if filename == last_filename:
assert last_row != row
else:
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
with open(filename) as f:
lines = f.readlines()
last_filename = filename
last_row = row
line = lines[row - 1]
if err in ["C812", "C815"]:
lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
elif err in ["C819"]:
assert line[col - 2] == ","
lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We're migrating to using the cleaner zulip.com domain, which involves
changing all of our links from ReadTheDocs and other places to point
to the cleaner URL.
Generated by pyupgrade --py36-plus --keep-percent-format, but with the
NamedTuple changes reverted (see commit
ba7906a3c6, #15132).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
datetime.timezone is available in Python ≥ 3.2. This also lets us
remove a pytz dependency from the PostgreSQL scripts.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This commit merges do_change_is_admin and do_change_is_guest to a
single function do_change_user_role which will be used for changing
role of users.
do_change_is_api_super_user is added as a separate function for
changing is_api_super_user field of UserProfile.
I don't think we've had a use for these tools since our unread systems
stabilized shortly after they were written, so it makes sense to just
remove them rather than updating them for the pointer migration.
Refactored code in actions.py and streams.py to move stream related
functions into streams.py and remove the dependency on actions.py.
validate_sender_can_write_to_stream function in actions.py was renamed
to access_stream_for_send_message in streams.py.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We use retry_event in queue_processors.py to handle trying on failures,
without getting stuck in permanent retry loops if the event ends up
leading to failure on every attempt and we just keep sending NACK to
rabbitmq forever (or until the channel crashes). Tornado queues haven't
been using this, but they should.
After subscribing a stream email address to a Mailman email list
and receiving a message from it (using the polling configuration
with an Exim + Dovecot mailserver), the following error message
is emitted by Zulip:
Logger zerver.lib.email_mirror, from module zerver.lib.email_mirror line 77:
Error generated by Anonymous user (not logged in) on zulip deployment
Sender: "Foo Bar" <foo@example.com>
To: No recipient found
Missing recipient in mirror email
This is because the To: header on the received email corresponds
to the email list, and there are no other headers to indicate the
final recipient, apart from the "Envelope-To" header added by
Exim. To resolve this problem, the commit adds "Envelope-To" to
the list of headers to check for a match.
setup_event_queue() generates some logs about loaded event queues, and
it's good for the logging system to have access to the port at that
point already.
Django 2.2.x is the next LTS release after Django 1.11.x; I expect
we'll be on it for a while, as Django 3.x won't have an LTS release
series out for a while.
Because of upstream API changes in Django, this commit includes
several changes beyond requirements and:
* urls: django.urls.resolvers.RegexURLPattern has been replaced by
django.urls.resolvers.URLPattern; affects OpenAPI code and related
features which re-parse Django's internals.
https://code.djangoproject.com/ticket/28593
* test_runner: Change number to suffix. Django changed the name in this
ticket: https://code.djangoproject.com/ticket/28578
* Delete now-unnecessary SameSite cookie code (it's now the default).
* forms: urlsafe_base64_encode returns string in Django 2.2.
https://docs.djangoproject.com/en/2.2/ref/utils/#django.utils.http.urlsafe_base64_encode
* upload: Django's File.size property replaces _get_size().
https://docs.djangoproject.com/en/2.2/_modules/django/core/files/base/
* process_queue: Migrate to new autoreload API.
* test_messages: Add an extra query caused by .refresh_from_db() losing
the .select_related() on the Realm object.
* session: Sync SessionHostDomainMiddleware with Django 2.2.
There's a lot more we can do to take advantage of the new release;
this is tracked in #11341.
Many changes by Tim Abbott, Umair Waheed, and Mateusz Mandera squashed
are squashed into this commit.
Fixes#10835.
"Zulip Voyager" was a name invented during the Hack Week to open
source Zulip for what a single-system Zulip server might be called, as
a Star Trek pun on the code it was based on, "Zulip Enterprise".
At the time, we just needed a name quickly, but it was never a good
name, just a placeholder. This removes that placeholder name from
much of the codebase. A bit more work will be required to transition
the `zulip::voyager` Puppet class, as that has some migration work
involved.
These docstrings hadn't been properly updated in years, and bad an
awkward mix of a bad version of the user-facing documentation and
details that are no longer true (e.g. references to "Voyager").
(One important detail is that we have real documentation for this
system now).
Closes#13736.
zerver.lib.server_initialization.create_internal has precisely the same
code (you can copy-and-paste swap them, with one level of indentation
adjustment, without generating any diff) so they can be trivially
deduplicated.
zerver.lib.server_initialization.create_users has precisely the same
code (you can copy-and-paste swap them without generating any diff) so
they can be trivially deduplicated.
This doesn't change any behavior, the purpose of this is to make the
function identical to what we have in server_initialization.py so that
it can be deduplicated in follow-up commits.
This legacy cross-realm bot hasn't been used in several years, as far
as I know. If we wanted to re-introduce it, I'd want to implement it
as an embedded bot using those common APIs, rather than the totally
custom hacky code used for it that involves unnecessary queue workers
and similar details.
Fixes#13533.
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013. We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not. Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.
While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.
This WebSockets code path had a lot of downsides/complexity,
including:
* The messy hack involving constructing an emulated request object to
hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
needs and must be provisioned independently from the rest of the
server).
* A duplicate check_send_receive_time Nagios test specific to
WebSockets.
* The requirement for users to have their firewalls/NATs allow
WebSocket connections, and a setting to disable them for networks
where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
been poorly maintained, and periodically throws random JavaScript
exceptions in our production environments without a deep enough
traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
server restart, and especially for large installations like
zulipchat.com, resulting in extra delay before messages can be sent
again.
As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients. We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.
If we later want WebSockets, we’ll likely want to just use Django
Channels.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
return in that loop was a bug, which would lead to the To: header not
being set even though data['recipient'] = str(message['To']) is being
run next, thus requiring the header. We can remove the return
statement and now the loop will overwrite all the potentially
troublesome headers.
This is a preparatory commit for using isort for sorting all of our
imports, merging changes to files where we can easily review the
changes as something we're happy with.
These are also files with relatively little active development, which
means we don't expect much merge conflict risk from these changes.
If ldap sync is run while ldap is misconfigured, it can end up causing
troublesome deactivations due to not finding users in ldap -
deactivating all users, or deactivating all administrators of a realm,
which then will require manual intervention to reactivate at least one
admin in django shell.
This change prevents such potential troublesome situations which are
overwhelmingly likely to be unintentional. If intentional, --force
option can be used to remove the protection.
This allows us to email sets of users on a server with a nicely
formatted email similar to our onboarding emails, built off of a
Markdown template.
The code was based on send_password_reset_email, but it doesn't
replace that use case, since one cannot include special values like
password reset tokens in these emails.
Fixes#1727.
With the server down, apply migrations 0245 and 0246. 0246 will remove
the pub_date column, so it's essential that the previous migrations
ran correctly to copy data before running this.
Previously, incorrectly passing an existing directory to the
`manage.py export --output` option would remove its contents without
warning. Abort instead.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Also cleans up the interface between the management command and the
LDAP backends code to not guess/recompute under what circumstances
what should be logged.
Co-authored-by: mateuszmandera <mateusz.mandera@protonmail.com>
Apparently, the filters written for the send_password_reset_email (and
some other management commands) didn't correctly consider the case of
deactivated users.
While some commands, like syncing LDAP data (which can include whether
a user should be deactivated) want to process all users, other
commands generally only want to interact with active users. We fix
this and add some tests.
Previous cleanups (mostly the removals of Python __future__ imports)
were done in a way that introduced leading newlines. Delete leading
newlines from all files, except static/assets/zulip-emoji/NOTICE,
which is a verbatim copy of the Apache 2.0 license.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We simply state that certain options are `Optional`.
The following files are affected:
add_users_to_mailing_list
send_to_email_mirror
fill_memcached_caches
client_activity
When typing `**options` as an `Optional[str]` we will see errors
in the from of `None type has no attribute 'split'`. This change
allows mypy to effectively handle the `None` case.
As of commit cff40c557b (#9300), these
files are no longer served directly to the browser. Disentangle them
from the static asset pipeline so we can refactor it without worrying
about them.
This has the side effect of eliminating the accidental duplication of
translation data via hash-naming in our release tarballs.
This reverts commit b546391f0b (#1148).
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
When parsing custom HTTP headers in the integrations dev panel, http
headers from fixtures system and the send_webhook_fixture_message
we now use a singular source of logic: standardize_headers which
will take care of converting a dictionary of input headers into a
standard form that Django expects.
The argument parser has default empty values set for the options
`--password` and `--password-file`, and this causes the script to try and
read a password file even when the argument was not provided.
The upload option will no longer be limited to strictly S3 uploads. This
commit serves as a preliminary step for supporting LOCAL_UPLOADS_DIR as
part of the public only export feature.
This is a very old commit for #106, which has been on hiatus for a few
years. It was significantly modified by tabbott to:
* Improve coding style and variable names
* Update mypy annotations style
* Clean up the testing logic
* Update for API changes elsewhere in our system
But the actual runtime code is essentially unmodified from the
original work by Kirill.
It contains basic support for archiving Messages, UserMessages, and
Attachments with a nice test suite. It's still not usable in
production (e.g. it will probably break Reactions, SubMessages, etc.),
but upcoming commits will address that.
This lets us handle directly in our tooling the user experience that
we document for exporting a realm with member consent (before, it
required unpleasant manual work).
This renames Subscription.in_home_view field to is_muted, for greater
clarity as to what it does just from seeing the setting name, without
having to look it up.
Also disabled an obsolete test_migrations test.
Fixes#10042.
Using sys.exit in a management command makes it impossible
to unit test the code in question. The correct approach to do the same
thing in Django management commands is to raise CommandError.
Followup of b570c0dafa
This commit serves as the first step in supporting "public export" as a
webapp feature. The refactoring was done as a means to allow calling
the export logic from elsewhere in the codebase.
These functions don't really belong in actions.py, so we move them out,
into email_mirror_helpers.py. They can't go directly into
email_mirror.py or we'd get circular imports resulting in ImportError.
We change the send_to_email_mirror management command, to send messages
to the email mirror through the mirror_email_message function instead of
process_message - this makes the message follow a similar codepath as
emails sent into the mirror with the postfix configuration, which means
going through the MirrorWorker queue. The reason for this is to make
this command useful for testing the new email mirror rate limiter.
When soft deactivation is run for in "auto" mode (no emails are
specified and all users inactive for specified number of days are
deactivated), catch-up is also run in the "auto" mode if
AUTO_CATCH_UP_SOFT_DEACTIVATED_USERS is True.
Automatically catching up soft-deactivated users periodically would
ensure a good user experience for returning users, but on some servers
we may want to turn off this option to save on some disk space.
Fixes#8858, at least for the default configuration, by eliminating
the situation where there are a very large number of messages to recover.
Earlier the behavior was to raise an exception thereby stopping the
whole sync. Now we log an error message and skip the field. Also
fixes the `query_ldap` command to report missing fields without
error.
Fixes: #11780.
This avoids a spurious permission error inside the Postgres
`resolve_symlinks` function if we don’t have access to the current
working directory (e.g. we’re running with cwd /root inside `su
zulip`).
While we’re here, add a defensive `--` argument.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
We do this since we are yet to figure out how the entire realm
internal bots scenerio should work and therefore for the timming
we will use notification bot to deliver the reminders.