For 3000 messages and 400 users, this saved
about 30 seconds.
We only do two queries per batch of messages
now, and the algorithm is easier to analyze,
as it's just three nested loops.
I think it's important that the callers understand
that bulk_add_subscriptions assumes all streams
are being created within a single realm, so I make
it an explicit parameter.
This may be overkill--I would also be happy if we
just included the assertions from this commit.
SIGALRM is the simplest way to set a specific maximum duration that
queue workers can take to handle a specific message. This only works
in non-threaded environments, however, as signal handlers are
per-process, not per-thread.
The MAX_CONSUME_SECONDS is set quite high, at 10s -- the longest
average worker consume time is embed_links, which hovers near 1s.
Since just knowing the recent mean does not give much information[1],
it is difficult to know how much variance is expected. As such, we
set the threshold to be such that only events which are significant
outliers will be timed out. This can be tuned downwards as more
statistics are gathered on the runtime of the workers.
The exception to this is DeferredWorker, which deals with quite-long
requests, and thus has no enforceable SLO.
[1] https://www.autodesk.com/research/publications/same-stats-different-graphs
We set wildcard_mention_policy in the test database so that we can
avoid future changes in mention puppeteer tests, as the default
membership of streams in the Zulip development organization is large
enough to prevent random users from using wildcard mentions.
We call build_message_send_dict from check_message instead of
do_send_messages.
This is a prep commit for adding a new setting for handling
wildcard mentions in large streams.
Having both of these is confusing; TORNADO_SERVER is used only when
there is one TORNADO_PORT. Its primary use is actually to be _unset_,
and signal that in-process handling is to be done.
Rename to USING_TORNADO, to parallel the existing USING_RABBITMQ, and
switch the places that used it for its contents to using
TORNADO_PORTS.
A few major themes here:
- We remove short_name from UserProfile
and add the appropriate migration.
- We remove short_name from various
cache-related lists of fields.
- We allow import tools to continue to
write short_name to their export files,
and then we simply ignore the field
at import time.
- We change functions like do_create_user,
create_user_profile, etc.
- We keep short_name in the /json/bots
API. (It actually gets turned into
an email.)
- We don't modify our LDAP code much
here.
The prior version clobbered all flags, which means
we had unrealistic values for is_private.
Now we only touch the unread flag, which
also means when we go next to create alert words,
those will now work.
Added new Event Type in AbstractRealmAuditLog STREAM_CREATED.
Since we finally create streams in create_stream_if_needed function
in zerver/lib/streams.py so logged realm_audit there.
Passed acting_user when create_stream_if_needed or ensure_stream
function is called.
Added tests in test_audit_log.
Fixes#2665.
Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.
Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start. I expect this change will increase pressure for us to split
those files, which isn't a bad thing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Automatically generated by the following script, based on the output
of lint with flake8-comma:
import re
import sys
last_filename = None
last_row = None
lines = []
for msg in sys.stdin:
m = re.match(
r"\x1b\[35mflake8 \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
)
if m:
filename, row_str, col_str, err = m.groups()
row, col = int(row_str), int(col_str)
if filename == last_filename:
assert last_row != row
else:
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
with open(filename) as f:
lines = f.readlines()
last_filename = filename
last_row = row
line = lines[row - 1]
if err in ["C812", "C815"]:
lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
elif err in ["C819"]:
assert line[col - 2] == ","
lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Generated by pyupgrade --py36-plus --keep-percent-format, but with the
NamedTuple changes reverted (see commit
ba7906a3c6, #15132).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We change do_create_user and create_user to accept
role as a parameter instead of 'is_realm_admin' and 'is_guest'.
These changes are done to minimize data conversions between
role and boolean fields.
This commit merges do_change_is_admin and do_change_is_guest to a
single function do_change_user_role which will be used for changing
role of users.
do_change_is_api_super_user is added as a separate function for
changing is_api_super_user field of UserProfile.
Instread of using stream_name + Intergers as topics, we now
generate topics using pos in `config.generate_data.json`.
This helps us create and test more realistic topics.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We now restrict emails on the zulip realm, and now
`email` and `delivery_email` will be different for
users.
This change should make it more likely to catch
errors where we leak delivery emails or use the
wrong field for lookups.
The email domain restriction to @zulip.com is annoying in development
environment when trying to test sign up. For consistency, it's best to
have tests use the same default, and the tests that require domain
restriction can be adjusted to set that configuration up for themselves
explicitly.
This index is intended to optimize the performance of the very
frequently run query of "what is the presence status of all users in a
realm?".
Main changes:
- add realm_id to UserPresence
- add index for realm_id
- backfill realm_id for old rows
- change all writes to UserPresence to include
realm_id
The index is of this form:
"zerver_userpresence_realm_id_5c4ef5a9" btree (realm_id)
We will create an index on (realm_id, timestamp) in a
future commit, but I think it's a bit faster if you do
the backfill before the index.
There's also a minor tweak to the populate_db script.
This commit includes a new `stream_post_policy` setting,
by replacing the `is_announcement_only` field from the Stream model,
which is done by mirroring the structure of the existing
`create_stream_policy`.
It includes the necessary schema and database migrations to migrate
the is_announcement_only boolean field to stream_post_policy,
a smallPositiveInteger field similar to many other settings.
This change is done to allow organization administrators to restrict
new members from creating and posting to a stream. However, this does
not affect admins who are new members.
With many tweaks by tabbott to documentation under /help, etc.
Fixes#13616.
This is a preparatory commit for using isort for sorting all of our
imports, merging changes to files where we can easily review the
changes as something we're happy with.
These are also files with relatively little active development, which
means we don't expect much merge conflict risk from these changes.
This fixes a similar problem to the last commit; we don't use
memcached with the test database, so we don't need to flush memcached
when rebuilding it.
(And if we try, we'll get exceptions trying to access the relevant
settings).
Our recent fixes to using the system's configured memcached settings
broke populate_db, because its hacky clear_database helper is called
with a hacked-up settings module.
We fix this by first moving this out-of-place code from models.py into
populate_db, and then saving the settings required to access memcached
so that we can use them in clear_database.
We also fix a mypy erorr in flush-memcached that matches the same
issue fixed in clear_database.
Extracting this calculation makes it easier to hack
it when you're trying to load lots of users.
We probably want a slightly more realistic calculation
here for stress testing. And also fewer rows. But
at least now it's a little more clear what it's doing.
Fixes#1727.
With the server down, apply migrations 0245 and 0246. 0246 will remove
the pub_date column, so it's essential that the previous migrations
ran correctly to copy data before running this.
Previous cleanups (mostly the removals of Python __future__ imports)
were done in a way that introduced leading newlines. Delete leading
newlines from all files, except static/assets/zulip-emoji/NOTICE,
which is a verbatim copy of the Apache 2.0 license.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This commit alone breaks things, needs to be merged with the follow-up
ones.
welcome-bot is removed from the explicit list, because it already is in
settings.INTERNAL_BOTS.
Django’s default FileSystemFinder disallows STATICFILES_DIRS from
containing STATIC_ROOT (by raising an ImproperlyConfigured exception),
because STATIC_ROOT is supposed to be the result of collecting all the
static files in the project, not one of the potentially many sources
of static files.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We rename the send_messages function to generate_and_messages, and
factor out the actual sending part of it into a separate function, which
now gets the name send_messages().
Sending messages one-by-one is significantly slower compared to taking
advantage of the batch-handling code in do_send_messages, so we pass all
the messages to the function in one go. This is memory-hungry if there's
a lot of messages, so we will allow splitting into smaller batches in
the next commit.
The code generating pub_dates for messages would fail to distribute them
across days if tot_messages was too large.
We refactor this code as a separate function (for clarity and to unit
test for the bug we're fixing), and change the structure and naming to a
form that more clearly describes what's happening. We also shift away
from the approach of all the float-to-int conversions as this is in
general tricky and bug prone - django's timedelta() handles floats as
arguments, so we take advantage of that.
Add new custom profile field type, External account.
External account field links user's social media
profile with account. e.g. GitHub, Twitter, etc.
Fixes part of #12302
Rename URL type custom profile field in populate db to avoid confusion
with the "GitHub profile" custom external account profile field we'll
be adding shortly.
The hope is that by having a shorter list of initial streams, it'll
avoid some potential confusion confusion about the value of topics.
At the very least, having 5 streams each with 1 topic was not a good
way to introduce Zulip.
This commit minimizes changes to the message content in
`send_initial_realm_messages` to keep the diff readable. Future commits will
reshape the content.
For internal stream messages, most of the time, we have access to
a Stream object. For the few corner cases where we don't, it is a
much cleaner approach to have a separate function that accepts a
stream name than having one multi-option helper that accepts both
names and objects.
If the caller has access to a Stream object, it is wasteful to
query a database for a stream by ID or name. In addition, not
having to go through stream names eliminates various classes of
possible bugs involved with getting a Stream object back.
A key part of this is the new helper, get_user_by_delivery_email. Its
verbose name is important for clarity; it should help avoid blind
copy-pasting of get_user (which we'll also want to rename).
Unfortunately, it requires detailed understanding of the context to
figure out which one to use; each is used in about half of call sites.
Another important note is that this PR doesn't migrate get_user calls
in the tests except where not doing so would cause the tests to fail.
This probably deserves a follow-up refactor to avoid bugs here.
This fixes a lot of spammy output of the form:
2018-11-27 17:46:48.279 INFO [zerver.lib.push_notifications] Sending push notification to user 46
when running populate_db, which is both confusing (since we're not
actually sending push notifications here) and spammy.
random_api_key, the function we use to generate random tokens for API
keys, has been moved to zerver/lib/utils.py because it's used in more
parts of the codebase (apart from user creation), and having it in
zerver/lib/create_user.py was prone to cyclic dependencies.
The function has also been renamed to generate_api_key to have an
imperative name, that makes clearer what it does.
Now reading API keys from a user is done with the get_api_key wrapper
method, rather than directly fetching it from the user object.
Also, every place where an action should be done for each API key is now
using get_all_api_keys. This method returns for the moment a single-item
list, containing the specified user's API key.
This commit is the first step towards allowing users have multiple API
keys.
This renames Realm.restricted_to_domain field to
emails_restricted_to_domains, for greater clarity as to what it does
just from seeing the setting name, without having to look it up.
Fixes part of #10042.
As detailed in the documentation changes, this simplifies the
development workflow for doing UI work on the /stats pages.
The cost is a ~10% increase the time it takes to run `populate_db`,
which doesn't happen very often (and for most purposes manifests as a
1% increase in the time it takes to rebuild the database from scratch).
The only changes visible at the AST level, checked using
https://github.com/asottile/astpretty, are
zerver/lib/test_fixtures.py:
'\x1b\\[(1|0)m' ↦ '\\x1b\\[(1|0)m'
'\\[[X| ]\\] (\\d+_.+)\n' ↦ '\\[[X| ]\\] (\\d+_.+)\\n'
which is fine because re treats '\\x1b' and '\\n' the same way as
'\x1b' and '\n'.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This reflects the changes to the default URL publicly
displayed to the user. It also changes the default
URL of the default test server outgoing webhook, which
prevented the test server flaskbotrc from working out
of the box.
We essentially stop running create_realm_internal_bots during
every provisioing and move its operations to run from populate db.
In fact to speed things up a bit we actually make populate db call the
funcs which create_realm_internal_bots calls behind the scenes.
Fixes: #9467.
Previously, the stream colors index i was accidentally a function only
of the user, so each user got the same color for all their streams.
This should provide a lot nicer-looking development environment
experience.
Makes announce stream `is_announcement_only` for the dev db for easier
manual testing. The default value for `is_announcement_only` in
`bulk_create_streams` is False.
We flip the Stream "Rome" to be a web public stream. Also we add
attribute is_web_public in various stream dicts and in the
bulk_create_streams function of bulk_create.py responsible for
default stream creation in dev environment.
Apparently, this bot account was not properly being tagged as an API
super user in the test database; resulting in incorrect behavior if we
tried to send to a private stream in a test.
(Note that there seems to also be a similar issue in production, that
we don't understand the cause of; that is unrelated).
Issue #2088 asked for a wrapper to be created for
`create_stream_if_needed` (called `ensure_stream`) for the 25 times that
`create_stream_if_needed` is called and ignores whether the stream was
created. This commit replaces relevant occurences of
`create_stream_if_needed` with `ensure_stream`, including imports.
The changes weren't significant enough to add any tests or do any
additional manual testing.
The refactoring intended to make the API easier to use in most cases.
The majority of uses of `create_stream_if_needed` ignored the second
parameter.
Fixes: #2088.
To ensure that we have some basic data for custom profile settings,
in the `populate_db` data set, remove `options['test_suite']` check
for adding intial custom profile data.
This commit migrates realm emoji to be addressed by their `id` rather
than their name. This fixes a long standing issue which was causing
an error on uploading an emoji with same name as a deactivated realm
emoji.
Fixes: #6977.
This commit adds a generic function called check_send_webhook_message
that does the following:
* If a stream is specified in the webhook URL, it sends a stream
message, otherwise sends a PM to the owner of the bot.
* In the case of a stream message, if a custom topic is specified
in the webhook URL, it uses that topic as the subject of the
stream message.
Also, note that we need not test this anywhere except for the
helloworld webhook. Since helloworld is our default example for
webhooks, it is here to stay and it made sense that tests for a
generic function such as check_send_webhook_message be tested
with an actual generic webhook!
Fixes#8607.
This makes it easier to work on features that depend on messages
having been sent in the past (E.g. the date parts of recipient bars).
The new feature only works with --threads=1; since with the ~100
default messages that populate_db generates, the multi-threaded
feature shouldn't have significant performance impact (and it would be
tricky to make increasing timestamps work with the multi-threaded
model), it's reasonable to just set the default number of threads to 1
for now and have this timestamp-spreading feature only supported with
--threads=1.
Fixes#8277.
We'd rather this work be just executed immediately, rather than
queued, since queued events can confuse the queue workers if the
database is dropped and recreated repeatedly.
The 'simple' realm was super broken and confusing for new users. We
should replace this with having an easy way to make a new realm in
development, done properly.
Fixes#6116.
While it might be useful to have created welcome-bot earlier in a
certain sense, it's definitely not a good idea in this populate_db
implementation, because doing so threw off the random initial
assignment of users to streams and thus broke the casper tests.
This makes the standard checkboxes 7% darker and makes the disabled
ones about 12% darker + 7% darker than they were before, to
increase visibility.
Fixes: #6331.
Create a generator script to pull lines from a play, enhancing
random lines with emoji, Markdown and other flair.
With numerous contributions from Rein Zustand and Tim Abbott to finish
the project.
Fixes: #1666.
This system hasn't been in active use for several years, and had some
problems with it's design. So it makes sense to just remove it to declutter
the codebase.
Fixes#5655.
Once we implement org_type-specific features, it'll be easy to change a
corporate realm to a community realm, but hard to go the other way. The main
difference (the main thing that makes migrating from a community realm to a
corporate realm hard) is that you'd have to make everyone sign another terms
of service.
Change `from django.utils.timezone import now` to
`from django.utils import timezone`.
This is both because now() is ambiguous (could be datetime.datetime.now),
and more importantly to make it easier to write a lint rule against
datetime.datetime.now().
This fixes an issue where one would get errors of the form:
`ValueError: unsupported pickle protocol: 3`
in a `run-dev.py` server run against Python 2 if you ran `provision`.
Provision currently runs `populate_db` with Python 3, storing Python 3
based data in memcached, which then can't be read by Python 2.
The realm with string_id of "simple" just has three users
named alice, bob, and cindy for now. It is useful for testing
scenarios where realms don't have special zulip.com exception
handling.
This old helper has for years been used only by populate_db, and got
buggy (as of a recent refactoring). So we just call do_send_messages
directly instead.
Fixes the provisioning error we currently get in Travis CI.
Finishes the refactoring started in c1bbd8d. The goal of the refactoring is
to change the argument to get_realm from a Realm.domain to a
Realm.string_id. The steps were
* Add a new function, get_realm_by_string_id.
* Change all calls to get_realm to use get_realm_by_string_id instead.
* Remove get_realm.
* (This commit) Rename get_realm_by_string_id to get_realm.
Part of a larger migration to remove the Realm.domain field entirely.
First step in cleaning up populate_db.create_streams and
bulk_create.bulk_create_streams. Part of a series of commits to remove
Realm.domain from populate_db.
We are prone to case-sensitivity bugs, so I added AARON and ZOE.
Also, for good measure, I insert them in non-alphabetical order
to try to drive out bugs from non-consistent sorting of user ids.
This adds a couple new tools that can be used to determine whether a
particular change in Zulip's backend markdown processor would impact
the rendering of historical messages, without a human actually looking
at the message content. This is a useful way to verify whether a
change to our markdown syntax is likely to create problems.
[commit message and code tweaked by tabbott]
Previously, we set restrict_to_domain and invite_required differently
depending on whether we were setting up a community or a corporate
realm. Setting restrict_to_domain requires validation on the domain of the
user's email, which is messy in the web realm creation flow, since we
validate the user's email before knowing whether the user intends to set up
a corporate or community realm. The simplest solution is to have the realm
creation flow impose as few restrictions as possible (community defaults),
and then worry about restrict_to_domain etc. after the user is already in.
We set the test suite to explictly use the old defaults, since several of
the tests depend on the old defaults.
This commit adds a database migration.
Does a database migration to rename Realm.subdomain to
Realm.string_id, and makes Realm.subdomain a property. Eventually,
Realm.string_id will replace Realm.domain as the handle by which we
retrieve Realm objects.
This is a preliminary step towards eliminating the realm.domain field
in favor of realm.subdomain. Includes a database migration to create
these for existing realms.
The command to render old messages now looks for all messages
not matching the bugdown version, and it no longer directly calls
into model code. We should still be extremely cautious about
using this code.
This adds support for running a Zulip production server with each
realm on its own unique subdomain, e.g. https://realm_name.example.com.
This patch includes a ton of important features:
* Configuring the Zulip sesion middleware to issue cookier correctly
for the subdomains case.
* Throwing an error if the user tries to visit an invalid subdomain.
* Runs a portion of the Casper tests with REALMS_HAVE_SUBDOMAINS
enabled to test the subdomain signup process.
* Updating our integrations documentation to refer to the current subdomain.
* Enforces that users can only login to the subdomain of their realm
(but does not restrict the API; that will be tightened in a future commit).
Note that toggling settings.REALMS_HAVE_SUBDOMAINS on a live server is
not supported without manual intervention (the main problem will be
adding "subdomain" values for all the existing realms).
[substantially modified by tabbott as part of merging]