The hope is that by having a shorter list of initial streams, it'll
avoid some potential confusion confusion about the value of topics.
At the very least, having 5 streams each with 1 topic was not a good
way to introduce Zulip.
This commit minimizes changes to the message content in
`send_initial_realm_messages` to keep the diff readable. Future commits will
reshape the content.
There were several problems with the old format:
* The sender was not necessarily the sender; it was the person who did
the deletion (which could be an organization administrator)
* It didn't include the ID of the sender, just the email address.
* It didn't include the recipient ID, instead having a semi-malformed
recipient_type_id under the weird name recipient_user_ids.
Since nothing was relying on the old behavior, we can just fix the
event structure.
We change the send_to_email_mirror management command, to send messages
to the email mirror through the mirror_email_message function instead of
process_message - this makes the message follow a similar codepath as
emails sent into the mirror with the postfix configuration, which means
going through the MirrorWorker queue. The reason for this is to make
this command useful for testing the new email mirror rate limiter.
Closes#2420
We add rate limiting (max X emails withing Y seconds per realm) to the
email mirror. By creating RateLimitedRealmMirror class, inheriting from
RateLimitedObject, and rate_limit_mirror_by_realm function, following a
mechanism used by rate_limit_user, we're able to have this
implementation mostly rely on the already existing, and proven over
time, rate_limiter.py code. The rules are configurable in settings.py in
RATE_LIMITING_MIRROR_REALM_RULES, analogically to RATE_LIMITING_RULES.
Rate limit verification happens in the MirrorWorker in
queue_processors.py. We don't rate limit missed message emails, as due
to using one time addresses, they're not a spam threat.
test_mirror_worker is adapted to the altered MirrorWorker code and a new
test - test_mirror_worker_rate_limiting is added in test_queue_worker.py
to provide coverage for these changes.
We clean up test_mirror_worker for more readability, as well as make it
verify that mirror_email gets called the correct amount of times and use
a correct rcpt_to address, so that the test doesn't fail when some
verification of the address is added in the following commits
implementing rate limiting in the email mirror.
Fixes#9840.
Old addresses caused bugs in some cases with non-latin characters in
stream names (see issue number above). We switch to using django's
slugify helper function to convert stream names to full ascii, while
also getting rid of problematic non-alphanumeric characters, in a
reasonable way. See Django's documentation for slugify to see more about
how this function works.
Tests extended by tabbott to cover cases where we do end up with ascii.
To prepare for changing how the stream name gets encoded into mirror
email addresses while making sure old addresses keep working, we ignore
the stream_name part when receiving emails into the mirror and we only
look at the email_token to identify into which stream to mirror the
email.
I'm surprised that this wasn't a mypy error; we were passing a Realm
object as an integer, and predictably, this resulted in us
constructing a cache key that looked like this:
stream_by_realm_and_name:<Realm: zulip 1>:dd5...
Previously, these cache keys looked like:
:1:9c26164d3a393e316e0f8210efe270e08710d45astream_by_realm_and_name:...
Now, they look like this:
:1:9c26164d3a393e316e0f8210efe270e08710d45a:stream_by_realm_and_name:...
This avoids a bunch of duplicated calls to auth_enabled_helper for our
social auth backends, which added up because auth_enabled_helper can
take 100us to run.
See the comment, but this is a significant performance optimization
for all of our pages using common_context, because this code path is
called more than a dozen times (recursively) by common_context.
We have a few code paths that call get_realm_from_request multiple
times on the same request (e.g. the login page), once inside the view
function and once inside the common context processor code. This
change saves a useless duplicate database query in those code paths.
This block of code with 2 database queries is solely for the /devlogin
endpoint. Removing that block from the /login code path makes it
easier to test /login perf in development.
We never intended to render them for this use case as the result would
not look good, and now we have a convenient bugdown option for
controlling this behavior.
Since we're not storing the markdown rendering anywhere, there's
conveniently no data migration required.
Fixes#11889.
This renames references to user avatars, bot avatars, or organization
icons to profile pictures. The string in the UI are updated,
in addition to the help files, comments, and documentation. Actual
variable/function names, changelog entries, routes, and s3 buckets are
left as-is in order to avoid introducing bugs.
Fixes#11824.
Follow up on 92dc363. This modifies the ScheduledEmail model
and send_future_email to properly support multiple recipients.
Tweaked by tabbott to add some useful explanatory comments and fix
issues with the migration.
Apparently, our invalid realm error page had HTTP status 200, which
could be confusing and in particular broken our mobile app's error
handling for this case.
When soft deactivation is run for in "auto" mode (no emails are
specified and all users inactive for specified number of days are
deactivated), catch-up is also run in the "auto" mode if
AUTO_CATCH_UP_SOFT_DEACTIVATED_USERS is True.
Automatically catching up soft-deactivated users periodically would
ensure a good user experience for returning users, but on some servers
we may want to turn off this option to save on some disk space.
Fixes#8858, at least for the default configuration, by eliminating
the situation where there are a very large number of messages to recover.
A user who has been soft deactivated for a long time might have 10Ks of message
history that was "soft deactivated". It might take a minute or more to add
UserMessage rows for all of these messages, causing timeouts. So, we paginate
the creation of these UserMessage rows.
This logic for passing through whether the user was logged in never
worked, because we were trying to read the client.
Fix this, and add tests to ensure it never breaks again.
Restructured by tabbott to have completely different code with the
same intent.
Fixes#11802.
Previously, the LDAP authentication model ignored the realm-level
settings for who can join a realm. This was sort of reasonable at the
time, because the original LDAP auth was an SSO solution that didn't
allow multiple realms, and so one could fully configure authentication
settings on the LDAP side. But now that we allow multiple realms with
the LDAP backend, one could easily imagine wanting different
restrictions on them, and so it makes sense to add this enforcement.
This field is primarily intended to support avoiding displaying the
"more topics" feature in new organizations and streams, where we might
know that all messages in the stream are already available in the
browser.
Based on original work by Roman Godov, and significantly modified by
tabbott.
The second migration involved here could be expensive on Zulip Cloud,
but is unlikely to be an issue on other servers.
The actual bug in #11791 was caused by code reverted in
3ed85f4cd7, so technically #11791 is
already fixed. However, it makes sense to add tests to ensure that it
doesn't regress in the future as part of closing out the issue.
Fixes#11791.
Apparently, our new validator for stream color having a valid format
incorrectly handled colors that had duplicate characters in them.
(This is caused in part by the spectrum.js logic automatically
converting #ffff00 to #ff0, which our validator rejected). Given that
we had old stream colors in the #ff0 format in our database anyway for
legacy, there's no benefit to banning these colors.
In the future, we could imagine standardizing the format, but doing so
will require also changing the frontend to submit colors only in the
6-character format.
Fixes an issue reported in
https://github.com/zulip/zulip/issues/11845#issuecomment-471417073
According to GitHub's webhook docs, the scope of a membership
event can only be limited to 'teams', which holds true when a
new member is added to a team. However, we just found a payload
in our logs that indicates that when a user is removed from a
team, the scope of the membership is erroneously set to
'organization', not 'team'. This is most likely a bug on
GitHub's end because such behaviour is a direct violation of
their webhook API event specifications. We account for this
by restricting membership events to teams explicitly, at least
till GitHub's docs suggest otherwise.
Now that we've more or less stabilized our authentication/registration
subsystem how we want it, it seems worth adding proper documentation
for this.
Fixes#7619.
Addresses point 2 of #10612. We use a regex to detect if a form
of FWD indicator is present at the beginning of the subject, which
means the message has been forwarded.
remove_quotations argument is added to a couple of functions where
it's necessary.
In filter_footer, the criteria for a line to be a possible beginning
of a footer is changed to line.strip() == "--", instead of
line.strip().startswith("--"), because the former would remove
quotations from plaintext emails. This change makes sense, because
RFC 3676 specifies ""-- " as the separator line between the body
and the signature of a message":
https://tools.ietf.org/html/rfc3676
We remove the 'subject' argument of process_stream_message and make
subject processing happen inside the function, as it's a more
appropriate place than the general process_message function and is
needed to have a good way of disabling removing quotations in forwarded
emails sent into the mirror.
This used to have a single function test_email_subject_stripping which
would run through a sizeable list of example subjects from subjects.json
fixture, form an email with each subject, send it to the email mirror
and check if the resulting stream message has a correctly stripped
topic. That took too much time, because we run through the entire
process_message and most_recent_message codepaths a lot of times.
We change the way of testing to:
1. Ensure process_message applies subject stripping (only need to run
process_message twice here)
2. Test the strip_from_subject function separately, on all the example
from the subjects.json fixtures. This is very fast.
Some urls which end with image file extensions (eg .jpg) may link to
html pages. This adds handling for linx.li, wikipedia.org and
pasteboard.co. If it is possible, we redirect to the actual image url
otherwise we do not attempt to render it as an image.
Fixes#10438.
When a Zephyr user deactivates their account, they should be
automatically turned into a mirror dummy user (so that other users can
continue to interact with them as normal for a Zephyr user who isn't
using Zulip).
Fixes part 3 of #10612. When sending an email to the email mirror to a
stream address, if "+show-sender" is added in the address, the stream
message will now include "From: <sender>" at the top.
The test_events system was in several tests using get_realm to fetch a
realm object, rather than accessing self.user_profile.realm. This
created subtle problems where we were neither directly editing nor
refreshing the `realm` object associated with our UserProfile object
from the database after our the `do_*` methods.
The payoff for this is we can update the previously confused
`do_change_icon_source` test to actually change the state and have the
correct result.
This reverts commit ff90c0101c but keeps
the test cases added for reference.
This was reverted because it was both not a clean solution and created
other realm filters bugs involving dashes (etc.).
In commit de65a04 we can see that if the need ever arises to modify
how stream descriptions are rendered, we would need to make changes
at 5 different call points which can be quite cumbersome. So this
functionality has been extracted to a new method called
'render_stream_descriptions'.
Earlier the behavior was to raise an exception thereby stopping the
whole sync. Now we log an error message and skip the field. Also
fixes the `query_ldap` command to report missing fields without
error.
Fixes: #11780.
This fixes an issue where invalid emoji name prevents following
emojis from rendering.
This reverts the code change in
8842349629, while still passing the
tests added in that commit (it seems the original commit had
misdiagnosed an ordering bug and thus introduced this issue).
Fixes: #11770.
The night logo synchronization on the settings page was perfect, but
the actual display logic had a few problems:
* We were including the realm_logo in context_processors, even though
it is only used in home.py.
* We used different variable names for the templating in navbar.html
than anywhere else the codebase.
* The behavior that the night logo would default to the day logo if
only one was uploaded was not correctly implemented for the navbar
position, either in the synchronization for updates code or the
logic in the navbar.html templates.
This commit leverages the ahocorasick algorithm to build a set of user_ids
that have their alert_words present in the message. It runs in linear time
of the order of length of the input message as opposed to number of
alert_words. This is after building a ahocorasick Automaton which runs
in O(number of alert_words in entire realm) which is usually cached.
This fixes an issue where blank lines between blocks were causing
auto-numbering of list to stop before the blank line resulting
in two separate numbered list instead of one.
Edited significantly by tabbott to explain the tricky details in the
comments.
Fixes: #11651.
Add `max_int_size` parameter to `to_non_negative_int()` in
decorator.py so it will be able to validate that the integer doesn't
exceed the integer maximum limit.
Fixes#11451
This is important for situations such as with our Zapier app,
where the requesting user may be a bot that would like to access
its owner's subscriptions.
Tweaked by tabbott to eliminate the 2^N growth of cases in
do_get_streams.
tests now ran in 7.649s from 9.297s. And this test works just as well
with 3 bots, since only 3 database queries with 3 bots confirms we're
not doing linear queries in the number of bots in the organization.
We want to use the baseline features of bugdown, but not fancy things
like inline URL previews, since the whole structure of stream
descriptions is to have a single-line thing supporting some
formatting.
The migration part of this change fixes a bug encountered by some
organizations upgrading from older versions of Zulip.
This allows us to have some features using bugdown rendering where
inline image previews will not be rendered (which would be problematic
for e.g. stream descriptions).
Guest users will just get an empty list of default streams; we also
hide the "Default streams" organization view from the guest users UI.
This is for consistency with not providing guest users the full list
of streams in an organization.
Fixing this involves fixing the backend to handle unchanged field
submissions of the Zoom credentials without trying to re-validate the
credentials (for performance) as well as to fetch the already-sent
secret.
Visually, #zoom_help_text acts like
.organization-settings-parent div:first-of-type when the Zoom option
is selected, but isn't treated as such.
No visual change with the #google_hangouts_domain change; just there to make
the code more readable/defensible.
This avoids a spurious permission error inside the Postgres
`resolve_symlinks` function if we don’t have access to the current
working directory (e.g. we’re running with cwd /root inside `su
zulip`).
While we’re here, add a defensive `--` argument.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This saves about 8% of the runtime of our total response middleware,
or equivalently close to 2% of the total Tornado response time. Which
is pretty significant given that we're not sure anyone is using statsd
in production.
It's also useful outside Tornado, but the effect is particularly
significant because of how important Tornado performance is.
This avoids parsing these functions on every request, which was
adding roughly 350us to our per-request response times.
The overall impact was more than 10% of basic Tornado response
runtime.
We'll use this in the push-notifications code, in a context where
there should definitely already be UserMessage rows if everything's
gone normally... but explicitly checking at the top seems like the
right pattern from a secure-coding perspective.
When a bunch of messages with active notifications are all read at
once -- e.g. by the user choosing to mark all messages, or all in a
stream, as read, or just scrolling quickly through a PM conversation
-- there can be a large batch of this information to convey. Doing it
in a single GCM/FCM message is better for server congestion, and for
the device's battery.
The corresponding client-side logic is in zulip/zulip-mobile#3343 .
Existing clients today only understand one message ID at a time; so
accommodate them by sending individual GCM/FCM messages up to an
arbitrary threshold, with the rest only as a batch.
Also add an explicit test for this logic. The existing tests
that happen to cause this function to run don't exercise the
last condition, so without a new test `--coverage` complains.
Adblock Plus's "Block social media icons tracking" setting blocked
integration logos for social media platforms from loading, so the logos
are renamed to bypass this.
Fixes#11590.
This change was prompted by a possible bug on Clubhouse's end. In
general, if a branch is added, it also prompts a workflow state
transition in its primary story.
However, our webhook error logs show an erroneous payload where
the same branch is added to another story, but there is no
workflow state transition. Also, both the stories are grouped in
the same payload, which confused/broke our code. Ideally, this
shouldn't happen and is most likely a bug on Clubhouse's end.
In most cases, changes included in Clubhouse payloads never
pertain to more than one parent entity (stories) simultaneously,
and we usually operate under the assumption that the changes
included therein are related to each other in terms of their
parent object (story or epic) and not a child object (the GitHub
branch).
A check suite is a collection of check runs. We care a lot more
about the outcomes of check runs in this case because check_run
payloads are a lot more informative than check_suite payloads.
(And in any case, the check_suite events are primarily for notifying
tools like CI to run checks).