An estimated traffic of 0 suggests a stream is dead, and has pretty
different semantics from any non-zero value. So we should round up any
number between 0 and 1 to 1.
We don't ever use this value, but it's confusing to have the incorrect
calculation in the code.
Ideally we would set this to "None", but I don't know the code well enough
to be confident nothing would break.
We've for a long time had the behavior that a bot mentioned in a
stream message receives the notification, regardless of whether the
bot was actually subscribed to the stream.
Apparently, this behavior also triggered if you mentioned a bot in a
private message (i.e. the bot would be delievered the private message
and would probably respond unhelpfully in a new group private message
thread with the PMs original recipients plus the bot).
The fix for this bug is simple: To exclude this feature for private
messages.
This fixes an issue where if you make #announce (the default
announcement stream) announce-only, then creating a new stream will
throw an exception (because notification-bot can't send there).
Fixes#9636.
The user can now specify the value while creating a stream.
An admin can later change it via `Change stream permissions`
modal. Add is_announcement_only to subscription type text.
For some reason in my original version I was sending both
content and data to the client for submessage events,
where data === JSON.parse(content). There's no reason
to not just let the client parse it, since the client
already does it for data that comes on the original
message, and since we might eventually have non-JSON
payloads.
The server still continues to validate that the payload
is JSON, and the client will blueslip if the server
regressses and sends bad JSON for some reason.
API users, particularly bots, can now send a field
called "widget_content" that will be turned into
a submessage for the web app to look at. (Other
clients can still rely on "content" to be there,
although it's up to the bot author to make the
experience good for those clients as well.)
Right now widget_content will be a JSON string that
encodes a "zform" widget with "choices." Our first
example will be a trivia bot, where users will see
something like this:
Which fruit is orange in color?
[A] orange
[B] blackberry
[C] strawberry
The letters will be turned into buttons on the webapp
and have canned replies.
This commit has a few parts:
- receive widget_content in the request (simply
validating that it's a string)
- parse the JSON in check_message and deeply
validate its structure
- turn it into a submessage in widget.py
This should significantly improve the user experience for creating
additional accounts on zulipchat.com.
Currently, disabled in production pending some work on visual styling.
We extract the entire operations of the management command to a
function create_if_missing_realm_internal_bots in the
zerver/lib/onboarding.py. The logic for determining if there are any realm
internal bots which have not been created is extracted to a function
missing_any_realm_internal_bots in actions.py.
If a user's account has been deactivated, we want to provide a special
error message that makes clear what's going on.
Future work is to provide some administrative controls on whether a
user should be able to re-activate their account.
The only slash command implemented in this initial
version is an extremely crippled version of a
"/stats" slash command that reports that you are
running 1 server.
- do_change_is_admin now raises AssertionError when a non-admin
permission is given.
- adds test to test_users to ensure admin asserts on invalid
permission values.
It makes sense to refactor out the last_reminder logic out of
send_pm_if_empty_stream and have a generic function that can send
rate-limited PM notifications to a bot owner and can be used by
methods other than send_pm_if_empty_stream.
We send add events on upload, update events when sending a message
referencing it, and delete updates on removal.
This should make it possible to do real-time sync for the attachments
UI.
Based in part on work by Aastha Gupta.
A typo in my reading of 6cc2e8bbff meant
that we were incorrectly doing database queries for each Service
object, just to get the user_profile.id, which we already had.
This eliminates the need to call user_ids_to_users inside the
get_service_dicts_for_bots code path, saving a database query.
This completes my refactor to fix backend performance issues in this
code path. Previously, our messy layering of queries that resulted in
Zulip doing work even if none of the bots actually had Services or
config_data.
This commit adds a new field history_public_to_subscribers to the
Stream model, which serves a similar function to the old
settings.PRIVATE_STREAM_HISTORY_FOR_SUBSCRIBERS; we still use that
setting as the default value for new streams to avoid breaking
backwards-compatibility for those users before we are ready with an
actual UI for users to choose directly.
This also comes with a migration to set the value of the new field for
existing streams with an algorithm matching that used at runtime.
With significant changes by Tim Abbott.
This is an initial part of our efforts on #9232.
The removed code path was only needed due to buggy setup code in the
test_cross_realm_scenarios test. We address that with a less buggy
workaround, and which lets us remove unnecessary complexity from this
important validation function.
Thanks for Umair Waheed for some preliminary work on this.
Fixes#7561.
Add realm setting to set time limit for message deleitng.
Set default value of message_content_delete_limit_seconds
to 600 seconds(10 min).
Thanks to Shubham Dhama for rebasing and reworking this. Some final
edits also done by Tim Abbott.
Fixes#7344.
This is a simple computed field. It's intended to more clearly
capture the meaning of this restriction for the users in zephyr mirror
realms, and eventually support guest user accounts in normal Zulip
realms.
This is part of the effort to remove the use of is_zephyr_mirror_realm
across the code path for situations that might be relevant for other
users. It helps keep the code readable.
This commit sends the event for renaming of a private stream to
organization admins of the realm, in addition to the obvious list of
subscribers of the private stream.
Normally, admins can manage a private stream (e.g. unsubscribing a
user). But when the admin tried to unsubscribes a user from a
previously renamed stream, we previously were throwing a JS error, as
the webapp hadn't been notified about the new stream name.
Fixes#9034.
Remove models.get_user_profiles_by_ids() and
obtain user's bots profiles in actions.get_service_dicts_for_bots() by
users.user_ids_to_users() instead of models.get_user_profiles_by_ids().
Fixes#8939
Implement few optimizations for reading admin's bot dicts from database
for a constants number of requests:
- add models.get_user_profiles_by_ids() for reading bots profiles
by single query from database
- add models.get_services_for_bots() for reading services for bots
by single query from database
- add bot_config.get_bot_configs() for reading config data for bots
by single query from database
Fixes#8838
Fixes#8853.
In certain cases, the browser is not able to look up the message.
Include the recipient data for the message in the delete_message event,
so look up of those attributes by the browser isn't required.
This bot was basically a duplicate of NOTIFICATION_BOT for some
specific corner cases, and didn't add much value. It's better to just
eliminate it, which also removes some ugly corner cases around what
happens if the user account doesn't exist.
webhook-errors.log file is cluttered with Stream.DoesNotExist
errors, which hides the errors that we actually need to see. So,
since check_message already sends the bot_owner a PM if the webhook
bot tries to send a message to a non-existent stream, we can ignore
such exceptions.
Add function in user-groups.py for getting member ids
for a group.
Update view to enforce checks for modifying user-groups.
Only admins and user group members can modify user-groups.
There were two instances of `ensure_stream` being called and assigned to
a variable with the variable not being used elsewhere. pyflakes picked
up on this (where it didn't in the previous version likely due to tuple
unpacking), so the the variable assignment has been replaced with a call
to `ensure_stream`.
Issue #2088 asked for a wrapper to be created for
`create_stream_if_needed` (called `ensure_stream`) for the 25 times that
`create_stream_if_needed` is called and ignores whether the stream was
created. This commit replaces relevant occurences of
`create_stream_if_needed` with `ensure_stream`, including imports.
The changes weren't significant enough to add any tests or do any
additional manual testing.
The refactoring intended to make the API easier to use in most cases.
The majority of uses of `create_stream_if_needed` ignored the second
parameter.
Fixes: #2088.
This commit migrates realm emoji to be addressed by their `id` rather
than their name. This fixes a long standing issue which was causing
an error on uploading an emoji with same name as a deactivated realm
emoji.
Fixes: #6977.
We disabled the original "colorized HTML edit-history" feature way
back in 2013 in c51056ff8e.
That original feature involved showing what had been edited inline in
message bodies, so one could easily see what had been changed.
That old feature has since been replaced with the "view edit history"
menu option, and we're unlikely to ever want the old feature back.
So, we can just remove its code. There's a few supporting variables
that were created to help implement this; we can clean those up and
simplify the `update_message` code now that this feature is fully
removed.
This applies only on a server open for anyone to create a realm.
Moreover, if the server admins have granted any given realm a
max_invites greater than the default, that realm is exempt too.
This fixes an unpleasant regression in
f5edeb01ae, where we stopped correctly
filtering users who have an open browser session that's idle. These
users are tagged as "UserPresence.IDLE" with an current timestamp in
the database, and should be treated as idle for presence purposes.
As a result, if you had an open Zulip browser session, you incorrectly
wouldn't get missed-message emails for PMs and mentions before this fix.
Currently, when other private stream subscriber add realm admin to
stream, new copy private stream is created in realm admin's streams.
Which resulted in error, cause there are two similar stream element
in stream settings.
If new subscriber is added to private stream, we first send them
stream `create` event, cause private stream are not visible until
user don't get subscribed at least once. But realm admins can now
always access private stream, so when realm admin is subscribed to
stream, realm admin get stream `create` event even if stream already
exist in on realm admin client side.
Fix this by extracting realm admins from stream `create` event on
`add` subscription operation and sending private stream `create`
event to all realm admins on stream creation operation.
Fixes#8695
This commit adds a generic function called check_send_webhook_message
that does the following:
* If a stream is specified in the webhook URL, it sends a stream
message, otherwise sends a PM to the owner of the bot.
* In the case of a stream message, if a custom topic is specified
in the webhook URL, it uses that topic as the subject of the
stream message.
Also, note that we need not test this anywhere except for the
helloworld webhook. Since helloworld is our default example for
webhooks, it is here to stay and it made sense that tests for a
generic function such as check_send_webhook_message be tested
with an actual generic webhook!
Fixes#8607.
This will allow realm admins to access subscribers of unsubscribed
private stream. This is a preparatory commit for letting realm admins
remove those users.
Apparently, we did essentially all the work to support showing full
topic history to newly subscribed users from a data flow perspective,
but didn't actually enable this feature by having the topic history
endpoint grant access to historical topics. This fixes that gap.
I'm not altogether happy with how the code and tests read for this
feature; the code itself has more duplication than I'd like, and the
tests do too, but it works.
If new private stream is created by realm admin without realm admin
subscribed to it, then it doesn't automatically add created stream to
realm admin's stream list. We have to reload the browser to get newly
created stream in stream list. Cause private stream creation event is
only sent to the subscribed users to private stream, so even if realm
admin is acting user, they don't get creation event.
We should send private stream creation event to realm admin users along
with subscribed user to stream, as realm admins can access unsubscribed
private streams.
Tweaked by tabbott to fix various typos and clean up the code.
Adds realm_bot delete event. On bot ownership change, add event is
sent to the bot_owner(if not admin) and delete event to the
previous bot owner(if not admin). For admin, update event is sent.
This sets up a new test class with a simple
test, mostly for increasing coverage. The class
should in the future be extended to properly
verify the handle_feedback() logic.
Previously, when a user updated the config data of an
embedded bot, only the updated fields were dispatched
back to the client. Dispatching all config data fields
makes the client updating code less brittle.
models.py should only contain thin wrapper functions. Furthermore,
this move allows us to remove the circular imports. The two moved
functions are interdependent and are thus moved in one commit.
Creating the very first organization administrator user and
subscribing them to streams before any messages were sent resulted in
RealmAuditLog entries being created with a `event_last_message_id` of
None, because that's the maximum ID in the empty set.
We correct this by fixing the incorrectly created RealmAuditLog
entries, both for new servers and also fixing old broken entries on
existing servers.
This fixes an issue where if a user setup a Zulip server with just the
organization administrator, and then forgot about it (so that the
initial user became soft-deactivated), trying to sign in 3 weeks later
would throw an exception.
This fixes the issue reported here:
https://chat.zulip.org/#narrow/stream/9-issues/subject/500.20error.20on.20login/near/511981
This adds button under "Organization profile" settings, which
deactivates the organization and sends an "event" to all the
active user and log out them.
Fixes: #8212.
This is based on usage in bulk_change_user_names.py, and that
the RealmAuditLog acting_user field is Optional[UserProfile].
This could be more meaningfully changed in future, perhaps to
indicate that the command was run by a specific zulip user.
In order to get test coverage on topic name checks, we
do them in Addressee, so that we don't hit an assertion
first. The assertion in question is in Addressee.topic(),
and it was added partly to appease mypy.
Adds a check for newline that was present on backend, but missing in the
frontend markdown implementation. Updating messages uses is_me_message flag
received from server instead of its own partial test. Similarly, rendering
previews uses markdown code.
Fixes#6493.
This is the first step for allowing users
to edit a bot's service entries, name the
outgoing webhook configuration entries. The
chosen data structures allow for a future
with multiple services per bot; right now,
only one service per bot is supported.
We add two functions:
1.) check_schedule_message(): This function is responsible for
doing the essential initial checkes to verify the validity of
the message. These checkes include things like if user is
allowed to send messages to some stream or not or if the user is
a super_user. All this is basically done by further calling
check_message() with appropriate parameters. This is on the same
lines as is check_send_message().
2.) do_schedule_messages(): This function is responsible for
creating ScheduleMessage table rows for a list of messages that
are to be scheduled. This basically accumulates the ScheduleMessage
objects in a list and then bulk creates the rows.
The original logic is buggy now that emails can belong to (and be
invited to) multiple realms.
The new logic in the `invites` queue worker also avoids the bug where
when the PreregistrationUser was gone by the time the queue worker got
to the invite (e.g., because it'd been revoked), we threw an exception.
[greg: fix upgrade-compatibility logic; add test; explain
revoked-invite race above]
This code changes frequently enough that errors are bound to creep in. The
main change is that this sends the original invitation email instead of the
reminder email, but I think that's fine.
[Modified by greg to (1) keep `USERNAME_FIELD = 'email'`,
(2) silence the corresponding system check, and (3) ban
reusing a system bot's email address, just like we do in
realm creation.]
As we migrate to allow reuse of the same email with multiple realms,
we need to replace the old "no email reuse" validators. Because
stealing the email for a system bot would be problematic, we still ban
doing so.
This commit only affects the realm creation logic, not registering an
account in an existing realm.
We would allow a user with a valid invitation for one realm to use it
on a different realm instead. On a server with multiple realms, an
authorized user of one realm could use this (by sending invites to
other email addresses they control) to create accounts on other
realms. (CVE-2017-0910)
With this commit, when sending an invitation, we record the inviting
user's realm on the PreregistrationUser row; and when registering a
user, we check that the PregistrationUser realm matches the realm the
user is trying to register on. This resolves CVE-2017-0910 for
newly-sent invitations; the next commit completes the fix.
[greg: rewrote commit message]
This fixes some subtle JavaScript exceptions we've been getting in
zulipchat.com, caused by the system bot realm there not being "zulip"
interacting with get_cross_realm_users.
This should help protect us from future issues with the way that
`bulk_get_users` does caching.
It's likely that we'll want to further restructure `bulk_get_users` to
not have this base_query code path altogether (since it's kinda
buggy), but I'm going to defer that for a time when we have another
user.
We include ERROR_BOT in this set, even though it's not technically
cross-realm (it just lives in the admin realm).
This code path does not correctly handle emails that correspond to
multiple accounts (because `get_system_bot` does not). Since it's
intended to only be used by system bots, we add an appropriate
assertion to ensure it is only used for system bots.
Previously, this was a ValidationError, but that doesn't really make
sense, since this condition reflects an actual bug in the code.
Because this happened to be our only test coverage the ValidationError
catch on line 84 of registration.py, we add nocoverage there for now.
This fixes a bug where, when a user is unsubscribed from a stream,
they might have unread messages on that stream leak. While it might
seem to be a minor problem, it can cause significant problems for
computing the `unread_msgs` data structures, since it means we need to
add an extra filter for whether the user is still subscribed, either
in the backend or in the UI.
Fixes#7095.
This often can cause minor caching problems.
Obviously, it'd be better if we had access to the AST and thus could
do this rule for UserProfile objects in general.
This endpoint will allow us to add/delete emoji reactions whose emoji
got renamed during various emoji infra changes. This was also a
required change for realm emoji migration.
This commit was tweaked significantly by tabbott for greater clarity
(with no changes to the actual logic).
In remove_members_from_group_backend, we are passing user group to
remove_members_from_user_group. In remove_members_from_user_group,
expect user_group_id.
This fixes a regression in ae5ba7f4fd,
where Zulip would 500 if the newly added system bots didn't exist on
the server.
This also fixes a moderate size performance problem where we'd fetch 5
users from memcached or the database in a loop.
Generally emails are not written with markdown in mind and hence
sometimes render in strange ways. This commit fixes a particular
issue that was causing whitespace before paragraphs to be treated
as code block due to which email content was being rendered in a
box that scrolls in right direction a lot.
Fixes: #7045.
The original PR to allow generic bots to be mentioned had
some merge issues that we detected about a week after the
fact. This commit restores the logic from the original PR.
The reason we didn't detect this bug earlier is that the
merge issues didn't break any existing behavior. Instead,
they made it so that only UserMessage rows got written for
bots, but no events were being set. The part of the commit
that got lost is restored here, so now events get sent as
well.
Thanks to @derAnfaenger for reporting this and being patient
as we tracked it down.
Fixes#7140
We extract get_bulk_stream_subscriber_info() from this
function to remove some of the complexity. Also, in that
new function we avoid a hop to the database by querying
on stream ids instead of recipient ids. The query that
gets changed here does require a join to the recipient
table (to get the stream id), so it's a little bit of a
tradeoff.
There's an implicit assumption in bulk_remove_subscriptions
that all users belong to the same realm. We use the realm
for things like comparing occupied streams before and
after our main operation of deactivating streams.
Before this change, we just used the user_profile variable
that leaked from some prior loop to look up the realm, which
was super brittle.
Now we're a bit more explicit.
Note that this code leads to a slightly different query, because
we join to one row in the small Recipient table to match
stream_id to recipient.type_id.
The first method we extract to this library is
get_active_subscriptions_for_stream_id().
We also move num_subscribers_for_stream_id() to here, which
is slightly annoying (having the method on Stream was nice)
but avoids some circular dependency issues.
This extraction moves all the huddle logic into models.py, which
hopefully can reduce friction for things like re-organizing our
caches (there are two cache entries for every huddle) and/or
just putting huddle_id on Message directly.
Do you call get_recipient(Recipient.STREAM, stream_id) or
get_recipient(stream_id, Recipient.STREAM)? I could never
remember, and it was not very type safe, since both parameters
are integers.
Almost all callers to do_create_user were trying to
create active users, except for one test. The
active=False codepath was kind of broken (things
like sending welcome messages had sort of undefined
behavior there), so instead of trying to maintain it,
we just update the one test (`test_people`) to flip the
`is_active` flag manually.
Fixes#7197
Lets administrators view a list of open(unconfirmed) invitations and
resend or revoke a chosen invitation.
There are a few changes that we can expect for the future:
* It is currently possible to invite an email that you have already
invited, it might make sense to change this behavior.
* Resend currently sends an invite reminder instead of resending the
original invite, this is because 'custom_body' was not stored when
the first invite was sent.
Tweaked in various minor ways, primarily in the backend, by tabbott,
mostly for style consistency with the rest of the codebase.
Fixes: #1180.
Tweaked by tabbott to have the field before the invitation is
completed be called invite_as_admins, not invited_as_admins, for
readability.
Fixes#6834.
This change allows normal bots to get UserMessage rows when
they are mentioned on a stream, even if they are not actually
subscribed to the stream.
Fixes#7140.
We now find all (possibly) relevant service bots for a message
in the call to get_recipient_info. This allows us to eliminate
some code that would patch them after we rendered.
The get_service_bot_events() function will ignore any service
bots that weren't actually mentioned in the message (due to
backticks) or part of the active user ids.
We now have a MentionData class that encapsulates
the users who are possibly mentioned in a message.
Not that the rendering code may not keep all the mentions,
since things like backticks will suppress the mention.
We populate this now in do_send_messages, so that we can use
the info earlier in the message-sending process. This info
now gets passed down the call stack as an optional parameter.
Note that bugdown.convert() still populates the data when its
callers decline to pass in a MentionData object.
This is mostly a preparatory commit, as we don't take advantage
of the data yet in do_send_messages.
In do_send_messages, we only produce one dictionary for
the event queues, instead of different flavors for text
vs. html. This prevents two unnecessary queries to the
database.
It also means we only put one dictionary on the "message"
event queue instead of two, albeit a wider one that has
some values that won't be sent to the actual clients.
This wider dictionary from MessageDict.wide_dict is also
used for the `feedback_messages` queue and service bot
queues. Since the extra fields are possibly useful down
the road, and they'll just be ignored for now, we don't
bother to remove them. Also, those queue processors won't
have access to `content_type`, which they shouldn't need.
Fixes#6947
Before this change, we populated two cache entries for each
message that we sent. The entries were largely redundant,
with the only difference being whether we sent the content
as raw markdown or as the rendered HTML.
This commit makes it so we only have one cache entry per
message, and it includes both content and rendered_content.
One legacy source on confusion here is that `content`
changes meaning when you're on the front end. Here is the
situation going forward:
database:
content = raw
rendered_contented = rendered
cache entry:
content = raw
rendered_contented = rendered
payload for the frontend:
content = raw (for apply_markdown=False)
content = rendered (for apply_markdown=True)
Wherever possible, we always want to move checking for error
conditions to the views code, so that we don't need to worry about
handling failures with (in this case) a user that's half-created
because a DefaultStreamGroup doesn't exist.
This effectively implements the feature of default stream groups,
except for a UI, nice styling, etc.
Note that we're careful to not have this do anything in an
organization that doesn't have any default stream groups.
Add this field to the Stream model will prevent us from having
to look at realm data for several types of stream operations, which
can be prone to either doing extra database lookups or making
our cached data bloated.
Going forward, we'll set stream.is_zephyr to True whenever the
realm's string id is "zephyr".
Since subscribed_to_stream is only doing an id lookup
on the Stream model to find out if a user is subscribed to
a stream, there's no reason to require a full Stream object.
It's currently the case that all callers do have full Stream
objects handy to pass in to this function, but it's still a
good practice to have functions only ask for objects that they
need.
We now return user_ids for subscribers to streams in add-stream
events. This allows us to eliminate the UserLite class for
both bulk adds and bulk removes. It also simplifies some JS
code that already wanted to use user_ids, not emails.
Fixes#6898
Using lightweight objects will speed up adding new users
to realms.
We also sort the query results, which lets us itertools.groupby
to more efficiently build the data structure.
Profiling on a large data set shows about a 25x speedup for this
function, and before the optimization, this function accounts
for most of the time spend in bulk_add_subscriptions.
There's a lot less memory to allocate. I didn't measure
the memory difference.
When we test-deployed this to chat.zulip.org, we got about a 6x
speedup.
We now do push notifications and missed message emails
for offline users who are subscribed to the stream for
a message that has been edited, but we short circuit
the offline-notification logic for any user who presumably
would have already received a notification on the original
message.
This effectively boils down to sending notifications to newly
mentioned users. The motivating use case here is that you
forget to mention somebody in a message, and then you edit
the message to mention the person. If they are offline, they
will now get pushed notifications and missed message emails,
with some minor caveats.
We try to mostly use the same techniques here as the
send-message code path, and we share common code with the
send-message path once we get to the Tornado layer and call
maybe_enqueue_notifications.
The major places where we differ are in a function called
maybe_enqueue_notifications_for_message_update, and the top
of that function short circuits a bunch of cases where we
can mostly assume that the original message had an offline
notification.
We can expect a couple changes in the future:
* Requirements may change here, and it might make sense
to send offline notifications on the update side even
in circumstances where the original message had a
notification.
* We may track more notifications in a DB model, which
may simplify our short-circuit logic.
In the view/action layer, we already had two separate codepaths
for send-message and update-message, but this mostly echoes
what the send-message path does in terms of collecting data
about recipients.
Postgres doesn't like them, we don't have an obvious way to escape
them, and they tend to be sent by buggy tools where it'd be better for
the user to get an error.
This fixes a 500 we were getting occasionally.
We have two different concepts of "idle", and this function
is based on the "presence" aspect of idleness. There is also
idleness in terms of a user having no current client
descriptors accepting messages, and we check that later in
the process for things like sending missed message emails.
check_send_stream_message is a simpler version of
check_send_message for sending messages where the addressee is
a stream. Instead of relying on Addressee.legacy_build,
check_send_stream_message uses Addressee.for_stream. Consequently,
it eschews many of check_send_message's kwargs that aren't needed
when the intended recipient of a message is a stream.
Having Addressee take care of setting stream_name to
sender.default_sending_stream.name makes us able to have
the invariant that stream_name is never None when the
message type is 'stream', which will help for mypy, among
other things.
One thing to be aware of is that Addressee does do a little
bit of validation work, and this adds yet another JsonableError
exception. I don't view this as a bad thing, just something to
know.
The dictionary result for get_user_info_for_message_updates()
now has a `mention_user_ids` field that is a set of user ids
who were mentioned in a message.
There are several reasons to extract this function:
* It's easy to unit test without extensive mocking.
* It will show up when we profile code.
* It is something that you can mostly ignore for
most messages.
The main reason to extract this, though, is that we are about
to do some fairly complex splicing of data for the use case
of mentioning service bots on streams they are not subscribed to,
and we want to localize the complexity.
This fixes a bug where the internal_prep_message code path would
incorrectly ignore the `realm` that was passed into it. As a result,
attempts to send messages using the system bots with this code path
would crash.
As a sidenote, we really need to make our test system consistent with
production in terms of whether the user's realm is the same as the
system realm.
We don't access any attributes of the sender other than the realm, and
as it turns out, we in some cases want to use a different realm than
the sender's.
Previously, invitation reminder emails were only being cleared after a
successful signup if newsletter_data was available, since that was the
circumstance in which we were calling the relevant queue processor
code. Now, we (1) clear them when a human user finishes signing up
and (2) correctly clear them using the 'address' field of
ScheduleEmail, not user_id.
We don't need full Realm objects to find DefaultStream
objects for a realm. So now a few functions related to
adding/removing default streams use realm_id for lookups.
Similarly, we don't need a full Stream object to find
out if a stream exists in DefaultStream, so we do id
lookups there as well.
This sets us up to use thinner objects in callers.
We now have a dedicated cache for active_user_ids() that only
stores a list of user_ids.
Before this commit, active_user_ids() used a cache of UserProfile
dictionaries, so it incurred unnecessary deserialization costs for
all the user fields that it sliced away in a list comprehension.
Because the cache is skinnier here, we also need to invalidate it
less frequently. Basically, all we care about is new users, realm
deactivations, and user deactivations.
It's hard to measure how much this will improve performance, because
the speedup for any operation here is pretty minor, but we use this
function a lot, so hopefully it will make the overall system more
healthy.
This is mostly a preparatory commit for an upcoming optimization
related to stream data, but it probably does save us an
occasional DB hop to the realm table.
This leads to more than a 2x speedup when tested with
20k+ total subscribers. (For large realms with lots of default
streams, this function deals with LOTS of data, so it is important
to optimize.)
This class encapsulates the mapping of stream ids to
recipient ids, and it is optimized for bulk use and
repeated use (i.e. it remembers values it already fetched).
This particular commit barely improves the performance
of gather_subscriptions_helper, but it sets us up for
further optimizations.
Long term, we may try to denormalize stream_id on to the
Subscriber table or otherwise modify the database so we
don't have to jump through hoops to do this kind of mapping.
This commit will help enable those changes, because we
isolate the mapping to this one new class.
We were mostly excluding inactive users before this fix, but
now we completely ignore them.
This potentially changes some of the data we return from
get_recipient_info(), but the extra user ids before this fix
were effectively ignored by the caller.
The prior code would queue up feedback messages even if the
feedback bot was deactivated, which was just due to oversight
most likely. (People probably rarely disable the feedback bot,
but they should have that option.)
This sets us up a subsequent commit where we need more data
from the Subscription table to build recipient info, so the
function boundary doesn't work any more for get_recipient_info,
which is part of the heavily optimized send-message
path.
We used to share code here with typing notifications, but
typing notifications need a lot less data than the
send-message path, so it's useful to decouple these two
things. The idioms that are duplicated here are pretty simple
one-liners.
This commit makes get_recipient_info() faster by never creating
Django ORM objects. We use the ORM to create a values query
instead, and then we iterate over the rows to create various
collections of ids.
In order to avoid lots of code duplication, this commit unifies
how we query UserProfile for PMs and streams. Prior to this
commit we were getting "wide" UserProfile objects out of
our memcached cache. Now we just go to the database with our
list of userids. The new approach at worst adds one hop to the
database for PMs, which aren't really a performance bottleneck
(compared to streams). And the new approach actually saves a
hop when both partners aren't in cache (plus we don't pay the
penalty of hitting the cache itself).
The performance improvement here is easy to measure for messages
to streams with many users, even with all the other activity
that goes on inside do_send_messages(). I took test_performance()
in test_messages.py, set num_extra_users to 3000, and consistently
measured a ~20% speedup in do_send_messages().
This commit also eliminates fetching of emails. We probably
could have done that in a prior commit, but in this commit it
is very explicit that we don't need it. While removing email
from the query is a no-brainer, it actually had a negigible
impact on performance. Almost all the savings here comes from
not create UserProfile objects.
This function returns a summary of recipient data for a message
that's being sent. It's mostly just moving code into the
old function called get_recipient_user_profiles().
This commit is necessary to prevent bringing back emails from the
DB for all N recipients of a message just to see if the feedback
bot is being invoked.
We calculate `service_bot_tuples` earlier in the function, so that
we don't need "full" UserProfile objects later in the function.
This is part of consolidating code that basically just needs to
triage user_ids.
This starts to phase out the need for UserProfile objects in
do_send_messages(). UserProfile objects are expensive to create
for large streams with lots of users. The objects in the code
before this commit aren't even full UserProfile objects.
This change mostly sets up future performance improvements, but
we also get a minor speedup here when we run a test with 3000
stream subscribers.
There is no reason for either render_incoming_message() or
render_markdown() to require full UserProfile objects just to
triage alert words.
By only asking for user_ids, we save extra queries in two
callpaths and we make it easier to start using user_ids in
do_send_messages().
This function is essentially a copy of get_recipient_user_profiles,
which is about to go away. The new function enforces the contract of
typing indicators, which is that they don't apply to streams, which
allows us to use a relatively simple approach for getting user
profile objects.
We are diverging this code, because the send-message path needs
more optimizations.
This change introduces an extra hop to the database, but it is
generally faster due to nuances of the DB and the ORM. It
also sets us up to optimize get_recipient_user_profiles() by
avoiding creating ORM objects.
I measured the impact of this using a stream with 3000
subscribers, half of whom were idle, and it speeds things up
by 10%.
Usually a small minority of users are eligible to receive missed
message emails or mobile notifications.
We now filter users first before hitting UserPresence to find idle
users. We also simply check for the existence of recent activity
rather than borrowing the more complicated data structures that we
use for the buddy list.
This commit completely switches us over to using a
dedicated model called MutedTopic to track which topics
a user has muted.
This includes the necessary migrations to create the
table and populate it from legacy data in UserProfile.
A subsequent commit will actually remove the old field
in UserProfile.
Admins need to know about private streams to delete them, even
if they are not subscribed. We send the minimal info possible
to the client to allow them to have a UI for that.
This never made sense to be a flag on the UserMessage table, since
it's not per-user state. And in fact it doesn't need to be in a
database at all, since it's easily computed from content anyway.
Fixes#1099.
This is mostly pure code extraction.
It also removes some dead code in update_muted_topic, where
were updating muted_topics spuriously before calling
do_update_muted_topic.
Unlike creating a stream, there's really no reason one would want to
call the function to create a realm while uncertain whether that realm
already existed.
This change is mostly based on a similar commit from hackerkid
in a feature branch. It borrows both code and ideas. Some of
it's my own stuff, as I was working on a newer branch.
We now call get_user_including_cross_realm_email() inside of
user_profiles_from_unvalidated_emails(), instead of using
get_user_profile_by_email.
This requires a few of our callers to pass down sender into us.
One consequence of this change is that we change the symptoms
for trying to send to emails outside of your realm. In some
cases, we simply raise an error that an email is invalid to us
instead of getting into the deeper validate_recipient_user_profiles
check.
We are trying to convert emails to user_profiles earlier in
the codepath. This may cause subtle changes in which errors
appear, but it's probably generally good to report on bad
addressees sooner than later.
This class simplifies the calling sequence to methods like
check_message and _internal_prep_message, and it's also more
type safe.
Checking for message types is encapsulated with calls to is_stream()
and is_private(). There are also shortcut constructors when you
know that the type of the address (stream vs. private), which is often.
This function optimizes marking streams and topics as read,
by using UserMessage.where_unread(), which uses a partial
index on the "read" flag.
This also simplifies the code path for ordinary message
flag updates.
In order to keep 100% line coverage, I simplified the
logging in update_message_flags, so now all requests
will show the "actually" format.
This is an interim step toward creating dedicated endpoints
for marking streams/topics as reads, so we do error checking
with asserts for flag/operation, so we don't introduce a
temporary translation string.
This is the first part of a larger migration to convert Zulip's
reactions storage to something based on the codepoint, not the emoji
name that the user typed in, so that we don't need to worry about
changes in the names we're using breaking the emoji storage.
We apparently were not correctly clearing the user_profile's email
address from caches when changing email addresses, which meant that
trying to look up the old email in the user_profile caches would still
work.
Fixes#6035.
The "all" option for 'message/flags' was dangerous, as it could
apply to any of our flags. The only flag it made sense for, the
"read" flag, now has a dedicated endpoint.
This change simplifies how we mark all messages as read. It also
speeds up the backend by taking advantage of our partial index
for unread messages. We also use a new statsd indicator.