We now return user_ids for subscribers to streams in add-stream
events. This allows us to eliminate the UserLite class for
both bulk adds and bulk removes. It also simplifies some JS
code that already wanted to use user_ids, not emails.
Fixes#6898
This function truncates the textual content at correct length.
(It will be updated later to handle corner cases of unicode
combining characters and tags when we start supporting them.)
Using lightweight objects will speed up adding new users
to realms.
We also sort the query results, which lets us itertools.groupby
to more efficiently build the data structure.
Profiling on a large data set shows about a 25x speedup for this
function, and before the optimization, this function accounts
for most of the time spend in bulk_add_subscriptions.
There's a lot less memory to allocate. I didn't measure
the memory difference.
When we test-deployed this to chat.zulip.org, we got about a 6x
speedup.
Sort of a hacky hammer, but
* The original design of the analytics system mistakenly attempted to play
nicely with non-UTC datetimes.
* Timezone errors are really hard to find and debug, and don't jump out that
easily when reading code.
I don't know of any outstanding errors, but putting a few "assert this
timezone is in UTC" around will hopefully reduce the chance that there are
any current or future timezone errors.
Note that none of these functions are called outside of the analytics code
(and tests). This commit also doesn't change any current behavior, assuming
a database where all datetimes have been being stored in UTC.
Previously, entering a non-UTC end time for a daily stat would give you
incorrect results. This is because:
* All daily stats are collected at and have end_times in the database in
midnight UTC.
* For daily stats, time_range returns a list of datetimes at midnight in the
timezone of its end argument. These datetimes are the only ones we look
for when looking for rows corresponding to the stat in the database.
* Previously, we passed on the end argument from the API to time_range,
without modification.
The logic to apply events to page_params['unread_msgs'] was
complicated due to the aggregated data structures that we pass
down to the client.
Now we defer the aggregation logic until after we apply the
events. This leads to some simplifications in that codepath,
as well as some performance enhancements.
The intermediate data structure has sets and dictionaries that
generally are keyed by message_id, so most message-related
updates are O(1) in nature.
Also, by waiting to compute the counts until the end, it's a
bit less messy to try to keep track of increments/decrements.
Instead, we just update the dictionaries and sets during the
event-apply phase.
This change also fixes some corner cases:
* We now respect mutes when updating counts.
* For message updates, instead of bluntly updating
the whole topic bucket, we update individual
message ids.
Unfortunately, this change doesn't seem to address the pesky
test that fails sporadically on Travis, related to mention
updates. It will change the symptom, slightly, though.
We now have two helper functions:
* get_raw_unread_data
* aggregate_unread_data
Separating the concerns is nice. The first function does
all the data collection. The second function should be fast,
and it only re-organizes the data into an aggregated form
that makes the page_params payload smaller and easier for
clients to work with.
For the first function, we try to return data structures
that are easier to manipulate than the end result. This
will allow us to apply events more easily, in a subsequent
commit.
Instead of using `unified_reactions` mapping start using
`name_to_codepoint` mapping for converting emoji name to
codepoints. We were using `unified_reactions` mapping
because prior to emoji web PR `name_to_codepoint` mapping
was generated using emoji_map.json which contained old
codepoints but for reactions new codepoints were required
to display them using sprite sheets.
Create a new custom email backend which would automatically
logs the emails that are send in the dev environment as
well as print a friendly message in console to visit /emails
for accessing all the emails that are sent in dev environment.
Since django.core.mail.backends.console.EmailBackend is no longer
userd emails would not be printed to the console anymore.
We now do push notifications and missed message emails
for offline users who are subscribed to the stream for
a message that has been edited, but we short circuit
the offline-notification logic for any user who presumably
would have already received a notification on the original
message.
This effectively boils down to sending notifications to newly
mentioned users. The motivating use case here is that you
forget to mention somebody in a message, and then you edit
the message to mention the person. If they are offline, they
will now get pushed notifications and missed message emails,
with some minor caveats.
We try to mostly use the same techniques here as the
send-message code path, and we share common code with the
send-message path once we get to the Tornado layer and call
maybe_enqueue_notifications.
The major places where we differ are in a function called
maybe_enqueue_notifications_for_message_update, and the top
of that function short circuits a bunch of cases where we
can mostly assume that the original message had an offline
notification.
We can expect a couple changes in the future:
* Requirements may change here, and it might make sense
to send offline notifications on the update side even
in circumstances where the original message had a
notification.
* We may track more notifications in a DB model, which
may simplify our short-circuit logic.
In the view/action layer, we already had two separate codepaths
for send-message and update-message, but this mostly echoes
what the send-message path does in terms of collecting data
about recipients.
They're rarely useful, usually displayed invisibly in most tools
anyway, and this helps make sure the message makes it into Zulip
rather than being rejected.
Postgres doesn't like them, we don't have an obvious way to escape
them, and they tend to be sent by buggy tools where it'd be better for
the user to get an error.
This fixes a 500 we were getting occasionally.
We have two different concepts of "idle", and this function
is based on the "presence" aspect of idleness. There is also
idleness in terms of a user having no current client
descriptors accepting messages, and we check that later in
the process for things like sending missed message emails.
check_send_stream_message is a simpler version of
check_send_message for sending messages where the addressee is
a stream. Instead of relying on Addressee.legacy_build,
check_send_stream_message uses Addressee.for_stream. Consequently,
it eschews many of check_send_message's kwargs that aren't needed
when the intended recipient of a message is a stream.
This isn't something that a user can ever modify, so it doesn't belong
in DEFAULT_SETTINGS. While we're at it, we align the appearance of
the email gateway in the docs with whether this setting in the docs
will be valid.
This commit switches to use sprite sheets for rendering emojis
in all the remaining places, i.e., message bodies and composebox
typeahead. This commit also includes some changes to notifications.py
file so that the spans used for rendering emojis can be converted
to corresponding image tags so that we don't break the emoji rendering
in missed message emails since we can't use sprite sheets there.
As part of switching the bugdown system to use sprite sheets, we need
to switch the name_to_codepoint mappings to match the new sprite
sheets. This has the side effect of fixing a bunch of emoji like
numbers and flag emoji in the emoji pickers.
Fixes: #3895.
Fixes: #3972.
These are long enough to still be self-explanatory (the only one I'm
at all in doubt about there is DEBG; I avoided "DBUG" because it reads
"BUG" which suggests a high-priority message, and those are the
opposite of that), while saving a good bit of horizontal space
vs. padding everything to the 8 characters of "CRITICAL".
Also add a linter exception to allow easy-to-read alignment here,
similar to several existing exceptions for other alignment cases.
This also gives us a place to hang the originating module, if we write a bit
of logic to work that out; sadly it doesn't come out of the box, only
the filename (which is likely to have a bunch of noise that just shows the
path to the deployment or virtualenv.)
This doesn't yet do much, but it gives us a suitable place to
add code to customize how log messages are displayed, beyond what
a format string passed to the default formatter can do.