This removes sender names from the message cache, since
they aren't guaranteed to be valid, and they're inexpensive
to add.
This commit will make the message cache entries smaller
by removing sender___full_name and sender__short_name
fields.
Then we add in the sender fields to the message payloads
by doing a query against the unique sender ids of the
messages we are processing.
This change leads to 2 extra database hops for most of
our message-related codepaths. The reason there are 2 hops
instead of 1 is that we basically re-calculate way too
much data to get a no-markdown dictionary.
Introduce MessageDict.post_process_dicts() will allow us
the ability to do the following:
* use less memory in the cache for repeated data
* prevent cache invalidation
* format data according to different client needs
The first use of this function is pretty inconsequential, but
it sets us up for more consequential changes.
In this commit we defer the MessageDict.hydrate_recipient_info
step until after we pull data out of the cache. This impacts
cache size as follows:
* streams - negligibly bigger
* PMs/huddles - slimmer due to not needing to repeat
sender data like email/full_name
Again, the main point of this change is to start setting up
the infrastructure to do post-processing.
This is a first step to eventually slimming the message cache,
but there are still some moving parts there to be worked through.
The more immediate benefit of extracting this function is that
we can put tests on it. Also, it isolates some functionality
that may go away as our clients gets smarter.
This endpoint is about to become an API-style route and have the legacy
decorator removed from its view. The json/fetch_api_key endpoint will be
used in tests instead of it.
We now use a `.values` query to get just the fields we need
in order to fulfill '/json/users' requests.
The main benefit is that we don't do O(N) queries for bot
owners, but we also have less data on UserProfile to process.
On receiving a request for deleting a reaction, just check if such
a reaction exists or not. If it exists then just delete the reaction
otherwise send an error message that such a reaction doesn't exist.
It doesn't make sense to check whether an emoji name is valid or not.
This commit prepares us to introduce a StreamLite class. For
these tests, we don't care about the actual contents of the
Stream, just the right stream is there.
Since subscribed_to_stream is only doing an id lookup
on the Stream model to find out if a user is subscribed to
a stream, there's no reason to require a full Stream object.
It's currently the case that all callers do have full Stream
objects handy to pass in to this function, but it's still a
good practice to have functions only ask for objects that they
need.
The original "quality score" was invented purely for populating
our password-strength progress bar, and isn't expressed in terms
that are particularly meaningful. For configuration and the core
accept/reject logic, it's better to use units that are readily
understood. Switch to those.
I considered using "bits of entropy", defined loosely as the log
of this number, but both the zxcvbn paper and the linked CACM
article (which I recommend!) are written in terms of the number
of guesses. And reading (most of) those two papers made me
less happy about referring to "entropy" in our terminology.
I already knew that notion was a little fuzzy if looked at
too closely, and I gained a better appreciation of how it's
contributed to confusion in discussing password policies and
to adoption of perverse policies that favor "Password1!" over
"derived unusual ravioli raft". So, "guesses" it is.
And although the log is handy for some analysis purposes
(certainly for a graph like those in the zxcvbn paper), it adds
a layer of abstraction, and I think makes it harder to think
clearly about attacks, especially in the online setting. So
just use the actual number, and if someone wants to set a
gigantic value, they will have the pleasure of seeing just
how many digits are involved.
(Thanks to @YJDave for a prototype that the code changes in this
commit are based on.)
We now return user_ids for subscribers to streams in add-stream
events. This allows us to eliminate the UserLite class for
both bulk adds and bulk removes. It also simplifies some JS
code that already wanted to use user_ids, not emails.
Fixes#6898
This test suite works by using the expected_output and new text_output
fields in the bugdown test cases to verify that each syntax is
correctly translated by this new function.
Some of these translations, like strikethrough, are kinda poor; but
this framework should make it easy to iterate on the formatting.
Fixes: #6720.
This function truncates the textual content at correct length.
(It will be updated later to handle corner cases of unicode
combining characters and tags when we start supporting them.)
It's fairly difficult to debug tests that use
EventsRegisterTest.do_test, and when they fail on
Travis, it's particularly challengning. Now we make
the main diff less noisy, and we also include
the events that were applied.
Using lightweight objects will speed up adding new users
to realms.
We also sort the query results, which lets us itertools.groupby
to more efficiently build the data structure.
Profiling on a large data set shows about a 25x speedup for this
function, and before the optimization, this function accounts
for most of the time spend in bulk_add_subscriptions.
There's a lot less memory to allocate. I didn't measure
the memory difference.
When we test-deployed this to chat.zulip.org, we got about a 6x
speedup.
This reverts commit ba8dc62132.
As best I can tell, the old configuration was correct for what Django
wanted. Further testing is required, but this at least brings
.tx/config to match the actual filenames; I think our Chinese
translations have been broken until now.
Since the REALMS_HAVE_SUBDOMAINS migration in development, we've had
scattered reports of users who found trying to open 127.0.0.1:9991
resulting in a redirect loop between zulipdev.com:9991,
zulipdev.com:9991/devlogin, and zulipdev.com:9991/devlogin/, and back
to zulipdev.com:9991.
We fix this temporarily through a small cleanup, which is to have that
last step in the loop send the user to the subdomain where they're
actually logged in, zulip.zulipdev.com:9991.
There's more to be done before this system will make sense, though.
We originally wrote this because when testing subdomains, you wanted
to be sure you were actually testing subdomains. Now that subdomains
is the default, doesn't seem to actually be a good reason why we
should need this.
Previously we used to mark a key as unstranlated if its value was equal
to it in translations.json. This had an issue because it didn't allow
otherwise valid cases where key was equal to the value.
This commit solves the problem by disallowing an empty string as a valid
translation and then using the empty string as the value for all the
unstranslated keys.
Fixes#5261
Previously, this would always send one to homepage, making visiting
the /help/ documentation in the development environment using the
localhost URL unpleasant.
While this fixes the proximal bug, it's not clear to me that we need
this redirect logic at all, so I'm going to try removing it soon.
This endpoint is part of the old tutorial, which we've removed, and
has some security downsides as well.
This includes a minor refactoring of the tests.
Sort of a hacky hammer, but
* The original design of the analytics system mistakenly attempted to play
nicely with non-UTC datetimes.
* Timezone errors are really hard to find and debug, and don't jump out that
easily when reading code.
I don't know of any outstanding errors, but putting a few "assert this
timezone is in UTC" around will hopefully reduce the chance that there are
any current or future timezone errors.
Note that none of these functions are called outside of the analytics code
(and tests). This commit also doesn't change any current behavior, assuming
a database where all datetimes have been being stored in UTC.
Previously, entering a non-UTC end time for a daily stat would give you
incorrect results. This is because:
* All daily stats are collected at and have end_times in the database in
midnight UTC.
* For daily stats, time_range returns a list of datetimes at midnight in the
timezone of its end argument. These datetimes are the only ones we look
for when looking for rows corresponding to the stat in the database.
* Previously, we passed on the end argument from the API to time_range,
without modification.
The logic to apply events to page_params['unread_msgs'] was
complicated due to the aggregated data structures that we pass
down to the client.
Now we defer the aggregation logic until after we apply the
events. This leads to some simplifications in that codepath,
as well as some performance enhancements.
The intermediate data structure has sets and dictionaries that
generally are keyed by message_id, so most message-related
updates are O(1) in nature.
Also, by waiting to compute the counts until the end, it's a
bit less messy to try to keep track of increments/decrements.
Instead, we just update the dictionaries and sets during the
event-apply phase.
This change also fixes some corner cases:
* We now respect mutes when updating counts.
* For message updates, instead of bluntly updating
the whole topic bucket, we update individual
message ids.
Unfortunately, this change doesn't seem to address the pesky
test that fails sporadically on Travis, related to mention
updates. It will change the symptom, slightly, though.
We now have two helper functions:
* get_raw_unread_data
* aggregate_unread_data
Separating the concerns is nice. The first function does
all the data collection. The second function should be fast,
and it only re-organizes the data into an aggregated form
that makes the page_params payload smaller and easier for
clients to work with.
For the first function, we try to return data structures
that are easier to manipulate than the end result. This
will allow us to apply events more easily, in a subsequent
commit.
Emojis which are represented by a sequence of codepoints or emojis
with ZWJ are not included until we implement a mechanism for dealing
with their unicode versions.
Fixes: #6279.
Instead of using `unified_reactions` mapping start using
`name_to_codepoint` mapping for converting emoji name to
codepoints. We were using `unified_reactions` mapping
because prior to emoji web PR `name_to_codepoint` mapping
was generated using emoji_map.json which contained old
codepoints but for reactions new codepoints were required
to display them using sprite sheets.
Create a new custom email backend which would automatically
logs the emails that are send in the dev environment as
well as print a friendly message in console to visit /emails
for accessing all the emails that are sent in dev environment.
Since django.core.mail.backends.console.EmailBackend is no longer
userd emails would not be printed to the console anymore.
We now do push notifications and missed message emails
for offline users who are subscribed to the stream for
a message that has been edited, but we short circuit
the offline-notification logic for any user who presumably
would have already received a notification on the original
message.
This effectively boils down to sending notifications to newly
mentioned users. The motivating use case here is that you
forget to mention somebody in a message, and then you edit
the message to mention the person. If they are offline, they
will now get pushed notifications and missed message emails,
with some minor caveats.
We try to mostly use the same techniques here as the
send-message code path, and we share common code with the
send-message path once we get to the Tornado layer and call
maybe_enqueue_notifications.
The major places where we differ are in a function called
maybe_enqueue_notifications_for_message_update, and the top
of that function short circuits a bunch of cases where we
can mostly assume that the original message had an offline
notification.
We can expect a couple changes in the future:
* Requirements may change here, and it might make sense
to send offline notifications on the update side even
in circumstances where the original message had a
notification.
* We may track more notifications in a DB model, which
may simplify our short-circuit logic.
In the view/action layer, we already had two separate codepaths
for send-message and update-message, but this mostly echoes
what the send-message path does in terms of collecting data
about recipients.
They're rarely useful, usually displayed invisibly in most tools
anyway, and this helps make sure the message makes it into Zulip
rather than being rejected.
Postgres doesn't like them, we don't have an obvious way to escape
them, and they tend to be sent by buggy tools where it'd be better for
the user to get an error.
This fixes a 500 we were getting occasionally.
We have two different concepts of "idle", and this function
is based on the "presence" aspect of idleness. There is also
idleness in terms of a user having no current client
descriptors accepting messages, and we check that later in
the process for things like sending missed message emails.
This commit migrates all webhooks to use check_send_stream_message
instead of check_send_message. The only two webhooks that still
use check_send_message are our yo and teamcity webhooks. They
both use check_send_message for private messages.
check_send_stream_message is a simpler version of
check_send_message for sending messages where the addressee is
a stream. Instead of relying on Addressee.legacy_build,
check_send_stream_message uses Addressee.for_stream. Consequently,
it eschews many of check_send_message's kwargs that aren't needed
when the intended recipient of a message is a stream.
This isn't something that a user can ever modify, so it doesn't belong
in DEFAULT_SETTINGS. While we're at it, we align the appearance of
the email gateway in the docs with whether this setting in the docs
will be valid.
This commit switches to use sprite sheets for rendering emojis
in all the remaining places, i.e., message bodies and composebox
typeahead. This commit also includes some changes to notifications.py
file so that the spans used for rendering emojis can be converted
to corresponding image tags so that we don't break the emoji rendering
in missed message emails since we can't use sprite sheets there.
As part of switching the bugdown system to use sprite sheets, we need
to switch the name_to_codepoint mappings to match the new sprite
sheets. This has the side effect of fixing a bunch of emoji like
numbers and flag emoji in the emoji pickers.
Fixes: #3895.
Fixes: #3972.
These are long enough to still be self-explanatory (the only one I'm
at all in doubt about there is DEBG; I avoided "DBUG" because it reads
"BUG" which suggests a high-priority message, and those are the
opposite of that), while saving a good bit of horizontal space
vs. padding everything to the 8 characters of "CRITICAL".
Also add a linter exception to allow easy-to-read alignment here,
similar to several existing exceptions for other alignment cases.
This also gives us a place to hang the originating module, if we write a bit
of logic to work that out; sadly it doesn't come out of the box, only
the filename (which is likely to have a bunch of noise that just shows the
path to the deployment or virtualenv.)
This doesn't yet do much, but it gives us a suitable place to
add code to customize how log messages are displayed, beyond what
a format string passed to the default formatter can do.
This should make it a little easier to understand our logging config
and make changes to it with confidence.
Many of these items that are now redundant used to be required when we
were setting disable_existing_loggers to True (before 500d81bf2), in
order to exempt those loggers from being cleared out. Now they're not.
One bit of test code needed a tweak to how it got its hands on the
AdminZulipHandler instance; it can do it from the list on the root
logger just as well as on the `django` logger.
Most of the paths leading through this except clause were cut in
73e8bba37 "ldap auth: Reassure django_auth_ldap". The remaining one
had no test coverage -- the case that leads to it had a narrow unit
test, but no test had the exception actually propagate here. As a
result, the clause was mistakenly cut, in commit
8d7f961a6 "LDAP: Remove now-impossible except clause.", which could
lead to an uncaught exception in production.
Restore the except clause, and add a test for it.
Having Addressee take care of setting stream_name to
sender.default_sending_stream.name makes us able to have
the invariant that stream_name is never None when the
message type is 'stream', which will help for mypy, among
other things.
One thing to be aware of is that Addressee does do a little
bit of validation work, and this adds yet another JsonableError
exception. I don't view this as a bad thing, just something to
know.
TRELLO_MESSAGE_TEMPLATE and TRELLO_SUBJECT_TEMPLATE are
redundant. This commit removes them. Now, subjects don't end
in periods. And where a period is necessary in the message body,
one is appended at the end of the specific template for that
message.
This is just enough of a quick fix to work with a stock Zulip 1.6
server. We should really also make this robust to arbitrary input
from the remote Zulip server, even though it'll be a little tedious.
The dictionary result for get_user_info_for_message_updates()
now has a `mention_user_ids` field that is a set of user ids
who were mentioned in a message.
This checks what arguments it passes into the enqueuing function.
Note, however, that the arguments are wrong for various cases, we'll
update the tests as we fix those bugs.
This ensures that as we expand the logic for under what circumstances
email and push notifications should be sent, we can be confident about
this code path always doing the right thing.
This fixes a problem introduced in the recent refactoring where
`triggers` would not be set correctly when a push or email
notification was triggered by missedmessage_hook.
Fixes#6612.
Now, the two code paths do the same thing for this check.
It seems like there may be more work to do here, in that
wildcard_mentioned messages seem to not be eligible for sending
email/push notifications. We probably want to add some logic there
for the user doing the mention to control whether or not it does.
This makes GoogleSubdomainLoginTest consistently access subdomains the
standard way, replacing the original hacky approach it had that
predated the library.
There are several reasons to extract this function:
* It's easy to unit test without extensive mocking.
* It will show up when we profile code.
* It is something that you can mostly ignore for
most messages.
The main reason to extract this, though, is that we are about
to do some fairly complex splicing of data for the use case
of mentioning service bots on streams they are not subscribed to,
and we want to localize the complexity.
It's unlikely to be of any real consequence, but this code bugged me
in that it makes a whole set before throwing it away to make nearly
the same set.
Sadly Python's comprehensions lack a way to write these cleanly as one
comprehension; but with no extra code complexity we can make the
temporary a genexp, which does the job.
We need a migration to clear the tutorial_status for existing users,
so that we don't show hotspots to anyone who signed up for Zulip in
the month or so since we deleted the old tutorial.
This fixes a bug where the internal_prep_message code path would
incorrectly ignore the `realm` that was passed into it. As a result,
attempts to send messages using the system bots with this code path
would crash.
As a sidenote, we really need to make our test system consistent with
production in terms of whether the user's realm is the same as the
system realm.
We don't access any attributes of the sender other than the realm, and
as it turns out, we in some cases want to use a different realm than
the sender's.
The plan is to have everything expect subdomains, so it makes sense to
move these tests to the subdomains-only test class and style.
Most of the remaining GoogleLoginTest tests are now either duplicates
or basic API-level tests where subdomains are irrelevant.
Previously, this accessed realm.uri via trying to use
zulip_default_context. That doesn't make any sense, because
zulip_default_context expects an HttpRequest object, and those are
nowhere in sight in the code path. We do, however, have the outgoing
webhook bot user involved in the event, and that's the object to
access realm.uri from here.
These arguments are only intended to be used for realm creation, and
they make the code more confusing.
We need to make a few changes after doing this, because some tests
were relying on these extra arguments causing the form to not submit
for their error handling.
We don't apply these changes to the LDAP tests, since fixing those
seems complicated.
This commit implements support for rendering static files in
under static/generated/bots/ in the same manner as we render
our webhooks/integration documentation. Said static files are
generated by tools/setup/generate_zulip_bots_static_files.py
during provisioning.
This commit implements support for copying over static files
for all bots in the zulip_bots package to
static/generated/bots/ during provisioning. This directory
isn't tracked by Git. This allows us to have access to files
stored in an arbitrary zulip_bots package directory somewhere
on the system. For now, logo.* and doc.md files are copied over.
This commit should act as a starting point for extending our
macro-based Markdown framework to our bots/API packages'
documentation and eventually rendering these static files
alongside our webhooks' documentation.