This function truncates the textual content at correct length.
(It will be updated later to handle corner cases of unicode
combining characters and tags when we start supporting them.)
Using lightweight objects will speed up adding new users
to realms.
We also sort the query results, which lets us itertools.groupby
to more efficiently build the data structure.
Profiling on a large data set shows about a 25x speedup for this
function, and before the optimization, this function accounts
for most of the time spend in bulk_add_subscriptions.
There's a lot less memory to allocate. I didn't measure
the memory difference.
When we test-deployed this to chat.zulip.org, we got about a 6x
speedup.
Sort of a hacky hammer, but
* The original design of the analytics system mistakenly attempted to play
nicely with non-UTC datetimes.
* Timezone errors are really hard to find and debug, and don't jump out that
easily when reading code.
I don't know of any outstanding errors, but putting a few "assert this
timezone is in UTC" around will hopefully reduce the chance that there are
any current or future timezone errors.
Note that none of these functions are called outside of the analytics code
(and tests). This commit also doesn't change any current behavior, assuming
a database where all datetimes have been being stored in UTC.
Previously, entering a non-UTC end time for a daily stat would give you
incorrect results. This is because:
* All daily stats are collected at and have end_times in the database in
midnight UTC.
* For daily stats, time_range returns a list of datetimes at midnight in the
timezone of its end argument. These datetimes are the only ones we look
for when looking for rows corresponding to the stat in the database.
* Previously, we passed on the end argument from the API to time_range,
without modification.
The logic to apply events to page_params['unread_msgs'] was
complicated due to the aggregated data structures that we pass
down to the client.
Now we defer the aggregation logic until after we apply the
events. This leads to some simplifications in that codepath,
as well as some performance enhancements.
The intermediate data structure has sets and dictionaries that
generally are keyed by message_id, so most message-related
updates are O(1) in nature.
Also, by waiting to compute the counts until the end, it's a
bit less messy to try to keep track of increments/decrements.
Instead, we just update the dictionaries and sets during the
event-apply phase.
This change also fixes some corner cases:
* We now respect mutes when updating counts.
* For message updates, instead of bluntly updating
the whole topic bucket, we update individual
message ids.
Unfortunately, this change doesn't seem to address the pesky
test that fails sporadically on Travis, related to mention
updates. It will change the symptom, slightly, though.
We now have two helper functions:
* get_raw_unread_data
* aggregate_unread_data
Separating the concerns is nice. The first function does
all the data collection. The second function should be fast,
and it only re-organizes the data into an aggregated form
that makes the page_params payload smaller and easier for
clients to work with.
For the first function, we try to return data structures
that are easier to manipulate than the end result. This
will allow us to apply events more easily, in a subsequent
commit.
Instead of using `unified_reactions` mapping start using
`name_to_codepoint` mapping for converting emoji name to
codepoints. We were using `unified_reactions` mapping
because prior to emoji web PR `name_to_codepoint` mapping
was generated using emoji_map.json which contained old
codepoints but for reactions new codepoints were required
to display them using sprite sheets.
Create a new custom email backend which would automatically
logs the emails that are send in the dev environment as
well as print a friendly message in console to visit /emails
for accessing all the emails that are sent in dev environment.
Since django.core.mail.backends.console.EmailBackend is no longer
userd emails would not be printed to the console anymore.
We now do push notifications and missed message emails
for offline users who are subscribed to the stream for
a message that has been edited, but we short circuit
the offline-notification logic for any user who presumably
would have already received a notification on the original
message.
This effectively boils down to sending notifications to newly
mentioned users. The motivating use case here is that you
forget to mention somebody in a message, and then you edit
the message to mention the person. If they are offline, they
will now get pushed notifications and missed message emails,
with some minor caveats.
We try to mostly use the same techniques here as the
send-message code path, and we share common code with the
send-message path once we get to the Tornado layer and call
maybe_enqueue_notifications.
The major places where we differ are in a function called
maybe_enqueue_notifications_for_message_update, and the top
of that function short circuits a bunch of cases where we
can mostly assume that the original message had an offline
notification.
We can expect a couple changes in the future:
* Requirements may change here, and it might make sense
to send offline notifications on the update side even
in circumstances where the original message had a
notification.
* We may track more notifications in a DB model, which
may simplify our short-circuit logic.
In the view/action layer, we already had two separate codepaths
for send-message and update-message, but this mostly echoes
what the send-message path does in terms of collecting data
about recipients.
They're rarely useful, usually displayed invisibly in most tools
anyway, and this helps make sure the message makes it into Zulip
rather than being rejected.
Postgres doesn't like them, we don't have an obvious way to escape
them, and they tend to be sent by buggy tools where it'd be better for
the user to get an error.
This fixes a 500 we were getting occasionally.
We have two different concepts of "idle", and this function
is based on the "presence" aspect of idleness. There is also
idleness in terms of a user having no current client
descriptors accepting messages, and we check that later in
the process for things like sending missed message emails.
check_send_stream_message is a simpler version of
check_send_message for sending messages where the addressee is
a stream. Instead of relying on Addressee.legacy_build,
check_send_stream_message uses Addressee.for_stream. Consequently,
it eschews many of check_send_message's kwargs that aren't needed
when the intended recipient of a message is a stream.
This isn't something that a user can ever modify, so it doesn't belong
in DEFAULT_SETTINGS. While we're at it, we align the appearance of
the email gateway in the docs with whether this setting in the docs
will be valid.
This commit switches to use sprite sheets for rendering emojis
in all the remaining places, i.e., message bodies and composebox
typeahead. This commit also includes some changes to notifications.py
file so that the spans used for rendering emojis can be converted
to corresponding image tags so that we don't break the emoji rendering
in missed message emails since we can't use sprite sheets there.
As part of switching the bugdown system to use sprite sheets, we need
to switch the name_to_codepoint mappings to match the new sprite
sheets. This has the side effect of fixing a bunch of emoji like
numbers and flag emoji in the emoji pickers.
Fixes: #3895.
Fixes: #3972.
These are long enough to still be self-explanatory (the only one I'm
at all in doubt about there is DEBG; I avoided "DBUG" because it reads
"BUG" which suggests a high-priority message, and those are the
opposite of that), while saving a good bit of horizontal space
vs. padding everything to the 8 characters of "CRITICAL".
Also add a linter exception to allow easy-to-read alignment here,
similar to several existing exceptions for other alignment cases.
This also gives us a place to hang the originating module, if we write a bit
of logic to work that out; sadly it doesn't come out of the box, only
the filename (which is likely to have a bunch of noise that just shows the
path to the deployment or virtualenv.)
This doesn't yet do much, but it gives us a suitable place to
add code to customize how log messages are displayed, beyond what
a format string passed to the default formatter can do.
Having Addressee take care of setting stream_name to
sender.default_sending_stream.name makes us able to have
the invariant that stream_name is never None when the
message type is 'stream', which will help for mypy, among
other things.
One thing to be aware of is that Addressee does do a little
bit of validation work, and this adds yet another JsonableError
exception. I don't view this as a bad thing, just something to
know.
This is just enough of a quick fix to work with a stock Zulip 1.6
server. We should really also make this robust to arbitrary input
from the remote Zulip server, even though it'll be a little tedious.
The dictionary result for get_user_info_for_message_updates()
now has a `mention_user_ids` field that is a set of user ids
who were mentioned in a message.
There are several reasons to extract this function:
* It's easy to unit test without extensive mocking.
* It will show up when we profile code.
* It is something that you can mostly ignore for
most messages.
The main reason to extract this, though, is that we are about
to do some fairly complex splicing of data for the use case
of mentioning service bots on streams they are not subscribed to,
and we want to localize the complexity.
It's unlikely to be of any real consequence, but this code bugged me
in that it makes a whole set before throwing it away to make nearly
the same set.
Sadly Python's comprehensions lack a way to write these cleanly as one
comprehension; but with no extra code complexity we can make the
temporary a genexp, which does the job.
This fixes a bug where the internal_prep_message code path would
incorrectly ignore the `realm` that was passed into it. As a result,
attempts to send messages using the system bots with this code path
would crash.
As a sidenote, we really need to make our test system consistent with
production in terms of whether the user's realm is the same as the
system realm.
We don't access any attributes of the sender other than the realm, and
as it turns out, we in some cases want to use a different realm than
the sender's.
Previously, this accessed realm.uri via trying to use
zulip_default_context. That doesn't make any sense, because
zulip_default_context expects an HttpRequest object, and those are
nowhere in sight in the code path. We do, however, have the outgoing
webhook bot user involved in the event, and that's the object to
access realm.uri from here.
This commit implements support for rendering static files in
under static/generated/bots/ in the same manner as we render
our webhooks/integration documentation. Said static files are
generated by tools/setup/generate_zulip_bots_static_files.py
during provisioning.
Previously, invitation reminder emails were only being cleared after a
successful signup if newsletter_data was available, since that was the
circumstance in which we were calling the relevant queue processor
code. Now, we (1) clear them when a human user finishes signing up
and (2) correctly clear them using the 'address' field of
ScheduleEmail, not user_id.
We don't need full Realm objects to find DefaultStream
objects for a realm. So now a few functions related to
adding/removing default streams use realm_id for lookups.
Similarly, we don't need a full Stream object to find
out if a stream exists in DefaultStream, so we do id
lookups there as well.
This sets us up to use thinner objects in callers.
We want to convert stream names to stream ids as close
to the "edges" of our system as possible, so we let our
caller do the work of finding the stream id for a stream
narrow.
We now have a dedicated cache for active_user_ids() that only
stores a list of user_ids.
Before this commit, active_user_ids() used a cache of UserProfile
dictionaries, so it incurred unnecessary deserialization costs for
all the user fields that it sliced away in a list comprehension.
Because the cache is skinnier here, we also need to invalidate it
less frequently. Basically, all we care about is new users, realm
deactivations, and user deactivations.
It's hard to measure how much this will improve performance, because
the speedup for any operation here is pretty minor, but we use this
function a lot, so hopefully it will make the overall system more
healthy.
This is mostly a preparatory commit for an upcoming optimization
related to stream data, but it probably does save us an
occasional DB hop to the realm table.
Previously, this was its own separate test script; now it's a normal
part of the test suite.
Tweaked by tabbott to use a proper test method.
Fixes#6327.
This leads to more than a 2x speedup when tested with
20k+ total subscribers. (For large realms with lots of default
streams, this function deals with LOTS of data, so it is important
to optimize.)
This class encapsulates the mapping of stream ids to
recipient ids, and it is optimized for bulk use and
repeated use (i.e. it remembers values it already fetched).
This particular commit barely improves the performance
of gather_subscriptions_helper, but it sets us up for
further optimizations.
Long term, we may try to denormalize stream_id on to the
Subscriber table or otherwise modify the database so we
don't have to jump through hoops to do this kind of mapping.
This commit will help enable those changes, because we
isolate the mapping to this one new class.
Moves SEND_ALL to inside get_next_hotspots, since it is not something other
files should call.
Also changes the delay to 0s, and gates the code behind an
`if settings.DEVELOPMENT`.
We were mostly excluding inactive users before this fix, but
now we completely ignore them.
This potentially changes some of the data we return from
get_recipient_info(), but the extra user ids before this fix
were effectively ignored by the caller.
The prior code would queue up feedback messages even if the
feedback bot was deactivated, which was just due to oversight
most likely. (People probably rarely disable the feedback bot,
but they should have that option.)
We now triage message content for possible mentions before
going to the cache/DB to get name info. This will create an
extra data hop for messages with mentions, but it will save
a fairly expensive cache lookup for most messages. (This will
be especially helpful for large realms.)
[Note that we need a subsequent commit to actually make the speedup
happen here, since avatars also cause us to look up all users in
the realm.]
This sets us up a subsequent commit where we need more data
from the Subscription table to build recipient info, so the
function boundary doesn't work any more for get_recipient_info,
which is part of the heavily optimized send-message
path.
We used to share code here with typing notifications, but
typing notifications need a lot less data than the
send-message path, so it's useful to decouple these two
things. The idioms that are duplicated here are pretty simple
one-liners.
compilemessages command now does all the heavy lifting by creating a
language_name_map.json file under locale directory. This file is used
by get_language_list to retrieve the require information.
Fixes: #6486
This commit makes get_recipient_info() faster by never creating
Django ORM objects. We use the ORM to create a values query
instead, and then we iterate over the rows to create various
collections of ids.
In order to avoid lots of code duplication, this commit unifies
how we query UserProfile for PMs and streams. Prior to this
commit we were getting "wide" UserProfile objects out of
our memcached cache. Now we just go to the database with our
list of userids. The new approach at worst adds one hop to the
database for PMs, which aren't really a performance bottleneck
(compared to streams). And the new approach actually saves a
hop when both partners aren't in cache (plus we don't pay the
penalty of hitting the cache itself).
The performance improvement here is easy to measure for messages
to streams with many users, even with all the other activity
that goes on inside do_send_messages(). I took test_performance()
in test_messages.py, set num_extra_users to 3000, and consistently
measured a ~20% speedup in do_send_messages().
This commit also eliminates fetching of emails. We probably
could have done that in a prior commit, but in this commit it
is very explicit that we don't need it. While removing email
from the query is a no-brainer, it actually had a negigible
impact on performance. Almost all the savings here comes from
not create UserProfile objects.
This function returns a summary of recipient data for a message
that's being sent. It's mostly just moving code into the
old function called get_recipient_user_profiles().
This commit is necessary to prevent bringing back emails from the
DB for all N recipients of a message just to see if the feedback
bot is being invoked.
We calculate `service_bot_tuples` earlier in the function, so that
we don't need "full" UserProfile objects later in the function.
This is part of consolidating code that basically just needs to
triage user_ids.
This starts to phase out the need for UserProfile objects in
do_send_messages(). UserProfile objects are expensive to create
for large streams with lots of users. The objects in the code
before this commit aren't even full UserProfile objects.
This change mostly sets up future performance improvements, but
we also get a minor speedup here when we run a test with 3000
stream subscribers.
There is no reason for either render_incoming_message() or
render_markdown() to require full UserProfile objects just to
triage alert words.
By only asking for user_ids, we save extra queries in two
callpaths and we make it easier to start using user_ids in
do_send_messages().
This function is essentially a copy of get_recipient_user_profiles,
which is about to go away. The new function enforces the contract of
typing indicators, which is that they don't apply to streams, which
allows us to use a relatively simple approach for getting user
profile objects.
We are diverging this code, because the send-message path needs
more optimizations.
This change introduces an extra hop to the database, but it is
generally faster due to nuances of the DB and the ORM. It
also sets us up to optimize get_recipient_user_profiles() by
avoiding creating ORM objects.
I measured the impact of this using a stream with 3000
subscribers, half of whom were idle, and it speeds things up
by 10%.
The commit() call in fix() breaks migrations and tests (unless you
mock) due to outer transactions.
We now explicitly call commit() from the management command.
Usually a small minority of users are eligible to receive missed
message emails or mobile notifications.
We now filter users first before hitting UserPresence to find idle
users. We also simply check for the existence of recent activity
rather than borrowing the more complicated data structures that we
use for the buddy list.
This commit completely switches us over to using a
dedicated model called MutedTopic to track which topics
a user has muted.
This includes the necessary migrations to create the
table and populate it from legacy data in UserProfile.
A subsequent commit will actually remove the old field
in UserProfile.
Admins need to know about private streams to delete them, even
if they are not subscribed. We send the minimal info possible
to the client to allow them to have a UI for that.
This never made sense to be a flag on the UserMessage table, since
it's not per-user state. And in fact it doesn't need to be in a
database at all, since it's easily computed from content anyway.
Fixes#1099.
And it works!
A couple of things still to do:
* When a device token is no longer active, we'll get HTTP status 410.
We should then remove the token from the database so we don't keep
trying to push to it. This is fairly urgent.
* The library we're using has a nice asynchronous API, but this
version doesn't use it. This is OK now, but async will be
essential at scale.
This code empirically doesn't work. It's not entirely clear why, even
having done quite a bit of debugging; partly because the code is quite
convoluted, and because it shows the symptoms of people making changes
over time without really understanding how it was supposed to work.
Moreover, this code targets an old version of the APNs provider API.
Apple deprecated that in 2015, in favor of a shiny new one which uses
HTTP/2 to meet the same needs for concurrency and scale that the old
one had to do a bunch of ad-hoc protocol design for.
So, rip this code out. We'll build a pathway to the new API from
scratch; it's not that complicated.
We'd been getting errors from APNs that appeared to say that the
device tokens we were trying to send to were invalid. It turned out
that the device tokens didn't match the "topic" (i.e. app ID) we were
sending, which was because the topic was wrong, which was because we
were using the wrong SSL cert. But for a while we thought it might be
that we were somehow messing up the device tokens we put into the
database. This logging helped us work out that wasn't the issue, and
would have helped our debugging sooner.
This brings type-checking to the last place we fetch
data from Redis, with the exception of our APNs code
which is being replaced (with a Redis-free version,
thanks to improvements in Apple's APNs API) shortly.
This gives us type-checking, to help prevent bugs like the
last couple of commits fixed in our Tornado code and our
missed-message email handling. Fortunately no behavior
changes are needed here.
Redis and the Redis client know nothing but bytes. When we take a
`bytes` object it returns and pass it down as `subject` here, it
causes an exception deep inside message processing if the realm has
any filters, when `bugdown.subject_links` attempts to search the
subject for the filters, which are of course `str` patterns.
For symmetry, make the conversion to bytes on the storing side
explicit too.
Previously, we didn't pass customized HTTP_HOST headers when making
network requests. As we move towards a world where everything is on a
subdomain, we'll want to start doing that.
The vast majority of our test code is written to interact with the
default "zulip" realm, which has a subdomain of "zulip". While
probably longer-term, we'll wish this was the root domain, for now, we
need to make our HTTP requests match what is expected by the test
code.
This commit almost certainly introduces some weird bugs where code was
expecting a different subdomain but the tests doesn't fail yet. It's
not clear how to find all of these, but I've done some grepping.
get_realm_by_email_domain was intended to be registration flow code
not used in other code, but it was leaked to a few places. This
removes one of the main remaining references to it outside the
registration code path.
This is mostly pure code extraction.
It also removes some dead code in update_muted_topic, where
were updating muted_topics spuriously before calling
do_update_muted_topic.
Unlike creating a stream, there's really no reason one would want to
call the function to create a realm while uncertain whether that realm
already existed.
This change is mostly based on a similar commit from hackerkid
in a feature branch. It borrows both code and ideas. Some of
it's my own stuff, as I was working on a newer branch.
We now call get_user_including_cross_realm_email() inside of
user_profiles_from_unvalidated_emails(), instead of using
get_user_profile_by_email.
This requires a few of our callers to pass down sender into us.
One consequence of this change is that we change the symptoms
for trying to send to emails outside of your realm. In some
cases, we simply raise an error that an email is invalid to us
instead of getting into the deeper validate_recipient_user_profiles
check.
We are trying to convert emails to user_profiles earlier in
the codepath. This may cause subtle changes in which errors
appear, but it's probably generally good to report on bad
addressees sooner than later.
This class simplifies the calling sequence to methods like
check_message and _internal_prep_message, and it's also more
type safe.
Checking for message types is encapsulated with calls to is_stream()
and is_private(). There are also shortcut constructors when you
know that the type of the address (stream vs. private), which is often.
An expression like `force_bytes(chr(...))`, on Python 3 where the
`force_bytes` finds itself with something to do because `chr` returns
a text string, gives the UTF-8 encoding of the given value as a
Unicode codepoint.
Here, we don't want that -- rather we want the given value as a
single byte. We can do that with `struct.pack`.
This fixes an issue where the "Link with Webathena" flow was producing
invalid credential caches when run on Python 3, breaking the Zephyr
mirror for any user who went through it anew.
Given typeahed and the fact that this only worked if the person had a
full name that didn't contain whitespace, this side effect of the
original @shortname mentionfeature that we removed was experienced by
users as a bug.
Fixes#6142.
This function optimizes marking streams and topics as read,
by using UserMessage.where_unread(), which uses a partial
index on the "read" flag.
This also simplifies the code path for ordinary message
flag updates.
In order to keep 100% line coverage, I simplified the
logging in update_message_flags, so now all requests
will show the "actually" format.
This is an interim step toward creating dedicated endpoints
for marking streams/topics as reads, so we do error checking
with asserts for flag/operation, so we don't introduce a
temporary translation string.
This is the first part of a larger migration to convert Zulip's
reactions storage to something based on the codepoint, not the emoji
name that the user typed in, so that we don't need to worry about
changes in the names we're using breaking the emoji storage.
This commits adds new helper functions which are:
* get_users_for_soft_deactivation(): This function can be used to
fetch a list of human users which pass the criteria of minimum
inactivity period (in days) passed as a parameter to the function.
* do_soft_activate_users(): Given a list of users this function
reactivates them and help them catch up with the missing message
rows for them in the UserMessage table.
This function will help us in creating undisturbed experience for
returning soft deactivated users.
Tweaked by tabbott to fix minor performance and clarity issues.
This field is convenient for bankruptcy checks. Clients could
calculate it from page_params.unread_msgs before this change, but
it would kind of a painful calculation.
To add count, we had to simplify the mypy annotations, which weren't
really accurate before.
We were exiting this function in certain cases before updating
mentions. This bug was always there, but it was flaky in terms
of database setup whether the tests would fail, so now the
relevant test sends three consecutive messages.
We also avoid putting duplicate message ids in mentions.
It's hard to find literature with the community tone we're going for, that
is consistent with the Zulip code of conduct, etc.
This commit removes the special tooling for Gutenberg plays, and changes the
text to be some mixture of scigen, Communications From Elsewhere,
chat.zulip.org, and various books from the public domain.
We apparently were not correctly clearing the user_profile's email
address from caches when changing email addresses, which meant that
trying to look up the old email in the user_profile caches would still
work.
Fixes#6035.
The "all" option for 'message/flags' was dangerous, as it could
apply to any of our flags. The only flag it made sense for, the
"read" flag, now has a dedicated endpoint.
This change simplifies how we mark all messages as read. It also
speeds up the backend by taking advantage of our partial index
for unread messages. We also use a new statsd indicator.
Create a generator script to pull lines from a play, enhancing
random lines with emoji, Markdown and other flair.
With numerous contributions from Rein Zustand and Tim Abbott to finish
the project.
Fixes: #1666.
Some of this code was only used by the `active_user_stats`
management command deleted in the previous commit. Other
code appears to have already been dead. Remove it all.
interface_type select menu will be used to choose the interface
for outgoing webhooks. It will be displayed only when the selected
bot type is OUTGOING WEBHOOK type. The default value is GENERIC
interface type (1).
We are adding a new list of unread message ids grouped by
conversation to the queue registration result. This will allow
clients to show accurate unread badges without needing to load an
unbound number of historic messages.
Jason started this commit, and then Steve Howell finished it.
We only identify conversations using stream_id/user_id info;
we may need a subsequent version that includes things like
stream names and user emails/names for API clients that don't
have data structures to map ids -> attributes.
In anticipation of have all unread message ids available to the
web app in page_params (via a separate effort), we are simplifying
the /topics endpoint to no longer return unread counts.
Instead we have a list of tiny dictionaries with these fields:
name - name of the topic
max_id - max message id for the topic (aka most recent)
The items in the list are order by most-recent-topic-first.
This allows us to reliably parse the error in code, rather than
attempt to parse the error text. Because the error text gets
translated into the user's language, this error-handling path
wasn't functioning at all for users using Zulip in any of the
seven non-English languages for which we had a translation for
this string.
Together with 709c3b50f which fixed a similar issue in a
different error-handling path, this fixes#5598.
Process the unicode emojis in twitter link previews and render them
properly. Before this we were not processing the unicode emojis in
twitter link previews and hence on the systems which don't have
fonts for displaying them they were rendered as blank boxes.
Fixes: #5427.
This commit renames list named `to_linkify` in twitter link processor
to `to_process` and adds a `type` field to each entry in it to
indicate the type of data represented by that particular entry.
Add test to check if the embedded bot service being used is in the
registry or not.
Add test to check if the bot being added to the registry has a valid
bot corresponding to it.
Move 'get_bot_handler' to 'zerver/lib/bot_lib.py' as it is an independent
function, not related to the 'EmbeddedBotWorker' class that it was
previously a part of.
This fixes the original issue that #5598 was the root cause of; when
the user returns to a Zulip browser tab after they've been idle past
the timeout (10 min, per IDLE_EVENT_QUEUE_TIMEOUT_SECS), we now
correctly reload the page even if they're using Zulip in German or
another non-English language where we have a translation for the
relevant error message.
All JsonableError subclasses now have corresponding ErrorCode values
of their own, reducing the number of different patterns for using
the new JsonableError API.
This provides the main infrastructure for fixing #5598. From here,
it's a matter of on the one hand upgrading exception handlers -- the
many except-blocks in the codebase that look for JsonableError -- to
look beyond the string `msg` and pass on the machine-readable full
error information to their various downstream recipients, and on the
other hand adjusting places where we raise errors to take advantage
of this mechanism to give the errors structured details.
In an ideal future, I think all exception handlers that look (or
should look) for a JsonableError would use its contents in structured
form, never mentioning `msg`; but the majority of error sites might
continue to just instantiate JsonableError with a string message. The
latter is the simplest thing to do, and probably most error types will
never have code looking for them specifically.
Because the new API refactors the `to_json_error_msg` method which was
designed for subclasses to override, update the 4 subclasses that did
so to take full advantage of the new API instead.
This simplifies things for all codepaths not involving this feature.
Using this feature becomes slightly easier when you're already
defining a subclass, but now requires you to define a subclass.
Currently we use it just once out of >100 uses of JsonableError, and
that use already has a subclass, so this seems like a win.
With #5598 there will soon be an application-level error code
optionally associated with a `JsonableError`, so rename this
field to make clear that it specifically refers to an
HTTP status code.
Also take this opportunity to eliminate most of the places
that refer to it, which only do so to repeat the default value.
The file `zerver/lib/request.py` doesn't have type annotations
of its own; if they did, they would duplicate the annotations that
exist in its stub file `zerver/lib/request.pyi`. The latter exists
so that we can provide types for the highly dynamic `REQ` and
`has_request_variables`, which are beyond the type-checker's ken
to type-check, but we should minimize the scope of code that gets
that kind of treatment and `JsonableError` is not at all the sort of
code that needs it.
So move the definition of `JsonableError` into a file that does
get type-checked.
In doing so, the type-checker points out one issue already:
`__str__` should return a `str`, but we had it returning a `Text`,
which on Python 2 is not the same thing. Indeed, because the
message we pass to the `JsonableError` constructor is generally
translated, it may well be a Unicode string stuffed full of
non-ASCII characters. This is potentially a bit of a landmine.
But (a) it can only possibly matter in Python 2 which we intend to
be off before long, and (b) AFAIK it hasn't been biting us in
practice, so we've probably reasonably well worked around it where
it could matter. Leave it as is.
The whole thing is an error, so "message" is a more apt word for the
error message specifically. We abbreviate that as `msg` in the actual
HTTP responses and in the signatures of `json_error` and friends, so
do the same here.
In order to benefit from the modern conveniences of type-checking,
add concrete, non-Any types to the interface for JsonableError.
Relatedly, there's no need at this point to duck-type things at
the places where we receive a JsonableError and try to use it.
Simplify those by using straightforward standard typing.
This fixes some error message strings and skips converting request_data
into json. From now, conversion would be the responsibility of interface.
Also, base_url is now not passed into event structure.
Splitting bot_lib.py file into 2 files led to unnecessary
redirection of the code workflow. For an embedded bot/service to
send a reply, it was being redirected 3 times.
First, the code flow comes to "EmbeddedBotHandler" class to send
reply, then it goes to the common function in "zulip_bots/lib.py",
then it would come back to "EmbeddedBotHandler". Later on, if we
create an abstract class, from where the bot work flow would
directly hit and then from there it is classified into
EmbeddedBotHandler or ExternalBotHandler and accordingly it would
get redirected.
Now, first the bot flow goes to it's handler class External or
Embedded (where we pass that this is External or Embedded bot as
parameter) and then goes to a common point and then comes back to
the same class.
This is required, since we just reorganized the python-zulip-api
repository into 3 packages.
A nice side effect is that we get to eliminate some now-unnecessary
code for editing sys.path.
In most cases, we do have the data for which other user was
responsible for subscribing the target user to new streams.
The main case where we don't is when the user is created and gets the
default streams.
ScheduledJob was written for much more generality than it ended up being
used for. Currently it is used by send_future_email, and nothing
else. Tailoring the model to emails in particular will make it easier to do
things like selectively clear emails when people unsubscribe from particular
email types, or seamlessly handle using the same email on multiple realms.
Both the queue processor and ScheduledJob emails need to sometimes pass a
to_user_id and sometimes pass a to_email, and it's more convenient to just
have one function that they can call that can handle either.
Also removes the now redundant send_email_to_user.
Better to see "noreply@..." when replying to a message that you can't reply
to than to see "Zulip" (for email clients that hide the email address when
there is a display name).
This is required for test fixtures which contain `stream_id`. Prior
to python 3.3 hashes were not randomized but after a security fix
hash randomization was enabled by default in python 3.3 which made
iteration of dictionaries and sets completely unpredictable.
This new setting controls whether or not users are allowed to see the
edit history in a Zulip organization. It controls access through 2
key mechanisms:
* For long-ago edited messages, get_messages removes the edit history
content from messages it sends to clients.
* For newly edited messages, clients are responsible for checking the
setting and not saving the edit history data. Since the webapp was
the only client displaying it before this change, this just required
some changes in message_events.js.
Significantly modified by tabbott to fix some logic bugs and add a
test.
I pushed a bunch of commits that attempted to introduce
the concept of `client_message_id` into our server, as
part of cleaning up our codepaths related to messages you
sent (both for the locally echoed case and for the host
case).
When we deployed this, we had some strange failures involving
double-echoed messages and issues advancing the pointer that appeared
related to #5779. We didn't get to the bottom of exactly why the PR
caused havoc, but I decided there was a cleaner approach, anyway.
We are deprecating local_id/local_message_id on the Python server.
Instead of the server knowing about the client's implementation of
local id, with the message id = 9999.01 scheme, we just send the
server an opaque id to send back to us.
This commit changes the name from local_id -> client_message_id,
but it doesn't change the actual values passed yet.
The goal for client_key in future commits will be to:
* Have it for all messages, not just locally rendered messages
* Not have it overlap with server-side message ids.
The history behind local_id having numbers like 9999.01 is that
they are actually interim message ids and the numerical value is
used for rendering the message list when we do client-side rendering.
Prior to this commit, 7 megabytes of images (through 253 individual requests)
were heavily slowing down the initial load. With this commit, we load only the
logos (60 or so images).
Documentation and images for the individual integration sub-pages is requested
separately using the /integrations/doc/ endpoint, which returns HTML.
We were using relative URLs for realm emojis in missed message emails.
Since the email server is not same as the Zulip server and doesn't
have the realm emoji files, they were rendered as broken images. This
commit fixes them to use absolute URL.
Fixes: #5692.
The Realm.DoesNotExist exception will never be raised if we pass a
string_id as in that case zerver.models.get_realm function would be
used for fetching. realm.zerver.models.get_realm uses filter query
so the exception will not be raised even if there are no realm which
matches the query.
Tests added by tabbott.
This system hasn't been in active use for several years, and had some
problems with it's design. So it makes sense to just remove it to declutter
the codebase.
Fixes#5655.
This new library is intended to make it easy for management commands
to access a realm or a user in a realm without having to duplicate any
of the annoying parsing/extraction code.
Make it less likely that further development will break compatibility with
ZULIP_ADMINISTRATORs of the form "name <email>".
Note that the suggested value for this setting has been
'zulip-admin@example.com' for a while, so hopefully this commit causes no
change for most installations.
This prevents users from accidentally sending a confirmation link
specific to their account to their Zulip administrator if they reply
to the invitation, invitation reminder, account confirmation, or new
email confirmation emails.
Instead of removing an emoji from the database, just mark them as
deactivated so that they can't be used further but can be rendered
properly in reactions and messages.
Fixes: #4750.
No change in behavior.
Also makes the first step towards converting all uses of
settings.ZULIP_ADMINISTRATOR and settings.NOREPLY_EMAIL_ADDRESS to
FromAddress.*.
Once everything is converted, it will be easier to ensure that future
development doesn't break backwards compatibility with the old style of
settings emails.
This will allow for customized senders for emails, e.g. 'Zulip Digest' for
digest emails and 'Zulip Missed Messages' for missed message emails.
Also:
* Converts the sender name to always be "Zulip", if the from_email used to
be settings.NOREPLY_EMAIL_ADDRESS or settings.ZULIP_ADMINISTRATOR.
* Changes the default value of settings.NOREPLY_EMAIL_ADDRESS in the
prod_setting_template to no longer have a display name. The only use of
that display name was in the email pathway.
Unicode emojis when rendered should display canonical short name.
Similarly, the alt text should be of the format `:<short_name>:`.
For both of these we currently display the actual unicode symbol.
As some systems don't have the fonts necessary for displaying them
properly, they are rendered as empty square blocks. This commit also
ensures that the markup generated for emoji generated by canonical
name and by an unicode emoji is same.
Fixes: #5555.
Once we implement org_type-specific features, it'll be easy to change a
corporate realm to a community realm, but hard to go the other way. The main
difference (the main thing that makes migrating from a community realm to a
corporate realm hard) is that you'd have to make everyone sign another terms
of service.
Also move a couple of comments inside the functions they describe,
because they're about the implementation of the functions rather
than their interface.