We use the message a lot for the query modified
here, so I think it's worth taking the up-front
hit of getting bulkier objects to avoid O(N)
hops back to the database.
Our webhook-errors.log file is riddled with exceptions that are
logged when a webhook is incorrectly configured to send data in
a non-JSON format. To avoid this, api_key_only_webhook_view
now supports an additional argument, notify_bot_owner_on_invalid_json.
This argument, when True, will send a PM notification to the bot's
owner notifying them of the configuration issue.
We (lexically) remove "subject" from the conversion code. The
`build_message` helper calls `set_topic_name` under the hood,
so things still have "subject" in the JSON.
There was good code coverage on `build_message`.
The various vars here that had recipient_subject
in the name now have either bucket or bucket_tup
there.
The shorter names are a bit easier to read, and the
original names were misleading for the PM case.
This was basically two search/replaces, and we have
good test coverage here, so it's pretty low risk
despite the messy diff.
We now attach zulip_db_data to the markdown engines
for classes that need it. This was the last remaining
global we had, so we remove `arguments.py` here.
The Markdown processor makes it fairly simple for
the helper classes to access the `md` engine. We
now write `_md_engine.zulip_message` to avoid having
the current message in the global namespace.
Note that we do reuse engines for multiple messages,
but each engine is specific to a realm. And we therefore
avoid even the theoretical possibility of leaking message
data between realms.
This makes us consistent with how we import codehilite.
Using Python's normal import mechanism avoids some overhead
with Markdown having to parse dotted notation.
These modules are tiny, so they shouldn't impact startup
too much. Also, by explicitly importing them, we avoid
the pitfall of having a sucessful startup and a broken
renderer.
We were building the same link regex every time
we build a Markdown engine, which happens twice
per realm. It's an expensive operation due to
the complexity of the regex and us reading a file.
Nested classes are kind of expensive in Python,
particularly when you throw in mypy annotations.
Also, flatter is arguably better, although it is
kind of a pain here not to have closures.
This change avoids hitting the Django ORM when
we don't find any possible group mentions in
the message content.
Django doesn't necessarily actually hit the database,
but it's still slow and shows up in profiles.
This commit speeds up the import by avoiding
sender lookups and instead using the data
for users that we already have in memory.
This avoids a few DB hops, many hops to memcached,
plus some object construction.
We now call do_render_markdown() directly. This
also makes it more explicit that the import has
never rendered alert words.
For the import-data codepath, we will call
the extracted function directly in a
subsequent commit.
The do_render_markdown() function has more
required parameters, which allows for more
explicit code and also allows us to flatten
out some logic related to alert words. (We
just pass in empty sets/dicts as needed).
We can rely on `message_realm` being the same
as `message.sender.realm`, which allows us to
skip two queries to the database for the rare
Zephyr mirroring case.
This function requires a message object, whereas
we want to work with JSON data to avoid necessary
queries when we import data. Inlining the function
sets us up for a subsequent refactoring.
We change the way we deal with theoretical return
values of `None` to use an assertion; otherwise,
we would have to loosen up a bunch of mypy types
from `str` to `Optional[str]`. It's not clear `None`
is even possible--we've moved toward throwing exceptions
there instead of silently failing.
This is somewhat hairy logic, so it's nice
to extract it and not worry about variable leaks.
Also, this moves some legacy "subject" references out
of actions.py.
We start by including functions that do custom
queries for topic history.
The goal of this library is partly to quarantine
the legacy "subject" column on Message.
Recently, one of our users reported that a JIRA webhook was not
able to send messages to a stream with a space character in its
name. Turns out that JIRA does something weird with webhook URLs,
such that escaped space characters (%20) are escaped again, so
that when the request gets to Zulip, the double escaped %20 is
evaluated as the literal characters `%20`, and not as a space.
We fix this by unescaping the stream name on our end before
sending the message forward!
The previous logic was incorrect, in that if `content_type` was set to
None (which happens with Slack/HipChat export, among other things),
then we wouldn't run the `guess_type` logic to auto-detect the
Content-Type to send to S3.
The UserMessage table can be huge, so creating a
bunch of entries in `ID_MAP` can overflow memory.
We don't have any tables that depend on `UserMessage`,
and we don't send the 'id' fields from `zerver_usermessage`
to the database, so re-mapping them was just busy-work.
This is a preparator refactor for supporting hosting different Tornado
processes on different servers; to look up which Tornado server we
should be sending the event to, we'll need the realm object.
Apparently, the QUERY_STRING property of the report object wasn't
actually a string; since we only care about its string representation,
we should just stringify it.
Fixes#10745.
Use get_display_recipient to get stream names, and remove the
references to message.stream_name in push_notifications.py which were
added in 97571a203, as the actual stream names were being retrived
only for Message objects associated with public streams.
When you send a message to a bot that wants
to talk via an outgoing webhook, and there's
an error (e.g. server is down), we send a
message to the bot's owner that links to the
message that triggered the error.
The code to produce those links was out of
date.
Now we move the important code to the
`url_encoding.py` library and fix the PM
links to use the more modern style (user_ids
instead of emails). We also replace "subject"
with "topic" in the stream urls.
This supports guest user in the user-info-form-modal as well as in the
role section of the admin-user-table.
With some fixes by Tim Abbott and Shubham Dhama.
This will help us eliminate camo from our production installs.
Basically it helps us de duplicate some code from upcoming code
which will help us check validity of a camo url.
Also, rename get_alert_from_message to get_gcm_alert.
With the implementation of the and get_apns_alert_title and
get_apns_alert_subtitle, the logic within get_alert_from_message
is only relevant to the GCM payload, so we adjust the name
accordingly.
Progresses #9949.
Resolves https://github.com/zulip/zulip-mobile/issues/1316.
The string that is returned from get_alert_from_message is
dependent upon the same message that is passed into get_apns_payload
and get_gcm_payload. The contents of those payloads that are tested via
TestGetAPNsPayload and TestGetGCMPayload, which makes the tests for
get_alert_from_message redundant.
Also, simplify the logic by removing the last elif conditional.
If we use an outgoing webhook and the web server
responds with `widget_content` in the payload, we
include that in what we send through the send-message
codepath.
This makes outgoing webhook bots more consistent with
generic bots.
Bots are not allowed to use the same name as
other users in the realm (either bot or human).
This is kind of a big commit, but I wanted to
combine the post/patch (aka add/edit) checks
into one commit, since it's a change in policy
that affects both codepaths.
A lot of the noise is in tests. We had good
coverage on the previous code, including some places
like event testing where we were expediently
not bothering to use different names for
different bots in some longer tests. And then
of course I test some new scenarios that are relevant
with the new policy.
There are two new functions:
check_bot_name_available:
very simple Django query
check_change_bot_full_name:
this diverges from the 3-line
check_change_full_name, where the latter
is still used for the "humans" use case
And then we just call those in appropriate places.
Note that there is still a loophole here
where you can get two bots with the same
name if you reactivate a bot named Fred
that was inactive when the second bot named
Fred was created. Also, we don't attempt
to fix historical data. So this commit
shouldn't be considered any kind of lockdown,
it's just meant to help people from
inadvertently creating two bots of the same
name where they don't intend to. For more
context, we are continuing to allow two
human users in the same realm to have the
same full name, and our code should generally
be tolerant of that possibility. (A good
example is our new mention syntax, which disambiguates
same-named people using ids.)
It's also worth noting that our web app client
doesn't try to scrub full_name from its payload in
situations where the user has actually only modified other
fields in the "Edit bot" UI. Starting here
we just handle this on the server, since it's
easy to fix there, and even if we fixed it in the web
app, there's no guarantee that other clients won't be
just as brute force. It wasn't exactly broken before,
but we'd needlessly write rows to audit tables.
Fixes#10509
When we create new ids for message rows, we
now sort the new ids by their corresponding
pub_date values in the rows.
This takes a sizable chunk of memory.
This feature only gets turned on if you
set sort_by_date to True in realm.json.
We could migrate all the current PREMIUM_FREE organizations to have more
invites, but this setting mainly affects orgs right as they are starting, so
it's probably fine.
We seemed to have been doing too much of sharpening on the thumbnails.
The purpose of sharpening here was to just counter the softening
effects of a resize on an image but overdoing it is bad.
Value sharpen(0.5,0.2,true) seems to look good for achieving the
best results here on different displays as revealed in the manual
hit and trial based testing.
Thanks to @borisyankov for pointing out the issue and suggesting
the values.
For some webhook endpoints where the third-party API requires us to do
this, the user's API key might appear in error emails through
appearing in the `QUERY_STRING` parameter. Fix that by filtering any
actual content from those; what we usually need for debugging is just
what set of parameters were provided.
The APNS client libraries (especially the hyper.http20 one) were
determined via profiling to take significant time during the import
process, so we move them to be lazily imported in order to optimize
the overall Zulip import process. This save up to about 100ms in
import time.
These libraries are only used in certain Django processes inside
zulipchat.com, and so are unnecessary both in development as well as
for self-hosted Zulip servers.
This is a prepartory commit for the upcoming changes. It was meaningful
to extract this one out because this function is essentially a condition
check on whether a given url is one of the user_uploads or an external
one. Based on its value we decide whether a url must be thumbnailed or
not and thus this function will also be used in an upcoming commit
patching lib/thumbnail.py to do the same check before thumbnail url
generation.
We are basically adding a check for url's to be external (belonging
to some 3rd party web site hosting the image) or be one of the
user uploaded files. User uploaded files are served by a separate
endpoint which is /user_uploads/. Any other local url such as
/user_avatars/ or /static/ should never be sent to thumbor for
thumbnailing.
Not sending /user_avatars/ to thumbor for thumbnailing makes sense
because they are already properly thumbnailed and stored properly.
/static/ urls host very few images we use for demo and can be safely
be excluded from thumbnailing.
The Zulip API is to be used on both development and production
servers, and really we just need to talk about zuliprc files.
There's a similar issue for the JS docs, but we need to fix the
copy/paste issues with those as well.
We don't want really long urls to lead to truncated
keys, or we could theoretically have two different
urls get mixed up previews.
Also, this suppresses warnings about exceeding the
250 char limit.
Finally, this gives the key a proper prefix.
We use UserMessageLite to avoid Django overhead, and we
do updates in chunks of 10000. (The export may be broken
into several files already, but a reasonable chunking at
import time is good defense against running out of memory.)
Now that we allow multiple users to have registered the same token, we
need to configure calls to unregister tokens to only query the
targeted user_id.
We conveniently were already passing the `user_id` into the push
notification bouncer for the remove API, so no migration for older
Zulip servers is required.
If cordelia searches on pm-with:iago@zulip.com,cordelia@zulip.com,
we now properly treat that the same way as pm-with:iago@zulip.com.
Before this fix, the query would initially go through the
huddle code path. The symptom wasn't completely obvious, as
eventually a deeper function would return a recipient id
corresponding to a single PM with @iago@zulip.com, but we would
only get messages where iago was the recipient, and not any
messages where he was the sender to cordelia.
I put the helper function for this in zerver/lib/addressee, which
is somewhat speculative. Eventually, we'll want pm-with queries
to allow for user ids, and I imagine there will be some shared
logic with other Addressee code in terms of how we handle these
strings. The way we deal with lists of emails/users for various
endpoints is kind of haphazard in the current code, although
granted it's mostly just repeating the same simple patterns. It
would be nice for some of this code to converge a bit. This
affects new messages, typing indicators, search filters, etc.,
and some endpoints have strange legacy stuff like supporting
JSON-encoded lists, so it's not trivial to clean this up.
Tweaked by tabbott to add some additional tests.
For our bots that use GenericOutgoingWebhookService
(which are basically Zulip style bots), we now
include a "content-type" header of "application/json".
We accomplish this by having the service classes
implement their own custom method called
`send_data_to_server`. For the Slack-related
code, we just extracted code from `do_rest_call`,
and then for the Zulip-related code, we added
a `headers` parameter.
If we omit methods in subclasses, they're likely to
be caught by linters or unit tests, and even if they
aren't, raising NotImplementedError doesn't actually
prevent user problems.
I've been fighting these in refactoring, and it's
just been a bunch of busy work, plus comments are
highly likely to bitrot.
This fixes a couple things:
* process_event() is a pretty vague name
* returning tuples should generally be avoided
* we were producing the same REST parameters in both
subclasses
* relative_url_path was always blank
* request_kwargs was always empty
Now process_event() is called build_bot_request(),
and it only returns request data,
not a tuple of `rest_operation` and `request_data`.
By no longer returning `rest_operation`, there are
fewer moving parts. We just have `do_rest_call` make
a POST call.
Before this change, we instantiated base_url into a superclass
of subclasses that returned base_url into a dictionary that
gets returned to our caller.
Now we just pull base_url out of service when we need to make
the REST call.
We move the JSON parsing step into the
higher level function: process_success_response().
In the unlikely event that we'll start integrating
with a solution that doesn't use JSON, we can deal
with that, and for now doing the parsing in one
place will help us make error reporting more
consistent.
In a subsequent commit we'll introduce better
error handling for malformed JSON.
The earlier code here, if it got a payload with
"response_string" as a key, would prefix the
corresponding value with "Success!". We just
want the bot to set its own content.
The code is reorganized here so that process_success()
always produces a value keyed by "content" from
incoming data, and then process_success_response()
doesn't do any fancy munging of the data.
Previously, Zulip did not correctly handle the case of a mobile device
being registered with a push device token being registered for
multiple accounts on the same server (which is a common case on
zulipchat.com). This was because our database `unique` and
`unique_together` indexes incorrectly enforced the token being unique
on a given server, rather than unique for a given user_id.
We fix this gap, and at the same time remove unnecessary (and
incorrectly racey) logic deleting and recreating the tokens in the
appropriate tables.
There's still an open mobile app bug causing repeated re-registrations
in a loop, but this should fix the fact that the relevant mobile bug
causes the server to 500.
Follow-up work that may be of value includes:
* Removing `ios_app_id`, which may not have much purpose.
* Renaming `last_updated` to `data_created`, since that's what it is now.
But none of those are critical to solving the actual bug here.
Fixes#8841.
We now allow outgoing webhooks to provide us a
"content" field, which is probably a more guessable
name than "response_string", particularly for folks
that use our other bot-related APIs. And we don't
modify content as we do response_string, i.e. no
"Success!" prefix.
If we're not too concerned about backward compatibility,
we can do a subsequent commit that makes "content"
and "response_string" true synonyms and get rid of
the "Success!" prefix, which was probably accidental
to begin with.
This commit starts by changing the third
argument of send_response_message to be a Dict
instead of a string, so that the data can be more
structured going forward.
That change makes the 2nd/3rd parameters both be
dicts, so to be defensive, I now have all the callers
pass in explicit keyword names. And then I rename
message to message_info, so that the callers have
more clear code.
And that changes the implementation inside of
send_response_message() a bit.
Sorry this commit is a bit coarse, but the intermediate
commits would have been kind of ugly, too.
At the end of the day, it's pretty simple:
bot_id: never changed
message_info: just renamed from message
response_data: is a Dict with the key of "content"
And the innards of send_response_message() are basically
simply dictionary lookups and function calls.
There's no reason to return a failure message in
process_success(), since it's implied to be part of
the success codepath. I didn't look at the full history
of how the strange API evolved, but the second element
of the tuple was clearly noise by the time I got here.
Neither of the subclasses ever set it, and none of the
consumers used it.
This two-line function wasn't really carrying its
weight, and it just made it harder to refactor the
overall codepath.
Eliminating the function forces us to mock at a slightly
deeper level, which is probably a good thing for what
the test intends to do. The deeper mock still verifies that
we're sending the message (good) without digging into
all the details of how we send it (good).
Note that we will still keep around the similarly named
`fail_with_message` helper, which is a lot more useful.
(The succeed/fail scenarios aren't really symmetric here.
For success, there are fewer codepaths that do more complex
things, whereas we have lots and lots of failure codepaths
that all do the same simple thing of replying with a canned
message.)
Before this change subclasses of OutgoingWebhookServiceInterface
would return a raw string as the first element of its return
tuple in process_success(). This is not a very flexible
design, as it prevents the bot from passing extra data like
`widget_content`.
It's also possible in the future that we'll want to let outgoing
bots reply directly to senders who mention them on streams, and
again the original design was overly constrained for that.
This commit does not actually change any functionality yet.
The code was needlessly querying the DB to get full
objects for entities where we only needed user_id,
realm_id, and stream_id.
With my test data of ~1000 records this sped up the
function from ~8s to ~0.5s. The speedup would probably
be even more for larger data sets.
Fixes the urgent part of #10397.
It was discovered that soft-deactivated users don't get mobile push
notifications for messages on private streams that they have configured
to send push notifications.
Reason: `handle_push_notification` calls `access_message`, and that
logic assumes that a user who is a recipient of a message has an
associated UserMessage row. Those UserMessage rows are created
lazily for soft-deactivated users, so they might not exist (yet)
until the user comes back.
Solution: Ensure that userMessage row is created for
stream_push_user_ids and stream_email_user_ids in create_user_messages.
At some point as part of the process of supporting renumbering data,
we changed the structure of our file uploads to expect `path` to match
`s3_path`, with both having the relative path within the overall
hierarchy (including the realm ID). This change updates the more
rarely-used S3 export code path to use that model, fixing a crash when
messages reference an Attachment object with a rewritten path_id.
If any user had sent the reply to the welcome bot recommended by our
tutorial, then the Zulip export/import process didn't work properly,
because we weren't including (and then remapping) the recipient ID for
sending PMs to the cross-realm bots. This commit fixes that gap, by
recording the necessary data on the export side, and doing the
appropriate remapping on the import side.
Previously, our realm import logic only did the special remapping
logic for the original notifications_stream_id; when we added the new
signup_notifications_stream_id field, we neglected to handle it in the
same way.
In the event that two processes are racing to be the
first to load data from zulip.yaml, we now make the
race scenario be duplicated effort instead of having
the second racer get an attribute error on `data`.
We do this by declaring victory only after setting
`data`. "Declaring victory" in this case is a matter
of setting `last_update`.
We are still possibly vulnerable to corrupted data
here, so we should investigate a mutex, or just
read the data on every call (but it's strangely
expensive, almost 3.5s on my instance), or converting
the YAML to code before launching the server.
We start by stripping the ids in front of the name before the database
lookup. This has the advantage of not mentioning anyone if an incorrect
user id and full name combination is specified, as well as not having
the query the database twice, once by fullname and next by id.
Previously, we were storing only the most recent person with the same
full name as others; this commit adds new keys to the dict such that
simply looking by name would get you the newest user with this name,
and the get_user_by_id function can index the remaining users.
We also remove some unreachable code. Calling
split() always returns at least one token, even
if it's just the empty string. This is tested
directly on this commit, plus messages with
empty content get rejected pretty early in
the execution path.
In user type custom field, field value is list of user ids. We weren't
converting list to json object in update event payload. This throws
error in frontend, cause we store stringify representation of custom
field value. Therefore, after update event is recieved field-value-
type gets updated to array from string which throws json parsing error.
Having HTML (or HTML-like) content in the examples was making parts of
the content invisible, since the browser identified them as HTML tags
rather than verbose text.
There are some endpoints that don't fall into the currently available
categories, so this new function will be used for calling the tests for
server and realm-related endpoints.
Using early-exit here allows us to more easily
comment why there are certain exemptions to
this logic.
We also only require callers to pass in realm,
not the whole user object.
Since this class was built, folks have always chosen
to subclass JsonableError for situations where
the default of ErrorCode.BAD_REQUEST is insufficient.
So now we simplify the use cases, which also gets
us 100% coverage on this core module.
This commit add FIELD_TYPE_CHOICES_DICT to page_params and replace
FIELD_TYPE_CHOICES.
FIELD_TYPE_CHOICES_DICT includes all field types with keyword, id
and display name. Using this field-type-dict, we can access field
type information by it's keyword, and remove all static use of
field-type'a name or id in frontend.
This commit also modifies functions in js where this page_params
field-types is used.
This commit modifies FIELD_TYPE_DATA dict in `CustomProfileField`
model to store keyword of field types. And create new dict
FIELD_TYPE_CHOICES_DICT to store all field type information
by field type keyword, i.e. id, name.
This is preparatory commit to remove all static use of field
types in frontend and access field type with keyword instead
of display name.
This prevents leaking some variables into an already
cluttered function.
We also add test coverage for what's now an
early-exit condition in the new function--we exempt
public MIT streams from these events.
This change was partially driven by a quirk in Python
where peephole optimizations make `continue` lines
appear not to be covered.
I also think it's generally a good idiom to extract
functions for loop bodies when they don't actually
accumulate values or maintain other state. With this
commit we now prevent potential bugs for vars like
`is_stream` leaking between loop iterations.
We simulate a race condition by mocking create_user
to actually create a user, but then raise an
IntegrityError (as if another process had actually
created the user, not our test).
I also changed the real code to use explicitly
named parameters.
I don't understand why this didn't cause test failures in CI; this
change was clearly required and test_change_realm_property was failing
consistently for me locally.
Our get_streams_traffic function used to query
all streams in the StreamCount table if you
passed in `None` for `streams`.
Now we require that you pass in a list of
stream_ids.
I don't know how much work this will save
the database, since probably the bulk of
the work is aggregating. If we need to fine
tune DB performance, we could possibly add
`realm` as an argument and add it to the filter.
What we'll immediately get, for large multi-realm
installations, is less data over the wire and
less work for the ORM.
The prior code uses an awkward idiom that
pre-dates the `exists()` function, and it
had an unreachable line of code.
The new version should be faster, since we
don't create a throwaway heavy Django object
or send needless data over the wire.
This functions appears to be redundant to
`access_stream_by_name`. The only
meaningful line of code in the function that we're
removing, the code that raises an error,
appears to be unreachable, despite reasonably
extensive tests.
The only thing the function was restricting
was that the case where the bot's owner was
unsubscribed to a private stream, which
is already locked down in
`access_stream_by_name` calls inside of
`patch_bot_backend`.
This commit increases test coverage
by removing unreachable code.
It's possible this function had
some theoretical value before we
introduced the `require_non_guest_human_user`
decorator to the `patch_bot_backend`
view, since in theory the bot itself
could have subscribed to a stream that
the owner didn't subscribe to. Even
then it's not clear that allowing the
bot to set that as a default stream
would have been harmful, since they
can already access it.
We want our methodology for extracting the last message
id to be consistent, particularly in terms of how we
handle edge cases. (I'll concede that the
`bulk_remove_subscriptions` codepath never hits that
corner case in practice, but it's harmless to handle
the theoretical case.)
It may also be nice to have this function show up
clearly in profiling.
This also adds some direct testing to the function.
It's not clear to me why we don't use `latest('id')`
in the implementation, but that's outside the scope
of this commit.
This de-clutters check_message a bit and also makes
it easy to audit our rules for who can write to a
stream.
Also, this works around a bug with Python where its
optimizations for the `pass` instruction make them
not appear to run and show up as uncovered in
coverage reports.
Fixes#10124.
Users in the waiting period category cannot subscribe other users to
a stream. When a user tries to mention another unsubscribed user, a
warning message appears with a subscribe button on it to subscribe
the other user.
This commit removes the subscribe button and changes the warning text
for users in the waiting period category.
Right now it only has one function, but the function
we removed never really belonged in actions.py, and
now we have better test coverage on actions.py, which
is an important module to get to 100%.
In this commit we fix a bug due to which url preview images for urls
to custom emojis, realm icons or user avatars appeared broken when
such urls would be part of a Zulip message.
This is a preparatory commit to fix a bug in which a user posts
a link of custom emoji, user avatar or realm icon in a Zulip
message.
In this commit we are just adjusting the url generation in the
backend to have the '/user_uploads/' in the encrypted url generated
which the user is supposed to be redirected to and therefore
essentially reaching thumbor with the encrypted url.
This is necessary because 'user_uploads' and 'user_avatars' (or any
other item under 'user_avatars' endpoint) have a different folder
location under the local file storage backend. 'user_uploads'
endpoint's stuff is stored in a 'files' directory whereas stuff
'user_avatars' endpoint's stuff is stored in a 'avatars' directory.
Thumbor needs to know from which directory a particular local file
needs to be retrieved and therefore the zthumbor/loaders.py adds
a prefix location for the directory.
Since in an upcoming commit we are going to add user_avatars
directory location 'avatars' folder as a prefix this preparatory
commit helps simply doing the changes.
The 'last_modified' value in emoji records is
needed for uploading the file to the S3 backend.
We set the same in the function 'import_uploads_s3'.
We also have to remove the keyword 'last_modified'
while building the RealmEmoji dict, as it is not
a field which exists in RealmEmoji objects.
This uses the recently introduced active_mobile_push_notification
flag; messages that have had a mobile push notification sent will have
a removal push notification sent as soon as they are marked as read.
Note that this feature is behind a setting,
SEND_REMOVE_PUSH_NOTIFICATIONS, since the notification format is not
supported by the mobile apps yet, and we want to give a grace period
before we start sending notifications that appear as (null) to
clients. But the tracking logic to maintain the set of message IDs
with an active push notification runs unconditionally.
This is designed with at-least-once semantics; so mobile clients need
to handle the possibility that they receive duplicat requests to
remove a push notification.
We reuse the existing missedmessage_mobile_notifications queue
processor for the work, to avoid materially impacting the latency of
marking messages as read.
Fixes#7459, though we'll need to open a follow-up issue for
using these data on iOS.
Fixes a regression introduced in 23246ff816.
However, we'll be shortly removing this feature, since it's legacy
support for an app that no longer is supported.
Following recent testing flakes that were traced down to this not
having been called causing `receiver_is_off_zulip` to depend on test
ordering, it makes sense to centralize this.
I think it should always have been in ZulipTestCase; it appears the
reason it wasn't from the beginning was that originally only
test_events.py interacted with it, and do_test there still needs to
call this directly (because it can be called multiple times within a
single test). And then we did the wrong thing as expanded use of
Tornado event_queue code in tests to more of the codebase.
The s3 import code path made a hard assumption about `user_profile_id`
being set (we'd already fixed this in the local uploads code path).
Ideally, it should be, and I've opened #10268 for fixing that, but for
now this is how it needs to work.
Private messages are not supported in Slack-format webhook.
Instead of raising a NotImplementedError, we warn the user
that PM service is not supported by sending a message to the
user.
Added tests for the same.
Fixes#9239
After the messages have been imported, set the rendered_content of the
messages instead of leaving its value to be 'None'.
This is important to ensure that:
(1) Performance for users is good after completing the import.
(2) The database's full-text indexes have all of the imported messages
(which only happens properly when Message rows have their
rendered_content field edited).
Fixes#9168.
The "/stats" command doesn't actually do anything
interesting yet, and it also writes to the message
feed instead of replying directly to the user.
The history of this command was that it was
written during a PyCon sprint. It was mainly intended
as an example for subsequent slash commands. The
ones we built after "/stats" have sort of outgrown
"/stats" and don't follow the original structure
for "/stats". (The "/day", "/ping", and "/settings"
commands were built shortly after.)j
We probably want to ressurect "/stats" fairly soon,
after figuring out some useful stats and refining
the UI.
As you can see from this commit, resurrecting the
code here shouldn't be too difficult, but it
may actually be pretty rare that we just translate
slash commands into fleshed out messages.
Since otp_encrypt_api_key only encrypts API keys, it doesn't require
access to the full UserProfile object to work properly. Now the
parameter it accepts is just the API key.
This is preparatory refactoring for removing the api_key field on
UserProfile.
random_api_key, the function we use to generate random tokens for API
keys, has been moved to zerver/lib/utils.py because it's used in more
parts of the codebase (apart from user creation), and having it in
zerver/lib/create_user.py was prone to cyclic dependencies.
The function has also been renamed to generate_api_key to have an
imperative name, that makes clearer what it does.
Now reading API keys from a user is done with the get_api_key wrapper
method, rather than directly fetching it from the user object.
Also, every place where an action should be done for each API key is now
using get_all_api_keys. This method returns for the moment a single-item
list, containing the specified user's API key.
This commit is the first step towards allowing users have multiple API
keys.
python-twitter was consuming a significant amount of import time.
However, this commit seems to not save any time at all, probably
because its recursive dependencies are imported elsewhere in Zulip.
Renaming a user group to a name shared by other group wasn't a scenario
handled by the backend, and the server errored whenever this was
attempted.
Now a json_error is returned, letting the user know that a user group
with that name already exists.
We found out in #9953 that, appparently, loading the OpenAPI file was
taking abut a 5% of the Zulip server startup time.
Since in many cases (especially in development) having the file loaded
won't be necessary at all, we read it on the first time data from the
OpenAPI spec is needed.
Tweaked by tabbott to add a test.
Automatically detect if the OpenAPI spec file has been modified since
the last time it was loaded into memory, and if it has, automatically
reload it to have the latest version.
This feature is designed with development environments in mind. The main
benefit is being able to see the changes made to the OpenAPI document
without needing to restart the development server, which is tedious and
slows the documentation workflow down.
When last user(only in case of admin) unsubscribe from private stream,
stream page doesn't get updated. Cause we delete the private stream
as soon as last user unsubscribe from stream.
So `sub` get undefined in frontend, cause that stream is deleted
before unsubscribe-user-from-stream event is received.
Fix this by changing order of events sent to frontend. Event
`subscription: remove` should be sent before `stream: delete` event
from backend.
This fixes a bug where administrators couldn't remove private
unsubscribed streams from the "default streams" list, because
access_stream_by_name didn't give them access to the stream object.
This commit adds 'resize_gif()' function which extracts each frame,
resize it and coalesces them again to form the resized GIF while
preserving the duration of the GIF. I read some stackoverflow
answers all of which were referring to BiggleZX's script
(https://gist.github.com/BigglesZX/4016539) for working with animated
GIF. I modified the script to fit to our usecase and did some manual
testing but the function was failing for some specific GIFs and was not
preserving the duration of animation. So I went ahead and read about
GIF format itself as well as PIL's `GifImagePlugin` code and came up
with this simple function which gets the worked done in a much cleaner
way. I tested this function on a number of GIF images from giphy.com
and it resized all of them correctly.
Fixes: #9945.
This adds a new function called handle_remove_push_notification in
zerver/lib/push_notifications.py which requires user_profile id and
the message id which has to be removed in the function.
For now, the function only supports GCM (and is mostly there for
prototyping).
The payload which is being delivered needs to contain the narrow
information and the content of the message.
This should make it much simpler for the mobile apps to line up the
data from server_settings against the data in the notifications.
Addresses part of #10094.
This ensures that the format of this data structures matches that for
in-realm bots in the main users data structure (including avatars,
etc.).
Fixes#10138.
For realms that don't have any presence-active users, we know for a
fact that there aren't any active clients that will be reloading just
after the server restarts, so we can skip filling the cache with data
related to that realm.
For zulipchat.com, this results in a significant performance
optimization for the recipient and stream caches, and a moderate
performance improvement for the user caches as well.
Private messages make up the bulk of Recipient objects. While private
messages are ~50% of messages, if you weight by messages received
(which is what is important for message-loading performance), it's
pretty strongly balanced towards stream messages.
We don't need to include long-term idle or other inactive users here,
since fetching them consumed to vast majority of the time.
(On chat.zulip.org, this decreased the runtime for populating the user
cache by 5x, removing only users we're unlikely to need to access).
This doesn't seem to have a huge performance downside (less than 1s
extra time for loading / on chat.zulip.org), and it means the
possibility of users having so many unreads that we get weird/buggy
behavior is much more unlikely to exist.
We'll still want a better experience for users who somehow go over
this limit, but it can be pretty firmly "you need to go mark some
things as read".
This renames Realm.show_digest_email field to
digest_emails_enabled, for greater clarity as to what it does
just from seeing the setting name, without having to look it up.
Fixes part of #10042.
We were getting event-handling exceptions in JS in production if a new
user was created and then went and set a custom profile field, because
there was no `.profile_data` on their user object. We were able to
trace the issue down to the fact that our events didn't include that
field when creating a new user.
This renames Realm.restricted_to_domain field to
emails_restricted_to_domains, for greater clarity as to what it does
just from seeing the setting name, without having to look it up.
Fixes part of #10042.
The previous error messages for this were written for a tool only to
be used by a couple people, and didn't make clear what potential
causes were. Tweak these to provide greater clarity about what's
going on.
The main cause of these errors appearing in practice was fixed in
7ea5987e5d, but nothing strongly
prevents a similar issue from being introduced in the future.
Fixes#10078.
Apparently, our old unminify logic relied on the fact that the
filenames displayed in tracebacks were of the form "app.js" (and the
`app.js` copy of the source map in the appropriate
`/home/zulip/deployments/`). The correct behavior is to just look up
the source map for the appropriate hash-named
`app.a40806b10565c1dee5bf.js` type file.
We fix this with a few small tweaks to the regular expressions. I
wish this file had reasonable unit tests.
As part of our effort to change the data model away from each user
having a single API key, we're eliminating the couple requests that
were made from Django to Tornado (as part of a /register or home
request) where we used the user's API key grabbed from the database
for authentication.
Instead, we use the (already existing) internal_notify_view
authentication mechanism, which uses the SHARED_SECRET setting for
security, for these requests, and just fetch the user object using
get_user_profile_by_id directly.
Tweaked by Yago to include the new /api/v1/events/internal endpoint in
the exempt_patterns list in test_helpers, since it's an endpoint we call
through Tornado. Also added a couple missing return type annotations.
Extracting this helper library will help us avoid an import loop
between notifications.py and message.py (with bugdown in between).
But in addition to that, it's a more natural model, since some of the
uses for these functions weren't part of the notifications code
anyway.
Implement this function in 'bulk_import_model'
and 'update_model_ids'.
This lets us save on redundant-feeling arguments in these
frequently-called helper functions.
One of the code examples for GET /users was using the raw call_endpoint
method from our Python bindings, rather than get_members, which has been
specifically designed to interact with this endpoint.
Now all the examples here use the appropiate method.
Whenever a parameter for an endpoint in our REST API has a default
value, it is displayed under the "Description" section of the
arguments table in the docs.
This way, we don't need to explicitly indicate the default values in the
description, thus avoiding duplicate information in the OpenAPI source.
This has the benefit that we now get the usual data about the
user/request/etc. in error emails related to bugdown exceptions;
previously we were just getting the traceback in the emails (since our
`mail_admins` template was very simple) and no other debugging
details.
Comments tweaked by tabbott to help make clear exactly what's going on
here, since it's a little subtle and a little hacky.
Fixes#8843.
Our nested code block processor wasn't using the correct test for
whether a paragraph was empty of other content; first, we need to
confirm no children, and second, we need to confirm there is no text
before/after the code block element inside the p tag.
None of the file types here are actually processed by our static asset
pipeline in a way that would result in the hash-named versions of the
files (stored in staticfiles.json) being accessed. (If they were,
we'd be using something like `render_bundle` to access their paths).
So, we should not be generating/shipping these hash-named versions of
image, audio, and locale files.
This change decreases the size of a Zulip release tarball from 153MB
to 93MB, by removing all of the duplicated copies of various asset
files.
This may also help with #10038, but I'm not marking it as completing
that issue yet, because part of #10038 is that the non-hash-named
image files in prod-static/generated/emoji do not seem to have been
properly overwritten on upgrade, and it's unclear why.
Fixes#5971.
A stream created in the last few hours likely won't be in StreamCount
(since that gets updated once a day), and hence won't be in the
recent_traffic dict.
However, get_average_weekly_stream_traffic should be None in this case,
not 0.
This commit closes a long pending issue which involved moving the
`EMOTICON_CONVERSION` mapping to build_emoji infrastructure so
that there is only one source of truth. This was pending from the
time when this feature was implemented.
The function 'update_model_ids' should be used on
the models BotStorageData and BotConfigData.
It is wrongly added here for UserGroup model.
Also the sequence name for BotStorageData and
BotConfigData is 'zerver_botuserstatedata_id_seq' and
'zerver_botuserconfigdata_id_seq' respectively, which
should be specifically mentioned in the function
'allocate_ids'.
This fixes some nondeterministic test failures.
The gitter mentions are in the format '@usermention'
and the mentions are included in the export data as:
"mentions": [
{
"screenName": "usermention",
"userId": "54d7876c15522ed4b3dbbefb",
"userIds": []
}]
We extract this data and map this mention to @**usermention**
for Zulip.
Messages can be bulky, and storing them in a single
data structure can cause a memory error.
In this commit, the messages are written to a file
batch-wise, thus avoiding the memory error.
Similar to commit 6b7b6b38ad
This will be used while for any ManyToMany field which
is being imported.
We add an internal function which takes in the old ID list
of the ManyToMany field and return the new updated ID list.
Various pieces of our thumbor-based thumbnailing system were already
merged; this adds the remaining pieces required for it to work:
* a THUMBOR_URL Django setting that controls whether thumbor is
enabled on the Zulip server (and if so, where thumbor is hosted).
* Replaces the overly complicated prototype cryptography logic
* Adds a /thumbnail endpoint (supported both on web and mobile) for
accessing thumbnails in messages, designed to support hosting both
external URLs as well as uploaded files (and applying Zulip's
security model for access to thumbnails of uploaded files).
* Modifies bugdown to, when THUMBOR_URL is set, render images with the
`src` attribute pointing /thumbnail (to provide a small thumbnail
for the image), along with adding a "data-original" attribute that
can be used to access the "original/full" size version of the image.
There are a few things that don't work quite yet:
* The S3 backend support is incomplete and doesn't work yet.
* The error pages for unauthorized access are ugly.
* We might want to rename data-original and /thumbnail?size=original
to use some other name, like "full", that better reflects the fact
that we're potentially not serving the original image URL.
This modifies the logic for formatting outgoing missed-message emails
to support the upcoming stream email notifications feature (providing
a new format for the subject, etc.).
Because we're passing through the trigger for notifications to
do_send_missedmessage_events_reply_in_zulip, we don't need to go back
to the database to determine which messages actually mentioned the
user.
This change converts our logic for determining whether the current
user was mentioned in a group of messages from the implicit "if it was
sent to a stream, it's a mention" to the explicit "we actually know
there was a mention in the message". This is an important
prerequisite for our upcoming feature to support getting email
notifications for streams always (even without a mention).
Because in upcoming commits, we'll want to pass additional per-message
data into do_send_missedmessage_events_reply_in_zulip, we need to
expand the format for how we represent messages to account for that.
This refactors the generate_topic_history_from_db_rows function to not
depend upon the assumption of rows passed as parameter to be sorted in
reverse order of max_message_id field.
Additionally, we add sorting and some tests that verify correct
handling of these cases.
In this commit we add a new endpoint so as to have a way of fetching
topic history for a given stream id without having to be logged in.
This can only happen if the said stream is web public otherwise we
just return an empty topics list. This endpoint is quite analogous
to get_topics_backend which is used by our main web app.
In this commit we also do a bit of duplication regarding the query
responsible for fetching all the topics from DB. Basically this
query is exactly the same as what we have in the
get_topic_history_for_stream function in actions.py. Basically
duplicating now is the right thing to do because this query is
really gonna change when we add another criteria for filtering
messages which is:
Only topics for messages which were sent during the period the
corresponding stream was web public should be returned.
Now when we will do this, the query will change and thus it won't
really be a code duplication!
We already include the issue title in the topic. But if one chooses
to group all gitlab notifications under one topic, the message body
is misleading in the sense that only the Issue ID and the description
are displayed, not the title, which isn't super helpful if the topic
doesn't tell you the title either.
I think we should err on the side of always including the title in
the main message body, which is what this commit does.
Fixes#9913.
This migrates Zulip to use a dramatically better set of names and
aliases for our emoji set, defined in emoji_names.py (which is in turn
manually generated from our hand-curated CSV file).
This should significantly improve the experience of using Zulip's
emoji picker and emoji typeahead for finding what one is looking for.
Fixes#7665
In case of invitation events, 'invites_changed' event without
any real payload is sent to all the realm admins and the user.
The event is handled by reloading the list to view recent changes.
Commit tweaked by shubhamdhama:
* Send an `invite_changed` event when an user accept an invite.
Also, added the test for the same.
* No need to delete the invite list in frontend, current logic
handles the case when the invite data is changed properly.
* Extracted the common logic for sending an event into
`notify_invites_changed`.
Some of the arguments in our REST API have to be sent as JSON objects,
which only accept double quotes for strings.
If we display the examples as normal Python objects, the syntax would be
quite similar but it would use simple quotes, which is invalid JSON (and
isn't accepted by the server).
That's why all the examples should be JSON-serialized in order to comply
with the API's requirements.
Until now, we were displaying an empty "Arguments" section in the REST
API docs whenever an endpoint didn't use input arguments.
In the case of OpenAPI-based docs, that was also annoying because it
required removing the {generate_api_arguments_table|...} template tag or
leaving an empty "parameters" field in zulip.yaml.
After this, we show a paragraph indicating that the endpoint doesn't
need arguments under the "Arguments" section.
If the argument table generator isn't able to reach a file that is
supposed to read, the two most likely causes are:
- The source .md documentation file that is requesting the table has a
typo in the path.
- The file with the arguments isn't there, for some reason.
In either case, we don't want the server to fail silently-ish and
display the docs as if there was no arguments for that endpoint. That's
why the most logic thing to do is to raise an exception and let the
admins know that there's something wrong.
For importing huddles we have to have unique huddle hashes.
Huddle hashes are extracted from the list of users participating
in a huddle. So to extract these user ids, we first use huddle
id to getting the matching recipient, and then we use subscription
to get the user ids from the recipient id.
Added tests for the same (tests slightly tweaked by tabbott).
The tests for GET /users were looking for a specific user, asuming that
it would always be in the same position. Since the users' sorting isn't
guaranteed in any way, this can lead to errors in the tests.
Now we make sure the user we grab from the list is the one we need by
checking its email address.
This is just a hotfix that addresses the short-term problem: we have
already made some efforts to make sure these tests are more
deterministic, and now we only need to finish the migration of the old
enpoints to the new system as a long-term solution.
This is all the plumbing that makes it possible to enable the
stream_email_notifications setting via the Zulip API. The flag still
doesn't do anything yet, but this is a nice checkpoint along the way
to implementing this feature.
This commit creates a new field called delivery_email. For now, it is
exactly the same as email upon user profile creation and should stay
that way even when email is changed, and is used only for sending
outgoing email from Zulip.
The purpose of this field is to support an upcoming option where the
existing `email` field in Zulip becomes effectively the user's
"display email" address, as part of making it possible for users
actual email addresses (that can receive email, stored in the
delivery_email field) to not be available to other non-administrator
users in the organization.
Because the `email` field is used in numerous places in display code,
in the API, and in database queries, the shortest path to implementing
this "private email" feature is to keep "email" as-is in those parts
of the codebase, and just set the existing "email" ("display email")
model field to be something generated like
"username@zulip.example.com" for display purposes.
Eventually, we'll want to do further refactoring, either in the form
of having both `display_email` and `delivery_email` as fields, or
renaming "email" to "username".
This commit adds a Markdown tree-processor extension that renders
multi-line code blocks that are nested inside lists with the
formatting. Note that the code block could be nested inside multiple
list levels and would still get rendered correctly.
Tim: This fixes the need for unpleasant workarounds like
f5bfa4e793 and makes nested code blocks
in our documentation look exactly how users would expect them to.
This adds a new settings, SOCIAL_AUTH_SUBDOMAIN, which specifies which
domain should be used for GitHub auth and other python-social-auth
backends.
If one is running a single-realm Zulip server like chat.zulip.org, one
doesn't need to use this setting, but for multi-realm servers using
social auth, this fixes an annoying bug where the session cookie that
python-social-auth sets early in the auth process on the root domain
ends up masking the session cookie that would have been used to
determine a user is logged in. The end result was that logging in
with GitHub on one domain on a multi-realm server like zulipchat.com
would appear to log you out from all the others!
We fix this by moving python-social-auth to a separate subdomain.
Fixes: #9847.
* If `zerver_realmauditlog` is present in the exported data,
`RealmAuditLog` would be imported normally.
* If it is not present, `create_subscription_events`
function in would create the `subscription_created`
events for RealmAuditLog. The reason this function
is in `import_realm` module and not in the individual
export tool scripts (like Slack) is because this
function would be common for all export tools.
This fixes#9846 for users who have not already done an import of
their organization from Slack.
Fixes#9846.
Custom profile field value are stored in different structure compare to
other profile fields in events, so generic way to update fields wasn't
updating custom profile fields in `apply_event` function.
Fix this by adding check for custom fields in `apply_event`.
This also adds the appropriate test_events test to verify this code path.
Fixes part of #9875.
We extract out the logic for generating a list of all historical
topics for a given stream as a separate function. This avoids code
duplication when we add the similar code path for grabbing all topics
for web public streams.
This has two advantages;
* We can split bugdown/__init__.py into several modules, and each
module can access these arguments by importing these
* We get rid of the super-ugly `global db_data` construct, replacing
it with a only slightly ugly monkey-ish patching of the
`zerver.lib.bugdown.arguments` module, which is at least
considerably more clear on reading as to what it's purpose is.
The main remaining todo for correctly populating
RealmAuditLog.requires_billing_update is supporting the de-seating (and
corresponding re-seating) that happens after being offline for two weeks.
In this commit we are fixing a kinda serious un-noticed bug with
the way run_db_migrations worked for test db.
Basically run_db_migrations runs new migrations on db (dev or test).
When we talk about the dev platform this process is straight forward.
We have a single DB zulip which was once created and now has some data.
Introduction of new migration causes a schema change or does something
else but bottom line being we just migrate the zulip DB and stuff works
fine.
Now coming to zulip test db (zulip_test) situation is a bit complex
in comparision to dev db. Basically this is because we make use of
what we call zulip_test_template to make test fixture restoration
after tests run fast. Now before we introduced the performance
optimisation of just doing migrations when possible, introduction of
a migration would ideally result in provisioning do a full rebuild of
the test database. When that used to happen sequence of events used to
be something like this:
* Create a zulip_test db from zulip_test_base template (An absolute
basic schema holding)
* Migrate and populate the zulip_test db.
* Create/Re-create zulip_test_template from the latest zulip_test.
Now after we introduced just do migrations instead of full db rebuild
when possible, what used to happen was that zulip_test db got
successfully migrated but when test suites would run they would try to
create zulip_test from zulip_test_template (so that individual tests
don't affect each other on db level).
This is where the problem resides; zulip_test_template wasn't migrated
and we just scrapped zulip_test and re-created it using
zulip_test_template as a template and hence zulip_test will not hold the
latest schema.
This is what we fix in this commit.
The only changes visible at the AST level, checked using
https://github.com/asottile/astpretty, are
zerver/lib/test_fixtures.py:
'\x1b\\[(1|0)m' ↦ '\\x1b\\[(1|0)m'
'\\[[X| ]\\] (\\d+_.+)\n' ↦ '\\[[X| ]\\] (\\d+_.+)\\n'
which is fine because re treats '\\x1b' and '\\n' the same way as
'\x1b' and '\n'.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Messages can be bulky, and storing them in a single
data structure can cause a memory error.
In this commit, the messages are written to a file
batch-wise, thus avoiding the memory error.
Previously, the messages where being stored in a output file from
outside the function 'convert_slack_workspace_messages', but
now we store it from the inside the mentioned function.
This will help in processing and saving the messages batch-wise
so as to avoid a memory error.
Reactions are returned separately from 'convert_slack_workspace_messages'
rather than 'message_json'.
Also updated test for 'convert_slack_workspace_messages' and an additional
test for reactions is added.
An estimated traffic of 0 suggests a stream is dead, and has pretty
different semantics from any non-zero value. So we should round up any
number between 0 and 1 to 1.
We don't ever use this value, but it's confusing to have the incorrect
calculation in the code.
Ideally we would set this to "None", but I don't know the code well enough
to be confident nothing would break.