This was a pretty nasty error, where we were accidentally accessing
the parent list in this inner loop function.
This appears to have been introduced as a refactoring bug in
7822ef38c2.
The various vars here that had recipient_subject
in the name now have either bucket or bucket_tup
there.
The shorter names are a bit easier to read, and the
original names were misleading for the PM case.
This was basically two search/replaces, and we have
good test coverage here, so it's pretty low risk
despite the messy diff.
We now attach zulip_db_data to the markdown engines
for classes that need it. This was the last remaining
global we had, so we remove `arguments.py` here.
The Markdown processor makes it fairly simple for
the helper classes to access the `md` engine. We
now write `_md_engine.zulip_message` to avoid having
the current message in the global namespace.
Note that we do reuse engines for multiple messages,
but each engine is specific to a realm. And we therefore
avoid even the theoretical possibility of leaking message
data between realms.
This makes us consistent with how we import codehilite.
Using Python's normal import mechanism avoids some overhead
with Markdown having to parse dotted notation.
These modules are tiny, so they shouldn't impact startup
too much. Also, by explicitly importing them, we avoid
the pitfall of having a sucessful startup and a broken
renderer.
We were building the same link regex every time
we build a Markdown engine, which happens twice
per realm. It's an expensive operation due to
the complexity of the regex and us reading a file.
Nested classes are kind of expensive in Python,
particularly when you throw in mypy annotations.
Also, flatter is arguably better, although it is
kind of a pain here not to have closures.
This change avoids hitting the Django ORM when
we don't find any possible group mentions in
the message content.
Django doesn't necessarily actually hit the database,
but it's still slow and shows up in profiles.
This commit speeds up the import by avoiding
sender lookups and instead using the data
for users that we already have in memory.
This avoids a few DB hops, many hops to memcached,
plus some object construction.
We now call do_render_markdown() directly. This
also makes it more explicit that the import has
never rendered alert words.
For the import-data codepath, we will call
the extracted function directly in a
subsequent commit.
The do_render_markdown() function has more
required parameters, which allows for more
explicit code and also allows us to flatten
out some logic related to alert words. (We
just pass in empty sets/dicts as needed).
We can rely on `message_realm` being the same
as `message.sender.realm`, which allows us to
skip two queries to the database for the rare
Zephyr mirroring case.
This function requires a message object, whereas
we want to work with JSON data to avoid necessary
queries when we import data. Inlining the function
sets us up for a subsequent refactoring.
We change the way we deal with theoretical return
values of `None` to use an assertion; otherwise,
we would have to loosen up a bunch of mypy types
from `str` to `Optional[str]`. It's not clear `None`
is even possible--we've moved toward throwing exceptions
there instead of silently failing.
This is somewhat hairy logic, so it's nice
to extract it and not worry about variable leaks.
Also, this moves some legacy "subject" references out
of actions.py.
We start by including functions that do custom
queries for topic history.
The goal of this library is partly to quarantine
the legacy "subject" column on Message.
A recent change to check_send_webhook_message allows webhooks to
unescape stream names before sending a message. This commit adds
a test for the edge case where the webhook URL is escaped twice by
a third-party.
Recently, one of our users reported that a JIRA webhook was not
able to send messages to a stream with a space character in its
name. Turns out that JIRA does something weird with webhook URLs,
such that escaped space characters (%20) are escaped again, so
that when the request gets to Zulip, the double escaped %20 is
evaluated as the literal characters `%20`, and not as a space.
We fix this by unescaping the stream name on our end before
sending the message forward!
The previous logic was incorrect, in that if `content_type` was set to
None (which happens with Slack/HipChat export, among other things),
then we wouldn't run the `guess_type` logic to auto-detect the
Content-Type to send to S3.
The last_modified field is intended to support setting the
orig-last-modified field in the S3 backend when importing, basically
to keep track of this bit of pre-export data for debugging. In the
event that it isn't available, the correct thing to do is not write
out an invalid `last_modified` field; we should just not write it out
at all.
This fixes the fact that these emoji were sometimes not displaying
properly (because of changes in the emoji names used in the codebase),
while also making this integration more standard (since it was the
only one with such an aggressive use of emoji).
The UserMessage table can be huge, so creating a
bunch of entries in `ID_MAP` can overflow memory.
We don't have any tables that depend on `UserMessage`,
and we don't send the 'id' fields from `zerver_usermessage`
to the database, so re-mapping them was just busy-work.
This is a preparator refactor for supporting hosting different Tornado
processes on different servers; to look up which Tornado server we
should be sending the event to, we'll need the realm object.
This should make it possible for there to safely be multiple Tornado
processes running on different ports on the same system.
It may also fix a rare race bug in development, where previously, it
was possible for the Tornados processes for Casper and the main
development server to interfere; I haven't investigated whether this
was a real bug or not, but now those two services will use independent
Tornado files.
We still need to add something to direct traffic between the different
Tornado processes.
This reverts commit 3645bb9225.
This change was incorrect, because the `is_zephyr_mirror_realm`
property on Realm is a property and thus isn't available in the
migration codebase.
Since this migration is only run for very old servers, this should
have no impact.
Apparently, the QUERY_STRING property of the report object wasn't
actually a string; since we only care about its string representation,
we should just stringify it.
Apparently, we weren't resetting the query counters inside the
websockets codebase, resulting in broken log results like this:
SOCKET 403 2ms (db: 1ms/2q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET 403 5ms (db: 2ms/3q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET 403 2ms (db: 3ms/4q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET 403 2ms (db: 3ms/5q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET 403 2ms (db: 4ms/6q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET 403 2ms (db: 5ms/7q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET 403 2ms (db: 5ms/8q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET 403 3ms (db: 6ms/9q) /socket/auth [transport=websocket] (unknown via ?)
The correct fix for this is to call reset_queries at the start of each
endpoint within the websockets system. As it turns out, we're already
calling record_request_start_data there, and in fact should be calling
`reset_queries` in all code paths that use that function (the other
code paths, in zerver/middleware.py, do it manually with
connection.connection.queries = []).
So we can clean up the code in a way that reduces risk for similar
future issues and fix this logging bug with this simple refactor.
Fixes#10745.
Use get_display_recipient to get stream names, and remove the
references to message.stream_name in push_notifications.py which were
added in 97571a203, as the actual stream names were being retrived
only for Message objects associated with public streams.
This is mostly an extraction, but it does change the
way we calculate `content`. We append the markdown
links from ALL files to any content that came in the
message itself.
Separating this out also allows us to add more
test coverage for the extracted code.
We now use subscriber_map for building UserMessage
rows in Slack/Gitter conversions.
This is mostly designed to simplify the code, rather
than having to scan the entire subscribers for each
message.
I am guessing this will improve performance for most
conversions. We sort small lists on every message,
in order to be deterministic, but the sorting cost
is probably more than offset by avoiding the O(N)
scans across all subscriptions. Also, it's probably
negligible in the grand scheme of things, compared
to JSON parsing, file I/O, etc.
This commits also fixes some typos with mentioned_users_id ->
mentioned_user_ids and cleans up a test a bit as well.
We now have all three third party
conversions (Gitter/Slack/Hipchat)
go through build_user_message().
Hipchat was already using this helper.
We also avoid callers having to pass in
an id to build_user_message().
When you send a message to a bot that wants
to talk via an outgoing webhook, and there's
an error (e.g. server is down), we send a
message to the bot's owner that links to the
message that triggered the error.
The code to produce those links was out of
date.
Now we move the important code to the
`url_encoding.py` library and fix the PM
links to use the more modern style (user_ids
instead of emails). We also replace "subject"
with "topic" in the stream urls.
This supports guest user in the user-info-form-modal as well as in the
role section of the admin-user-table.
With some fixes by Tim Abbott and Shubham Dhama.
The purpose of this commit is to pass information
to the frontend whether the message response recieved
has been limited due to plan restrictions or not.
To implement this, the backend for limiting the message
history had to be rewritten as we used to fetch
only the message rows whose id was greater than
first_visible_message_id. The filtered rows gives us
no information on whether the message history was
limited or not. So the backend was rewritten to not
do any restriction of limiting the message rows while
making the query. The limiting of rows is now done in
post_process_limited_query which will also return back
the value of history_limited flag.
Tweaked by tabbott to note a few cases where the results are
incorrect. I'm merging this despite those, because those cases don't
impact the correctness of the feature, and it may have tricky
performance implications to fix correctly.
Apparently, we weren't actually checking that found_oldest had the
correct value; fortunately, this didn't actually result in a problem,
because the values were always correct. But this will be important as
we start extending this test.
This is a preparatory commit which will help us with removing camo.
In the upcoming commits we introduce a new endpoint which is based
out on the setting CAMO_URI. Since camo could have been hosted on
a different server as well from the main Zulip server, this change
will help us realise in tests how that scenerio might be dealt with.
This will help us eliminate camo from our production installs.
Basically it helps us de duplicate some code from upcoming code
which will help us check validity of a camo url.
Also, rename get_alert_from_message to get_gcm_alert.
With the implementation of the and get_apns_alert_title and
get_apns_alert_subtitle, the logic within get_alert_from_message
is only relevant to the GCM payload, so we adjust the name
accordingly.
Progresses #9949.
Resolves https://github.com/zulip/zulip-mobile/issues/1316.
The string that is returned from get_alert_from_message is
dependent upon the same message that is passed into get_apns_payload
and get_gcm_payload. The contents of those payloads that are tested via
TestGetAPNsPayload and TestGetGCMPayload, which makes the tests for
get_alert_from_message redundant.
Also, simplify the logic by removing the last elif conditional.
If we use an outgoing webhook and the web server
responds with `widget_content` in the payload, we
include that in what we send through the send-message
codepath.
This makes outgoing webhook bots more consistent with
generic bots.
The test named `test_archiving_messages_with_attachment`
started flaking recently. We use sets for comparison
instead of lists to avoid arbitrary sorting differences.
Masking content can be useful for testing
out conversions where you're dealing
with data from customers and want to avoid
inadvertently reading their content (while
still having semi-realistic messages).
Having two smaller functions should make it
easier to customize the behavior for each specific
use case. The only reason they were ever coupled
was to keep ids in sequence, but the recent NEXT_ID
changes make that a non-issue now.
We now instantiate NEXT_ID in sequencer.py, which avoids
having multiple modules make multiple copies of a sequencer
and possibly causing id collisions.
Bots are not allowed to use the same name as
other users in the realm (either bot or human).
This is kind of a big commit, but I wanted to
combine the post/patch (aka add/edit) checks
into one commit, since it's a change in policy
that affects both codepaths.
A lot of the noise is in tests. We had good
coverage on the previous code, including some places
like event testing where we were expediently
not bothering to use different names for
different bots in some longer tests. And then
of course I test some new scenarios that are relevant
with the new policy.
There are two new functions:
check_bot_name_available:
very simple Django query
check_change_bot_full_name:
this diverges from the 3-line
check_change_full_name, where the latter
is still used for the "humans" use case
And then we just call those in appropriate places.
Note that there is still a loophole here
where you can get two bots with the same
name if you reactivate a bot named Fred
that was inactive when the second bot named
Fred was created. Also, we don't attempt
to fix historical data. So this commit
shouldn't be considered any kind of lockdown,
it's just meant to help people from
inadvertently creating two bots of the same
name where they don't intend to. For more
context, we are continuing to allow two
human users in the same realm to have the
same full name, and our code should generally
be tolerant of that possibility. (A good
example is our new mention syntax, which disambiguates
same-named people using ids.)
It's also worth noting that our web app client
doesn't try to scrub full_name from its payload in
situations where the user has actually only modified other
fields in the "Edit bot" UI. Starting here
we just handle this on the server, since it's
easy to fix there, and even if we fixed it in the web
app, there's no guarantee that other clients won't be
just as brute force. It wasn't exactly broken before,
but we'd needlessly write rows to audit tables.
Fixes#10509
This bug was introduced very recently and is an
aliasing bug. It caused extra UserMessage rows to
be created as we inadvertently updated the underlying
subscriber_map sets for multiple messages.
This probably mostly affected PMs.
It's doubtful the bug ever got out into the field.
Previously, MissedMessageWorker used a batching strategy of just
grabbing all the events from the last 2 minutes, and then sending them
off as emails. This suffered from the problem that you had a random
time, between 0s and 120s, to edit your message before it would be
sent out via an email.
Additionally, this made the queue had to monitor, because it was
expected to pile up large numbers of events, even if everything was
fine.
We fix this by batching together the events using a timer; the queue
processor itself just tracks the items, and then a timer-handler
process takes care of ensuring that the emails get sent at least 120s
(and at most 130s) after the first triggering message was sent in Zulip.
This introduces a new unpleasant bug, namely that when we restart a
Zulip server, we can now lose some missed_message email events;
further work is required on this point.
Fixes#6839.
These test are for the handling of HipChat
sender info. The data formats are somewhat
inconsistent and sometimes require us to
generate "mirror" users, so this is potentially
fragile code if we don't cover it well.
We extract this function and put it in the shared
library `import_util.py`.
Also, we make it one time higher up in the call
stack, rather than re-building it for every batch
of messages. I doubt this was super expensive, but
there's no reason to repeatedly execute this.
Before this fix, we were creating two copies of every
PM Message in zerver_message with only corresponding
UserMessage row.
Now we only create one PM Message per message, which
we accomplish by making sure we only use imported
messages from the sender's history.json file. And
then we write UserMessage rows for both participants
by making sure to include sender_id in the set of
user_ids that feeds into making UserMessage. For
the case where you PM yourself, there's just one
UserMessage row.
It does not appear that we need to support huddles
yet.
When we create new ids for message rows, we
now sort the new ids by their corresponding
pub_date values in the rows.
This takes a sizable chunk of memory.
This feature only gets turned on if you
set sort_by_date to True in realm.json.
We could migrate all the current PREMIUM_FREE organizations to have more
invites, but this setting mainly affects orgs right as they are starting, so
it's probably fine.
We recently received a bug report that implied that for certain
payloads, the `requested_reviewers` key was empty whereas a
singular `requested_reviewer` key containing one reviewer's
information was present in its stead. Naturally, this raised
some not so pretty IndexError exceptions.
After some investigation and generating a few similar payloads,
I discovered that in every case both the `requested_reviewers`
and the `requested_reviewer` keys were correctly populated, so I
had to manually edit the payload to reproduce the error on my end.
My guess is that this anomaly goes back to when GitHub's reviewer
request feature was new and didn't support requesting multiple
reviewers, and that the singular `requested_reviewer` key could
possibly just be there for backwards compatibility or might just
be mere oversight. Either way, the solution here is to look for the
plural `requested_reviewers` key, and if that is empty, fall back
to the singular `requested_reviewer` key.
We seemed to have been doing too much of sharpening on the thumbnails.
The purpose of sharpening here was to just counter the softening
effects of a resize on an image but overdoing it is bad.
Value sharpen(0.5,0.2,true) seems to look good for achieving the
best results here on different displays as revealed in the manual
hit and trial based testing.
Thanks to @borisyankov for pointing out the issue and suggesting
the values.
For some webhook endpoints where the third-party API requires us to do
this, the user's API key might appear in error emails through
appearing in the `QUERY_STRING` parameter. Fix that by filtering any
actual content from those; what we usually need for debugging is just
what set of parameters were provided.
Currently, if there is only one admin in realm and admin tries
to updates any non-adminuser's full name it throws error,
"Cannot remove only realm admin". Because in `/json/users/<user_id>`
api check_if_last_admin_is_changed is checked even if property
is_admin is not changed.
This commit fix this issue and add tests for it.
These lazy imports save a significant amount of time on Zulip's core
import process, because mock imports pbr, which in turn import
pkgresources, which is in turn incredibly slow to import.
Fixes part of #9953.
The APNS client libraries (especially the hyper.http20 one) were
determined via profiling to take significant time during the import
process, so we move them to be lazily imported in order to optimize
the overall Zulip import process. This save up to about 100ms in
import time.
These libraries are only used in certain Django processes inside
zulipchat.com, and so are unnecessary both in development as well as
for self-hosted Zulip servers.
This is a prepartory commit for the upcoming changes. It was meaningful
to extract this one out because this function is essentially a condition
check on whether a given url is one of the user_uploads or an external
one. Based on its value we decide whether a url must be thumbnailed or
not and thus this function will also be used in an upcoming commit
patching lib/thumbnail.py to do the same check before thumbnail url
generation.
We are basically adding a check for url's to be external (belonging
to some 3rd party web site hosting the image) or be one of the
user uploaded files. User uploaded files are served by a separate
endpoint which is /user_uploads/. Any other local url such as
/user_avatars/ or /static/ should never be sent to thumbor for
thumbnailing.
Not sending /user_avatars/ to thumbor for thumbnailing makes sense
because they are already properly thumbnailed and stored properly.
/static/ urls host very few images we use for demo and can be safely
be excluded from thumbnailing.
Previously, these timer accounting functions could be easily mistaken
for referring to starting/stopping the request. By adding timer to
the name, we make the code easier for the casual observer to read and
understand.
This commit adds a test for the payload that is generated when
a Task is moved from one user story to another on Taiga's Sprint
Taskboard UI.
This commit also gets up this webhook's test coverage up to 100%.
I generated multiple payloads and verified that there are no
`change` event payloads that will not contain the values in
question, so it is useless to catch these KeyErrors. If there are
any anomalies still, it is better to be notified about them than
to silently ignore them.
The Zulip API is to be used on both development and production
servers, and really we just need to talk about zuliprc files.
There's a similar issue for the JS docs, but we need to fix the
copy/paste issues with those as well.
Before, presence information for an entire realm could only be queried via
the `POST /api/v1/users/me/presence` endpoint. However, this endpoint also
updates the presence information for the user making the request. Therefore,
bot users are not allowed to access this endpoint because they don't have
any presence data.
This commit adds a new endpoint `GET /api/v1/realm/presence` that just
returns the presence information for the realm of the caller.
Fixes#10651.
Even individual "room" files from hipchat can be large,
so we process only 1000 messages at a time
within each file, which produces smaller JSON files.
We don't want really long urls to lead to truncated
keys, or we could theoretically have two different
urls get mixed up previews.
Also, this suppresses warnings about exceeding the
250 char limit.
Finally, this gives the key a proper prefix.
We use UserMessageLite to avoid Django overhead, and we
do updates in chunks of 10000. (The export may be broken
into several files already, but a reasonable chunking at
import time is good defense against running out of memory.)
Now that we allow multiple users to have registered the same token, we
need to configure calls to unregister tokens to only query the
targeted user_id.
We conveniently were already passing the `user_id` into the push
notification bouncer for the remove API, so no migration for older
Zulip servers is required.
If cordelia searches on pm-with:iago@zulip.com,cordelia@zulip.com,
we now properly treat that the same way as pm-with:iago@zulip.com.
Before this fix, the query would initially go through the
huddle code path. The symptom wasn't completely obvious, as
eventually a deeper function would return a recipient id
corresponding to a single PM with @iago@zulip.com, but we would
only get messages where iago was the recipient, and not any
messages where he was the sender to cordelia.
I put the helper function for this in zerver/lib/addressee, which
is somewhat speculative. Eventually, we'll want pm-with queries
to allow for user ids, and I imagine there will be some shared
logic with other Addressee code in terms of how we handle these
strings. The way we deal with lists of emails/users for various
endpoints is kind of haphazard in the current code, although
granted it's mostly just repeating the same simple patterns. It
would be nice for some of this code to converge a bit. This
affects new messages, typing indicators, search filters, etc.,
and some endpoints have strange legacy stuff like supporting
JSON-encoded lists, so it's not trivial to clean this up.
Tweaked by tabbott to add some additional tests.
For our bots that use GenericOutgoingWebhookService
(which are basically Zulip style bots), we now
include a "content-type" header of "application/json".
We accomplish this by having the service classes
implement their own custom method called
`send_data_to_server`. For the Slack-related
code, we just extracted code from `do_rest_call`,
and then for the Zulip-related code, we added
a `headers` parameter.
If we omit methods in subclasses, they're likely to
be caught by linters or unit tests, and even if they
aren't, raising NotImplementedError doesn't actually
prevent user problems.
I've been fighting these in refactoring, and it's
just been a bunch of busy work, plus comments are
highly likely to bitrot.
This fixes a couple things:
* process_event() is a pretty vague name
* returning tuples should generally be avoided
* we were producing the same REST parameters in both
subclasses
* relative_url_path was always blank
* request_kwargs was always empty
Now process_event() is called build_bot_request(),
and it only returns request data,
not a tuple of `rest_operation` and `request_data`.
By no longer returning `rest_operation`, there are
fewer moving parts. We just have `do_rest_call` make
a POST call.
Before this change, we instantiated base_url into a superclass
of subclasses that returned base_url into a dictionary that
gets returned to our caller.
Now we just pull base_url out of service when we need to make
the REST call.
We move the JSON parsing step into the
higher level function: process_success_response().
In the unlikely event that we'll start integrating
with a solution that doesn't use JSON, we can deal
with that, and for now doing the parsing in one
place will help us make error reporting more
consistent.
In a subsequent commit we'll introduce better
error handling for malformed JSON.
The earlier code here, if it got a payload with
"response_string" as a key, would prefix the
corresponding value with "Success!". We just
want the bot to set its own content.
The code is reorganized here so that process_success()
always produces a value keyed by "content" from
incoming data, and then process_success_response()
doesn't do any fancy munging of the data.
This is a preparatory commit for upcoming changes to move
/avatar/ to be a logged in or API accessible endpoint.
Basically we rename this variable because the new name is more
appropriate in the situation. Also user_profile will be used to
hold the user_profile of person accessing the endpoint in coming up
commit.
We simplify the code for is_realm_admin
and set is_guest as well.
I verified that build_user() is not used
by Slack/Gitter, so the extra argument there
should be fine.
Fixes#10639
Previously, Zulip did not correctly handle the case of a mobile device
being registered with a push device token being registered for
multiple accounts on the same server (which is a common case on
zulipchat.com). This was because our database `unique` and
`unique_together` indexes incorrectly enforced the token being unique
on a given server, rather than unique for a given user_id.
We fix this gap, and at the same time remove unnecessary (and
incorrectly racey) logic deleting and recreating the tokens in the
appropriate tables.
There's still an open mobile app bug causing repeated re-registrations
in a loop, but this should fix the fact that the relevant mobile bug
causes the server to 500.
Follow-up work that may be of value includes:
* Removing `ios_app_id`, which may not have much purpose.
* Renaming `last_updated` to `data_created`, since that's what it is now.
But none of those are critical to solving the actual bug here.
Fixes#8841.
We now allow outgoing webhooks to provide us a
"content" field, which is probably a more guessable
name than "response_string", particularly for folks
that use our other bot-related APIs. And we don't
modify content as we do response_string, i.e. no
"Success!" prefix.
If we're not too concerned about backward compatibility,
we can do a subsequent commit that makes "content"
and "response_string" true synonyms and get rid of
the "Success!" prefix, which was probably accidental
to begin with.
This commit starts by changing the third
argument of send_response_message to be a Dict
instead of a string, so that the data can be more
structured going forward.
That change makes the 2nd/3rd parameters both be
dicts, so to be defensive, I now have all the callers
pass in explicit keyword names. And then I rename
message to message_info, so that the callers have
more clear code.
And that changes the implementation inside of
send_response_message() a bit.
Sorry this commit is a bit coarse, but the intermediate
commits would have been kind of ugly, too.
At the end of the day, it's pretty simple:
bot_id: never changed
message_info: just renamed from message
response_data: is a Dict with the key of "content"
And the innards of send_response_message() are basically
simply dictionary lookups and function calls.
There's no reason to return a failure message in
process_success(), since it's implied to be part of
the success codepath. I didn't look at the full history
of how the strange API evolved, but the second element
of the tuple was clearly noise by the time I got here.
Neither of the subclasses ever set it, and none of the
consumers used it.
This two-line function wasn't really carrying its
weight, and it just made it harder to refactor the
overall codepath.
Eliminating the function forces us to mock at a slightly
deeper level, which is probably a good thing for what
the test intends to do. The deeper mock still verifies that
we're sending the message (good) without digging into
all the details of how we send it (good).
Note that we will still keep around the similarly named
`fail_with_message` helper, which is a lot more useful.
(The succeed/fail scenarios aren't really symmetric here.
For success, there are fewer codepaths that do more complex
things, whereas we have lots and lots of failure codepaths
that all do the same simple thing of replying with a canned
message.)
Before this change subclasses of OutgoingWebhookServiceInterface
would return a raw string as the first element of its return
tuple in process_success(). This is not a very flexible
design, as it prevents the bot from passing extra data like
`widget_content`.
It's also possible in the future that we'll want to let outgoing
bots reply directly to senders who mention them on streams, and
again the original design was overly constrained for that.
This commit does not actually change any functionality yet.
Tweaked by tabbott to use a declared constant rather than just use
5000 in multiple places; this also means we can change the count
without updating translations.
Fixes#10446.
IFTTT allows custom templating for their payloads, so the onus is
on the user to ensure that their custom templates conform to the
expectations outlined in our IFTTT webhook docs. For that reason,
these payloads weren't generated, but were manually edited.
After discovering a couple of bugs, I decided to thoroughly test
and rewrite this integration from scratch. The older code wasn't
generating coherent messages.
This also commit gets this integration up to 100% test coverage.
Test coverage was improved by removing an unused function and
removing some code (written by me) that was actually handling
Test Hook event types incorrectly.
It was a painful amount of work to generate the actual payload.
Since the only difference was a small build URL, I manually
edited the payload and used that for testing.
This commit gets our GitHub webhook up to 100% test coverage.
Some of the page build message code had insufficient test coverage.
I looked at generating the payloads that would allow me to test
the lines of code in question, but it was too much work to
generate the payloads and this seemed like a vague event anyway.
So I just rewrote the logic so that the lines missing
coverage are implicitly covered.
This is a part of our efforts to get this webhook's coverage
up to 100%.
Note that apart from just testing an uncovered line of code, this
commit also fixes a minor bug in the code for messages about issue
comment deletion and editing.
Note that Freshdesk allows custom templating for outgoing payloads
in their webhook UI. Therefore, the payloads added in this commit
did not have to be official payloads from Freshdesk.
Instead of just referring to the commit with the raw URL, we
should use the commit ID as the text of the hyperlink.
Note that in commit_status_changed type messages, the name of the
commit isn't available.
The function that generates the body of the commit_status_changed
event messages generated an invalid commit URL.
Most likely, we missed this because this event type is fairly
vague and it is possible it was never tested by users much,
if at all.
The lack of coverage was due to:
* An unused function that was never used anywhere.
* get_commit_status_changed_body was using a regex where it didn't
really need to use one. And there was an if statement that
assumed that the payload might NOT contain the URL to the commit.
However, I checked the payload and there shouldn't be any instances
where a commit event is generated but there is no URL to the commit.
* get_push_tag_body had an `else` condition that really can't happen
in any payload. I verified this by checking the BitBucket webhook
docs.
We shouldn't just ignore exceptions when encoding the incoming
auth credentials. Even if the incoming credentials are properly
encoded, it is better to know when that is the case or if
something else fails.
This is a very early version of a tool to convert Hipchat
tar files into data files that can be used by the Zulip
import process.
We include the most fundamental entities--users and
streams. Customers who don't care about past messages
or customizations could start an instance off of this
and start communicating.
Of course, there are a lot of things missing in the
initial version:
* messages!
* file assets -- avatars, emojis, attachments
* probably lots of other minor things
We currently ignore any incoming dates from Hipchat data
and just use the current time. This is consistent with
other imports.
We also don't have any docs yet, although the process
will be extremely similar to the "Slack" process:
https://zulipchat.com/help/import-from-slack
Also, there's a comment at the top of convert_hipchat_data.py
that describes how to test this in dev mode.
I tested this by following the steps in the comment above.
The users just "show up" in /devlogin, so that's nice, and
you can send messages to other users. To verify the stream
data you have to go into the gear menu and click on "All
Streams", then you can subscribe and send a message.
Production users will need to get new passwords and
re-subscribe to streams. We will probably auto-subscribe
all users to public streams.
The code was needlessly querying the DB to get full
objects for entities where we only needed user_id,
realm_id, and stream_id.
With my test data of ~1000 records this sped up the
function from ~8s to ~0.5s. The speedup would probably
be even more for larger data sets.
The `match_subject` field is supposed to contain HTML; that's how
the highlighting is done. But the `subject` field is plain text --
it must be encoded if we want corresponding HTML.
Of the three places the `match_subject` field is populated -- two
here in messages_in_narrow_backend, one in get_messages_backend --
two of them already do this correctly, via get_search_fields.
Fix the remaining one, where in a `/messages/matches_narrow` query
we populate `matches_subject` even if the query didn't involve a
full-text search.
This doesn't affect the webapp, which ignores `match_subject` unless
it knows it did a full-text search; nor the mobile app, which
doesn't use `/messages/matches_narrow` at all.
Fixes the urgent part of #10397.
It was discovered that soft-deactivated users don't get mobile push
notifications for messages on private streams that they have configured
to send push notifications.
Reason: `handle_push_notification` calls `access_message`, and that
logic assumes that a user who is a recipient of a message has an
associated UserMessage row. Those UserMessage rows are created
lazily for soft-deactivated users, so they might not exist (yet)
until the user comes back.
Solution: Ensure that userMessage row is created for
stream_push_user_ids and stream_email_user_ids in create_user_messages.
At some point as part of the process of supporting renumbering data,
we changed the structure of our file uploads to expect `path` to match
`s3_path`, with both having the relative path within the overall
hierarchy (including the realm ID). This change updates the more
rarely-used S3 export code path to use that model, fixing a crash when
messages reference an Attachment object with a rewritten path_id.
If any user had sent the reply to the welcome bot recommended by our
tutorial, then the Zulip export/import process didn't work properly,
because we weren't including (and then remapping) the recipient ID for
sending PMs to the cross-realm bots. This commit fixes that gap, by
recording the necessary data on the export side, and doing the
appropriate remapping on the import side.
Previously, our realm import logic only did the special remapping
logic for the original notifications_stream_id; when we added the new
signup_notifications_stream_id field, we neglected to handle it in the
same way.
Note we're no longer using subscriptions_html in the help docs, so no need
to test for it. There is already a test for subscriptions_html in
IntegrationTest.
In the event that two processes are racing to be the
first to load data from zulip.yaml, we now make the
race scenario be duplicated effort instead of having
the second racer get an attribute error on `data`.
We do this by declaring victory only after setting
`data`. "Declaring victory" in this case is a matter
of setting `last_update`.
We are still possibly vulnerable to corrupted data
here, so we should investigate a mutex, or just
read the data on every call (but it's strangely
expensive, almost 3.5s on my instance), or converting
the YAML to code before launching the server.
This fixes an inconsistent test failure with test_users.py (that
depended on the ordering between this migration and the creation of
test database users like hamlet).
We start by stripping the ids in front of the name before the database
lookup. This has the advantage of not mentioning anyone if an incorrect
user id and full name combination is specified, as well as not having
the query the database twice, once by fullname and next by id.
Previously, we were storing only the most recent person with the same
full name as others; this commit adds new keys to the dict such that
simply looking by name would get you the newest user with this name,
and the get_user_by_id function can index the remaining users.
This is largely inspired by requests from people not liking the
Google's new emojiset. A lot of people were requesting to revert
back to old blobs emojiset so we are re-enabling this feature
after making relevant infrastructure changes for supporting google's
old blob emojiset and re-adding support for twitter emojiset.
Fixes: #10158.
Fixes part of #10297.
Use FAKE_LDAP_NUM_USERS which specifies the number of LDAP users
instead of FAKE_LDAP_EXTRA_USERS which specified the number of
extra users.
This adds a feature in the "Notification" section of "Settings" tab,
which lets user enable or disable login emails notification.
Tweaked by tabbott to simplify the test.
Fixes: #5795, progress towards #5854.
Also use name for selecting form in casper tests
as form with action=new is present in both /new
and /accounts/new/send_confirm/ which breaks
test in CircleCI as
waitWhileVisible('form[action^="/new/"]) never stops
waiting.
We also remove some unreachable code. Calling
split() always returns at least one token, even
if it's just the empty string. This is tested
directly on this commit, plus messages with
empty content get rejected pretty early in
the execution path.
In user type custom field, field value is list of user ids. We weren't
converting list to json object in update event payload. This throws
error in frontend, cause we store stringify representation of custom
field value. Therefore, after update event is recieved field-value-
type gets updated to array from string which throws json parsing error.
Having HTML (or HTML-like) content in the examples was making parts of
the content invisible, since the browser identified them as HTML tags
rather than verbose text.
There are some endpoints that don't fall into the currently available
categories, so this new function will be used for calling the tests for
server and realm-related endpoints.
Using early-exit here allows us to more easily
comment why there are certain exemptions to
this logic.
We also only require callers to pass in realm,
not the whole user object.
The function being tested here was kind of an
emergency response to some spam attacks. It
works for a pretty specific set of circumstances,
so it requires a lot of setup.
We may eliminate this function as we improve
our realm "plan types", and if that happens, we
can either eliminate this test or repurpose it.
Since this class was built, folks have always chosen
to subclass JsonableError for situations where
the default of ErrorCode.BAD_REQUEST is insufficient.
So now we simplify the use cases, which also gets
us 100% coverage on this core module.
The output of generate_dev_ldap_dir was being tested against the fixture
located at zerver/tests/fixtures/ldap_dir.json. This didn't make much sense
as generate_dev_ldap_dir was itself used by developers to generate/update
the fixtures. Instead, test_generate_dev_ldap_dir checks the structure of
the dict returned by generate_dev_ldap_dir. The structure is checked by
regex checks, checking whether the dict contains some keys or not, etc.
This commit add FIELD_TYPE_CHOICES_DICT to page_params and replace
FIELD_TYPE_CHOICES.
FIELD_TYPE_CHOICES_DICT includes all field types with keyword, id
and display name. Using this field-type-dict, we can access field
type information by it's keyword, and remove all static use of
field-type'a name or id in frontend.
This commit also modifies functions in js where this page_params
field-types is used.
This commit modifies FIELD_TYPE_DATA dict in `CustomProfileField`
model to store keyword of field types. And create new dict
FIELD_TYPE_CHOICES_DICT to store all field type information
by field type keyword, i.e. id, name.
This is preparatory commit to remove all static use of field
types in frontend and access field type with keyword instead
of display name.
This prevents leaking some variables into an already
cluttered function.
We also add test coverage for what's now an
early-exit condition in the new function--we exempt
public MIT streams from these events.
This extends a test that proved only what Cordelia
could do with/without super_user privileges when she
was trying to send to an unsubscribed stream as herself.
Now the test shows the same powers extend to Cordelia
when she's sending messages on behalf of a mirrored
user.
This change was partially driven by a quirk in Python
where peephole optimizations make `continue` lines
appear not to be covered.
I also think it's generally a good idiom to extract
functions for loop bodies when they don't actually
accumulate values or maintain other state. With this
commit we now prevent potential bugs for vars like
`is_stream` leaking between loop iterations.
We simulate a race condition by mocking create_user
to actually create a user, but then raise an
IntegrityError (as if another process had actually
created the user, not our test).
I also changed the real code to use explicitly
named parameters.
I don't understand why this didn't cause test failures in CI; this
change was clearly required and test_change_realm_property was failing
consistently for me locally.
The missing fields are checked by `full_clean()` method.
The datetime field errors are ignored as they are fixed
in the `import_realm` script. The field that are
allowed to be null are not included while building
this object.
These test cases are used to test the cost of stream creation.
Three scenarios of stream creation are covered:
1) create a public stream;
2) create a private stream;
3) create a public stream with announce=true when there is a notification stream.
Fix: #4804.
We've been getting reports from users that our Freshdesk webhook
isn't working correctly. It turns out that the issue had nothing
to do with the webhook implementation itself!
In freshdesk/doc.md, we have a JSON template we ask users to
copy/paste into a textbox in the Freshdesk UI. That JSON template
contains "{{" and "}}" characters which we escaped as Unicode
decimals to prevent clashes with Jinja2 syntax in other parts
of the same template. This worked for a while!
But thanks to the changes introduced as part of the
nested_code_blocks extension, such escaped characters were never
decoded, leading users to copy/paste the same template but with
raw escaped unicode representations of "{{" and "}}" inside. And
that eventually broke our webhook implementation.
This commit makes sure that such characters are properly "unescaped",
just for Freshdesk docs.
We have code to prevent newbies on open realms
from inviting users. This is mostly intended
to hinder spammers. This commit just adds some
test coverage.
Our get_streams_traffic function used to query
all streams in the StreamCount table if you
passed in `None` for `streams`.
Now we require that you pass in a list of
stream_ids.
I don't know how much work this will save
the database, since probably the bulk of
the work is aggregating. If we need to fine
tune DB performance, we could possibly add
`realm` as an argument and add it to the filter.
What we'll immediately get, for large multi-realm
installations, is less data over the wire and
less work for the ORM.
The prior code uses an awkward idiom that
pre-dates the `exists()` function, and it
had an unreachable line of code.
The new version should be faster, since we
don't create a throwaway heavy Django object
or send needless data over the wire.
This functions appears to be redundant to
`access_stream_by_name`. The only
meaningful line of code in the function that we're
removing, the code that raises an error,
appears to be unreachable, despite reasonably
extensive tests.
The only thing the function was restricting
was that the case where the bot's owner was
unsubscribed to a private stream, which
is already locked down in
`access_stream_by_name` calls inside of
`patch_bot_backend`.
This commit increases test coverage
by removing unreachable code.
It's possible this function had
some theoretical value before we
introduced the `require_non_guest_human_user`
decorator to the `patch_bot_backend`
view, since in theory the bot itself
could have subscribed to a stream that
the owner didn't subscribe to. Even
then it's not clear that allowing the
bot to set that as a default stream
would have been harmful, since they
can already access it.
This commit adds some more tests related to patching
a bot's `default_sending_stream`.
Unfortunately, this didn't reach the code that I was
intending to add line coverage to, since checks happen
higher up in the stack, but the test code I added
is probably worthwhile.
We want our methodology for extracting the last message
id to be consistent, particularly in terms of how we
handle edge cases. (I'll concede that the
`bulk_remove_subscriptions` codepath never hits that
corner case in practice, but it's harmless to handle
the theoretical case.)
It may also be nice to have this function show up
clearly in profiling.
This also adds some direct testing to the function.
It's not clear to me why we don't use `latest('id')`
in the implementation, but that's outside the scope
of this commit.
If `TEXT_EMOJISET` is currently selected emojiset then fallback to
`GOOGLE_EMOJISET` for displaying emojis in emoji picker and
composebox typeahead. We should pre-load the spritesheets in`emoji.js`
even in case of text emojiset otherwise on slow networks emoji picker
will appear empty initially.
This de-clutters check_message a bit and also makes
it easy to audit our rules for who can write to a
stream.
Also, this works around a bug with Python where its
optimizations for the `pass` instruction make them
not appear to run and show up as uncovered in
coverage reports.
The timestamp used for new login notifications always used the 12-hour
format. Instead of that, we use now the one preferred by the user, as
reflected in their settings.
The TeamCity webhook plugin supports multiple payload formats that
are customized to be used by different services such as Slack,
Flowdock, etc. We don't support such payloads, so we should ignore
them and stick to parsing only the generic ones. We should also
notify that bot owner about the error.
This fixes an issue where passing a path like `~/exports/foo` would
result in a `~` directory being created and the export/import not
working correctly.
Fixes#10124.
Users in the waiting period category cannot subscribe other users to
a stream. When a user tries to mention another unsubscribed user, a
warning message appears with a subscribe button on it to subscribe
the other user.
This commit removes the subscribe button and changes the warning text
for users in the waiting period category.
Issue: When you created a new organization with /new, the "new login"
emails were emailed. We previously had a hack of adding the
.just_registered property to the user Python object to attempt to
prevent the emails, and checking that in zerver/signals.py. This
commit gets rid of the .just_registered check.
Instead of the .just_registered check, this checks if the user has
joined more than a minute before.
A test test_dont_send_login_emails_for_new_user_registration_logins
already exists.
Tweaked by tabbott to introduce the constant JUST_CREATED_THRESHOLD.
Fixes#10179.