The octet-stream content type is potentially under-specified, but it's
better than potentially submitting None and increases consistency of
this part of the codebase.
The boto library's s3 interface allows setting only string-format
metadata keys. So we need to cast the last_modified floating-point
timestamp into a string before storing on the S3 object.
This bug mostly broke uploading avatars when using the S3 storage backend.
This IntegrityError has been happening occasionally in production due
to races, likely due to some sort of mobile app double-post bug.
Handle this by avoiding a 500, and returning the same 400 we would do
if there hadn't been a race.
This commit adds a custom Markdown include extension which is
identical to the original except when a macro file can't
be found, it raises a custom JsonableError exception, which
we can catch and then trigger an appropriate test failure.
Fixes: #10947
Our HipChat conversion tool didn't properly handle basic avatar
images, resulting in only the medium-size avatar images being imported
properly. This fixes that bug by asking the import tool to do the
thumbnailing for the basic avatar image (from the .original file) as
well as the medium avatar image.
This adds a new realm_logo field, which is a horizontal-format logo to
be displayed in the top-left corner of the webapp, and any other
places where we might want a wide-format branding of the organization.
Tweaked significantly by tabbott to rebase, fix styling, etc.
Fixing the styling of this feature's loading indicator caused me to
notice the loading indicator for the realm_icon feature was also ugly,
so I fixed that too.
Fixes#7995.
Support for extended mention syntax was added as a part of
commit fbe99b812ee8fbca7257a5b7156c57a6cd74195b in the
python-zulip-api repository. The relevant function,
extract_query_without_mention now relies on the client's ID
in order to check for the extended syntax. Since the
EmbeddedBotHandler has no user_id attribute, the latest
python-zulip-api release broke a test in the main repo.
Apparently, when we renamed these files to no longer have a .txt
extension, we accidentally removed them from the set of strings for
translation, because `manage.py makemessages` by default only
processes .txt and .html files under the templates/ directory.
Fix this by adding a .txt extension.
It appears that our i18n logic was only using the recipient's language
for logged-in emails, so even properly tagged for translation and
translated emails for functions like "Find my team" and "password
reset" were being always sent in English.
With great work by Vishnu Ks on the tests and the to_emails code path.
The previous version was also doing almost the same thing.
But checking for DEVELOPMENT_LOG_EMAILS would allow us
to control the call of send_email by altering the value
of DEVELOPMENT_LOG_EMAILS in tests.
The previous logic for soft deactivation ended up doing a giant
transaction in the case that there were thousands of users to
deactivate; this was messy and potentially buggy.
The batched transactions were useful for RealmAuditLog management,
however. So the right solution is to do reasonably sized batches
(e.g. 100 users).
Apparently, our do_batch_update method (used, e.g., in a pgroonga
migration) was using semi-invalid syntax that was removed in postgres
10.
Thanks to Ilya Evseev for the report.
Fixes#11063.
This should make it possible for blueslip error reports to be sent on
our logged-out portico pages, which should in turn make it possible to
debug any such issues as they occur.
This checks if push_notification_enabled() is set to false in
handle_push_notification and adds an early return statement.
This is a significant performance optimization for our unit tests
because the push notifications code path does a number of database
queries, and this migration means we don't end up doing those queries
the hundreds of times we send PMs or mentions in our tests where we're
not trying to test the push notifications functionality.
This should also have a small message sending scalability improvement
for any Zulip servers without push notifications enabled.
Tweaked by tabbott to fix a few small issues.
Fixes#10895.
The previous migration code path was broken in two ways:
* ScheduledEmail objects generally contain a `None` value for
whichever of `to_user_id` and `to_email` isn't in use; this could
result in us sending a [None] to send_email(), which doesn't make
sense.
* We were calling handle_send_email_format_changes in the wrong order
with respect to the JSON loading process.
Thanks to Tom Daff for the report!
Fixes a bug in import_realm where secondary attributes like message
visibility weren't being set, and also makes bugs like this less likely in
the future.
Also, putting the plan_type change at the end of import_realm, so that
future restrictions to LIMITED realms don't affect the import process.
We should rate-limit users when our rate limiter deadlocks trying to
increment its count; we also now log at warning level (so it doesn't
send spammy emails) and include details on the user and route was, so
that we can properly investigate whether the rate-limiting on the
route was in error.
The code paths for accessing user-uploaded files are both (A) highly
optimized so as to not require a ton of work, and (B) a code path
where it's totally reasonable for a client to need to fetch 100+
images all at once (e.g. if it's the first browser open in a setting
with a lot of distinct senders with avatars or a lot of image
previews).
Additionally, we've been seeing exceptions logged in the production
redis configuration caused by this code path (basically, locking
failures trying to update the rate-limit data structures).
So we skip running our current rate limiting algorithm for these views.
Now that we've styled this feature properly, this makes it possible to
copy various user-preferences type profile data in production when
making a new account with the same email address as an existing
account.
Broaden the type of the AbstractEnum __reduce_ex__ parameter to object; this
matches the parameter type specified in the latest enum.pyi file in typeshed.
Fixes#10996.
Also, add a new notification sound, "ding". It comes from
https://freesound.org, where the original Zulip notification sound comes
from as well. In the future, new sounds can be added by adding audio
files to the `static/audio/notification_sounds` directory.
Tweaked significantly by tabbott:
* Avoided removing static/audio/zulip.ogg, because that file is
checked for by old versions of the desktop app.
* Added a views check for the sound being valid + tests.
* Added additional tests.
* Restructured the test_events test to be cleaner.
* Removed check_bool_or_string.
* Increased max length of notification_sound.
* Provide available_notification_sounds in events data set if global
notifications settings are requested.
Fixes#8051.
A key part of this is the new helper, get_user_by_delivery_email. Its
verbose name is important for clarity; it should help avoid blind
copy-pasting of get_user (which we'll also want to rename).
Unfortunately, it requires detailed understanding of the context to
figure out which one to use; each is used in about half of call sites.
Another important note is that this PR doesn't migrate get_user calls
in the tests except where not doing so would cause the tests to fail.
This probably deserves a follow-up refactor to avoid bugs here.
We've had a long stream of bugs existed because only one of these two
code paths was tested (usually the local uploads backend). By
deduplicating these functions, we ensure that this category of bugs no
longer happens.
Following my recent refactor, this is just a straightforward merge,
with code for one or the other backend ending up inside an if
statement.
Previously, we were incorrectly importing avatar PNGs to a filename
without the .png extension, resulting in them effectively not being
imported.
This was mitigated by the fact that we imported the originals and ran
the appropriate `ensure_` functions, but still a bug.
If a user deletes message between when it triggered a potential push
notification for another user, and when that notification was actually
sent, we'd end up with a situation where the `Message` table didn't
have an entry for the requested message ID.
The clean fix for this is to still throw an exception in the event
that the message doesn't exist at all, but if it exists in
ArchivedMessage, don't throw a user-facing exception.
This seems to happen when Apple is having a partial outage on some of
their APNS shards; it should be treated like other networking errors
connecting to APNS (with an automatic retry).
Apparently, we have a second code path where we might try to call
send_email library functions on old data, namely in the
queue_processors codebase. So we apply the same migration logic here.
This adds a function that sends provided email to all administrators
of a realm, but in a single email. As a result, send_email now takes
arguments to_user_ids and to_emails instead of to_user_id and
to_email.
We adjust other APIs to match, but note that send_future_email does
not yet support the multiple recipients model for good reasons.
Tweaked by tabbott to modify `manage.py deliver_email` to handle
backwards-compatibily for any ScheduledEmail objects already in the
database.
Fixes#10896.
This prevents the warning about push notifications not being
registered for from being printed in development environment startup
by default. In development, that's the expected state, and we don't
need to spam up the output with that notice.
Previously, we frequently accessed user_profile.realm from outside the
loops that interact with UserProfile objects. This variable reuse
outside the loop could be confusing and should be a style/lint
violation.
While in this case, the behavior was correct (in that all users in the
loops were within the same realm), extracting a separate `realm`
variable significantly clarifies what's going on here.
This reverts commit 2fa77d9d54.
Further investigation has determined that this did not fix the
password-reset problem described in the previous commit message;
meanwhile, it causes other problems. We still need to track down the
root cause of the original password-reset bug.
This provides a nice user experience for folks where we do know what
their LDAP credentials are.
Though we need to fix#10917 before the content in the email with be
correct.
This is the first step of letting users use Zulip markdown in their
SHORT_TEXT and LONG_TEXT custom profile fields, so that they can
include emphasis, links, etc.
This doesn't include any frontend logic yet, however.
This commit changes the return type of get_possible_mentions_info to a
list instead of a dict, thus disposing off the hacky logic of storing
users with duplicate full names with name|id keys that made the code
obfuscated.
The other functions continue to use the dicts as before, however, there
are minor variable changes where needed in accordance with the updated
definition of get_possible_mentions_info.
This function is equivalent to recipient_for_emails, but fetches
user_profiles by IDs, not by emails.
This commit is a part of our efforts surrounding #9474, but is
more primarily geared towards adding support for sending typing
notifications by user IDs.
Previously, get_user_profiles() was split into two functions:
* user_profiles_from_unvalidated_emails, which raised a
ValidationError upon encountering a non-existent user email.
* get_user_profiles, which caught the ValidationError raised
by user_profiles_from_unvalidated_emails and raised a
JsonableError instead.
According to Steve Howell, this complexity is partly a relic
of past refactoring and is unnecessarily heavy. It is better to
just raise JsonableError directly.
recipient_for_emails is used by our typing notifications code.
user_profiles_from_unvalidated_emails is used by our typing
notifications code *and* for sending messages.
user_profiles_from_unvalidated_emails is a part of a larger
framework used by Addressee to validate recipient emails when sending
messages and will eventually need to be removed as we move forward
with #9474. So it makes sense to just inline this function within
recipient_for_emails so that we don't break our typing notifications
code in the future.
This commit is a part of our efforts surrounding #9474.
This library was absolutely essential as part of our Python 2->3
migration process, but all of its calls should be either no-ops or
encode/decode operations.
Note also that the library has been wrong since the incorrect
refactoring in 1f9244e060.
Fixes#10807.
This adds a web flow and management command for reactivating a Zulip
organization, with confirmation from one of the organization
administrators.
Further work is needed to make the emails nicer (ideally, we'd send
one email with all the admins on the `To` line, but the `send_email`
library doesn't support that).
Fixes#10783.
With significant tweaks to the email text by tabbott.
The previous content made it sound like we were actually sending a
push notification, which could be confusing/alarming in some cases
(see e.g. 9c224ccdd3). Instead, we make
clear that we're sending it to all clients (which one might correctly
suspect is vacuous in the development environment).
While it could make sense to print these logging statements at WARN
level on server startup, it doesn't make sense to do so on every
message (though it perhaps did make sense to do so before more recent
changes added good ways to discover you forgot to configure push
notifications).
Instead, we now just do a WARN log on queue processor startup, and
then at DEBUG level for individual messages.
Fixes#10894.
Change the truncation marker from `...` to `\n[message truncated]`
when receiving messages from the API or through e-mail. Also, update
tests to account for the new change.
Fix#10871.
Our recent change in 2fa77d9d54 to
disable the cached_db cache backend broke upgrade-zulip-from-git with
an attributeerror; we fix that by checking the session engine before
trying to access its cache-related attributes.
This is initial work, which will help us establish habits of using a
well-tested approach for renaming a Zulip organization (since as part
of https://github.com/zulip/zulip-mobile/issues/3142, we'll likely
need to make this function do more).
This seems like kind of a silly function to extract
to topic.py, but it will theoretically help us sweep
"subject" if we change the DB.
It had test coverage.
We use the message a lot for the query modified
here, so I think it's worth taking the up-front
hit of getting bulkier objects to avoid O(N)
hops back to the database.
Our webhook-errors.log file is riddled with exceptions that are
logged when a webhook is incorrectly configured to send data in
a non-JSON format. To avoid this, api_key_only_webhook_view
now supports an additional argument, notify_bot_owner_on_invalid_json.
This argument, when True, will send a PM notification to the bot's
owner notifying them of the configuration issue.
We (lexically) remove "subject" from the conversion code. The
`build_message` helper calls `set_topic_name` under the hood,
so things still have "subject" in the JSON.
There was good code coverage on `build_message`.
The various vars here that had recipient_subject
in the name now have either bucket or bucket_tup
there.
The shorter names are a bit easier to read, and the
original names were misleading for the PM case.
This was basically two search/replaces, and we have
good test coverage here, so it's pretty low risk
despite the messy diff.
We now attach zulip_db_data to the markdown engines
for classes that need it. This was the last remaining
global we had, so we remove `arguments.py` here.
The Markdown processor makes it fairly simple for
the helper classes to access the `md` engine. We
now write `_md_engine.zulip_message` to avoid having
the current message in the global namespace.
Note that we do reuse engines for multiple messages,
but each engine is specific to a realm. And we therefore
avoid even the theoretical possibility of leaking message
data between realms.
This makes us consistent with how we import codehilite.
Using Python's normal import mechanism avoids some overhead
with Markdown having to parse dotted notation.
These modules are tiny, so they shouldn't impact startup
too much. Also, by explicitly importing them, we avoid
the pitfall of having a sucessful startup and a broken
renderer.
We were building the same link regex every time
we build a Markdown engine, which happens twice
per realm. It's an expensive operation due to
the complexity of the regex and us reading a file.
Nested classes are kind of expensive in Python,
particularly when you throw in mypy annotations.
Also, flatter is arguably better, although it is
kind of a pain here not to have closures.
This change avoids hitting the Django ORM when
we don't find any possible group mentions in
the message content.
Django doesn't necessarily actually hit the database,
but it's still slow and shows up in profiles.
This commit speeds up the import by avoiding
sender lookups and instead using the data
for users that we already have in memory.
This avoids a few DB hops, many hops to memcached,
plus some object construction.
We now call do_render_markdown() directly. This
also makes it more explicit that the import has
never rendered alert words.
For the import-data codepath, we will call
the extracted function directly in a
subsequent commit.
The do_render_markdown() function has more
required parameters, which allows for more
explicit code and also allows us to flatten
out some logic related to alert words. (We
just pass in empty sets/dicts as needed).
We can rely on `message_realm` being the same
as `message.sender.realm`, which allows us to
skip two queries to the database for the rare
Zephyr mirroring case.
This function requires a message object, whereas
we want to work with JSON data to avoid necessary
queries when we import data. Inlining the function
sets us up for a subsequent refactoring.
We change the way we deal with theoretical return
values of `None` to use an assertion; otherwise,
we would have to loosen up a bunch of mypy types
from `str` to `Optional[str]`. It's not clear `None`
is even possible--we've moved toward throwing exceptions
there instead of silently failing.
This is somewhat hairy logic, so it's nice
to extract it and not worry about variable leaks.
Also, this moves some legacy "subject" references out
of actions.py.
We start by including functions that do custom
queries for topic history.
The goal of this library is partly to quarantine
the legacy "subject" column on Message.
Recently, one of our users reported that a JIRA webhook was not
able to send messages to a stream with a space character in its
name. Turns out that JIRA does something weird with webhook URLs,
such that escaped space characters (%20) are escaped again, so
that when the request gets to Zulip, the double escaped %20 is
evaluated as the literal characters `%20`, and not as a space.
We fix this by unescaping the stream name on our end before
sending the message forward!
The previous logic was incorrect, in that if `content_type` was set to
None (which happens with Slack/HipChat export, among other things),
then we wouldn't run the `guess_type` logic to auto-detect the
Content-Type to send to S3.
The UserMessage table can be huge, so creating a
bunch of entries in `ID_MAP` can overflow memory.
We don't have any tables that depend on `UserMessage`,
and we don't send the 'id' fields from `zerver_usermessage`
to the database, so re-mapping them was just busy-work.
This is a preparator refactor for supporting hosting different Tornado
processes on different servers; to look up which Tornado server we
should be sending the event to, we'll need the realm object.
Apparently, the QUERY_STRING property of the report object wasn't
actually a string; since we only care about its string representation,
we should just stringify it.
Fixes#10745.
Use get_display_recipient to get stream names, and remove the
references to message.stream_name in push_notifications.py which were
added in 97571a203, as the actual stream names were being retrived
only for Message objects associated with public streams.
When you send a message to a bot that wants
to talk via an outgoing webhook, and there's
an error (e.g. server is down), we send a
message to the bot's owner that links to the
message that triggered the error.
The code to produce those links was out of
date.
Now we move the important code to the
`url_encoding.py` library and fix the PM
links to use the more modern style (user_ids
instead of emails). We also replace "subject"
with "topic" in the stream urls.
This supports guest user in the user-info-form-modal as well as in the
role section of the admin-user-table.
With some fixes by Tim Abbott and Shubham Dhama.
This will help us eliminate camo from our production installs.
Basically it helps us de duplicate some code from upcoming code
which will help us check validity of a camo url.
Also, rename get_alert_from_message to get_gcm_alert.
With the implementation of the and get_apns_alert_title and
get_apns_alert_subtitle, the logic within get_alert_from_message
is only relevant to the GCM payload, so we adjust the name
accordingly.
Progresses #9949.
Resolves https://github.com/zulip/zulip-mobile/issues/1316.
The string that is returned from get_alert_from_message is
dependent upon the same message that is passed into get_apns_payload
and get_gcm_payload. The contents of those payloads that are tested via
TestGetAPNsPayload and TestGetGCMPayload, which makes the tests for
get_alert_from_message redundant.
Also, simplify the logic by removing the last elif conditional.
If we use an outgoing webhook and the web server
responds with `widget_content` in the payload, we
include that in what we send through the send-message
codepath.
This makes outgoing webhook bots more consistent with
generic bots.
Bots are not allowed to use the same name as
other users in the realm (either bot or human).
This is kind of a big commit, but I wanted to
combine the post/patch (aka add/edit) checks
into one commit, since it's a change in policy
that affects both codepaths.
A lot of the noise is in tests. We had good
coverage on the previous code, including some places
like event testing where we were expediently
not bothering to use different names for
different bots in some longer tests. And then
of course I test some new scenarios that are relevant
with the new policy.
There are two new functions:
check_bot_name_available:
very simple Django query
check_change_bot_full_name:
this diverges from the 3-line
check_change_full_name, where the latter
is still used for the "humans" use case
And then we just call those in appropriate places.
Note that there is still a loophole here
where you can get two bots with the same
name if you reactivate a bot named Fred
that was inactive when the second bot named
Fred was created. Also, we don't attempt
to fix historical data. So this commit
shouldn't be considered any kind of lockdown,
it's just meant to help people from
inadvertently creating two bots of the same
name where they don't intend to. For more
context, we are continuing to allow two
human users in the same realm to have the
same full name, and our code should generally
be tolerant of that possibility. (A good
example is our new mention syntax, which disambiguates
same-named people using ids.)
It's also worth noting that our web app client
doesn't try to scrub full_name from its payload in
situations where the user has actually only modified other
fields in the "Edit bot" UI. Starting here
we just handle this on the server, since it's
easy to fix there, and even if we fixed it in the web
app, there's no guarantee that other clients won't be
just as brute force. It wasn't exactly broken before,
but we'd needlessly write rows to audit tables.
Fixes#10509
When we create new ids for message rows, we
now sort the new ids by their corresponding
pub_date values in the rows.
This takes a sizable chunk of memory.
This feature only gets turned on if you
set sort_by_date to True in realm.json.
We could migrate all the current PREMIUM_FREE organizations to have more
invites, but this setting mainly affects orgs right as they are starting, so
it's probably fine.
We seemed to have been doing too much of sharpening on the thumbnails.
The purpose of sharpening here was to just counter the softening
effects of a resize on an image but overdoing it is bad.
Value sharpen(0.5,0.2,true) seems to look good for achieving the
best results here on different displays as revealed in the manual
hit and trial based testing.
Thanks to @borisyankov for pointing out the issue and suggesting
the values.
For some webhook endpoints where the third-party API requires us to do
this, the user's API key might appear in error emails through
appearing in the `QUERY_STRING` parameter. Fix that by filtering any
actual content from those; what we usually need for debugging is just
what set of parameters were provided.
The APNS client libraries (especially the hyper.http20 one) were
determined via profiling to take significant time during the import
process, so we move them to be lazily imported in order to optimize
the overall Zulip import process. This save up to about 100ms in
import time.
These libraries are only used in certain Django processes inside
zulipchat.com, and so are unnecessary both in development as well as
for self-hosted Zulip servers.
This is a prepartory commit for the upcoming changes. It was meaningful
to extract this one out because this function is essentially a condition
check on whether a given url is one of the user_uploads or an external
one. Based on its value we decide whether a url must be thumbnailed or
not and thus this function will also be used in an upcoming commit
patching lib/thumbnail.py to do the same check before thumbnail url
generation.
We are basically adding a check for url's to be external (belonging
to some 3rd party web site hosting the image) or be one of the
user uploaded files. User uploaded files are served by a separate
endpoint which is /user_uploads/. Any other local url such as
/user_avatars/ or /static/ should never be sent to thumbor for
thumbnailing.
Not sending /user_avatars/ to thumbor for thumbnailing makes sense
because they are already properly thumbnailed and stored properly.
/static/ urls host very few images we use for demo and can be safely
be excluded from thumbnailing.
The Zulip API is to be used on both development and production
servers, and really we just need to talk about zuliprc files.
There's a similar issue for the JS docs, but we need to fix the
copy/paste issues with those as well.
We don't want really long urls to lead to truncated
keys, or we could theoretically have two different
urls get mixed up previews.
Also, this suppresses warnings about exceeding the
250 char limit.
Finally, this gives the key a proper prefix.
We use UserMessageLite to avoid Django overhead, and we
do updates in chunks of 10000. (The export may be broken
into several files already, but a reasonable chunking at
import time is good defense against running out of memory.)
Now that we allow multiple users to have registered the same token, we
need to configure calls to unregister tokens to only query the
targeted user_id.
We conveniently were already passing the `user_id` into the push
notification bouncer for the remove API, so no migration for older
Zulip servers is required.
If cordelia searches on pm-with:iago@zulip.com,cordelia@zulip.com,
we now properly treat that the same way as pm-with:iago@zulip.com.
Before this fix, the query would initially go through the
huddle code path. The symptom wasn't completely obvious, as
eventually a deeper function would return a recipient id
corresponding to a single PM with @iago@zulip.com, but we would
only get messages where iago was the recipient, and not any
messages where he was the sender to cordelia.
I put the helper function for this in zerver/lib/addressee, which
is somewhat speculative. Eventually, we'll want pm-with queries
to allow for user ids, and I imagine there will be some shared
logic with other Addressee code in terms of how we handle these
strings. The way we deal with lists of emails/users for various
endpoints is kind of haphazard in the current code, although
granted it's mostly just repeating the same simple patterns. It
would be nice for some of this code to converge a bit. This
affects new messages, typing indicators, search filters, etc.,
and some endpoints have strange legacy stuff like supporting
JSON-encoded lists, so it's not trivial to clean this up.
Tweaked by tabbott to add some additional tests.
For our bots that use GenericOutgoingWebhookService
(which are basically Zulip style bots), we now
include a "content-type" header of "application/json".
We accomplish this by having the service classes
implement their own custom method called
`send_data_to_server`. For the Slack-related
code, we just extracted code from `do_rest_call`,
and then for the Zulip-related code, we added
a `headers` parameter.
If we omit methods in subclasses, they're likely to
be caught by linters or unit tests, and even if they
aren't, raising NotImplementedError doesn't actually
prevent user problems.
I've been fighting these in refactoring, and it's
just been a bunch of busy work, plus comments are
highly likely to bitrot.
This fixes a couple things:
* process_event() is a pretty vague name
* returning tuples should generally be avoided
* we were producing the same REST parameters in both
subclasses
* relative_url_path was always blank
* request_kwargs was always empty
Now process_event() is called build_bot_request(),
and it only returns request data,
not a tuple of `rest_operation` and `request_data`.
By no longer returning `rest_operation`, there are
fewer moving parts. We just have `do_rest_call` make
a POST call.
Before this change, we instantiated base_url into a superclass
of subclasses that returned base_url into a dictionary that
gets returned to our caller.
Now we just pull base_url out of service when we need to make
the REST call.
We move the JSON parsing step into the
higher level function: process_success_response().
In the unlikely event that we'll start integrating
with a solution that doesn't use JSON, we can deal
with that, and for now doing the parsing in one
place will help us make error reporting more
consistent.
In a subsequent commit we'll introduce better
error handling for malformed JSON.
The earlier code here, if it got a payload with
"response_string" as a key, would prefix the
corresponding value with "Success!". We just
want the bot to set its own content.
The code is reorganized here so that process_success()
always produces a value keyed by "content" from
incoming data, and then process_success_response()
doesn't do any fancy munging of the data.
Previously, Zulip did not correctly handle the case of a mobile device
being registered with a push device token being registered for
multiple accounts on the same server (which is a common case on
zulipchat.com). This was because our database `unique` and
`unique_together` indexes incorrectly enforced the token being unique
on a given server, rather than unique for a given user_id.
We fix this gap, and at the same time remove unnecessary (and
incorrectly racey) logic deleting and recreating the tokens in the
appropriate tables.
There's still an open mobile app bug causing repeated re-registrations
in a loop, but this should fix the fact that the relevant mobile bug
causes the server to 500.
Follow-up work that may be of value includes:
* Removing `ios_app_id`, which may not have much purpose.
* Renaming `last_updated` to `data_created`, since that's what it is now.
But none of those are critical to solving the actual bug here.
Fixes#8841.
We now allow outgoing webhooks to provide us a
"content" field, which is probably a more guessable
name than "response_string", particularly for folks
that use our other bot-related APIs. And we don't
modify content as we do response_string, i.e. no
"Success!" prefix.
If we're not too concerned about backward compatibility,
we can do a subsequent commit that makes "content"
and "response_string" true synonyms and get rid of
the "Success!" prefix, which was probably accidental
to begin with.
This commit starts by changing the third
argument of send_response_message to be a Dict
instead of a string, so that the data can be more
structured going forward.
That change makes the 2nd/3rd parameters both be
dicts, so to be defensive, I now have all the callers
pass in explicit keyword names. And then I rename
message to message_info, so that the callers have
more clear code.
And that changes the implementation inside of
send_response_message() a bit.
Sorry this commit is a bit coarse, but the intermediate
commits would have been kind of ugly, too.
At the end of the day, it's pretty simple:
bot_id: never changed
message_info: just renamed from message
response_data: is a Dict with the key of "content"
And the innards of send_response_message() are basically
simply dictionary lookups and function calls.
There's no reason to return a failure message in
process_success(), since it's implied to be part of
the success codepath. I didn't look at the full history
of how the strange API evolved, but the second element
of the tuple was clearly noise by the time I got here.
Neither of the subclasses ever set it, and none of the
consumers used it.
This two-line function wasn't really carrying its
weight, and it just made it harder to refactor the
overall codepath.
Eliminating the function forces us to mock at a slightly
deeper level, which is probably a good thing for what
the test intends to do. The deeper mock still verifies that
we're sending the message (good) without digging into
all the details of how we send it (good).
Note that we will still keep around the similarly named
`fail_with_message` helper, which is a lot more useful.
(The succeed/fail scenarios aren't really symmetric here.
For success, there are fewer codepaths that do more complex
things, whereas we have lots and lots of failure codepaths
that all do the same simple thing of replying with a canned
message.)
Before this change subclasses of OutgoingWebhookServiceInterface
would return a raw string as the first element of its return
tuple in process_success(). This is not a very flexible
design, as it prevents the bot from passing extra data like
`widget_content`.
It's also possible in the future that we'll want to let outgoing
bots reply directly to senders who mention them on streams, and
again the original design was overly constrained for that.
This commit does not actually change any functionality yet.
The code was needlessly querying the DB to get full
objects for entities where we only needed user_id,
realm_id, and stream_id.
With my test data of ~1000 records this sped up the
function from ~8s to ~0.5s. The speedup would probably
be even more for larger data sets.
Fixes the urgent part of #10397.
It was discovered that soft-deactivated users don't get mobile push
notifications for messages on private streams that they have configured
to send push notifications.
Reason: `handle_push_notification` calls `access_message`, and that
logic assumes that a user who is a recipient of a message has an
associated UserMessage row. Those UserMessage rows are created
lazily for soft-deactivated users, so they might not exist (yet)
until the user comes back.
Solution: Ensure that userMessage row is created for
stream_push_user_ids and stream_email_user_ids in create_user_messages.
At some point as part of the process of supporting renumbering data,
we changed the structure of our file uploads to expect `path` to match
`s3_path`, with both having the relative path within the overall
hierarchy (including the realm ID). This change updates the more
rarely-used S3 export code path to use that model, fixing a crash when
messages reference an Attachment object with a rewritten path_id.
If any user had sent the reply to the welcome bot recommended by our
tutorial, then the Zulip export/import process didn't work properly,
because we weren't including (and then remapping) the recipient ID for
sending PMs to the cross-realm bots. This commit fixes that gap, by
recording the necessary data on the export side, and doing the
appropriate remapping on the import side.
Previously, our realm import logic only did the special remapping
logic for the original notifications_stream_id; when we added the new
signup_notifications_stream_id field, we neglected to handle it in the
same way.
In the event that two processes are racing to be the
first to load data from zulip.yaml, we now make the
race scenario be duplicated effort instead of having
the second racer get an attribute error on `data`.
We do this by declaring victory only after setting
`data`. "Declaring victory" in this case is a matter
of setting `last_update`.
We are still possibly vulnerable to corrupted data
here, so we should investigate a mutex, or just
read the data on every call (but it's strangely
expensive, almost 3.5s on my instance), or converting
the YAML to code before launching the server.
We start by stripping the ids in front of the name before the database
lookup. This has the advantage of not mentioning anyone if an incorrect
user id and full name combination is specified, as well as not having
the query the database twice, once by fullname and next by id.
Previously, we were storing only the most recent person with the same
full name as others; this commit adds new keys to the dict such that
simply looking by name would get you the newest user with this name,
and the get_user_by_id function can index the remaining users.
We also remove some unreachable code. Calling
split() always returns at least one token, even
if it's just the empty string. This is tested
directly on this commit, plus messages with
empty content get rejected pretty early in
the execution path.
In user type custom field, field value is list of user ids. We weren't
converting list to json object in update event payload. This throws
error in frontend, cause we store stringify representation of custom
field value. Therefore, after update event is recieved field-value-
type gets updated to array from string which throws json parsing error.
Having HTML (or HTML-like) content in the examples was making parts of
the content invisible, since the browser identified them as HTML tags
rather than verbose text.
There are some endpoints that don't fall into the currently available
categories, so this new function will be used for calling the tests for
server and realm-related endpoints.
Using early-exit here allows us to more easily
comment why there are certain exemptions to
this logic.
We also only require callers to pass in realm,
not the whole user object.
Since this class was built, folks have always chosen
to subclass JsonableError for situations where
the default of ErrorCode.BAD_REQUEST is insufficient.
So now we simplify the use cases, which also gets
us 100% coverage on this core module.
This commit add FIELD_TYPE_CHOICES_DICT to page_params and replace
FIELD_TYPE_CHOICES.
FIELD_TYPE_CHOICES_DICT includes all field types with keyword, id
and display name. Using this field-type-dict, we can access field
type information by it's keyword, and remove all static use of
field-type'a name or id in frontend.
This commit also modifies functions in js where this page_params
field-types is used.
This commit modifies FIELD_TYPE_DATA dict in `CustomProfileField`
model to store keyword of field types. And create new dict
FIELD_TYPE_CHOICES_DICT to store all field type information
by field type keyword, i.e. id, name.
This is preparatory commit to remove all static use of field
types in frontend and access field type with keyword instead
of display name.
This prevents leaking some variables into an already
cluttered function.
We also add test coverage for what's now an
early-exit condition in the new function--we exempt
public MIT streams from these events.
This change was partially driven by a quirk in Python
where peephole optimizations make `continue` lines
appear not to be covered.
I also think it's generally a good idiom to extract
functions for loop bodies when they don't actually
accumulate values or maintain other state. With this
commit we now prevent potential bugs for vars like
`is_stream` leaking between loop iterations.
We simulate a race condition by mocking create_user
to actually create a user, but then raise an
IntegrityError (as if another process had actually
created the user, not our test).
I also changed the real code to use explicitly
named parameters.
I don't understand why this didn't cause test failures in CI; this
change was clearly required and test_change_realm_property was failing
consistently for me locally.
Our get_streams_traffic function used to query
all streams in the StreamCount table if you
passed in `None` for `streams`.
Now we require that you pass in a list of
stream_ids.
I don't know how much work this will save
the database, since probably the bulk of
the work is aggregating. If we need to fine
tune DB performance, we could possibly add
`realm` as an argument and add it to the filter.
What we'll immediately get, for large multi-realm
installations, is less data over the wire and
less work for the ORM.
The prior code uses an awkward idiom that
pre-dates the `exists()` function, and it
had an unreachable line of code.
The new version should be faster, since we
don't create a throwaway heavy Django object
or send needless data over the wire.
This functions appears to be redundant to
`access_stream_by_name`. The only
meaningful line of code in the function that we're
removing, the code that raises an error,
appears to be unreachable, despite reasonably
extensive tests.
The only thing the function was restricting
was that the case where the bot's owner was
unsubscribed to a private stream, which
is already locked down in
`access_stream_by_name` calls inside of
`patch_bot_backend`.
This commit increases test coverage
by removing unreachable code.
It's possible this function had
some theoretical value before we
introduced the `require_non_guest_human_user`
decorator to the `patch_bot_backend`
view, since in theory the bot itself
could have subscribed to a stream that
the owner didn't subscribe to. Even
then it's not clear that allowing the
bot to set that as a default stream
would have been harmful, since they
can already access it.
We want our methodology for extracting the last message
id to be consistent, particularly in terms of how we
handle edge cases. (I'll concede that the
`bulk_remove_subscriptions` codepath never hits that
corner case in practice, but it's harmless to handle
the theoretical case.)
It may also be nice to have this function show up
clearly in profiling.
This also adds some direct testing to the function.
It's not clear to me why we don't use `latest('id')`
in the implementation, but that's outside the scope
of this commit.