This is a first step toward two goals:
* support dictionary-like narrows when registering events
* use readable dataclasses internally
This is gonna be a somewhat complicated exercise due to how
events get serialized, but fortunately this interim step
doesn't require any serious shims, so it improves the codebase
even if the long-term goals may take a while to get sorted
out.
The two places where we have to use a helper to convert narrows
from tuples to dataclasses will eventually rely on their callers
to do the conversion, but I don't want to re-work the entire
codepath yet.
Note that the new NarrowTerm dataclass makes it more explicit
that the internal functions currently either don't care about
negated flags or downright don't support them. This way mypy
protects us from assuming that we can just add negated support
at the outer edges.
OTOH I do make a tiny effort here to slightly restructure
narrow_filter in a way that paves the way for negation support.
The bigger goal by far, though, is to at least support the
dictionary format.
We no longer pass in a big opaque event to narrow_filter
(which is inside build_narrow_filter). We instead explicitly
pass in message and flags. This leads to a bit more type
safety, and it's also more flexible. There's no reason to
build an entire event just to see if a message belongs to
a narrow.
The changes to the test work around the fact that the fixtures
are sloppy with types. I plan a subsequent commit to clean
up those tests significantly.
django-stubs 4.2.1 gives transaction.on_commit a more accurate type
annotation, but this exposed that mypy can’t handle the lambda default
parameters that we use to recapture loop variables such as
for stream_id in public_stream_ids:
peer_user_ids = …
event = …
transaction.on_commit(
lambda event=event, peer_user_ids=peer_user_ids: send_event(
realm, event, peer_user_ids
)
)
https://github.com/python/mypy/issues/15459
A workaround that mypy accepts is
transaction.on_commit(
(
lambda event, peer_user_ids: lambda: send_event(
realm, event, peer_user_ids
)
)(event, peer_user_ids)
)
But that’s kind of ugly and potentially error-prone, so let’s make a
helper function for this very common pattern.
send_event_on_commit(realm, event, peer_user_ids)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This commit makes it possible for users to control the wildcard
mention notifications for messages sent to followed topics
via a global notification setting.
There is no support for configuring this setting
through the UI yet.
This commit makes it possible for users to control
the push notifications for messages sent to followed topics
via a global notification setting.
There is no support for configuring this setting
through the UI yet.
This commit makes it possible for users to control
the email notifications for messages sent to followed topics
via a global notification setting.
Although there is no support for configuring this setting
through the UI yet.
Add five new fields to the UserBaseSettings class for
the "followed topic notifications" feature, similar to
stream notifications. But this commit consists only of
the implementation of email notifications.
In #23380 we want to replace all occurrences of `uri` with `url`.
This commit replaces the occurrences appeared in a variable name
`tornado_uri` and a function name `get_tornado_uri`.
This swaps out url_format_string from all of our APIs and replaces it
with url_template. Note that the documentation changes in the following
commits will be squashed with this commit.
We change the "url_format" key to "url_template" for the
realm_linkifiers events in event_schema, along with updating
LinkifierDict. "url_template" is the name chosen to normalize
mixed usages of "url_format_string" and "url_format" throughout
the backend.
The markdown processor is updated to stop handling the format string
interpolation and delegate the task template expansion to the uri_template
library instead.
This change affects many test cases. We mostly just replace "%(name)s"
with "{name}", "url_format_string" with "url_template" to make sure that
they still pass. There are some test cases dedicated for testing "%"
escaping, which aren't relevant anymore and are subject to removal.
But for now we keep most of them as-is, and make sure that "%" is always
escaped since we do not use it for variable substitution any more.
Since url_format_string is not populated anymore, a migration is created
to remove this field entirely, and make url_template non-nullable since
we will always populate it. Note that it is possible to have
url_template being null after migration 0422 and before 0424, but
in practice, url_template will not be None after backfilling and the
backend now is always setting url_template.
With the removal of url_format_string, RealmFilter model will now be cleaned
with URL template checks, and the old checks for escapes are removed.
We also modified RealmFilter.clean to skip the validation when the
url_template is invalid. This avoids raising mulitple ValidationError's
when calling full_clean on a linkifier. But we might eventually want to
have a more centric approach to data validation instead of having
the same validation in both the clean method and the validator.
Fixes#23124.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
Refactors instances of `message_type_name` and `message_type`
that are referring to API message type value ("stream" or
"private") to use `recipient_type_name` instead.
Prep commit for adding "direct" as a value for endpoints with a
`type` parameter to indicate whether the message is a stream or
direct message.
This code is called in the hot path when Tornado is processing events.
As such, making this code performant is important. Profiling shows
that a significant portion of the time is spent calling asdict() to
serialize the UserMessageNotificationsData dataclass. In this case
`asdict` does several steps which we do not need, such as attempting
to recurse into its fields, and deepcopy'ing the values of the fields.
In our use case, these add a notable amount of overhead:
```py3
from zerver.tornado.event_queue import UserMessageNotificationsData
from dataclasses import asdict
from timeit import timeit
o = UserMessageNotificationsData(1, False, False, False, False, False, False, False, False, False, False, False)
%timeit asdict(o)
%timeit {**vars(o)}
```
Replace the `asdict` call with a direct access of the fields. We
perform a shallow copy because we do need to modify the resulting
fields.
Black 23 enforces some slightly more specific rules about empty line
counts and redundant parenthesis removal, but the result is still
compatible with Black 22.
(This does not actually upgrade our Python environment to Black 23
yet.)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
A missed message email notification, where the message is the welcome
message sent by the welcome bot on account creation, get sent when
the user somehow not focuses the browser tab during account creation.
No missed message email or push notifications should be sent for the
messages generated by the welcome bot.
'internal_send_private_message' accepts a parameter
'disable_external_notifications' and is set to 'True' when the sender
is 'welcome bot'.
A check is introduced in `trivially_should_not_notify`, not to notify
if `disable_external_notifications` is true.
TestCases are updated to include the `disable_external_notifications`
check in the early (False) return patterns of `is_push_notifiable` and
`is_email_notifiable`.
One query reduced for both `test_create_user_with_multiple_streams`
and `test_register`.
Reason: When welcome bot sends message after user creation
`do_send_messages` calls `get_active_presence_idle_user_ids`,
`user_ids` in `get_active_presence_idle_user_ids` remains empty if
`disable_external_notifications` is true because `is_notifiable` returns
false.
`get_active_presence_idle_user_ids` calls `filter_presence_idle_user_ids`
and since the `user_ids` is empty, the query inside the function doesn't
get executed.
MissedMessageHookTest updated.
Fixes: #22884
Doing rapid pace mark-as-unread in the Zulip web application, one
observed assertion failures showing that the server would send an
event containing multiple message IDs but only one of the messages
present in the message_details side data structure.
The cause of this was the "virtual events" compression system; two
flags/remove/read events were being combined by simply concatenating
the lists of events, without any attempt to merge the
`message_details` field on those objects.
The immediate fix is to disable virtual events compression for this
event class, but it's not unlikely we'll need to just eliminate the
virtual_events system entirely, because it seems difficult to make it
soundly handle a message whose state for a given flag changes back and
forth while the client is offline.
But we'll leave that for later, since removing that optimization
deserves more discussion than fixing this event corruption bug.
return inside finally blocks causes exceptions to be silenced.
Although these blocks follow blanket ‘except Exception’ handlers, they
do not seem to have a goal of silencing BaseException and exceptions
thrown by the exception handler, so rewrite them to avoid it.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Mobile clients older than v27.192 do not support PRONOUNS type
custom profile fields, so we instead change the type of it to
SHORT_TEXT in the data sent with register response and also in
the events sent to those clients.
One should now be able to configure a regex by appending _regex to the
port number:
[tornado_sharding]
9802_regex = ^[l-p].*\.zulipchat\.com$
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We now send a new user_topic event while muting and unmuting topics.
fetch_initial_state_data now returns an additional user_topics array to
the client that will maintain the user-topic relationship data.
This will support any future addition of new features to modify the
relationship between a user-topic pair.
This commit adds the relevent backend code and schema for the new
event.
Instead of using `request.POST` to access the `data` parameter used
in the internal `notify_tornado` path, adds `has_request_variables`
decorator and accesses `data` as a `REQ` parameter.
Expands `test_tornado_endpoint` in `test_event_system.py` for
`data` being a required parameter for this path.
This was for the old /messages/latest API that was removed in commit
e06722657a.
If we wanted a new check like this, it shouldn’t go in zulip_finish,
because that only runs when the client gets an asynchronous response
from polling an initially-empty queue, and not when the client gets a
synchronous response from polling a nonempty queue.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
For the same reason as `handler_id` has, we define `_request`
as an attribute. Note that the name `request` is already taken.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
This prevents us from relying on a side-effect of `allocate_handler_id`
that monkey-patches `handler_id` on the `AsyncDjangoHandler` object,
allowing mypy to acknowledge the existence of `handler_id` as an `int`.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
Commit 6fd1a558b7 (#21469) introduced an
await point where get_events_backend calls fetch_events in order to
switch threads. This opened the possibility that, in the window
between the connect_handler call in fetch_events and the old location
of this assignment in get_events_backend, an event could arrive,
causing ClientDescriptor.add_event to crash on missing
handler._request. Fix this by assigning handler._request earlier.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
According to the documentation: “Pika does not have any notion of
threading in the code. If you want to use Pika with threading, make
sure you have a Pika connection per thread, created in that thread. It
is not safe to share one Pika connection across threads, with one
exception: you may call the connection method add_callback_threadsafe
from another thread to schedule a callback within an active pika
connection.”
https://pika.readthedocs.io/en/stable/faq.html
This also means that synchronous Django code running in Tornado will
use its own synchronous SimpleQueueClient rather than sharing the
asynchronous TornadoQueueClient, which is unfortunate but necessary as
they’re about to be on different threads.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We previously forked tornado.autoreload to work around a problem where
it would crash if you introduce a syntax error and not recover if you
fix it (https://github.com/tornadoweb/tornado/issues/2398).
A much more maintainable workaround for that issue, at least in
current Tornado, is to use tornado.autoreload as the main module.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
When `update_message` events were updated to have a consistent
format for both normal message updates/edits and special
rendering preview updates, the logic used in the tornado event
queue processor to identify the special events for sending
notifications no longer applied.
Updates that logic to use the `rendering_only` flag (if present)
that was added to the `update_message` event format to identify
if the event processor should potentially send notifications to
users.
For upgrade compatibility, if `rendering_only` flag is not present,
uses previous event structure and checks for the absence of the
`user_id` property, which indicated the special rendering preview
updates.
Fixes#16022.
`cachify` is essentially caching the return value of a function using only
the non-keyword-only arguments as the key.
The use case of the function in the backend can be sufficiently covered by
`functools.lru_cache` as an unbound cache. There is no signficant difference
apart from `cachify` overlooking keyword-only arguments, and
`functools.lru_cache` being conveniently typed.
Signed-off-by: Zixuan James Li <359101898@qq.com>
Adds request as a parameter to json_success as a refactor towards
making `ignored_parameters_unsupported` functionality available
for all API endpoints.
Also, removes any data parameters that are an empty dict or
a dict with the generic success response values.
This replaces the temporary (and testless) fix in
24b1439e93 with a more permanent
fix.
Instead of checking if the user is a bot just before
sending the notifications, we now just don't enqueue
notifications for bots. This is done by sending a list
of bot IDs to the event_queue code, just like other
lists which are used for creating NotificationData objects.
Credit @andersk for the test code in `test_notification_data.py`.
A SIGTERM can show up at any point in the ioloop, even in places which
are not prepared to handle it. This results in the process ignoring
the `sys.exit` which the SIGTERM handler calls, with an uncaught
SystemExit exception:
```
2021-11-09 15:37:49.368 ERR [tornado.application:9803] Uncaught exception
Traceback (most recent call last):
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/http1connection.py", line 238, in _read_message
delegate.finish()
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/httpserver.py", line 314, in finish
self.delegate.finish()
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/routing.py", line 251, in finish
self.delegate.finish()
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 2097, in finish
self.execute()
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 2130, in execute
**self.path_kwargs)
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/gen.py", line 307, in wrapper
yielded = next(result)
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 1510, in _execute
result = method(*self.path_args, **self.path_kwargs)
File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/handlers.py", line 150, in get
request = self.convert_tornado_request_to_django_request()
File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/handlers.py", line 113, in convert_tornado_request_to_django_request
request = WSGIRequest(environ)
File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/django/core/handlers/wsgi.py", line 66, in __init__
script_name = get_script_name(environ)
File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/event_queue.py", line 611, in <lambda>
signal.signal(signal.SIGTERM, lambda signum, stack: sys.exit(1))
SystemExit: 1
```
Supervisor then terminates the process with a SIGKILL, which results
in dropping data held in the tornado process, as it does not dump its
queue.
The only command which is safe to run in the signal handler is
`ioloop.add_callback_from_signal`, which schedules the callback to run
during the course of the normal ioloop. This callbacks does an
orderly shutdown of the server and the ioloop before exiting.
This fixes a problem where we could not import zerver.lib.streams from
zerver.lib.message, which would otherwise be reasonable, because the
former implicitly imported many modules due to this issue.
This utilizes the generic `BaseNotes` we added for multipurpose
patching. With this migration as an example, we can further support
more types of notes to replace the monkey-patching approach we have used
throughout the codebase for type safety.
These changes are all independent of each other; I just didn’t feel
like making dozens of commits for them.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This fixes a bug where email notifications were sent for wildcard
mentions even if the `enable_offline_email_notifications` setting was
turned off.
This was because the `notification_data` class incorrectly considered
`wildcard_mentions_notify` as an indeoendent setting, instead of a wrapper
around `enable_offline_email_notifications` and `enable_offline_push_notifications`.
Also add a test for this case.
This commit adds "user_settings_object" field to
client_capabilities which will be used to determine
if the client needs 'update_display_settings' and
'update_global_notifications' event.
Return zulip_merge_base alongside zulip_version
in `/register`, `/event` and `/server_settings`
endpoint so that the value can be used by other
clients.
Previously, we checked for the `enable_offline_email_notifications` and
`enable_offline_push_notifications` settings (which determine whether the
user will receive notifications for PMs and mentions) just before sending
notifications. This has a few problem:
1. We do not have access to all the user settings in the notification
handlers (`handle_missedmessage_emails` and `handle_push_notifications`),
and therefore, we cannot correctly determine whether the notification should
be sent. Checks like the following which existed previously, will, for
example, incorrectly not send notifications even when stream email
notifications are enabled-
```
if not receives_offline_email_notifications(user_profile):
return
```
With this commit, we simply do not enqueue notifications if the "offline"
settings are disabled, which fixes that bug.
Additionally, this also fixes a bug with the "online push notifications"
feature, which was, if someone were to:
* turn off notifications for PMs and mentions (`enable_offline_push_notifications`)
* turn on stream push notifications (`enable_stream_push_notifications`)
* turn on "online push" (`enable_online_push_notifications`)
then, they would still receive notifications for PMs when online.
This isn't how the "online push enabled" feature is supposed to work;
it should only act as a wrapper around the other notification settings.
The buggy code was this in `handle_push_notifications`:
```
if not (
receives_offline_push_notifications(user_profile)
or receives_online_push_notifications(user_profile)
):
return
// send notifications
```
This commit removes that code, and extends our `notification_data.py` logic
to cover this case, along with tests.
2. The name for these settings is slightly misleading. They essentially
talk about "what to send notifications for" (PMs and mentions), and not
"when to send notifications" (offline). This commit improves this condition
by restricting the use of this term only to the database field, and using
clearer names everywhere else. This distinction will be important to have
non-confusing code when we implement multiple options for notifications
in the future as dropdown (never/when offline/when offline or online, etc).
3. We should ideally re-check all notification settings just before the
notifications are sent. This is especially important for email notifications,
which may be sent after a long time after the message was sent. We will
in the future add code to thoroughly re-check settings before sending
notifications in a clean manner, but temporarily not re-checking isn't
a terrible scenario either.
This is necessary to break the uncollectable reference cycle created
by our ‘request_notes.saved_response = json_response(…)’, Django’s
‘response._resource_closers.append(request.close)’, and Python’s
https://bugs.python.org/issue44680.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This prevents a memory leak arising from Python’s inability to collect
a reference cycle from a WeakKeyDictionary value to its key
(https://bugs.python.org/issue44680).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This concludes the HttpRequest migration to eliminate arbitrary
attributes (except private ones that are belong to django) attached
to the request object during runtime and migrated them to a
separate data structure dedicated for the purpose of adding
information (so called notes) to a HttpRequest.
This includes the migration of fields that require trivial changes
to be migrated to be stored with ZulipRequestNotes.
Specifically _requestor_for_logs, _set_language, _query, error_format,
placeholder_open_graph_description, saveed_response, which were all
previously set on the HttpRequest object at some point. This migration
allows them to be typed.
We will no longer use the HttpRequest to store the rate limit data.
Using ZulipRequestNotes, we can access rate_limit and ratelimits_applied
with type hints support. We also save the process of initializing
ratelimits_applied by giving it a default value.
We create a class called ZulipRequestNotes as a new home to all the
additional attributes that we add to the Django HttpRequest object.
This allows mypy to do the typecheck and also enforces type safety.
Most of the attributes are added in the middleware, and thus it is
generally safe to assert that they are not None in a code path that
goes through the middleware. The caller is obligated to do manual
the type check otherwise.
This also resolves some cyclic dependencies that zerver.lib.request
have with zerver.lib.rate_limiter and zerver.tornado.handlers.
* `stream_name`: This field is actually redundant. The email/push
notifications handlers don't use that field from the dict, and they
anyways query for the message, so we're safe in deleting this field,
even if in the future we end up needing the stream name.
* `timestamp`: This is totally unused by the email/push notification
handlers, and aren't sent to push clients either.
* `type` is used only for the push notifications handler, since only
push notifications can be revoked, so we move them to only run there.
We will later use this data to include text like:
`<sender> mentioned @<user_group>` instead of the current
`<sender> mentioned you` when someone mentions a user group
the current user is a part of in email/push notification.
Part of #13080.
JsonableError has two major benefits over json_error:
* It can be raised from anywhere in the codebase, rather than
being a return value, which is much more convenient for refactoring,
as one doesn't potentially need to change error handling style when
extracting a bit of view code to a function.
* It is guaranteed to contain the `code` property, which is helpful
for API consistency.
Various stragglers are not updated because JsonableError requires
subclassing in order to specify custom data or HTTP status codes.
This removes some complexity from the event_queue module.
To avoid code duplication, we reduce the `is_notifiable` methods to
internally just call the `trigger` methods and check their return value.
* Modify `maybe_enqueue_notifications` to take in an instance of the
dataclass introduced in 951b49c048.
* The `check_notify` tests tested the "when to notify" logic in a way
which involved `maybe_enqueue_notifications`. To simplify things, we've
earlier extracted this logic in 8182632d7e.
So, we just kill off the `check_notify` test, and keep only those parts
which verify the queueing and return value behavior of that funtion.
* We retain the the missedmessage_hook and message
message_edit_notifications since they are more integration-style.
* There's a slightly subtle change with the missedmessage_hook tests.
Before this commit, we short-circuited the hook if the sender was muted
(5a642cea11).
With this commit, we delegate the check to our dataclass methods.
So, `maybe_enqueue_notifications` will be called even if the sender was
muted, and the test needs to be updated.
* In our test helper `get_maybe_enqueue_notifications_parameters` which
generates default values for testing `maybe_enqueue_notifications` calls,
we keep `message_id`, `sender_id`, and `user_id` as required arguments,
so that the tests are super-clear and avoid accidental false positives.
* Because `do_update_embedded_data` also sends `update_message` events,
we deal with that case with some hacky code for now. See the comment
there.
This mostly completes the extraction of the "when to notify" logic into
our new `notification_data` module.
We already have this data in the `flags` for each user, so no need to
send this set/list in the event dictionary.
The `flags` in the event dict represent the after-message-update state,
so we can't avoid sending `prior_mention_user_ids`.
Since `flags` here could be iterated through multiple times
(to check for push/email notifiability), we use `Collection`.
Inspired by 871e73ab8f.
The other change here in the `event_queue` code is prep for using
the `UserMessageNotificationsData` class there.
Before this commit, we used to pre-calculate flags for user data and send
it to Tornado, like so:
```
{
"id": 10,
"flags": ["mentioned"],
"mentioned": true,
"online_push_enabled": false,
"stream_push_notify": false,
"stream_email_notify": false,
"wildcard_mention_notify": false,
"sender_is_muted": false,
}
```
This has the benefit of simplifying the logic in the event_queue code a bit.
However, because we sent such an object for each user receiving the event,
the string keys (like "stream_email_notify") get duplicated in the JSON
blob that is sent to Tornado.
For 1000 users, this data may take up upto ~190KB of space, which can
cause performance degradation in large organisations.
Hence, as an alternative, we send just the list of user_ids fitting
each notification criteria, and then calculate the flags in Tornado.
This brings down the space to ~60KB for 1000 users.
This commit reverts parts of following commits:
- 2179275
- 40cd6b5
We will in the future, add helpers to create `UserMessageNotificationsData`
objects from these lists, so as to avoid code duplication.