Commit Graph

202 Commits

Author SHA1 Message Date
Anders Kaseorg ea6934c26d dependencies: Remove WebSockets system for sending messages.
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013.  We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not.  Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.

While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.

This WebSockets code path had a lot of downsides/complexity,
including:

* The messy hack involving constructing an emulated request object to
  hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
  needs and must be provisioned independently from the rest of the
  server).
* A duplicate check_send_receive_time Nagios test specific to
  WebSockets.
* The requirement for users to have their firewalls/NATs allow
  WebSocket connections, and a setting to disable them for networks
  where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
  been poorly maintained, and periodically throws random JavaScript
  exceptions in our production environments without a deep enough
  traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
  server restart, and especially for large installations like
  zulipchat.com, resulting in extra delay before messages can be sent
  again.

As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients.  We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.

If we later want WebSockets, we’ll likely want to just use Django
Channels.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-14 22:34:00 -08:00
Tim Abbott f0fd812cc5 tornado: Add transitional code for sender_delivery_email.
This issue was introduced in 54e357e154.
2019-11-20 17:31:11 -08:00
Tim Abbott 1fe4f795af settings: Add notification settings checkboxes for wildcard mentions.
This change makes it possible for users to control the notification
settings for wildcard mentions as a separate control from PMs and
direct @-mentions.
2019-11-20 16:58:46 -08:00
Tim Abbott b85c9b0810 tornado: Use delivery_email in logging.
Eventually, we'll want to replace emails with user IDs here entirely,
but until we make that happen, we should at least use the same email
address present in our other logging.

I think we won't miss updating these in a future migration thanks to
mypy types.
2019-11-15 17:16:05 -08:00
Tim Abbott 993ed9c2b1 tornado: Remove stale user_profile_email field.
Since years ago, this field hasn't been used for anything other than
some logging that would be better off logging the user ID anyway.

It existed in the first place simply because we weren't passing the
user_profile_id to Tornado at all.
2019-11-15 17:07:52 -08:00
Anders Kaseorg 0d20145b93 mypy: Upgrade from 0.730 to 0.740.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-11-13 12:38:45 -08:00
Anders Kaseorg cafac83676 request: Tighten type checking on REQ.
Then, find and fix a predictable number of previous misuses.

With a small change by tabbott to preserve backwards compatibility for
sending `yes` for the `forged` field.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-11-13 12:35:55 -08:00
Tim Abbott b12d3d54c6 events: Fix documentation testing for /events.
Most of the failures were due to parameters that are not intended to
be used by third-party code, so the correct fix for those was the set
intentionally_undocumented=True.

Fixes #12969.
2019-10-21 16:50:10 -07:00
Tim Abbott 0ed0bb6828 messages: Add email/push notifications for wildcard mentions.
Historically, Zulip's implementation of wildcard mentions never
triggered either email or push notifications, instead being limited to
desktop notifications and the "mentions" counter.

We fix this just by plumbing the "wildcard_mentioned" flag through our
system.

Implements much of
https://github.com/zulip/zulip/issues/6040#issuecomment-510157264.
We're also now ready to seriously work on #3750.
2019-08-26 14:39:53 -07:00
Tim Abbott 7953e23d49 event_queue: Actually fix missing copy for edit-message events.
Apparently, our edit-message events did not guarantee that the outer
wrapper dictionary, which is intended to be unique for each client,
was unique for every client (instead only ensuring it was unique for
each user).

This led to clients unexpectedly getting last_event_id validation
errors in this code path when a user had multiple connected clients,
because the linear ordering of event IDs within a given queue was
corrupted.

In fd2a63b049, we accidentally fixed
this issue with a different set of userdata events, without fixing the
edit-message event bug.  This commit fixes the remaining issue.
2019-08-12 15:17:10 -07:00
Anders Kaseorg 7bf09067d1 event_queue: Clean up type ignores.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-09 17:42:33 -07:00
Anders Kaseorg becef760bf cleanup: Delete leading newlines.
Previous cleanups (mostly the removals of Python __future__ imports)
were done in a way that introduced leading newlines.  Delete leading
newlines from all files, except static/assets/zulip-emoji/NOTICE,
which is a verbatim copy of the Apache 2.0 license.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-06 23:29:11 -07:00
Tim Abbott fd2a63b049 event_queue: Fix missing copy for edit-message events.
Apparently, our edit-message events did not guarantee that the outer
wrapper dictionary, which is intended to be unique for each client,
was unique for every client (instead only ensuring it was unique for
each user).

This led to clients unexpectedly getting last_event_id validation
errors in this code path when a user had multiple connected clients,
because the linear ordering of event IDs within a given queue was
corrupted.
2019-08-06 13:40:30 -07:00
Anders Kaseorg 68dd8e4ec8 mypy: Migrate from mypy_extensions to typing_extensions.
This gives us access to typing_extensions.Deque, which was not added
to typing until 3.5.4.

(PROVISION_VERSION is not bumped because the transitive dependency set
in dev.txt hasn’t changed.)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-05 17:24:09 -07:00
Anders Kaseorg 86a7fdddd7 events: Check last_event_id for validity, take 2.
This verifies that the client passed a last_event_id that actually
came from the queue instead of making up an ID from the future.  It
turns out one of our tests was making up such an ID, but legitimate
clients are expected not to do so.

The previous version of this commit (commit
e00d4be6d5, #12888) had to be reverted
(commit b86c5cc490) because it was
missing the `to_dict`/`from_dict` migration code.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-05 17:18:49 -07:00
Tim Abbott b86c5cc490 Revert "events: Check last_event_id for validity."
This isn't correct without a proper migration for existing queues,
which may not be implementable.

This reverts commit e00d4be6d5.
2019-08-02 14:44:35 -07:00
Tim Abbott 8be3df0e29 tornado: Fix bugs in Tornado autoreload library.
This fixes two issues:

* The syntax check logic we had for zerver.tornado.autoreload would
  end up clearing _reload_hooks if one of the files that had changed
  was zerver.tornado.autoreload itself (because we'd had re-imported
  the current module), which could be incredibly confusing when trying
  to test the autoreload logic.  It seems better to just not run the
  syntax check for syntax errors in this file.

  Similarly, because reloading event_queue.py would destroy the state
  in the queues, we avoid that as well.

* We make sure to flush stdout after running and reload hooks, to make
  sure their output reaches the user.
2019-08-02 12:47:49 -07:00
Tim Abbott a43b2f7a43 tornado: Fix incorrect import of autoreload library.
We were apparently not running our own forked Tornado autoreload
library when adding reload hooks, which meant that our autoreload
hooks didn't run at all.

This fixes an issue that made dump_event_queues never run and thus the
local development environment difficult to use for testing event queues.
2019-08-02 12:47:49 -07:00
Wyatt Hoodes 4beec5c6b9 typing: Use TYPE_CHECKING when dealing with cyclic dependencies. 2019-07-31 12:19:39 -07:00
Anders Kaseorg e00d4be6d5 events: Check last_event_id for validity.
This verifies that the client passed a last_event_id that actually
came from the queue instead of making up an ID from the future.  It
turns out one of our tests was making up such an ID, but legitimate
clients are expected not to do so.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-07-26 17:18:28 -07:00
Wyatt Hoodes a2fa1a6f25 handlers: Remove duplicate type annotation.
`self._request_middleware` is already typed in the `__init__` method.
2019-07-22 16:27:39 -07:00
Anders Kaseorg 3968fe9fdf tornado: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:33:13 -08:00
Anders Kaseorg e5bf0c0a69 event_queue: Avoid hardcoded paths in /var/tmp.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-01-15 16:12:05 -08:00
Tim Abbott 930e65d1be push: Include type in add-push-notification events.
This should make us able to clean up the logic for this in the future
(right now, we still need to do the .get() for backwards compatibility).
2018-12-15 13:58:52 -08:00
Tim Abbott cfeb87c1c9 tornado: Require non-negative lifespan_secs.
Previously, our validation for this field only checked it was an
integer, and you could in theory send invalid negative values here.
2018-12-05 14:50:37 -08:00
Tim Abbott 8e4d6fa045 event_queue: Rename IDLE_EVENT_QUEUE_TIMEOUT_SECS.
This is a default value, not an always-used value, and its name should
reflect that.
2018-12-05 14:48:40 -08:00
Tim Abbott 94dfff1c4e event queue: Don't set a minimum for lifespan_secs.
This makes it more convenient for developers to set very short values
for this (e.g. 1 minute) for the purposes of testing/debugging; there
aren't obvious problems with letting users set short values for this.
2018-12-05 14:47:53 -08:00
Tim Abbott a3c2d49f0c event_queue: Change garbage-collection frequency to 1 minute.
This is designed to help make it more convenient to do manual testing
where we need event queues to be garbage-collected.
2018-12-05 14:42:53 -08:00
Tim Abbott 6dd69b9bff event_queue: Rename ClientDescriptor.idle to expired.
This better reflects the situation with these event queues -- they're
not idle, they are expired and to be garbage collected.
2018-12-05 14:42:53 -08:00
Tim Abbott 408af032a0 event_queue: Remove queue_timeout migration code from 2013.
There's never going to be an event queue without a queue_timeout
property anymore.
2018-12-05 14:24:38 -08:00
Tim Abbott d0f71881f4 docs: Add detailed documentation on the process for sending messages.
This has long been something missing from our suite of documentation.
2018-11-29 16:25:35 -08:00
Tim Abbott 64960383e4 mypy: Fix missing type annotation in tornado code. 2018-11-20 19:08:14 -08:00
Tim Abbott 46acb608b1 tornado: Include port number in logging statements. 2018-11-20 18:45:22 -08:00
Tim Abbott 40ff41e135 tornado: Fix populate_db failing to call send_event properly.
This isn't the right long-term fix; theoretically, send_event
shouldn't be doing anything with populate_db, but that's for later.
2018-11-02 17:07:21 -07:00
Tim Abbott 0cac7e1cd3 tornado: Extract functions for Tornado queue names.
This moves all control for what queue to use for which realm in our
Tornado system to just the sharding.py file; no actual sharding is
done yet.
2018-11-02 17:00:10 -07:00
Tim Abbott 152c44b6d2 tornado: Extract function for specifying Tornado URI.
Since TORNADO_PROCESSES is 1 in all default configurations, this
doesn't have any user-facing effect.
2018-11-02 17:00:09 -07:00
Tim Abbott ec065e92ee tornado: Store port on SockJS connection object.
This will make it available for use inside our websockets code.
2018-11-02 16:55:33 -07:00
Tim Abbott ea1ec68899 events: Pass a realm object into send_event.
This is a preparator refactor for supporting hosting different Tornado
processes on different servers; to look up which Tornado server we
should be sending the event to, we'll need the realm object.
2018-11-02 16:47:39 -07:00
Tim Abbott 75e48459b5 tornado: Support using a port-aware file for dumping event queues.
This should make it possible for there to safely be multiple Tornado
processes running on different ports on the same system.

It may also fix a rare race bug in development, where previously, it
was possible for the Tornados processes for Casper and the main
development server to interfere; I haven't investigated whether this
was a real bug or not, but now those two services will use independent
Tornado files.

We still need to add something to direct traffic between the different
Tornado processes.
2018-11-02 16:47:39 -07:00
Tim Abbott 9339c191da tornado: Fix missing mypy annotation. 2018-11-02 14:43:55 -07:00
Tim Abbott cf24a20185 tornado: Refactor logic for persistent queue filenames.
Now, these are computed using a function, which will make it easier to
edit these paths to depend on which Tornado process it is in coming
commits.
2018-11-02 14:19:10 -07:00
Tim Abbott 98f28fa6ce tornado: Remove unused send_notification() function.
This hasn't been used in a long time, probably since
3fddc11cc2.
2018-11-02 14:14:39 -07:00
neiljp (Neil Pilgrim) 482383b6f7 mypy: Add Optional to tornado/descriptors.py; remove from mypy.ini. 2018-10-29 12:53:16 -07:00
Tim Abbott e4813e462b tornado: Rename async_request_{restart,stop} to mention timer.
Previously, these timer accounting functions could be easily mistaken
for referring to starting/stopping the request.  By adding timer to
the name, we make the code easier for the casual observer to read and
understand.
2018-10-16 15:39:10 -07:00
Tim Abbott 58307f80aa event_queue: Stop mocking push notifications in most tests.
Historically, queue_json_publish had a special third argument that was
basically its default mock behavior in the test suite.  We've been
migrating away from that model, because it was confusing and resulted
in poor test coverage of our queue worker code paths; this was one of
the last holdouts.

As it turns out, we don't exercise this code path in a way that
impacts tests much; the main downside of this change is a likely small
penalty to performance of the full test suite when sending private
messages.
2018-08-10 13:58:39 -07:00
Tim Abbott 02ae71f27f api: Stop using API keys for Django->Tornado authentication.
As part of our effort to change the data model away from each user
having a single API key, we're eliminating the couple requests that
were made from Django to Tornado (as part of a /register or home
request) where we used the user's API key grabbed from the database
for authentication.

Instead, we use the (already existing) internal_notify_view
authentication mechanism, which uses the SHARED_SECRET setting for
security, for these requests, and just fetch the user object using
get_user_profile_by_id directly.

Tweaked by Yago to include the new /api/v1/events/internal endpoint in
the exempt_patterns list in test_helpers, since it's an endpoint we call
through Tornado. Also added a couple missing return type annotations.
2018-07-30 12:28:31 -07:00
Tim Abbott 07af59d4cc tornado: Split get_events_backend into two functions.
The lower-layer function, now called get_events_backend, is intended
to be called by multiple code paths (including the upcoming
get_events_internal).
2018-07-30 12:28:31 -07:00
Sarah de947445ca event_queue: Add stream_email_notify.
This adds support to the event queue system for triggering
missed-message notifications (whether push or email) to support the
stream push notifications feature.
2018-07-14 12:19:33 +05:30
Tim Abbott 58a7a390c8 event_queue: Call build_offline_notification unconditionally.
Previously, maybe_enqueue_notifications had this very subtle logic,
where it set the notice variable only inside the block for push
notifications, but then also used it inside the block for email
notifications.

This "worked", because previously the conditions for push
notifications were always true if the conditions for email
notifications were, but the code was unnecessarily confusing.  The
only good reason to write it this way is if build_offline_notification
was expensive; in fact, the most expensive thing it does is calling
time.time(), so that reason does not apply here.

This was further confusing, in that in the original logic, we relied
on the fact that push notification code path edited the "notice"
dictionary for further processing.

Instead, we just call it separately and setup the data separately in
each code path.
2018-07-14 12:01:59 +05:30
Tim Abbott a09ebf0551 event_queue: Remove confusing comment about rabbitmq.
Whatever RabbitMQ check this comment used to be next to, it isn't next
to anymore.
2018-07-14 12:00:37 +05:30