zulip

Commit Graph

Author	SHA1	Message	Date
Tim Abbott	1ea2f188ce	tornado: Rewrite Django integration to duplicate less code. Since essentially the first use of Tornado in Zulip, we've been maintaining our Tornado+Django system, AsyncDjangoHandler, with several hundred lines of Django code copied into it. The goal for that code was simple: We wanted a way to use our Django middleware (for code sharing reasons) inside a Tornado process (since we wanted to use Tornado for our async events system). As part of the Django 2.2.x upgrade, I looked at upgrading this implementation to be based off modern Django, and it's definitely possible to do that: * Continue forking load_middleware to save response middleware. * Continue manually running the Django response middleware. * Continue working out a hack involving copying all of _get_response to change a couple lines allowing us our Tornado code to not actually return the Django HttpResponse so we can long-poll. The previous hack of returning None stopped being viable with the Django 2.2 MiddlewareMixin.__call__ implementation. But I decided to take this opportunity to look at trying to avoid copying material Django code, and there is a way to do it: * Replace RespondAsynchronously with a response.asynchronous attribute on the HttpResponse; this allows Django to run its normal plumbing happily in a way that should be stable over time, and then we proceed to discard the response inside the Tornado `get()` method to implement long-polling. (Better yet might be raising an exception?). This lets us eliminate maintaining a patched copy of _get_response. * Removing the @asynchronous decorator, which didn't add anything now that we only have one API endpoint backend (with two frontend call points) that could call into this. Combined with the last bullet, this lets us remove a significant hack from our never_cache_responses function. * Calling the normal Django `get_response` method from zulip_finish after creating a duplicate request to process, rather than writing totally custom code to do that. This lets us eliminate maintaining a patched copy of Django's load_middleware. * Adding detailed comments explaining how this is supposed to work, what problems we encounter, and how we solve various problems, which is critical to being able to modify this code in the future. A key advantage of these changes is that the exact same code should work on Django 1.11, Django 2.2, and Django 3.x, because we're no longer copying large blocks of core Django code and thus should be much less vulnerable to refactors. There may be a modest performance downside, in that we now run both request and response middleware twice when longpolling (once for the request we discard). We may be able to avoid the expensive part of it, Zulip's own request/response middleware, with a bit of additional custom code to save work for requests where we're planning to discard the response. Profiling will be important to understanding what's worth doing here.	2020-02-13 16:13:11 -08:00
Tim Abbott	986706c7e5	tornado: Use common code for copying headers. This fixes a bug where our asynchronous requests were only copying the Content-Type header (i.e. the one case where we're noticed) from the Django HttpResponse. I'm not sure what the impact of this would be; the rate-limiting headers rarely come up when breaking a long-polled request. But it seems clearly an improvement to do this in a consistent fashion. Only the headers piece is a change; in Tornado self.finish(x) is equivalent to: self.write(x) self.finish()	2020-02-07 16:14:19 -08:00
Tim Abbott	224a73a3ec	tornado: Extract a function for writing Tornado responses. This increases the readability of what's happening in our core Tornado handlers code, as well as making this logic reusable.	2020-02-07 16:13:49 -08:00
Tim Abbott	5305e8af85	tornado: Extract convert_tornado_request_to_django_request.	2020-02-07 16:03:58 -08:00
Tim Abbott	fc58ae117a	handlers: Rename confusingly named response to result_dict. This should somewhat increase the readability of zulip_finish.	2020-02-07 16:03:58 -08:00
Tim Abbott	2aab71e153	event_queue: Fix confusing event_queue.push interface. In `e3ad9baf1d`, we introduced yet another bug where we incorrectly shared event dictionaries between multiple queues. Fortunately, the logging that reports on "event was not in the queue" issues worked and detected this on chat.zulip.org, but this is a clear indication that the comments we have around this system were not sufficient to produce correct behavior. We fix this by changing event_queue.push, the code that mutates the event dictionaries, to do the shallow copies itself. The only downside here is process_message_event, a relatively low-traffic code path, does an extra per-queue dictionary copy. Given that presence, heartbeat, and message reading events are likely more traffic and dealing with HTTP is likely much more expensive than a dictionary copy, this probably doesn't matter performance-wise. (And if profiling later finds it is, there are potential workarounds like passing a skip_copy argument we can do).	2020-02-05 12:40:01 -08:00
Steve Howell	e3ad9baf1d	presence: Add process_presence_event. This lets us conditionally remove the email field from a presence event if the client has registered with the slim_presence flag.	2020-02-04 12:30:36 -08:00
Steve Howell	bf9144ff69	presence: Add slim_presence flag. This flag affects page_params and the payload you get back from POSTs to this url: users/me/presence The flag does not yet affect the presence events that get sent to a client.	2020-02-04 12:30:34 -08:00
Anders Kaseorg	ea6934c26d	dependencies: Remove WebSockets system for sending messages. Zulip has had a small use of WebSockets (specifically, for the code path of sending messages, via the webapp only) since ~2013. We originally added this use of WebSockets in the hope that the latency benefits of doing so would allow us to avoid implementing a markdown local echo; they were not. Further, HTTP/2 may have eliminated the latency difference we hoped to exploit by using WebSockets in any case. While we’d originally imagined using WebSockets for other endpoints, there was never a good justification for moving more components to the WebSockets system. This WebSockets code path had a lot of downsides/complexity, including: * The messy hack involving constructing an emulated request object to hook into doing Django requests. * The `message_senders` queue processor system, which increases RAM needs and must be provisioned independently from the rest of the server). * A duplicate check_send_receive_time Nagios test specific to WebSockets. * The requirement for users to have their firewalls/NATs allow WebSocket connections, and a setting to disable them for networks where WebSockets don’t work. * Dependencies on the SockJS family of libraries, which has at times been poorly maintained, and periodically throws random JavaScript exceptions in our production environments without a deep enough traceback to effectively investigate. * A total of about 1600 lines of our code related to the feature. * Increased load on the Tornado system, especially around a Zulip server restart, and especially for large installations like zulipchat.com, resulting in extra delay before messages can be sent again. As detailed in https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it appears that removing WebSockets moderately increases the time it takes for the `send_message` API query to return from the server, but does not significantly change the time between when a message is sent and when it is received by clients. We don’t understand the reason for that change (suggesting the possibility of a measurement error), and even if it is a real change, we consider that potential small latency regression to be acceptable. If we later want WebSockets, we’ll likely want to just use Django Channels. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-14 22:34:00 -08:00
Tim Abbott	f0fd812cc5	tornado: Add transitional code for sender_delivery_email. This issue was introduced in `54e357e154`.	2019-11-20 17:31:11 -08:00
Tim Abbott	1fe4f795af	settings: Add notification settings checkboxes for wildcard mentions. This change makes it possible for users to control the notification settings for wildcard mentions as a separate control from PMs and direct @-mentions.	2019-11-20 16:58:46 -08:00
Tim Abbott	b85c9b0810	tornado: Use delivery_email in logging. Eventually, we'll want to replace emails with user IDs here entirely, but until we make that happen, we should at least use the same email address present in our other logging. I think we won't miss updating these in a future migration thanks to mypy types.	2019-11-15 17:16:05 -08:00
Tim Abbott	993ed9c2b1	tornado: Remove stale user_profile_email field. Since years ago, this field hasn't been used for anything other than some logging that would be better off logging the user ID anyway. It existed in the first place simply because we weren't passing the user_profile_id to Tornado at all.	2019-11-15 17:07:52 -08:00
Anders Kaseorg	0d20145b93	mypy: Upgrade from 0.730 to 0.740. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-11-13 12:38:45 -08:00
Anders Kaseorg	cafac83676	request: Tighten type checking on REQ. Then, find and fix a predictable number of previous misuses. With a small change by tabbott to preserve backwards compatibility for sending `yes` for the `forged` field. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-11-13 12:35:55 -08:00
Tim Abbott	b12d3d54c6	events: Fix documentation testing for /events. Most of the failures were due to parameters that are not intended to be used by third-party code, so the correct fix for those was the set intentionally_undocumented=True. Fixes #12969.	2019-10-21 16:50:10 -07:00
Tim Abbott	0ed0bb6828	messages: Add email/push notifications for wildcard mentions. Historically, Zulip's implementation of wildcard mentions never triggered either email or push notifications, instead being limited to desktop notifications and the "mentions" counter. We fix this just by plumbing the "wildcard_mentioned" flag through our system. Implements much of https://github.com/zulip/zulip/issues/6040#issuecomment-510157264. We're also now ready to seriously work on #3750.	2019-08-26 14:39:53 -07:00
Tim Abbott	7953e23d49	event_queue: Actually fix missing copy for edit-message events. Apparently, our edit-message events did not guarantee that the outer wrapper dictionary, which is intended to be unique for each client, was unique for every client (instead only ensuring it was unique for each user). This led to clients unexpectedly getting last_event_id validation errors in this code path when a user had multiple connected clients, because the linear ordering of event IDs within a given queue was corrupted. In `fd2a63b049`, we accidentally fixed this issue with a different set of userdata events, without fixing the edit-message event bug. This commit fixes the remaining issue.	2019-08-12 15:17:10 -07:00
Anders Kaseorg	7bf09067d1	event_queue: Clean up type ignores. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-09 17:42:33 -07:00
Anders Kaseorg	becef760bf	cleanup: Delete leading newlines. Previous cleanups (mostly the removals of Python __future__ imports) were done in a way that introduced leading newlines. Delete leading newlines from all files, except static/assets/zulip-emoji/NOTICE, which is a verbatim copy of the Apache 2.0 license. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-06 23:29:11 -07:00
Tim Abbott	fd2a63b049	event_queue: Fix missing copy for edit-message events. Apparently, our edit-message events did not guarantee that the outer wrapper dictionary, which is intended to be unique for each client, was unique for every client (instead only ensuring it was unique for each user). This led to clients unexpectedly getting last_event_id validation errors in this code path when a user had multiple connected clients, because the linear ordering of event IDs within a given queue was corrupted.	2019-08-06 13:40:30 -07:00
Anders Kaseorg	68dd8e4ec8	mypy: Migrate from mypy_extensions to typing_extensions. This gives us access to typing_extensions.Deque, which was not added to typing until 3.5.4. (PROVISION_VERSION is not bumped because the transitive dependency set in dev.txt hasn’t changed.) Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-05 17:24:09 -07:00
Anders Kaseorg	86a7fdddd7	events: Check last_event_id for validity, take 2. This verifies that the client passed a last_event_id that actually came from the queue instead of making up an ID from the future. It turns out one of our tests was making up such an ID, but legitimate clients are expected not to do so. The previous version of this commit (commit `e00d4be6d5`, #12888) had to be reverted (commit `b86c5cc490`) because it was missing the `to_dict`/`from_dict` migration code. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-05 17:18:49 -07:00
Tim Abbott	b86c5cc490	Revert "events: Check last_event_id for validity." This isn't correct without a proper migration for existing queues, which may not be implementable. This reverts commit `e00d4be6d5`.	2019-08-02 14:44:35 -07:00
Tim Abbott	8be3df0e29	tornado: Fix bugs in Tornado autoreload library. This fixes two issues: * The syntax check logic we had for zerver.tornado.autoreload would end up clearing _reload_hooks if one of the files that had changed was zerver.tornado.autoreload itself (because we'd had re-imported the current module), which could be incredibly confusing when trying to test the autoreload logic. It seems better to just not run the syntax check for syntax errors in this file. Similarly, because reloading event_queue.py would destroy the state in the queues, we avoid that as well. * We make sure to flush stdout after running and reload hooks, to make sure their output reaches the user.	2019-08-02 12:47:49 -07:00
Tim Abbott	a43b2f7a43	tornado: Fix incorrect import of autoreload library. We were apparently not running our own forked Tornado autoreload library when adding reload hooks, which meant that our autoreload hooks didn't run at all. This fixes an issue that made dump_event_queues never run and thus the local development environment difficult to use for testing event queues.	2019-08-02 12:47:49 -07:00
Wyatt Hoodes	4beec5c6b9	typing: Use TYPE_CHECKING when dealing with cyclic dependencies.	2019-07-31 12:19:39 -07:00
Anders Kaseorg	e00d4be6d5	events: Check last_event_id for validity. This verifies that the client passed a last_event_id that actually came from the queue instead of making up an ID from the future. It turns out one of our tests was making up such an ID, but legitimate clients are expected not to do so. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-07-26 17:18:28 -07:00
Wyatt Hoodes	a2fa1a6f25	handlers: Remove duplicate type annotation. `self._request_middleware` is already typed in the `__init__` method.	2019-07-22 16:27:39 -07:00
Anders Kaseorg	3968fe9fdf	tornado: Remove unused imports. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-02-02 17:33:13 -08:00
Anders Kaseorg	e5bf0c0a69	event_queue: Avoid hardcoded paths in /var/tmp. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-01-15 16:12:05 -08:00
Tim Abbott	930e65d1be	push: Include type in add-push-notification events. This should make us able to clean up the logic for this in the future (right now, we still need to do the .get() for backwards compatibility).	2018-12-15 13:58:52 -08:00
Tim Abbott	cfeb87c1c9	tornado: Require non-negative lifespan_secs. Previously, our validation for this field only checked it was an integer, and you could in theory send invalid negative values here.	2018-12-05 14:50:37 -08:00
Tim Abbott	8e4d6fa045	event_queue: Rename IDLE_EVENT_QUEUE_TIMEOUT_SECS. This is a default value, not an always-used value, and its name should reflect that.	2018-12-05 14:48:40 -08:00
Tim Abbott	94dfff1c4e	event queue: Don't set a minimum for lifespan_secs. This makes it more convenient for developers to set very short values for this (e.g. 1 minute) for the purposes of testing/debugging; there aren't obvious problems with letting users set short values for this.	2018-12-05 14:47:53 -08:00
Tim Abbott	a3c2d49f0c	event_queue: Change garbage-collection frequency to 1 minute. This is designed to help make it more convenient to do manual testing where we need event queues to be garbage-collected.	2018-12-05 14:42:53 -08:00
Tim Abbott	6dd69b9bff	event_queue: Rename ClientDescriptor.idle to expired. This better reflects the situation with these event queues -- they're not idle, they are expired and to be garbage collected.	2018-12-05 14:42:53 -08:00
Tim Abbott	408af032a0	event_queue: Remove queue_timeout migration code from 2013. There's never going to be an event queue without a queue_timeout property anymore.	2018-12-05 14:24:38 -08:00
Tim Abbott	d0f71881f4	docs: Add detailed documentation on the process for sending messages. This has long been something missing from our suite of documentation.	2018-11-29 16:25:35 -08:00
Tim Abbott	64960383e4	mypy: Fix missing type annotation in tornado code.	2018-11-20 19:08:14 -08:00
Tim Abbott	46acb608b1	tornado: Include port number in logging statements.	2018-11-20 18:45:22 -08:00
Tim Abbott	40ff41e135	tornado: Fix populate_db failing to call send_event properly. This isn't the right long-term fix; theoretically, send_event shouldn't be doing anything with populate_db, but that's for later.	2018-11-02 17:07:21 -07:00
Tim Abbott	0cac7e1cd3	tornado: Extract functions for Tornado queue names. This moves all control for what queue to use for which realm in our Tornado system to just the sharding.py file; no actual sharding is done yet.	2018-11-02 17:00:10 -07:00
Tim Abbott	152c44b6d2	tornado: Extract function for specifying Tornado URI. Since TORNADO_PROCESSES is 1 in all default configurations, this doesn't have any user-facing effect.	2018-11-02 17:00:09 -07:00
Tim Abbott	ec065e92ee	tornado: Store port on SockJS connection object. This will make it available for use inside our websockets code.	2018-11-02 16:55:33 -07:00
Tim Abbott	ea1ec68899	events: Pass a realm object into send_event. This is a preparator refactor for supporting hosting different Tornado processes on different servers; to look up which Tornado server we should be sending the event to, we'll need the realm object.	2018-11-02 16:47:39 -07:00
Tim Abbott	75e48459b5	tornado: Support using a port-aware file for dumping event queues. This should make it possible for there to safely be multiple Tornado processes running on different ports on the same system. It may also fix a rare race bug in development, where previously, it was possible for the Tornados processes for Casper and the main development server to interfere; I haven't investigated whether this was a real bug or not, but now those two services will use independent Tornado files. We still need to add something to direct traffic between the different Tornado processes.	2018-11-02 16:47:39 -07:00
Tim Abbott	9339c191da	tornado: Fix missing mypy annotation.	2018-11-02 14:43:55 -07:00
Tim Abbott	cf24a20185	tornado: Refactor logic for persistent queue filenames. Now, these are computed using a function, which will make it easier to edit these paths to depend on which Tornado process it is in coming commits.	2018-11-02 14:19:10 -07:00
Tim Abbott	98f28fa6ce	tornado: Remove unused send_notification() function. This hasn't been used in a long time, probably since `3fddc11cc2`.	2018-11-02 14:14:39 -07:00

1 2 3 4 5

210 Commits