zulip

Commit Graph

Author	SHA1	Message	Date
Anders Kaseorg	c734bbd95d	python: Modernize legacy Python 2 syntax with pyupgrade. Generated by `pyupgrade --py3-plus --keep-percent-format` on all our Python code except `zthumbor` and `zulip-ec2-configure-interfaces`, followed by manual indentation fixes. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-09 16:43:22 -07:00
Mateusz Mandera	4283a513d4	tornado: Reuse retry_event functions for failures in tornado queues. We use retry_event in queue_processors.py to handle trying on failures, without getting stuck in permanent retry loops if the event ends up leading to failure on every attempt and we just keep sending NACK to rabbitmq forever (or until the channel crashes). Tornado queues haven't been using this, but they should.	2020-04-09 12:43:38 -07:00
Tim Abbott	a373387009	tornado: Fix parsing of delete_message events with no users. The change in `180d8abed6`, while correct for the Django part of the codebase, had the nasty side effect of exposing a failure mode in the process_notification logic if the users list was empty. This, in turn, could cause our process_notification code to fail with an IndexError when trying to process the event, which would result in that tornado process not automatically recovering, due to the outer try/except handler for consume triggering a NACK and thus repeating the event.	2020-04-09 05:39:47 -07:00
Udit107710	ef741bf317	messages: Return shallow copy of message object. When more than one outgoing webhook is configured, the message which is send to the webhook bot passes through finalize_payload function multiple times, which mutated the message dict in a way that many keys were lost from the dict obj. This commit fixes that problem by having `finalize_payload` return a shallow copy of the incoming dict, instead of mutating it. We still mutate dicts inside of `post_process_dicts`, though, for performance reasons. This was slightly modified by @showell to fix the `test_both_codepaths` test that was added concurrently to this work. (I used a slightly verbose style in the tests to emphasize the transformation from `wide_dict` to `narrow_dict`.) I also removed a deepcopy call inside `get_client_payload`, since we now no longer mutate in `finalize_payload`. Finally, I added some comments here and there. For testing, I mostly protect against the root cause of the bug happening again, by adding a line to make sure that `sender_realm_id` does not get wiped out from the "wide" dictionary. A better test would exercise the actual code that exposed the bug here by sending a message to a bot with two or more services attached to it. I will do that in a future commit. Fixes #14384	2020-03-29 15:12:27 -07:00
Steve Howell	4c51a94bcd	message: Move transitional shim for delivery email. If we have an old event that's missing the field `sender_delivery_email`, we now patch it at the top of `process_message_event`, rather than for each call to `get_client_payload`. This will make an upcoming commit a bit easier to reason about. Basically, it's simpler to shim the incoming event one time rather than doing it up to four times. We know that `get_client_payload` is non-destructive, because it does a deepcopy.	2020-03-29 15:12:27 -07:00
Anders Kaseorg	39f9abeb3f	python: Convert json.loads(f.read()) to json.load(f). Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-03-24 10:46:32 -07:00
Mateusz Mandera	89394fc1eb	middleware: Use request.user for logging when possible. Instead of trying to set the _requestor_for_logs attribute in all the relevant places, we try to use request.user when possible (that will be when it's a UserProfile or RemoteZulipServer as of now). In other places, we set _requestor_for_logs to avoid manually editing the request.user attribute, as it should mostly be left for Django to manage it. In places where we remove the "request._requestor_for_logs = ..." line, it is clearly implied by the previous code (or the current surrounding code) that request.user is of the correct type.	2020-03-09 13:54:58 -07:00
Mateusz Mandera	0255ca9b6a	middleware: Log user.id/realm.string_id instead of _email.	2020-03-09 13:54:58 -07:00
Mateusz Mandera	2d544250b7	events: Add block for compatibility with old delete_message events.	2020-03-03 15:52:42 -08:00
Mateusz Mandera	3922fb3a92	events: Clean up delete_message even processing code.	2020-03-03 15:52:42 -08:00
Steve Howell	862515b7a4	presence: Avoid failures with obsolete events. We only recently added `user_id` to presence events.	2020-03-03 11:45:45 -08:00
Tim Abbott	1ea2f188ce	tornado: Rewrite Django integration to duplicate less code. Since essentially the first use of Tornado in Zulip, we've been maintaining our Tornado+Django system, AsyncDjangoHandler, with several hundred lines of Django code copied into it. The goal for that code was simple: We wanted a way to use our Django middleware (for code sharing reasons) inside a Tornado process (since we wanted to use Tornado for our async events system). As part of the Django 2.2.x upgrade, I looked at upgrading this implementation to be based off modern Django, and it's definitely possible to do that: * Continue forking load_middleware to save response middleware. * Continue manually running the Django response middleware. * Continue working out a hack involving copying all of _get_response to change a couple lines allowing us our Tornado code to not actually return the Django HttpResponse so we can long-poll. The previous hack of returning None stopped being viable with the Django 2.2 MiddlewareMixin.__call__ implementation. But I decided to take this opportunity to look at trying to avoid copying material Django code, and there is a way to do it: * Replace RespondAsynchronously with a response.asynchronous attribute on the HttpResponse; this allows Django to run its normal plumbing happily in a way that should be stable over time, and then we proceed to discard the response inside the Tornado `get()` method to implement long-polling. (Better yet might be raising an exception?). This lets us eliminate maintaining a patched copy of _get_response. * Removing the @asynchronous decorator, which didn't add anything now that we only have one API endpoint backend (with two frontend call points) that could call into this. Combined with the last bullet, this lets us remove a significant hack from our never_cache_responses function. * Calling the normal Django `get_response` method from zulip_finish after creating a duplicate request to process, rather than writing totally custom code to do that. This lets us eliminate maintaining a patched copy of Django's load_middleware. * Adding detailed comments explaining how this is supposed to work, what problems we encounter, and how we solve various problems, which is critical to being able to modify this code in the future. A key advantage of these changes is that the exact same code should work on Django 1.11, Django 2.2, and Django 3.x, because we're no longer copying large blocks of core Django code and thus should be much less vulnerable to refactors. There may be a modest performance downside, in that we now run both request and response middleware twice when longpolling (once for the request we discard). We may be able to avoid the expensive part of it, Zulip's own request/response middleware, with a bit of additional custom code to save work for requests where we're planning to discard the response. Profiling will be important to understanding what's worth doing here.	2020-02-13 16:13:11 -08:00
Tim Abbott	986706c7e5	tornado: Use common code for copying headers. This fixes a bug where our asynchronous requests were only copying the Content-Type header (i.e. the one case where we're noticed) from the Django HttpResponse. I'm not sure what the impact of this would be; the rate-limiting headers rarely come up when breaking a long-polled request. But it seems clearly an improvement to do this in a consistent fashion. Only the headers piece is a change; in Tornado self.finish(x) is equivalent to: self.write(x) self.finish()	2020-02-07 16:14:19 -08:00
Tim Abbott	224a73a3ec	tornado: Extract a function for writing Tornado responses. This increases the readability of what's happening in our core Tornado handlers code, as well as making this logic reusable.	2020-02-07 16:13:49 -08:00
Tim Abbott	5305e8af85	tornado: Extract convert_tornado_request_to_django_request.	2020-02-07 16:03:58 -08:00
Tim Abbott	fc58ae117a	handlers: Rename confusingly named response to result_dict. This should somewhat increase the readability of zulip_finish.	2020-02-07 16:03:58 -08:00
Tim Abbott	2aab71e153	event_queue: Fix confusing event_queue.push interface. In `e3ad9baf1d`, we introduced yet another bug where we incorrectly shared event dictionaries between multiple queues. Fortunately, the logging that reports on "event was not in the queue" issues worked and detected this on chat.zulip.org, but this is a clear indication that the comments we have around this system were not sufficient to produce correct behavior. We fix this by changing event_queue.push, the code that mutates the event dictionaries, to do the shallow copies itself. The only downside here is process_message_event, a relatively low-traffic code path, does an extra per-queue dictionary copy. Given that presence, heartbeat, and message reading events are likely more traffic and dealing with HTTP is likely much more expensive than a dictionary copy, this probably doesn't matter performance-wise. (And if profiling later finds it is, there are potential workarounds like passing a skip_copy argument we can do).	2020-02-05 12:40:01 -08:00
Steve Howell	e3ad9baf1d	presence: Add process_presence_event. This lets us conditionally remove the email field from a presence event if the client has registered with the slim_presence flag.	2020-02-04 12:30:36 -08:00
Steve Howell	bf9144ff69	presence: Add slim_presence flag. This flag affects page_params and the payload you get back from POSTs to this url: users/me/presence The flag does not yet affect the presence events that get sent to a client.	2020-02-04 12:30:34 -08:00
Anders Kaseorg	ea6934c26d	dependencies: Remove WebSockets system for sending messages. Zulip has had a small use of WebSockets (specifically, for the code path of sending messages, via the webapp only) since ~2013. We originally added this use of WebSockets in the hope that the latency benefits of doing so would allow us to avoid implementing a markdown local echo; they were not. Further, HTTP/2 may have eliminated the latency difference we hoped to exploit by using WebSockets in any case. While we’d originally imagined using WebSockets for other endpoints, there was never a good justification for moving more components to the WebSockets system. This WebSockets code path had a lot of downsides/complexity, including: * The messy hack involving constructing an emulated request object to hook into doing Django requests. * The `message_senders` queue processor system, which increases RAM needs and must be provisioned independently from the rest of the server). * A duplicate check_send_receive_time Nagios test specific to WebSockets. * The requirement for users to have their firewalls/NATs allow WebSocket connections, and a setting to disable them for networks where WebSockets don’t work. * Dependencies on the SockJS family of libraries, which has at times been poorly maintained, and periodically throws random JavaScript exceptions in our production environments without a deep enough traceback to effectively investigate. * A total of about 1600 lines of our code related to the feature. * Increased load on the Tornado system, especially around a Zulip server restart, and especially for large installations like zulipchat.com, resulting in extra delay before messages can be sent again. As detailed in https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it appears that removing WebSockets moderately increases the time it takes for the `send_message` API query to return from the server, but does not significantly change the time between when a message is sent and when it is received by clients. We don’t understand the reason for that change (suggesting the possibility of a measurement error), and even if it is a real change, we consider that potential small latency regression to be acceptable. If we later want WebSockets, we’ll likely want to just use Django Channels. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-14 22:34:00 -08:00
Tim Abbott	f0fd812cc5	tornado: Add transitional code for sender_delivery_email. This issue was introduced in `54e357e154`.	2019-11-20 17:31:11 -08:00
Tim Abbott	1fe4f795af	settings: Add notification settings checkboxes for wildcard mentions. This change makes it possible for users to control the notification settings for wildcard mentions as a separate control from PMs and direct @-mentions.	2019-11-20 16:58:46 -08:00
Tim Abbott	b85c9b0810	tornado: Use delivery_email in logging. Eventually, we'll want to replace emails with user IDs here entirely, but until we make that happen, we should at least use the same email address present in our other logging. I think we won't miss updating these in a future migration thanks to mypy types.	2019-11-15 17:16:05 -08:00
Tim Abbott	993ed9c2b1	tornado: Remove stale user_profile_email field. Since years ago, this field hasn't been used for anything other than some logging that would be better off logging the user ID anyway. It existed in the first place simply because we weren't passing the user_profile_id to Tornado at all.	2019-11-15 17:07:52 -08:00
Anders Kaseorg	0d20145b93	mypy: Upgrade from 0.730 to 0.740. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-11-13 12:38:45 -08:00
Anders Kaseorg	cafac83676	request: Tighten type checking on REQ. Then, find and fix a predictable number of previous misuses. With a small change by tabbott to preserve backwards compatibility for sending `yes` for the `forged` field. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-11-13 12:35:55 -08:00
Tim Abbott	b12d3d54c6	events: Fix documentation testing for /events. Most of the failures were due to parameters that are not intended to be used by third-party code, so the correct fix for those was the set intentionally_undocumented=True. Fixes #12969.	2019-10-21 16:50:10 -07:00
Tim Abbott	0ed0bb6828	messages: Add email/push notifications for wildcard mentions. Historically, Zulip's implementation of wildcard mentions never triggered either email or push notifications, instead being limited to desktop notifications and the "mentions" counter. We fix this just by plumbing the "wildcard_mentioned" flag through our system. Implements much of https://github.com/zulip/zulip/issues/6040#issuecomment-510157264. We're also now ready to seriously work on #3750.	2019-08-26 14:39:53 -07:00
Tim Abbott	7953e23d49	event_queue: Actually fix missing copy for edit-message events. Apparently, our edit-message events did not guarantee that the outer wrapper dictionary, which is intended to be unique for each client, was unique for every client (instead only ensuring it was unique for each user). This led to clients unexpectedly getting last_event_id validation errors in this code path when a user had multiple connected clients, because the linear ordering of event IDs within a given queue was corrupted. In `fd2a63b049`, we accidentally fixed this issue with a different set of userdata events, without fixing the edit-message event bug. This commit fixes the remaining issue.	2019-08-12 15:17:10 -07:00
Anders Kaseorg	7bf09067d1	event_queue: Clean up type ignores. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-09 17:42:33 -07:00
Anders Kaseorg	becef760bf	cleanup: Delete leading newlines. Previous cleanups (mostly the removals of Python __future__ imports) were done in a way that introduced leading newlines. Delete leading newlines from all files, except static/assets/zulip-emoji/NOTICE, which is a verbatim copy of the Apache 2.0 license. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-06 23:29:11 -07:00
Tim Abbott	fd2a63b049	event_queue: Fix missing copy for edit-message events. Apparently, our edit-message events did not guarantee that the outer wrapper dictionary, which is intended to be unique for each client, was unique for every client (instead only ensuring it was unique for each user). This led to clients unexpectedly getting last_event_id validation errors in this code path when a user had multiple connected clients, because the linear ordering of event IDs within a given queue was corrupted.	2019-08-06 13:40:30 -07:00
Anders Kaseorg	68dd8e4ec8	mypy: Migrate from mypy_extensions to typing_extensions. This gives us access to typing_extensions.Deque, which was not added to typing until 3.5.4. (PROVISION_VERSION is not bumped because the transitive dependency set in dev.txt hasn’t changed.) Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-05 17:24:09 -07:00
Anders Kaseorg	86a7fdddd7	events: Check last_event_id for validity, take 2. This verifies that the client passed a last_event_id that actually came from the queue instead of making up an ID from the future. It turns out one of our tests was making up such an ID, but legitimate clients are expected not to do so. The previous version of this commit (commit `e00d4be6d5`, #12888) had to be reverted (commit `b86c5cc490`) because it was missing the `to_dict`/`from_dict` migration code. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-05 17:18:49 -07:00
Tim Abbott	b86c5cc490	Revert "events: Check last_event_id for validity." This isn't correct without a proper migration for existing queues, which may not be implementable. This reverts commit `e00d4be6d5`.	2019-08-02 14:44:35 -07:00
Tim Abbott	8be3df0e29	tornado: Fix bugs in Tornado autoreload library. This fixes two issues: * The syntax check logic we had for zerver.tornado.autoreload would end up clearing _reload_hooks if one of the files that had changed was zerver.tornado.autoreload itself (because we'd had re-imported the current module), which could be incredibly confusing when trying to test the autoreload logic. It seems better to just not run the syntax check for syntax errors in this file. Similarly, because reloading event_queue.py would destroy the state in the queues, we avoid that as well. * We make sure to flush stdout after running and reload hooks, to make sure their output reaches the user.	2019-08-02 12:47:49 -07:00
Tim Abbott	a43b2f7a43	tornado: Fix incorrect import of autoreload library. We were apparently not running our own forked Tornado autoreload library when adding reload hooks, which meant that our autoreload hooks didn't run at all. This fixes an issue that made dump_event_queues never run and thus the local development environment difficult to use for testing event queues.	2019-08-02 12:47:49 -07:00
Wyatt Hoodes	4beec5c6b9	typing: Use TYPE_CHECKING when dealing with cyclic dependencies.	2019-07-31 12:19:39 -07:00
Anders Kaseorg	e00d4be6d5	events: Check last_event_id for validity. This verifies that the client passed a last_event_id that actually came from the queue instead of making up an ID from the future. It turns out one of our tests was making up such an ID, but legitimate clients are expected not to do so. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-07-26 17:18:28 -07:00
Wyatt Hoodes	a2fa1a6f25	handlers: Remove duplicate type annotation. `self._request_middleware` is already typed in the `__init__` method.	2019-07-22 16:27:39 -07:00
Anders Kaseorg	3968fe9fdf	tornado: Remove unused imports. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-02-02 17:33:13 -08:00
Anders Kaseorg	e5bf0c0a69	event_queue: Avoid hardcoded paths in /var/tmp. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-01-15 16:12:05 -08:00
Tim Abbott	930e65d1be	push: Include type in add-push-notification events. This should make us able to clean up the logic for this in the future (right now, we still need to do the .get() for backwards compatibility).	2018-12-15 13:58:52 -08:00
Tim Abbott	cfeb87c1c9	tornado: Require non-negative lifespan_secs. Previously, our validation for this field only checked it was an integer, and you could in theory send invalid negative values here.	2018-12-05 14:50:37 -08:00
Tim Abbott	8e4d6fa045	event_queue: Rename IDLE_EVENT_QUEUE_TIMEOUT_SECS. This is a default value, not an always-used value, and its name should reflect that.	2018-12-05 14:48:40 -08:00
Tim Abbott	94dfff1c4e	event queue: Don't set a minimum for lifespan_secs. This makes it more convenient for developers to set very short values for this (e.g. 1 minute) for the purposes of testing/debugging; there aren't obvious problems with letting users set short values for this.	2018-12-05 14:47:53 -08:00
Tim Abbott	a3c2d49f0c	event_queue: Change garbage-collection frequency to 1 minute. This is designed to help make it more convenient to do manual testing where we need event queues to be garbage-collected.	2018-12-05 14:42:53 -08:00
Tim Abbott	6dd69b9bff	event_queue: Rename ClientDescriptor.idle to expired. This better reflects the situation with these event queues -- they're not idle, they are expired and to be garbage collected.	2018-12-05 14:42:53 -08:00
Tim Abbott	408af032a0	event_queue: Remove queue_timeout migration code from 2013. There's never going to be an event queue without a queue_timeout property anymore.	2018-12-05 14:24:38 -08:00
Tim Abbott	d0f71881f4	docs: Add detailed documentation on the process for sending messages. This has long been something missing from our suite of documentation.	2018-11-29 16:25:35 -08:00

1 2 3 4 5

221 Commits