zulip

Commit Graph

Author	SHA1	Message	Date
Anders Kaseorg	a50eb2e809	mypy: Enable new error explicit-override. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-10-12 12:28:41 -07:00
Esther Anierobi	b2ea3125b2	exports: Improve notifications about completed data exports. Change the url in the notification message to point to the settings interface rather than linking to the export directly. This is a much better user experience in the case that the export has been deleted since the time the export was requested. Fixes: #26923.	2023-10-11 17:42:32 -07:00
Anders Kaseorg	cf4791264c	python: Replace functools.partial with type-safe returns.curry.partial. The type annotation for functools.partial uses unchecked Any for all the function parameters (both early and late). returns.curry.partial uses a mypy plugin to check the parameters safely. https://returns.readthedocs.io/en/latest/pages/curry.html Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-09-11 18:03:45 -07:00
Alex Vandiver	b94402152d	models: Always search Messages with a realm_id or id limit. Unless there is a limit on `id`, always provide a `realm_id` limit as well. We also notate which index is expected to be used in each query.	2023-09-11 15:00:37 -07:00
Zixuan James Li	30495cec58	migration: Rename extra_data_json to extra_data in audit log models. This migration applies under the assumption that extra_data_json has been populated for all existing and coming audit log entries. - This removes the manual conversions back and forth for extra_data throughout the codebase including the orjson.loads(), orjson.dumps(), and str() calls. - The custom handler used for converting Decimal is removed since DjangoJSONEncoder handles that for extra_data. - We remove None-checks for extra_data because it is now no longer nullable. - Meanwhile, we want the bouncer to support processing RealmAuditLog entries for remote servers before and after the JSONField migration on extra_data. - Since now extra_data should always be a dict for the newer remote server, which is now migrated, the test cases are updated to create RealmAuditLog objects by passing a dict for extra_data before sending over the analytics data. Note that while JSONField allows for non-dict values, a proper remote server always passes a dict for extra_data. - We still test out the legacy extra_data format because not all remote servers have migrated to use JSONField extra_data. This verifies that support for extra_data being a string or None has not been dropped. Co-authored-by: Siddharth Asthana <siddharthasthana31@gmail.com> Signed-off-by: Zixuan James Li <p359101898@gmail.com>	2023-08-16 17:18:14 -07:00
Steve Howell	51db22c86c	per-request caches: Add per_request_cache library. We have historically cached two types of values on a per-request basis inside of memory: * linkifiers * display recipients Both of these caches were hand-written, and they both actually cache values that are also in memcached, so the per-request cache essentially only saves us from a few memcached hits. I think the linkifier per-request cache is a necessary evil. It's an important part of message rendering, and it's not super easy to structure the code to just get a single value up front and pass it down the stack. I'm not so sure we even need the display recipient per-request cache any more, as we are generally pretty smart now about hydrating recipient data in terms of how the code is organized. But I haven't done thorough research on that hypotheseis. Fortunately, it's not rocket science to just write a glorified memoize decorator and tie it into key places in the code: * middleware * tests (e.g. asserting db counts) * queue processors That's what I did in this commit. This commit definitely reduces the amount of code to maintain. I think it also gets us closer to possibly phasing out this whole technique, but that effort is beyond the scope of this PR. We could add some instrumentation to the decorator to see how often we get a non-trivial number of saved round trips to memcached. Note that when we flush linkifiers, we just use a big hammer and flush the entire per-request cache for linkifiers, since there is only ever one realm in the cache.	2023-08-11 11:09:34 -07:00
Alex Vandiver	c77c78f147	missed-message: Add a try-catch to prevent killing background thread. An exception which escapes from this loop can kill the background worker thread; this results in consuming the queue (leading to the illusion of progress) but more and more rows silently piling up in the ScheduledMessageNotificationEmail table. Wrap the inside of the `while True` loop in a try/catch to make sure that no exceptions escape and kill the background thread. To prevent even more indentation, the inner loop is extracted into its own function. It returns true/false to signal if the `self.stopping` was set to tell the loop to stop; we cannot check it ourselves in the outer loop because it needs to hold the lock to be examined.	2023-07-25 10:01:00 -07:00
Anders Kaseorg	b285813beb	error_notify: Remove custom email error reporting handler. Restore the default django.utils.log.AdminEmailHandler when ERROR_REPORTING is enabled. Those with more sophisticated needs can turn it off and use Sentry or a Sentry-compatible system. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-07-20 11:00:09 -07:00
Alex Vandiver	be960f4142	missed-message: Lock ScheduledMessageNotificationEmail rows. This prevents the rows from being deleted out from under the worker while it is sending emails.	2023-07-13 11:50:42 -07:00
Alex Vandiver	d87895a3ef	missed-message: Merge before calling handle_missedmessage_emails. The MissedMessage queue worker is the single callsite of `handle_missedmessage_emails`, which immediately transforms the list of events into a dict keyed by message-id. Skip the intermediate list step, and use defaultdict and a dataclass to simplify and make explicit the pieces. This removes the unused user_profile_id and message_id pieces of the data structure.	2023-07-13 11:50:42 -07:00
Alex Vandiver	c7d9a4784e	missed-message: Remove unnecessary select_related(). This was added in ebb4eab0f99d; neither the `user_profile` nor the `message` attribute are read off of the object.	2023-07-13 11:50:42 -07:00
Zixuan James Li	b6d1e56cac	queue_processors: Avoid queue worker timeouts in tests. For tests that use the dev server, like test-api, test-js-with-puppeteer, we don't have the consumers for the queues. As they eventually timeout, we get unnecessary error messages. This adds a new flag, disable_timeout, to disable this behavior for the test cases.	2023-06-28 11:06:24 -07:00
Lauryn Menard	d3f7cfccbc	zerver: Update comments with "private message" or "PM". Updates comments/doc-strings that use "private message" or "PM" in files in the `/zerver` directory to instead use "direct message".	2023-06-23 11:24:13 -07:00
Alex Vandiver	7811e99548	realm_export: Handle hard head-of-queue failures. Realm exports may OOM on deployments with low memory; to ensure forward progress, log the start time in the RealmAuditLog entry, and key off of the existence of that to prevent re-attempting an export which was already tried once.	2023-05-16 14:05:01 -07:00
Alex Vandiver	4a43856ba7	realm_export: Do not assume null extra_data is special. Fixes: #20197.	2023-05-16 14:05:01 -07:00
Alex Vandiver	362177b788	workers: Run realm export with one thread if in low-memory environment. We previously hard-coded 6 threads for the realm export; in low-memory environments, spawning 6 threads for an export can lean to an OOM, which kills the process and leaves a partial export on disk -- which is then tried again, since the export was never completed. This leads to excessive disk consumption and brief repeated outages of all other workers, until the failing export job is manually de-queued somehow. Lower the export to only use on thread if it is already running in a multi-threaded environment. Note that this does not guarantee forward progress, it merely makes it more likely that exports will succeed in low-memory deployments.	2023-05-16 14:05:01 -07:00
Alex Vandiver	9f231322c9	workers: Pass down if they are running multi-threaded. This allows them to decide for themselves if they should enable timeouts.	2023-05-16 14:05:01 -07:00
Alex Vandiver	daba72c116	error_notify: Drop any remaining browser-side errors in RabbitMQ queue.	2023-04-13 14:59:58 -07:00
Alex Vandiver	3efc0c9af3	workers: Rewrite missedmessage_emails with a worker thread. The previous implementation leaked database connections, as a new thread (and thus a new thread-local database connection) was made for each timer execution. While these connections were relatively lightweight in Python, they also incur memory overhead in the PostgreSQL server itself. The logic for managing the timer was also unclear, and the unavoidable deadlock in the stopping logic was rather unfortunate. Rewrite with one explicit worker thread which handles the delayed message sending. The RabbitMQ consumer creates the database rows, and notifies the worker to start its 5s timeout. Because it is controlled by a condition variable, it does not hold the lock while waiting, and can be notified to exit.	2023-04-10 17:38:08 -07:00
Alex Vandiver	02a73af386	deferred_work: Log at start of the work. This is helpful for debugging -- generally these tasks are in a worker queue because they take a long time to run, so knowing what long task is about to start before it does, rather than just after, is useful.	2023-02-09 12:06:38 -08:00
Anders Kaseorg	7e3a681f80	ruff: Fix S108 Probable insecure usage of temporary file. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-26 10:14:56 -08:00
Anders Kaseorg	25346bde98	ruff: Fix SIM118 Use `k in d` instead of `k in d.keys()`. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-23 11:18:36 -08:00
Anders Kaseorg	46cdcd3f33	ruff: Fix PIE790 Unnecessary `pass` statement. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-04 16:25:07 -08:00
Anders Kaseorg	e1ed44907b	ruff: Fix SIM118 Use `key in dict` instead of `key in dict.keys()`. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-04 16:25:07 -08:00
Anders Kaseorg	73c4da7974	ruff: Fix N818 exception name should be named with an Error suffix. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-11-17 16:52:00 -08:00
Anders Kaseorg	9a8a2bd345	ruff: Enable import sorting, replacing isort. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-11-16 09:29:11 -08:00
Anders Kaseorg	1735b8863e	ruff: Fix B012 return inside finally blocks. return inside finally blocks causes exceptions to be silenced. Although these blocks follow blanket ‘except Exception’ handlers, they do not seem to have a goal of silencing BaseException and exceptions thrown by the exception handler, so rewrite them to avoid it. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-11-16 09:29:11 -08:00
Tim Abbott	8010d06f9e	compatiblity: Delete obsolete compatibility code. Both of these compatibility blocks can be deleted, since you can't upgrade directly to any supported release from the versions where the old event formats would be used.	2022-11-15 15:39:38 -08:00
Anders Kaseorg	b45484573e	python: Use format string for logging str(obj). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-10-10 08:32:29 -07:00
Anders Kaseorg	fcd81a8473	python: Replace avoidable uses of __special__ attributes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-10-10 08:32:29 -07:00
Christopher Chong	28173cafc8	message_flags: Fix deadlocks when updating message flags. Previously, an active production Zulip server would experience a class of deadlocks caused by two or more concurrent bulk update operations on the UserMessage table. This is because UPDATE ... SET ... WHERE statements that execute in parallel take row-level UPDATE locks as they get results; since the query plans may result in getting rows in different orders between two queries, this can result in deadlocks. Some databases allow ORDER BY on their UPDATE ... WHERE statements; PostgreSQL does not. In PostgreSQL, the answer is to do a sub-select with an ORDER BY ... FOR UPDATE to ensure consistent ordering on row locks. We do this all code paths using bitand or bitor as part of bulk editing message flags, which should ensure that these concurrent operations obtain row level locks on the table in the same order. Fixes #19054.	2022-09-06 16:06:58 -07:00
Zixuan James Li	3ba51ef1e2	queue_processor: Fix type annotation for connection. Signed-off-by: Zixuan James Li <p359101898@gmail.com>	2022-07-26 18:00:24 -07:00
Zixuan James Li	cd8510607a	queue_processor: Remove unreachable code. This change was added in `c93f1d4eda (diff-d88010b113b79080cab5885fdfbbb56ae2d380cb601d8f520621b3361ad8cebc)`. `message.content` cannot be `None` by the model definition. Signed-off-by: Zixuan James Li <p359101898@gmail.com>	2022-07-19 17:30:15 -07:00
Alex Vandiver	cd9c69cd12	message_send: Remove unnecessary user_ids argument. `cfcbf58cd1` rightly removed the use of `user_ids` in `render_markdown`, which in turn makes it unnecessary in `render_incoming_message`. Remove the unnecessary parameter from `render_incoming_message`.	2022-05-04 14:45:18 -07:00
Alex Vandiver	74e9b086f9	embed_links: Check that the message still exists before proceeding.	2022-05-04 14:45:18 -07:00
Alex Vandiver	de63000db6	embed_links: Take a lock on the message object while editing. We leave the fetching of links outside of the lock, as they could take seconds, which is an unreasonable amount of time to hold a lock on the message row. This may result in unnecessary work, in the case that the message was since edited, but the unnecessary work is preferable to blocking other work on the message row for the duration.	2022-05-04 14:45:18 -07:00
Alex Vandiver	127108c7d1	workers: Log the exception if the export fails. We previously just swallowed the exception entirely.	2022-04-28 11:52:47 -07:00
Zixuan James Li	a8fd9eb701	email_notifications: Soft reactivate mentioned users. Signed-off-by: Zixuan James Li <359101898@qq.com>	2022-04-27 16:43:54 -07:00
Sahil Batra	61365fbe21	invites: Use expiration time in minutes instead of days. This commit changes the invite API to accept invitation expiration time in minutes since we are going to add a custom option in further commits which would allow a user to set expiration time in minutes, hours and weeks as well.	2022-04-20 13:31:37 -07:00
Alex Vandiver	351bdfaf78	preview: Use cache only as a non-durable cache, not an IPC. The `get_link_embed_data` / `link_embed_data_from_cache` pair as introduced in `c93f1d4eda` uses the cache as a temporary store inside of the `embed_links` worker; this means that it must be durable storage, or the worker will stall and re-fetch the same links to preview them. Switch to plumbing through the fetched URL embed data as an parameter to the Markdown evaluation which uses them, rather than using the cache as an intermediary. This frees up the cache to be merely a non-durable cache. As a side-effect, this removes get_cache_with_key, and link_embed_data_from_cache which was its only callsite.	2022-04-15 14:48:12 -07:00
Anders Kaseorg	eda000899b	actions: Split out zerver.actions.message_edit. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-14 17:14:36 -07:00
Anders Kaseorg	eb4e9fe1e7	actions: Split out zerver.actions.message_flags. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-14 17:14:36 -07:00
Anders Kaseorg	975066e3f0	actions: Split out zerver.actions.message_send. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-14 17:14:34 -07:00
Anders Kaseorg	b7adfb02f6	actions: Split out zerver.actions.presence. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-14 17:14:32 -07:00
Anders Kaseorg	6168c0110a	actions: Split out zerver.actions.user_activity. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-14 17:14:32 -07:00
Anders Kaseorg	8fc5922ebd	actions: Split out zerver.actions.realm_export. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-14 17:14:31 -07:00
Anders Kaseorg	ca8d374e21	actions: Split out zerver.actions.invites. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-14 17:14:31 -07:00
Sahil Batra	392b17da5f	invite: Add backend support for "Never expires" option. The database value for expiry_date is None for the invite that will never expire and the clients send -1 as value in the API similar to the message retention setting. Also, when passing invite_expire_in_days as an argument in various functions, invite_expire_in_days is passed as -1 for "Never expires" option since invite_expire_in_days is an optional argument in some functions and thus we cannot pass "None" value.	2022-02-24 16:32:19 -08:00
Mateusz Mandera	30ac291eba	emoji: Add migration to reupload all RealmEmoji and ensure .author. Fixes #19732.	2022-02-10 17:45:31 -08:00
Lauryn Menard	c532829c35	backend: Change `do_report_error` return value. As a preparatory step to refactoring json_success to accept request as a parameter, change `do_report_error`, which is called from the events queue for "error_reports", to return None instead of json_success. Adds an assertion error to `ErrorReporter` queue processor and removes `JsonableError` from `do_report_error`. It is likely that `do_error_report` was moved from a view in a previous refactor, but was not updated to no longer return an HttpReponse.	2022-02-04 15:16:55 -08:00
Alex Vandiver	3efed5f1e6	queue_processors: Shut down background missedmessage_emails thread. Python's behaviour on `sys.exit` is to wait for all non-daemon threads to exit. In the context of the missedmessage_emails worker, if any work is pending, a non-daemon Timer thread exists, which is waiting for 5 seconds. As soon as that thread is serviced, it sets up another 5-second Timer, a process which repeats until all ScheduledMessageNotificationEmail records have been handled. This likely takes two minutes, but may theoretically take up to a week until the thread exits, and thus sys.exit can complete. Supervisor only gives the process 30 seconds to shut down, so something else must prevent this endless Timer. When `stop` is called, take the lock so we can mutate the timer. However, since `stop` may have been called from a signal handler, our thread may _already_ have the lock. As Python provides no way to know if our thread is the one which has the lock, make the lock a re-entrant one, allowing us to always try to take it. With the lock in hand, cancel any outstanding timers. A race exists where the timer may not be able to be canceled because it has finished, maybe_send_batched_emails has been called, and is itself blocked on the lock. Handle this case by timing out the thread join in `stop()`, and signal the running thread to exit by unsetting the timer event, which will be detected once it claims the lock.	2021-11-23 10:45:49 -08:00
Mateusz Mandera	8af7ffd9da	rate_limit: Fix logging string when rate limiting email gateway. realm.name is not the right "name" to log, we should use realm.subdomain like everywhere else.	2021-11-22 10:28:56 -08:00
Alex Vandiver	faeffa2466	queue_processors: Set a bounded prefetch size on rabbitmq queues. RabbitMQ clients have a setting called prefetch[1], which controls how many un-acknowledged events the server forwards to the local queue in the client. The default is 0; this means that when clients first connect, the server must send them every message in the queue. This itself may cause unbounded memory usage in the client, but also has other detrimental effects. While the client is attempting to process the head of the queue, it may be unable to read from the TCP socket at the rate that the server is sending to it -- filling the TCP buffers, and causing the server's writes to block. If the server blocks for more than 30 seconds, it times out the send, and closes the connection with: ``` closing AMQP connection <0.30902.126> (127.0.0.1:53870 -> 127.0.0.1:5672): {writer,send_failed,{error,timeout}} ``` This is https://github.com/pika/pika/issues/753#issuecomment-318119222. Set a prefetch limit of 100 messages, or the batch size, to better handle queues which start with large numbers of outstanding events. Setting prefetch=1 causes significant performance degradation in the no-op queue worker, to 30% of the prefetch=0 performance. Setting prefetch=100 achieves 90% of the prefetch=0 performance, and higher values offer only minor gains above that. For batch workers, their performance is not notably degraded by prefetch equal to their batch size, and they cannot function on smaller prefetches than their batch size. We also set a 100-count prefetch on Tornado workers, as they are potentially susceptible to the same effect. [1] https://www.rabbitmq.com/confirms.html#channel-qos-prefetch	2021-11-16 11:48:50 -08:00
Alex Vandiver	64268f47e8	queue_processors: Drop unused current_queue_size, which was local size. The `current_queue_size` key in the queue monitoring stats file was the local queue size, not the global queue size -- `d5a6b0f99a` renamed the function, but did not adjust the queue monitoring JSON, despite the last use of it having been removed in `cd9b194d88`. The function is still used to mark "we emptied our queue," and it remains a reasonable metric for that.	2021-11-16 11:48:50 -08:00
Alex Vandiver	800e38016a	queue_rate: Output to CSV, and run multiple prefetch values.	2021-11-16 11:48:50 -08:00
Shlok Patel	893c9bc896	export: Remove `--delete-after-upload` flag in realm export. For export realm following changes have been made: - `./manage.py export --upload` would delete `.tar.gz` and unpacked dir - `./manage.py export` would only delete `unpacked dir` Besides, we have removed `--delete-after-upload` as we have set it as the default. Fixes #20081	2021-11-03 11:14:02 -07:00
Alex Vandiver	75f1070881	queue_processors: Disable timeouts with PushNotificationsWorker. Since `3853285241`, PushNotificationsWorker uses the aioapns library to send Apple push notifications. This introduces an asyncio event loop into this worker process, which, if unlucky, can respond poorly when a SIGALRM is introduced to it: ``` [asyncio] Task exception was never retrieved future: <Task finished coro=<send_apple_push_notification.<locals>.attempt_send() done, defined at /path/to/zerver/lib/push_notifications.py:166> exception=WorkerTimeoutException(30, 1)> Traceback (most recent call last): File "/path/to/zerver/lib/push_notifications.py", line 169, in attempt_send result = await apns_context.apns.send_notification(request) File "/path/to/zulip-py3-venv/lib/python3.6/site-packages/aioapns/client.py", line 57, in send_notification response = await self.pool.send_notification(request) File "/path/to/zulip-py3-venv/lib/python3.6/site-packages/aioapns/connection.py", line 407, in send_notification response = await connection.send_notification(request) File "/path/to/zulip-py3-venv/lib/python3.6/site-packages/aioapns/connection.py", line 189, in send_notification data = json.dumps(request.message, ensure_ascii=False).encode() File "/usr/lib/python3.6/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/usr/lib/python3.6/json/encoder.py", line 199, in encode chunks = self.iterencode(o, _one_shot=True) File "/usr/lib/python3.6/json/encoder.py", line 257, in iterencode return _iterencode(o, 0) File "/path/to/zerver/worker/queue_processors.py", line 353, in timer_expired raise WorkerTimeoutException(limit, len(events)) zerver.worker.queue_processors.WorkerTimeoutException: Timed out after 30 seconds processing 1 events ``` ...which subsequently leads to the worker failing to make any progress on the queue. Remove the timeout on the worker. This may result in failing to make forward progress if Apple/Google take overly long handling requests, but is likely preferable to failing to make forward progress if _one_ request takes too long and gets unlucky with when the signal comes through.	2021-10-21 08:59:56 -07:00
Alex Vandiver	ab985c0066	queue_processors: Add a comment clarifying that timeouts only happen when single-threaded.	2021-10-21 08:59:56 -07:00
shanukun	8c1ea78d7d	invite: Extend invite api for handling expiration duration. This extends the invite api endpoints to handle an extra argument, expiration duration, which states the number of days before the invitation link expires. For prereg users, expiration info is attached to event object to pass it to invite queue processor in order to create and send confirmation link. In case of multiuse invites, confirmation links are created directly inside do_create_multiuse_invite_link(), For filtering valid user invites, expiration info stored in Confirmation object is used, which is accessed by a prereg user using reverse generic relations. Fixes #16359.	2021-09-10 16:53:03 -07:00
Anders Kaseorg	646c04eff2	Rename default branch to ‘main’. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-09-06 12:56:35 -07:00
Alex Vandiver	5b45f8a128	queue_processors: Include queue name in the timeout exception. This information can be gleaned from the stacktrace, but making it explicit in the stringification makes it much easier to differentiate types of errors at a glance, particularly in Sentry.	2021-09-02 02:48:34 -07:00
Alex Vandiver	4d98b0552e	missedmessage_emails: Ensure forward progress. maybe_send_batched_emails handles batches of emails from different users at once; as it processes each user's batch, it enqueues messages onto the `email_senders` queue. If `handle_missedmessage_emails` raises an exception when processing a single user's email, no events are marked as handled -- including those that were already handled and enqueued onto `email_senders`. This results in an increasing number of users being sent repeated emails about the same missed messages. Catch and log any exceptions when handling an individual user's events. This guarantees forward progress, and that notifications are sent at-most-once, not at-least-once.	2021-08-20 07:21:39 -07:00
Mateusz Mandera	a01594e72b	bots: Pass realm to get_system_bot call in DeferredWorker.	2021-07-26 15:33:13 -07:00
Abhijeet Prasad Bodas	dd5e12d112	MissedMessageWorker: Use custom batching periods from UserProfile.	2021-07-23 12:13:46 -07:00
Abhijeet Prasad Bodas	9fcb6e51ce	MissedMessageWorker: Handle deleted messages. The test for the try-except block is hacky. See the comment for explaination.	2021-07-23 12:13:46 -07:00
Abhijeet Prasad Bodas	de78b015d9	MissedMessageWorker: Remove unnecessary transaction.atomic. We only have one query which will change database state in this function, and we already have a lock on the process itself, so there's no need for a transaction. This was added in `ebb4eab0f9`.	2021-07-23 12:13:46 -07:00
Abhijeet Prasad Bodas	ebb4eab0f9	worker: Rewrite MissedMessageWorker to not be lossy. Previously, we stored up to 2 minutes worth of email events in memory before processing them. So, if the server were to go down we would lose those events. To fix this, we store the events in the database. This is a prep change for allowing users to set custom grace period for email notifications, since the bug noted above will aggravate with longer grace periods.	2021-07-13 17:21:38 -07:00
Abhijeet Prasad Bodas	e63e86dcb2	worker: Ensure complete coverage for PushNotificationsWorker. The `# nocoverage` was unnecessary apart from for the compatibility code, so add a test for that code and remove the `# nocoverage`. The `message_id` -> `message_ids` conversion was done in `9869153ae8`.	2021-07-13 08:30:31 -07:00
Mateusz Mandera	58d9975cca	embed_links: Interrupt consume() function on worker timeout. This fixes a bug introduced in `95b46549e1` which made the worker simply log a warning about the timeout and then continue consume()ing the event that should have also been interrupted. The idea here is to introduce an exception which can be used to interrupt the consume() process without triggering the regular handling of exceptions that happens in _handle_consume_exception.	2021-07-07 09:24:50 -07:00
Mateusz Mandera	95b46549e1	embed_links: Only log warning if worker times out. Throwing an exception is excessive in case of this worker, as it's expected for it to time out sometimes if the urls take too long to process. With a test added by tabbott.	2021-07-06 14:17:24 -07:00
Mateusz Mandera	d9ab70bdde	queue_processors: Make timer_expired receive list of events as argument. This will give queue workers more flexibility when defining their own override of the method.	2021-07-06 13:46:48 -07:00
Mateusz Mandera	c101f3acd6	queue_processors: Make timer_expired() a method. This allows specific queue workers to override the defaut behavior and implement their own response to the timer expiring. We will want to use this for embed_links queue at least.	2021-07-06 13:46:48 -07:00
PIG208	75cea329b4	markdown: Refactor out additional properties added to Message. This adds a new class called MessageRenderingResult to contain the additional properties we added to the Message object (like alert_words) as well as the rendered content to ensure typesafe reference. No behavioral change is made except changes in typing. This is a preparatory change for adding django-stubs to the backend. Related: #18777	2021-06-24 18:14:53 -07:00
sahil839	37bf160298	queue_processor: Add langauge to the events added to invites queue. This is a prep commit for adding realm-level default for various user settings. We add the language, in which the invite email will be sent, to the dict added to queue itself to avoid making queries in a loop when sending multiple emails from queue. We also handle the case for old events in the queue.	2021-06-22 16:55:32 -07:00
Mateusz Mandera	496e744053	queue_processors: Log more detailed info when marking messages as read.	2021-05-26 11:17:21 -07:00
PIG208	7150fe5dc5	backend: Extract check_update_message from update_message_backend.	2021-05-09 20:44:04 -07:00
Cyril Pletinckx	e4ff372fc3	emails: Transform SMTPException into EmailNotDeliveredException. Django's default SMTP implementation can raise various exceptions when trying to send an email. In order to allow Zulip calling code to catch fewer exceptions to handle any cause of "email not sent", we translate most of them into EmailNotDeliveredException. The non-translated exceptions concern the connection with the SMTP server. They were not merged with the rest to keep some details about the nature of these. Tests are implemented in the test_send_email.py module.	2021-05-05 20:16:11 -07:00
Alex Vandiver	a9688ceb75	worker: Allow long MissedMessageWorker consumes. This will stop dropping events in the case that the background `maybe_send_batched_email` thread takes longer than 30s. However, see also #15280 and the TODO comment about how we lose events upon restart; this worker is still lossy.	2021-05-04 08:45:48 -07:00
Cyril Pletinckx	9afde790c6	email: Open a single SMTP connection to send email batches. Previously the outgoing emails were sent over several SMTP connections through the EmailSendingWorker; establishing a new connection each time adds notable overhead. Redefine EmailSendingWorker worker to be a LoopQueueProcessingWorker, which allows it to handle batches of events. At the same time, persist the connection across email sending, if possible. The connection is initialized in the constructor of the worker in order to keep the same connection throughout the whole process. The concrete implementation of the consume_batch function is simply processing each email one at a time until they have all been sent. In order to reuse the previously implemented decorator to retry sending failures a new method that meets the decorator's required arguments is declared inside the EmailSendingWorker class. This allows to retry the sending process of a particular email inside the batch if the caught exception leaves this process retriable. A second retry mechanism is used inside the initialize_connection function to redo the opening of the connection until it works or until three attempts failed. For this purpose the backoff module has been added to the dependencies and a test has been added to ensure that this retry mechanism works well. The connection is closed when the stop method is called. Fixes: #17672.	2021-04-26 17:27:22 -07:00
Alex Vandiver	0ad17925eb	send_email: Remove unnecessary send_email_from_dict. This was introduced in `8321bd3f92` to serve as a sort of drop-in replacement for zerver.lib.queue.queue_json_publish, but its use has been subsequently cut out (e.g. `9fcdb6c83ac5`). Remote its last callsite.	2021-04-26 17:27:22 -07:00
Anders Kaseorg	178736c8eb	docs: Fix spelling errors caught by codespell. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-04-26 09:31:08 -07:00
Tim Abbott	260861426c	queue_processors: Document when can remove compatibility code.	2021-04-16 09:55:14 -07:00
Anders Kaseorg	e7ed907cf6	python: Convert deprecated Django ugettext alias to gettext. django.utils.translation.ugettext is a deprecated alias of django.utils.translation.gettext as of Django 3.0, and will be removed in Django 4.0. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-04-15 18:01:34 -07:00
Anders Kaseorg	1fe29aad42	queue_processors: Simplify unnecessary use of Optional. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-04-13 08:54:26 -07:00
Alex Vandiver	a280905a89	outgoing_webhook: Join build_bot_request and send_data_to_server. The existing organization, of returning an opaque blob from `build_bot_request`, which was later consumed by `send_data_to_server`, is not particularly sensible; the steps become oddly split between the OutgoingWebhookWorker, `do_rest_call`, and the `OutgoingWebhookServiceInterface`. Make the `OutgoingWebhookServiceInterface` in charge of building, making, and returning the request in one method; another method handles extracting content from a successful response. `do_rest_call` is responsible for calling both halves of this, and doing common error handling.	2021-03-29 18:24:44 -07:00
Anders Kaseorg	d55dc6f8f1	requirements: Upgrade python-zulip-api from Git. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-03-26 16:31:03 -07:00
Mateusz Mandera	b9c1fed18c	invites: Delete old compat code in the invites queue worker. 1.7.* is old enough at this point that we can clean up this code.	2021-02-26 08:26:43 -08:00
Mateusz Mandera	09fc79f911	actions: Remove realm argument to internal_send_private_message. The argument is redundant.	2021-02-23 15:26:47 -08:00
Anders Kaseorg	6e4c3e41dc	python: Normalize quotes with Black. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	11741543da	python: Reformat with Black, except quotes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	5028c081cb	python: Merge concatenated string literals that Black would uglify. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Alex Vandiver	d0f0c2f2ed	digest: Fix the structure that we enqueue across when digesting. This rename was missed in `bfa0bdf3d6`. Without this fix, digest messages fail to send.	2021-02-08 17:28:59 -08:00
Anders Kaseorg	454144c35f	queue_processors: Fix retry_send_email_failures type. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-01-26 13:27:50 -08:00
Steve Howell	bfa0bdf3d6	email digests: Process users in chunks of 30. This should make the queue empty more quickly, because we do bulk queries to prevent database hops.	2021-01-17 11:28:30 -08:00
Alex Vandiver	c2526844e9	worker: Remove SignupWorker and friends. ZULIP_FRIENDS_LIST_ID and MAILCHIMP_API_KEY are not currently used in production. This removes the unused 'signups' queue and worker.	2021-01-17 11:16:35 -08:00
Alex Vandiver	d688e18de2	errors: Remove references to "deployment", use "host". The `deployment` key was only set in `do_report_error`, which is now only used in one codepath (the queue worker). The logging handlers on staging call notify_server_error directly, which omits the `deployment` key. Remove the odd one-of key, and instead simply do dispatch in `do_report_error`.	2021-01-17 11:08:12 -08:00
Anders Kaseorg	cc55393671	python: Open text files as text to skip decode operations. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-30 11:36:38 -07:00
Abhijeet Prasad Bodas	e98a8856c7	logging: Add logging in deferred_work queue processor. Adds logging statements in deferred_work queue consume.	2020-10-29 10:34:53 -07:00
Alex Vandiver	142de0f670	queue: Increase default timeout to 30s, from 10s. Not all of the workers are known to be safe to interrupt; they might leave inconsistent state. As such, terminating them with timeouts should currently only be a last-resort against stalled queues, not a regular occurrence.	2020-10-27 16:39:31 -07:00
Alex Vandiver	c73dd194f0	sentry: Group all worker timeouts together, by queue. Since the exception can be triggered at arbitrary places in the stack based on whenever the alarm happens to fire, they do not often group together. Explicitly group them together, grouped only by which queue the work is in.	2020-10-27 16:39:31 -07:00

1 2 3 4 5 ...

477 Commits