zulip

Commit Graph

Author	SHA1	Message	Date
Prakhar Pratyush	07dcee36b2	export_realm: Add RealmExport model. Earlier, we used to store the key data related to realm exports in RealmAuditLog. This commit adds a separate table to store those data. It includes the code to migrate the concerned existing data in RealmAuditLog to RealmExport. Fixes part of #31201.	2024-10-04 12:06:35 -07:00
Anders Kaseorg	1b4e02c5d0	thumbnail: Remove type: ignore. (An alternate solution is message_classes: list[type[Message \| ArchivedMessage]].) Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-10-04 13:54:14 -04:00
Alex Vandiver	912c1b5984	thumbnail: Tighten and clarify the "type: ignore" limitation.	2024-10-04 09:10:14 -07:00
Alex Vandiver	3cbbf2307b	thumbnail: Only lock the message row, not the Attachment row. This prevents a deadlock between the thumbnailing worker and message sending, as follows: 1. A user uploads an image, making Attachment and ImageAttachment rows, as well as enqueuing a job in the thumbnailing queue. 2. Message sending starts a transaction, creates the Message row, and calls `do_claim_attachments`, which edits the Attachment row of the upload (implicitly locking it). 3. The thumbnailing worker starts a transaction, locks the ImageAttachment row for its image, thumbnails it, and then attempts to `select_for_update()` the message objects (joined to the Attachments table) to find the ones which link to the attachment in question. This query blocks, since "a locking clause without a table list affects all tables used in the statement"[^1] and the message-send request already has a write lock on the Attachments row in question. 4. The message-send request attempts to re-fetch the ImageAttachment row inside the transaction, which tries to pull a lock on it. 5. Deadlock, because the message-send request has the Attachment lock, and waits for the ImageAttachment lock; the thumbnailing worker has the ImageAttachment lock, and waits for the Attachment lock. We break this deadlock by limiting the `update_message_rendered_content` `select_for_update` to only take the lock on the Message table, and not also the Attachments table -- no changes will be made to the Attachments, so no lock is necessary there. This allows the thumbnailing worker to successfully pull the empty list of messages (since the message-send request has not commits its transaction, and thus the Message row is not visible yet), and release its ImageAttachment lock so that the message-send request can proceed. [^1]: https://www.postgresql.org/docs/current/sql-select.html#SQL-FOR-UPDATE-SHARE	2024-10-04 09:10:14 -07:00
Prakhar Pratyush	65f465562f	export_realm: Remove the 'react on consent message' approach. For exporting full with consent: * Earlier, a message advertising users to react with thumbs up was sent and later used to determine the users who consented. * Now, we no longer need to send such a message. This commit updates the logic to use `allow_private_data_export` user-setting to determine users who consented. Fixes part of #31201.	2024-09-24 14:32:42 -07:00
Alex Vandiver	ce0df00e44	export: Notify all realm admins on realm export.	2024-09-23 10:02:43 -07:00
Alex Vandiver	b4764f49df	upload: Download files with their original names. Fixes: #29491.	2024-09-09 12:40:17 -07:00
Alex Vandiver	6f20c15ae9	thumbnail: Resolve a race condition when rendering messages. Messages are rendered outside of a transaction, for performance reasons, and then sent inside of one. This opens thumbnailing up to a race where the thumbnails have not yet been written when the message is rendered, but the message has not been sent when thumbnailing completes, causing `rewrite_thumbnailed_images` to be a no-op and the message being left with a spinner which never resolves. Explicitly lock and use he ImageAttachment data inside the message-sending transaction, to rewrite the message content with the latest information about the existing thumbnails. Despite the thumbnailing worker taking a lock on Message rows to update them, this does not lead to deadlocks -- the INSERT of the Message rows happens in a transaction, ensuring that either the message rending blocks the thumbnailing until the Message row is created, or that the `rewrite_thumbnailed_images` and Message INSERT waits until thumbnailing is complete (and updated no Message rows).	2024-08-01 16:48:16 -07:00
Mateusz Mandera	aaca394813	presence: Remove the queue worker.	2024-07-31 16:46:42 -07:00
Alex Vandiver	2ea0cc0005	thumbnail: Add a data-original-dimensions attribute. This allows clients to potentially lay out the thumbnails more intelligently, or to provide a better "progressive-load" experience when enlarging the thumbnail.	2024-07-22 22:41:10 -04:00
Alex Vandiver	65828b20e9	thumbnail: Factor out a dataclass for markdown image metadata.	2024-07-22 22:41:10 -04:00
Alex Vandiver	b42863be4b	markdown: Show thumbnails for uploaded images. Fixes: #16210.	2024-07-21 18:41:59 -07:00
Alex Vandiver	4351cc5914	thumbnail: Move get_image_thumbnail_path and split_thumbnail_path.	2024-07-18 13:50:28 -07:00
Alex Vandiver	d121a80b78	upload: Serve thumbnailed images.	2024-07-16 13:22:15 -07:00
Alex Vandiver	2e38f426f4	upload: Generate thumbnails when images are uploaded. A new table is created to track which path_id attachments are images, and for those their metadata, and which thumbnails have been created. Using path_id as the effective primary key lets us ignore if the attachment is archived or not, saving some foreign key messes. A new worker is added to observe events when rows are added to this table, and to generate and store thumbnails for those images in differing sizes and formats.	2024-07-16 13:22:15 -07:00
Anders Kaseorg	1e9b6445a9	ruff: Fix PLR6104 Use `+=` to perform an augmented assignment directly. This is a preview rule, not yet enabled by default. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-07-14 13:49:51 -07:00
Anders Kaseorg	b96feb34f6	ruff: Fix SIM117 Use a single `with` statement with multiple contexts. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-07-14 13:48:32 -07:00
Anders Kaseorg	0fa5e7f629	ruff: Fix UP035 Import from `collections.abc`, `typing` instead. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-07-13 22:28:22 -07:00
Anders Kaseorg	531b34cb4c	ruff: Fix UP007 Use `X \| Y` for type annotations. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-07-13 22:28:22 -07:00
Anders Kaseorg	e08a24e47f	ruff: Fix UP006 Use `list` instead of `List` for type annotation. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-07-13 22:28:22 -07:00
Anders Kaseorg	b115d44b6a	requirements: Upgrade Python requirements. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-06-27 15:31:43 -07:00
Alex Vandiver	b2ebe34500	missedmessage_emails: Backoff the background worker retries.	2024-05-06 12:50:27 -07:00
Tim Abbott	0a756c652c	push_notifications: Shard mobile push notifications.	2024-05-02 14:25:10 -07:00
Alex Vandiver	572fbfe114	queue_processors: Pass the worker_num down into the class.	2024-05-02 14:25:10 -07:00
Alex Vandiver	9dfaa83aa8	invites: Remove invites worker, make confirmation object in-process. The "invites" worker exists to do two things -- make a Confirmation object, and send the outgoing email. Making the Confirmation object in a background process from where the PreregistrationUser is created temporarily leaves the PreregistrationUser in invalid state, and results in 500's, and the user not immediately seeing the sent invitation. That the "invites" worker also wants to create the Confirmation object means that "resending" an invite invalidates the URL in the previous email, which can be confusing to the user. Moving the Confirmation creation to the same transaction solves both of these issues, and leaves the "invites" worker with nothing to do but send the email; as such, we remove it entirely, and use the existing "email_senders" worker to send the invites. The volume of invites is small enough that this will not affect other uses of that worker. Fixes: #21306 Fixes: #24275	2024-05-02 14:23:04 -07:00
Anders Kaseorg	d32d4434dd	partial: Replace returns plugin with an annotation. The returns plugin hasn’t been updated for mypy ≥ 1.6. This annotation is more limited in that it only supports a fixed number of positional arguments and no keyword arguments, but is good enough for our purposes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-04-29 17:14:41 -07:00
Anders Kaseorg	72aeaf8d52	db: Split reset_queries into a new module zerver.lib.db_connections. Fixes an import cycle that breaks mypy inference with django-stubs: zproject.settings → zproject.computed_settings → zerver.lib.db → django.db → django.db.backends.base.base → django.db.backends.base.features → django.db.models.base → django.db.models.options → django.contrib.contenttypes.fields → django.contrib.contenttypes.models → confirmation.models → django.conf → zproject.settings Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-04-17 16:49:03 -07:00
Alex Vandiver	5654d051f7	worker: Split into separate files. This makes each worker faster to start up.	2024-04-16 23:00:02 -07:00
Anders Kaseorg	7e2ef11f61	ruff: Fix UP041 Replace aliased errors with `TimeoutError`. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-04-01 18:32:52 -07:00
Alex Vandiver	9d8d2d138b	missedmessage_emails: Add Sentry spans to worker thread.	2024-03-21 12:46:13 -07:00
Alex Vandiver	9451d08bb9	worker: Split out worker sampling rate, and add Sentry transactions.	2024-03-21 12:46:13 -07:00
Alex Vandiver	3cbce0c5c7	missedmessage_emails: Clear caches and db query tracking per-loop. Otherwise, these accumulate and leak memory.	2024-03-21 12:46:13 -07:00
Alex Vandiver	6e91e326e9	deferred_work: Reduce batch size due to bad statistics. PostgreSQL's estimate of the number of usermessage rows for a single message can be wildly off, due to poor statistics generation. This causes this query, with 100-message batch sizes, to incorrectly estimate millions of matched rows, causing it to perform a full-table index scan, rather than piecemeal using the `message_id` index. Reduce the batch size to 50, which is enough to tip in favor of a rational query plan.	2024-03-11 09:24:59 -07:00
Anders Kaseorg	d748ec8d52	ruff: Fix PLW0108 Lambda may be unnecessary. This is a preview rule, not yet enabled by default. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-03-01 09:30:04 -08:00
Alex Vandiver	a808c730bc	deferred_work: Use an id high-water-mark instead of offsets. This solves the problem listed in the now-removed comment.	2024-02-27 17:02:34 -08:00
Alex Vandiver	58f0669997	deferred_work: Re-queue remaining "mark all as read" work after 30s.	2024-02-27 10:21:04 -08:00
Alex Vandiver	75e9903be5	deferred_work: Move all queries into the transaction. The presence of `len(messages)` outside the transaction caused the full resultset to be fetched outside of the transaction. This should ideally be inside the transaction, and also only need be the count. However, also note that the process of counting matching rows, and then executing a second query which embeds the same query, is susceptible to phantom reads, where a query with the same conditions returns different resultsets, under PostgreSQL's default transaction isolation of "read committed." While this is possible to resolve by pulling the returned IDs into a Python list, it would not address the issue that concurrent updates which change the resultset would make the overall algorithm still incorrect. Add a comment clarifying the conditions under which the algorithm is correct. A more correct algorithm would walk the UserMessage rows which are unread and in the stream, but this requires a whole-UserMessage index which would be quite large for such an infrequent use case.	2024-02-27 10:21:04 -08:00
Alex Vandiver	37fa181e5f	queue_processors: Process user_activity in one query. This leads to significant speedups. In a test, with 100 random unique event classes, the old code processed a batch of 100 rows (on average 66-ish unique in the batch) in 0.45 seconds. Doing this in a single query processes the same batch in 0.0076 seconds.	2024-01-22 16:25:13 -08:00
Alex Vandiver	e6a0284275	queue_processors: Defer initial email connection creation. We previously created the connection to the outgoing email server when the EmailSendingWorker was first created. Since creating the connection can fail (e.g. because of firewalls or typos in the hostname), this can cause the `QueueProcessingWorker` creation to raise an exception. In multi-threaded mode, exceptions in the worker threads which are _not_ during the handling of a specific event percolate out to `log_and_exit_if_exception` and trigger the termination of the entire process -- stopping all worker threads from making forward progress. Contain the blast radius of misconfigured email servers by deferring the opening of the connection until it is first needed. This will not cause any overall performance change, since it only affects the latency of the very first email after startup.	2024-01-12 08:38:46 -08:00
Anders Kaseorg	1f1b2f9a68	models: Extract zerver.models.bots. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-12-16 22:08:44 -08:00
Anders Kaseorg	bac027962f	models: Extract zerver.models.clients. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-12-16 22:08:44 -08:00
Anders Kaseorg	927d7a9a60	models: Extract zerver.models.prereg_users. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-12-16 22:08:44 -08:00
Anders Kaseorg	45bb8d2580	models: Extract zerver.models.users. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-12-16 22:08:44 -08:00
Prakhar Pratyush	c1daabd3c0	remote_server: Rename to 'send_server_data_to_push_bouncer'. This commit renames 'send_analytics_to_push_bouncer' to 'send_server_data_to_push_bouncer'.	2023-12-11 14:07:39 -08:00
Tim Abbott	5c1a5a816f	remote_server: Rename register_realm_with_push_bouncer. We plan to have this potentially happen more than once for a given realm.	2023-12-11 14:07:39 -08:00
Prakhar Pratyush	d763fae9d0	remote_server: Eliminate separate realms-only code path. Given that most of the use cases for realms-only code path would really like to upload audit logs too, and the others would likely produce a better user experience if they upoaded audit logs, we should just have a single main code path here i.e. 'send_analytics_to_push_bouncer'. We still only upload usage statistics according to documented option, and only from the analytics cron job. The error handling takes place in 'send_analytics_to_push_bouncer' itself.	2023-12-11 14:07:39 -08:00
Anders Kaseorg	223b626256	python: Use urlsplit instead of urlparse. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-12-05 13:03:07 -08:00
Anders Kaseorg	3853fa875a	python: Consistently use from…import for urllib.parse. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-12-05 13:03:07 -08:00
Anders Kaseorg	8a7916f21a	python: Consistently use from…import for datetime. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-12-05 12:01:18 -08:00
Mateusz Mandera	a67dd6dc1f	realms: Call send_realms_only_to_push_bouncer at realm creation/import.	2023-12-03 08:49:58 -08:00

1 2 3 4 5 ...

477 Commits