Commit Graph

479 Commits

Author SHA1 Message Date
Prakhar Pratyush 9c9866461a transaction: Add `durable=True` to the outermost db transactions.
This commit adds `durable=True` to the outermost db transactions
created in the following:
* confirm_email_change
* handle_upload_pre_finish_hook
* deliver_scheduled_emails
* restore_data_from_archive
* do_change_realm_subdomain
* do_create_realm
* do_deactivate_realm
* do_reactivate_realm
* do_delete_user
* do_delete_user_preserving_messages
* create_stripe_customer
* process_initial_upgrade
* do_update_plan
* request_sponsorship
* upload_message_attachment
* register_remote_server
* do_soft_deactivate_users
* maybe_send_batched_emails

It helps to avoid creating unintended savepoints in the future.

This is as a part of our plan to explicitly mark all the
transaction.atomic calls with either 'savepoint=False' or
'durable=True' as required.

* 'savepoint=True' is used in special cases.
2024-11-05 17:58:47 -08:00
Anders Kaseorg ac2b1cd45d worker: Address sentry_sdk deprecations.
https://docs.sentry.io/platforms/python/migration/1.x-to-2.x#scope-configuring
https://github.com/getsentry/sentry-python/releases/2.0.0
https://github.com/getsentry/sentry-python/releases/2.15.0

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-10-22 10:05:01 -07:00
Prakhar Pratyush 07dcee36b2 export_realm: Add RealmExport model.
Earlier, we used to store the key data related to realm exports
in RealmAuditLog. This commit adds a separate table to store
those data.

It includes the code to migrate the concerned existing data in
RealmAuditLog to RealmExport.

Fixes part of #31201.
2024-10-04 12:06:35 -07:00
Anders Kaseorg 1b4e02c5d0 thumbnail: Remove type: ignore.
(An alternate solution is message_classes: list[type[Message |
ArchivedMessage]].)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-10-04 13:54:14 -04:00
Alex Vandiver 912c1b5984 thumbnail: Tighten and clarify the "type: ignore" limitation. 2024-10-04 09:10:14 -07:00
Alex Vandiver 3cbbf2307b thumbnail: Only lock the message row, not the Attachment row.
This prevents a deadlock between the thumbnailing worker and message
sending, as follows:

1. A user uploads an image, making Attachment and ImageAttachment
   rows, as well as enqueuing a job in the thumbnailing queue.

2. Message sending starts a transaction, creates the Message row,
   and calls `do_claim_attachments`, which edits the Attachment row
   of the upload (implicitly locking it).

3. The thumbnailing worker starts a transaction, locks the
   ImageAttachment row for its image, thumbnails it, and then
   attempts to `select_for_update()` the message objects (joined to
   the Attachments table) to find the ones which link to the
   attachment in question. This query blocks, since "a locking
   clause without a table list affects all tables used in the
   statement"[^1] and the message-send request already has a write
   lock on the Attachments row in question.

4. The message-send request attempts to re-fetch the ImageAttachment
   row inside the transaction, which tries to pull a lock on it.

5. Deadlock, because the message-send request has the Attachment
   lock, and waits for the ImageAttachment lock; the thumbnailing
   worker has the ImageAttachment lock, and waits for the Attachment
   lock.

We break this deadlock by limiting the
`update_message_rendered_content` `select_for_update` to only take
the lock on the Message table, and not also the Attachments table --
no changes will be made to the Attachments, so no lock is necessary
there. This allows the thumbnailing worker to successfully pull the
empty list of messages (since the message-send request has not
commits its transaction, and thus the Message row is not visible
yet), and release its ImageAttachment lock so that the message-send
request can proceed.

[^1]: https://www.postgresql.org/docs/current/sql-select.html#SQL-FOR-UPDATE-SHARE
2024-10-04 09:10:14 -07:00
Prakhar Pratyush 65f465562f export_realm: Remove the 'react on consent message' approach.
For exporting full with consent:

* Earlier, a message advertising users to react with thumbs up
  was sent and later used to determine the users who consented.

* Now, we no longer need to send such a message. This commit
  updates the logic to use `allow_private_data_export` user-setting
  to determine users who consented.

Fixes part of #31201.
2024-09-24 14:32:42 -07:00
Alex Vandiver ce0df00e44 export: Notify all realm admins on realm export. 2024-09-23 10:02:43 -07:00
Alex Vandiver b4764f49df upload: Download files with their original names.
Fixes: #29491.
2024-09-09 12:40:17 -07:00
Alex Vandiver 6f20c15ae9 thumbnail: Resolve a race condition when rendering messages.
Messages are rendered outside of a transaction, for performance
reasons, and then sent inside of one.  This opens thumbnailing up to a
race where the thumbnails have not yet been written when the message
is rendered, but the message has not been sent when thumbnailing
completes, causing `rewrite_thumbnailed_images` to be a no-op and the
message being left with a spinner which never resolves.

Explicitly lock and use he ImageAttachment data inside the
message-sending transaction, to rewrite the message content with the
latest information about the existing thumbnails.

Despite the thumbnailing worker taking a lock on Message rows to
update them, this does not lead to deadlocks -- the INSERT of the
Message rows happens in a transaction, ensuring that either the
message rending blocks the thumbnailing until the Message row is
created, or that the `rewrite_thumbnailed_images` and Message INSERT
waits until thumbnailing is complete (and updated no Message rows).
2024-08-01 16:48:16 -07:00
Mateusz Mandera aaca394813 presence: Remove the queue worker. 2024-07-31 16:46:42 -07:00
Alex Vandiver 2ea0cc0005 thumbnail: Add a data-original-dimensions attribute.
This allows clients to potentially lay out the thumbnails more
intelligently, or to provide a better "progressive-load" experience
when enlarging the thumbnail.
2024-07-22 22:41:10 -04:00
Alex Vandiver 65828b20e9 thumbnail: Factor out a dataclass for markdown image metadata. 2024-07-22 22:41:10 -04:00
Alex Vandiver b42863be4b markdown: Show thumbnails for uploaded images.
Fixes: #16210.
2024-07-21 18:41:59 -07:00
Alex Vandiver 4351cc5914 thumbnail: Move get_image_thumbnail_path and split_thumbnail_path. 2024-07-18 13:50:28 -07:00
Alex Vandiver d121a80b78 upload: Serve thumbnailed images. 2024-07-16 13:22:15 -07:00
Alex Vandiver 2e38f426f4 upload: Generate thumbnails when images are uploaded.
A new table is created to track which path_id attachments are images,
and for those their metadata, and which thumbnails have been created.
Using path_id as the effective primary key lets us ignore if the
attachment is archived or not, saving some foreign key messes.

A new worker is added to observe events when rows are added to this
table, and to generate and store thumbnails for those images in
differing sizes and formats.
2024-07-16 13:22:15 -07:00
Anders Kaseorg 1e9b6445a9 ruff: Fix PLR6104 Use `+=` to perform an augmented assignment directly.
This is a preview rule, not yet enabled by default.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-14 13:49:51 -07:00
Anders Kaseorg b96feb34f6 ruff: Fix SIM117 Use a single `with` statement with multiple contexts.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-14 13:48:32 -07:00
Anders Kaseorg 0fa5e7f629 ruff: Fix UP035 Import from `collections.abc`, `typing` instead.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg 531b34cb4c ruff: Fix UP007 Use `X | Y` for type annotations.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg e08a24e47f ruff: Fix UP006 Use `list` instead of `List` for type annotation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg b115d44b6a requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-06-27 15:31:43 -07:00
Alex Vandiver b2ebe34500 missedmessage_emails: Backoff the background worker retries. 2024-05-06 12:50:27 -07:00
Tim Abbott 0a756c652c push_notifications: Shard mobile push notifications. 2024-05-02 14:25:10 -07:00
Alex Vandiver 572fbfe114 queue_processors: Pass the worker_num down into the class. 2024-05-02 14:25:10 -07:00
Alex Vandiver 9dfaa83aa8 invites: Remove invites worker, make confirmation object in-process.
The "invites" worker exists to do two things -- make a Confirmation
object, and send the outgoing email.  Making the Confirmation object
in a background process from where the PreregistrationUser is created
temporarily leaves the PreregistrationUser in invalid state, and
results in 500's, and the user not immediately seeing the sent
invitation.  That the "invites" worker also wants to create the
Confirmation object means that "resending" an invite invalidates the
URL in the previous email, which can be confusing to the user.

Moving the Confirmation creation to the same transaction solves both
of these issues, and leaves the "invites" worker with nothing to do
but send the email; as such, we remove it entirely, and use the
existing "email_senders" worker to send the invites.  The volume of
invites is small enough that this will not affect other uses of that
worker.

Fixes: #21306
Fixes: #24275
2024-05-02 14:23:04 -07:00
Anders Kaseorg d32d4434dd partial: Replace returns plugin with an annotation.
The returns plugin hasn’t been updated for mypy ≥ 1.6.  This
annotation is more limited in that it only supports a fixed number of
positional arguments and no keyword arguments, but is good enough for
our purposes.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-04-29 17:14:41 -07:00
Anders Kaseorg 72aeaf8d52 db: Split reset_queries into a new module zerver.lib.db_connections.
Fixes an import cycle that breaks mypy inference with django-stubs:

zproject.settings → zproject.computed_settings → zerver.lib.db →
django.db → django.db.backends.base.base →
django.db.backends.base.features → django.db.models.base →
django.db.models.options → django.contrib.contenttypes.fields →
django.contrib.contenttypes.models → confirmation.models → django.conf
→ zproject.settings

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-04-17 16:49:03 -07:00
Alex Vandiver 5654d051f7 worker: Split into separate files.
This makes each worker faster to start up.
2024-04-16 23:00:02 -07:00
Anders Kaseorg 7e2ef11f61 ruff: Fix UP041 Replace aliased errors with `TimeoutError`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-04-01 18:32:52 -07:00
Alex Vandiver 9d8d2d138b missedmessage_emails: Add Sentry spans to worker thread. 2024-03-21 12:46:13 -07:00
Alex Vandiver 9451d08bb9 worker: Split out worker sampling rate, and add Sentry transactions. 2024-03-21 12:46:13 -07:00
Alex Vandiver 3cbce0c5c7 missedmessage_emails: Clear caches and db query tracking per-loop.
Otherwise, these accumulate and leak memory.
2024-03-21 12:46:13 -07:00
Alex Vandiver 6e91e326e9 deferred_work: Reduce batch size due to bad statistics.
PostgreSQL's estimate of the number of usermessage rows for a single
message can be wildly off, due to poor statistics generation.  This
causes this query, with 100-message batch sizes, to incorrectly
estimate millions of matched rows, causing it to perform a full-table
index scan, rather than piecemeal using the `message_id` index.

Reduce the batch size to 50, which is enough to tip in favor of a
rational query plan.
2024-03-11 09:24:59 -07:00
Anders Kaseorg d748ec8d52 ruff: Fix PLW0108 Lambda may be unnecessary.
This is a preview rule, not yet enabled by default.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-03-01 09:30:04 -08:00
Alex Vandiver a808c730bc deferred_work: Use an id high-water-mark instead of offsets.
This solves the problem listed in the now-removed comment.
2024-02-27 17:02:34 -08:00
Alex Vandiver 58f0669997 deferred_work: Re-queue remaining "mark all as read" work after 30s. 2024-02-27 10:21:04 -08:00
Alex Vandiver 75e9903be5 deferred_work: Move all queries into the transaction.
The presence of `len(messages)` outside the transaction caused the
full resultset to be fetched outside of the transaction.  This should
ideally be inside the transaction, and also only need be the count.

However, also note that the process of counting matching rows, and
then executing a second query which embeds the same query, is
susceptible to phantom reads, where a query with the same conditions
returns different resultsets, under PostgreSQL's default transaction
isolation of "read committed."  While this is possible to resolve by
pulling the returned IDs into a Python list, it would not address the
issue that concurrent updates which change the resultset would make
the overall algorithm still incorrect.

Add a comment clarifying the conditions under which the algorithm is
correct.  A more correct algorithm would walk the UserMessage rows
which are unread and in the stream, but this requires a
whole-UserMessage index which would be quite large for such an
infrequent use case.
2024-02-27 10:21:04 -08:00
Alex Vandiver 37fa181e5f queue_processors: Process user_activity in one query.
This leads to significant speedups.  In a test, with 100 random unique
event classes, the old code processed a batch of 100 rows (on average
66-ish unique in the batch) in 0.45 seconds.  Doing this in a single
query processes the same batch in 0.0076 seconds.
2024-01-22 16:25:13 -08:00
Alex Vandiver e6a0284275 queue_processors: Defer initial email connection creation.
We previously created the connection to the outgoing email server when
the EmailSendingWorker was first created.  Since creating the
connection can fail (e.g. because of firewalls or typos in the
hostname), this can cause the `QueueProcessingWorker` creation to
raise an exception.  In multi-threaded mode, exceptions in the worker
threads which are _not_ during the handling of a specific event
percolate out to `log_and_exit_if_exception` and trigger the
termination of the entire process -- stopping all worker threads from
making forward progress.

Contain the blast radius of misconfigured email servers by deferring
the opening of the connection until it is first needed.  This will not
cause any overall performance change, since it only affects the
latency of the very first email after startup.
2024-01-12 08:38:46 -08:00
Anders Kaseorg 1f1b2f9a68 models: Extract zerver.models.bots.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Anders Kaseorg bac027962f models: Extract zerver.models.clients.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Anders Kaseorg 927d7a9a60 models: Extract zerver.models.prereg_users.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Anders Kaseorg 45bb8d2580 models: Extract zerver.models.users.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Prakhar Pratyush c1daabd3c0 remote_server: Rename to 'send_server_data_to_push_bouncer'.
This commit renames 'send_analytics_to_push_bouncer'
to 'send_server_data_to_push_bouncer'.
2023-12-11 14:07:39 -08:00
Tim Abbott 5c1a5a816f remote_server: Rename register_realm_with_push_bouncer.
We plan to have this potentially happen more than once for a given
realm.
2023-12-11 14:07:39 -08:00
Prakhar Pratyush d763fae9d0 remote_server: Eliminate separate realms-only code path.
Given that most of the use cases for realms-only code path would
really like to upload audit logs too, and the others would likely
produce a better user experience if they upoaded audit logs, we
should just have a single main code path here i.e.
'send_analytics_to_push_bouncer'.

We still only upload usage statistics according to documented
option, and only from the analytics cron job.

The error handling takes place in 'send_analytics_to_push_bouncer'
itself.
2023-12-11 14:07:39 -08:00
Anders Kaseorg 223b626256 python: Use urlsplit instead of urlparse.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-05 13:03:07 -08:00
Anders Kaseorg 3853fa875a python: Consistently use from…import for urllib.parse.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-05 13:03:07 -08:00