Commit Graph

50 Commits

Author SHA1 Message Date
Prakhar Pratyush 9c9866461a transaction: Add `durable=True` to the outermost db transactions.
This commit adds `durable=True` to the outermost db transactions
created in the following:
* confirm_email_change
* handle_upload_pre_finish_hook
* deliver_scheduled_emails
* restore_data_from_archive
* do_change_realm_subdomain
* do_create_realm
* do_deactivate_realm
* do_reactivate_realm
* do_delete_user
* do_delete_user_preserving_messages
* create_stripe_customer
* process_initial_upgrade
* do_update_plan
* request_sponsorship
* upload_message_attachment
* register_remote_server
* do_soft_deactivate_users
* maybe_send_batched_emails

It helps to avoid creating unintended savepoints in the future.

This is as a part of our plan to explicitly mark all the
transaction.atomic calls with either 'savepoint=False' or
'durable=True' as required.

* 'savepoint=True' is used in special cases.
2024-11-05 17:58:47 -08:00
Mateusz Mandera da4443f392 thumbnail: Make thumbnailing work with data import.
We didn't have thumbnailing for images coming from data import and this
commit adds the functionality.

There are a few fundamental issues that the implementation needs to
solve.

1. The images come from an untrusted source and therefore we don't want
   to just pass them through to thumbnailing without checking. For that
   reason, we cannot just import ImageAttachment rows from the export
   data, even for zulip=>zulip imports.
   The right way to process images is to pass them to maybe_thumbail(),
   which runs libvips_check_image() on them to verify we're okay with
   thumbnailing, creates ImageAttachment rows for them and sends them
   to the thumbnailing queue worker. This approach lets us handle both
   zulip=>zulip and 3rd party=>zulip imports in the same way,

2. There is a somewhat circular dependency between the Message,
   Attachment and ImageAttachment import process:

- ImageAttachments would ideally be created after importing
  Attachments, but they need to already exist at the time of Message
  import. Otherwise, the markdown processor doesn't know it has to add
  HTML for image previews to messages that reference images. This would
  mean that messages imported from 3rd party tools don't get image
  previews.
- Attachments only get created after Message import however, due to the
  many-to-many relationship between Message and Attachment.

This is solved by fixing up some data of Attachments pre-emptively, such
as the path_ids. This gives us the necessary information for creating
ImageAttachments before importing Messages.

While we generate ImageAttachment rows synchronously, the actual
thumbnailing job is sent to the queue worker. Theoretically, the worker
could be very backlogged and not process the thumbnails anytime soon.
This is fine - if the app is loaded and tries to display a message with
such a not-yet-generated thumbnail, the code in `serve_file` will
generate the thumbnails synchronously on the fly and the user will see
the image preview displayed normally. See:

1b47134d0d/zerver/views/upload.py (L333-L342)
2024-10-24 10:32:51 -07:00
Alex Vandiver a20673a267 upload: Allow filtering to just a prefix (e.g. a realm id). 2024-09-26 12:01:11 -07:00
Alex Vandiver 2dc737335e upload: Switch from BinaryIO to IO[bytes].
This is slightly more generally-compatible.
2024-09-26 12:01:11 -07:00
Alex Vandiver 287850d08d tusd: Remove non-ASCII characters from path-ids. 2024-09-26 12:00:43 -07:00
Alex Vandiver 9a1f78db22 thumbnail: Support checking for images from streaming sources.
We may not always have trivial access to all of the bytes of the
uploaded file -- for instance, if the file was uploaded previously, or
by some other process.  Downloading the entire image in order to check
its headers is an inefficient use of time and bandwidth.

Adjust `maybe_thumbnail` and dependencies to potentially take a
`pyvips.Source` which supports streaming data from S3 or disk.  This
allows making the ImageAttachment row, if deemed appropriate, based on
only a few KB of data, and not the entire image.
2024-09-17 12:51:30 -07:00
Alex Vandiver 903bfb31e6 upload: Provide the frontend with the less-modified filename. 2024-09-09 12:40:17 -07:00
Alex Vandiver b4764f49df upload: Download files with their original names.
Fixes: #29491.
2024-09-09 12:40:17 -07:00
Alex Vandiver 4351cc5914 thumbnail: Move get_image_thumbnail_path and split_thumbnail_path. 2024-07-18 13:50:28 -07:00
Alex Vandiver 2e38f426f4 upload: Generate thumbnails when images are uploaded.
A new table is created to track which path_id attachments are images,
and for those their metadata, and which thumbnails have been created.
Using path_id as the effective primary key lets us ignore if the
attachment is archived or not, saving some foreign key messes.

A new worker is added to observe events when rows are added to this
table, and to generate and store thumbnails for those images in
differing sizes and formats.
2024-07-16 13:22:15 -07:00
Anders Kaseorg 0fa5e7f629 ruff: Fix UP035 Import from `collections.abc`, `typing` instead.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg 531b34cb4c ruff: Fix UP007 Use `X | Y` for type annotations.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg e08a24e47f ruff: Fix UP006 Use `list` instead of `List` for type annotation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Alex Vandiver 262689da76 thumbnail: Fix MAX_EMOJI_GIF_FILE_SIZE_BYTES check to be post-resize.
This check was intended to check the post-resized image size, not the
pre-resized image.
2024-07-12 13:26:47 -07:00
Alex Vandiver 54f2fabac0 thumbnail: Still emoji are always pngs. 2024-07-12 13:26:47 -07:00
Alex Vandiver f6b99171ce emoji: Derive the file extension from a limited set of content-types.
We thumbnail and serve emoji with the same format as they were
uploaded.  However, we preserved the original extension, which might
mismatch with the provided content-type.

Limit the content-type to a subset which is both (a) an image format
we can thumbnail, and (b) a media format which is widely-enough
supported that we are willing to provide it to all browsers.  This
prevents uploading a `.tiff` emoji, for instance.

Based on this limited content-type, we then reverse to find the
reasonable extension to use when storing it.  This is particularly
important because the local file storage uses the file extension to
choose what content-type to re-serve the emoji as.

This does nothing for existing emoji, which may have odd or missing
file extensions.
2024-07-12 13:26:47 -07:00
Alex Vandiver 62a0611ddb emoji: Pass down content-type, rather than guessing from extension. 2024-07-12 13:26:47 -07:00
Alex Vandiver 4bc563128e thumbnail: Use a consistent set of supported image types. 2024-07-11 07:31:39 -07:00
Alex Vandiver ff90e5355f upload: Pass down content-type of realm icon/logo to backend.
This saves having to try to re-derive it from the file extension,
which may be ".original" in some cases.
2024-07-11 07:31:39 -07:00
Alex Vandiver 79f858b4b8 upload: Pass bytes to create_attachment.
This will be used to analyze the bytes for image metadata.
2024-07-07 14:40:07 -07:00
Alex Vandiver f97a30f240 upload: Reorder arguments to parallel upload_message_attachment. 2024-07-07 14:40:07 -07:00
Alex Vandiver f52a93bc14 upload: Stop requiring callers pass in the file size.
This can be calculated because we have the contents.
2024-07-07 14:40:07 -07:00
Alex Vandiver 58a9fe9af1 upload: Drop unused parameters to upload_message_attachment. 2024-07-07 14:40:07 -07:00
Alex Vandiver 0a296b2a6e upload: Start storing content-type for new uploads. 2024-07-07 14:40:07 -07:00
Alex Vandiver e29a455b2d avatars: Encode version into the filename.
Hash the salt, user-id, and now avatar version into the filename.
This allows the URL contents to be immutable, and thus to be marked as
immutable and cacheable.  Since avatars are served unauthenticated,
hashing with a server-side salt makes the current and past avatars not
enumerable.

This requires plumbing the current (or future) avatar version through
various parts of the upload process.

Since this already requires a full migration of current avatars, also
take the opportunity to fix the missing `.png` on S3 uploads (#12852).

We switch from SHA-1 to SHA-256, but truncate it such that avatar URL
data does not substantially increase in size.

Fixes: #12852.
2024-07-07 14:40:07 -07:00
Sahil Batra 5ef14c3a8e users: Fix uploading user avatars.
Due to recent refactoring in 9fb03cb2c7, a user could not
upload avatar if the server uses local upload backend and there
was already an avatar file for that user.

This commit fixes it to just check if there exists a file only
when importing and not when the user is actually trying to
change the avatar.

Fixes #30676.
2024-07-02 13:26:21 -07:00
Alex Vandiver 2eaf098c5d upload: Content-type is always defined. 2024-06-26 16:43:11 -07:00
Alex Vandiver 17fb23746f upload: Move methods into zerver.lib.upload from .base. 2024-06-26 16:43:11 -07:00
Alex Vandiver c826d80061 upload: Factor out common code into zerver.lib.upload. 2024-06-26 16:43:11 -07:00
Alex Vandiver 08b24484d1 upload: Remove redundant acting_user_profile argument.
This argument, effectively added in 9eb47f108c, was never actually
used.
2024-06-26 16:43:11 -07:00
Alex Vandiver fb929ca218 thumbnailing: Remove unnecessary third return value from resize_emoji. 2024-06-26 16:43:09 -07:00
Alex Vandiver b14a33c659 thumbnailing: Switch to libvips, from PIL/pillow.
This is done in as much of a drop-in fashion as possible.  Note that
libvips does not support animated PNGs[^1], and as such this
conversion removes support for them as emoji; however, libvips
includes support for webp images, which future commits will take
advantage of.

This removes the MAX_EMOJI_GIF_SIZE limit, since that existed to work
around bugs in Pillow.  MAX_EMOJI_GIF_FILE_SIZE_BYTES is fixed to
actually be 128KiB (not 128MiB, as it actually was), and is counted
_after_ resizing, since the point is to limit the amount of data
transfer to clients.

[^1]: https://github.com/libvips/libvips/discussions/2000
2024-06-26 16:42:57 -07:00
Alex Vandiver 9fb03cb2c7 upload: Factor out common avatar logic. 2024-06-26 16:38:01 -07:00
Alex Vandiver d92993c972 upload: Factor out common emoji logic. 2024-06-26 16:38:01 -07:00
Anders Kaseorg fb4ad1422e mime_types: Add audio and image types missing from Python library.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-06-20 15:29:20 -07:00
Prakhar Pratyush 508c5611d1 claim_attachment: Remove the stale 'user_profile' parameter.
This commit removes the unused 'user_profile' parameter
of the 'claim_attachement' function.
2024-05-21 09:24:43 -07:00
Vector73 8ab526a25a models: Replace realm.uri with realm.url.
In #23380, we are changing all occurrences of uri with url in order to
follow the latest URL standard. Previous PRs #25038 and #25045 has
replaced the occurences of uri that has no direct relation with realm.

This commit changes just the model property, which has no API
compatibility concerns.
2024-05-08 11:12:43 -07:00
Alex Vandiver 043d3127eb upload: Only load S3 backend (and thus boto3) if necessary.
Because loading boto3 is so slow, this saves a significant amount of
time (0.3s or so) in process startup on servers which are not using
the S3 file storage backend.
2024-04-15 13:12:51 -07:00
Anders Kaseorg 3853fa875a python: Consistently use from…import for urllib.parse.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-05 13:03:07 -08:00
Anders Kaseorg 53e8c0c497 ruff: Fix E721 Do not compare types, use `isinstance()`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-17 17:05:34 -07:00
Mateusz Mandera 414658fc8e scheduled_message: Handle attachments properly.
Fixes #25414.

We add Attachment.scheduled_messages relation to track ScheduledMessages
which reference the attachment.

The import bits can be done after merging this, by updating #25345.
2023-05-08 09:56:02 -07:00
Alex Vandiver e408f069fe uploads: Add a method to copy attachment contents out. 2023-04-07 09:13:48 -07:00
Alex Vandiver 3bf3f47b49 delete_old_unclaimed_attachments: Add flag to clean up storage.
Actions like deleting realms may leave unreferenced uploads in the
attachment storage backend.

Fix these by walking the complete contents of the attachment storage
backend, and removing files which are no longer present in the
database.  This may take quite some time, as it is necessarily O(n) in
the number of files uploaded to the system.
2023-03-02 16:36:19 -08:00
Alex Vandiver c9d1755a12 delete_realm: Optimize attachment cleanup by batching. 2023-03-02 16:36:19 -08:00
Alex Vandiver b31a6dc56c upload: Reorder functions into logical groupings. 2023-03-02 16:36:19 -08:00
Alex Vandiver 04e7621668 upload: Rename upload_message_image_from_request.
The table is named Attachment, and not all of them are images.
2023-03-02 16:36:19 -08:00
Alex Vandiver bd80c048be upload: Rename delete_message_image to use word "attachment".
The table is named Attachment, and not all of them are images.
2023-03-02 16:36:19 -08:00
Alex Vandiver 567d1d54e7 upload: Rename upload_message_file to use word "attachment".
For consistency with the table, which is named Attachment.
2023-03-02 16:36:19 -08:00
Alex Vandiver 862e3bb80a avatars: Use a helper method, rather than use upload_backend directly.
Importing `upload_backend` directly means that in testing it must also
be mocked where it is imported, in order to correctly test the right
backend.  Since `get_avatar_url` is part of the public
`ZulipUploadBackend` API, add another helper method to call that.
2023-01-09 18:23:58 -05:00
Alex Vandiver 7c0d414aff uploads: Split out S3 and local file backends into separate files.
The uploads file is large, and conceptually the S3 and local-file
backends are separable.
2023-01-09 18:23:58 -05:00