zulip/zerver/lib
Mateusz Mandera da4443f392 thumbnail: Make thumbnailing work with data import.
We didn't have thumbnailing for images coming from data import and this
commit adds the functionality.

There are a few fundamental issues that the implementation needs to
solve.

1. The images come from an untrusted source and therefore we don't want
   to just pass them through to thumbnailing without checking. For that
   reason, we cannot just import ImageAttachment rows from the export
   data, even for zulip=>zulip imports.
   The right way to process images is to pass them to maybe_thumbail(),
   which runs libvips_check_image() on them to verify we're okay with
   thumbnailing, creates ImageAttachment rows for them and sends them
   to the thumbnailing queue worker. This approach lets us handle both
   zulip=>zulip and 3rd party=>zulip imports in the same way,

2. There is a somewhat circular dependency between the Message,
   Attachment and ImageAttachment import process:

- ImageAttachments would ideally be created after importing
  Attachments, but they need to already exist at the time of Message
  import. Otherwise, the markdown processor doesn't know it has to add
  HTML for image previews to messages that reference images. This would
  mean that messages imported from 3rd party tools don't get image
  previews.
- Attachments only get created after Message import however, due to the
  many-to-many relationship between Message and Attachment.

This is solved by fixing up some data of Attachments pre-emptively, such
as the path_ids. This gives us the necessary information for creating
ImageAttachments before importing Messages.

While we generate ImageAttachment rows synchronously, the actual
thumbnailing job is sent to the queue worker. Theoretically, the worker
could be very backlogged and not process the thumbnails anytime soon.
This is fine - if the app is loaded and tries to display a message with
such a not-yet-generated thumbnail, the code in `serve_file` will
generate the thumbnails synchronously on the fly and the user will see
the image preview displayed normally. See:

1b47134d0d/zerver/views/upload.py (L333-L342)
2024-10-24 10:32:51 -07:00
..
markdown python: Avoid deprecated cgi module, removed in Python 3.13. 2024-10-22 10:05:01 -07:00
upload thumbnail: Make thumbnailing work with data import. 2024-10-24 10:32:51 -07:00
url_preview python: Avoid deprecated cgi module, removed in Python 3.13. 2024-10-22 10:05:01 -07:00
webhooks integrations: Add support for release events to GitLab integration. 2024-09-16 09:26:20 -07:00
__init__.py
addressee.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
alert_words.py alert_words: Update remove_alert_word codepath to send event on commit. 2024-07-31 22:33:52 -07:00
async_utils.py mypy: Enable new error explicit-override. 2023-10-12 12:28:41 -07:00
attachments.py upload: Explicitly return a bool and the Attachment object. 2024-09-09 12:40:17 -07:00
avatar.py avatar: Add checks to make sure system bot avatar exists. 2024-10-23 10:35:42 -07:00
avatar_hash.py avatars: Encode version into the filename. 2024-07-07 14:40:07 -07:00
bot_config.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
bot_lib.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
bot_storage.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
bulk_create.py audit-log: Move user group event types to AuditLogEventType enum. 2024-09-09 11:50:13 -07:00
cache.py python: Simplify with str.removeprefix, str.removesuffix. 2024-09-03 12:30:16 -07:00
cache_helpers.py analytics: Pass subgroup=None to improve indexing. 2024-10-02 14:11:44 -04:00
camo.py
ccache.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
compatibility.py ruff: Fix PLR6104 Use `+=` to perform an augmented assignment directly. 2024-07-14 13:49:51 -07:00
context_managers.py ruff: Fix SIM117 Use a single `with` statement with multiple contexts. 2024-07-14 13:48:32 -07:00
create_user.py api: Improve handling of delivery_email in the GET /users/{email} API. 2024-10-08 18:01:49 -07:00
data_types.py ruff: Fix UP038 Use `X | Y` in `isinstance` call instead of `(X, Y)`. 2024-07-13 22:28:22 -07:00
db.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
db_connections.py db: Split reset_queries into a new module zerver.lib.db_connections. 2024-04-17 16:49:03 -07:00
debug.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
default_streams.py ruff: Fix UP006 Use `list` instead of `List` for type annotation. 2024-07-13 22:28:22 -07:00
dev_ldap_directory.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
digest.py audit-log: Move subscription event types to AuditLogEventType enum. 2024-09-09 11:50:13 -07:00
display_recipient.py mypy: Remove use of ValuesQuerySet and QuerySetAny. 2024-08-24 17:30:41 -07:00
domains.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
drafts.py drafts: Update do_delete_draft to send event on commit. 2024-08-12 12:16:14 -07:00
email_mirror.py upload: Provide the frontend with the less-modified filename. 2024-09-09 12:40:17 -07:00
email_mirror_helpers.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
email_notifications.py onboarding: Use "Moving to Zulip" guide in emails & Welcome bot message. 2024-09-30 11:58:31 -07:00
email_validation.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
emoji.py emoji: Use a non-predictable filename. 2024-07-12 13:26:47 -07:00
emoji_utils.py emoji: Match emoji sequences in markdown. 2023-08-23 16:18:15 -07:00
event_schema.py realm_export: Add realm_export_consent feature to API. 2024-10-18 14:08:20 -07:00
events.py realm_export: Add realm_export_consent feature to API. 2024-10-18 14:08:20 -07:00
exceptions.py user_groups: Include settings and supergroups in error response. 2024-10-01 09:45:33 -07:00
export.py thumbnail: Make thumbnailing work with data import. 2024-10-24 10:32:51 -07:00
external_accounts.py ruff: Fix UP006 Use `list` instead of `List` for type annotation. 2024-07-13 22:28:22 -07:00
fix_unreads.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
generate_test_data.py ruff: Fix PLR6104 Use `+=` to perform an augmented assignment directly. 2024-07-14 13:49:51 -07:00
github.py python: Simplify with str.removeprefix, str.removesuffix. 2024-09-03 12:30:16 -07:00
home.py user_groups: Handle deactivated groups in webapp. 2024-09-18 13:41:13 -07:00
html_diff.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
html_to_text.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
i18n.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
import_realm.py thumbnail: Make thumbnailing work with data import. 2024-10-24 10:32:51 -07:00
initial_password.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
integrations.py integrations: Lazily load webhook integrations. 2024-09-24 18:17:52 -07:00
invites.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
logging_util.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
management.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
mdiff.py node_tests: Move to web/tests. 2023-02-23 16:04:17 -08:00
mention.py mention: Do not include deactivated users in group mention data. 2024-10-10 11:37:44 -07:00
message.py requirements: Upgrade Python requirements. 2024-10-20 18:16:27 -07:00
message_cache.py edit_history: Remove 'prev_rendered_content_version' field. 2024-08-29 15:37:12 -07:00
migrate.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
mime_types.py emoji: Derive the file extension from a limited set of content-types. 2024-07-12 13:26:47 -07:00
mobile_auth_otp.py ruff: Fix B905 `zip()` without an explicit `strict=` parameter. 2024-07-13 22:28:22 -07:00
muted_users.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
name_restrictions.py python: Simplify with str.removeprefix, str.removesuffix. 2024-09-03 12:30:16 -07:00
narrow.py message_fetch: Add message_ids parameter to /messages request. 2024-10-07 11:00:40 -07:00
narrow_helpers.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
narrow_predicate.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
notes.py ruff: Fix FURB180 Use of `metaclass=abc.ABCMeta`. 2024-07-14 13:53:40 -07:00
notification_data.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
onboarding.py onboarding: Use "Moving to Zulip" guide in emails & Welcome bot message. 2024-09-30 11:58:31 -07:00
onboarding_steps.py topic: Add a first-time explanation for "Resolve topic". 2024-10-09 18:12:55 -07:00
outgoing_http.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
outgoing_webhook.py ruff: Fix FURB180 Use of `metaclass=abc.ABCMeta`. 2024-07-14 13:53:40 -07:00
partial.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
per_request_cache.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
presence.py presence: Add history_limit_days param to the API. 2024-09-10 13:15:35 -07:00
profile.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
push_notifications.py settings: Rework how push notifications service is configured. 2024-07-17 17:14:06 -07:00
pysa.py
query_helpers.py mypy: Remove use of ValuesQuerySet and QuerySetAny. 2024-08-24 17:30:41 -07:00
queue.py ruff: Fix FURB180 Use of `metaclass=abc.ABCMeta`. 2024-07-14 13:53:40 -07:00
rate_limiter.py ruff: Fix B905 `zip()` without an explicit `strict=` parameter. 2024-07-13 22:28:22 -07:00
realm_description.py
realm_icon.py settings: Make DEFAULT_LOGO_URI/DEFAULT_AVATAR_URI use staticfiles. 2023-02-14 17:17:06 -05:00
realm_logo.py ruff: Fix UP006 Use `list` instead of `List` for type annotation. 2024-07-13 22:28:22 -07:00
recipient_parsing.py scheduled_messages: Migrate to typed_endpoint. 2024-08-20 10:03:22 -07:00
recipient_users.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
redis_utils.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
remote_server.py settings: Rework how push notifications service is configured. 2024-07-17 17:14:06 -07:00
request.py endpoints: Remove the has_request_variables decorator. 2024-09-05 16:02:12 -07:00
response.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
rest.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
retention.py retention: Limit number of ids passed to db in delete messages query. 2024-09-20 09:31:21 -07:00
safe_session_cached_db.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
scheduled_messages.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
scim.py do_change_user_delivery_email: Add acting_user kwarg. 2024-09-30 12:00:14 -07:00
scim_filter.py zerver: Replace uri with url in local variables and comments. 2024-07-14 22:30:28 -07:00
send_email.py custom_email: Add manage_preferences block to the plaintext version. 2024-09-10 09:36:56 -07:00
server_initialization.py avatar: Use fixed avatars for system bots. 2024-10-17 15:47:17 -07:00
sessions.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
singleton_bmemcached.py ruff: Fix UP006 Use `list` instead of `List` for type annotation. 2024-07-13 22:28:22 -07:00
soft_deactivation.py audit-log: Move subscription event types to AuditLogEventType enum. 2024-09-09 11:50:13 -07:00
sounds.py ruff: Fix UP006 Use `list` instead of `List` for type annotation. 2024-07-13 22:28:22 -07:00
sqlalchemy_utils.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
storage.py storage: Reformat hashed medium static avatar files. 2024-10-23 10:35:42 -07:00
stream_color.py ruff: Fix UP006 Use `list` instead of `List` for type annotation. 2024-07-13 22:28:22 -07:00
stream_subscription.py mypy: Remove use of ValuesQuerySet and QuerySetAny. 2024-08-24 17:30:41 -07:00
stream_topic.py ruff: Fix UP006 Use `list` instead of `List` for type annotation. 2024-07-13 22:28:22 -07:00
stream_traffic.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
streams.py settings: Handle guests separately for group-based settings. 2024-09-18 11:51:11 -07:00
string_validation.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
subdomains.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
subscription_info.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
templates.py help-links: Limit billing related relative gear menu links. 2024-09-30 11:35:45 -07:00
test_classes.py tests: Extract upload_image helpers from test_markdown_thumbnail. 2024-10-24 10:32:51 -07:00
test_console_output.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
test_data.source.txt
test_fixtures.py text_fixtures: Fix buggy skip-checks placement. 2024-09-24 15:00:46 -07:00
test_helpers.py tusd: Set metadata correctly in S3. 2024-09-26 12:00:43 -07:00
test_runner.py python: Simplify with str.removeprefix, str.removesuffix. 2024-09-03 12:30:16 -07:00
tex.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
thumbnail.py thumbnail: Make thumbnailing work with data import. 2024-10-24 10:32:51 -07:00
timeout.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
timestamp.py python: Consistently use from…import for datetime. 2023-12-05 12:01:18 -08:00
timezone.py ruff: Fix UP006 Use `list` instead of `List` for type annotation. 2024-07-13 22:28:22 -07:00
topic.py python: Simplify with str.removeprefix, str.removesuffix. 2024-09-03 12:30:16 -07:00
topic_sqlalchemy.py narrow: Migrate legacy SQLAlchemy select syntax. 2024-07-16 14:50:30 -07:00
transfer.py upload: Rename "upload_image_to_s3"; it is not only for images. 2024-09-09 12:40:17 -07:00
typed_endpoint.py endpoints: Remove the has_request_variables decorator. 2024-09-05 16:02:12 -07:00
typed_endpoint_validators.py user_settings: Migrate to typed_endpoint. 2024-07-31 17:10:06 -07:00
types.py custom_profile_fields: Add "editable_by_user" setting. 2024-09-23 18:09:38 -07:00
url_decoding.py urls: Generate narrow links in backend with "channel" operator. 2024-10-11 17:00:23 -07:00
url_encoding.py urls: Generate narrow links in backend with "channel" operator. 2024-10-11 17:00:23 -07:00
url_redirects.py help: Create "Configure send message keys" article. 2024-10-10 14:33:38 -07:00
user_agent.py ruff: Fix UP006 Use `list` instead of `List` for type annotation. 2024-07-13 22:28:22 -07:00
user_counts.py ruff: Fix UP006 Use `list` instead of `List` for type annotation. 2024-07-13 22:28:22 -07:00
user_groups.py user_groups: Add API support to add subgroups during group creation. 2024-10-17 14:27:21 -07:00
user_message.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
user_status.py ruff: Fix UP007 Use `X | Y` for type annotations. 2024-07-13 22:28:22 -07:00
user_topics.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
users.py user_groups: Do not allow updating memberships of deactivated users. 2024-10-10 11:37:44 -07:00
utils.py ruff: Fix UP035 Import from `collections.abc`, `typing` instead. 2024-07-13 22:28:22 -07:00
validator.py endpoints: Remove the has_request_variables decorator. 2024-09-05 16:02:12 -07:00
widget.py python: Simplify with str.removeprefix, str.removesuffix. 2024-09-03 12:30:16 -07:00
zcommand.py python: Simplify with str.removeprefix, str.removesuffix. 2024-09-03 12:30:16 -07:00
zephyr.py python: Simplify with str.removeprefix, str.removesuffix. 2024-09-03 12:30:16 -07:00
zulip_update_announcements.py audit-log: Move realm event types to AuditLogEventType enum. 2024-09-09 11:50:13 -07:00