Commit Graph

391 Commits

Author SHA1 Message Date
Mateusz Mandera da4443f392 thumbnail: Make thumbnailing work with data import.
We didn't have thumbnailing for images coming from data import and this
commit adds the functionality.

There are a few fundamental issues that the implementation needs to
solve.

1. The images come from an untrusted source and therefore we don't want
   to just pass them through to thumbnailing without checking. For that
   reason, we cannot just import ImageAttachment rows from the export
   data, even for zulip=>zulip imports.
   The right way to process images is to pass them to maybe_thumbail(),
   which runs libvips_check_image() on them to verify we're okay with
   thumbnailing, creates ImageAttachment rows for them and sends them
   to the thumbnailing queue worker. This approach lets us handle both
   zulip=>zulip and 3rd party=>zulip imports in the same way,

2. There is a somewhat circular dependency between the Message,
   Attachment and ImageAttachment import process:

- ImageAttachments would ideally be created after importing
  Attachments, but they need to already exist at the time of Message
  import. Otherwise, the markdown processor doesn't know it has to add
  HTML for image previews to messages that reference images. This would
  mean that messages imported from 3rd party tools don't get image
  previews.
- Attachments only get created after Message import however, due to the
  many-to-many relationship between Message and Attachment.

This is solved by fixing up some data of Attachments pre-emptively, such
as the path_ids. This gives us the necessary information for creating
ImageAttachments before importing Messages.

While we generate ImageAttachment rows synchronously, the actual
thumbnailing job is sent to the queue worker. Theoretically, the worker
could be very backlogged and not process the thumbnails anytime soon.
This is fine - if the app is loaded and tries to display a message with
such a not-yet-generated thumbnail, the code in `serve_file` will
generate the thumbnails synchronously on the fly and the user will see
the image preview displayed normally. See:

1b47134d0d/zerver/views/upload.py (L333-L342)
2024-10-24 10:32:51 -07:00
Prakhar Pratyush 55f97cd06f realm_export: Add support to create full data export via /export/realm.
Earlier, only public data export was possible via `POST /export/realm`
endpoint. This commit adds support to create full data export with
member consent via that endpoint.

Also, this adds a 'export_type' parameter to the dictionaries
in `realm_export` event type and `GET /export/realm` response.

Fixes part of #31201.
2024-10-11 13:20:42 -07:00
Prakhar Pratyush 07dcee36b2 export_realm: Add RealmExport model.
Earlier, we used to store the key data related to realm exports
in RealmAuditLog. This commit adds a separate table to store
those data.

It includes the code to migrate the concerned existing data in
RealmAuditLog to RealmExport.

Fixes part of #31201.
2024-10-04 12:06:35 -07:00
Prakhar Pratyush 5d9eb4e358 realm_export: Save stats in '.json' format instead of '.txt'.
This commit updates code to store the realm export stats in
json format instead of plain text.

This will help in storing the stats as JsonField in RealmExport table.
2024-10-04 12:06:35 -07:00
Anders Kaseorg 184c0203f3 upload: Lazily import boto3.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-09-24 16:38:37 -07:00
Vector73 9e4e85e140 saved_snippets: Add backend for saved snippets.
Part of #31227.
2024-09-24 15:27:58 -07:00
Prakhar Pratyush 65f465562f export_realm: Remove the 'react on consent message' approach.
For exporting full with consent:

* Earlier, a message advertising users to react with thumbs up
  was sent and later used to determine the users who consented.

* Now, we no longer need to send such a message. This commit
  updates the logic to use `allow_private_data_export` user-setting
  to determine users who consented.

Fixes part of #31201.
2024-09-24 14:32:42 -07:00
Alex Vandiver ce0df00e44 export: Notify all realm admins on realm export. 2024-09-23 10:02:43 -07:00
Alex Vandiver 7afe6800f7 export: Use relative paths, include more data. 2024-09-23 10:02:43 -07:00
Alex Vandiver e125ad823d exports: Add a separate bucket for realm exports.
This allows finer-grained access control and auditing.  The links
generated also expire after one week, and the suggested configuration
is that the underlying data does as well.

Co-authored-by: Prakhar Pratyush <prakhar@zulip.com>
2024-09-20 15:43:49 -07:00
Alex Vandiver 91ac5c3c8b export: Log before the compression step, which can be slow. 2024-09-20 15:43:49 -07:00
sujal shah 614caf111e user_groups: Add `creator` and `date_created` field in user groups.
This commit introduced 'creator' and 'date_created'
fields in user groups, allowing users to view who
created the groups and when.

Both fields can be null for groups without creator data.
2024-09-13 18:44:58 -07:00
Lauryn Menard d2c32f23db audit-log: Move realm event types to AuditLogEventType enum.
Event types moved: REALM_DEACTIVATED, REALM_REACTIVATED, REALM_SCRUBBED
REALM_PLAN_TYPE_CHANGED, REALM_LOGO_CHANGED, REALM_EXPORTED
REALM_PROPERTY_CHANGED, REALM_ICON_SOURCE_CHANGED, REALM_DISCOUNT_CHANGED
REALM_SPONSORSHIP_APPROVED, REALM_BILLING_MODALITY_CHANGED
REALM_REACTIVATION_EMAIL_SENT, REALM_SPONSORSHIP_PENDING_STATUS_CHANGED
REALM_SUBDOMAIN_CHANGED
2024-09-09 11:50:13 -07:00
Mateusz Mandera 5476340b52 import: Export and import .original emoji files correctly.
The export tool was only exporting the already-thumbnailed emoji file,
omitting the original one. Now we make sure to export the .original file
too, like we do for avatars, and make the import tool process it
directly, to thumbnail it directly and generate a still in the case of
animated emojis.

Otherwise, the imported realm wouldn't have the <emoji>.png.original
file that we generally expect to have accessible, and stills for
animated emojis were completely missing.
2024-08-21 16:30:19 -07:00
roanster007 7b3e163d55 refactor: Rename `huddle` to `direct_message_group` in non api files.
This commit completes rename of "huddle" to "direct_message_group"
in all the non API files.

Part of #28640
2024-07-31 23:25:56 -07:00
Alex Vandiver 2e38f426f4 upload: Generate thumbnails when images are uploaded.
A new table is created to track which path_id attachments are images,
and for those their metadata, and which thumbnails have been created.
Using path_id as the effective primary key lets us ignore if the
attachment is archived or not, saving some foreign key messes.

A new worker is added to observe events when rows are added to this
table, and to generate and store thumbnails for those images in
differing sizes and formats.
2024-07-16 13:22:15 -07:00
Prakhar Pratyush 7d379e00b0 export: Fix 'OnboardingUserMessage' table not being exported.
Earlier, the export tool was logging a warning:
"??? NO DATA EXPORTED FOR TABLE zerver_onboardingusermessage!!!"

This bug was due to not configuring a Config object for
'OnboardingUserMessage' in 'get_realm_config()'.

This commit fixes the bug to export the table properly.
2024-07-16 09:36:02 -07:00
Anders Kaseorg 1e9b6445a9 ruff: Fix PLR6104 Use `+=` to perform an augmented assignment directly.
This is a preview rule, not yet enabled by default.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-14 13:49:51 -07:00
Anders Kaseorg 0fa5e7f629 ruff: Fix UP035 Import from `collections.abc`, `typing` instead.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg 531b34cb4c ruff: Fix UP007 Use `X | Y` for type annotations.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Anders Kaseorg e08a24e47f ruff: Fix UP006 Use `list` instead of `List` for type annotation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Alex Vandiver d5a4941691 django: Switch to .alias() instead .annotate() where possible.
When using the sub-expression purely for filtering, and not for
accessing the value in the resultset, .alias() is potentially faster
since it does not pull the value in as well.
2024-07-11 09:26:23 -07:00
roanster007 02d0566dc5 refactor: Rename `Huddle` Django model class to `DirectMessageGroup`.
This commit renames the "Huddle" Django model class to
"DirectMessageGroup", while maintaining the same table --
"zerver_huddle".

Fixes part of #28640.
2024-07-07 21:31:30 -07:00
Alex Vandiver e29a455b2d avatars: Encode version into the filename.
Hash the salt, user-id, and now avatar version into the filename.
This allows the URL contents to be immutable, and thus to be marked as
immutable and cacheable.  Since avatars are served unauthenticated,
hashing with a server-side salt makes the current and past avatars not
enumerable.

This requires plumbing the current (or future) avatar version through
various parts of the upload process.

Since this already requires a full migration of current avatars, also
take the opportunity to fix the missing `.png` on S3 uploads (#12852).

We switch from SHA-1 to SHA-256, but truncate it such that avatar URL
data does not substantially increase in size.

Fixes: #12852.
2024-07-07 14:40:07 -07:00
Prakhar Pratyush fb836a4f0a onboarding: Add 'OnboardingUserMessage' model.
This prep commit adds a new OnboardingUserMessage model
that will be used to mark the new onboarding messages
for new users as unread and the first message of each
onboarding topic as starred.

This table won't include the old onboarding messages.
2024-07-05 15:39:32 -07:00
roanster007 52692a6448 refactor: Rename `huddle` to `direct_message_group` in non API.
This commit performs a sweep on the first batch of non API
files to rename "huddle" to "direct_message_group`.

It also renames variables and methods of type -
"huddle_message" to "group_direct_message".

This is a part of #28640
2024-07-04 07:56:31 -07:00
Mateusz Mandera b1d50b511c presence: Handle PresenceSequence in the export/import system. 2024-06-02 22:08:28 -07:00
Prakhar Pratyush cc793612f0 export: Create REALM_EXPORTED audit log for exports via shell.
Earlier, we were creating RealmAuditLog with REALM_EXPORTED
event_type when export of public data took place via organization
settings panel.

We were not creating the audit log when the export was executed
via shell i.e './manage.py export'.

This commit creates the audit log in that case too. It will
help during import to distinguish readily between imports
from another Zulip server vs imports from another product.
2024-05-08 16:16:37 -07:00
Sahil Batra c9a7c13ea7 import: Add code to support import and export of NamedUserGroups. 2024-04-26 17:03:09 -07:00
Sahil Batra 71b601cf5a groups: Create NamedUserGroup table. 2024-04-26 17:03:09 -07:00
Anders Kaseorg a82a3eb4d7 ruff: Fix UP033 Use `@functools.cache`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-04-01 18:32:52 -07:00
John Lu a5cf0ec526
refactor: Replace HUDDLE with DIRECT_MESSAGE_GROUP.
Replaced HUDDLE attribute with DIRECT_MESSAGE_GROUP using VS Code search,
part of a general renaming of the object class.

Fixes part of #28640.

Co-authored-by: JohnLu2004 <JohnLu10212004@gmail.com>
2024-03-21 16:39:33 -07:00
Anders Kaseorg 3b114c516c ruff: Fix PLR2044 Line with empty comment.
This is a preview rule, not yet enabled by default.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-03-01 09:30:04 -08:00
Anders Kaseorg 570f3dd447 python: Reformat with Ruff formatter.
https://docs.astral.sh/ruff/formatter/

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-02-29 17:07:16 -08:00
Anders Kaseorg 53e80c41ea ruff: Fix SIM113 Use `enumerate()` for index variable in `for` loop.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-02-02 10:30:45 -08:00
Anders Kaseorg cd96193768 models: Extract zerver.models.realms.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Anders Kaseorg 45bb8d2580 models: Extract zerver.models.users.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Anders Kaseorg e601d0ae7c models: Rename zerver/models.py to zerver/models/__init__.py.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Prakhar Pratyush 62bfc20ebc models: Rename 'UserHotspot' model to 'OnboardingStep'.
This commit renames the 'UserHotspot' model to 'OnboardingStep'.

Also, it renames the 'hotspot' field in that model
to 'onboarding_step'.
2023-12-06 18:19:20 -08:00
Anders Kaseorg 8a7916f21a python: Consistently use from…import for datetime.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-05 12:01:18 -08:00
Anders Kaseorg 2665a3ce2b python: Elide unnecessary list wrappers.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-13 12:41:23 -07:00
Alex Vandiver b94402152d models: Always search Messages with a realm_id or id limit.
Unless there is a limit on `id`, always provide a `realm_id` limit as
well.  We also notate which index is expected to be used in each
query.
2023-09-11 15:00:37 -07:00
Zixuan James Li 30495cec58 migration: Rename extra_data_json to extra_data in audit log models.
This migration applies under the assumption that extra_data_json has
been populated for all existing and coming audit log entries.

- This removes the manual conversions back and forth for extra_data
throughout the codebase including the orjson.loads(), orjson.dumps(),
and str() calls.

- The custom handler used for converting Decimal is removed since
DjangoJSONEncoder handles that for extra_data.

- We remove None-checks for extra_data because it is now no longer
nullable.

- Meanwhile, we want the bouncer to support processing RealmAuditLog entries for
remote servers before and after the JSONField migration on extra_data.

- Since now extra_data should always be a dict for the newer remote
server, which is now migrated, the test cases are updated to create
RealmAuditLog objects by passing a dict for extra_data before
sending over the analytics data. Note that while JSONField allows for
non-dict values, a proper remote server always passes a dict for
extra_data.

- We still test out the legacy extra_data format because not all
remote servers have migrated to use JSONField extra_data.
This verifies that support for extra_data being a string or None has not
been dropped.

Co-authored-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-08-16 17:18:14 -07:00
Sahil Batra 35559ae324 export: Pass "user_profile" as arg to select_related call for Subscription.
This commit adds code to pass all the required arguments to
select_related call for Subscription objects such that only
the required related fields are fetched from the database.

Previously, we did not pass any arguments to select_related,
so all the directly and indirectly related fields were fetched
when many of them were actually not being used and made the
query unnecessarily complex.
2023-08-10 17:35:43 -07:00
Anders Kaseorg c2c96eb0cf python: Annotate type aliases with TypeAlias.
This is not strictly necessary but it’s clearer and improves mypy’s
error messages.

https://docs.python.org/3/library/typing.html#typing.TypeAlias
https://mypy.readthedocs.io/en/stable/kinds_of_types.html#type-aliases

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-07 10:02:49 -07:00
Alex Vandiver 54395612c7 export: Skip crossrealm bots, if they are in the exported realm.
This prevents them from being duplicated in the crossrealm users.
2023-07-17 17:22:57 -07:00
Alex Vandiver cfda414277 export: Include huddles subscription from mirrordummy users.
If there are two huddles, with users A + B + C + D and A + B + C, and
user D is deleted, it is replaced with a mirrordummy user.  If
mirrordummy subscriptions are not included in exports, then the two
huddles have duplicate member sets, and will not be able to be
imported successfully.

Include huddle subscriptions for mirrordummy users in exports.
2023-07-17 17:22:57 -07:00
Lauryn Menard d3f7cfccbc zerver: Update comments with "private message" or "PM".
Updates comments/doc-strings that use "private message" or "PM" in
files in the `/zerver` directory to instead use "direct message".
2023-06-23 11:24:13 -07:00
Mateusz Mandera b55adbef3d export: Handle RealmAuditLog with .acting_user in different realm. 2023-05-19 11:12:19 -07:00
Alex Vandiver 4a43856ba7 realm_export: Do not assume null extra_data is special.
Fixes: #20197.
2023-05-16 14:05:01 -07:00