Commit Graph

329 Commits

Author SHA1 Message Date
Anders Kaseorg d574200423 tests: Consume streaming responses.
Fixes warnings like “ResourceWarning: unclosed file <_io.FileIO
name='/srv/zulip/var/044e5d44-87aa-4c43-abbb-28a144fa6654/test-backend/run_1238680/worker_0/test_uploads/files/thumbnail/2/1e/jmUuDhQC8WlaSRCuc0zQyx7D/img.tif/100x75.webp'
mode='rb' closefd=True>” with warnings enabled.

deque(…, 0) is an efficient way to consume an iterator documented at
https://docs.python.org/3/library/itertools.html#itertools-recipes
under consume.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-19 09:20:56 -07:00
Vector73 d21ee6fa23 api: Deprecate uri and add url parameter in "/user_uploads" endpoint. 2024-07-14 22:32:36 -07:00
Anders Kaseorg b96feb34f6 ruff: Fix SIM117 Use a single `with` statement with multiple contexts.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-14 13:48:32 -07:00
Anders Kaseorg 48202389b8 ruff: Bump target-version from py38 to py310.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-07-13 22:28:22 -07:00
Alex Vandiver 2b3da0e70f fixup! thumbnail: Fix MAX_EMOJI_GIF_FILE_SIZE_BYTES check to be post-resize. 2024-07-12 13:26:47 -07:00
Alex Vandiver f6b99171ce emoji: Derive the file extension from a limited set of content-types.
We thumbnail and serve emoji with the same format as they were
uploaded.  However, we preserved the original extension, which might
mismatch with the provided content-type.

Limit the content-type to a subset which is both (a) an image format
we can thumbnail, and (b) a media format which is widely-enough
supported that we are willing to provide it to all browsers.  This
prevents uploading a `.tiff` emoji, for instance.

Based on this limited content-type, we then reverse to find the
reasonable extension to use when storing it.  This is particularly
important because the local file storage uses the file extension to
choose what content-type to re-serve the emoji as.

This does nothing for existing emoji, which may have odd or missing
file extensions.
2024-07-12 13:26:47 -07:00
Alex Vandiver fa28e3aa0f tests: Split up test_upload.EmojiTest into test_thumbnail. 2024-07-12 13:26:47 -07:00
Alex Vandiver 382cb5bb13 thumbnail: Lock down which formats we parse. 2024-07-11 07:31:39 -07:00
Alex Vandiver 4bc563128e thumbnail: Use a consistent set of supported image types. 2024-07-11 07:31:39 -07:00
Alex Vandiver f52a93bc14 upload: Stop requiring callers pass in the file size.
This can be calculated because we have the contents.
2024-07-07 14:40:07 -07:00
Alex Vandiver e29a455b2d avatars: Encode version into the filename.
Hash the salt, user-id, and now avatar version into the filename.
This allows the URL contents to be immutable, and thus to be marked as
immutable and cacheable.  Since avatars are served unauthenticated,
hashing with a server-side salt makes the current and past avatars not
enumerable.

This requires plumbing the current (or future) avatar version through
various parts of the upload process.

Since this already requires a full migration of current avatars, also
take the opportunity to fix the missing `.png` on S3 uploads (#12852).

We switch from SHA-1 to SHA-256, but truncate it such that avatar URL
data does not substantially increase in size.

Fixes: #12852.
2024-07-07 14:40:07 -07:00
Alex Vandiver 17fb23746f upload: Move methods into zerver.lib.upload from .base. 2024-06-26 16:43:11 -07:00
Alex Vandiver fb929ca218 thumbnailing: Remove unnecessary third return value from resize_emoji. 2024-06-26 16:43:09 -07:00
Alex Vandiver 0070b5da78 tests: Switch from PIL to pyvips. 2024-06-26 16:42:59 -07:00
Alex Vandiver b14a33c659 thumbnailing: Switch to libvips, from PIL/pillow.
This is done in as much of a drop-in fashion as possible.  Note that
libvips does not support animated PNGs[^1], and as such this
conversion removes support for them as emoji; however, libvips
includes support for webp images, which future commits will take
advantage of.

This removes the MAX_EMOJI_GIF_SIZE limit, since that existed to work
around bugs in Pillow.  MAX_EMOJI_GIF_FILE_SIZE_BYTES is fixed to
actually be 128KiB (not 128MiB, as it actually was), and is counted
_after_ resizing, since the point is to limit the amount of data
transfer to clients.

[^1]: https://github.com/libvips/libvips/discussions/2000
2024-06-26 16:42:57 -07:00
Alex Vandiver 9fb03cb2c7 upload: Factor out common avatar logic. 2024-06-26 16:38:01 -07:00
Alex Vandiver 0153d6dbcd thumbnailing: Move resizing functions into zerver.lib.thumbnail. 2024-06-20 23:06:08 -04:00
Mateusz Mandera 9406bfbc0a analytics: Store realm disk space used as a CountStat.
Fixes #29632.

The issue description explains this well:

We currently recalculate `currently_used_upload_space_bytes` every file
upload, by dint of calling `flush_used_upload_space_cache`  on
save/delete, and then immediately calling
`user_profile.realm.currently_used_upload_space_bytes()` in
`notify_attachment_update`.  Since this walks the Attachments table,
recalculating this can take seconds in large realms.

Switch this to using a CountStat, so we don't need to walk significant
chunks of the Attachment table when we upload an attachment.  This will
also give us a historical daily graph of usage.
2024-05-09 10:54:44 -07:00
Anders Kaseorg 96fbe060a6 python: Mark regexes as raw strings.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-04-26 12:30:31 -07:00
Mateusz Mandera 8038e2322c realm: Change implementation approach for upload_quota_gb.
Most importantly, fixes a bug where a realm with a custom
.upload_quota_gb value (set by changing it in the database via e.g.
manage.py shell) would end up having it lowered while upgrading their
plan via the do_change_realm_plan_type function, which used to just set
it to the value implied by the new plan without caring about whether
that isn't lower than the original limit.

The new approach is cleaner since we don't do db queries by
upload_quota_gb so it's nicer to just generate these dynamically, making
changes to our limit-per-plan rules much easier - skipping the need for
migrations.
2024-04-15 15:08:56 -07:00
Alex Vandiver 7f46773ef1 tests: Clear in-memory Client caches before testing query counts.
This makes counts more apples-to-apples comparable when run
back-to-back.
2024-02-14 12:27:03 -08:00
Anders Kaseorg cff0b78771 models: Move some functions to zerver.lib.attachments.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Anders Kaseorg cd96193768 models: Extract zerver.models.realms.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Anders Kaseorg 45bb8d2580 models: Extract zerver.models.users.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Anders Kaseorg 3853fa875a python: Consistently use from…import for urllib.parse.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-05 13:03:07 -08:00
Alex Vandiver 82960d9bc2 upload: Redirect unauthorized anonymous requests to login.
Note that this also redirects rate-limited anonymous requests to the
login page, as we do not currently differentiate the cases.
2023-11-28 09:44:55 -08:00
Alex Vandiver f9884af114 upload: Return images for 404/403 responses with image Accept: headers.
If the request's `Accept:` header signals a preference for serving
images over text, return an image representing the 404/403 instead of
serving a `text/html` response.

Fixes: #23739.
2023-11-28 09:44:55 -08:00
Sahil Batra 58461660c3 users: Restrict accessing avatar for inaccessible users.
We now return the special avatar used for inaccessible users
when a guest user tries to access avatar of an inaccessibe
user using "/avatar" endpoint.
2023-11-21 23:58:45 -08:00
Anders Kaseorg a50eb2e809 mypy: Enable new error explicit-override.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-10-12 12:28:41 -07:00
Anders Kaseorg 6988622fe8 ruff: Enable B023 Function definition does not bind loop variable.
Python’s loop scoping is misdesigned, resulting in a very common
gotcha for functions that close over loop variables [1].  The general
problem is so bad that even the Go developers plan to break
compatibility in order to fix the same design mistake in their
language [2].

Enable the Ruff rule function-uses-loop-variable (B023) [3], which
conservatively prohibits functions from binding loop variables at all.

[1] https://docs.python-guide.org/writing/gotchas/#late-binding-closures
[2] https://go.dev/s/loopvar-design
[3] https://beta.ruff.rs/docs/rules/function-uses-loop-variable/

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-11 18:03:45 -07:00
Anders Kaseorg 81bd63cb46 ruff: Fix PIE808 Unnecessary `start` argument in `range`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-01 14:57:01 -07:00
Alex Vandiver b67108c8c6 retention: Prevent deletion of partially-archived messages.
Previously, this code:
```python3
old_archived_attachments = ArchivedAttachment.objects.annotate(
    has_other_messages=Exists(
        Attachment.objects.filter(id=OuterRef("id"))
        .exclude(messages=None)
        .exclude(scheduled_messages=None)
    )
).filter(messages=None, create_time__lt=delta_weeks_ago, has_other_messages=False)
```

...protected from removal any ArchivedAttachment objects where there
was an Attachment which had _both_ a message _and_ a scheduled
message, instead of _either_ a message _or_ a scheduled message.
Since files are removed from disk when the ArchivedAttachment rows are
deleted, this meant that if an upload was referenced in two messages,
and one was deleted, the file was permanently deleted when the
ArchivedMessage and ArchivedAttachment were cleaned up, despite being
still referenced in live Messages and Attachments.

Switch from `.exclude(messages=None).exclude(scheduled_messages=None)`
to `.exclude(messages=None, scheduled_messages=None)` which "OR"s
those conditions appropriately.

Pull the relevant test into its own file, and expand it significantly
to cover this, and other, corner cases.
2023-08-06 13:40:02 -07:00
Anders Kaseorg e932e2ce52 ruff: Fix UP032 Use f-string instead of `format` call.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-02 15:58:55 -07:00
arghyadeep10 1808cdec90 uploads: Improve file not found message.
It replaces the "File not found." text with:
"This file does not exist or has been deleted."

At present when a file is deleted it results in a confusing
experience when looking at the "File not found." message.
In order to clarify the situation is not a bug, the message
has been replaced with a better alternative.

Fixes part of Issue #23739.
2023-07-06 09:32:41 -07:00
Alex Vandiver e2847790b6 upload: Provide a default upload file name, rather than 500. 2023-07-03 21:51:58 -07:00
Ujjawal Modi f7346f36fc attachments: Refactor code for flushing used_upload_space cache.
Subsequent commits will add "on_delete=models.RESTRICT"
relationships, which will result in the Attachment
objects being deleted after Realm has been deleted from
the database.

In order to handle this, we update
get_realm_used_upload_space_cache_key function to accept
realm_id as parameter instead of realm object, so that
the code for flushing the cache works even after the
realm is deleted. This change is fine because eventually
only realm_id is used by this function and there is no
need of the complete realm object.
2023-06-28 18:03:32 -07:00
Lauryn Menard d53b854a7c backend-tests: Update "private message" or "PM" to "direct message".
Updates comments and test strings/names with "private message" or
"PM" to use "direct message" instead.
2023-06-23 11:24:13 -07:00
Anders Kaseorg c09e7d6407 codespell: Correct “requestor” to “requester”.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-06-20 16:17:55 -07:00
Anders Kaseorg 92c83c1df4 tests: Remove assert_streaming_content helper in favor of getvalue.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-06-15 16:49:27 -07:00
Alex Vandiver fbb831ff3b uploads: Allow access to the /download/ variant anonymously.
This was mistakenly left off of b799ec32b0.
2023-06-12 12:55:27 -07:00
Alex Vandiver 0dbe111ab3 test_helpers: Switch add/remove_ratelimit to a contextmanager.
Failing to remove all of the rules which were added causes action at a
distance with other tests.  The two methods were also only used by
test code, making their existence in zerver.lib.rate_limiter clearly
misplaced.

This fixes one instance of a mis-balanced add/remove, which caused
tests to start failing if run non-parallel and one more anonymous
request was added within a rate-limit-enabled block.
2023-06-12 12:55:27 -07:00
Lauryn Menard 154af5bb6b scheduled-messages: Remove ID from create scheduled message.
Part of splitting creating and editing scheduled messages.
Should be merged with final commit in series. Breaks tests.

Removes `scheduled_message_id` parameter from the create scheduled
message path.
2023-05-26 18:05:55 -07:00
Mateusz Mandera 414658fc8e scheduled_message: Handle attachments properly.
Fixes #25414.

We add Attachment.scheduled_messages relation to track ScheduledMessages
which reference the attachment.

The import bits can be done after merging this, by updating #25345.
2023-05-08 09:56:02 -07:00
Mateusz Mandera 4598607a46 test_uploads: Fix two typos. 2023-05-08 09:56:02 -07:00
AcKindle3 b0ef8f0822 test: Replace occurences of `uri` with `url`.
In all the tests files, replaced all occurences of `uri` with `url`
appeared in comments, local variablles, function names and their callers.
2023-04-08 16:27:55 -07:00
Anders Kaseorg a881918a05 requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-04-03 22:39:21 -07:00
Lauryn Menard 7b225245c0 tests: Update ZulipTestCase.tearDown to remove local uploads.
Previously, tests that exercised code paths that added local
uploads did not always clean up `settings.LOCAL_UPLOADS_DIR`
after the test was complete.

Updates the `ZulipTestCase` class to remove any local uploads
in the unique `settings.LOCAL_UPLOADS_DIR` in `tearDown` for
all tests.
2023-03-28 14:38:06 -07:00
Anders Kaseorg 2d9b2a2a05 models: Remove type prefixes from __str__ values.
The Django convention is for __repr__ to include the type and __str__
to omit it.  In fact its default __repr__ implementation for models
automatically adds a type prefix to __str__, which has resulted in the
type being duplicated:

    >>> UserProfile.objects.first()
    <UserProfile: <UserProfile: emailgateway@zulip.com <Realm: zulipinternal 1>>>

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-03-08 22:56:55 -08:00
Alex Vandiver 880a3f95a7 tests: Split out s3 and local tests.
This mirrors the split of the code in 7c0d414aff.
2023-03-02 16:36:19 -08:00
Alex Vandiver bd80c048be upload: Rename delete_message_image to use word "attachment".
The table is named Attachment, and not all of them are images.
2023-03-02 16:36:19 -08:00