Commit Graph

203 Commits

Author SHA1 Message Date
Lauryn Menard aa796af0a8 upload: Remove `mimetype` url parameter in `get_file_info`.
This `mimetype` parameter was introduced in c4fa29a and its last
usage removed in 5bab2a3. This parameter was undocumented in the
OpenAPI endpoint documentation for `/user_uploads`, therefore
there shouldn't be client implementations that rely on it's
presence.

Removes the `request.GET` call for the `mimetype` parameter and
replaces it by getting the `content_type` value from the file,
which is an instance of Django's `UploadedFile` class and stores
that file metadata as a property.

If that returns `None` or an empty string, then we try to guess
the `content_type` from the filename, which is the same as the
previous behaviour when `mimetype` was `None` (which we assume
has been true since it's usage was removed; see above).

If unable to guess the `content_type` from the filename, we now
fallback to "application/octet-stream", instead of an empty string
or `None` value.

Also, removes the specific test written for having `mimetype` as
a url parameter in the request, and replaces it with a test that
covers when we try to guess `content_type` from the filename.
2022-08-08 16:06:09 -07:00
Anders Kaseorg 2508b579a6 upload: Replace boto3.Session with boto3.session.Session.
boto3-stubs seems to have dropped the former for some reason.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-07-30 06:46:34 -07:00
Zixuan James Li e68fb802f4 upload: Replace File with UploadedFile.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-07-29 14:09:12 -07:00
Zixuan James Li f42465319b upload: Refactor file size out of get_file_info.
We have already checked the size of the file in `upload_file_backend`.
This is the only caller of `upload_message_image_from_request`, and
indirectly the only caller of `get_file_info`. There is no need to
retrieve this information again.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-07-29 14:09:12 -07:00
Alex Vandiver 0830d5e7ea emoji: Write "original" file before attempting resize.
Resizing emoji can fail, especially for animated GIFs; in such cases,
it is useful to have the original data on hand, to be able to dissect
the failure.
2022-07-06 17:20:40 -07:00
Zixuan James Li 40b4da8f58 emoji: Add none checks for uploaded file name.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-06-23 19:25:48 -07:00
Anders Kaseorg e230ea2598 actions: Split out zerver.actions.uploads.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-04-14 17:14:32 -07:00
Alex Vandiver 4f93b4b6e4 uploads: Skip the outgoing proxy if S3_KEY is unset.
When the credentials are provided by dint of being run on an EC2
instance with an assigned Role, we must be able to fetch the instance
metadata from IMDS -- which is precisely the type of internal-IP
request that Smokescreen denies.

While botocore supports a `proxies` argument to the `Config` object,
this is not actually respected when making the IMDS queries; only the
environment variables are read from.  See
https://github.com/boto/botocore/issues/2644

As such, implement S3_SKIP_PROXY by monkey-patching the
`botocore.utils.should_bypass_proxies` function, to allow requests to
IMDS to be made without Smokescreen impeding them.

Fixes #20715.
2022-03-24 10:21:35 -07:00
Alex Vandiver abed174b12 uploads: Add an endpoint which forces a download.
This is most useful for images hosted in S3, which are otherwise
always displayed in the browser.
2022-03-22 15:05:02 -07:00
Alex Vandiver 95892a5ed3 emoji: Support animated PNGs. 2022-03-15 12:47:21 -07:00
Alex Vandiver 96a5fa9d78 upload: Fix resizing non-animated images.
5dab6e9d31 began honoring the list of disposals for every frame.
Unfortunately, passing a list of disposals for a non-animated image
raises an exception:
```
  File "zerver/lib/upload.py", line 212, in resize_emoji
    image_data = resize_gif(im, size)
  File "zerver/lib/upload.py", line 165, in resize_gif
    frames[0].save(
  File "[...]/PIL/Image.py", line 2212, in save
    save_handler(self, fp, filename)
  File "[...]/PIL/GifImagePlugin.py", line 605, in _save
    _write_single_frame(im, fp, palette)
  File "[...]/PIL/GifImagePlugin.py", line 506, in _write_single_frame
    _write_local_header(fp, im, (0, 0), flags)
  File "[...]/PIL/GifImagePlugin.py", line 647, in _write_local_header
    disposal = int(im.encoderinfo.get("disposal", 0))
TypeError: int() argument must be a string, a bytes-like object or a
number, not 'list'
```

`check_add_realm_emoji` calls this as:

```
    try:
        is_animated = upload_emoji_image(image_file, emoji_file_name, a
uthor)
        emoji_uploaded_successfully = True
    finally:
        if not emoji_uploaded_successfully:
            realm_emoji.delete()
            return None
        # ...
```

This is equivalent to dropping _all_ exceptions silently.  As such,
Zulip has silently rejected all non-animated images larger than 64x64
since 5dab6e9d31.

Adjust to only pass a single disposal if there are no additional
frames.  Add a test for non-animated images, which requires also
fixing the incidental bug that all GIF images were being recorded as
animated, regardless of if they had more than 1 frame or not.
2022-02-17 12:19:47 -08:00
Tim Abbott edf16cb861 upload: Mark migration code as nocoverage.
We want to add tests here, but it's more important to fix main failing
CI.
2022-02-11 10:37:58 -08:00
Mateusz Mandera 30ac291eba emoji: Add migration to reupload all RealmEmoji and ensure .author.
Fixes #19732.
2022-02-10 17:45:31 -08:00
Mateusz Mandera fe61243cfe upload: Don't access emoji_file.name attribute upload_emoji_image.
The S3 backend implementation of upload_emoji_image was accessing
emoji_file.name - which is redundant because emoji_file_name already
gets passed in and can be used, and an object of type IO[bytes] may not
have the .name attribute. Spotted by @Fingel.
2022-02-09 11:26:39 -08:00
Mateusz Mandera 4102816240 upload: Pass the target realm to create_attachment.
The target realm was not being passed to create_attachment in
upload_message_file implementations. This was a bug in the edge-case of
cross-realm messages - in particular, causing a bug in the email
gateway:
When an email with an attachment is sent, the message is mirrored to
Zulip with Email Gateway Bot as the message sender and uploader of the
attachment. Due to the realm not being passed to create_attachment, the
Attachment would get created with .realm being the system bot realm,
making the attachment inaccessible under some conditions due to failing
the following condition check (that's expected to pass, provided that
the .realm is set correctly):
```
    if (
        attachment.is_realm_public
        and attachment.realm == user_profile.realm
        and user_profile.can_access_public_streams()
    ):
        # Any user in the realm can access realm-public files
        return True
```
2022-01-27 17:23:44 -08:00
Anders Kaseorg 78e54a0d7a python: Replace deprecated jinja2.utils.Markup with markupsafe.Markup.
Fixes “DeprecationWarning: 'jinja2.Markup' is deprecated and will be
removed in Jinja 3.1. Import 'markupsafe.Markup' instead.”

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-01-13 14:22:48 -08:00
Tim Abbott 22b5e105e6 upload: Remove incorrect animated GIF asserts.
GIF files can be `.GIF`, and also we determine the file format by
inspecting the image data, so there's no reason to have this
assertion.

(The code for serving still images does not rely on the file being a
GIF.)
2021-12-16 16:13:00 -08:00
Anders Kaseorg 58920affd4 python: Remove re.UNICODE flag (redundant in Python 3).
https://docs.python.org/3/library/re.html#re.A

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-10-22 13:42:29 -07:00
Anders Kaseorg 3bd3173b1f avatar: Remove ?x=x kludge.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-10-14 12:47:43 -07:00
Riken Shah 8c31e6f96e emoji: Add backend changes to support still image for animated emojis.
Now, when we add a custom animated emoji to the realm
we also save a still image of it (1st frame of the gif). So
we can avoid showing an animated emoji every time.
2021-09-12 07:13:04 +00:00
rht 6ff659d199 upload: Extract generate_message_upload_path helper.
This helper will let us avoid copying this logic in the data import
code path.
2021-09-02 16:31:08 -07:00
PIG208 04f5f25478 typing: Replace `File` with `IO[bytes]`. 2021-08-20 06:02:28 -07:00
Anders Kaseorg 1bdb7b1141 mypy: Add boto3-stubs.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-09 20:32:19 -07:00
Anders Kaseorg 5c90522e69 mypy: Add types-Pillow.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-09 20:32:19 -07:00
Anders Kaseorg 14f0594795 upload: Replace exif_rotate with Pillow exif_transpose.
Fixes #18599.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-09 20:32:19 -07:00
Anders Kaseorg ad5f0c05b5 python: Remove default "utf8" argument for encode(), decode().
Partially generated by pyupgrade.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-02 15:53:52 -07:00
PIG208 7d1c475f69 typing: Use assertions for function arguments.
Utilize the assert_is_not_None helper to eliminate errors of
'Argument x to "Foo" has incompatible type "Optional[Bar]"...'
2021-07-26 14:48:45 -07:00
Anders Kaseorg fb3ddf50d4 python: Fix mypy no_implicit_reexport errors.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-07-16 14:02:31 -07:00
Mateusz Mandera b9a8fb4453 upload: Deduplicate logic for public upload url creation.
get_public_upload_root_url and construct_public_upload_url_base were
both doing basically the same thing in the same. We deduplicate this,
making them share the same code, using the approach from
get_public_upload_root_url of using urljoin.

Using a format string is not a great idea, as it doesn't handle the case
of the URL already having parts that will be interpreted as format
string metacharacters. On the downside, this approach negatively affects
performance:

```
...: s = time.time()
...: for i in range(0, 250):
...:     r = u.get_public_upload_url("foo")
...: print(time.time()-s)
0.020366191864013672
```

up from 0.001 before this change.
2021-07-02 08:05:53 -07:00
Mateusz Mandera 85e19b2bde upload: Use URL manipulation for get_public_upload_url logic.
This is much faster than calling generate_presigned_url each time.

```
In [3]: t = time.time()
   ...: for i in range(250):
   ...:     x = u.get_public_upload_url("foo")
   ...: print(time.time()-t)
0.0010945796966552734
```
2021-06-22 09:35:56 -07:00
Mateusz Mandera e883ab057f upload: Cache the boto client to improve performance.
Fixes #18915

This was very slow, causing performance issues. After investigating,
generate_presigned_url is the cheap part of this, but the
session.client() call is expensive - so that's what we should cache.

Before the change:
```
In [4]: t = time.time()
   ...: for i in range(250):
   ...:     x = u.get_public_upload_url("foo")
   ...: print(time.time()-t)
6.408717393875122
```

After:
```
In [4]: t = time.time()
   ...: for i in range(250):
   ...:     x = u.get_public_upload_url("foo")
   ...: print(time.time()-t)
0.48990607261657715
```

This is not good enough to avoid doing something ugly like replacing
generate_presigned_url with some manual URL manipulation, but it's a
helpful structure that we may find useful with further refactoring.
2021-06-22 09:35:19 -07:00
Alex Vandiver 721546dfc0 subdomains: Extend "static" to include resources hosted on S3.
This causes avatars and emoji which are hosted by Zulip in S3 (or
compatible) servers to no longer go through camo.  Routing these
requests through camo does not add any privacy benefit (as the request
logs there go to the Zulip admins regardless), and may break emoji
imported from Slack before 1bf385e35f,
which have `application/octet-stream` as their stored Content-Type.
2021-06-08 15:28:10 -07:00
Tim Abbott 9f2daeee45 upload: Use get_public_upload_url for export tarballs too.
This deduplicates the code so that we now just have one function for
constructing S3 URLs.
2021-05-27 23:26:45 -07:00
ryanreh99 5a4aecfc40 s3 uploads: Refactor to access objects via `get_public_upload_url`.
Our current logic only allows S3 block storage providers whose
upload URL matches with the format used by AWS. This also allows
other styles such as the "virtual host" format used by Oracle cloud.

Fixes #17762.
2021-05-27 23:26:42 -07:00
Mateusz Mandera 6a8586e989 upload: Mention new difference between sanitize_name and slugify.
In Django 3.2 slugify strips trailing dashes and underscores:
0382ecfe02

sanitize_name doesn't so this difference should be documented like the
others.
2021-05-03 08:36:22 -07:00
Mateusz Mandera 389c7bdb5a upload: Fix docstring and regex in sanitize_name regarding underscore.
Underscore character is already covered by \w, so _ in the regex is
redundant. Also the docstring is mildly incorrect - underscore already
is an allowed character by django's slugify (and always was) for the
aforementioned reason.
2021-05-03 08:36:22 -07:00
Ganesh Pawar 830f1fa8c5 upload: Refactor and add tests for ensure_avatar_image in upload.py.
`ensure_basic_avatar_image` and `ensure_medium_avatar_image` are
essentially the same thing, except a size parameter.
So, refactor them into a single function.

This doesn't introduce any functional changes.
2021-04-29 21:18:13 -07:00
Anders Kaseorg e7ed907cf6 python: Convert deprecated Django ugettext alias to gettext.
django.utils.translation.ugettext is a deprecated alias of
django.utils.translation.gettext as of Django 3.0, and will be removed
in Django 4.0.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-04-15 18:01:34 -07:00
Anders Kaseorg 6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg 11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
ryanreh99 dfa7ce5637 uploads: Support non-AWS S3-compatible server.
Boto3 does not allow setting the endpoint url from
the config file. Thus we create a django setting
variable (`S3_ENDPOINT_URL`) which is passed to
service clients and resources of `boto3.Session`.

We also update the uploads-backend documentation
and remove the config environment variable as now
AWS supports the SIGv4 signature format by default.
And the region name is passed as a parameter instead
of creating a config file for just this value.

Fixes #16246.
2020-10-28 21:59:07 -07:00
ryanreh99 1c370a975c refactor: Access a bucket by calling `zerver.lib.uploads.get_bucket`. 2020-10-28 21:52:08 -07:00
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Cody Piersall 5dab6e9d31 emoji-upload: Fix transparency issues on GIF emoji upload.
This preserves the alpha layer on GIF images that need to be resized
before being uploaded.  Two important changes occur here:

1. The new frame is a *copy* of the original image, which preserves the
   GIF info.
2. The disposal method of the original GIF is preserved.  This
   essentially determines what state each frame of the GIF starts from
   when it is drawn; see PIL's docs:
   https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#saving
   for more info.

This resolves some but not all of the test cases in #16370.
2020-10-11 16:23:07 -07:00
akshatdalton 52c411df8a emoji: Add padding around the gif on GIF emoji upload.
Replaced ImageOps.fit by ImageOps.pad, in zerver/lib/upload.py, which
returns a sized and padded version of the image, expanded to fill the
requested aspect ratio and size.
Fixes part of #16370.
2020-10-06 17:28:02 -07:00
Anders Kaseorg faf600e9f5 urls: Remove unused URL names and shorten others.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-22 10:46:28 -07:00
Anders Kaseorg ddf8ec33df upload: Strip leading slash from deleted S3 export paths.
Previously, S3UploadBackend.delete_export_tarball failed to strip the
leading ‘/’ from the export path.  This mistake is now caught by Moto
1.3.15.  I expect it caused deletion failures in the real S3, although
I haven’t verified this.

We store export_path in the audit log with a leading ‘/’, but the
actual S3 keys do not have a leading ‘/’.  Changing either system
would require a migration.  So the new convention is that the
variables named ‘export_path’ have a leading ‘/’, while variables
named ‘path_id’ or ‘key’ do not.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-13 20:59:09 -07:00
Anders Kaseorg b7b7475672 python: Use standard secrets module to generate random tokens.
There are three functional side effects:

• Correct an insignificant but mathematically offensive bias toward
repeated characters in generate_api_key introduced in commit
47b4283c4b4c70ecde4d3c8de871c90ee2506d87; its entropy is increased
from 190.52864 bits to 190.53428 bits.

• Use the base32 alphabet in confirmation.models.generate_key; its
entropy is reduced from 124.07820 bits to the documented 120 bits, but
now it uses 1 syscall instead of 24.

• Use the base32 alphabet in get_bigbluebutton_url; its entropy is
reduced from 51.69925 bits to 50 bits, but now it uses 1 syscall
instead of 10.

(The base32 alphabet is A-Z 2-7.  We could probably replace all of
these with plain secrets.token_urlsafe, since I expect most callers
can handle the full urlsafe_b64 alphabet A-Z a-z 0-9 - _ without
problems.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-09 15:52:57 -07:00
Anders Kaseorg f91d287447 python: Pre-fix a few spots for better Black formatting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00
Clara Dantas 05bf72a75c attachments: Add is_web_public field.
This commit adds the is_web_public field in the AbstractAttachment
class. This is useful when validating user access to the attachment,
as otherwise we would have to make a query in the db to check if
that attachment was sent in a message in a web-public stream or not.
2020-08-12 17:26:03 -07:00