We may not always have trivial access to all of the bytes of the
uploaded file -- for instance, if the file was uploaded previously, or
by some other process. Downloading the entire image in order to check
its headers is an inefficient use of time and bandwidth.
Adjust `maybe_thumbnail` and dependencies to potentially take a
`pyvips.Source` which supports streaming data from S3 or disk. This
allows making the ImageAttachment row, if deemed appropriate, based on
only a few KB of data, and not the entire image.
Providing a signed Camo URL for arbitrary URLs opened the server up to
being an open redirector. Return 403 if the URL is not a user upload,
and the backend image if it is. Since we do not have ImageAttachment
rows for uploads at a time we wrote `/thumbnail?` URLs, return the
full-size content.
Modern browsers respect the EXIF orientation information of images,
applying rotation and/or mirroring as specified in those tags. The
the `width="..."` and `height="..."` tags are to size the image
_after_ applying those orientation transformations.
The `.width` and `.height` properties of libvips' images are _before_
any transformations are applied. Since we intend to use these to hint
to rendering clients the size that the image should be _rendered at_,
change to storing (and providing to clients) the dimensions of the
rendered image, not the stored bytes.
Fixes warnings like “ResourceWarning: unclosed file <_io.FileIO
name='/srv/zulip/var/044e5d44-87aa-4c43-abbb-28a144fa6654/test-backend/run_1238680/worker_0/test_uploads/files/thumbnail/2/1e/jmUuDhQC8WlaSRCuc0zQyx7D/img.tif/100x75.webp'
mode='rb' closefd=True>” with warnings enabled.
deque(…, 0) is an efficient way to consume an iterator documented at
https://docs.python.org/3/library/itertools.html#itertools-recipes
under consume.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
A new table is created to track which path_id attachments are images,
and for those their metadata, and which thumbnails have been created.
Using path_id as the effective primary key lets us ignore if the
attachment is archived or not, saving some foreign key messes.
A new worker is added to observe events when rows are added to this
table, and to generate and store thumbnails for those images in
differing sizes and formats.
Failing to remove all of the rules which were added causes action at a
distance with other tests. The two methods were also only used by
test code, making their existence in zerver.lib.rate_limiter clearly
misplaced.
This fixes one instance of a mis-balanced add/remove, which caused
tests to start failing if run non-parallel and one more anonymous
request was added within a rate-limit-enabled block.
Thumbor and tc-aws have been dragging their feet on Python 3 support
for years, and even the alphas and unofficial forks we’ve been running
don’t seem to be maintained anymore. Depending on these projects is
no longer viable for us.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Fixes#2665.
Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.
Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start. I expect this change will increase pressure for us to split
those files, which isn't a bad thing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Automatically generated by the following script, based on the output
of lint with flake8-comma:
import re
import sys
last_filename = None
last_row = None
lines = []
for msg in sys.stdin:
m = re.match(
r"\x1b\[35mflake8 \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
)
if m:
filename, row_str, col_str, err = m.groups()
row, col = int(row_str), int(col_str)
if filename == last_filename:
assert last_row != row
else:
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
with open(filename) as f:
lines = f.readlines()
last_filename = filename
last_row = row
line = lines[row - 1]
if err in ["C812", "C815"]:
lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
elif err in ["C819"]:
assert line[col - 2] == ","
lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We now have this API...
If you really just need to log in
and not do anything with the actual
user:
self.login('hamlet')
If you're gonna use the user in the
rest of the test:
hamlet = self.example_user('hamlet')
self.login_user(hamlet)
If you are specifically testing
email/password logins (used only in 4 places):
self.login_by_email(email, password)
And for failures uses this (used twice):
self.assert_login_failure(email)
This reduces query counts in some cases, since
we no longer need to look up the user again. In
particular, it reduces some noise when we
count queries for O(N)-related tests.
The query count is usually reduced by 2 per
API call. We no longer need to look up Realm
and UserProfile. In most cases we are saving
these lookups for the whole tests, since we
usually already have the `user` objects for
other reasons. In a few places we are simply
moving where that query happens within the
test.
In some places I shorten names like `test_user`
or `user_profile` to just be `user`.
This closes an open redirect vulnerability, one case of which was
found by Graham Bleaney and Ibrahim Mohamed using Pysa.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We used to add sharpen filter for all the image sizes whereas it was
intended for resized images only which would have been smoothened
out a bit by the resize operation.
This unnecessary use of the filter used to result in weird issues
with full size images.
For example: Image located at this url:-
http://arqex.com/wp-content/uploads/2015/02/trees.png
When rendered in full size would have just boundaries visible.
This is a preparatory commit which will help us with removing camo.
In the upcoming commits we introduce a new endpoint which is based
out on the setting CAMO_URI. Since camo could have been hosted on
a different server as well from the main Zulip server, this change
will help us realise in tests how that scenerio might be dealt with.
We seemed to have been doing too much of sharpening on the thumbnails.
The purpose of sharpening here was to just counter the softening
effects of a resize on an image but overdoing it is bad.
Value sharpen(0.5,0.2,true) seems to look good for achieving the
best results here on different displays as revealed in the manual
hit and trial based testing.
Thanks to @borisyankov for pointing out the issue and suggesting
the values.
In this commit we fix a bug due to which url preview images for urls
to custom emojis, realm icons or user avatars appeared broken when
such urls would be part of a Zulip message.
This is a preparatory commit to fix a bug in which a user posts
a link of custom emoji, user avatar or realm icon in a Zulip
message.
In this commit we are just adjusting the url generation in the
backend to have the '/user_uploads/' in the encrypted url generated
which the user is supposed to be redirected to and therefore
essentially reaching thumbor with the encrypted url.
This is necessary because 'user_uploads' and 'user_avatars' (or any
other item under 'user_avatars' endpoint) have a different folder
location under the local file storage backend. 'user_uploads'
endpoint's stuff is stored in a 'files' directory whereas stuff
'user_avatars' endpoint's stuff is stored in a 'avatars' directory.
Thumbor needs to know from which directory a particular local file
needs to be retrieved and therefore the zthumbor/loaders.py adds
a prefix location for the directory.
Since in an upcoming commit we are going to add user_avatars
directory location 'avatars' folder as a prefix this preparatory
commit helps simply doing the changes.