zulip

Commit Graph

Author	SHA1	Message	Date
Mateusz Mandera	85e19b2bde	upload: Use URL manipulation for get_public_upload_url logic. This is much faster than calling generate_presigned_url each time. ``` In [3]: t = time.time() ...: for i in range(250): ...: x = u.get_public_upload_url("foo") ...: print(time.time()-t) 0.0010945796966552734 ```	2021-06-22 09:35:56 -07:00
Mateusz Mandera	e883ab057f	upload: Cache the boto client to improve performance. Fixes #18915 This was very slow, causing performance issues. After investigating, generate_presigned_url is the cheap part of this, but the session.client() call is expensive - so that's what we should cache. Before the change: ``` In [4]: t = time.time() ...: for i in range(250): ...: x = u.get_public_upload_url("foo") ...: print(time.time()-t) 6.408717393875122 ``` After: ``` In [4]: t = time.time() ...: for i in range(250): ...: x = u.get_public_upload_url("foo") ...: print(time.time()-t) 0.48990607261657715 ``` This is not good enough to avoid doing something ugly like replacing generate_presigned_url with some manual URL manipulation, but it's a helpful structure that we may find useful with further refactoring.	2021-06-22 09:35:19 -07:00
Alex Vandiver	721546dfc0	subdomains: Extend "static" to include resources hosted on S3. This causes avatars and emoji which are hosted by Zulip in S3 (or compatible) servers to no longer go through camo. Routing these requests through camo does not add any privacy benefit (as the request logs there go to the Zulip admins regardless), and may break emoji imported from Slack before `1bf385e35f`, which have `application/octet-stream` as their stored Content-Type.	2021-06-08 15:28:10 -07:00
Tim Abbott	9f2daeee45	upload: Use get_public_upload_url for export tarballs too. This deduplicates the code so that we now just have one function for constructing S3 URLs.	2021-05-27 23:26:45 -07:00
ryanreh99	5a4aecfc40	s3 uploads: Refactor to access objects via `get_public_upload_url`. Our current logic only allows S3 block storage providers whose upload URL matches with the format used by AWS. This also allows other styles such as the "virtual host" format used by Oracle cloud. Fixes #17762.	2021-05-27 23:26:42 -07:00
Mateusz Mandera	6a8586e989	upload: Mention new difference between sanitize_name and slugify. In Django 3.2 slugify strips trailing dashes and underscores: `0382ecfe02` sanitize_name doesn't so this difference should be documented like the others.	2021-05-03 08:36:22 -07:00
Mateusz Mandera	389c7bdb5a	upload: Fix docstring and regex in sanitize_name regarding underscore. Underscore character is already covered by \w, so _ in the regex is redundant. Also the docstring is mildly incorrect - underscore already is an allowed character by django's slugify (and always was) for the aforementioned reason.	2021-05-03 08:36:22 -07:00
Ganesh Pawar	830f1fa8c5	upload: Refactor and add tests for ensure_avatar_image in upload.py. `ensure_basic_avatar_image` and `ensure_medium_avatar_image` are essentially the same thing, except a size parameter. So, refactor them into a single function. This doesn't introduce any functional changes.	2021-04-29 21:18:13 -07:00
Anders Kaseorg	e7ed907cf6	python: Convert deprecated Django ugettext alias to gettext. django.utils.translation.ugettext is a deprecated alias of django.utils.translation.gettext as of Django 3.0, and will be removed in Django 4.0. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-04-15 18:01:34 -07:00
Anders Kaseorg	6e4c3e41dc	python: Normalize quotes with Black. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	11741543da	python: Reformat with Black, except quotes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
ryanreh99	dfa7ce5637	uploads: Support non-AWS S3-compatible server. Boto3 does not allow setting the endpoint url from the config file. Thus we create a django setting variable (`S3_ENDPOINT_URL`) which is passed to service clients and resources of `boto3.Session`. We also update the uploads-backend documentation and remove the config environment variable as now AWS supports the SIGv4 signature format by default. And the region name is passed as a parameter instead of creating a config file for just this value. Fixes #16246.	2020-10-28 21:59:07 -07:00
ryanreh99	1c370a975c	refactor: Access a bucket by calling `zerver.lib.uploads.get_bucket`.	2020-10-28 21:52:08 -07:00
Anders Kaseorg	72d6ff3c3b	docs: Fix more capitalization issues. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-23 11:46:55 -07:00
Cody Piersall	5dab6e9d31	emoji-upload: Fix transparency issues on GIF emoji upload. This preserves the alpha layer on GIF images that need to be resized before being uploaded. Two important changes occur here: 1. The new frame is a copy of the original image, which preserves the GIF info. 2. The disposal method of the original GIF is preserved. This essentially determines what state each frame of the GIF starts from when it is drawn; see PIL's docs: https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#saving for more info. This resolves some but not all of the test cases in #16370.	2020-10-11 16:23:07 -07:00
akshatdalton	52c411df8a	emoji: Add padding around the gif on GIF emoji upload. Replaced ImageOps.fit by ImageOps.pad, in zerver/lib/upload.py, which returns a sized and padded version of the image, expanded to fill the requested aspect ratio and size. Fixes part of #16370.	2020-10-06 17:28:02 -07:00
Anders Kaseorg	faf600e9f5	urls: Remove unused URL names and shorten others. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-22 10:46:28 -07:00
Anders Kaseorg	ddf8ec33df	upload: Strip leading slash from deleted S3 export paths. Previously, S3UploadBackend.delete_export_tarball failed to strip the leading ‘/’ from the export path. This mistake is now caught by Moto 1.3.15. I expect it caused deletion failures in the real S3, although I haven’t verified this. We store export_path in the audit log with a leading ‘/’, but the actual S3 keys do not have a leading ‘/’. Changing either system would require a migration. So the new convention is that the variables named ‘export_path’ have a leading ‘/’, while variables named ‘path_id’ or ‘key’ do not. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-13 20:59:09 -07:00
Anders Kaseorg	b7b7475672	python: Use standard secrets module to generate random tokens. There are three functional side effects: • Correct an insignificant but mathematically offensive bias toward repeated characters in generate_api_key introduced in commit 47b4283c4b4c70ecde4d3c8de871c90ee2506d87; its entropy is increased from 190.52864 bits to 190.53428 bits. • Use the base32 alphabet in confirmation.models.generate_key; its entropy is reduced from 124.07820 bits to the documented 120 bits, but now it uses 1 syscall instead of 24. • Use the base32 alphabet in get_bigbluebutton_url; its entropy is reduced from 51.69925 bits to 50 bits, but now it uses 1 syscall instead of 10. (The base32 alphabet is A-Z 2-7. We could probably replace all of these with plain secrets.token_urlsafe, since I expect most callers can handle the full urlsafe_b64 alphabet A-Z a-z 0-9 - _ without problems.) Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-09 15:52:57 -07:00
Anders Kaseorg	f91d287447	python: Pre-fix a few spots for better Black formatting. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 17:51:09 -07:00
Clara Dantas	05bf72a75c	attachments: Add is_web_public field. This commit adds the is_web_public field in the AbstractAttachment class. This is useful when validating user access to the attachment, as otherwise we would have to make a query in the db to check if that attachment was sent in a message in a web-public stream or not.	2020-08-12 17:26:03 -07:00
Tim Abbott	6130a61be0	export: Only print .s with percent_callback to console. The S3 data export tool's upload code path uses this nice boto callback feature for showing a progress bar, which is nice for the management command. It's spammy/broken in production and the backend tests, so we change percent_callback to be a parameter passed in so that it can only be used in the contexts where it makes sense.	2020-07-30 13:14:53 -07:00
Tim Abbott	0b6ebb4fbb	upload: Remove unused get_realm_for_filename.	2020-06-18 17:55:13 -07:00
Tim Abbott	5962d1ea14	upload: Avoid fetching bucket objects repeatedly. This takes of advantage of saving the bucket object on the UploadBackend class to deduplicate a bunch of redundant code getting buckets.	2020-06-18 17:55:13 -07:00
Wyatt Hoodes	2ef791fc21	upload.py: Support using non S3-providers. With #14378, we regressed back to the state of that prior to `7e0ea61b00`. We fix this by getting our avatar bucket on object initialization, and use the appropriate means of gathering the network location for the urls. Fixes #14484.	2020-06-18 17:55:13 -07:00
Anders Kaseorg	365fe0b3d5	python: Sort imports with isort. Fixes #2665. Regenerated by tabbott with `lint --fix` after a rebase and change in parameters. Note from tabbott: In a few cases, this converts technical debt in the form of unsorted imports into different technical debt in the form of our largest files having very long, ugly import sequences at the start. I expect this change will increase pressure for us to split those files, which isn't a bad thing. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-11 16:45:32 -07:00
Anders Kaseorg	69730a78cc	python: Use trailing commas consistently. Automatically generated by the following script, based on the output of lint with flake8-comma: import re import sys last_filename = None last_row = None lines = [] for msg in sys.stdin: m = re.match( r"\x1b\[35mflake8 \\|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg ) if m: filename, row_str, col_str, err = m.groups() row, col = int(row_str), int(col_str) if filename == last_filename: assert last_row != row else: if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) with open(filename) as f: lines = f.readlines() last_filename = filename last_row = row line = lines[row - 1] if err in ["C812", "C815"]: lines[row - 1] = line[: col - 1] + "," + line[col - 1 :] elif err in ["C819"]: assert line[col - 2] == "," lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ") if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-06-11 16:04:12 -07:00
Graham Bleaney	461d5b1a3e	pysa: Introduce sanitizers, models, and inline marking safe. This commit adds three `.pysa` model files: `false_positives.pysa` for ruling out false positive flows with `Sanitize` annotations, `req_lib.pysa` for educating pysa about Zulip's `REQ()` pattern for extracting user input, and `redirects.pysa` for capturing the risk of open redirects within Zulip code. Additionally, this commit introduces `mark_sanitized`, an identity function which can be used to selectively clear taint in cases where `Sanitize` models will not work. This commit also puts `mark_sanitized` to work removing known false postive flows.	2020-06-11 12:57:49 -07:00
Anders Kaseorg	67e7a3631d	python: Convert percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-10 15:02:09 -07:00
Anders Kaseorg	444fbbf964	python: Whitespace fixes from autopep8. Generated by autopep8. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-08 15:21:30 -07:00
whoodes	cea7d713cd	requirements: Upgrade boto to boto3. Fixes: #3490 Contributors include: Author: whoodes <hoodesw@hawaii.edu> Author: zhoufeng1989 <zhoufengloop@gmail.com> Author: rht <rhtbot@protonmail.com>	2020-05-26 23:18:07 -07:00
Anders Kaseorg	bdc365d0fe	logging: Pass format arguments to logging. https://docs.python.org/3/howto/logging.html#optimization Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-02 10:18:02 -07:00
Anders Kaseorg	fead14951c	python: Convert assignment type annotations to Python 3.6 style. This commit was split by tabbott; this piece covers the vast majority of files in Zulip, but excludes scripts/, tools/, and puppet/ to help ensure we at least show the right error messages for Xenial systems. We can likely further refine the remaining pieces with some testing. Generated by com2ann, with whitespace fixes and various manual fixes for runtime issues: - invoiced_through: Optional[LicenseLedger] = models.ForeignKey( + invoiced_through: Optional["LicenseLedger"] = models.ForeignKey( -_apns_client: Optional[APNsClient] = None +_apns_client: Optional["APNsClient"] = None - notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) - signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) + notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) + signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) - author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE) + author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE) - bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL) + bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL) - default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) - default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) + default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) + default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) -descriptors_by_handler_id: Dict[int, ClientDescriptor] = {} +descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {} -worker_classes: Dict[str, Type[QueueProcessingWorker]] = {} -queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {} +worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {} +queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {} -AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None +AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-22 11:02:32 -07:00
Mateusz Mandera	4018dcb8e7	upload: Include filename at the end of temporary access URLs.	2020-04-20 10:25:48 -07:00
Tim Abbott	0ccc0f02ce	upload: Support requesting a temporary unauthenticated URL. This is be useful for the mobile and desktop apps to hand an uploaded file off to the system browser so that it can render PDFs (Etc.). The S3 backend implementation is simple; for the local upload backend, we use Django's signing feature to simulate the same sort of 60-second lifetime token. Co-Author-By: Mateusz Mandera <mateusz.mandera@protonmail.com>	2020-04-17 09:08:10 -07:00
Tim Abbott	7f582b3861	upload: Increase the lifetime of signed upload URLs. For some mobile use cases, 15 seconds is potentially too short for a busy+slow device to open a browser and fetch the URL. 60 seconds is plenty, and doesn't carry a materially increased security risk.	2020-04-17 09:08:10 -07:00
Anders Kaseorg	c734bbd95d	python: Modernize legacy Python 2 syntax with pyupgrade. Generated by `pyupgrade --py3-plus --keep-percent-format` on all our Python code except `zthumbor` and `zulip-ec2-configure-interfaces`, followed by manual indentation fixes. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-09 16:43:22 -07:00
Anders Kaseorg	7ff9b22500	docs: Convert many http URLs to https. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-03-26 21:35:32 -07:00
Vishnu Ks	af3a37b58b	upload: Refactor out realm_avatar_and_logo_path function.	2020-02-03 14:09:05 -08:00
Tim Abbott	7ccc8373e2	bugdown: Fix logic for extracting attachment path_id. In `3892a8afd8`, we restructured the system for managing uploaded files to a much cleaner model where we just do parsing inside bugdown. That new model had potentially buggy handling of cases around both relative URLs and URLS starting with `realm.host`. We address this by further rewriting the handling of attachments to avoid regular expressions entirely, instead relying on urllib for parsing, and having bugdown output `path_id` values, so that there's no need for any conversions between formats outside bugdowm. The check_attachment_reference_change function for processing message updates is significantly simplified in the process. The new check on the hostname has the side effect of requiring us to fix some previously weird/buggy test data. Co-Author-By: Anders Kaseorg <anders@zulipchat.com> Co-Author-By: Rohitt Vashishtha <aero31aero@gmail.com>	2019-12-12 20:30:26 -08:00
Rohitt Vashishtha	3fbb050216	messages: Remove dependence on regex for claiming attachments. This commit wraps up the work to remove basic regex based parsing of messages to handle attachment claiming/unclaiming. We now use the more dependable Bugdown processor to find potential links and only operate upon those links instead of parsing the full message content again.	2019-12-11 11:03:49 -08:00
Tim Abbott	7e0ea61b00	upload: Support S3-compatible S3 hosting providers. Previously, we were hardcoding the domain s3.amazonaws.com. Given that we already have an interface for configuring the host in /etc/zulip/boto.cfg (which in turn, automatically configures boto), we just need to actually use the value configured in boto for what S3 hostname to use. We don't have tests for this new use case, in part because they're likely annoying to write with `moto` and there hasn't been a huge amount of demand for it. Since this doesn't regress existing S3 backend support, it seems worth merging.	2019-09-24 17:17:21 -07:00
Tim Abbott	b8b0ae362c	uploads: Only initialize S3 connection once in __init__. This should be a mild performance optimization for the S3 authentication backend, since we aren't initializing unnecessary duplicate connections.	2019-09-24 17:15:44 -07:00
Tim Abbott	96726c00ce	export: Fix broken URLs in UI with S3 backend. Apparently, the Zulip notifications (and resulting emails) were correct, but the download links inside the Zulip UI were incorrectly not including S3 prefix on the URL, making them not work. While we're at this, we rewrite the somewhat convoluted previous system for formatting the data export output.	2019-09-24 13:56:49 -07:00
Anders Kaseorg	780ecb672b	CVE-2019-16216: Fix MIME type validation. * Whitelist a small number of image/ types to be served as non-attachments. * Serve the file using the type that we validated rather than relying on an independent guess to match. This issue can lead to a stored XSS security vulnerability for older browsers that don't support Content-Security-Policy. It primarily affects servers using Zulip's local file uploads backend for servers running Ubuntu 16.04 Xenial or newer; the legacy local file upload backend for (now EOL) Ubuntu 14.04 Trusty was not affected and it has limited impact for the S3 upload backend (which uses an unprivileged S3 bucket domain to serve files). Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-09-11 15:46:36 -07:00
neiljp (Neil Pilgrim)	5f673f5820	mypy: Remove type ignores after boto stub improvements.	2019-08-06 23:24:56 -07:00
Wyatt Hoodes	22481f63bf	upload: Fix typing for key variable.	2019-07-29 15:23:10 -07:00
Wyatt Hoodes	b1900c406a	public_export: Add logic for deleting the export tarball. The path to the uploaded tarball is reconstructed via the relative url and removed with the canonical methods in `upload.py`.	2019-07-26 15:52:03 -07:00
Wyatt Hoodes	e331a758c3	python: Migrate open statements to use with. This is low priority, but it's nice to be consistently using the best practice pattern. Fixes: #12419.	2019-07-20 15:48:52 -07:00
Wyatt Hoodes	af4eb8c0d5	export/upload: Refactor tarball upload logic to upload_backend. The conditional block containing the tarball upload logic for both S3 and local uploads was deconstructed and moved to the more appropriate location within `zerver/lib/upload.py`.	2019-07-03 15:40:35 -07:00

1 2 3 4

174 Commits