Commit Graph

16380 Commits

Author SHA1 Message Date
Alex Vandiver f78c2eab2c message_send: Use stream object for name in "no subscribers" message.
In the case where a stream existed but had no subscribers, the error
message used to send to the owner always used `stream_name`, which
may have been None.

Switch to using `stream.name` rather than `stream_name` for this case.
2023-02-23 11:02:55 -08:00
Alex Vandiver 6427d85cf6 tornado: Replace dataclasses.asdict() call, as it is slow.
This code is called in the hot path when Tornado is processing events.
As such, making this code performant is important.  Profiling shows
that a significant portion of the time is spent calling asdict() to
serialize the UserMessageNotificationsData dataclass.  In this case
`asdict` does several steps which we do not need, such as attempting
to recurse into its fields, and deepcopy'ing the values of the fields.

In our use case, these add a notable amount of overhead:
```py3
from zerver.tornado.event_queue import UserMessageNotificationsData
from dataclasses import asdict
from timeit import timeit
o = UserMessageNotificationsData(1, False, False, False, False, False, False, False, False, False, False, False)
%timeit asdict(o)
%timeit {**vars(o)}
```

Replace the `asdict` call with a direct access of the fields.  We
perform a shallow copy because we do need to modify the resulting
fields.
2023-02-23 11:01:38 -08:00
Anders Kaseorg ed069ebe0e docs: Remove spaces before commas.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-22 17:17:25 -08:00
Sahil Batra 678d16a908 migrations: Fix extra_data of RealmAuditLog objects.
This commit adds migration to fix extra_data field
of RealmAuditLog objects created on changing
can_remove_subscribers_group setting to add "property"
field since the same event type will now be used for
other group based stream settings that will be added
in future.
2023-02-22 12:17:46 -08:00
Sahil Batra 0cf99cf5c3 streams: Refactor code to set group-based stream settings.
We add stream_permission_group_settings object which is
similar to property_types framework used for realm settings.

This commit also adds GroupPermissionSetting dataclass for
defining settings inside stream_permission_group_settings.

We add "do_change_stream_group_based_setting" function which
is called in loop to update all the group-based stream settings
and it is now used to update 'can_remove_subscribers_group'
setting instead of "do_change_can_remove_subscribers_group".

We also change the variable name for event_type field of
RealmAuditLog objects to STREAM_GROUP_BASED_SETTING_CHANGED
since this will be used for all group-based stream settings.

'property' field is also added to extra_data field to identify
the setting for which RealmAuditLog object was created.

We will add a migration in further commits which will add the
property field to existing RealmAuditLog objects created for
changing can_remove_subscribers_group setting.
2023-02-22 12:17:46 -08:00
Mateusz Mandera a593089770 presence: Return some presence params in the /register response. 2023-02-21 12:10:43 -08:00
Mateusz Mandera 4ba4305ff9 presence: Change server-side OFFLINE_THRESHOLD_SECS to 140s.
This old 300s value was meaningfully used in 2 places:
1. In the do_change_user_settings presence_enabled codepath when turning
   a user invisible. It doesn't matter there, 140s is just since the
   point is to make clients see this user as offline. And 140s is the
   threshold used by clients (see the presence.js constant).
2. For calculating whether to set "offline" "status" in
   result["presence"]["aggregated"] in get_presence_backend. It's fine
   for this to become 140s, since clients shouldn't be looking at the
   status value anymore anyway and just do their calculation based on
   the timestamps.
2023-02-21 12:10:43 -08:00
Alex Vandiver bd6b215cdc sentry: Fix accidental blockquote by escaping >. 2023-02-21 09:59:13 -08:00
Abhijeet Prasad Bodas 6e001d0672 user topic: Remove unnecessary check for double addition of muted topic.
This makes use of the new case insensitive UNIQUE index added in the
earlier commit. With that index present, we can now rely solely on the
database to correctly identify duplicates and throw integrity errors as
required.
2023-02-20 21:04:13 -08:00
Abhijeet Prasad Bodas 80bf6b0777 user topic: Add case insensitive UNIQUE constraint.
This will allow us to rely on the database to detect duplicate
`UserTopic`s (with the same `topic_name` with different cases)
and thus correctly throw IntegrityErrors when expected.
This is also important from a correctness point of view, since as
of now, when checking if topic is muted or requesting the backend for
muting a topic, the frontend does not check for case insensitivity.

There might exist duplicate UserTopics (in a case insensitive sense)
which need are removed before creating the new index.
The migration was tested manually using `./manage.py shell`.
2023-02-20 21:04:13 -08:00
Abhijeet Prasad Bodas 9fde88796a mute user: Remove unnecessary check for double muting.
In 141b0c4, we added code to handle races caused by duplicate muting
requests. That code can also handle the non-race condition, so we don't
require the first check.
2023-02-20 21:04:13 -08:00
Lauryn Menard 06dd6f8254 message-send: Deduplicate check of `settings.MAX_MESSAGE_LENGTH`.
Removes the initial check in `_internal_prep_message` of the length
of the message content because the `check_message` in the try block
will call `normalize_body` on the message content string, which
does a more robust check of the message content (empty string, null
bytes, length). If the message content length exceeds the value of
`settings.MAX_MESSAGE_LENGTH`, then it is truncated based on that
value. Updates associated backend test for these changes.

The removed length check would truncate the message content with a
hard coded value instead of using the value for
`settings.MAX_MESSAGE_LENGTH`.

Also, removes an extraneous comment about removing null bytes. If
there are null bytes in the message content, then `normalize_body`
will raise an error.

Note that the previous check had intentionally reduced any message over
the 10000 character limit to 3900 characters, with the code in
question dating to 2012's 100df7e349.

The 3900 character truncating rule was implemented for incoming emails
with the email gateway, and predated other features to help with
overly long messages (better stripping of email footers via Talon,
introduced in f1f48f305e, and
condensing, introduced in c92d664b44).
While we could preserve that logic if desired, it likely is no longer
a necessary or useful variation from our usual truncation rules.
2023-02-20 12:55:15 -08:00
Alex Vandiver d3403dde86 rabbitmq: Add a RABBITMQ_PORT setting. 2023-02-20 12:04:25 -08:00
Lauryn Menard 0463b0df7b api-docs: Update content parameter for `max_message_length`.
Updates the descriptions of content parameters (optional and
required) to note that the maximum size of the message content
should be based on the `max_message_length` value returned by
the register endpoint.

Previously these descriptions had a hardcoded value of 10000
bytes as the maximum message size.

Also, updates the description of `max_message_length` to clarify
that the value represents Unicode code points.
2023-02-20 12:03:12 -08:00
Lauryn Menard b25cba16a7 tests: Remove ignored parameter from `authenticated_json_view` test.
The password parameter being passed in the `_do_test` helper
function for `TestAuthenticatedJsonPostViewDecorator` tests was
being ignored, as the user needs to be logged in. Removes the
parameter from the helper function and updates the success test
to use `assert_json_success` instead of just checking the status
code.

Also adds a test case for when a user is not logged in to confirm
that it returns an UnauthorizedError.
2023-02-20 11:56:00 -08:00
Alex Vandiver 8998aa00cd Revert "create_user: Use transaction.atomic decorator for do_create_user."
This reverts commit 851d68e0fc.

That commit widened how long the transaction is open, which made it
much more likely that after the user was created in the transaction,
and the memcached caches were flushed, some other request will fill
the `get_realm_user_dicts` cache with data which did not include the
new user (because it had not been committed yet).

If a user creation request lost this race, the user would, upon first
request to `/`, get a blank page and a Javascript error:

    Unknown user_id in get_by_user_id: 12345

...where 12345 was their own user-id.  This error would persist until
the cache expired (in 7 days) or something else expunged it.

Reverting this does not prevent the race, as the post_save hook's call
to flush_user_profile is still in a transaction (and has been since
168f241ff0), and thus leaves the potential race window open.
However, it much shortens the potential window of opportunity, and is
a reasonable short-term stopgap.
2023-02-17 21:20:52 -05:00
Anders Kaseorg bf7d3124ab openapi: Correct “we-public” typo.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-15 15:21:54 -08:00
Alex Vandiver f8a16fb9af model: Handle parsed user-agents longer than 30 characters.
The Client.name field is only 30 characters long, but there is no
limit to the length of parsed User-Agent value which we may attempt to
store in it.  This can cause requests with long user-agents to 500
when the creation of the Client row fails.

Truncate the name at 30 characters for the cache key, and passing
`name` to `get_or_create`.
2023-02-14 14:58:15 -08:00
Alex Vandiver 3559bf416d settings: Allow customization of STATIC_URL.
Some requests will still be to `/static/`, either at the realm's
domain, or at the root domain.
2023-02-14 17:17:06 -05:00
Alex Vandiver dd21e472e7 subdomains: Extend tests for for same-host when STATIC_URL has domain. 2023-02-14 17:17:06 -05:00
Alex Vandiver a1967a7f2d markdown: Rewrite /static/ images to use staticfiles_storage.url. 2023-02-14 17:17:06 -05:00
Alex Vandiver ccecc8eb84 markdown: Comment why we do not hash or use STATIC_URL for :zulip:. 2023-02-14 17:17:06 -05:00
Alex Vandiver 6806bec2e9 email_notifications: Use stable, unhashed, URLs for emoji in emails. 2023-02-14 17:17:06 -05:00
Alex Vandiver 72ac38eff8 send_email: Use STATIC_URL as base for images in emails. 2023-02-14 17:17:06 -05:00
Alex Vandiver e31767dda4 settings: Make DEFAULT_LOGO_URI/DEFAULT_AVATAR_URI use staticfiles. 2023-02-14 17:17:06 -05:00
Alex Vandiver ede0ae9069 unminify: Do not assume /static/ exists in the stacktrace. 2023-02-14 17:17:06 -05:00
Alex Vandiver 25b97c2872 storage: Skip hashing markdown files. 2023-02-14 17:17:06 -05:00
Alex Vandiver d9af7f25e8 storage: Fix extension logic to not hash other resource files. 2023-02-14 17:17:06 -05:00
Abhijeet Prasad Bodas 9bc49ea4be message_send: Extract out stream notification user-set calculation logic.
This will allow us to re-use this logic later, when we add support for
re-checking notification settings just before sending email/push
notifications to the user.

Also, since this is essentially part of the notifiability logic,
this better belongs to `notification_data.py` and this change will
hopefully reduce the reading complexity of the message-send codepath.
2023-02-10 22:47:30 -08:00
Abhijeet Prasad Bodas 167ea63c1f message_send: Add comment on wildcard_mention_user_ids optimization.
The earlier commit explained the code, and not why we have this
optimization in the first place.
2023-02-10 22:47:30 -08:00
David Rosa 2b9b0acaa4 help: Update URL references to new "Configure email visibility" page. 2023-02-10 18:15:56 -08:00
Alya Abbott 9ea3572017 help: Document new user setting for configuring email visibility.
- Adds a URL redirect from the old page to the new page.
2023-02-10 18:15:56 -08:00
Sahil Batra 9d1dc20e6e settings: Remove realm-level email_address_visibility setting.
This was replaced by the new user-level version in recent commits.

Fixes #20035.
Fixes #18149.
2023-02-10 17:40:33 -08:00
Sahil Batra 1b110552ec settings: Migrate to use user-level email_address_visibility setting. 2023-02-10 17:35:50 -08:00
Sahil Batra 0ed5f76063 settings: Add backend code for using user email_address_visibility setting.
This commits update the code to use user-level email_address_visibility
setting instead of realm-level to set or update the value of UserProfile.email
field and to send the emails to clients.

Major changes are -

- UserProfile.email field is set while creating the user according to
RealmUserDefault.email_address_visbility.

- UserProfile.email field is updated according to change in the setting.

- 'email_address_visibility' is added to person objects in user add event
and in avatar change event.

- client_gravatar can be different for different users when computing
avatar_url for messages and user objects since email available to clients
is dependent on user-level setting.

- For bots, email_address_visibility is set to EVERYONE while creating
them irrespective of realm-default value.

- Test changes are basically setting user-level setting instead of realm
setting and modifying the checks accordingly.
2023-02-10 17:35:49 -08:00
Sahil Batra ea0b2286e0 settings: Add email_address_visbility user setting. 2023-02-10 17:35:49 -08:00
Sahil Batra 2396e6858f users: Always pass "delivery_email" field in user objects.
Previously, user objects contained delivery_email field
only when user had access to real email. Also, delivery_email
was not present if visibility setting is set to "everyone"
as email field was itself set to real email.

This commit changes the code to pass "delivery_email" field
always in the user objects with its value being "None" if
user does not have access to real email and real email otherwise.
The "delivery_email" field value is None for logged-out users.
For bots, the "delivery_email" is always set to real email
irrespective of email_address_visibility setting.

Also, since user has access to real email if visibility is set
to "everyone", "delivery_email" field is passed in that case
too.

There is no change in email field and it is same as before.

This commit also adds code to send event to update delivery_email
field when email_address_visibility setting changes to all the
users whose access to emails changes and also changes the code to
send event on changing delivery_email to users who have access
to email.
2023-02-10 17:35:49 -08:00
David Rosa 8c047275a4 help: Update URLs to match "Restrict moving messages" title.
Updates all references to the new URL and adds a URL redirect.

Follow up to #24329.
2023-02-10 15:56:16 -08:00
David Rosa 3254023fa3 help: Update URLs to match "Restrict message editing and deletion" title.
Updates all references to the new URL and adds a URL redirect.

Follow up to #24329.
2023-02-10 15:56:16 -08:00
Anders Kaseorg 41aadc8c63 message_send: Convert RecipientInfoResult from TypedDict to dataclass.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-10 15:51:22 -08:00
Tim Abbott 387f178ef1 lint: Fix import sort order. 2023-02-10 15:47:32 -08:00
Abhijeet Prasad Bodas 7d02b20bdb user topics: Rename test file.
This makes the file name in line with the model name and other files,
for better discoverability.

This probably should have been a part of
48883f1580.
2023-02-10 15:39:57 -08:00
Abhijeet Prasad Bodas 9d1c131dc6 muted users: Make file naming consistent.
This makes the names of the relevant files consistant with the database
model name as well as the frontend JS files.
2023-02-10 15:39:57 -08:00
Anders Kaseorg 0a1904a6a7 markdown: Rewrite YouTube URL parser without regex spaghetti.
This also adds support for the new YouTube Shorts URLs.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-09 22:34:51 -08:00
Anders Kaseorg 70ac144d57 markdown: Replace custom cache decorator with functools.lru_cache.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-09 15:46:11 -08:00
Alex Vandiver 02a73af386 deferred_work: Log at start of the work.
This is helpful for debugging -- generally these tasks are in a worker
queue because they take a long time to run, so knowing what long task
is about to start before it does, rather than just after, is useful.
2023-02-09 12:06:38 -08:00
Anders Kaseorg f90a41bab0 upload: Replace deprecated PIL.PngImagePlugin.APNG_DISPOSE_OP_NONE.
https://pillow.readthedocs.io/en/stable/deprecations.html#constants

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-09 10:06:25 -08:00
Anders Kaseorg 201debc1ea upload: Replace deprecated PIL.Image.ANTIALIAS with LANCZOS.
https://pillow.readthedocs.io/en/stable/deprecations.html#constants

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-08 17:53:45 -08:00
Anders Kaseorg 6992d3297a ruff: Fix PIE810 Call `startswith` once with a `tuple`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-08 16:40:35 -08:00
Sahil Batra 891f83601d message_edit: Use move_messages_between_streams_limit_seconds setting.
This commit adds time restriction on moving messages between streams
using the move_messages_between_streams_limit_seconds setting in the
backend. There is no time limit for admins and moderators.
2023-02-08 12:46:05 -08:00
Sahil Batra 2c4e076fef message_edit: Use move_messages_within_stream_limit_seconds setting.
We now use the newly added move_messages_within_stream_limit_seconds
setting to check for how long the user can edit the topic replacing
the previously used 3-day limit. As it was previously, there is no
time limit for admins and moderators.
2023-02-08 12:46:05 -08:00
Sahil Batra b919dfd489 realm: Add time limit setting for moving messages between streams.
This commit adds "move_messages_between_streams_limit_seconds"
setting which would be used to set a time limit to move messages
between streams.
2023-02-08 12:46:05 -08:00
Sahil Batra 73f0eae394 realm: Add time limit setting for moving message within stream.
This commit adds "move_messages_within_streams_limit_seconds"
setting which would be used to set a time limit to move messages
within stream.
2023-02-08 12:46:05 -08:00
Sahil Batra 5b1f6a696e message: Refactor code for parsing message time limit settings.
This commit renames parse_message_content_edit_or_delete_limit
to parse_message_time_limit_setting and also renames
MESSAGE_CONTENT_EDIT_OR_DELETE_LIMIT_SPECIAL_VALUES_MAP to
MESSAGE_TIME_LIMIT_SETTING_SPECIAL_VALUES_MAP.

We do this change since this function and object will also be
used for message move limit and it makes sense to have a more
generic name.
2023-02-08 10:59:28 -08:00
Sahil Batra 1a656d2e23 realm: Deduplicate code for parsing message edit and delete limit settings.
This commit extracts a function to parse message time limit type settings
and to set it if the new setting value is None.

This function is currently used for message_content_edit_limit_seconds and
message_content_delete_limit_seconds settings and will be used for
message_move_limit_seconds setting to be added in further commits.
2023-02-08 10:59:28 -08:00
Sahil Batra e53b09fb12 events: Remove code for settings which are included in property_types.
These lines in fetch_initial_state_data are redundant now since these
settings are already included in property_types after 04693b6ac1.
2023-02-08 10:59:28 -08:00
Alex Vandiver e3c976fed9 missed_messages: Include messages from topic which differ by case.
In Zulip, message topics are case-insensitive but case-preserving.
The `get_context_for_message` function erroneously did a
case-sensitive search, and thus only messages whose topic matched
exactly were pulled in as context.

Make the missed-message pipeline aware that message topics are not
case-sensitive.  This means that, when collapsing adjacent messages,
we merge messages with topic headers which are "different"; create a
separate explicit "grouping" to know which to collapse.
2023-02-07 11:04:21 -08:00
Alex Vandiver 23894fc9a3 uploads: Set Content-Type and -Disposition from Django for local files.
Similar to the previous commit, Django was responsible for setting the
Content-Disposition based on the filename, whereas the Content-Type
was set by nginx based on the filename.  This difference is not
exploitable, as even if they somehow disagreed with Django's expected
Content-Type, nginx will only ever respond with Content-Types found in
`uploads.types` -- none of which are unsafe for user-supplied content.

However, for consistency, have Django provide both Content-Type and
Content-Disposition headers.
2023-02-07 17:12:02 +00:00
Alex Vandiver 2f6c5a883e CVE-2023-22735: Provide the Content-Disposition header from S3.
The Content-Type of user-provided uploads was provided by the browser
at initial upload time, and stored in S3; however, 04cf68b45e
switched to determining the Content-Disposition merely from the
filename.  This makes uploads vulnerable to a stored XSS, wherein a
file uploaded with a content-type of `text/html` and an extension of
`.png` would be served to browsers as `Content-Disposition: inline`,
which is unsafe.

The `Content-Security-Policy` headers in the previous commit mitigate
this, but only for browsers which support them.

Revert parts of 04cf68b45e, specifically by allowing S3 to provide
the Content-Disposition header, and using the
`ResponseContentDisposition` argument when necessary to override it to
`attachment`.  Because we expect S3 responses to vary based on this
argument, we include it in the cache key; since the query parameter
has dashes in it, we can't use use the helper `$arg_` variables, and
must parse it from the query parameters manually.

Adding the disposition may decrease the cache hit rate somewhat, but
downloads are infrequent enough that it is unlikely to have a
noticeable effect.  We take care to not adjust the cache key for
requests which do not specify the disposition.
2023-02-07 17:09:52 +00:00
Alex Vandiver d41a00b83b uploads: Extra-escape internal S3 paths.
In nginx, `location` blocks operate on the _decoded_ URI[^1]:

> The matching is performed against a normalized URI, after decoding
> the text encoded in the “%XX” form

This means that if a user-uploaded file contains characters that are
not URI-safe, the browser encodes them in UTF-8 and then URI-encodes
them -- and nginx decodes them and reassembles the original character
before running the `location ~ ^/...` match.  This means that the `$2`
_is not URI-encoded_ and _may contain non-ASCII characters.

When `proxy_pass` is passed a value containing one or more variables,
it does no encoding on that expanded value, assuming that the bytes
are exactly as they should be passed to the upstream.  This means that
directly calling `proxy_pass https://$1/$2` would result in sending
high-bit characters to the S3 upstream, which would rightly balk.

However, a longstanding bug in nginx's `set` directive[^2] means that
the following line:

```nginx
set $download_url https://$1/$2;
```

...results in nginx accidentally URI-encoding $1 and $2 when they are
inserted, resulting in a `$download_url` which is suitable to pass to
`proxy_pass`.  This bug is only present with numeric capture
variables, not named captures; this is particularly relevant because
numeric captures are easily overridden by additional regexes
elsewhere, as subsequent commits will add.

Fixing this is complicated; nginx does not supply any way to escape
values[^3], besides a third-party module[^4] which is an undue
complication to begin using.  The only variable which nginx exposes
which is _not_ un-escaped already is `$request_uri`, which contains
the very original URL sent by the browser -- and thus can't respect
any work done in Django to generate the `X-Accel-Redirect` (e.g., for
`/user_uploads/temporary/` URLs).  We also cannot pass these URLs to
nginx via query-parameters, since `$arg_foo` values are not
URI-decoded by nginx, there is no function to do so[^3], and the
values must be URI-encoded because they themselves are URLs with query
parameters.

Extra-URI-encode the path that we pass to the `X-Accel-Redirect`
location, for S3 redirects.  We rely on the `location` block
un-escaping that layer, leaving `$s3_hostname` and `$s3_path` as they
were intended in Django.

This works around the nginx bug, with no behaviour change.

[^1]: http://nginx.org/en/docs/http/ngx_http_core_module.html#location
[^2]: https://trac.nginx.org/nginx/ticket/348
[^3]: https://trac.nginx.org/nginx/ticket/52
[^4]: https://github.com/openresty/set-misc-nginx-module#set_escape_uri
2023-02-07 17:09:52 +00:00
David Rosa 2b8dfbfe1f help: Improve relative settings links for documentation on bots.
Fixes the documentation generated from the Markdown macros
{settings_tab|your-bots} and {settings_tab|bot-list-admin} to
match the text labels in the Zulip UI and improves the text of
relative links to explicitly say if we are referring to the Bots
tab of the Personal or Organization settings menu.

Follow-up to #23256.
2023-02-06 15:06:15 -08:00
David Rosa df7e409cd9 help_settings_links: Refactor handleMatch function.
This code needs to be more flexible to improve the documentation
of items in the Personal and Organization settings menu when
using the `{settings_tab|[setting-name]}` Markdownm macro that
provides relative links or step-by-step instructions.

This commit moves the Markdown formatting code to a new function that
receives tuples from `link_mapping` as input. This is a preliminary
step to offer more flexibility than the current approach.
2023-02-06 15:06:15 -08:00
Prakhar Pratyush 906ff9243a user_mutes: Rename 'muting.py' to 'user_mutes.py'.
Rename 'muting.py' to 'user_mutes.py' because it, now
, contains only user-mute related functions.

Includes minor refactoring needed after renaming the file.
2023-02-07 00:23:47 +05:30
Prakhar Pratyush 49577bbdcd user_topics: Move topic muting functions to user_topics.py.
This commit moves topic related stuff i.e. topic muting functions
to a separate file 'views/user_topics.py'.

'views/muting.py' contains functions related to user-mutes only.
2023-02-07 00:23:47 +05:30
Kartik Srivastava 48883f1580 tests: Rename test_muting_topics to test_topic_visibility_policies.
As we'll be adding support for more visibility policies for topics,
this renamed file will contain the necessary tests.
2023-02-07 00:23:47 +05:30
Sahil Batra 73f11853ec streams: Allow setting can_remove_subscribers_group_id while creating streams.
This commit adds API support to set can_remove_subscribers_group setting
when creating streams.
2023-02-05 14:46:36 -08:00
Sahil Batra c3759814be streams: Allow changing can_remove_subscribers_group through API.
This commit adds API support to change can_remove_subscribers_group
setting for a stream.
2023-02-05 14:46:36 -08:00
Aman Agrawal 9965ad2ea3 registration: Track create organization page in GA.
This will help us track if users actually clicked on the
email confirmation link while creating a new organization.

Replaced all the `reder` calls in `accounts_register` with
`TemplateResponse` to comply with `add_google_analytics`
decorator.
2023-02-05 10:24:32 -08:00
Anders Kaseorg b91788b945 markdown: Replace deprecated UnescapePostprocessor.
See https://github.com/Python-Markdown/markdown/pull/1272.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-04 16:36:47 -08:00
Anders Kaseorg b08c81b312 ruff: Fix G201 Logging `.error(..., exc_info=True)` → `.exception(...)`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-04 16:36:20 -08:00
Anders Kaseorg 59eca10a43 ruff: Fix G004 Logging statement uses f-string.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-04 16:36:20 -08:00
Anders Kaseorg da3cf5ea7a ruff: Fix RSE102 Unnecessary parentheses on raised exception.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-04 16:34:55 -08:00
Prakhar Pratyush 9997131df3 message_edit: Fix resolve topic notifications.
When 'resolve|unresolve' and 'move stream' actions occurs in
the same api call, 'This topic was marked as resolved|unresolved'
notification is not sent.

Both 'topic moved' and 'topic resolved' notification should be generated.

This commit updates the logic of when and where to send
'topic resolve|unresolve' notification. Unlike previous logic, notification
may be sent even in the case 'new_stream' is not None.

In general, 'topic resolved|unresolved' notification is sent to
'stream_being_edited'. In this particular case ('new_stream' is not None),
notification is sent to the 'new_stream' after check.

Test case is included.

Fixes: #22973
2023-02-03 18:20:39 -08:00
Anders Kaseorg 81a7c7502f requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-03 16:36:54 -08:00
Anders Kaseorg 5b7c4206d7 ruff: Fix SIM300 Yoda conditions are discouraged.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-03 16:36:54 -08:00
Anders Kaseorg b85cb1e700 exceptions: Accept lazy translation as JsonableError argument.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-03 16:36:54 -08:00
Alessandro Toppi ff89590558 auth: Add JWT-based user API key fetch.
This adds a new endpoint /jwt/fetch_api_key that accepts a JWT and can
be used to fetch API keys for a certain user. The target realm is
inferred from the request and the user email is part of the JWT.

A JSON containing an user API key, delivery email and (optionally)
raw user profile data is returned in response.
The profile data in the response is optional and can be retrieved by
setting the POST param "include_profile" to "true" (default=false).

Co-authored-by: Mateusz Mandera <mateusz.mandera@zulip.com>
2023-02-03 15:23:35 -08:00
Tim Abbott 7c7ca61e9f auth: Remove now unnecessary return_data parameter. 2023-02-03 15:23:35 -08:00
Mateusz Mandera 75b44337a8 auth: Extract get_api_key_fetch_authenticate_failure. 2023-02-03 15:23:35 -08:00
Mateusz Mandera 3a1d974cee auth: Extract process_api_key_fetch_authenticate_result function.
This will be useful for re-use in the implementation of another JWT
endpoint in the upcoming commits.
2023-02-03 15:23:35 -08:00
Mateusz Mandera 100f4a7152 auth: Add @require_post to remote_user_jwt. 2023-02-03 15:23:35 -08:00
Mateusz Mandera d466da1064 auth: Use REQ for getting token value and rename param in jwt paths.
This makes us use REQ properly instead of fetching from request.POST
manually - also renaming the param to "token" which is more standard.
2023-02-03 15:23:35 -08:00
Mateusz Mandera c9c255b3a8 auth: Improve JsonableError in get_..._jwt_authentication_request. 2023-02-03 15:23:35 -08:00
Mateusz Mandera 6c638a1057 auth: Extract token-check logic of remote_user_jwt.
This will be useful for re-use for implementation of an endpoint for
obtaining the API by submitting a JWT in the next commits.

It's not a pure refactor, as it requires some tweaks to remote_user_jwt
behavior:
1. The expected format of the request is changed a bit. It used to
   expect "user" and "realm" keys, from which the intended email was
   just generated by joining with @. Now it just expects "email"
   straight-up. The prior design was a bt strange to begin with, so this
   might be an improvement actually.
2. In the case of the codepath of new user signup, this will no longer
   pre-populate the Full Name in the registration form with the value
   from the "user" key. This should be a very minor lost of
   functionality, because the "user" value was not going to be a proper
   Full Name anyway. This functionality can be restored in a future
   commit if desired.

This is an API change, but this endpoint is nearly unused as far as
we're aware.
2023-02-03 15:23:35 -08:00
Tim Abbott 8d27d14354 models: Improve documentation for FOLLOWED UserTopics. 2023-02-03 11:25:05 -08:00
Aman Agrawal c3f0d5898a index: Remove `Loading...` text from the app loading overlay.
The new loading spinner animation conveys the message well
that the app is loading and this text is no longer required.
2023-02-03 10:51:39 -08:00
Lauryn Menard dbacc00f0f api-docs: Move markdown files to top level directory.
- Updates `.prettierignore` for the new directory.
- Updates any reference to the API documentation directory for
  markdown files to be `api_docs/` instead of `zerver/api/`.
- Removes a reference link from `docs/documentation/api.md` that
  hasn't referenced anything in the text since commit 0542c60.
- Update rendering of API documentation for new directory.
2023-02-02 17:25:40 -08:00
Lauryn Menard fc54ffd778 documentation: Move check for `api-doc-template.md`.
Moves the check for calling the `api-doc-template.md` directly,
so that we don't return a 500 error from the server, to happen
earlier with other checks for returning a 404 / missing page.

Also adds a specific test to `zerver/tests/test_urls` for this
template.

Prep commit for moving API documentation directory to be a top
level directory.
2023-02-02 17:23:31 -08:00
Alya Abbott 27b703f617
emails: Improve followup_day1 (registration confirmation) email.
- Clean up the language.
- Add a prominent "Go to organization" button.
- Link to guides for new users and admins.
- Fix duplication bug in text email version.

Co-authored-by: Mateusz Mandera <mateusz.mandera@zulip.com>
2023-02-02 17:16:43 -08:00
Anders Kaseorg df001db1a9 black: Reformat with Black 23.
Black 23 enforces some slightly more specific rules about empty line
counts and redundant parenthesis removal, but the result is still
compatible with Black 22.

(This does not actually upgrade our Python environment to Black 23
yet.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-02 10:40:13 -08:00
Lauryn Menard b64d8e8a52 templates: Remove `base_path` argument from api_arguments_table_generator.
Removes `base_path` argument when making the markdown extension for
parameters in documentation for API endpoints.

This seems to have been originally included for API parameters that
were documented in JSON files, which is no longer in use. Now all
API endpoints in the documentation are documented in
`zerver/openapi/zulip.yaml`.
2023-02-01 13:12:53 -08:00
Lauryn Menard f38d5a6a26 templates: Remove `base_path` argument from api_return_values_generator.
Removes `base_path` argument when making the markdown extension for
return values in documentation for API endpoints.

This seems to have been a copy and paste error in commit d2ee99a2fd
when `zerver/lib/markdown/api_return_values_generator.py` was created.
2023-02-01 13:12:53 -08:00
Lauryn Menard 20fb9af974 help: Add redirect for `help/configure-default-view`.
Adds URL redirect that was missed in commit 73710e1cf0 when
`change-default-view.md` was renamed to `configure-default-view.md`.
2023-02-01 06:10:14 -08:00
Lalit 4ab824dc4c
emoji: Disallow `.` in custom emoji names.
Until now, custom emojis with "periods" in their name were allowed, even though
they don't really fit the pattern of how we name them, and in fact the Markdown
processor would not render such custom emoji. Fix this by just disallowing the
character.

Also update the error strings accordingly.

Note that this does not include a migration to eliminate any existing custom emoji with this 
character in their name.

Fixes #24066.
2023-01-31 17:28:33 -08:00
Alex Vandiver 92c8c17190 import: Add the UTF-8 flag on file entries in zipfiles from Slack.
Fixes: #22533.
2023-01-31 16:07:48 -08:00
N-Shar-ma 282bf680f2 emoji: Replace `built-in` with `default` in emoji overriding API error. 2023-01-31 15:52:33 -08:00
Tim Abbott 22712f634d fix_unreads: Remove dead duplicate build_topic_unmute_checker.
The only caller for this function, which is part of a very old
migration, was removed in b77da6b834.
2023-01-31 11:18:44 -08:00
Sahil Batra e14b1f03fa register: Refactor template for new account registration.
This commit refactors the template code for source-realm
select element to have same structure as other inputs
and select element in the page. Thus this change also
makes the styling of source-realm select element consistent
with other select element in the page.
2023-01-26 12:36:31 -08:00
Sahil Batra 851d68e0fc create_user: Use transaction.atomic decorator for do_create_user.
We change the do_create_user function to use transaction.atomic
decorator instead of using with block. Due to this change, all
send_event calls are made inside transaction.on_commit.

Some other changes -
- Remove transaction.atomic decorator from send_inital_realm_messages
since it is now called inside a transaction.
- Made changes in tests which tests message events and notifications
to make sure on_commit callbacks are executed.
2023-01-26 10:49:19 -08:00
Sahil Batra 6cc468f6d4 create_user: Use transaction.atomic as decorator with do_reactivate_user.
This commit changes the do_reactivate_user such that the complete function
is called inside an atomic transaction and events are called after the
transaction is commited using on_commit helper. This is a prep commit
for unsubscribing the bots of unaccessible private streams when reactivating
them.
2023-01-26 10:49:19 -08:00
Sahil Batra 5635881664 realm_export: Add transaction.atomic to export_realm. 2023-01-26 10:49:19 -08:00
David Rosa 8d1db6482f help center: Update relative gear menu macro for organization settings.
Renames "Manage organization" to "Organization settings" to reflect
changes in the previous commit.
2023-01-26 10:17:45 -08:00
Anders Kaseorg cb8c7f2a17 ruff: Fix UP032 Use f-string instead of `format` call.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-26 10:16:30 -08:00
Anders Kaseorg 4eda29bd86 ruff: Fix RUF005 Consider spread instead of concatenation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-26 10:16:30 -08:00
Anders Kaseorg 7e3a681f80 ruff: Fix S108 Probable insecure usage of temporary file.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-26 10:14:56 -08:00
Lauryn Menard dedea23745
help-docs: Move help center documentation to top level directory.
These files are not Jinja2 templates, so there's no reason that they needed
to be inside `templates/zerver`. Moving them to the top level reflects their
importance and also makes it feel nicer to work on editing the help center content, 
without it being unnecessary buried deep in the codebase.
2023-01-25 14:08:29 -08:00
Alex Vandiver 23f4cde91c email_mirror: Ensure that attachments get space to be included.
The content of a message is truncated to `MAX_MESSAGE_LENGTH`, which
is 1000 characters.  Since the email gateway places attachments at the
very end of the extracted body, that means that they are the first
thing to get truncated off.

That is, if an incoming email message contains 1000 `a`s and an image
attachment, the link that attaches the attachment to the message will
get truncated off, leaving it dangling in the database.

Truncate the message body content separately from the attachment links
which are included at the end of the body.
2023-01-24 13:22:13 -08:00
Lauryn Menard e7cecc989f documentation: Clean up `MarkdownDirectoryView.get_context_data`.
Changes the check for whether the documentation page is a policy
center page to be the `self.policies_view` boolean instead of the
`path_template` value as it reads much more clearly.

Moves a comment in the code to be contextually relevant.
2023-01-24 13:04:04 -08:00
Lauryn Menard 8859da911b documentation: Rename `article_path` variable when getting context.
Because of the overlap with the `DocumentationArticle` dataclass
field `article_path`, we rename the `article_path` variable used
in `MarkdownDirectoryView.get_context_data` for the absolute path
to be `article_absolute_path`.
2023-01-24 13:04:04 -08:00
Lauryn Menard f7d5cd5690 documentation: Remove "not_index_page" from context.
In commit bbecd41, we added "not_index_page" to the context for
some documentation articles, but use of that context key/value was
removed when the help documentation was removed in commit 1cf7ee9.

Changes `not_index_page` to be a boolean value that's used to set
the page title, but is not then passed on as a context key/value.

Also removes an irrelevant comment about disabling "Back to home"
on the homepage.
2023-01-24 13:04:04 -08:00
Aman Agrawal 37431cf0b5 urls: Provide `email` as a GET parameter.
Since we want to use `accounts/new/send_confirm` to know how many
users actually register after visiting the register page, we
added it to Google Tag Manager, but GTM tracks every user
registration separately due <email> in the URL
making it harder to track.

To solve this, we want to pass <email> as a GET parameter which
can be easily filtered inside GTM using a RegEx and all the
registrations can be tracked as one.
2023-01-24 11:29:50 -08:00
Aman Agrawal a51bf96c70 accounts_send_confirm: Show email to which the link was sent.
This can be useful for the user in case user is worried if they
typed the correct email.
2023-01-24 11:29:50 -08:00
Prakhar Pratyush 1a400b21e7 notifications: Fix missed message email notifications of welcome bot.
A missed message email notification, where the message is the welcome
message sent by the welcome bot on account creation, get sent when
the user somehow not focuses the browser tab during account creation.

No missed message email or push notifications should be sent for the
messages generated by the welcome bot.

'internal_send_private_message' accepts a parameter
'disable_external_notifications' and is set to 'True' when the sender
is 'welcome bot'.

A check is introduced in `trivially_should_not_notify`, not to notify
if `disable_external_notifications` is true.

TestCases are updated to include the `disable_external_notifications`
check in the early (False) return patterns of `is_push_notifiable` and
`is_email_notifiable`.

One query reduced for both `test_create_user_with_multiple_streams`
and `test_register`.
Reason: When welcome bot sends message after user creation
`do_send_messages` calls `get_active_presence_idle_user_ids`,
`user_ids` in `get_active_presence_idle_user_ids` remains empty if
`disable_external_notifications` is true because `is_notifiable` returns
false.
`get_active_presence_idle_user_ids` calls `filter_presence_idle_user_ids`
and since the `user_ids` is empty, the query inside the function doesn't
get executed.

MissedMessageHookTest updated.

Fixes: #22884
2023-01-24 11:16:21 -08:00
Prakhar Pratyush b40bbd6ca8 message_send: Refactor internal_send_*, internal_prep_* & _internal_prep_*.
This commit makes all the parameters after 'content' in
'internal_send_*', 'internal_prep_*' and '_internal_prep_*'
a mandatory keyword argument to increase code readability.
2023-01-24 11:16:21 -08:00
Prakhar Pratyush 4595b5d132 notifications: Add separate function for `is_notifiable` trivial checks.
A separate function named `trivially_should_not_notify` is added which
extracts the common checks from `get_push_notification_trigger` and
`get_email_notification_trigger` which are users' notification settings
independent and thus don't depend on what type of notification (email/push)
it is.
2023-01-24 11:16:14 -08:00
Alex Vandiver 994806c505 migrations: Fix ownership, and re-attach, mis-owned email attachments.
608c787c52 fixed a bug where messages sent by the email gateway "as"
a user failed to properly attribute ownership of their attachments,
leaving the attachments orphaned and thus with nobody with permissions
to view them.

These orphaned attachments only remain longer than a few weeks if the
`delete_unclaimed_attachments` script has not been run reliably.
Since there is currently no shipped cron job for this, that is most
likely all deployments.

Add a migration to find such orphaned attachments, and re-attach them
to their original message.  While theoretically the attachments
could have been later referenced in other messages -- which would be
very difficult to find and determine if they had access to the
attachment -- we only fix the original message.

In order to make this somewhat performant, we assume that the Message
rows associated with an Attachment made by the email gateway happened
within 5 minutes, since they must have been made during one HTTP
request.

This is complicated by the message potentially having been deleted; in
this case, the Attachment is moved into ArchivedAttachment, so it can
relate to the ArchivedMessage.  The many-to-many
`zerver_archivedattachment_messages` relationship table cannot use its
own `id` sequence for the value, since the `id` is re-used when the
row is inserted into the `zerver_attachment_messages` table -- we
instead consume a value from the `id` sequence of the
`zerver_attachment_messages` table.
2023-01-24 10:49:46 -08:00
Anders Kaseorg d3164016f5 ruff: Fix UP032 Use f-string instead of `format` call.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Anders Kaseorg e5d671bf2b ruff: Fix SIM210 Use `bool(…)` instead of `True if … else False`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Anders Kaseorg 7a7513f6e0 ruff: Fix SIM201 Use `… != …` instead of `not … == …`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Anders Kaseorg 25346bde98 ruff: Fix SIM118 Use `k in d` instead of `k in d.keys()`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Anders Kaseorg 6303ebfc2f ruff: Fix SIM115 Use context handler for opening files.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Anders Kaseorg b8b29dc3ad ruff: Fix SIM110 Use `return any(…)` instead of `for` loop.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Anders Kaseorg ff1971f5ad ruff: Fix SIM105 Use `contextlib.suppress` instead of try-except-pass.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Anders Kaseorg b0e569f07c ruff: Fix SIM102 nested `if` statements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
David Rosa f70b321773 help center: Document bot name in org settings link to bot's user card.
Documents link to the bot's user card from the bot's name in
Organization settings > Bots, and information in the bot's user card.

Fixes part of #23970.
2023-01-19 11:13:33 -08:00
Alex Vandiver 608c787c52 email_mirror: Create attachments as the message sender.
When the email mirror gateway is sending messages "as" a user (as
triggered by having access to the missed-message email address),
attachments were still created as the Email Gateway bot.  Since the
sender (the end-user) was not the owner of those attachments (the
gateway bot), nor were they referenced yet anywhere, this resulted in
the attachments being "orphaned" and not allowed to be accessed by
anyone -- despite the attachment links being embedded in the message.
This was accompanied by the error:

```
WARN [] User 12345 tried to share upload 123/3LkSA4OcoG6OpAknS2I0SFAQ/example.jpf in message 123456, but lacks permission
INFO [zerver.lib.email_mirror] Successfully processed email from user 12345 to example-stream
```

We solve this by creating attachment objects as the users the message
will be sent from.
2023-01-18 15:42:40 -08:00
Anders Kaseorg 8f7a7877fe python: Clean up janky URL matching code with urlsplit.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-18 17:25:46 -05:00
Anders Kaseorg 9a7f33ab98 migrations: Fix ‘continue’ logic error in 0037.
The intention was to continue the outer ‘for’ loop, not the inner one
(but Python doesn’t have labelled ‘continue’).

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-18 13:34:52 -08:00
Trident Pancake c6ea673cc9 markdown: Update max inline preview from 10 to 24.
The max inline preview limit was previously increased to 10 by #20789.
However, as issue #23624 shows, it's still causing confusion for users
when they include more than 10 links.

Bump this limit up to 24, which is a multiple of the 4 image preview
per line logic.
2023-01-18 14:58:00 -05:00
Lauryn Menard ba443cac03 templates: Set robots noindex for attribution corporate template.
Overrides the default context `allow_search_engine_indexing` to
always be `False` for `templates/corporate/attribution.html` so
that it does not appear in Google / search engine indexes.

Updates test of documentation pages in `test_docs.py` to have an
option for corporate pages to set this value in the template and
verifies that the meta tag for robots noindex, nofollow is
always in the response.
2023-01-17 15:00:39 -08:00
Lauryn Menard c2bcfb52aa api-tests: Reduce error output for `/register` openapi validation.
For descriptive endpoints, such as `/register`, that might raise
Schema Validation errors via `validate_against_openapi_schema`,
omits the OpenAPI schema definition in the error output.

Also omits the error instance definition in the error output
when it is a jsonschema object with over 100 properties. This
means that the test instance for objects, like user settings,
will be printed in the error output, but the test instance for
the entire endpoint will not be printed to the console.

The omitted output can be thousands of lines long making it
difficult to find the initial console output that actually helps
the contributor with debugging.

Adds a section in "Documenting REST API endpoints" about
debugging and understanding these errors that is linked to
in the error console output.
2023-01-17 14:50:42 -08:00
Lauryn Menard fe03d2a533 api-docs: Clarify only API doc paths check for endpoint info.
Previously, we got the directory path for all documentation pages
before checking for API method and path information in the OpenAPI
documentation. Instead, we now check the `path_template` is the
API documentation view template before getting the directory path.

Also, changes the confusingly named `article_path` variable, which
overlapped with the DocumentationArticle dataclass `article_path`
field, to now be `api_documentation_path`.

Prep commit for moving the help center documentation to a top level
directory.
2023-01-10 15:32:47 -08:00
Mateusz Mandera 89d1f1f385 messages: Eliminate redundant realm fetch in has_message_access.
Accessing .realm will cause a fetch query from the database if the
attribute hasn't been fetched already earlier in the codepath. That's
completely redundant if we're just comparing realms, and we should only
access .realm_id attribute. This seems to eliminate a query in some
codepaths, which is nice in this performance-sensitive function.
2023-01-10 15:27:55 -08:00
Lauryn Menard a7fd994cbd docs: Link to management commands documentation in user facing docs.
Adds links to the documentation about management commands in the
API documentation for creating users, as well as the `/devtools`
documentation, the GDPR compliance article and the incoming
webhooks tutorial.
2023-01-10 08:50:00 -08:00
Alex Vandiver 04cf68b45e uploads: Serve S3 uploads directly from nginx.
When file uploads are stored in S3, this means that Zulip serves as a
302 to S3.  Because browsers do not cache redirects, this means that
no image contents can be cached -- and upon every page load or reload,
every recently-posted image must be re-fetched.  This incurs extra
load on the Zulip server, as well as potentially excessive bandwidth
usage from S3, and on the client's connection.

Switch to fetching the content from S3 in nginx, and serving the
content from nginx.  These have `Cache-control: private, immutable`
headers set on the response, allowing browsers to cache them locally.

Because nginx fetching from S3 can be slow, and requests for uploads
will generally be bunched around when a message containing them are
first posted, we instruct nginx to cache the contents locally.  This
is safe because uploaded file contents are immutable; access control
is still mediated by Django.  The nginx cache key is the URL without
query parameters, as those parameters include a time-limited signed
authentication parameter which lets nginx fetch the non-public file.

This adds a number of nginx-level configuration parameters to control
the caching which nginx performs, including the amount of in-memory
index for he cache, the maximum storage of the cache on disk, and how
long data is retained in the cache.  The currently-chosen figures are
reasonable for small to medium deployments.

The most notable effect of this change is in allowing browsers to
cache uploaded image content; however, while there will be many fewer
requests, it also has an improvement on request latency.  The
following tests were done with a non-AWS client in SFO, a server and
S3 storage in us-east-1, and with 100 requests after 10 requests of
warm-up (to fill the nginx cache).  The mean and standard deviation
are shown.

|                   | Redirect to S3      | Caching proxy, hot  | Caching proxy, cold |
| ----------------- | ------------------- | ------------------- | ------------------- |
| Time in Django    | 263.0 ms ±  28.3 ms | 258.0 ms ±  12.3 ms | 258.0 ms ±  12.3 ms |
| Small file (842b) | 586.1 ms ±  21.1 ms | 266.1 ms ±  67.4 ms | 288.6 ms ±  17.7 ms |
| Large file (660k) | 959.6 ms ± 137.9 ms | 609.5 ms ±  13.0 ms | 648.1 ms ±  43.2 ms |

The hot-cache performance is faster for both large and small files,
since it saves the client the time having to make a second request to
a separate host.  This performance improvement remains at least 100ms
even if the client is on the same coast as the server.

Cold nginx caches are only slightly slower than hot caches, because
VPC access to S3 endpoints is extremely fast (assuming it is in the
same region as the host), and nginx can pool connections to S3 and
reuse them.

However, all of the 648ms taken to serve a cold-cache large file is
occupied in nginx, as opposed to the only 263ms which was spent in
nginx when using redirects to S3.  This means that to overall spend
less time responding to uploaded-file requests in nginx, clients will
need to find files in their local cache, and skip making an
uploaded-file request, at least 60% of the time.  Modeling shows a
reduction in the number of client requests by about 70% - 80%.

The `Content-Disposition` header logic can now also be entirely shared
with the local-file codepath, as can the `url_only` path used by
mobile clients.  While we could provide the direct-to-S3 temporary
signed URL to mobile clients, we choose to provide the
served-from-Zulip signed URL, to better control caching headers on it,
and greater consistency.  In doing so, we adjust the salt used for the
URL; since these URLs are only valid for 60s, the effect of this salt
change is minimal.
2023-01-09 18:23:58 -05:00
Alex Vandiver 58dc1059f3 uploads: Move unauth-signed tokens into view. 2023-01-09 18:23:58 -05:00
Alex Vandiver ed6d62a9e7 avatars: Serve /user_avatars/ through Django, which offloads to nginx.
Moving `/user_avatars/` to being served partially through Django
removes the need for the `no_serve_uploads` nginx reconfiguring when
switching between S3 and local backends.  This is important because a
subsequent commit will move S3 attachments to being served through
nginx, which would make `no_serve_uploads` entirely nonsensical of a
name.

Serve the files through Django, with an offload for the actual image
response to an internal nginx route.  In development, serve the files
directly in Django.

We do _not_ mark the contents as immutable for caching purposes, since
the path for avatar images is hashed only by their user-id and a salt,
and as such are reused when a user's avatar is updated.
2023-01-09 18:23:58 -05:00
Alex Vandiver f0f4aa66e0 uploads: Inline the one callsite of get_local_file_path.
This helps make more explicit the assert_is_local_storage_path which
makes using local_path safe.
2023-01-09 18:23:58 -05:00
Alex Vandiver 862e3bb80a avatars: Use a helper method, rather than use upload_backend directly.
Importing `upload_backend` directly means that in testing it must also
be mocked where it is imported, in order to correctly test the right
backend.  Since `get_avatar_url` is part of the public
`ZulipUploadBackend` API, add another helper method to call that.
2023-01-09 18:23:58 -05:00
Alex Vandiver 0c3d74ea31 test_helpers: Use a mock, rather than explicitly setting and unsetting. 2023-01-09 18:23:58 -05:00
Alex Vandiver 7ad06473b6 uploads: Add LOCAL_AVATARS_DIR / LOCAL_FILES_DIR computed settings.
This avoids strewing "avatars" and "files" constants throughout.
2023-01-09 18:23:58 -05:00
Alex Vandiver 24f95a3788 uploads: Move internal upload serving path to under /internal/. 2023-01-09 18:23:58 -05:00
Alex Vandiver b20ecabf8f tornado: Move internal tornado redirect to under /internal/. 2023-01-09 18:23:58 -05:00
Alex Vandiver cc9b028312 uploads: Set X-Accel-Redirect manually, without using django-sendfile2.
The `django-sendfile2` module unfortunately only supports a single
`SENDFILE` root path -- an invariant which subsequent commits need to
break.  Especially as Zulip only runs with a single webserver, and
thus sendfile backend, the functionality is simple to inline.

It is worth noting that the following headers from the initial Django
response are _preserved_, if present, and sent unmodified to the
client; all other headers are overridden by those supplied by the
internal redirect[^1]:
 - Content-Type
 - Content-Disposition
 - Accept-Ranges
 - Set-Cookie
 - Cache-Control
 - Expires

As such, we explicitly unset the Content-type header to allow nginx to
set it from the static file, but set Content-Disposition and
Cache-Control as we want them to be.

[^1]: https://www.nginx.com/resources/wiki/start/topics/examples/xsendfile/
2023-01-09 18:23:58 -05:00
Alex Vandiver 43fe24a5a0 uploads: Make realm_avatar_and_logo_path non-abstract. 2023-01-09 18:23:58 -05:00
Alex Vandiver 8e68d68f32 uploads: Be consistent about first arguments to write_local_file.
Enforcing a consistent `type` helps us double-check that we're not
playing fast-and-loose with any file paths for local files.  As noted
in the comment, this is purely for defense-in-depth.

Passing `write_local_file` a consistent `type` requires removing the
"avatars" out of `realm_avatar_and_logo_path` -- which makes it
consistent across upload backends.

This, in turn, requires a compensatory change to zerver.lib.export, to
be explicit that the realm icons are exported from the avatars
directory.  This clarity is likely an improvement.
2023-01-09 18:23:58 -05:00
Alex Vandiver 679fb76acf uploads: Provide our own Content-Disposition header.
sendfile already applied a Content-Disposition header, but the
algorithm may provide both `filename=` and `filename*=` values (which
is potentially confusing to clients) and incorrectly slash-escapes
quotes in Unicode strings.

Django provides a correct implementation, but it is only accessible to
FileResponse objects.  Since the entire point is to offload the
filehandle handling, we cannot use a FileResponse.

Django 4.2 will make the function available outside of FileResponse.
Until then, extract our own Content-Disposition handling, based on
Django's.

We remove the very verbose comment added in d4360e2287, describing
Content-Disposition headers, as it does not add much.
2023-01-09 18:23:58 -05:00
Alex Vandiver 83fd807885 uploads: Remove unncessary return in create_attachment. 2023-01-09 18:23:58 -05:00
Alex Vandiver 7c0d414aff uploads: Split out S3 and local file backends into separate files.
The uploads file is large, and conceptually the S3 and local-file
backends are separable.
2023-01-09 18:23:58 -05:00
Abhijeet Prasad Bodas 932ce6cee4 MissedMessageHookTest: Add more tests.
Add more tests analogous to existing ones but for different scenarios.

This is mostly boring text, but is important for completeness, since the
notificability logic underneath is subtle.
2023-01-06 17:08:20 -08:00