Primary goal of library replacement is improving execution speed.
This commit should not affect the functionality of the system
or make any changes to it.
This is a follow up to #24673, we want to modify every webhook events to
follow the same pattern and consistency where branch name should only
show on opened and merged events.
- Being more specific about what the user will get.
- Putting less emphasis on entering multiple emails, since most
people probably just have one email they need to check.
- Using more intuitive wording and hint that deactivated or
deleted accounts won't be included.
Fixes: #24890.
This is a follow up to #24673, we want to modify every webhook events to
follow the same pattern and consistency where branch name should only
show on opened and merged events.
This is a prep commit to help make the changes to make changes to pull
event message easier. Our Bitbucket has been using a custom template to
render the reviewers. This means that values are fixed to how the templates
like it. These changes will allow `get_pull_request_event_message` to
support reviewer and allow for a easier and flexible adjustment to these
messages if needed.
Previously, the assignee message would stick around in the middle of the
event message. This doesn't look as good as if we put it to the end of
the event message. These changes does just that and move the assignee
messages towards the end of the event message to make it look better
and cleaner for the readers.
Previously, there was a stale code that didn't verify
if 'muted_topics' and 'user_topic' events are sent correctly.
This commit updates the test to verify if the expected
users are notified via 'muted_topics' and 'user_topic'
events.
This commit updates the move-topic codepath to perform
bulk database operations on the UserTopic record using
user_profiles for each visibility_policy instead of
previously looping over each user_profile one by one.
This commit refactors 'set_user_topic_visibility_policy_in_database'
to perform bulk database operations and the related changes.
There is an increase in database query count because requests
to delete user_topic rows now take two queries instead of one.
This is required for logging the info for a request to delete
a non-existent user_topic row while performing bulk operations
at the same time.
The overall query count will be lower while performing
bulk operations (multiple user_profiles instead of one).
This commit updates the 'do_update_message' codepath to
update the UserTopic records regardless of visibility policy
during the "move-topic" operation.
This is required before offering new visibility policies
in the UI.
Previously, UserTopic records were moved or deleted only
for objects with a MUTED visibility policy.
Fixes: #24574
This is a prep commit that renames 'set_topic_mutes' and
'topic_is_muted' to 'set_topic_visibility_policy' and
'topic_has_visibility_policy' respectively, and refactors
them to work with any visibility_policy, not only MUTED.
Previously, some call sites for the function provided optional
arguments as positional arguments. These changes will allow the
arguments to be passed as keyword arguments to the function and
fix up the call sites of the function to pass keyword arguments
instead.
Previously, some call sites for the function provided optional
arguments as positional arguments. These changes will allow the
arguments to be passed as keyword arguments to the function and
fix up the call sites of the function to pass keyword arguments
instead.
Previously, tests that exercised code paths that added local
uploads did not always clean up `settings.LOCAL_UPLOADS_DIR`
after the test was complete.
Updates the `ZulipTestCase` class to remove any local uploads
in the unique `settings.LOCAL_UPLOADS_DIR` in `tearDown` for
all tests.
This commit adds code to create a "Nobody" system user group
to realms which will be used in settings to represent "Nobody"
option.
We also add a migration to add this group to existing realms.
This commit updates the pattern for dealing with tuples
returned by the delete() query.
The '(num_deleted, ignored) = ModelName.objects.filter().delete()'
pattern is preferred due to better readability.
We avoid the pattern '(num_deleted, _)' because Django uses _
for translation, which may lead to future bugs.
This commit adds a new helper submit_realm_creation_form,
similar to existing submit_reg_form_for_user, to avoid
duplicate code for creating realms in tests.
This commit adds the fields related to realm creation form using
get_realm_create_form_context in the context passed to register.html
template to avoid duplication.
Since we have updated the registration code to use
PreregistrationRealm objects for realm creation in
previous commits, some of the code has become
redundant and this commit removes it.
We remove the following code -
- The modification to PreregistrationUser objects in
process_new_human_user can now be done unconditionally
because prereg_user is passed only during user creation
and not realm creation. And we anyway do not expect
any PreregistrationUser objects inside the realm
during the creation.
- There is no need of "realm_creation" parameter in
create_preregistration_user function, since we now
use create_preregistration_realm during realm creation.
Fixes part of #24307.
In previous commits, we updated the realm creation flow to show
the realm name, type and subdomain fields in the first form
when asking for the email of the user. This commit updates the
user registration form to show the already filled realm details
as non-editable text and there is also a button to edit the
realm details before registration.
We also update the sub-heading for user registration form as
mentioned in the issue.
Fixes part of #24307.
We now use PreregistrationRealm objects in registration_helper
function when creating new realms instead of PreregistrationUser
objects.
Fixes part of #24307.
We now show inputs for realm details like name, type and URL
in the create_realm.html template opened for "/new" url and
these information will be stored in PreregistrationRealm
objects in further commits.
We add a new class RealmDetailsForm in forms.py for this
such that it is used as a base class for RealmCreationForm
and we define RealmDetailsForm such that we can use it as
a subclass for RegistrationForm as well to avoid duplication.
This commit adds PreregistrationRealm class which will be
similar to PreregistrationUser and will store initial
information of the realm before its creation as we are
changing the organization creation flow as per #24307.
Fixes part of #24307.
This commit renames prereg_user variable in
check_prereg_key and get_prereg_key_and_redirect
functions in zerver/views/registration.py to
prereg_object as in further commits the
preregistration object could also be
PreregistrationRealm object as part of changes
for #24307.
This is a follow up to #24673, we want to modify every webhook events to
follow the same pattern and consistency where branch name should only
show on opened and merged events.
Previously, when a user moves a message to another topic, the Notification
bot will post a message saying "This topic was moved here from..." This is
confusing when the topic already contains messages. The changes aims to make
the messages more clear by changing the logic for the Notification bot. When
there is already messages in the topic, the bot will post "A message was
moved here from..." or "N messages were moved here from...". The bot will
post "This topic was moved here from (somewhere) by (someone)." when the
topic is empty.
Fixes#23267.
To avoid people calling "create_user_group" instead of
"check_add_user_group", we rename it to make its purpose clearer.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
"check_add_user_group" is a safer helper function than
"create_user_group" to use when creating user_groups. It does
error handling and notify the client with the appropriate event.
Note that the populate_db command still uses "create_user_group"
because we do not need to enqueue events at that point.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
Since this function creates a new user group into the database,
it is more appropriate to have it not as a generic "lib" function
but as an "action".
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
Some well-intentioned adblockers also block Sentry client-side error
reporting. Provide an endpoint on the Zulip server which forwards to
the Sentry server, so that these requests are not blocked.
Prior to commit a9b3a9c, the server implementation for documented
search operators with dashes, also implicitly supported clients
sending those same operators with underscores. This has been the
case sense the server side support for narrow filtering was
introduced in commit 3af2bf345a.
Updates the stricter version of mapping operator strings to `by*`
functions, to also include the underscore version of any operators
that have dashes. Adds a note that these undocumented versions are
tied to the support for the documented versions.
This is a follow up to #24673, we want to modify every webhook events to
follow the same pattern and consistency where branch name should only
show on opened and merged events.
We previously created RealmAuditLog entries for user notification
settings only. This commit changes the code to create entries for
all user settings. We cannot backfill the entries since we don't
have the data to do that.
When a new realm is created, a notification message is sent to
the realm configured as the settings.SYSTEM_BOT_REALM if there
is a "signups" stream that exists in that realm. This is used
for Zulip Cloud, but is an undocumented feature.
The topic of the message has been the subdomain of the new realm,
and the message content has been "Signups enabled" translated
into the default language of the new realm.
In order to make these messages more explicitly for Zulip Cloud,
the settings.CORPORATE_ENABLED is checked before sending these
messages.
To make these messages more useful, the topic for these
notifications is changed to be "new organizations". The content
of these messages is updated to have the new realm name (with a
link to the admin realm's activity support page for the realm),
subdomain (with a link to the realm), and organization type.
Use the built-in HTML escaping of Markup("…{var}…").format(), in order
to allow Semgrep to detect mistakes like Markup("…{var}…".format())
and Markup(f"…{var}…").
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Updates the logic for identifying the method to use to extend the
query for the given term from a narrow to use a dictionary that
maps the operator string to the by_* method in the NarrowBuilder
class.
Previously, the by_* method was determined by building a string
based on the operator string and replacing dashes with underscores.
This is a follow up to #24673, we want to modify every webhook
events to follow the same pattern and consistency where branch name
should only show on opened and merged events.
The RealmCount statistics will be empty if the realm was created since
the last daily aggregation. In cases where the daily stats have no
rows, it is likely fast enough to do the real count in the messages
table. This stops unduly penalizing folks who have actually sent
messages, and are just inviting people within the first day.
Prior to aa032bf62c, QOS prefetch was set on every `publish` and
before every `start_json_consumer` -- which had a large and
unnecessary effect on publishing rates, which don't care about the
prefetch QOS settings at all, much less re-setting them before every
publish.
Unfortunately, that change had the effect of causing prefetch settings
to almost never be respected -- since the configuration happened in
`ensure_queue`s re-check that the connection was still live. The
initial connection is established in `__init__` via `_connect`, and
the consumer only calls `ensure_queue` once, before setting up the
consumer.
Having no prefetch value set causes an unbounded prefetch; this
manifests itself as the server attempting to shove every event down to
the worker as soon as it starts consuming; if the client cannot keep
up, the server closes the connection. The worker observes the
connection has been shut down, and restarts. While this does make
forward progress, it causes large queues to make progress more slowly,
as they suffer from sporadic restarts.
Shift the QOS configuration to when the connection is set up, which is
a more sensible place for it in general -- and ensures that it is set
on consumers and producers alike, but only once per connection
establishment.
`render_markdown_path` renders Markdown, and also (since baff121115)
runs Jinja2 on the resulting HTML.
The `pure_markdown` flag was added in 0a99fa2fd6, and did two
things: retried the path directly in the filesystem if it wasn't found
by the Jinja2 resolver, and also skipped the subsequent Jinja2
templating step (regardless of where the content was found). In this
context, the name `pure_markdown` made some sense. The only two
callsites were the TOS and privacy policy renders, which might have
had user-supplied arbitrary paths, and we wished to handle absolute
paths in addition to ones inside `templates/`.
Unfortunately, the follow-up of 01bd55bbcb did not refactor the
logic -- it changed it, by making `pure_markdown` only do the former
of the two behaviors. Passing `pure_markdown=True` after that commit
still caused it to always run Jinja2, but allowed it to look elsewhere
in the filesystem.
This set the stage for calls, such as the one introduced in
dedea23745, which passed both a context for Jinja2, as well as
`pure_markdown=True` implying that Jinja2 was not to be used.
Split the two previous behaviors of the `pure_markdown` flag, and use
pre-existing data to control them, rather than an explicit flag. For
handling policy information which is stored at an absolute path
outside of the template root, we switch to using the template search
path if and only if the path is relative. This also closes the
potential inconsistency based on CWD when `pure_markdown=True` was
passed and the path was relative, not absolute.
Decide whether to run Jinja2 based on if a context is passed in at
all. This restores the behavior in the initial 0a99fa2fd6 where a
call to `rendar_markdown_path` could be made to just render markdown,
and not some other unmentioned and unrelated templating language as
well.
Previously, `QuerySet` does not support isinstance check since it is
defined to be generic in django-stubs. In a recent update, such check is
possible by using `QuerySetAny`, a non-generic alias of `QuerySet`.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
This adds tests for more corner cases, in exchange for dropping the
query count tests, which were of dubious utility. It also adds the
time-machine library to mock the current time to test that the limits
do expire.
Previously when Github bot receives an update pull request event,it
will produce the following message:
user updated PR #1 Start writing unit tests from test to main
"from test to main" is improper and causes unnecessary confusion.
These changes will update the logic to remove the phrase from
update events. These changes will also include the org: prefix to
the branch names to keep it consistent with Github and further
reduce confusions on branch names.
Fixes#24536.
This commit updates 'set_user_topic_visibility_policy_in_database'
to not raise an error when deleting a UserTopic row and the user
doesn't have a visibility_policy for the topic yet, or when setting
the visibility_policy to its current value.
Also, it includes the changes to not send unnecessary events
in such cases.
The list of supported events for filtering itself does not document what
each of the events does. Adding a link to GitHub's documentation would
be a pointer to get people started. But ideally we need to establish a
better system to document the events in general.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
Currently, there is a checkbox setting for whether to
"Include realm name in subject of message notification emails".
This commit replaces the checkbox setting with a dropdown
having values: Automatic [default], Always, Never.
The Automatic option includes the realm name if, and only if,
there are multiple Zulip realms associated with the user's email.
Tests are added and(or) modified.
Fixes: #19905.
Removes the notification message that was sent if a stream named
"signups" exists in the `settings.SYSTEM_BOT_REALM`. This was a
undocumented feature that would send a notification message when
a new user registered with a Zulip organization that was hosted
by an admin realm like Zulip Cloud.
This removes two database queries when a new user is created: one
to get the system bot realm and the other to get the notification
bot in said realm.
Note that there are still notification messages sent when a new
organization is registered with the admin realm if the "signups"
stream exists.
This commit refactors 'do_set_user_topic_visibility_policy'
to remove the if/else block and just have a single call to
'set_user_topic_visibility_policy_in_database'.
The branching out behaviour based on the user_topic
visibility_policy is reduced to one place, i.e.,
'set_user_topic_visibility_policy_in_database'.
Updated the title and description in the 'enable-emoticon-translation'
file and renamed the file accordingly. Added a new bullet point for
'time format' in the 'configure-new-user-settings.md' file and updated
the sidebar index by replacing the title 'Use 24-hour time' with
'Change the time format'.
The Django convention is for __repr__ to include the type and __str__
to omit it. In fact its default __repr__ implementation for models
automatically adds a type prefix to __str__, which has resulted in the
type being duplicated:
>>> UserProfile.objects.first()
<UserProfile: <UserProfile: emailgateway@zulip.com <Realm: zulipinternal 1>>>
Signed-off-by: Anders Kaseorg <anders@zulip.com>
While the function which processes the realm registration and
signup remains the same, we use different urls and functions to
call the process so that we can separately track them. This will
help us know the conversion rate of realm registration after
receiving the confirmation link.
This commit refactors the notify_created_user function to
call format_user_row twice with different parameters instead
of modifying the person object returned by format_user_row.
This change makes the code somewhat more easy to understand
than it was before.
dc1eeef30a made the column nullable, with the meaning for null of
"use the current `settings.INVITES_DEFAULT_REALM_DAILY_MAX`."
However, 8a95526ced switched to calling `do_change_plan_type` during
realm creation, which sets `realm.max_invites` based on the plan type,
thus ensuring that no new realms have their `_max_invites` set to
null.
Check `max_invites` instead of `_max_invites`. This requires test
adjustments for the fact that `apply_invite_realm_heuristics` is now
run.
Adds a page to the general api documentation about HTTP headers,
so that information about the special response headers for rate
limits have a more logical location in the docs and so that other
HTTP header information can be shared, such as `User-Agent`
conventions.
Adjusts some text and linking on the rest-error-handling page and
overview page for the REST API for the addition of the HTTP headers
page.
Zulip already has integrations for server-side Sentry integration;
however, it has historically used the Zulip-specific `blueslip`
library for monitoring browser-side errors. However, the latter sends
errors to email, as well optionally to an internal `#errors` stream.
While this is sufficient for low volumes of users, and useful in that
it does not rely on outside services, at higher volumes it is very
difficult to do any analysis or filtering of the errors. Client-side
errors are exceptionally noisy, with many false positives due to
browser extensions or similar, so determining real real errors from a
stream of un-grouped emails or messages in a stream is quite
difficult.
Add a client-side Javascript sentry integration. To provide useful
backtraces, this requires extending the pre-deploy hooks to upload the
source-maps to Sentry. Additional keys are added to the non-public
API of `page_params` to control the DSN, realm identifier, and sample
rates.
b4dd118aa1 changed how the `user_info_str` parsed information out of
the events it received -- but only changed the server errors, not the
browser errors, though both use the same codepath. As a result, all
browser errors since then have been incorrectly marked as being for
anonymous users.
Build and pass in the expected `user` dict into the event.
This commit adds 'visibility_policy' as a
parameter to user_allows_notifications_in_StreamTopic
function.
This adds logic inside the user_allows_notifications_in_StreamTopic
function, to not return False when a stream is muted
but the topic is UNMUTED.
Adds a method `user_id_to_visibility_policy_dict`
to 'StreamTopicTarget' class to fetch
(user_id => visibility_policy) in single db query.
Co-authored-by: Kartik Srivastava <kaushiksri0908@gmail.com>
Co-authored-by: Prakhar Pratyush <prakhar841301@gmail.com>
This commit replaces 'remove_topic_mute' with
'set_user_topic_visibility_policy_in_database' and
updates it to delete UserTopic row with any configured
visibility_policy and not just muting.
In order to support different types of topic visibility policies,
this renames 'add_topic_mute' to
'set_user_topic_visibility_policy_in_database'
and refactors it to accept a parameter 'visibility_policy'.
Create a corresponding UserTopic row for any visibility policy,
not just muting topics.
When a UserTopic row for (user_profile, stream, topic, recipient_id)
exists already, it updates the row with the new visibility_policy.
In the event of a duplicate request, raises a JsonableError.
i.e., new_visibility_policy == existing_visibility_policy.
There is an increase in the database query count in the message-edit
code path.
Reason:
Earlier, 'add_topic_mute' used 'bulk_create' which either
creates or raises IntegrityError -- 1 query.
Now, 'set_user_topic_visibility_policy' uses get_or_create
-- 2 queries in the case of creating new row.
We can't use the previous approach, because now we have to
handle the case of updating the visibility_policy too.
Also, using bulk_* for a single row is not the correct way.
Co-authored-by: Kartik Srivastava <kaushiksri0908@gmail.com>
Co-authored-by: Prakhar Pratyush <prakhar841301@gmail.com>
This commit refactors the existing pattern (real-time usage)
used to assert 'date_muted' in tests.
A fixed value is used at the start of the test to
assert 'date_muted', replacing the timedelta or real-time usage pattern.
Replaces 'do_unmute_topic' with 'do_set_user_topic_visibility_policy'
and associated minor changes.
This change is made to align with the plan to use a single function
'do_set_user_topic_visibility_policy' to manage
user_topic - visibility_policy changes and corresponding event
generation.
This commit is a step in the direction of having a common
function to handle visibility_policy changes and event
generation instead of separate functions for each
visibility policy.
In order to support different types of topic visibility policies,
this renames 'do_topic_mute' to 'do_set_user_topic_visibility_policy'
and refactors it to accept a parameter 'visibility_policy'.
The "add_topic_mute" and "remove_topic_mute" library functions
shouldn't be called directly from tests.
They should instead call "do_mute_topic" and "do_unmute_topic"
The reason being:
Library functions are meant to be internal interfaces
for just changing the database, and shouldn't generally be
called elsewhere.
Creates `MutableJsonResponse` as a subclass of Django's `HttpResponse`
that we can modify for ignored parameters in the response content.
Updates responses to include `ignored_parameters_unsupported` in
the response data through `has_request_variables`. Creates unit
test for this implementation in `test_decorators.py`.
The `method` parameter processed in `rest_dispatch` is not in the
`REQ` framework, so for any tests that pass that parameter, assert
for the ignored parameter with a comment.
Updates OpenAPI documentation for `ignored_parameters_unsupported`
being returned in the JSON success response for all endpoints.
Adds detailed documentation in the error handling article, and
links to that page in relevant locations throughout the API docs.
For the majority of endpoints, the documentation does not include
the array in any examples of return values, and instead links to
the error handling page. The exceptions are the three endpoints
that had previously supported this return value. The changes note
and example for these endpoints is also used in the error
handling page.
Adds `is_webhook_view` boolean field to the RequestNotes class so
that (when implemented) `ignored_parameters_unsupported` feature
is not something that is applied to webhooks.
In commit 8181ec4b56, we removed the `realm_str` as a parameter
for `send_message_backed`. This removes a missed test that included
this as a parameter for that endpoint/function.
Actions like deleting realms may leave unreferenced uploads in the
attachment storage backend.
Fix these by walking the complete contents of the attachment storage
backend, and removing files which are no longer present in the
database. This may take quite some time, as it is necessarily O(n) in
the number of files uploaded to the system.
Updates the Asana documentation, which was a detailed version
of the Zapier documentation with screenshots specifically for
Asana, to instead start with the basic incoming webhook steps
and then point to the general Zapier documentation to complete
the integration.
This will be easier to maintain moving forward in the short
term as ideally we'll migrate to a system that documents all
of the integrations with Zulip that are available via Zapier.
Also, updates the current Zapier documentation to mention
Asana as one of the apps that can be integrated with Zulip.
This commit renames reset_emails_in_zulip_realm function to
reset_email_visibility_to_everyone_in_zulip_realm which makes
it more clear to understand what the function actually does.
This commit also adds a comment explaining what this function
does.
The inital Welcome bot message has an extra section if the user is
joining a demo organization, but the link in that section was not
being formatted correctly. Fixes the formatting so that the link
works.
This already became useless in 6e11754642,
as detailed in the API changelog entry here. At this point, we should
eliminate this param and the weird code around it.
This commit also deletes the associated tests added in
6e11754642, since with realm_str removed,
they make no sense anymore (and actually fail with an OpenAPI error due
to using params not used in the API). Hypothetically they could be
translated to use the subdomain= kwarg, but that also doesn't make
sense, since at that point they'd be just testing the case of a user
making an API request on a different subdomain than their current one
and that's just redundant and already tested generally in
test_decorators.
This leftover variable, as a result of older changes, was just always
set to None. That was fine, because when realm=None reaches
check_message further down the codepath, it just infers from
sender.realm. We want to stop passing None like that though, so let's
just set this to user_profile.realm.
Updates the text and title used when the password reset done page
to work for situations where the user is resetting a forgotten
password and for situation where the user is setting a password
for the first time (e.g. SSO login, demo organizations).
This is the behaviour inherited from Django[^1]. While setting the
password to empty (`email_password = `) in
`/etc/zulip/zulip-secrets.conf` also would suffice, it's unclear what
the user would have been putting into `EMAIL_HOST_USER` in that
context.
Because we previously did not warn when `email_password` was not
present in `zulip-secrets.conf`, having the error message clarify the
correct configuration for disabling SMTP auth is important.
Fixes: #23938.
[^1]: https://docs.djangoproject.com/en/4.1/ref/settings/#std-setting-EMAIL_HOST_USER
`./manage.py import` does not take a tarball; it takes a directory.
Making a separate tarball is a waste of CPU time and disk, as it is
never used.
This was included in the commit of the initial Slack conversion code
in 5b37c5562b and propagated from there into every conversion tool.
Remove the unnecessary tarball creation.
c7d0192755 added the unique constraint on
`user_profile_id,message_id,reaction_type,emoji_code`, but left the
existing constraint on `user_profile_id,message_id,emoji_name`. As
explained in the comment added in 3cd543ee98, `emoji_name` cannot be
trusted to be unique, as it is possible to have an Unicode emoji
reaction and a custom emoji with the same name on a message.
Remove the overly-constraining unique index, now that c7d0192755 has
provided the correct one.
View that handled `PATCH user_groups/<int:user_group_id>` required
both name and description parameters to be passed. Due to this
clients had to pass values for both these parameters even if
one of them was changed.
To resolve this name description parameters to
`PATCH user_groups/<int:user_group_id>` are made optional.
We now allow user to change email_address_visibility during user
signup and it overrides the realm-level default and also overrides
the setting if user import settings from existing account.
We do not show UI to set email_address_visibility during realm
creation.
Fixes#24310.
This commit adds backend code to set email_address_visibility when
registering a new user. The realm-level default and the value of
source profile gets overridden by the value user selected during
signup.
This lets us simplify the long-ish ‘../../static/js’ paths, and will
remove the need for the ‘zrequire’ wrapper.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Ever since we started bundling the app with webpack, there’s been less
and less overlap between our ‘static’ directory (files belonging to
the frontend app) and Django’s interpretation of the ‘static’
directory (files served directly to the web).
Split the app out to its own ‘web’ directory outside of ‘static’, and
remove all the custom collectstatic --ignore rules. This makes it
much clearer what’s actually being served to the web, and what’s being
bundled by webpack. It also shrinks the release tarball by 3%.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This is quite a bit faster:
```
%timeit calendar.timegm(now.timetuple())
2.91 µs ± 361 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
%timeit int(now.timestamp())
539 ns ± 27 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
```
This is particularly important for the presence endpoint, which is a
tight loop of serializing datetimes.
As written, the QOS parameters are (re)set every time ensure_queue is
called, which is every time a message is enqueued. This is wasteful --
particularly QOS parameters only apply for consumers, and setting them
takes a RTT to the server.
Switch to only setting the QOS once, when a connection
is (re)established. In profiling, this reduces the time to call
`queue_json_publish("noop", {})` from 878µs to 150µs.
In the case where a stream existed but had no subscribers, the error
message used to send to the owner always used `stream_name`, which
may have been None.
Switch to using `stream.name` rather than `stream_name` for this case.
This code is called in the hot path when Tornado is processing events.
As such, making this code performant is important. Profiling shows
that a significant portion of the time is spent calling asdict() to
serialize the UserMessageNotificationsData dataclass. In this case
`asdict` does several steps which we do not need, such as attempting
to recurse into its fields, and deepcopy'ing the values of the fields.
In our use case, these add a notable amount of overhead:
```py3
from zerver.tornado.event_queue import UserMessageNotificationsData
from dataclasses import asdict
from timeit import timeit
o = UserMessageNotificationsData(1, False, False, False, False, False, False, False, False, False, False, False)
%timeit asdict(o)
%timeit {**vars(o)}
```
Replace the `asdict` call with a direct access of the fields. We
perform a shallow copy because we do need to modify the resulting
fields.
This commit adds migration to fix extra_data field
of RealmAuditLog objects created on changing
can_remove_subscribers_group setting to add "property"
field since the same event type will now be used for
other group based stream settings that will be added
in future.
We add stream_permission_group_settings object which is
similar to property_types framework used for realm settings.
This commit also adds GroupPermissionSetting dataclass for
defining settings inside stream_permission_group_settings.
We add "do_change_stream_group_based_setting" function which
is called in loop to update all the group-based stream settings
and it is now used to update 'can_remove_subscribers_group'
setting instead of "do_change_can_remove_subscribers_group".
We also change the variable name for event_type field of
RealmAuditLog objects to STREAM_GROUP_BASED_SETTING_CHANGED
since this will be used for all group-based stream settings.
'property' field is also added to extra_data field to identify
the setting for which RealmAuditLog object was created.
We will add a migration in further commits which will add the
property field to existing RealmAuditLog objects created for
changing can_remove_subscribers_group setting.
This old 300s value was meaningfully used in 2 places:
1. In the do_change_user_settings presence_enabled codepath when turning
a user invisible. It doesn't matter there, 140s is just since the
point is to make clients see this user as offline. And 140s is the
threshold used by clients (see the presence.js constant).
2. For calculating whether to set "offline" "status" in
result["presence"]["aggregated"] in get_presence_backend. It's fine
for this to become 140s, since clients shouldn't be looking at the
status value anymore anyway and just do their calculation based on
the timestamps.
This makes use of the new case insensitive UNIQUE index added in the
earlier commit. With that index present, we can now rely solely on the
database to correctly identify duplicates and throw integrity errors as
required.
This will allow us to rely on the database to detect duplicate
`UserTopic`s (with the same `topic_name` with different cases)
and thus correctly throw IntegrityErrors when expected.
This is also important from a correctness point of view, since as
of now, when checking if topic is muted or requesting the backend for
muting a topic, the frontend does not check for case insensitivity.
There might exist duplicate UserTopics (in a case insensitive sense)
which need are removed before creating the new index.
The migration was tested manually using `./manage.py shell`.
In 141b0c4, we added code to handle races caused by duplicate muting
requests. That code can also handle the non-race condition, so we don't
require the first check.
Removes the initial check in `_internal_prep_message` of the length
of the message content because the `check_message` in the try block
will call `normalize_body` on the message content string, which
does a more robust check of the message content (empty string, null
bytes, length). If the message content length exceeds the value of
`settings.MAX_MESSAGE_LENGTH`, then it is truncated based on that
value. Updates associated backend test for these changes.
The removed length check would truncate the message content with a
hard coded value instead of using the value for
`settings.MAX_MESSAGE_LENGTH`.
Also, removes an extraneous comment about removing null bytes. If
there are null bytes in the message content, then `normalize_body`
will raise an error.
Note that the previous check had intentionally reduced any message over
the 10000 character limit to 3900 characters, with the code in
question dating to 2012's 100df7e349.
The 3900 character truncating rule was implemented for incoming emails
with the email gateway, and predated other features to help with
overly long messages (better stripping of email footers via Talon,
introduced in f1f48f305e, and
condensing, introduced in c92d664b44).
While we could preserve that logic if desired, it likely is no longer
a necessary or useful variation from our usual truncation rules.
Updates the descriptions of content parameters (optional and
required) to note that the maximum size of the message content
should be based on the `max_message_length` value returned by
the register endpoint.
Previously these descriptions had a hardcoded value of 10000
bytes as the maximum message size.
Also, updates the description of `max_message_length` to clarify
that the value represents Unicode code points.
The password parameter being passed in the `_do_test` helper
function for `TestAuthenticatedJsonPostViewDecorator` tests was
being ignored, as the user needs to be logged in. Removes the
parameter from the helper function and updates the success test
to use `assert_json_success` instead of just checking the status
code.
Also adds a test case for when a user is not logged in to confirm
that it returns an UnauthorizedError.
This reverts commit 851d68e0fc.
That commit widened how long the transaction is open, which made it
much more likely that after the user was created in the transaction,
and the memcached caches were flushed, some other request will fill
the `get_realm_user_dicts` cache with data which did not include the
new user (because it had not been committed yet).
If a user creation request lost this race, the user would, upon first
request to `/`, get a blank page and a Javascript error:
Unknown user_id in get_by_user_id: 12345
...where 12345 was their own user-id. This error would persist until
the cache expired (in 7 days) or something else expunged it.
Reverting this does not prevent the race, as the post_save hook's call
to flush_user_profile is still in a transaction (and has been since
168f241ff0), and thus leaves the potential race window open.
However, it much shortens the potential window of opportunity, and is
a reasonable short-term stopgap.
The Client.name field is only 30 characters long, but there is no
limit to the length of parsed User-Agent value which we may attempt to
store in it. This can cause requests with long user-agents to 500
when the creation of the Client row fails.
Truncate the name at 30 characters for the cache key, and passing
`name` to `get_or_create`.
This will allow us to re-use this logic later, when we add support for
re-checking notification settings just before sending email/push
notifications to the user.
Also, since this is essentially part of the notifiability logic,
this better belongs to `notification_data.py` and this change will
hopefully reduce the reading complexity of the message-send codepath.
This commits update the code to use user-level email_address_visibility
setting instead of realm-level to set or update the value of UserProfile.email
field and to send the emails to clients.
Major changes are -
- UserProfile.email field is set while creating the user according to
RealmUserDefault.email_address_visbility.
- UserProfile.email field is updated according to change in the setting.
- 'email_address_visibility' is added to person objects in user add event
and in avatar change event.
- client_gravatar can be different for different users when computing
avatar_url for messages and user objects since email available to clients
is dependent on user-level setting.
- For bots, email_address_visibility is set to EVERYONE while creating
them irrespective of realm-default value.
- Test changes are basically setting user-level setting instead of realm
setting and modifying the checks accordingly.
Previously, user objects contained delivery_email field
only when user had access to real email. Also, delivery_email
was not present if visibility setting is set to "everyone"
as email field was itself set to real email.
This commit changes the code to pass "delivery_email" field
always in the user objects with its value being "None" if
user does not have access to real email and real email otherwise.
The "delivery_email" field value is None for logged-out users.
For bots, the "delivery_email" is always set to real email
irrespective of email_address_visibility setting.
Also, since user has access to real email if visibility is set
to "everyone", "delivery_email" field is passed in that case
too.
There is no change in email field and it is same as before.
This commit also adds code to send event to update delivery_email
field when email_address_visibility setting changes to all the
users whose access to emails changes and also changes the code to
send event on changing delivery_email to users who have access
to email.
This is helpful for debugging -- generally these tasks are in a worker
queue because they take a long time to run, so knowing what long task
is about to start before it does, rather than just after, is useful.
This commit adds time restriction on moving messages between streams
using the move_messages_between_streams_limit_seconds setting in the
backend. There is no time limit for admins and moderators.
We now use the newly added move_messages_within_stream_limit_seconds
setting to check for how long the user can edit the topic replacing
the previously used 3-day limit. As it was previously, there is no
time limit for admins and moderators.
This commit renames parse_message_content_edit_or_delete_limit
to parse_message_time_limit_setting and also renames
MESSAGE_CONTENT_EDIT_OR_DELETE_LIMIT_SPECIAL_VALUES_MAP to
MESSAGE_TIME_LIMIT_SETTING_SPECIAL_VALUES_MAP.
We do this change since this function and object will also be
used for message move limit and it makes sense to have a more
generic name.
This commit extracts a function to parse message time limit type settings
and to set it if the new setting value is None.
This function is currently used for message_content_edit_limit_seconds and
message_content_delete_limit_seconds settings and will be used for
message_move_limit_seconds setting to be added in further commits.
In Zulip, message topics are case-insensitive but case-preserving.
The `get_context_for_message` function erroneously did a
case-sensitive search, and thus only messages whose topic matched
exactly were pulled in as context.
Make the missed-message pipeline aware that message topics are not
case-sensitive. This means that, when collapsing adjacent messages,
we merge messages with topic headers which are "different"; create a
separate explicit "grouping" to know which to collapse.
Similar to the previous commit, Django was responsible for setting the
Content-Disposition based on the filename, whereas the Content-Type
was set by nginx based on the filename. This difference is not
exploitable, as even if they somehow disagreed with Django's expected
Content-Type, nginx will only ever respond with Content-Types found in
`uploads.types` -- none of which are unsafe for user-supplied content.
However, for consistency, have Django provide both Content-Type and
Content-Disposition headers.
The Content-Type of user-provided uploads was provided by the browser
at initial upload time, and stored in S3; however, 04cf68b45e
switched to determining the Content-Disposition merely from the
filename. This makes uploads vulnerable to a stored XSS, wherein a
file uploaded with a content-type of `text/html` and an extension of
`.png` would be served to browsers as `Content-Disposition: inline`,
which is unsafe.
The `Content-Security-Policy` headers in the previous commit mitigate
this, but only for browsers which support them.
Revert parts of 04cf68b45e, specifically by allowing S3 to provide
the Content-Disposition header, and using the
`ResponseContentDisposition` argument when necessary to override it to
`attachment`. Because we expect S3 responses to vary based on this
argument, we include it in the cache key; since the query parameter
has dashes in it, we can't use use the helper `$arg_` variables, and
must parse it from the query parameters manually.
Adding the disposition may decrease the cache hit rate somewhat, but
downloads are infrequent enough that it is unlikely to have a
noticeable effect. We take care to not adjust the cache key for
requests which do not specify the disposition.
In nginx, `location` blocks operate on the _decoded_ URI[^1]:
> The matching is performed against a normalized URI, after decoding
> the text encoded in the “%XX” form
This means that if a user-uploaded file contains characters that are
not URI-safe, the browser encodes them in UTF-8 and then URI-encodes
them -- and nginx decodes them and reassembles the original character
before running the `location ~ ^/...` match. This means that the `$2`
_is not URI-encoded_ and _may contain non-ASCII characters.
When `proxy_pass` is passed a value containing one or more variables,
it does no encoding on that expanded value, assuming that the bytes
are exactly as they should be passed to the upstream. This means that
directly calling `proxy_pass https://$1/$2` would result in sending
high-bit characters to the S3 upstream, which would rightly balk.
However, a longstanding bug in nginx's `set` directive[^2] means that
the following line:
```nginx
set $download_url https://$1/$2;
```
...results in nginx accidentally URI-encoding $1 and $2 when they are
inserted, resulting in a `$download_url` which is suitable to pass to
`proxy_pass`. This bug is only present with numeric capture
variables, not named captures; this is particularly relevant because
numeric captures are easily overridden by additional regexes
elsewhere, as subsequent commits will add.
Fixing this is complicated; nginx does not supply any way to escape
values[^3], besides a third-party module[^4] which is an undue
complication to begin using. The only variable which nginx exposes
which is _not_ un-escaped already is `$request_uri`, which contains
the very original URL sent by the browser -- and thus can't respect
any work done in Django to generate the `X-Accel-Redirect` (e.g., for
`/user_uploads/temporary/` URLs). We also cannot pass these URLs to
nginx via query-parameters, since `$arg_foo` values are not
URI-decoded by nginx, there is no function to do so[^3], and the
values must be URI-encoded because they themselves are URLs with query
parameters.
Extra-URI-encode the path that we pass to the `X-Accel-Redirect`
location, for S3 redirects. We rely on the `location` block
un-escaping that layer, leaving `$s3_hostname` and `$s3_path` as they
were intended in Django.
This works around the nginx bug, with no behaviour change.
[^1]: http://nginx.org/en/docs/http/ngx_http_core_module.html#location
[^2]: https://trac.nginx.org/nginx/ticket/348
[^3]: https://trac.nginx.org/nginx/ticket/52
[^4]: https://github.com/openresty/set-misc-nginx-module#set_escape_uri
Fixes the documentation generated from the Markdown macros
{settings_tab|your-bots} and {settings_tab|bot-list-admin} to
match the text labels in the Zulip UI and improves the text of
relative links to explicitly say if we are referring to the Bots
tab of the Personal or Organization settings menu.
Follow-up to #23256.
This code needs to be more flexible to improve the documentation
of items in the Personal and Organization settings menu when
using the `{settings_tab|[setting-name]}` Markdownm macro that
provides relative links or step-by-step instructions.
This commit moves the Markdown formatting code to a new function that
receives tuples from `link_mapping` as input. This is a preliminary
step to offer more flexibility than the current approach.
Rename 'muting.py' to 'user_mutes.py' because it, now
, contains only user-mute related functions.
Includes minor refactoring needed after renaming the file.
This commit moves topic related stuff i.e. topic muting functions
to a separate file 'views/user_topics.py'.
'views/muting.py' contains functions related to user-mutes only.
This will help us track if users actually clicked on the
email confirmation link while creating a new organization.
Replaced all the `reder` calls in `accounts_register` with
`TemplateResponse` to comply with `add_google_analytics`
decorator.
When 'resolve|unresolve' and 'move stream' actions occurs in
the same api call, 'This topic was marked as resolved|unresolved'
notification is not sent.
Both 'topic moved' and 'topic resolved' notification should be generated.
This commit updates the logic of when and where to send
'topic resolve|unresolve' notification. Unlike previous logic, notification
may be sent even in the case 'new_stream' is not None.
In general, 'topic resolved|unresolved' notification is sent to
'stream_being_edited'. In this particular case ('new_stream' is not None),
notification is sent to the 'new_stream' after check.
Test case is included.
Fixes: #22973
This adds a new endpoint /jwt/fetch_api_key that accepts a JWT and can
be used to fetch API keys for a certain user. The target realm is
inferred from the request and the user email is part of the JWT.
A JSON containing an user API key, delivery email and (optionally)
raw user profile data is returned in response.
The profile data in the response is optional and can be retrieved by
setting the POST param "include_profile" to "true" (default=false).
Co-authored-by: Mateusz Mandera <mateusz.mandera@zulip.com>
This will be useful for re-use for implementation of an endpoint for
obtaining the API by submitting a JWT in the next commits.
It's not a pure refactor, as it requires some tweaks to remote_user_jwt
behavior:
1. The expected format of the request is changed a bit. It used to
expect "user" and "realm" keys, from which the intended email was
just generated by joining with @. Now it just expects "email"
straight-up. The prior design was a bt strange to begin with, so this
might be an improvement actually.
2. In the case of the codepath of new user signup, this will no longer
pre-populate the Full Name in the registration form with the value
from the "user" key. This should be a very minor lost of
functionality, because the "user" value was not going to be a proper
Full Name anyway. This functionality can be restored in a future
commit if desired.
This is an API change, but this endpoint is nearly unused as far as
we're aware.
- Updates `.prettierignore` for the new directory.
- Updates any reference to the API documentation directory for
markdown files to be `api_docs/` instead of `zerver/api/`.
- Removes a reference link from `docs/documentation/api.md` that
hasn't referenced anything in the text since commit 0542c60.
- Update rendering of API documentation for new directory.
Moves the check for calling the `api-doc-template.md` directly,
so that we don't return a 500 error from the server, to happen
earlier with other checks for returning a 404 / missing page.
Also adds a specific test to `zerver/tests/test_urls` for this
template.
Prep commit for moving API documentation directory to be a top
level directory.
- Clean up the language.
- Add a prominent "Go to organization" button.
- Link to guides for new users and admins.
- Fix duplication bug in text email version.
Co-authored-by: Mateusz Mandera <mateusz.mandera@zulip.com>
Black 23 enforces some slightly more specific rules about empty line
counts and redundant parenthesis removal, but the result is still
compatible with Black 22.
(This does not actually upgrade our Python environment to Black 23
yet.)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Removes `base_path` argument when making the markdown extension for
parameters in documentation for API endpoints.
This seems to have been originally included for API parameters that
were documented in JSON files, which is no longer in use. Now all
API endpoints in the documentation are documented in
`zerver/openapi/zulip.yaml`.
Removes `base_path` argument when making the markdown extension for
return values in documentation for API endpoints.
This seems to have been a copy and paste error in commit d2ee99a2fd
when `zerver/lib/markdown/api_return_values_generator.py` was created.
Until now, custom emojis with "periods" in their name were allowed, even though
they don't really fit the pattern of how we name them, and in fact the Markdown
processor would not render such custom emoji. Fix this by just disallowing the
character.
Also update the error strings accordingly.
Note that this does not include a migration to eliminate any existing custom emoji with this
character in their name.
Fixes#24066.
This commit refactors the template code for source-realm
select element to have same structure as other inputs
and select element in the page. Thus this change also
makes the styling of source-realm select element consistent
with other select element in the page.
We change the do_create_user function to use transaction.atomic
decorator instead of using with block. Due to this change, all
send_event calls are made inside transaction.on_commit.
Some other changes -
- Remove transaction.atomic decorator from send_inital_realm_messages
since it is now called inside a transaction.
- Made changes in tests which tests message events and notifications
to make sure on_commit callbacks are executed.
This commit changes the do_reactivate_user such that the complete function
is called inside an atomic transaction and events are called after the
transaction is commited using on_commit helper. This is a prep commit
for unsubscribing the bots of unaccessible private streams when reactivating
them.
These files are not Jinja2 templates, so there's no reason that they needed
to be inside `templates/zerver`. Moving them to the top level reflects their
importance and also makes it feel nicer to work on editing the help center content,
without it being unnecessary buried deep in the codebase.
The content of a message is truncated to `MAX_MESSAGE_LENGTH`, which
is 1000 characters. Since the email gateway places attachments at the
very end of the extracted body, that means that they are the first
thing to get truncated off.
That is, if an incoming email message contains 1000 `a`s and an image
attachment, the link that attaches the attachment to the message will
get truncated off, leaving it dangling in the database.
Truncate the message body content separately from the attachment links
which are included at the end of the body.
Changes the check for whether the documentation page is a policy
center page to be the `self.policies_view` boolean instead of the
`path_template` value as it reads much more clearly.
Moves a comment in the code to be contextually relevant.
Because of the overlap with the `DocumentationArticle` dataclass
field `article_path`, we rename the `article_path` variable used
in `MarkdownDirectoryView.get_context_data` for the absolute path
to be `article_absolute_path`.
In commit bbecd41, we added "not_index_page" to the context for
some documentation articles, but use of that context key/value was
removed when the help documentation was removed in commit 1cf7ee9.
Changes `not_index_page` to be a boolean value that's used to set
the page title, but is not then passed on as a context key/value.
Also removes an irrelevant comment about disabling "Back to home"
on the homepage.
Since we want to use `accounts/new/send_confirm` to know how many
users actually register after visiting the register page, we
added it to Google Tag Manager, but GTM tracks every user
registration separately due <email> in the URL
making it harder to track.
To solve this, we want to pass <email> as a GET parameter which
can be easily filtered inside GTM using a RegEx and all the
registrations can be tracked as one.
A missed message email notification, where the message is the welcome
message sent by the welcome bot on account creation, get sent when
the user somehow not focuses the browser tab during account creation.
No missed message email or push notifications should be sent for the
messages generated by the welcome bot.
'internal_send_private_message' accepts a parameter
'disable_external_notifications' and is set to 'True' when the sender
is 'welcome bot'.
A check is introduced in `trivially_should_not_notify`, not to notify
if `disable_external_notifications` is true.
TestCases are updated to include the `disable_external_notifications`
check in the early (False) return patterns of `is_push_notifiable` and
`is_email_notifiable`.
One query reduced for both `test_create_user_with_multiple_streams`
and `test_register`.
Reason: When welcome bot sends message after user creation
`do_send_messages` calls `get_active_presence_idle_user_ids`,
`user_ids` in `get_active_presence_idle_user_ids` remains empty if
`disable_external_notifications` is true because `is_notifiable` returns
false.
`get_active_presence_idle_user_ids` calls `filter_presence_idle_user_ids`
and since the `user_ids` is empty, the query inside the function doesn't
get executed.
MissedMessageHookTest updated.
Fixes: #22884
This commit makes all the parameters after 'content' in
'internal_send_*', 'internal_prep_*' and '_internal_prep_*'
a mandatory keyword argument to increase code readability.
A separate function named `trivially_should_not_notify` is added which
extracts the common checks from `get_push_notification_trigger` and
`get_email_notification_trigger` which are users' notification settings
independent and thus don't depend on what type of notification (email/push)
it is.
608c787c52 fixed a bug where messages sent by the email gateway "as"
a user failed to properly attribute ownership of their attachments,
leaving the attachments orphaned and thus with nobody with permissions
to view them.
These orphaned attachments only remain longer than a few weeks if the
`delete_unclaimed_attachments` script has not been run reliably.
Since there is currently no shipped cron job for this, that is most
likely all deployments.
Add a migration to find such orphaned attachments, and re-attach them
to their original message. While theoretically the attachments
could have been later referenced in other messages -- which would be
very difficult to find and determine if they had access to the
attachment -- we only fix the original message.
In order to make this somewhat performant, we assume that the Message
rows associated with an Attachment made by the email gateway happened
within 5 minutes, since they must have been made during one HTTP
request.
This is complicated by the message potentially having been deleted; in
this case, the Attachment is moved into ArchivedAttachment, so it can
relate to the ArchivedMessage. The many-to-many
`zerver_archivedattachment_messages` relationship table cannot use its
own `id` sequence for the value, since the `id` is re-used when the
row is inserted into the `zerver_attachment_messages` table -- we
instead consume a value from the `id` sequence of the
`zerver_attachment_messages` table.
Documents link to the bot's user card from the bot's name in
Organization settings > Bots, and information in the bot's user card.
Fixes part of #23970.
When the email mirror gateway is sending messages "as" a user (as
triggered by having access to the missed-message email address),
attachments were still created as the Email Gateway bot. Since the
sender (the end-user) was not the owner of those attachments (the
gateway bot), nor were they referenced yet anywhere, this resulted in
the attachments being "orphaned" and not allowed to be accessed by
anyone -- despite the attachment links being embedded in the message.
This was accompanied by the error:
```
WARN [] User 12345 tried to share upload 123/3LkSA4OcoG6OpAknS2I0SFAQ/example.jpf in message 123456, but lacks permission
INFO [zerver.lib.email_mirror] Successfully processed email from user 12345 to example-stream
```
We solve this by creating attachment objects as the users the message
will be sent from.
The intention was to continue the outer ‘for’ loop, not the inner one
(but Python doesn’t have labelled ‘continue’).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The max inline preview limit was previously increased to 10 by #20789.
However, as issue #23624 shows, it's still causing confusion for users
when they include more than 10 links.
Bump this limit up to 24, which is a multiple of the 4 image preview
per line logic.
Overrides the default context `allow_search_engine_indexing` to
always be `False` for `templates/corporate/attribution.html` so
that it does not appear in Google / search engine indexes.
Updates test of documentation pages in `test_docs.py` to have an
option for corporate pages to set this value in the template and
verifies that the meta tag for robots noindex, nofollow is
always in the response.
For descriptive endpoints, such as `/register`, that might raise
Schema Validation errors via `validate_against_openapi_schema`,
omits the OpenAPI schema definition in the error output.
Also omits the error instance definition in the error output
when it is a jsonschema object with over 100 properties. This
means that the test instance for objects, like user settings,
will be printed in the error output, but the test instance for
the entire endpoint will not be printed to the console.
The omitted output can be thousands of lines long making it
difficult to find the initial console output that actually helps
the contributor with debugging.
Adds a section in "Documenting REST API endpoints" about
debugging and understanding these errors that is linked to
in the error console output.
Previously, we got the directory path for all documentation pages
before checking for API method and path information in the OpenAPI
documentation. Instead, we now check the `path_template` is the
API documentation view template before getting the directory path.
Also, changes the confusingly named `article_path` variable, which
overlapped with the DocumentationArticle dataclass `article_path`
field, to now be `api_documentation_path`.
Prep commit for moving the help center documentation to a top level
directory.
Accessing .realm will cause a fetch query from the database if the
attribute hasn't been fetched already earlier in the codepath. That's
completely redundant if we're just comparing realms, and we should only
access .realm_id attribute. This seems to eliminate a query in some
codepaths, which is nice in this performance-sensitive function.
Adds links to the documentation about management commands in the
API documentation for creating users, as well as the `/devtools`
documentation, the GDPR compliance article and the incoming
webhooks tutorial.
When file uploads are stored in S3, this means that Zulip serves as a
302 to S3. Because browsers do not cache redirects, this means that
no image contents can be cached -- and upon every page load or reload,
every recently-posted image must be re-fetched. This incurs extra
load on the Zulip server, as well as potentially excessive bandwidth
usage from S3, and on the client's connection.
Switch to fetching the content from S3 in nginx, and serving the
content from nginx. These have `Cache-control: private, immutable`
headers set on the response, allowing browsers to cache them locally.
Because nginx fetching from S3 can be slow, and requests for uploads
will generally be bunched around when a message containing them are
first posted, we instruct nginx to cache the contents locally. This
is safe because uploaded file contents are immutable; access control
is still mediated by Django. The nginx cache key is the URL without
query parameters, as those parameters include a time-limited signed
authentication parameter which lets nginx fetch the non-public file.
This adds a number of nginx-level configuration parameters to control
the caching which nginx performs, including the amount of in-memory
index for he cache, the maximum storage of the cache on disk, and how
long data is retained in the cache. The currently-chosen figures are
reasonable for small to medium deployments.
The most notable effect of this change is in allowing browsers to
cache uploaded image content; however, while there will be many fewer
requests, it also has an improvement on request latency. The
following tests were done with a non-AWS client in SFO, a server and
S3 storage in us-east-1, and with 100 requests after 10 requests of
warm-up (to fill the nginx cache). The mean and standard deviation
are shown.
| | Redirect to S3 | Caching proxy, hot | Caching proxy, cold |
| ----------------- | ------------------- | ------------------- | ------------------- |
| Time in Django | 263.0 ms ± 28.3 ms | 258.0 ms ± 12.3 ms | 258.0 ms ± 12.3 ms |
| Small file (842b) | 586.1 ms ± 21.1 ms | 266.1 ms ± 67.4 ms | 288.6 ms ± 17.7 ms |
| Large file (660k) | 959.6 ms ± 137.9 ms | 609.5 ms ± 13.0 ms | 648.1 ms ± 43.2 ms |
The hot-cache performance is faster for both large and small files,
since it saves the client the time having to make a second request to
a separate host. This performance improvement remains at least 100ms
even if the client is on the same coast as the server.
Cold nginx caches are only slightly slower than hot caches, because
VPC access to S3 endpoints is extremely fast (assuming it is in the
same region as the host), and nginx can pool connections to S3 and
reuse them.
However, all of the 648ms taken to serve a cold-cache large file is
occupied in nginx, as opposed to the only 263ms which was spent in
nginx when using redirects to S3. This means that to overall spend
less time responding to uploaded-file requests in nginx, clients will
need to find files in their local cache, and skip making an
uploaded-file request, at least 60% of the time. Modeling shows a
reduction in the number of client requests by about 70% - 80%.
The `Content-Disposition` header logic can now also be entirely shared
with the local-file codepath, as can the `url_only` path used by
mobile clients. While we could provide the direct-to-S3 temporary
signed URL to mobile clients, we choose to provide the
served-from-Zulip signed URL, to better control caching headers on it,
and greater consistency. In doing so, we adjust the salt used for the
URL; since these URLs are only valid for 60s, the effect of this salt
change is minimal.