Commit Graph

17216 Commits

Author SHA1 Message Date
Anders Kaseorg 6988622fe8 ruff: Enable B023 Function definition does not bind loop variable.
Python’s loop scoping is misdesigned, resulting in a very common
gotcha for functions that close over loop variables [1].  The general
problem is so bad that even the Go developers plan to break
compatibility in order to fix the same design mistake in their
language [2].

Enable the Ruff rule function-uses-loop-variable (B023) [3], which
conservatively prohibits functions from binding loop variables at all.

[1] https://docs.python-guide.org/writing/gotchas/#late-binding-closures
[2] https://go.dev/s/loopvar-design
[3] https://beta.ruff.rs/docs/rules/function-uses-loop-variable/

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-11 18:03:45 -07:00
Anders Kaseorg cf4791264c python: Replace functools.partial with type-safe returns.curry.partial.
The type annotation for functools.partial uses unchecked Any for all
the function parameters (both early and late).  returns.curry.partial
uses a mypy plugin to check the parameters safely.

https://returns.readthedocs.io/en/latest/pages/curry.html

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-11 18:03:45 -07:00
Alex Vandiver 5874c6542f migrations: Remove indexes on Message without realm_id.
These indexes should no longer be necessary after the changes in the
previous commit.
2023-09-11 15:00:37 -07:00
Alex Vandiver b94402152d models: Always search Messages with a realm_id or id limit.
Unless there is a limit on `id`, always provide a `realm_id` limit as
well.  We also notate which index is expected to be used in each
query.
2023-09-11 15:00:37 -07:00
Alex Vandiver f9dd2549eb narrow: Set a realm_id limit on messages in user searches. 2023-09-11 15:00:37 -07:00
Alex Vandiver 3518d31797 migrations: Add indexes with realm_id.
This is designed to help PostgreSQL have better specificity and
locality in its indexes.  Subsequent commits will adjust the code to
make sure that we use these indexes rather than the `realm_id`-less
versions.

We do not add a `realm_id` variation to the full-text index, since
it is a GIN index; multi-column GIN indexes are not terribly
performant, require the `btree_gin` extension for `int` types (which
requires superuser privileges on PostgreSQL 12 and earlier), and
cannot be consistently added concurrently on running instances.

After all indexes have been made, we also run `CREATE STATISTICS` in
order to give PostgreSQL the opportunity to realize that recipient and
sender are highly correlated with message realm, allowing it to
estimate that `(realm_id, recipient_id)` is likely as specific as
matching a given `recipient_id`, instead of as likely as matching
`realm_id` times matching a `recipient_id`.  Finally, those statistics
must be filled by `ANALYZE zerver_message`, which is run last.
2023-09-11 15:00:37 -07:00
Alex Vandiver 067de6f948 coverage: Skip zerver.lib.migrate coverage.
It is only covered when we run migration tests, which we are not
guaranteed to always be able to do.
2023-09-11 15:00:37 -07:00
Alex Vandiver d6745209f2 django: Use .exists() instead of .count() when possible. 2023-09-11 15:00:37 -07:00
Alex Vandiver 9d3d57e786 message_send: Inline single use of filter_by_exact_message_topic.
Matching the topic exactly, as opposed to case-insensitively, is not a
common operation, and one that we want to make difficult to do
accidentally.  Inline the single use case of it.
2023-09-11 15:00:37 -07:00
Alex Vandiver 710524465a messages: Switch limits from sender__realm to realm.
We now have a `realm_id` on Message; use it, rather than having to
check the sender's realm.  This is theoretically different for
cross-realm bots, but these changes are all in tests where that does
not apply.
2023-09-11 15:00:37 -07:00
Alex Vandiver 5a0f4a1a22 messages: Limit to "id" column for max-message-id computation.
This lets PostgreSQL use an "Index Only Scan" which is slightly faster
than an "Index scan".
2023-09-11 15:00:37 -07:00
Alex Vandiver 631868a05b users: Refactor and optimize max_message_id_for_user by removing a join.
This algorithm existed in multiple places, with different queries.
Since we only access properties in the UserMessage table, we
standardize on the much simpler and faster Index Only Scan, rather
than a merge join.
2023-09-11 15:00:37 -07:00
Anders Kaseorg 1905df2342 requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-09 12:53:39 -07:00
Adrián Oliva 732ad89f3d markdown: Fix URL link topic skipping query.
When searching for links inside a topic name, the question mark (?)
was used to split the topic. If a URL had a query after the URL
(e.g., "?foo=bar"), then the query was trimmed from the URL.

Removing the question mark from `basic_link_splitter` is sufficient
to fix this issue. The `get_web_link_regex` function then removes
the trailing punctuation if any, including literal question marks.

Fixes #26368.
2023-09-08 16:17:11 -07:00
N-Shar-ma 8c91c91d86 widgets: Fix bug where a new line right after /todo broke rendering.
When there was no space right after `/todo` but there was content on a
new line, the message would be rendered plainly, not as a todo widget.
This was because we split on only the space character to then check if
the first token was a valid widget.

Now we split on both spaces and newlines to extract the widget name,
irrespective of whether it is followed by a space or a newline. This
results in the message being rendered as a todo widget as expected.
2023-09-08 15:39:07 -07:00
Lauryn Menard 11adc0f37d demo-organizations: Rename shortend versions of 'demo organization'.
Rename existing shortened references to demo organizations, like
`is_demo_org` or `demo-org-warning`, that have been used in the
codebase so far and replace them to be like the `models.py`
variable: `Realm.demo_organization_scheduled_deletion_date`.
2023-09-08 15:17:23 -07:00
Alex Vandiver 61262c7b9a tabbed_sections: Fix a backtrack-able regex.
This REDOS was not exploitable, as its content is only read from
checked-in files; regardless, simplify it to not backtrack.  We also
do not actually have any location which use leading or trailing
whitespace, so remove those optional bits.
2023-09-08 14:51:51 -07:00
Tim Abbott bcff5580d1 makemessages: Fix handling of handlebars whitespace control.
Our logic for extracting strings from templates did not properly
handle the syntax for code containing whitespace control characters,
resulting in a couple strings from subscribe_to_more_streams.hbs not
being processed.
2023-09-08 09:09:46 -07:00
Zixuan James Li 7d683018bd webhooks: Migrate travis to use @typed_endpoint.
To perform the same check, we define a Pydantic model. This includes
some keys "build_url" and "type" that we did not check for previously.
2023-09-08 08:20:17 -07:00
Zixuan James Li 4037196fb2 webhooks: Migrate librato to use @typed_endpoint.
The Librato webhook requires a mapping (which should be considered
immutable) with a default value. Ruff reports a false-positive due to
the Json wrapper.
2023-09-08 08:20:17 -07:00
Zixuan James Li a33607d8ad webhooks: Convert gitlab to use @typed_endpoint.
The GitLab webhook has a mix of different types of parameters each
requring a unique set of configurations.
2023-09-08 08:20:17 -07:00
Zixuan James Li b163f2fe4e webhooks: Convert non-body payload webhooks to use @typed_endpoint.
These webhooks do not use argument_type_is_body, so they are parsing the
payload from a query parameter directly into WildValue.
2023-09-08 08:20:17 -07:00
Zixuan James Li 318a9316a7 webhooks: Migrate webhooks with special payload types to use @typed_endpoint.
Instead of a WildValue, the JSON/Sentry webhook expect the request body to be a
dict.

For the JSON webhook, json.dumps accepts other types of input as well and the
constraint is not necessary, but this serve as a good example of an alternative
use of WebhookPayload to describe a payload that is intended to be parsed from
the entire request body from JSON, into a type other than WildValue.
2023-09-08 08:20:17 -07:00
Zixuan James Li ece6b98699 webhooks: Migrate helloworld to use WildValue to use @typed_endpoint.
We owe more documentation on the use of WildValue. A follow-up on
updating it with examples of WildValue and endpoint will be desirable.
2023-09-08 08:20:17 -07:00
Zixuan James Li 9fef12950a webhooks: Migrate transifex to use endpoint to use @typed_endpoint.
Transifex has parameters that need to be parsed from JSON and converted
to int. Note that we use Optional[Json[int]] instead of
Json[Optional[int]] to replicate the behavior of json_validator. This
caveat is explained in a new test called test_json_optional.
2023-09-08 08:20:17 -07:00
Zixuan James Li 1329284848 webhooks: Migrate webhooks with str parameters to use @typed_endpoint.
These webhooks have some URL query params that do not need additional
validation or parsing from JSON. So WebhookPaylaod is not applicable to
these webhooks.
2023-09-08 08:20:17 -07:00
Zixuan James Li 9377080f1f webhooks: Migrate most webhooks to use @typed_endpoint.
This converts most webhook integration views to use @typed_endpoint instead
of @has_request_variables, rewriting REQ parameters. For these
webhooks, it simply requires switching the decorator, rewriting the
type annotation of payload/message to WebhookPayload[WildValue], and
removing the REQ default that defines the to_wild_value converter.
2023-09-08 08:20:17 -07:00
Zixuan James Li 574740dda4 webhooks: Migrate check_send_webhook_message to use @typed_endpoint.
This function is used by almost all webhooks.

To support it, we use the "api_ignore_parameter" flag so that positional
arguments like topic and body that are not intended to be parsed from
the request can be ignored.
2023-09-08 08:20:17 -07:00
Zixuan James Li 910f69465c drafts: Migrate drafts to use @typed_endpoint.
This demonstrates the use of BaseModel to replace a check_dict_only
validator.

We also add support to referring to $defs in the OpenAPI tests. In the
future, we can descend down each object instead of mapping them to dict
for more accurate checks.
2023-09-08 08:20:17 -07:00
Zixuan James Li 4701f290f7 presence: Migrate presence to use @typed_endpoint.
This demonstrates a use of StringConstraints.
2023-09-08 08:20:17 -07:00
Zixuan James Li 6201914fd3 message_edit: Migrate message_edit to use @typed_endpoint.
This demonstrates how an alias is created and its suitable use case, the
use of PathOnly, NonNegativeInt, and Literal.
2023-09-08 08:20:17 -07:00
Zixuan James Li 9c53995830 alert_words: Migrate alert_words to use @typed_endpoint.
This demonstrates some basic use cases of the Json[...] wrapper with
@typed_endpoint.

Along with this change we extend test_openapi so that schema checking
based on function signatures will still work with this new decorator.
Pydantic's TypeAdapter supports dumping the JSON schema of any given type,
which is leveraged here to validate against our own OpenAPI definitions.
Parts of the implementation will be covered in later commits as we
migrate more functions to use @typed_endpoint.

See also:
https://docs.pydantic.dev/latest/api/type_adapter/#pydantic.type_adapter.TypeAdapter.json_schema

For the OpenAPI schema, we preprocess it mostly the same way. For the
parameter types though, we no longer need to use
get_standardized_argument_type to normalize type annotation, because
Pydantic dumps a JSON schema that is compliant with OpenAPI schema
already, which makes it a lot convenient for us to compare the types
with our OpenAPI definitions.

Do note that there are some exceptions where our definitions do not match
the generated one. For example, we use JSON to parse int and bool parameters,
but we don't mark them to use "application/json" in our definitions.
2023-09-08 08:20:17 -07:00
Zixuan James Li c336bf0398 api: Avoid programming errors due to nested Annotated types.
We want to reject ambiguous type annotations that set ApiParamConfig
inside a Union. If a parameter is Optional and has a default of None, we
prefer Annotated[Optional[T], ...] over Optional[Annotated[T, ...]].

This implements a check that detects Optional[Annotated[T, ...]] and
raise an assertion error if ApiParamConfig is in the annotation. It also
checks if the type annotation contains any ApiParamConfig objects that
are ignored, which can happen if the Annotated type is nested inside
another type like List, Union, etc.

Note that because
param: Annotated[Optional[T], ...] = None
and
param: Optional[Annotated[Optional[T], ...]] = None
are equivalent in runtime prior to Python 3.11, there is no way for us
to distinguish the two. So we cannot detect that in runtime.
See also: https://github.com/python/cpython/issues/90353
2023-09-08 08:20:17 -07:00
Zixuan James Li 5a7b1065e5 api: Rewrite argument type test for clarity.
We refactor HostRequestMock so that it now proper populates the request
body given the post data, assuming that the request is JSON encoded.
2023-09-08 08:20:17 -07:00
Zixuan James Li f4caf9dd79 api: Add new typed_endpoint decorators.
The goal of typed_endpoint is to replicate most features supported by
has_request_variables, and to improve on top of it. There are some
unresolved issues that we don't plan to work on currently. For example,
typed_endpoint does not support ignored_parameters_supported for 400
responses, and it does not run validators on path-only arguments.

Unlike has_request_variables, typed_endpoint supports error handling by
processing validation errors from Pydantic.

Most features supported by has_request_variables are supported by
typed_endpoint in various ways.

To define a function, use a syntax like this with Annotated if there is
any metadata you want to associate with a parameter, do note that
parameters that are not keyword-only are ignored from the request:
```
@typed_endpoint
def view(
    request: HttpRequest,
    user_profile: UserProfile,
    *,
    foo: Annotated[int, ApiParamConfig(path_only=True)],
    bar: Json[int],
    other: Annotated[
        Json[int],
        ApiParamConfig(
            whence="lorem",
            documentation_status=NTENTIONALLY_UNDOCUMENTED
        )
    ] = 10,
) -> HttpResponse:
    ....
```

There are also some shorthands for the commonly used annotated types,
which are encouraged when applicable for better readability and less
typing:
```
WebhookPayload = Annotated[Json[T], ApiParamConfig(argument_type_is_body=True)]
PathOnly = Annotated[T, ApiParamConfig(path_only=True)]
```

Then the view function above can be rewritten as:
```
@typed_endpoint
def view(
    request: HttpRequest,
    user_profile: UserProfile,
    *,
    foo: PathOnly[int],
    bar: Json[int],
    other: Annotated[
        Json[int],
        ApiParamConfig(
            whence="lorem",
            documentation_status=INTENTIONALLY_UNDOCUMENTED
        )
    ] = 10,
) -> HttpResponse:
    ....
```

There are some intentional restrictions:
- A single parameter cannot have more than one ApiParamConfig
- Path-only parameters cannot have default values
- argument_type_is_body is incompatible with whence
- Arguments of name "request", "user_profile", "args", and "kwargs" and
  etc. are ignored by typed_endpoint.
- positional-only arguments are not supported by typed_endpoint. Only
  keyword-only parameters are expected to be parsed from the request.
- Pydantic's strict mode is always enabled, because we don't want to
  coerce input parsed from JSON into other types unnecessarily.
- Using strict mode all the time also means that we should always use
  Json[int] instead of int, because it is only possible for the request
  to have data of type str, and a type annotation of int will always
  reject such data.

typed_endpoint's handling of ignored_parameters_unsupported is mostly
identical to that of has_request_variables.
2023-09-08 08:20:17 -07:00
Anders Kaseorg 0ce6dcb905 mypy: Upgrade mypy from 1.4.1 to 1.5.1.
_default_manager is the same as objects on most of our models. But
when a model class is stored in a variable, the type system doesn’t
know which model the variable is referring to, so it can’t know that
objects even exists (Django doesn’t add it if the user added a custom
manager of a different name). django-stubs used to incorrectly assume
it exists unconditionally, but it no longer does.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-07 17:51:42 -07:00
Anders Kaseorg 2cd018ce57 models: Remove duplicate index definition for date_sent.
Commit cf0eb46afc added this to let
Django understand the CREATE INDEX CONCURRENTLY statement that had
been hidden in a RunSQL query in migration 0244.  However, migration
0245 explained that same index to Django in a different way by setting
db_index=True.  Move that to 0244 where the index is actually created,
using SeparateDatabaseAndState.

Also remove the part of the SQL in 0245 that was mirrored by dummy
state_operations, and replace it with real operations.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-07 16:44:43 -07:00
Tim Abbott 6c83bbcbdb settings: Disallow everyone group for new setting.
This is important because the "guests" value isn't one that we'd
expect anyone to pick intentionally, and in particular isn't an
available option for the similar/adjacent "email invitations" setting.
2023-09-07 14:21:01 -07:00
Ujjawal Modi 88ec312b21 events: Send invites changes event to non-admin users also.
Earlier whenever a new invitation is created a event was sent
to only admin users. So, if invites by a non-admins user are changed
the invite panel does not live update.

This commit makes changes to also send event to non-admin
user if invites by them are changed.
2023-09-07 14:21:01 -07:00
Ujjawal Modi 5e31a6b1c0 invites: Make it possible for non-admins to revoke multiuse invites.
This commit makes changes to allow non-admins to revoke multiuse
invitations created by them.
2023-09-07 14:21:01 -07:00
Ujjawal Modi ec49c3acc8 invites: Rename `can_invite_others_to_realm` local variables.
This commit rename the existing setting `Who can invite users to this
organization` to `Who can send email invitations to new users` and
also renames all the variables related to this setting that do not
require a change to the API.

This was done for better code readability as a new setting
`Who can create invite links` will be added in future commits.
2023-09-07 14:21:01 -07:00
Ujjawal Modi f67cef8885 invite: Add new setting for "Who can create multiuse invite links".
This commit does the backend changes required for adding a realm
setting based on groups permission model and does the API changes
required for the new setting `Who can create multiuse invite link`.
2023-09-07 14:21:01 -07:00
Ujjawal Modi 9eccb4336e types: Add id_field_name field to GroupPermissionSetting type.
This commit adds id_field_name field to GroupPermissionSetting
type which will be used to store the string formed by concatenation
of setting_name and `_id`.
2023-09-07 14:21:01 -07:00
Tim Abbott 5f8bbfa652 invite: Explicitly mark REALM_OWNER as requiring an admin.
This was already enforced via separate logic that requires an owner to
invite an owner, but it makes the intent of the code a lot more clear
if we don't have this value mysteriously absent.
2023-09-07 14:21:01 -07:00
Ujjawal Modi a0b16e550e invites: Add a function to check if owner or admin is required.
Earlier there was a function to check if owner is
required to create invitations for the role specified
in invite and check for administrator was done
without any function call.

This commit adds a new function to check whether
owner or administrator is required for creating
invitations for the specified role and
refactors the code to use that new function.
2023-09-07 14:21:01 -07:00
Ujjawal Modi 2e59b1f30e tests: Use function to create realm rather than django ORM.
This commit makes changes in backend tests to use
`do_create_realm` function to create realm.
2023-09-07 14:21:01 -07:00
Ujjawal Modi 72b099524d internal_realm: Single transaction for changes while creating realm.
This commit makes the database changes while creating internal_realm
to be done in a single transaction.
This is needed for deferring the foreign key constraints
to the end of transaction.
2023-09-07 14:21:01 -07:00
Anders Kaseorg 48a3588cdb docs: Fix typos caught by ‘typos’.
https://github.com/crate-ci/typos

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-06 18:59:05 -07:00
Anders Kaseorg 6c76bad65a middleware: Fix exception logging format on JSON views.
Previously (with ERROR_REPORTING = True), we’d stuff the entire
traceback of the initial exception into the subject line of an error
email, and then also send a separate email for the JSON 500 response.
Instead, log one error with the standard Django format.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-06 09:14:49 -07:00
Zixuan James Li 1e1f98edb2 transaction_tests: Remove testing URL.
Rewrite the test so that we don't have a dedicated URL for testing.
dev_update_subgroups is called directly from the tests without using the
test client.
2023-09-06 09:13:02 -07:00