Commit Graph

248 Commits

Author SHA1 Message Date
Mateusz Mandera 1ec0d5bd9d requests: Add SELF_HOSTING_MANAGEMENT_SUBDOMAIN. 2023-11-22 14:22:26 -08:00
Anders Kaseorg a50eb2e809 mypy: Enable new error explicit-override.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-10-12 12:28:41 -07:00
Alex Vandiver 05d8c9d49e middleware: Don't double-log AnomalousWebhookPayloadError.
These are `WebhookError`s which have a 500 status code; we have logged
them already in the webhook decorator, so just return the content directly.
2023-10-12 10:06:31 -07:00
Anders Kaseorg f99cce91bf middleware: Send got_request_exception signal for JSON 500 errors.
This is ordinarily emitted by Django at
  https://github.com/django/django/blob/4.2.6/django/core/handlers/exception.py#L139
and received by Sentry at
  https://github.com/getsentry/sentry-python/blob/1.31.0/sentry_sdk/integrations/django/__init__.py#L166

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-10-04 17:25:20 -07:00
Alex Vandiver 5ee4b642ad views: Add a /health healthcheck endpoint.
This endpoint verifies that the services that Zulip needs to function
are running, and Django can talk to them.  It is designed to be used
as a readiness probe[^1] for Zulip, either by Kubernetes, or some other
reverse-proxy load-balancer in front of Zulip.  Because of this, it
limits access to only localhost and the IP addresses of configured
reverse proxies.

Tests are limited because we cannot stop running services (which would
impact other concurrent tests) and there would be extremely limited
utility to mocking the very specific methods we're calling to raising
the exceptions that we're looking for.

[^1]: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
2023-09-20 09:53:59 -07:00
Alex Vandiver 135acfea93 nginx: Suppress proxy warnings when the proxy itself sent the request.
This is common in cases where the reverse proxy itself is making
health-check requests to the Zulip server; these requests have no
X-Forwarded-* headers, so would normally hit the error case of
"request through the proxy, but no X-Forwarded-Proto header".

Add an additional special-case for when the request's originating IP
address is resolved to be the reverse proxy itself; in these cases,
HTTP requests with no X-Forwarded-Proto are acceptable.
2023-09-12 10:10:58 -07:00
Anders Kaseorg 6c76bad65a middleware: Fix exception logging format on JSON views.
Previously (with ERROR_REPORTING = True), we’d stuff the entire
traceback of the initial exception into the subject line of an error
email, and then also send a separate email for the JSON 500 response.
Instead, log one error with the standard Django format.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-06 09:14:49 -07:00
Alex Vandiver 5368d1bd4c middleware: Allow HTTP from localhost, not through a reverse proxy.
In servers with `application_server.http_only = true` and
`loadbalancer.ips` set, the DetectProxyMisconfiguration middleware
prevents access over HTTP from IP addresses other than the
loadbalancer.

However, this misses the case of access from localhost over HTTP,
which is safe and expected -- for instance, the `email-mirror-postfix`
script used in the email gateway[^1] will post to `http://localhost/`
by default in such configurations.  With the
DetectProxyMisconfiguration installed, this will result in a 403
response.

Make an exception for requests from `127.0.0.1` and `::1` from
proxy-misconfiguration rejections.

[^1]: https://zulip.readthedocs.io/en/latest/production/email-gateway.html
2023-08-17 12:07:37 -07:00
Steve Howell 51db22c86c per-request caches: Add per_request_cache library.
We have historically cached two types of values
on a per-request basis inside of memory:

    * linkifiers
    * display recipients

Both of these caches were hand-written, and they
both actually cache values that are also in memcached,
so the per-request cache essentially only saves us
from a few memcached hits.

I think the linkifier per-request cache is a necessary
evil. It's an important part of message rendering, and
it's not super easy to structure the code to just get
a single value up front and pass it down the stack.

I'm not so sure we even need the display recipient
per-request cache any more, as we are generally pretty
smart now about hydrating recipient data in terms of
how the code is organized. But I haven't done thorough
research on that hypotheseis.

Fortunately, it's not rocket science to just write
a glorified memoize decorator and tie it into key
places in the code:

    * middleware
    * tests (e.g. asserting db counts)
    * queue processors

That's what I did in this commit.

This commit definitely reduces the amount of code
to maintain. I think it also gets us closer to
possibly phasing out this whole technique, but that
effort is beyond the scope of this PR. We could
add some instrumentation to the decorator to see
how often we get a non-trivial number of saved
round trips to memcached.

Note that when we flush linkifiers, we just use
a big hammer and flush the entire per-request
cache for linkifiers, since there is only ever
one realm in the cache.
2023-08-11 11:09:34 -07:00
Anders Kaseorg 63be67af80 logging_util: Remove dependence on get_current_request.
Pass the HttpRequest explicitly through the two webhooks that log to
the webhook loggers.

get_current_request is now unused, so remove it (in the same commit
for test coverage reasons).

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-11 22:23:47 -07:00
Anders Kaseorg f66e2c3112 sentry: Remove dependence on get_current_request.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-11 22:23:47 -07:00
Alex Vandiver 8a77cca341 middleware: Detect reverse proxy misconfigurations.
Combine nginx and Django middlware to stop putting misleading warnings
about `CSRF_TRUSTED_ORIGINS` when the issue is untrusted proxies.
This attempts to, in the error logs, diagnose and suggest next steps
to fix common proxy misconfigurations.

See also #24599 and zulip/docker-zulip#403.
2023-07-02 16:20:21 -07:00
Alex Vandiver cf0b803d50 zproject: Prevent having exactly 17/18 middlewares, for Python 3.11 bug.
Having exactly 17 or 18 middlewares, on Python 3.11.0 and above,
causes python to segfault when running tests with coverage; see
https://github.com/python/cpython/issues/106092

Work around this by adding one or two no-op middlewares if we would
hit those unlucky numbers.  We only add them in testing, since
coverage is a requirement to trigger it, and there is no reason to
burden production with additional wrapping.
2023-07-02 16:20:21 -07:00
Anders Kaseorg c09e7d6407 codespell: Correct “requestor” to “requester”.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-06-20 16:17:55 -07:00
Anders Kaseorg 98310f269b middleware: Do not consume StreamingHttpResponse.streaming_content.
streaming_content is an iterator. Consuming it within middleware
prevents it from being sent to the browser.

https://docs.djangoproject.com/en/4.2/ref/request-response/#streaminghttpresponse-objects

“The StreamingHttpResponse … has no content attribute. Instead, it has
a streaming_content attribute. This can be used in middleware to wrap
the response iterable, but should not be consumed.”

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-06-19 15:04:18 -07:00
Alex Vandiver 24c3e25f86 middleware: Redirect non-canonical realm domain names.
If a host is in REALM_HOSTS, it has its own domain name.  Redirect
access from other domain names to that name.
2023-05-16 15:13:51 -07:00
Anders Kaseorg d0481be3e5 requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-05-10 19:44:47 -07:00
Anders Kaseorg 9db3451333 Remove statsd support.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-04-25 19:58:16 -07:00
Alex Vandiver e536a14b61 report_error: Remove API endpoint for client error reporting. 2023-04-13 14:59:58 -07:00
Alex Vandiver cb7bc1b7b9 report_error: Remove reference to old non-existant path. 2023-04-13 14:59:58 -07:00
Anders Kaseorg d2af06f4df middleware: Remove ZulipCommonMiddleware patch.
My fix for the relevant performance bug was upstreamed in Django 4.2.

https://code.djangoproject.com/ticket/33700

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-04-07 09:13:20 -07:00
Anders Kaseorg 6992d3297a ruff: Fix PIE810 Call `startswith` once with a `tuple`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-08 16:40:35 -08:00
Anders Kaseorg df001db1a9 black: Reformat with Black 23.
Black 23 enforces some slightly more specific rules about empty line
counts and redundant parenthesis removal, but the result is still
compatible with Black 22.

(This does not actually upgrade our Python environment to Black 23
yet.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-02 10:40:13 -08:00
Anders Kaseorg 7e3a681f80 ruff: Fix S108 Probable insecure usage of temporary file.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-26 10:14:56 -08:00
Anders Kaseorg d1bb100a2d Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-04 11:08:56 -08:00
Anders Kaseorg 2cc3fa4fba scim: Check SCIM tokens using constant-time comparison.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-16 10:52:48 -05:00
Anders Kaseorg 70dbeb197f middleware: Set the correct options on the django_language cookie.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-09 14:24:22 -08:00
Anders Kaseorg 55342efd33 scim: Upgrade django-scim2; remove request.user monkey patching.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-05 15:13:50 -07:00
Anders Kaseorg 1385a827c2 python: Clean up getattr, setattr, delattr calls with literal names.
These were useful as a transitional workaround to ignore type errors
that only show up with django-stubs, while avoiding errors about
unused type: ignore comments without django-stubs.  Now that the
django-stubs transition is complete, switch to type: ignore comments
so that mypy will tell us if they become unnecessary.  Many already
have.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-10-10 08:40:28 -07:00
Anders Kaseorg 92ad4455ed requirements: Upgrade Django to 4.1.
zerver/migrations/0240_usermessage_migrate_bigint_id_into_id.py needs
to be updated to account for Django 4.1 creating AutoField as an
identity column rather than a serial column.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-10-06 15:59:07 -07:00
Zixuan James Li e16de8d9e7 scim: Further slim down SCIMClient removing unused attributes.
This removes everything from SCIMClient except the "is_authenticated`
method. Previously, "realm" and "name" were only needed for logging
purposes. It is the best to keep SCIMClient as minimal as possible, as
it is only intended to be used for authenticating requests to SCIM
views.

This change also gurantees that the "LogRequests" middleware will not
rely on the type unsafe access of the format_requestor_for_logs method
on SCIMClient.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-09-30 15:34:50 +02:00
Zixuan James Li a20c9ea28d scim: Use setattr to set request.user as scim_client.
This is a type-unsafe workaround before we can fix the problem that
django_scim2 relies on request.user being present to authenticate
requests.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-09-27 10:19:32 -07:00
Anders Kaseorg 9198fe4fac scim: Downgrade SCIMClient from a model to an ephemeral dataclass.
SCIMClient is a type-unsafe workaround for django-scim2’s conflation
of SCIM users with Django users.  Given that a SCIMClient is not a
UserProfile, it might as well not be a model at all, since it’s only
used to satisfy django-scim2’s request.user.is_authenticated queries.

This doesn’t solve the type safety issue with assigning a SCIMClient
to request.user, nor the performance issue with running the SCIM
middleware on non-SCIM requests.  But it reduces the risk of potential
consequences worse than crashing, since there’s no longer a
request.user.id for Django to confuse with the ID of an actual
UserProfile.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-09-26 11:36:48 -07:00
Mateusz Mandera d21a1fe47f middleware: Log 5xx json_errors in JsonErrorHandler.
django.request logs responses with 5xx response codes (our configuration
of the logger prevents it from logging 4xx as well which it normally
does too). However, it does it without the traceback which results in
quite unhelpful log message that look like
"Bad Gateway:/api/v1/users/me/apns_device_token" - particularly
confusing when sent via email to server admins.

The solution here is to do the logging ourselves, using Django's
log_response() (which is meant for this purpose), and including the
traceback. Django tracks (via response._has_been_logged attribute) that
the response has already been logged, and knows to not duplicate that
action. See log_response() in django's codebase for these details.

Fixes #19596.
2022-08-31 14:43:15 -07:00
Zixuan James Li 21fd62427d typing: Remove ViewFuncT.
This removes ViewFuncT and all the associated type casts with ParamSpec
and Concatenate. This provides more accurate type annotation for
decorators at the cost of making the concatenated parameters
positional-only. This change does not intend to introduce any other
behavioral difference. Note that we retype args in process_view as
List[object] because the view functions can not only be called with
arguments of type str.

Note that the first argument of rest_dispatch needs to be made
positional-only because of the presence of **kwargs.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-08-22 15:46:16 -07:00
Zixuan James Li b02779c005 request: Refactor remote_server into RequestNotes.
This eliminates the possibility of having `request.user` as
`RemoteZulipServer` by refactoring it as an attribute of `RequestNotes`.

So we can effectively narrow the type of `request.user` by testing
`user.is_authenticated` in most cases (except that of `SCIMClient`) in
code paths that require access to `.format_requestor_for_logs` where we
previously expect either `UserProfile` or `RemoteZulipServer` backed by
the implied polymorphism.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-07-28 09:38:40 -07:00
Zixuan James Li e665ec8ae2 middleware: Add isinstance check before retrieving content.
StreamingHttpResponse is inferred without the isinstance check in the
else branch. We refactor this is shorten the code and also type narrow
it appropriately.
2022-07-15 14:00:56 -07:00
Zixuan James Li 2095258aa5 middleware: Assert request.method is not None.
`request.method` is not `None` in normal use cases, unless an
`HttpRequest` is directly instantiated without the method being set.
This situation does not apply to `WSGIRequest` at all.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-07-15 14:00:56 -07:00
Zixuan James Li 7d86d291d4 middleware: Remove inappropriate StreamingHttpResponse annotation.
Asserting response.stream is False is just suggesting the response being
an `HttpResponse`. This removes `StreamingHttpResponse` with the more
generic `HttpResponseBase` with an isinstance-check.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-07-15 14:00:56 -07:00
Zixuan James Li 75925fe059 middleware: Reorder middleware to clean up LogRequests hasattr checks.
Similar to the previous commit, we should access request.user only
after it has been initialized, rather than having awkward hasattr
checks.

With updates to the settings comments about LogRequests by tabbott.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-07-14 17:28:50 -07:00
Anders Kaseorg 345ed1d09d middleware: Pass unhandled API exceptions through to the test suite.
This results in more useful stack traces in failing tests.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-23 19:23:08 -07:00
Anders Kaseorg bb6bd900cd response: Replace response.asynchronous attribute with new class.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-05-27 14:27:34 -07:00
Anders Kaseorg 0043c0b6b2 django: Use HttpRequest.headers.
Fixes #14769.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-05-13 20:42:20 -07:00
Anders Kaseorg cce142c61a middleware: Fix URL encoding of next parameter.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-05-12 17:51:51 -07:00
Lauryn Menard 8154b4a9af middleware: Add `client` REQ parameter to `parse_client`.
If an API request specified a `client` parameter, we were
already prioritizing that value over parsing the UserAgent.

In order to have these parameters logged in the `RequestNotes`
as processed parameters instead of ignored parameters, we add
the `has_request_variables` decorator to `parse_client` and
then process the potential `client` parameter through the REQ
framework.

Co-authored by: Tim Abbott <tabbott@zulip.com>
2022-04-08 11:29:33 -07:00
Anders Kaseorg b0ce4f1bce docs: Fix many spelling mistakes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-02-07 18:51:06 -08:00
rht bb8504d925 lint: Fix typos found by codespell. 2021-10-19 16:51:13 -07:00
Mateusz Mandera 73a6f2a1a7 auth: Add support for using SCIM for account management. 2021-10-14 12:29:10 -07:00
Mateusz Mandera 8b906b5d2f request_notes: Set the realm appropriately for the root subdomain.
Requests to the root subdomain weren't getting request_notes.realm set
even if a realm exists on the root subdomain - which is actually a
common scenario, because simply having one organization, on the root
subdomain, is the simplest and common way for self-hosted deployments.
2021-09-28 10:02:52 -07:00
Mateusz Mandera fb3864ea3c auth: Change the look of SOCIAL_AUTH_SUBDOMAIN when directly opened.
SOCIAL_AUTH_SUBDOMAIN was potentially very confusing when opened by a
user, as it had various Login/Signup buttons as if there was a realm on
it. Instead, we want to display a more informative page to the user
telling them they shouldn't even be there. If possible, we just redirect
them to the realm they most likely came from.
To make this possible, we have to exclude the subdomain from
ROOT_SUBDOMAIN_ALIASES - so that we can give it special behavior.
2021-09-10 10:47:15 -07:00