Commit Graph

190 Commits

Author SHA1 Message Date
PIG208 c03b9c95ad request: Store client information using ZulipRequestNotes.
This concludes the HttpRequest migration to eliminate arbitrary
attributes (except private ones that are belong to django) attached
to the request object during runtime and migrated them to a
separate data structure dedicated for the purpose of adding
information (so called notes) to a HttpRequest.
2021-07-14 12:01:07 -07:00
PIG208 8eb2c3ffdb request: Move realm from the request to ZulipRequestNotes. 2021-07-14 12:01:07 -07:00
PIG208 742c17399e request: Move miscellaneous attributes to ZulipRequestNotes.
This includes the migration of fields that require trivial changes
to be migrated to be stored with ZulipRequestNotes.

Specifically _requestor_for_logs, _set_language, _query, error_format,
placeholder_open_graph_description, saveed_response, which were all
previously set on the HttpRequest object at some point. This migration
allows them to be typed.
2021-07-14 12:01:07 -07:00
PIG208 5475334b16 request: Refactor to store requestor_for_logs in ZulipRequestNotes. 2021-07-14 12:01:07 -07:00
PIG208 3f9a5e1e17 request: Refactor to record rate limit data using ZulipRequestNotes.
We will no longer use the HttpRequest to store the rate limit data.
Using ZulipRequestNotes, we can access rate_limit and ratelimits_applied
with type hints support. We also save the process of initializing
ratelimits_applied by giving it a default value.
2021-07-14 12:01:07 -07:00
PIG208 da6e5ddcae request: Move log_data from HttpRequest to ZulipRequestNotes. 2021-07-14 12:01:05 -07:00
PIG208 8b9011dff8 json_error: Completely remove json_error.
This completes the migration from `return json_error` to
`raise JsonableError`.
2021-07-06 15:34:33 -07:00
Anders Kaseorg 544bbd5398 docs: Fix capitalization mistakes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-05-10 09:57:26 -07:00
Tim Abbott 615ad2d5d8 middleware: Simplify logic for parsing user-agent.
This avoids calling parse_user_agent twice when dealing with official
Zulip clients, and also makes the logical flow hopefully easier to read.

We move get_client_name out of decorator.py, since it no longer
belongs there, and give it a nicer name.
2021-04-29 17:47:41 -07:00
orientor fe260fb892 middleware: Show client version in logging if available.
Fixes #14067.
2021-04-29 17:07:37 -07:00
orientor ac203cd9f1 middleware: Add client_version attribute to request. 2021-04-29 17:03:40 -07:00
orientor 6224d83dea middleware: Get client name in LogRequests instead of process_client.
This ensures it is present for all requests; while that was already
essentially true via process_client being called from every standard
decorator, this allows middleware and other code to rely on this
having been set.
2021-04-29 17:03:05 -07:00
Anders Kaseorg e7ed907cf6 python: Convert deprecated Django ugettext alias to gettext.
django.utils.translation.ugettext is a deprecated alias of
django.utils.translation.gettext as of Django 3.0, and will be removed
in Django 4.0.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-04-15 18:01:34 -07:00
Alex Vandiver 07779ea879 middleware: Do not trust X-Forwarded-For; use X-Real-Ip, set from nginx.
The `X-Forwarded-For` header is a list of proxies' IP addresses; each
proxy appends the remote address of the host it received its request
from to the list, as it passes the request down.  A naïve parsing, as
SetRemoteAddrFromForwardedFor did, would thus interpret the first
address in the list as the client's IP.

However, clients can pass in arbitrary `X-Forwarded-For` headers,
which would allow them to spoof their IP address.  `nginx`'s behavior
is to treat the addresses as untrusted unless they match an allowlist
of known proxies.  By setting `real_ip_recursive on`, it also allows
this behavior to be applied repeatedly, moving from right to left down
the `X-Forwarded-For` list, stopping at the right-most that is
untrusted.

Rather than re-implement this logic in Django, pass the first
untrusted value that `nginx` computer down into Django via `X-Real-Ip`
header.  This allows consistent IP addresses in logs between `nginx`
and Django.

Proxied calls into Tornado (which don't use UWSGI) already passed this
header, as Tornado logging respects it.
2021-03-31 14:19:38 -07:00
Anders Kaseorg 6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg 11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg 9773c0f1a8 python: Fix string literal concatenation mistakes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 08:02:51 -05:00
Mateusz Mandera f76202dd59 django3: Save language preference in a cookie rather than the session.
Support for saving it in the session is dropped in django3, the cookie
is the mechanism that needs to be used. The relevant i18n code doesn't
have access to the response objects and thus needs to delegate setting
the cookie to LocaleMiddleware.

Fixes the LocaleMiddleware point of #16030.
2021-01-17 10:38:58 -08:00
Mateusz Mandera 43a0c60e96 exceptions: Make RateLimited into a subclass of JsonableError.
This simplifies the code, as it allows using the mechanism of converting
JsonableErrors into a response instead of having separate, but
ultimately similar, logic in RateLimitMiddleware.
We don't touch tests here because "rate limited" error responses are
already verified in test_external.py.
2020-12-01 13:40:56 -08:00
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Alex Vandiver 536bd3188e middleware: Move locale-setting before domain checking.
Calling `render()` in a middleware before LocaleMiddleware has run
will pick up the most-recently-set locale.  This may be from the
_previous_ request, since the current language is thread-local.  This
results in the "Organization does not exist" page occasionally being
in not-English, depending on the preferences of the request which that
thread just finished serving.

Move HostDomainMiddleware below LocaleMiddleware; none of the earlier
middlewares call `render()`, so are safe.  This will also allow the
"Organization does not exist" page to be localized based on the user's
browser preferences.

Unfortunately, it also means that the default LocaleMiddleware catches
the 404 from the HostDomainMiddlware and helpfully tries to check if
the failure is because the URL lacks a language component (e.g.
`/en/`) by turning it into a 304 to that new URL.  We must subclass
the default LocaleMiddleware to remove this unwanted functionality.

Doing so exposes a two places in tests that relied (directly or
indirectly) upon the redirection: '/confirmation_key'
was redirected to '/en/confirmation_key', since the non-i18n version
did not exist; and requests to `/stats/realm/not_existing_realm/`
incorrectly were expecting a 302, not a 404.

This regression likely came in during f00ff1ef62, since prior to
that, the HostDomainMiddleware ran _after_ the rest of the request had
completed.
2020-09-14 22:16:09 -07:00
Alex Vandiver 6323218a0e request: Maintain a thread-local of the current request.
This allows logging (to Sentry, or disk) to be annotated with richer
data about the request.
2020-09-11 16:43:29 -07:00
Anders Kaseorg f91d287447 python: Pre-fix a few spots for better Black formatting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00
Anders Kaseorg a276eefcfe python: Rewrite dict() as {}.
Suggested by the flake8-comprehensions plugin.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-02 11:15:41 -07:00
Anders Kaseorg ab120a03bc python: Replace unnecessary intermediate lists with generators.
Mostly suggested by the flake8-comprehension plugin.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-02 11:15:41 -07:00
Aman fd5423a8f9 exceptions: Extract json_unauths into MissingAuthenticationError.
We raise two types of json_unauthorized when
MissingAuthenticationError is raised. Raising the one
with www_authenticate let's the client know that user needs
to be logged in to access the requested content.

Sending `www_authenticate='session'` header with the response
also stops modern web-browsers from showing a login form to the
user and let's the client handle it completely.

Structurally, this moves the handling of common authentication errors
to a single shared middleware exception handler.
2020-08-30 14:51:50 -07:00
Tim Abbott 1fddf16b73 Revert "exceptions: Extract json_unauths into MissingAuthenticationError."
This reverts commit c355f6b8d8.
2020-08-25 17:42:07 -07:00
Aman c355f6b8d8 exceptions: Extract json_unauths into MissingAuthenticationError.
We raise two types of json_unauthorized when
MissingAuthenticationError is raised. Raising the one
with www_authenticate let's the client know that user needs
to be logged in to access the requested content.

Sending `www_authenticate='session'` header with the response
also stops modern web-browsers from showing a login form to the
user and let's the client handle it completely.

Structurally, this moves the handling of common authentication errors
to a single shared middleware exception handler.
2020-08-25 16:52:21 -07:00
Alex Vandiver 596cf2580b sentry: Ignore all SuspiciousOperation loggers.
django.security.DisallowedHost is only one of a set of exceptions that
are "SuspiciousOperation" exceptions; all return a 400 to the user
when they bubble up[1]; all of them are uninteresting to Sentry.
While they may, in bulk, show a mis-configuration of some sort of the
application, such a failure should be detected via the increase in
400's, not via these, which are uninteresting individually.

While all of these are subclasses of SuspiciousOperation, we enumerate
them explicitly for a number of reasons:

 - There is no one logger we can ignore that captures all of them.
   Each of the errors uses its own logger, and django does not supply
   a `django.security` logger that all of them feed into.

 - Nor can we catch this by examining the exception object.  The
   SuspiciousOperation exception is raised too early in the stack for
   us to catch the exception by way of middleware and check
   `isinstance`.  But at the Sentry level, in `add_context`, it is no
   longer an exception but a log entry, and as such we have no
   `isinstance` that can be applied; we only know the logger name.

 - Finally, there is the semantic argument that while we have decided
   to ignore this set of security warnings, we _may_ wish to log new
   ones that may be added at some point in the future.  It is better
   to opt into those ignores than to blanket ignore all messages from
   the security logger.

This moves the DisallowedHost `ignore_logger` to be adjacent to its
kin, and not on the middleware that may trigger it.  Consistency is
more important than locality in this case.

Of these, the DisallowedHost logger if left as the only one that is
explicitly ignored in the LOGGING configuration in
`computed_settings.py`; it is by far the most frequent, and the least
likely to be malicious or impactful (unlike, say, RequestDataTooBig).

[1] https://docs.djangoproject.com/en/3.0/ref/exceptions/#suspiciousoperation
2020-08-12 16:08:38 -07:00
Alex Vandiver 28c627452f sentry: Ignore DisallowedHost messages.
This is a misconfiguration of the client, not the server.
2020-08-11 10:38:14 -07:00
Alex Vandiver f00ff1ef62 middleware: Make HostDomain into a process_request, not process_response.
It is more suited for `process_request`, since it should stop
execution of the request if the domain is invalid.  This code was
likely added as a process_response (in ea39fb2556) because there was
already a process_response at the time (added 7e786d5426, and no
longer necessary since dce6b4a40f).

It quiets an unnecessary warning when logging in at a non-existent
realm.

This stops performing unnecessary work when we are going to throw it
away and return a 404.  The edge case to this is if the request
_creates_ a realm, and is made using the URL of the new realm; this
change would prevent the request before it occurs. While this does
arise in tests, the tests do not reflect reality -- real requests to
/accounts/register/ are made via POST to the same (default) realm,
redirected there from `confirm-preregistrationuser`.  The tests are
adjusted to reflect real behavior.

Tweaked by tabbott to add a block comment in HostDomainMiddleware.
2020-08-11 10:37:55 -07:00
Alex Vandiver 9266315a1f middleware: Stop shadowing top-level logger definition on line 33. 2020-07-27 16:46:13 -07:00
Alex Vandiver 1b2d0271af sentry: Prevent double-logging of JSON-formatted errors.
Capture and report the initial exception, not the formatted text-only
message traceback.
2020-07-27 11:07:55 -07:00
Mohit Gupta 44d68c1840 refactor: Rename bugdown words to markdown in stats related functions.
This commit is part of series of commits aimed at renaming bugdown to
markdown.
2020-06-26 17:20:40 -07:00
Mohit Gupta 3f5fc13491 refactor: Rename zerver.lib.bugdown to zerver.lib.markdown .
This commit is first of few commita which aim to change all the
bugdown references to markdown. This commits rename the files,
file path mentions and change the imports.
Variables and other references to bugdown will be renamed in susequent
commits.
2020-06-26 17:08:37 -07:00
Anders Kaseorg 5dc9b55c43 python: Manually convert more percent-formatting to f-strings.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-14 23:27:22 -07:00
Anders Kaseorg 365fe0b3d5 python: Sort imports with isort.
Fixes #2665.

Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.

Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start.  I expect this change will increase pressure for us to split
those files, which isn't a bad thing.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 16:45:32 -07:00
Anders Kaseorg 69730a78cc python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:

import re
import sys

last_filename = None
last_row = None
lines = []

for msg in sys.stdin:
    m = re.match(
        r"\x1b\[35mflake8    \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
    )
    if m:
        filename, row_str, col_str, err = m.groups()
        row, col = int(row_str), int(col_str)

        if filename == last_filename:
            assert last_row != row
        else:
            if last_filename is not None:
                with open(last_filename, "w") as f:
                    f.writelines(lines)

            with open(filename) as f:
                lines = f.readlines()
            last_filename = filename
        last_row = row

        line = lines[row - 1]
        if err in ["C812", "C815"]:
            lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
        elif err in ["C819"]:
            assert line[col - 2] == ","
            lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")

if last_filename is not None:
    with open(last_filename, "w") as f:
        f.writelines(lines)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-06-11 16:04:12 -07:00
Anders Kaseorg 67e7a3631d python: Convert percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 15:02:09 -07:00
Mateusz Mandera dd40649e04 queue_processors: Remove the slow_queries queue.
While this functionality to post slow queries to a Zulip stream was
very useful in the early days of Zulip, when there were only a few
hundred accounts, it's long since been useless since (1) the total
request volume on larger Zulip servers run by Zulip developers, and
(2) other server operators don't want real-time notifications of slow
backend queries.  The right structure for this is just a log file.

We get rid of the queue and replace it with a "zulip.slow_queries"
logger, which will still log to /var/log/zulip/slow_queries.log for
ease of access to this information and propagate to the other logging
handlers.  Reducing the amount of queues is good for lowering zulip's
memory footprint and restart performance, since we run at least one
dedicated queue worker process for each one in most configurations.
2020-05-11 00:45:13 -07:00
Tim Abbott a702894e0e middleware: Stop using X_REAL_IP.
The comment was wrong, in that REMOTE_ADDR is where the real external
IP was; X_REAL_IP was the loadbalancer's IP.
2020-05-08 11:40:54 -07:00
Anders Kaseorg bdc365d0fe logging: Pass format arguments to logging.
https://docs.python.org/3/howto/logging.html#optimization

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-02 10:18:02 -07:00
Anders Kaseorg fead14951c python: Convert assignment type annotations to Python 3.6 style.
This commit was split by tabbott; this piece covers the vast majority
of files in Zulip, but excludes scripts/, tools/, and puppet/ to help
ensure we at least show the right error messages for Xenial systems.

We can likely further refine the remaining pieces with some testing.

Generated by com2ann, with whitespace fixes and various manual fixes
for runtime issues:

-    invoiced_through: Optional[LicenseLedger] = models.ForeignKey(
+    invoiced_through: Optional["LicenseLedger"] = models.ForeignKey(

-_apns_client: Optional[APNsClient] = None
+_apns_client: Optional["APNsClient"] = None

-    notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
-    signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)

-    author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)
+    author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)

-    bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)
+    bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)

-    default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
-    default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)

-descriptors_by_handler_id: Dict[int, ClientDescriptor] = {}
+descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {}

-worker_classes: Dict[str, Type[QueueProcessingWorker]] = {}
-queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {}
+worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {}
+queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {}

-AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None
+AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-22 11:02:32 -07:00
Anders Kaseorg 1cf63eb5bf python: Whitespace fixes from autopep8.
Generated by autopep8, with the setup.cfg configuration from #14532.
I’m not sure why pycodestyle didn’t already flag these.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-21 17:58:09 -07:00
Anders Kaseorg dce6b4a40f middleware: Remove unused cookie_domain setting.
Since commit 1d72629dc4, we have been
maintaining a patched copy of Django’s
SessionMiddleware.process_response in order to unconditionally ignore
our own optional cookie_domain setting that we don’t set.

Instead, let’s not do that.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-12 11:55:55 -07:00
Anders Kaseorg c734bbd95d python: Modernize legacy Python 2 syntax with pyupgrade.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:43:22 -07:00
Mateusz Mandera 0155193140 rate_limiter: Change type of the RateLimitResult.remaining to int.
This is cleaner than it being Optional[int], as the value of None for
this object has been synonymous to 0.
2020-04-08 10:29:18 -07:00
Mateusz Mandera e86cfbdbd7 rate_limiter: Store data in request._ratelimits_applied list.
The information used to be stored in a request._ratelimit dict, but
there's no need for that, and a list is a simpler structure, so this
allows us to simplify the plumbing somewhat.
2020-04-08 10:29:18 -07:00
Mateusz Mandera 9911c6a0f0 rate_limiter: Put secs_to_freedom as message when raising RateLimited.
That's the value that matters to the code that catches the exception,
and this change allows simplifying the plumbing somewhat, and gets rid
of the get_rate_limit_result_from_request function.
2020-04-08 10:29:18 -07:00
Mateusz Mandera eb0216c5a8 middleware: Log <user.id>@subdomain instead of subdomain/<user.id>.
It was decided that the new format is preferable.
2020-03-24 10:25:01 -07:00