Commit Graph

253 Commits

Author SHA1 Message Date
Anders Kaseorg 69730a78cc python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:

import re
import sys

last_filename = None
last_row = None
lines = []

for msg in sys.stdin:
    m = re.match(
        r"\x1b\[35mflake8    \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
    )
    if m:
        filename, row_str, col_str, err = m.groups()
        row, col = int(row_str), int(col_str)

        if filename == last_filename:
            assert last_row != row
        else:
            if last_filename is not None:
                with open(last_filename, "w") as f:
                    f.writelines(lines)

            with open(filename) as f:
                lines = f.readlines()
            last_filename = filename
        last_row = row

        line = lines[row - 1]
        if err in ["C812", "C815"]:
            lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
        elif err in ["C819"]:
            assert line[col - 2] == ","
            lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")

if last_filename is not None:
    with open(last_filename, "w") as f:
        f.writelines(lines)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-06-11 16:04:12 -07:00
Anders Kaseorg 67e7a3631d python: Convert percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 15:02:09 -07:00
Mateusz Mandera dd40649e04 queue_processors: Remove the slow_queries queue.
While this functionality to post slow queries to a Zulip stream was
very useful in the early days of Zulip, when there were only a few
hundred accounts, it's long since been useless since (1) the total
request volume on larger Zulip servers run by Zulip developers, and
(2) other server operators don't want real-time notifications of slow
backend queries.  The right structure for this is just a log file.

We get rid of the queue and replace it with a "zulip.slow_queries"
logger, which will still log to /var/log/zulip/slow_queries.log for
ease of access to this information and propagate to the other logging
handlers.  Reducing the amount of queues is good for lowering zulip's
memory footprint and restart performance, since we run at least one
dedicated queue worker process for each one in most configurations.
2020-05-11 00:45:13 -07:00
Tim Abbott a702894e0e middleware: Stop using X_REAL_IP.
The comment was wrong, in that REMOTE_ADDR is where the real external
IP was; X_REAL_IP was the loadbalancer's IP.
2020-05-08 11:40:54 -07:00
Anders Kaseorg bdc365d0fe logging: Pass format arguments to logging.
https://docs.python.org/3/howto/logging.html#optimization

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-02 10:18:02 -07:00
Anders Kaseorg fead14951c python: Convert assignment type annotations to Python 3.6 style.
This commit was split by tabbott; this piece covers the vast majority
of files in Zulip, but excludes scripts/, tools/, and puppet/ to help
ensure we at least show the right error messages for Xenial systems.

We can likely further refine the remaining pieces with some testing.

Generated by com2ann, with whitespace fixes and various manual fixes
for runtime issues:

-    invoiced_through: Optional[LicenseLedger] = models.ForeignKey(
+    invoiced_through: Optional["LicenseLedger"] = models.ForeignKey(

-_apns_client: Optional[APNsClient] = None
+_apns_client: Optional["APNsClient"] = None

-    notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
-    signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)

-    author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)
+    author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)

-    bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)
+    bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)

-    default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
-    default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)

-descriptors_by_handler_id: Dict[int, ClientDescriptor] = {}
+descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {}

-worker_classes: Dict[str, Type[QueueProcessingWorker]] = {}
-queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {}
+worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {}
+queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {}

-AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None
+AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-22 11:02:32 -07:00
Anders Kaseorg 1cf63eb5bf python: Whitespace fixes from autopep8.
Generated by autopep8, with the setup.cfg configuration from #14532.
I’m not sure why pycodestyle didn’t already flag these.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-21 17:58:09 -07:00
Anders Kaseorg dce6b4a40f middleware: Remove unused cookie_domain setting.
Since commit 1d72629dc4, we have been
maintaining a patched copy of Django’s
SessionMiddleware.process_response in order to unconditionally ignore
our own optional cookie_domain setting that we don’t set.

Instead, let’s not do that.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-12 11:55:55 -07:00
Anders Kaseorg c734bbd95d python: Modernize legacy Python 2 syntax with pyupgrade.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:43:22 -07:00
Mateusz Mandera 0155193140 rate_limiter: Change type of the RateLimitResult.remaining to int.
This is cleaner than it being Optional[int], as the value of None for
this object has been synonymous to 0.
2020-04-08 10:29:18 -07:00
Mateusz Mandera e86cfbdbd7 rate_limiter: Store data in request._ratelimits_applied list.
The information used to be stored in a request._ratelimit dict, but
there's no need for that, and a list is a simpler structure, so this
allows us to simplify the plumbing somewhat.
2020-04-08 10:29:18 -07:00
Mateusz Mandera 9911c6a0f0 rate_limiter: Put secs_to_freedom as message when raising RateLimited.
That's the value that matters to the code that catches the exception,
and this change allows simplifying the plumbing somewhat, and gets rid
of the get_rate_limit_result_from_request function.
2020-04-08 10:29:18 -07:00
Mateusz Mandera eb0216c5a8 middleware: Log <user.id>@subdomain instead of subdomain/<user.id>.
It was decided that the new format is preferable.
2020-03-24 10:25:01 -07:00
Mateusz Mandera 85df6201f6 rate_limit: Move functions called by external code to RateLimitedObject. 2020-03-22 18:42:35 -07:00
Mateusz Mandera 2b51b3c6c5 middleware: Also log request subdomain when logging "unauth" request.
This returns us to a consistent logging format regardless of whether
the request is authenticated.

We also update some log examples in docs to be consistent with the new
style.
2020-03-22 18:32:04 -07:00
Mateusz Mandera 89394fc1eb middleware: Use request.user for logging when possible.
Instead of trying to set the _requestor_for_logs attribute in all the
relevant places, we try to use request.user when possible (that will be
when it's a UserProfile or RemoteZulipServer as of now). In other
places, we set _requestor_for_logs to avoid manually editing the
request.user attribute, as it should mostly be left for Django to manage
it.
In places where we remove the "request._requestor_for_logs = ..." line,
it is clearly implied by the previous code (or the current surrounding
code) that request.user is of the correct type.
2020-03-09 13:54:58 -07:00
Mateusz Mandera 0255ca9b6a middleware: Log user.id/realm.string_id instead of _email. 2020-03-09 13:54:58 -07:00
Tim Abbott 229090a3a5 middleware: Avoid running APPEND_SLASH logic in Tornado.
Profiling suggests this saves about 600us in the runtime of every GET
/events request attempting to resolve URLs to determine whether we
need to do the APPEND_SLASH behavior.

It's possible that we end up doing the same URL resolution work later
and we're just moving around some runtime, but I think even if we do,
Django probably doesn't do any fancy caching that would mean doing
this query twice doesn't just do twice the work.

In any case, we probably want to extend this behavior to our whole API
because the APPEND_SLASH redirect behavior is essentially a bug there.
That is a more involved refactor, however.
2020-02-14 16:15:57 -08:00
rht 41e3db81be dependencies: Upgrade to Django 2.2.10.
Django 2.2.x is the next LTS release after Django 1.11.x; I expect
we'll be on it for a while, as Django 3.x won't have an LTS release
series out for a while.

Because of upstream API changes in Django, this commit includes
several changes beyond requirements and:

* urls: django.urls.resolvers.RegexURLPattern has been replaced by
  django.urls.resolvers.URLPattern; affects OpenAPI code and related
  features which re-parse Django's internals.
  https://code.djangoproject.com/ticket/28593
* test_runner: Change number to suffix. Django changed the name in this
  ticket: https://code.djangoproject.com/ticket/28578
* Delete now-unnecessary SameSite cookie code (it's now the default).
* forms: urlsafe_base64_encode returns string in Django 2.2.
  https://docs.djangoproject.com/en/2.2/ref/utils/#django.utils.http.urlsafe_base64_encode
* upload: Django's File.size property replaces _get_size().
  https://docs.djangoproject.com/en/2.2/_modules/django/core/files/base/
* process_queue: Migrate to new autoreload API.
* test_messages: Add an extra query caused by .refresh_from_db() losing
  the .select_related() on the Realm object.
* session: Sync SessionHostDomainMiddleware with Django 2.2.

There's a lot more we can do to take advantage of the new release;
this is tracked in #11341.

Many changes by Tim Abbott, Umair Waheed, and Mateusz Mandera squashed
are squashed into this commit.

Fixes #10835.
2020-02-13 16:27:26 -08:00
Tim Abbott 1ea2f188ce tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.

The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).

As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
  to change a couple lines allowing us our Tornado code to not
  actually return the Django HttpResponse so we can long-poll.  The
  previous hack of returning None stopped being viable with the Django 2.2
  MiddlewareMixin.__call__ implementation.

But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:

* Replace RespondAsynchronously with a response.asynchronous attribute
  on the HttpResponse; this allows Django to run its normal plumbing
  happily in a way that should be stable over time, and then we
  proceed to discard the response inside the Tornado `get()` method to
  implement long-polling.  (Better yet might be raising an
  exception?).  This lets us eliminate maintaining a patched copy of
  _get_response.

* Removing the @asynchronous decorator, which didn't add anything now
  that we only have one API endpoint backend (with two frontend call
  points) that could call into this.  Combined with the last bullet,
  this lets us remove a significant hack from our
  never_cache_responses function.

* Calling the normal Django `get_response` method from zulip_finish
  after creating a duplicate request to process, rather than writing
  totally custom code to do that.  This lets us eliminate maintaining
  a patched copy of Django's load_middleware.

* Adding detailed comments explaining how this is supposed to work,
  what problems we encounter, and how we solve various problems, which
  is critical to being able to modify this code in the future.

A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.

There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard).  We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response.  Profiling will be important to understanding what's
worth doing here.
2020-02-13 16:13:11 -08:00
Mateusz Mandera 335b804510 exceptions: RateLimited shouldn't inherit from PermissionDenied.
We will want to raise RateLimited in authenticate() in rate limiting
code - Django's authenticate() mechanism catches PermissionDenied, which
we don't want for RateLimited. We want RateLimited to propagate to our
code that called the authenticate() function.
2020-02-02 19:15:00 -08:00
Mateusz Mandera a6a2d70320 rate_limiter: Handle multiple types of rate limiting in middleware.
As more types of rate limiting of requests are added, one request may
end up having various limits applied to it - and the middleware needs to
be able to handle that. We implement that through a set_response_headers
function, which sets the X-RateLimit-* headers in a sensible way based
on all the limits that were applied to the request.
2020-02-02 19:15:00 -08:00
Wyatt Hoodes b807c4273e middleware: Fix exception typing.
Mypy seems to have trouble understanding `Exception` inheritance
here, so we create a `Union` for the only `Exception` we are
looking for.
2019-07-31 12:23:20 -07:00
Anders Kaseorg 0bcae0be55 write_log_line: Fix logging of 4xx error data.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-07-25 14:42:52 -07:00
Wyatt Hoodes 5686821150 middleware: Change write_log_line to publish as a dict.
We were seeing errors when pubishing typical events in the form of
`Dict[str, Any]` as the expected type to be a `Union`.  So we instead
change the only non-dictionary call, to pass a dict instead of `str`.
2019-07-22 17:06:41 -07:00
Mateusz Mandera f73600c82c rate_limiter: Create a general rate_limit_request_by_entity function. 2019-05-30 16:50:11 -07:00
Anders Kaseorg 9efda71a4b get_realm: raise DoesNotExist instead of returning None.
This makes the implementation of `get_realm` consistent with its
declared return type of `Realm` rather than `Optional[Realm]`.

Fixes #12263.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-05-06 21:58:16 -07:00
Puneeth Chaganti a653fcca93 html_to_text: Escape text when using as description. 2019-04-25 15:29:16 -07:00
Puneeth Chaganti 7d7134d45d html_to_text: Extract code for html to plain text conversion. 2019-04-25 15:29:16 -07:00
Anders Kaseorg 21dc34cc52 open graph: HTML-escape og:description, twitter:description.
The entire idea of doing this operation with unchecked string
replacement in a middleware class is in my opinion extremely
ill-conceived, but this fixes the most pressing problem with it
generating invalid HTML.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:53:59 -07:00
Anders Kaseorg 643bd18b9f lint: Fix code that evaded our lint checks for string % non-tuple.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:21:37 -07:00
Tim Abbott 983e24a7f5 auth: Use HTTP status 404 for invalid realms.
Apparently, our invalid realm error page had HTTP status 200, which
could be confusing and in particular broken our mobile app's error
handling for this case.
2019-03-14 13:50:09 -07:00
Tim Abbott de6f724bc5 middleware: Avoid doing work for statsd when not enabled.
This saves about 8% of the runtime of our total response middleware,
or equivalently close to 2% of the total Tornado response time.  Which
is pretty significant given that we're not sure anyone is using statsd
in production.

It's also useful outside Tornado, but the effect is particularly
significant because of how important Tornado performance is.
2019-02-27 17:53:15 -08:00
Tim Abbott c955b20131 middleware: Don't repreatedly regenerate open graph functions.
This avoids parsing these functions on every request, which was
adding roughly 350us to our per-request response times.

The overall impact was more than 10% of basic Tornado response
runtime.
2019-02-27 17:53:13 -08:00
Rishi Gupta 028874bab3 open graph: Remove extraneous spaces from descriptions.
Our html collects extra spaces in a couple of places. The most prominent is
paragraphs that look like the following in the .md file:
* some text
  continued

The html will have two spaces before "continued".
2019-02-11 12:05:19 -08:00
Rishi Gupta d3125f59e1 open graph: Omit .code-section navigation from open graph. 2019-02-11 12:05:19 -08:00
Rishi Gupta e1f02dc6f2 open graph: Include multiple paragraphs in description tags. 2019-02-11 12:05:19 -08:00
Anders Kaseorg f0ecb93515 zerver core: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:41:24 -08:00
Wyatt Hoodes 8eac361fb5 docs: Refactor BS work with use of cache_with_key.
Refactor the potentially expensive work done by Beautiful Soup into a
function that is called by the alter_content function, so that we can
cache the result.  Saves a significant portion of the runtime of
loading of all of our /help/ and /api/ documentation pages (e.g. 12ms
for /api).

Fixes #11088.

Tweaked by tabbott to use the URL path as the cache key, clean up
argument structure, and use a clearer name for the function.
2019-01-28 15:21:52 -08:00
Tim Abbott 9c3f38a564 docs: Automatically construct OpenAPI metadata for help center.
This is somewhat hacky, in that in order to do what we're doing, we
need to parse the HTML of the rendered page to extract the first
paragraph to include in the open graph description field.  But
BeautifulSoup does a good job of it.

This carries a nontrivial performance penalty for loading these pages,
but overall /help/ is a low-traffic site compared to the main app, so
it doesn't matter much.

(As a sidenote, it wouldn't be a bad idea to cache this stuff).

There's lots of things we can improve in this, largely through editing
the articles, but we can deal with that over time.

Thanks to Rishi for writing all the tests.
2018-12-19 10:18:20 -08:00
Tim Abbott ae6fc0a471 sessions: Resync session middleware from Django upstream.
Until we resolve https://github.com/zulip/zulip/issues/10832, we will
need to maintain our own forked copy of Django's SessionMiddleware.
We apparently let this get out of date.

This fixes a few subtle bugs involving the user logout experience that
were throwing occasional exceptions (e.g. the UpdateError fix you can
see).
2018-11-14 15:16:12 -08:00
Tim Abbott 10ac671cd4 middleware: Fix logging of query counts in websockets requests.
Apparently, we weren't resetting the query counters inside the
websockets codebase, resulting in broken log results like this:

SOCKET  403   2ms (db: 1ms/2q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET  403   5ms (db: 2ms/3q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET  403   2ms (db: 3ms/4q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET  403   2ms (db: 3ms/5q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET  403   2ms (db: 4ms/6q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET  403   2ms (db: 5ms/7q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET  403   2ms (db: 5ms/8q) /socket/auth [transport=websocket] (unknown via ?)
SOCKET  403   3ms (db: 6ms/9q) /socket/auth [transport=websocket] (unknown via ?)

The correct fix for this is to call reset_queries at the start of each
endpoint within the websockets system.  As it turns out, we're already
calling record_request_start_data there, and in fact should be calling
`reset_queries` in all code paths that use that function (the other
code paths, in zerver/middleware.py, do it manually with
connection.connection.queries = []).

So we can clean up the code in a way that reduces risk for similar
future issues and fix this logging bug with this simple refactor.
2018-10-31 16:22:17 -07:00
Tim Abbott e4813e462b tornado: Rename async_request_{restart,stop} to mention timer.
Previously, these timer accounting functions could be easily mistaken
for referring to starting/stopping the request.  By adding timer to
the name, we make the code easier for the casual observer to read and
understand.
2018-10-16 15:39:10 -07:00
Vishnu Ks d2e4417a72 urls: Separate endpoint for signup and new realm email confirm.
This is preparation for the next commit.
2018-08-26 22:53:57 -07:00
Aditya Bansal 993d50f5ab zerver: Change use of typing.Text to str. 2018-05-12 15:22:39 -07:00
neiljp (Neil Pilgrim) 2ed6da77c7 mypy: Rewrite some middleware annotations to use ViewFuncT. 2018-03-17 23:25:05 +00:00
Greg Price 53c57cf002 errors: Include request info on error mails for JSON routes too.
When our code raises an exception and Django converts it to a 500
response (in django.core.handlers.exception.handle_uncaught_exception),
it attaches the request to the log record, and we use this in our
AdminNotifyHandler to include data like the user and the URL path
in the error email sent to admins.

On this line, when our code raises an exception but we've decided (in
`TagRequests`) to format any errors as JSON errors, we suppress the
exception so we have to generate the log record ourselves.  Attach the
request here, just like Django does when we let it do the job.

This still isn't an awesome solution, in that there are lots of other
places where we call `logging.error` or `logging.exception` while
inside a request; this just covers one of them.  This is one of the
most common, though, so it's a start.
2018-03-01 15:12:32 -08:00
Callum Fraser aa9567ce37 mypy: Use Python 3 type syntax in zerver/middleware.py. 2017-12-11 18:43:24 -08:00
rht a1cc720860 zerver: Use Python 3 syntax for typing.
Tweaked by tabbott to fix some minor whitespace errors.
2017-11-28 16:49:36 -08:00
Greg Price b6cc21b438 debug: Add facility to dump tracemalloc snapshots.
Originally this used signals, namely SIGRTMIN.  But in prod, the
signal handler never fired; I debugged fruitlessly for a while, and
suspect uwsgi was foiling it in a mysterious way (which is kind of
the only way uwsgi does anything.)

So, we listen on a socket.  Bit more code, and a bit trickier to
invoke, but it works.

This was developed for the investigation of memory-bloating on
chat.zulip.org that led to a331b4f64 "Optimize query_all_subs_by_stream()".

For usage instructions, see docstring.
2017-11-28 15:52:07 -08:00
Robert Hönig 0e0a8a2b14 queue processor tests: Call consume by default.
This significantly improves the API for queue_json_publish to not be
overly focused on what the behavior of this function should be in our
unit tests.
2017-11-26 11:45:34 -08:00
Tim Abbott 10ab9410c9 python: Sort imports in easy files in zerver/. 2017-11-15 15:50:28 -08:00
derAnfaenger 3ac09b3e9b queue processors: Add coverage for SlowQueryWorker. 2017-11-09 15:20:40 -08:00
rht 5ee40bf718 Remove usage of six.moves.binary_type. 2017-11-09 10:00:00 -08:00
Felix Yan aea33fc738 Fix a comment typo in zerver/middleware.py. 2017-10-30 10:36:35 -07:00
derAnfaenger 1792dcbd09 tests: Call real consume method of queue processors.
This switches to more real tests for a first batch of
queue_json_publish() calls that don't cause trouble when
used with call_consume_tests=True.
2017-10-26 14:58:03 -07:00
Greg Price 093bae4bc5 subdomains: Fix some implicit uses of "" for the root subdomain.
These are just instances that jumped out at me while working on the
subdomains code, mostly while grepping for get_subdomain call sites.
I haven't attempted a comprehensive search, and there are likely
still others left.
2017-10-26 10:29:17 -07:00
Tim Abbott 1ab2ca5986 subdomains: Extract zerver.lib.subdomains library.
These never really belonged with the rest of zerver.lib.utils.py, and
having a separate library makes it easier to enforce full test
coverage.
2017-10-18 22:27:48 -07:00
Alena Volkova 5515a075ec urls: Move the report endpoints to be API-style routes. 2017-10-17 22:05:56 -07:00
Tim Abbott 46485322eb middleware: Remove logic for redirecting to zulipdev.com domains.
We originally wrote this because when testing subdomains, you wanted
to be sure you were actually testing subdomains.  Now that subdomains
is the default, doesn't seem to actually be a good reason why we
should need this.
2017-10-05 23:21:02 -07:00
Tim Abbott 40c59f2878 middleware: Fix losing sub-URL when pushing to zulipdev.com.
Previously, this would always send one to homepage, making visiting
the /help/ documentation in the development environment using the
localhost URL unpleasant.

While this fixes the proximal bug, it's not clear to me that we need
this redirect logic at all, so I'm going to try removing it soon.
2017-10-05 16:36:34 -07:00
Tim Abbott 1d72629dc4 subdomains: Hardcode REALMS_HAVE_SUBDOMAINS=True. 2017-10-02 16:42:43 -07:00
rht 2949d1c1e8 zerver: Remove the rest of absolute_import. 2017-09-27 10:02:39 -07:00
neiljp (Neil Pilgrim) bb83742906 mypy: Correct 2 type annotations in zerver/middleware.py. 2017-08-15 17:50:18 -07:00
neiljp (Neil Pilgrim) e772e89fe2 mypy: Add None return path for RateLimitMiddleware.process_exception(). 2017-08-03 11:03:14 -07:00
Umair Khan 9e33917d25 rate_limiter: Upgrade max_api_calls to generic API. 2017-08-02 18:01:39 -07:00
Greg Price 192ec7c0f6 middleware: Use a proper error code on CSRF failure.
This allows us to reliably parse the error in code, rather than
attempt to parse the error text.  Because the error text gets
translated into the user's language, this error-handling path
wasn't functioning at all for users using Zulip in any of the
seven non-English languages for which we had a translation for
this string.

Together with 709c3b50f which fixed a similar issue in a
different error-handling path, this fixes #5598.
2017-07-25 14:02:12 -07:00
Greg Price 5de7c4f2af JsonErrorHandler: Take advantage of the new JsonableError structured data. 2017-07-24 16:41:22 -07:00
Greg Price 9faa44af60 JsonableError: Optionally carry error codes and structured data.
This provides the main infrastructure for fixing #5598.  From here,
it's a matter of on the one hand upgrading exception handlers -- the
many except-blocks in the codebase that look for JsonableError -- to
look beyond the string `msg` and pass on the machine-readable full
error information to their various downstream recipients, and on the
other hand adjusting places where we raise errors to take advantage
of this mechanism to give the errors structured details.

In an ideal future, I think all exception handlers that look (or
should look) for a JsonableError would use its contents in structured
form, never mentioning `msg`; but the majority of error sites might
continue to just instantiate JsonableError with a string message.  The
latter is the simplest thing to do, and probably most error types will
never have code looking for them specifically.

Because the new API refactors the `to_json_error_msg` method which was
designed for subclasses to override, update the 4 subclasses that did
so to take full advantage of the new API instead.
2017-07-24 16:41:22 -07:00
Greg Price 6dfb46dc08 JsonableError: Rename `status_code` and rely more on its default.
With #5598 there will soon be an application-level error code
optionally associated with a `JsonableError`, so rename this
field to make clear that it specifically refers to an
HTTP status code.

Also take this opportunity to eliminate most of the places
that refer to it, which only do so to repeat the default value.
2017-07-24 16:41:22 -07:00
Greg Price a5597e91a1 exceptions: Move zerver/exceptions.py under zerver/lib/.
Seems like a more appropriate place for it.  Preparation for
moving a bit more into that file.
2017-07-24 16:41:22 -07:00
Greg Price ff5013c619 JsonableError: Add types, and eliminate duck-typing.
In order to benefit from the modern conveniences of type-checking,
add concrete, non-Any types to the interface for JsonableError.

Relatedly, there's no need at this point to duck-type things at
the places where we receive a JsonableError and try to use it.
Simplify those by using straightforward standard typing.
2017-07-24 16:41:22 -07:00
Umair Khan 95fc16d90d Django 1.11: MIDDLEWARE_CLASSES setting is deprecated.
Django provides MiddlewareMixin to upgrade old-style middlewares. See
https://docs.djangoproject.com/en/1.11/topics/http/middleware/#upgrading-middleware
2017-06-13 15:04:04 -07:00
Tim Abbott 8e978df957 mypy: Fix missing import from recent mypy merge. 2017-05-25 16:22:19 -07:00
Ethan d2e72b0082 mypy: correct process response type. 2017-05-25 15:41:51 -07:00
Ethan b6e7e36c86 mypy: correct from int to str in rate limiting. 2017-05-25 15:41:49 -07:00
Tim Abbott 17f87cfc9e mypy: Fix missing Optional on process_exception. 2017-05-23 17:44:30 -07:00
Lukasz Prasol 5eaccc550a rate_limit: Make retry-after data machine-readable.
Fixes #4831.
2017-05-22 17:35:12 -07:00
hollywoodno 75d9630258 Add notifications on new logins to Zulip.
This adds helpful email notifications for users who just logged into a
Zulip server, as a security protection against accounts being hacked.

Text tweaked by tabbott.

Fixes #2182.
2017-03-25 16:50:52 -07:00
Tim Abbott c2c02ea4da middleware: Fix typo in render_to_response migration.
This fixes a 500 on the invalid realm page.
2017-03-21 07:30:28 -07:00
Umair Khan 4442703011 jinja2: No need for custom render_to_response.
Django 1.10 has changed the implementation of this function to
match our custom implementation; in addition to this, we prefer
render().

Fixes #1914 via #4093.
2017-03-17 13:57:34 -07:00
Umair Khan 97639e5e48 middleware: Change render_to_response to render.
Related to #4093
2017-03-17 13:52:59 -07:00
Raghav Jajodia a3a03bd6a5 mypy: Added Dict, List and Set imports.
Fixed mypy errors associated with the upgrade.
2017-03-04 14:33:44 -08:00
Tim Abbott 32bfebeb7a mypy: Fix inconsistencies in use of *args/**kwargs. 2017-02-18 18:39:44 -08:00
Tim Abbott b81fd407e8 mypy: Fix several Optional typing errors. 2017-02-10 23:53:44 -08:00
Tim Abbott de3e96162e middleware: Fix recursive DisallowedHost exceptions. 2017-01-29 20:26:58 -08:00
Tim Abbott 22d1aa396b lint: Clean up W503 PEP-8 warning. 2017-01-23 20:50:04 -08:00
Rishi Gupta 2b0a7fd0ba Rename models.get_realm_by_string_id to get_realm.
Finishes the refactoring started in c1bbd8d. The goal of the refactoring is
to change the argument to get_realm from a Realm.domain to a
Realm.string_id. The steps were

* Add a new function, get_realm_by_string_id.

* Change all calls to get_realm to use get_realm_by_string_id instead.

* Remove get_realm.

* (This commit) Rename get_realm_by_string_id to get_realm.

Part of a larger migration to remove the Realm.domain field entirely.
2017-01-04 17:12:23 -08:00
Juan Verhook cfa9c2eaf2 mypy: Update zerver directory to use Text 2016-12-29 09:12:15 -08:00
nikolay abc2ff4a06 pep8: Fix many rule E128 violations.
[Tweaked by tabbott to adjust some approaches used in wrapping]
2016-12-03 13:33:31 -08:00
bulat22101 a6f91064a2 pep8: Fix E129 violations 2016-12-03 10:56:36 -08:00
Rafid Aslam c5316b4002 lint: Fix E127 pep8 violations.
Fix pep8: E127 continuation line over-indented for visual indent
style issue.
2016-12-01 10:23:55 -08:00
Rafid Aslam 41bd88d5ed pep8: Fix E301 pep8 violations.
Fix "E301: expected (1 or 2) blank line" pep8 violations.
2016-11-29 08:51:44 -08:00
Rishi Gupta 4a74301a62 models.py: Replace resolve_subdomain_to_realm with get_realm_by_string_id.
No change in functionality.
2016-11-03 13:59:11 -07:00
hackerkid ea39fb2556 Add option for hosting each realm on its own subdomain.
This adds support for running a Zulip production server with each
realm on its own unique subdomain, e.g. https://realm_name.example.com.

This patch includes a ton of important features:
* Configuring the Zulip sesion middleware to issue cookier correctly
  for the subdomains case.
* Throwing an error if the user tries to visit an invalid subdomain.
* Runs a portion of the Casper tests with REALMS_HAVE_SUBDOMAINS
  enabled to test the subdomain signup process.
* Updating our integrations documentation to refer to the current subdomain.
* Enforces that users can only login to the subdomain of their realm
  (but does not restrict the API; that will be tightened in a future commit).

Note that toggling settings.REALMS_HAVE_SUBDOMAINS on a live server is
not supported without manual intervention (the main problem will be
adding "subdomain" values for all the existing realms).

[substantially modified by tabbott as part of merging]
2016-09-27 23:24:14 -07:00
Tim Abbott 647cead0d1 slow queries: Include full log line in slow query log.
The extra data is useful, and I think this won't make the lines annoying long.
2016-07-12 19:12:49 -07:00
Eklavya Sharma 9161ddaee0 zerver/middleware.py: Handle binary data in errors.
In write_log_line, error_content can be binary_type and
error_content_iter can be a Sequence of binary_type.  Handle
this this in a python 3 compatible way.  Also change annotations
to reflect this fact.
2016-07-10 11:30:13 -07:00
Taranjeet a8a4caf2c0 zerver: Fix lines with length greater than 120. 2016-07-08 11:41:43 -07:00
Eklavya Sharma 4761cc27dd zerver/middleware.py: Fix annotations.
* Use abstract types where relevant.
* Fix string types.
* Fix annotation of args and kwargs.
2016-07-04 02:14:42 +05:30
medullaskyline c5f0d5b40a Annotate zerver.middleware. 2016-06-04 18:32:06 -07:00
Umair Khan 08fbd57245 [i18n] Make error messages translatable.
Make all strings passing through `json_error` and `JsonableError`
translatable.

Fixes #727
2016-05-31 07:40:42 -07:00
Tim Abbott 92bec8cfea Merge Zulip 1.3.12 security release. 2016-05-10 11:32:26 -07:00
Tim Abbott 3cde06ea33 Add support for setting HTTP status codes in JsonableError. 2016-05-10 09:50:48 -07:00
Tim Abbott 54022ac204 Fix unnecessary whitespace between , and ). 2016-05-04 14:16:53 -07:00
Ryan Moore beac606ce6 switch output stats memcached -> remote_cache 2016-03-31 12:54:29 -07:00
Ryan Moore 85b05d4e2b s/memcached_output/remote_cache_output/g 2016-03-31 12:54:29 -07:00
Ryan Moore 5346e2ac23 s/memcached_count_delta/remote_cache_count_delta/g 2016-03-31 12:54:29 -07:00
Ryan Moore 1a2117292f s/memcached_requests/remote_cache_requests/g 2016-03-31 12:54:28 -07:00
Ryan Moore 16c936f638 s/memcached_time/remote_cache_time/g 2016-03-31 12:54:28 -07:00
Tim Abbott df0d2a726d python3: Add missing utf-8 encoding/decoding in various places. 2016-03-08 09:14:15 -08:00
Tim Abbott 10f15a2d00 middleware: Fix str/unicode type mismatch in statsd_path. 2016-02-03 19:29:07 -08:00
Tim Abbott b879b7ff42 Use logger.debug when logging 200/304 output on static assets. 2015-12-25 16:23:57 -08:00
Tim Abbott 06f6ee6566 Apply Python 3 futurize transform lib2to3.fixes.fix_idioms. 2015-11-01 09:25:47 -08:00
Tim Abbott 1f2aa2fcab Fix write_log_line for real.
(imported from commit cbb5c38b8e6c31822b28c478463978aa6cab33d4)
2015-08-22 14:40:47 -07:00
Tim Abbott f1bf5ba24f Fix write_log_line breakage for websockets.
(imported from commit 43bf24822329cf9729654ba58e9ffb0bff3403da)
2015-08-22 14:19:35 -07:00
Reid Barton ab9539cffe Remove OpenID authentication
(imported from commit 70a859041a851ed10dc40cfc068330e472d2ed09)
2015-08-20 23:52:48 -07:00
Reid Barton dfdc34603e Django 1.7 compatibility: handle both response.content and response.streaming_content
(imported from commit faaaff96819731a334d52b7d715c8ddb7c0d4293)
2015-08-20 23:01:26 -07:00
Tim Abbott eb1631f78d Set session cookie domain for *.e.zulip.com hostnames.
(imported from commit 42b15de3b4576341304041588ffaceac6f40baaf)
2015-01-15 21:09:52 -08:00
Tim Abbott 7e786d5426 Import default session middleware as start for custom session middleware.
(imported from commit 76aae367ab6ea5c2a7b0d98368482a3cb312b217)
2015-01-15 21:09:52 -08:00
Luke Faraone 2d3a7e5418 Use a different status code and include seconds remaining header in ratelimits
This will make it slightly easier to consume the data from our clients.

Ref:
    RFC 6585 §4

(imported from commit 6d323dc25db78a6d84a163add950f039e03e73d3)
2014-03-11 13:06:19 -04:00
Leo Franchi c504435bc3 Blacklist more paths, and fix paths with / to use . instead
(imported from commit 7e1840b7efb5d4f6e27307c3f7c95a9c822c8086)
2014-02-03 14:06:58 -05:00
Leo Franchi 30ae1c3463 Blacklist a few more statsd paths
(imported from commit 893b3d6c25e3a626b2948e69566fe5bd0db59813)
2014-01-22 10:49:49 -05:00
Zev Benjamin db23674749 Do query time tracking at the psycopg2 level instead of the Django level
This allows us to track the query time of SQLAlchemy and raw queries.

(imported from commit 818a4ee41786ffc57b80d7ed1cfba075f29b6ee5)
2014-01-14 11:47:12 -05:00
Steve Howell 5dc3d9abce Log internal queries if they are >= 5s.
(imported from commit ee88fcd6292a177e02bfe5e5bca5480b0e474030)
2013-12-26 09:23:18 -05:00
Steve Howell eb6868704f Give higher threshold for webathena kerberos queries.
These are mostly out of our control, so they are not very actionable.

(imported from commit ef342ec1edbff0fa1a934413a7f19ed14817a502)
2013-12-26 09:20:59 -05:00
Steve Howell f61740551c Bump slow query threshold to 1.2s
(imported from commit 8d97fc22d208274bc57b884828957dacf396348a)
2013-12-26 09:16:49 -05:00
Steve Howell 89f3a7c72f Break up conditional in is_slow_query().
(imported from commit c7ca42965e917a0386069c915c0225cefc218c3e)
2013-12-26 09:13:00 -05:00
Steve Howell e0a1841b1c Extract middleware.is_slow_query() and add tests.
(imported from commit 60902244a420800f558fdf2f1c38b4ed736c1286)
2013-12-26 09:09:15 -05:00
Tim Abbott b5754758a7 Don't alert on slow queries for /user_activity and /realm_activity.
(imported from commit 8b08ad47138477068f432ff15c56b2ba93ac2db6)
2013-12-19 17:41:36 -05:00
Tim Abbott 66e72d4705 Ignore slow queries on certain non-customer-facing URLs.
(imported from commit c3f53883d93a75d805b15f617f391ccd1a492ce8)
2013-12-19 17:20:43 -05:00
Tim Abbott 2ca5f43f05 Report json format 500 errors from all json format views.
Previously, we only did this via rest_dispatch.

(imported from commit b0edfdccea294378292b64677a64d5b01f936b08)
2013-12-19 16:48:51 -05:00
Tim Abbott b30afe432e Return a nice JSON error when CSRF errors happen in JSON views.
(imported from commit 916166c115f9b3ba0fdc93f8d917ff37ae22c2ae)
2013-12-19 16:48:51 -05:00
Tim Abbott c91415f318 Improve names for per-request display recipient cache.
It incorrectly advertises itself as per-process.

(imported from commit faf7ca7374d020058e80249bb16a4c6afbcb3e44)
2013-12-19 14:04:55 -05:00
Tim Abbott cc0ce6b437 Finish event handlers when disconnecting from an event queue.
This should help prevent timeouts from clients whose requests have
been supplanted.

(imported from commit fdb3a89c4ec02bb23d0fba50ea558d48cb786916)
2013-12-12 16:52:25 -05:00
Tim Abbott 878e07a5d8 Fix doubled query count logging when DEBUG=True.
(imported from commit 32dfc4c4e366ca80e659ca3d6ab4959b0235e861)
2013-12-03 12:21:06 -05:00
Tim Abbott 26f7b783a2 logging: Fix request processing time to not count initialization.
Previously, we counted not just the time required to process a
particular request, but also the time required to import+find the view
function via urls.py.  The latter is usually fast, but when a new
Django thread starts up, it can take significant time, resulting in us
flagging slow requests on each server restart and also when a new
Django thread starts up on prod to handle requests.

This change means that we no longer include that startup time as part
of request processing time -- but we still log it in the case that it
was more than 5ms, so that we can identify why a particular request
was slower than expected if high startup times become a problem.  We
annotate the time in log lines as "+start" rather than just "start" to
make clear that it's not counted in the total.

(imported from commit c677682e23b26005060390d85d386234f4f463ad)
2013-11-18 18:05:19 -05:00
Tim Abbott d44c6636c6 Add setting to enable profiling of all requests.
This is useful for the occasional case where we cannot figure out what
is causing a particular problem, but it can be easily reproduced on
staging.

(imported from commit 8b51184a8b686814f2c6ff103ba355538463ceb0)
2013-11-18 18:05:19 -05:00
Zev Benjamin eec195f421 socket: Use our full logging infrastructure
We also now separate out the times for the socket overhead, the
request service time, and the queuing delays.

(imported from commit e1683f7f28b968b86ebb701b0ac29b00ac6d67c3)
2013-11-12 15:24:30 -05:00
Zev Benjamin 6ff8ff0f3f Make all the work of the logging middleware not depend on Request or Response objects
(imported from commit ce13909d338230a931cb09405d54bb872b2ff0a6)
2013-11-12 15:24:30 -05:00
Zev Benjamin 32ed5f9f42 Move flushing the display recipient cache to its own middleware
(imported from commit 27a6935a5830ef986b18de169d66dd86d273d064)
2013-11-12 15:24:30 -05:00
Zev Benjamin 53ec292022 Store logging data in a dict instead of individual attributes
(imported from commit f7d76670428d5d8298c572a23cbfffc1d9695f23)
2013-11-12 15:24:30 -05:00
Tim Abbott 2e833613c3 [Django 1.6] update monkeypatching for CursorDebugWrapper.
(imported from commit cb4b44a2bb6ba6efbd1fb5710da2df7c1dec7f81)
2013-11-08 08:22:04 -05:00
Leo Franchi 980dc8345a Strip unicode chars from statsd path
(imported from commit 1c5829be7f89ad1e26be52b416a51da63e2a94cf)
2013-11-05 17:47:15 -05:00
Tim Abbott 5806a6e508 logging: Log the type of the narrow for get_old_messages.
(imported from commit 9e6471c10602242c924f29d29bc780667ac17672)
2013-10-21 14:33:24 -04:00
Tim Abbott 1d4f39f55a logging: Update slow query logger to not display client string.
I believe with this change the log lines will fit much better into
Zulip, and the Client string was I suspect rarely important for
responding to slow queries (and is always available in the main log
anyway).

(imported from commit ad56f446bf3fb96a14a56b825f46c1dad9b6babe)
2013-10-21 14:33:24 -04:00
Tim Abbott f3fee40d1d logging: Don't log query parameters for GET requests.
(imported from commit 4f764ae0168f6b0ad73d13a15ca7e0895514951c)
2013-10-21 14:33:24 -04:00
Leo Franchi 724f7e54e7 Clean up slow log displays
(imported from commit cab99eeaef14e1f24a6b09b0cdbcdcc45bcb620a)
2013-10-02 11:26:58 -04:00
Leo Franchi 2614716fca Log slow queries to zulip so we notice them
(imported from commit 23f311ad881edda4c4495089ea3b55213470a059)
2013-09-30 17:41:56 -04:00
Leo Franchi 0a1a63c87f Blacklist some webreq urls that we don't want to store for metrics
(imported from commit 60def99010e5c55e77acc65aa516459c6e4ffaae)
2013-09-18 11:56:25 -04:00
Jessica McKellar 2be988e6b7 Give recipient_cache and associated functions a clearer name.
(imported from commit 14b12b196e33dfc4eeca5ab0f12c130ddaef1065)
2013-08-28 10:23:40 -04:00