zulip

Commit Graph

Author	SHA1	Message	Date
Mateusz Mandera	0255ca9b6a	middleware: Log user.id/realm.string_id instead of _email.	2020-03-09 13:54:58 -07:00
Tim Abbott	229090a3a5	middleware: Avoid running APPEND_SLASH logic in Tornado. Profiling suggests this saves about 600us in the runtime of every GET /events request attempting to resolve URLs to determine whether we need to do the APPEND_SLASH behavior. It's possible that we end up doing the same URL resolution work later and we're just moving around some runtime, but I think even if we do, Django probably doesn't do any fancy caching that would mean doing this query twice doesn't just do twice the work. In any case, we probably want to extend this behavior to our whole API because the APPEND_SLASH redirect behavior is essentially a bug there. That is a more involved refactor, however.	2020-02-14 16:15:57 -08:00
rht	41e3db81be	dependencies: Upgrade to Django 2.2.10. Django 2.2.x is the next LTS release after Django 1.11.x; I expect we'll be on it for a while, as Django 3.x won't have an LTS release series out for a while. Because of upstream API changes in Django, this commit includes several changes beyond requirements and: * urls: django.urls.resolvers.RegexURLPattern has been replaced by django.urls.resolvers.URLPattern; affects OpenAPI code and related features which re-parse Django's internals. https://code.djangoproject.com/ticket/28593 * test_runner: Change number to suffix. Django changed the name in this ticket: https://code.djangoproject.com/ticket/28578 * Delete now-unnecessary SameSite cookie code (it's now the default). * forms: urlsafe_base64_encode returns string in Django 2.2. https://docs.djangoproject.com/en/2.2/ref/utils/#django.utils.http.urlsafe_base64_encode * upload: Django's File.size property replaces _get_size(). https://docs.djangoproject.com/en/2.2/_modules/django/core/files/base/ * process_queue: Migrate to new autoreload API. * test_messages: Add an extra query caused by .refresh_from_db() losing the .select_related() on the Realm object. * session: Sync SessionHostDomainMiddleware with Django 2.2. There's a lot more we can do to take advantage of the new release; this is tracked in #11341. Many changes by Tim Abbott, Umair Waheed, and Mateusz Mandera squashed are squashed into this commit. Fixes #10835.	2020-02-13 16:27:26 -08:00
Tim Abbott	1ea2f188ce	tornado: Rewrite Django integration to duplicate less code. Since essentially the first use of Tornado in Zulip, we've been maintaining our Tornado+Django system, AsyncDjangoHandler, with several hundred lines of Django code copied into it. The goal for that code was simple: We wanted a way to use our Django middleware (for code sharing reasons) inside a Tornado process (since we wanted to use Tornado for our async events system). As part of the Django 2.2.x upgrade, I looked at upgrading this implementation to be based off modern Django, and it's definitely possible to do that: * Continue forking load_middleware to save response middleware. * Continue manually running the Django response middleware. * Continue working out a hack involving copying all of _get_response to change a couple lines allowing us our Tornado code to not actually return the Django HttpResponse so we can long-poll. The previous hack of returning None stopped being viable with the Django 2.2 MiddlewareMixin.__call__ implementation. But I decided to take this opportunity to look at trying to avoid copying material Django code, and there is a way to do it: * Replace RespondAsynchronously with a response.asynchronous attribute on the HttpResponse; this allows Django to run its normal plumbing happily in a way that should be stable over time, and then we proceed to discard the response inside the Tornado `get()` method to implement long-polling. (Better yet might be raising an exception?). This lets us eliminate maintaining a patched copy of _get_response. * Removing the @asynchronous decorator, which didn't add anything now that we only have one API endpoint backend (with two frontend call points) that could call into this. Combined with the last bullet, this lets us remove a significant hack from our never_cache_responses function. * Calling the normal Django `get_response` method from zulip_finish after creating a duplicate request to process, rather than writing totally custom code to do that. This lets us eliminate maintaining a patched copy of Django's load_middleware. * Adding detailed comments explaining how this is supposed to work, what problems we encounter, and how we solve various problems, which is critical to being able to modify this code in the future. A key advantage of these changes is that the exact same code should work on Django 1.11, Django 2.2, and Django 3.x, because we're no longer copying large blocks of core Django code and thus should be much less vulnerable to refactors. There may be a modest performance downside, in that we now run both request and response middleware twice when longpolling (once for the request we discard). We may be able to avoid the expensive part of it, Zulip's own request/response middleware, with a bit of additional custom code to save work for requests where we're planning to discard the response. Profiling will be important to understanding what's worth doing here.	2020-02-13 16:13:11 -08:00
Mateusz Mandera	335b804510	exceptions: RateLimited shouldn't inherit from PermissionDenied. We will want to raise RateLimited in authenticate() in rate limiting code - Django's authenticate() mechanism catches PermissionDenied, which we don't want for RateLimited. We want RateLimited to propagate to our code that called the authenticate() function.	2020-02-02 19:15:00 -08:00
Mateusz Mandera	a6a2d70320	rate_limiter: Handle multiple types of rate limiting in middleware. As more types of rate limiting of requests are added, one request may end up having various limits applied to it - and the middleware needs to be able to handle that. We implement that through a set_response_headers function, which sets the X-RateLimit-* headers in a sensible way based on all the limits that were applied to the request.	2020-02-02 19:15:00 -08:00
Wyatt Hoodes	b807c4273e	middleware: Fix exception typing. Mypy seems to have trouble understanding `Exception` inheritance here, so we create a `Union` for the only `Exception` we are looking for.	2019-07-31 12:23:20 -07:00
Anders Kaseorg	0bcae0be55	write_log_line: Fix logging of 4xx error data. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-07-25 14:42:52 -07:00
Wyatt Hoodes	5686821150	middleware: Change write_log_line to publish as a dict. We were seeing errors when pubishing typical events in the form of `Dict[str, Any]` as the expected type to be a `Union`. So we instead change the only non-dictionary call, to pass a dict instead of `str`.	2019-07-22 17:06:41 -07:00
Mateusz Mandera	f73600c82c	rate_limiter: Create a general rate_limit_request_by_entity function.	2019-05-30 16:50:11 -07:00
Anders Kaseorg	9efda71a4b	get_realm: raise DoesNotExist instead of returning None. This makes the implementation of `get_realm` consistent with its declared return type of `Realm` rather than `Optional[Realm]`. Fixes #12263. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-05-06 21:58:16 -07:00
Puneeth Chaganti	a653fcca93	html_to_text: Escape text when using as description.	2019-04-25 15:29:16 -07:00
Puneeth Chaganti	7d7134d45d	html_to_text: Extract code for html to plain text conversion.	2019-04-25 15:29:16 -07:00
Anders Kaseorg	21dc34cc52	open graph: HTML-escape og:description, twitter:description. The entire idea of doing this operation with unchecked string replacement in a middleware class is in my opinion extremely ill-conceived, but this fixes the most pressing problem with it generating invalid HTML. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-04-23 15:53:59 -07:00
Anders Kaseorg	643bd18b9f	lint: Fix code that evaded our lint checks for string % non-tuple. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-04-23 15:21:37 -07:00
Tim Abbott	983e24a7f5	auth: Use HTTP status 404 for invalid realms. Apparently, our invalid realm error page had HTTP status 200, which could be confusing and in particular broken our mobile app's error handling for this case.	2019-03-14 13:50:09 -07:00
Tim Abbott	de6f724bc5	middleware: Avoid doing work for statsd when not enabled. This saves about 8% of the runtime of our total response middleware, or equivalently close to 2% of the total Tornado response time. Which is pretty significant given that we're not sure anyone is using statsd in production. It's also useful outside Tornado, but the effect is particularly significant because of how important Tornado performance is.	2019-02-27 17:53:15 -08:00
Tim Abbott	c955b20131	middleware: Don't repreatedly regenerate open graph functions. This avoids parsing these functions on every request, which was adding roughly 350us to our per-request response times. The overall impact was more than 10% of basic Tornado response runtime.	2019-02-27 17:53:13 -08:00
Rishi Gupta	028874bab3	open graph: Remove extraneous spaces from descriptions. Our html collects extra spaces in a couple of places. The most prominent is paragraphs that look like the following in the .md file: * some text continued The html will have two spaces before "continued".	2019-02-11 12:05:19 -08:00
Rishi Gupta	d3125f59e1	open graph: Omit .code-section navigation from open graph.	2019-02-11 12:05:19 -08:00
Rishi Gupta	e1f02dc6f2	open graph: Include multiple paragraphs in description tags.	2019-02-11 12:05:19 -08:00
Anders Kaseorg	f0ecb93515	zerver core: Remove unused imports. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-02-02 17:41:24 -08:00
Wyatt Hoodes	8eac361fb5	docs: Refactor BS work with use of cache_with_key. Refactor the potentially expensive work done by Beautiful Soup into a function that is called by the alter_content function, so that we can cache the result. Saves a significant portion of the runtime of loading of all of our /help/ and /api/ documentation pages (e.g. 12ms for /api). Fixes #11088. Tweaked by tabbott to use the URL path as the cache key, clean up argument structure, and use a clearer name for the function.	2019-01-28 15:21:52 -08:00
Tim Abbott	9c3f38a564	docs: Automatically construct OpenAPI metadata for help center. This is somewhat hacky, in that in order to do what we're doing, we need to parse the HTML of the rendered page to extract the first paragraph to include in the open graph description field. But BeautifulSoup does a good job of it. This carries a nontrivial performance penalty for loading these pages, but overall /help/ is a low-traffic site compared to the main app, so it doesn't matter much. (As a sidenote, it wouldn't be a bad idea to cache this stuff). There's lots of things we can improve in this, largely through editing the articles, but we can deal with that over time. Thanks to Rishi for writing all the tests.	2018-12-19 10:18:20 -08:00
Tim Abbott	ae6fc0a471	sessions: Resync session middleware from Django upstream. Until we resolve https://github.com/zulip/zulip/issues/10832, we will need to maintain our own forked copy of Django's SessionMiddleware. We apparently let this get out of date. This fixes a few subtle bugs involving the user logout experience that were throwing occasional exceptions (e.g. the UpdateError fix you can see).	2018-11-14 15:16:12 -08:00
Tim Abbott	10ac671cd4	middleware: Fix logging of query counts in websockets requests. Apparently, we weren't resetting the query counters inside the websockets codebase, resulting in broken log results like this: SOCKET 403 2ms (db: 1ms/2q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 5ms (db: 2ms/3q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 2ms (db: 3ms/4q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 2ms (db: 3ms/5q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 2ms (db: 4ms/6q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 2ms (db: 5ms/7q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 2ms (db: 5ms/8q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 3ms (db: 6ms/9q) /socket/auth [transport=websocket] (unknown via ?) The correct fix for this is to call reset_queries at the start of each endpoint within the websockets system. As it turns out, we're already calling record_request_start_data there, and in fact should be calling `reset_queries` in all code paths that use that function (the other code paths, in zerver/middleware.py, do it manually with connection.connection.queries = []). So we can clean up the code in a way that reduces risk for similar future issues and fix this logging bug with this simple refactor.	2018-10-31 16:22:17 -07:00
Tim Abbott	e4813e462b	tornado: Rename async_request_{restart,stop} to mention timer. Previously, these timer accounting functions could be easily mistaken for referring to starting/stopping the request. By adding timer to the name, we make the code easier for the casual observer to read and understand.	2018-10-16 15:39:10 -07:00
Vishnu Ks	d2e4417a72	urls: Separate endpoint for signup and new realm email confirm. This is preparation for the next commit.	2018-08-26 22:53:57 -07:00
Aditya Bansal	993d50f5ab	zerver: Change use of typing.Text to str.	2018-05-12 15:22:39 -07:00
neiljp (Neil Pilgrim)	2ed6da77c7	mypy: Rewrite some middleware annotations to use ViewFuncT.	2018-03-17 23:25:05 +00:00
Greg Price	53c57cf002	errors: Include request info on error mails for JSON routes too. When our code raises an exception and Django converts it to a 500 response (in django.core.handlers.exception.handle_uncaught_exception), it attaches the request to the log record, and we use this in our AdminNotifyHandler to include data like the user and the URL path in the error email sent to admins. On this line, when our code raises an exception but we've decided (in `TagRequests`) to format any errors as JSON errors, we suppress the exception so we have to generate the log record ourselves. Attach the request here, just like Django does when we let it do the job. This still isn't an awesome solution, in that there are lots of other places where we call `logging.error` or `logging.exception` while inside a request; this just covers one of them. This is one of the most common, though, so it's a start.	2018-03-01 15:12:32 -08:00
Callum Fraser	aa9567ce37	mypy: Use Python 3 type syntax in zerver/middleware.py.	2017-12-11 18:43:24 -08:00
rht	a1cc720860	zerver: Use Python 3 syntax for typing. Tweaked by tabbott to fix some minor whitespace errors.	2017-11-28 16:49:36 -08:00
Greg Price	b6cc21b438	debug: Add facility to dump tracemalloc snapshots. Originally this used signals, namely SIGRTMIN. But in prod, the signal handler never fired; I debugged fruitlessly for a while, and suspect uwsgi was foiling it in a mysterious way (which is kind of the only way uwsgi does anything.) So, we listen on a socket. Bit more code, and a bit trickier to invoke, but it works. This was developed for the investigation of memory-bloating on chat.zulip.org that led to `a331b4f64` "Optimize query_all_subs_by_stream()". For usage instructions, see docstring.	2017-11-28 15:52:07 -08:00
Robert Hönig	0e0a8a2b14	queue processor tests: Call consume by default. This significantly improves the API for queue_json_publish to not be overly focused on what the behavior of this function should be in our unit tests.	2017-11-26 11:45:34 -08:00
Tim Abbott	10ab9410c9	python: Sort imports in easy files in zerver/.	2017-11-15 15:50:28 -08:00
derAnfaenger	3ac09b3e9b	queue processors: Add coverage for SlowQueryWorker.	2017-11-09 15:20:40 -08:00
rht	5ee40bf718	Remove usage of six.moves.binary_type.	2017-11-09 10:00:00 -08:00
Felix Yan	aea33fc738	Fix a comment typo in zerver/middleware.py.	2017-10-30 10:36:35 -07:00
derAnfaenger	1792dcbd09	tests: Call real consume method of queue processors. This switches to more real tests for a first batch of queue_json_publish() calls that don't cause trouble when used with call_consume_tests=True.	2017-10-26 14:58:03 -07:00
Greg Price	093bae4bc5	subdomains: Fix some implicit uses of "" for the root subdomain. These are just instances that jumped out at me while working on the subdomains code, mostly while grepping for get_subdomain call sites. I haven't attempted a comprehensive search, and there are likely still others left.	2017-10-26 10:29:17 -07:00
Tim Abbott	1ab2ca5986	subdomains: Extract zerver.lib.subdomains library. These never really belonged with the rest of zerver.lib.utils.py, and having a separate library makes it easier to enforce full test coverage.	2017-10-18 22:27:48 -07:00
Alena Volkova	5515a075ec	urls: Move the report endpoints to be API-style routes.	2017-10-17 22:05:56 -07:00
Tim Abbott	46485322eb	middleware: Remove logic for redirecting to zulipdev.com domains. We originally wrote this because when testing subdomains, you wanted to be sure you were actually testing subdomains. Now that subdomains is the default, doesn't seem to actually be a good reason why we should need this.	2017-10-05 23:21:02 -07:00
Tim Abbott	40c59f2878	middleware: Fix losing sub-URL when pushing to zulipdev.com. Previously, this would always send one to homepage, making visiting the /help/ documentation in the development environment using the localhost URL unpleasant. While this fixes the proximal bug, it's not clear to me that we need this redirect logic at all, so I'm going to try removing it soon.	2017-10-05 16:36:34 -07:00
Tim Abbott	1d72629dc4	subdomains: Hardcode REALMS_HAVE_SUBDOMAINS=True.	2017-10-02 16:42:43 -07:00
rht	2949d1c1e8	zerver: Remove the rest of absolute_import.	2017-09-27 10:02:39 -07:00
neiljp (Neil Pilgrim)	bb83742906	mypy: Correct 2 type annotations in zerver/middleware.py.	2017-08-15 17:50:18 -07:00
neiljp (Neil Pilgrim)	e772e89fe2	mypy: Add None return path for RateLimitMiddleware.process_exception().	2017-08-03 11:03:14 -07:00
Umair Khan	9e33917d25	rate_limiter: Upgrade max_api_calls to generic API.	2017-08-02 18:01:39 -07:00

1 2 3

137 Commits