zulip

Commit Graph

Author	SHA1	Message	Date
rht	bb8504d925	lint: Fix typos found by codespell.	2021-10-19 16:51:13 -07:00
Mateusz Mandera	73a6f2a1a7	auth: Add support for using SCIM for account management.	2021-10-14 12:29:10 -07:00
Mateusz Mandera	8b906b5d2f	request_notes: Set the realm appropriately for the root subdomain. Requests to the root subdomain weren't getting request_notes.realm set even if a realm exists on the root subdomain - which is actually a common scenario, because simply having one organization, on the root subdomain, is the simplest and common way for self-hosted deployments.	2021-09-28 10:02:52 -07:00
Mateusz Mandera	fb3864ea3c	auth: Change the look of SOCIAL_AUTH_SUBDOMAIN when directly opened. SOCIAL_AUTH_SUBDOMAIN was potentially very confusing when opened by a user, as it had various Login/Signup buttons as if there was a realm on it. Instead, we want to display a more informative page to the user telling them they shouldn't even be there. If possible, we just redirect them to the realm they most likely came from. To make this possible, we have to exclude the subdomain from ROOT_SUBDOMAIN_ALIASES - so that we can give it special behavior.	2021-09-10 10:47:15 -07:00
PIG208	53888e5a26	request: Refactor ZulipRequestNotes to RequestNotes. This utilizes the generic `BaseNotes` we added for multipurpose patching. With this migration as an example, we can further support more types of notes to replace the monkey-patching approach we have used throughout the codebase for type safety.	2021-09-03 08:48:45 -07:00
PIG208	fa09404dd7	typing: Use assertions for responses when appropriate. This is part of #18777.	2021-08-20 06:02:56 -07:00
PIG208	f9644c8cf3	typing: Fix function signatures with django-stubs.	2021-08-20 06:02:55 -07:00
Anders Kaseorg	ad5f0c05b5	python: Remove default "utf8" argument for encode(), decode(). Partially generated by pyupgrade. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-08-02 15:53:52 -07:00
PIG208	8121d2d58d	typing: Fix misuse of HttpResponse. Amend usage of HttpResponse when appropriate.	2021-07-27 14:31:19 +08:00
Tim Abbott	01ce58319d	mypy: Fix most AnonymousUser type errors. This commit fixes several mypy errors with Django stubs, by telling mypy that we know in a given code path that the user is authenticated.	2021-07-24 14:55:46 -07:00
Anders Kaseorg	7c32134fb5	Revert "Revert "request: Refactor to record rate limit data using ZulipRequestNotes."" This reverts commit `49eab4efef`. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-07-19 16:48:23 -07:00
PIG208	49eab4efef	Revert "request: Refactor to record rate limit data using ZulipRequestNotes." This reverts commit `3f9a5e1e17`.	2021-07-16 09:01:20 -07:00
PIG208	c03b9c95ad	request: Store client information using ZulipRequestNotes. This concludes the HttpRequest migration to eliminate arbitrary attributes (except private ones that are belong to django) attached to the request object during runtime and migrated them to a separate data structure dedicated for the purpose of adding information (so called notes) to a HttpRequest.	2021-07-14 12:01:07 -07:00
PIG208	8eb2c3ffdb	request: Move realm from the request to ZulipRequestNotes.	2021-07-14 12:01:07 -07:00
PIG208	742c17399e	request: Move miscellaneous attributes to ZulipRequestNotes. This includes the migration of fields that require trivial changes to be migrated to be stored with ZulipRequestNotes. Specifically _requestor_for_logs, _set_language, _query, error_format, placeholder_open_graph_description, saveed_response, which were all previously set on the HttpRequest object at some point. This migration allows them to be typed.	2021-07-14 12:01:07 -07:00
PIG208	5475334b16	request: Refactor to store requestor_for_logs in ZulipRequestNotes.	2021-07-14 12:01:07 -07:00
PIG208	3f9a5e1e17	request: Refactor to record rate limit data using ZulipRequestNotes. We will no longer use the HttpRequest to store the rate limit data. Using ZulipRequestNotes, we can access rate_limit and ratelimits_applied with type hints support. We also save the process of initializing ratelimits_applied by giving it a default value.	2021-07-14 12:01:07 -07:00
PIG208	da6e5ddcae	request: Move log_data from HttpRequest to ZulipRequestNotes.	2021-07-14 12:01:05 -07:00
PIG208	8b9011dff8	json_error: Completely remove json_error. This completes the migration from `return json_error` to `raise JsonableError`.	2021-07-06 15:34:33 -07:00
Anders Kaseorg	544bbd5398	docs: Fix capitalization mistakes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-05-10 09:57:26 -07:00
Tim Abbott	615ad2d5d8	middleware: Simplify logic for parsing user-agent. This avoids calling parse_user_agent twice when dealing with official Zulip clients, and also makes the logical flow hopefully easier to read. We move get_client_name out of decorator.py, since it no longer belongs there, and give it a nicer name.	2021-04-29 17:47:41 -07:00
orientor	fe260fb892	middleware: Show client version in logging if available. Fixes #14067.	2021-04-29 17:07:37 -07:00
orientor	ac203cd9f1	middleware: Add client_version attribute to request.	2021-04-29 17:03:40 -07:00
orientor	6224d83dea	middleware: Get client name in LogRequests instead of process_client. This ensures it is present for all requests; while that was already essentially true via process_client being called from every standard decorator, this allows middleware and other code to rely on this having been set.	2021-04-29 17:03:05 -07:00
Anders Kaseorg	e7ed907cf6	python: Convert deprecated Django ugettext alias to gettext. django.utils.translation.ugettext is a deprecated alias of django.utils.translation.gettext as of Django 3.0, and will be removed in Django 4.0. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-04-15 18:01:34 -07:00
Alex Vandiver	07779ea879	middleware: Do not trust X-Forwarded-For; use X-Real-Ip, set from nginx. The `X-Forwarded-For` header is a list of proxies' IP addresses; each proxy appends the remote address of the host it received its request from to the list, as it passes the request down. A naïve parsing, as SetRemoteAddrFromForwardedFor did, would thus interpret the first address in the list as the client's IP. However, clients can pass in arbitrary `X-Forwarded-For` headers, which would allow them to spoof their IP address. `nginx`'s behavior is to treat the addresses as untrusted unless they match an allowlist of known proxies. By setting `real_ip_recursive on`, it also allows this behavior to be applied repeatedly, moving from right to left down the `X-Forwarded-For` list, stopping at the right-most that is untrusted. Rather than re-implement this logic in Django, pass the first untrusted value that `nginx` computer down into Django via `X-Real-Ip` header. This allows consistent IP addresses in logs between `nginx` and Django. Proxied calls into Tornado (which don't use UWSGI) already passed this header, as Tornado logging respects it.	2021-03-31 14:19:38 -07:00
Anders Kaseorg	6e4c3e41dc	python: Normalize quotes with Black. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	11741543da	python: Reformat with Black, except quotes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	9773c0f1a8	python: Fix string literal concatenation mistakes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 08:02:51 -05:00
Mateusz Mandera	f76202dd59	django3: Save language preference in a cookie rather than the session. Support for saving it in the session is dropped in django3, the cookie is the mechanism that needs to be used. The relevant i18n code doesn't have access to the response objects and thus needs to delegate setting the cookie to LocaleMiddleware. Fixes the LocaleMiddleware point of #16030.	2021-01-17 10:38:58 -08:00
Mateusz Mandera	43a0c60e96	exceptions: Make RateLimited into a subclass of JsonableError. This simplifies the code, as it allows using the mechanism of converting JsonableErrors into a response instead of having separate, but ultimately similar, logic in RateLimitMiddleware. We don't touch tests here because "rate limited" error responses are already verified in test_external.py.	2020-12-01 13:40:56 -08:00
Anders Kaseorg	72d6ff3c3b	docs: Fix more capitalization issues. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-23 11:46:55 -07:00
Alex Vandiver	536bd3188e	middleware: Move locale-setting before domain checking. Calling `render()` in a middleware before LocaleMiddleware has run will pick up the most-recently-set locale. This may be from the _previous_ request, since the current language is thread-local. This results in the "Organization does not exist" page occasionally being in not-English, depending on the preferences of the request which that thread just finished serving. Move HostDomainMiddleware below LocaleMiddleware; none of the earlier middlewares call `render()`, so are safe. This will also allow the "Organization does not exist" page to be localized based on the user's browser preferences. Unfortunately, it also means that the default LocaleMiddleware catches the 404 from the HostDomainMiddlware and helpfully tries to check if the failure is because the URL lacks a language component (e.g. `/en/`) by turning it into a 304 to that new URL. We must subclass the default LocaleMiddleware to remove this unwanted functionality. Doing so exposes a two places in tests that relied (directly or indirectly) upon the redirection: '/confirmation_key' was redirected to '/en/confirmation_key', since the non-i18n version did not exist; and requests to `/stats/realm/not_existing_realm/` incorrectly were expecting a 302, not a 404. This regression likely came in during `f00ff1ef62`, since prior to that, the HostDomainMiddleware ran _after_ the rest of the request had completed.	2020-09-14 22:16:09 -07:00
Alex Vandiver	6323218a0e	request: Maintain a thread-local of the current request. This allows logging (to Sentry, or disk) to be annotated with richer data about the request.	2020-09-11 16:43:29 -07:00
Anders Kaseorg	f91d287447	python: Pre-fix a few spots for better Black formatting. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 17:51:09 -07:00
Anders Kaseorg	a276eefcfe	python: Rewrite dict() as {}. Suggested by the flake8-comprehensions plugin. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-02 11:15:41 -07:00
Anders Kaseorg	ab120a03bc	python: Replace unnecessary intermediate lists with generators. Mostly suggested by the flake8-comprehension plugin. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-02 11:15:41 -07:00
Aman	fd5423a8f9	exceptions: Extract json_unauths into MissingAuthenticationError. We raise two types of json_unauthorized when MissingAuthenticationError is raised. Raising the one with www_authenticate let's the client know that user needs to be logged in to access the requested content. Sending `www_authenticate='session'` header with the response also stops modern web-browsers from showing a login form to the user and let's the client handle it completely. Structurally, this moves the handling of common authentication errors to a single shared middleware exception handler.	2020-08-30 14:51:50 -07:00
Tim Abbott	1fddf16b73	Revert "exceptions: Extract json_unauths into MissingAuthenticationError." This reverts commit `c355f6b8d8`.	2020-08-25 17:42:07 -07:00
Aman	c355f6b8d8	exceptions: Extract json_unauths into MissingAuthenticationError. We raise two types of json_unauthorized when MissingAuthenticationError is raised. Raising the one with www_authenticate let's the client know that user needs to be logged in to access the requested content. Sending `www_authenticate='session'` header with the response also stops modern web-browsers from showing a login form to the user and let's the client handle it completely. Structurally, this moves the handling of common authentication errors to a single shared middleware exception handler.	2020-08-25 16:52:21 -07:00
Alex Vandiver	596cf2580b	sentry: Ignore all SuspiciousOperation loggers. django.security.DisallowedHost is only one of a set of exceptions that are "SuspiciousOperation" exceptions; all return a 400 to the user when they bubble up[1]; all of them are uninteresting to Sentry. While they may, in bulk, show a mis-configuration of some sort of the application, such a failure should be detected via the increase in 400's, not via these, which are uninteresting individually. While all of these are subclasses of SuspiciousOperation, we enumerate them explicitly for a number of reasons: - There is no one logger we can ignore that captures all of them. Each of the errors uses its own logger, and django does not supply a `django.security` logger that all of them feed into. - Nor can we catch this by examining the exception object. The SuspiciousOperation exception is raised too early in the stack for us to catch the exception by way of middleware and check `isinstance`. But at the Sentry level, in `add_context`, it is no longer an exception but a log entry, and as such we have no `isinstance` that can be applied; we only know the logger name. - Finally, there is the semantic argument that while we have decided to ignore this set of security warnings, we _may_ wish to log new ones that may be added at some point in the future. It is better to opt into those ignores than to blanket ignore all messages from the security logger. This moves the DisallowedHost `ignore_logger` to be adjacent to its kin, and not on the middleware that may trigger it. Consistency is more important than locality in this case. Of these, the DisallowedHost logger if left as the only one that is explicitly ignored in the LOGGING configuration in `computed_settings.py`; it is by far the most frequent, and the least likely to be malicious or impactful (unlike, say, RequestDataTooBig). [1] https://docs.djangoproject.com/en/3.0/ref/exceptions/#suspiciousoperation	2020-08-12 16:08:38 -07:00
Alex Vandiver	28c627452f	sentry: Ignore DisallowedHost messages. This is a misconfiguration of the client, not the server.	2020-08-11 10:38:14 -07:00
Alex Vandiver	f00ff1ef62	middleware: Make HostDomain into a process_request, not process_response. It is more suited for `process_request`, since it should stop execution of the request if the domain is invalid. This code was likely added as a process_response (in `ea39fb2556`) because there was already a process_response at the time (added `7e786d5426`, and no longer necessary since `dce6b4a40f`). It quiets an unnecessary warning when logging in at a non-existent realm. This stops performing unnecessary work when we are going to throw it away and return a 404. The edge case to this is if the request _creates_ a realm, and is made using the URL of the new realm; this change would prevent the request before it occurs. While this does arise in tests, the tests do not reflect reality -- real requests to /accounts/register/ are made via POST to the same (default) realm, redirected there from `confirm-preregistrationuser`. The tests are adjusted to reflect real behavior. Tweaked by tabbott to add a block comment in HostDomainMiddleware.	2020-08-11 10:37:55 -07:00
Alex Vandiver	9266315a1f	middleware: Stop shadowing top-level logger definition on line 33.	2020-07-27 16:46:13 -07:00
Alex Vandiver	1b2d0271af	sentry: Prevent double-logging of JSON-formatted errors. Capture and report the initial exception, not the formatted text-only message traceback.	2020-07-27 11:07:55 -07:00
Mohit Gupta	44d68c1840	refactor: Rename bugdown words to markdown in stats related functions. This commit is part of series of commits aimed at renaming bugdown to markdown.	2020-06-26 17:20:40 -07:00
Mohit Gupta	3f5fc13491	refactor: Rename zerver.lib.bugdown to zerver.lib.markdown . This commit is first of few commita which aim to change all the bugdown references to markdown. This commits rename the files, file path mentions and change the imports. Variables and other references to bugdown will be renamed in susequent commits.	2020-06-26 17:08:37 -07:00
Anders Kaseorg	5dc9b55c43	python: Manually convert more percent-formatting to f-strings. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00
Anders Kaseorg	365fe0b3d5	python: Sort imports with isort. Fixes #2665. Regenerated by tabbott with `lint --fix` after a rebase and change in parameters. Note from tabbott: In a few cases, this converts technical debt in the form of unsorted imports into different technical debt in the form of our largest files having very long, ugly import sequences at the start. I expect this change will increase pressure for us to split those files, which isn't a bad thing. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-11 16:45:32 -07:00
Anders Kaseorg	69730a78cc	python: Use trailing commas consistently. Automatically generated by the following script, based on the output of lint with flake8-comma: import re import sys last_filename = None last_row = None lines = [] for msg in sys.stdin: m = re.match( r"\x1b\[35mflake8 \\|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg ) if m: filename, row_str, col_str, err = m.groups() row, col = int(row_str), int(col_str) if filename == last_filename: assert last_row != row else: if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) with open(filename) as f: lines = f.readlines() last_filename = filename last_row = row line = lines[row - 1] if err in ["C812", "C815"]: lines[row - 1] = line[: col - 1] + "," + line[col - 1 :] elif err in ["C819"]: assert line[col - 2] == "," lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ") if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-06-11 16:04:12 -07:00
Anders Kaseorg	67e7a3631d	python: Convert percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-10 15:02:09 -07:00
Mateusz Mandera	dd40649e04	queue_processors: Remove the slow_queries queue. While this functionality to post slow queries to a Zulip stream was very useful in the early days of Zulip, when there were only a few hundred accounts, it's long since been useless since (1) the total request volume on larger Zulip servers run by Zulip developers, and (2) other server operators don't want real-time notifications of slow backend queries. The right structure for this is just a log file. We get rid of the queue and replace it with a "zulip.slow_queries" logger, which will still log to /var/log/zulip/slow_queries.log for ease of access to this information and propagate to the other logging handlers. Reducing the amount of queues is good for lowering zulip's memory footprint and restart performance, since we run at least one dedicated queue worker process for each one in most configurations.	2020-05-11 00:45:13 -07:00
Tim Abbott	a702894e0e	middleware: Stop using X_REAL_IP. The comment was wrong, in that REMOTE_ADDR is where the real external IP was; X_REAL_IP was the loadbalancer's IP.	2020-05-08 11:40:54 -07:00
Anders Kaseorg	bdc365d0fe	logging: Pass format arguments to logging. https://docs.python.org/3/howto/logging.html#optimization Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-02 10:18:02 -07:00
Anders Kaseorg	fead14951c	python: Convert assignment type annotations to Python 3.6 style. This commit was split by tabbott; this piece covers the vast majority of files in Zulip, but excludes scripts/, tools/, and puppet/ to help ensure we at least show the right error messages for Xenial systems. We can likely further refine the remaining pieces with some testing. Generated by com2ann, with whitespace fixes and various manual fixes for runtime issues: - invoiced_through: Optional[LicenseLedger] = models.ForeignKey( + invoiced_through: Optional["LicenseLedger"] = models.ForeignKey( -_apns_client: Optional[APNsClient] = None +_apns_client: Optional["APNsClient"] = None - notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) - signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) + notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) + signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) - author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE) + author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE) - bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL) + bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL) - default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) - default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) + default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) + default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) -descriptors_by_handler_id: Dict[int, ClientDescriptor] = {} +descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {} -worker_classes: Dict[str, Type[QueueProcessingWorker]] = {} -queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {} +worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {} +queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {} -AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None +AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-22 11:02:32 -07:00
Anders Kaseorg	1cf63eb5bf	python: Whitespace fixes from autopep8. Generated by autopep8, with the setup.cfg configuration from #14532. I’m not sure why pycodestyle didn’t already flag these. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-21 17:58:09 -07:00
Anders Kaseorg	dce6b4a40f	middleware: Remove unused cookie_domain setting. Since commit `1d72629dc4`, we have been maintaining a patched copy of Django’s SessionMiddleware.process_response in order to unconditionally ignore our own optional cookie_domain setting that we don’t set. Instead, let’s not do that. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-12 11:55:55 -07:00
Anders Kaseorg	c734bbd95d	python: Modernize legacy Python 2 syntax with pyupgrade. Generated by `pyupgrade --py3-plus --keep-percent-format` on all our Python code except `zthumbor` and `zulip-ec2-configure-interfaces`, followed by manual indentation fixes. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-09 16:43:22 -07:00
Mateusz Mandera	0155193140	rate_limiter: Change type of the RateLimitResult.remaining to int. This is cleaner than it being Optional[int], as the value of None for this object has been synonymous to 0.	2020-04-08 10:29:18 -07:00
Mateusz Mandera	e86cfbdbd7	rate_limiter: Store data in request._ratelimits_applied list. The information used to be stored in a request._ratelimit dict, but there's no need for that, and a list is a simpler structure, so this allows us to simplify the plumbing somewhat.	2020-04-08 10:29:18 -07:00
Mateusz Mandera	9911c6a0f0	rate_limiter: Put secs_to_freedom as message when raising RateLimited. That's the value that matters to the code that catches the exception, and this change allows simplifying the plumbing somewhat, and gets rid of the get_rate_limit_result_from_request function.	2020-04-08 10:29:18 -07:00
Mateusz Mandera	eb0216c5a8	middleware: Log <user.id>@subdomain instead of subdomain/<user.id>. It was decided that the new format is preferable.	2020-03-24 10:25:01 -07:00
Mateusz Mandera	85df6201f6	rate_limit: Move functions called by external code to RateLimitedObject.	2020-03-22 18:42:35 -07:00
Mateusz Mandera	2b51b3c6c5	middleware: Also log request subdomain when logging "unauth" request. This returns us to a consistent logging format regardless of whether the request is authenticated. We also update some log examples in docs to be consistent with the new style.	2020-03-22 18:32:04 -07:00
Mateusz Mandera	89394fc1eb	middleware: Use request.user for logging when possible. Instead of trying to set the _requestor_for_logs attribute in all the relevant places, we try to use request.user when possible (that will be when it's a UserProfile or RemoteZulipServer as of now). In other places, we set _requestor_for_logs to avoid manually editing the request.user attribute, as it should mostly be left for Django to manage it. In places where we remove the "request._requestor_for_logs = ..." line, it is clearly implied by the previous code (or the current surrounding code) that request.user is of the correct type.	2020-03-09 13:54:58 -07:00
Mateusz Mandera	0255ca9b6a	middleware: Log user.id/realm.string_id instead of _email.	2020-03-09 13:54:58 -07:00
Tim Abbott	229090a3a5	middleware: Avoid running APPEND_SLASH logic in Tornado. Profiling suggests this saves about 600us in the runtime of every GET /events request attempting to resolve URLs to determine whether we need to do the APPEND_SLASH behavior. It's possible that we end up doing the same URL resolution work later and we're just moving around some runtime, but I think even if we do, Django probably doesn't do any fancy caching that would mean doing this query twice doesn't just do twice the work. In any case, we probably want to extend this behavior to our whole API because the APPEND_SLASH redirect behavior is essentially a bug there. That is a more involved refactor, however.	2020-02-14 16:15:57 -08:00
rht	41e3db81be	dependencies: Upgrade to Django 2.2.10. Django 2.2.x is the next LTS release after Django 1.11.x; I expect we'll be on it for a while, as Django 3.x won't have an LTS release series out for a while. Because of upstream API changes in Django, this commit includes several changes beyond requirements and: * urls: django.urls.resolvers.RegexURLPattern has been replaced by django.urls.resolvers.URLPattern; affects OpenAPI code and related features which re-parse Django's internals. https://code.djangoproject.com/ticket/28593 * test_runner: Change number to suffix. Django changed the name in this ticket: https://code.djangoproject.com/ticket/28578 * Delete now-unnecessary SameSite cookie code (it's now the default). * forms: urlsafe_base64_encode returns string in Django 2.2. https://docs.djangoproject.com/en/2.2/ref/utils/#django.utils.http.urlsafe_base64_encode * upload: Django's File.size property replaces _get_size(). https://docs.djangoproject.com/en/2.2/_modules/django/core/files/base/ * process_queue: Migrate to new autoreload API. * test_messages: Add an extra query caused by .refresh_from_db() losing the .select_related() on the Realm object. * session: Sync SessionHostDomainMiddleware with Django 2.2. There's a lot more we can do to take advantage of the new release; this is tracked in #11341. Many changes by Tim Abbott, Umair Waheed, and Mateusz Mandera squashed are squashed into this commit. Fixes #10835.	2020-02-13 16:27:26 -08:00
Tim Abbott	1ea2f188ce	tornado: Rewrite Django integration to duplicate less code. Since essentially the first use of Tornado in Zulip, we've been maintaining our Tornado+Django system, AsyncDjangoHandler, with several hundred lines of Django code copied into it. The goal for that code was simple: We wanted a way to use our Django middleware (for code sharing reasons) inside a Tornado process (since we wanted to use Tornado for our async events system). As part of the Django 2.2.x upgrade, I looked at upgrading this implementation to be based off modern Django, and it's definitely possible to do that: * Continue forking load_middleware to save response middleware. * Continue manually running the Django response middleware. * Continue working out a hack involving copying all of _get_response to change a couple lines allowing us our Tornado code to not actually return the Django HttpResponse so we can long-poll. The previous hack of returning None stopped being viable with the Django 2.2 MiddlewareMixin.__call__ implementation. But I decided to take this opportunity to look at trying to avoid copying material Django code, and there is a way to do it: * Replace RespondAsynchronously with a response.asynchronous attribute on the HttpResponse; this allows Django to run its normal plumbing happily in a way that should be stable over time, and then we proceed to discard the response inside the Tornado `get()` method to implement long-polling. (Better yet might be raising an exception?). This lets us eliminate maintaining a patched copy of _get_response. * Removing the @asynchronous decorator, which didn't add anything now that we only have one API endpoint backend (with two frontend call points) that could call into this. Combined with the last bullet, this lets us remove a significant hack from our never_cache_responses function. * Calling the normal Django `get_response` method from zulip_finish after creating a duplicate request to process, rather than writing totally custom code to do that. This lets us eliminate maintaining a patched copy of Django's load_middleware. * Adding detailed comments explaining how this is supposed to work, what problems we encounter, and how we solve various problems, which is critical to being able to modify this code in the future. A key advantage of these changes is that the exact same code should work on Django 1.11, Django 2.2, and Django 3.x, because we're no longer copying large blocks of core Django code and thus should be much less vulnerable to refactors. There may be a modest performance downside, in that we now run both request and response middleware twice when longpolling (once for the request we discard). We may be able to avoid the expensive part of it, Zulip's own request/response middleware, with a bit of additional custom code to save work for requests where we're planning to discard the response. Profiling will be important to understanding what's worth doing here.	2020-02-13 16:13:11 -08:00
Mateusz Mandera	335b804510	exceptions: RateLimited shouldn't inherit from PermissionDenied. We will want to raise RateLimited in authenticate() in rate limiting code - Django's authenticate() mechanism catches PermissionDenied, which we don't want for RateLimited. We want RateLimited to propagate to our code that called the authenticate() function.	2020-02-02 19:15:00 -08:00
Mateusz Mandera	a6a2d70320	rate_limiter: Handle multiple types of rate limiting in middleware. As more types of rate limiting of requests are added, one request may end up having various limits applied to it - and the middleware needs to be able to handle that. We implement that through a set_response_headers function, which sets the X-RateLimit-* headers in a sensible way based on all the limits that were applied to the request.	2020-02-02 19:15:00 -08:00
Wyatt Hoodes	b807c4273e	middleware: Fix exception typing. Mypy seems to have trouble understanding `Exception` inheritance here, so we create a `Union` for the only `Exception` we are looking for.	2019-07-31 12:23:20 -07:00
Anders Kaseorg	0bcae0be55	write_log_line: Fix logging of 4xx error data. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-07-25 14:42:52 -07:00
Wyatt Hoodes	5686821150	middleware: Change write_log_line to publish as a dict. We were seeing errors when pubishing typical events in the form of `Dict[str, Any]` as the expected type to be a `Union`. So we instead change the only non-dictionary call, to pass a dict instead of `str`.	2019-07-22 17:06:41 -07:00
Mateusz Mandera	f73600c82c	rate_limiter: Create a general rate_limit_request_by_entity function.	2019-05-30 16:50:11 -07:00
Anders Kaseorg	9efda71a4b	get_realm: raise DoesNotExist instead of returning None. This makes the implementation of `get_realm` consistent with its declared return type of `Realm` rather than `Optional[Realm]`. Fixes #12263. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-05-06 21:58:16 -07:00
Puneeth Chaganti	a653fcca93	html_to_text: Escape text when using as description.	2019-04-25 15:29:16 -07:00
Puneeth Chaganti	7d7134d45d	html_to_text: Extract code for html to plain text conversion.	2019-04-25 15:29:16 -07:00
Anders Kaseorg	21dc34cc52	open graph: HTML-escape og:description, twitter:description. The entire idea of doing this operation with unchecked string replacement in a middleware class is in my opinion extremely ill-conceived, but this fixes the most pressing problem with it generating invalid HTML. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-04-23 15:53:59 -07:00
Anders Kaseorg	643bd18b9f	lint: Fix code that evaded our lint checks for string % non-tuple. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-04-23 15:21:37 -07:00
Tim Abbott	983e24a7f5	auth: Use HTTP status 404 for invalid realms. Apparently, our invalid realm error page had HTTP status 200, which could be confusing and in particular broken our mobile app's error handling for this case.	2019-03-14 13:50:09 -07:00
Tim Abbott	de6f724bc5	middleware: Avoid doing work for statsd when not enabled. This saves about 8% of the runtime of our total response middleware, or equivalently close to 2% of the total Tornado response time. Which is pretty significant given that we're not sure anyone is using statsd in production. It's also useful outside Tornado, but the effect is particularly significant because of how important Tornado performance is.	2019-02-27 17:53:15 -08:00
Tim Abbott	c955b20131	middleware: Don't repreatedly regenerate open graph functions. This avoids parsing these functions on every request, which was adding roughly 350us to our per-request response times. The overall impact was more than 10% of basic Tornado response runtime.	2019-02-27 17:53:13 -08:00
Rishi Gupta	028874bab3	open graph: Remove extraneous spaces from descriptions. Our html collects extra spaces in a couple of places. The most prominent is paragraphs that look like the following in the .md file: * some text continued The html will have two spaces before "continued".	2019-02-11 12:05:19 -08:00
Rishi Gupta	d3125f59e1	open graph: Omit .code-section navigation from open graph.	2019-02-11 12:05:19 -08:00
Rishi Gupta	e1f02dc6f2	open graph: Include multiple paragraphs in description tags.	2019-02-11 12:05:19 -08:00
Anders Kaseorg	f0ecb93515	zerver core: Remove unused imports. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-02-02 17:41:24 -08:00
Wyatt Hoodes	8eac361fb5	docs: Refactor BS work with use of cache_with_key. Refactor the potentially expensive work done by Beautiful Soup into a function that is called by the alter_content function, so that we can cache the result. Saves a significant portion of the runtime of loading of all of our /help/ and /api/ documentation pages (e.g. 12ms for /api). Fixes #11088. Tweaked by tabbott to use the URL path as the cache key, clean up argument structure, and use a clearer name for the function.	2019-01-28 15:21:52 -08:00
Tim Abbott	9c3f38a564	docs: Automatically construct OpenAPI metadata for help center. This is somewhat hacky, in that in order to do what we're doing, we need to parse the HTML of the rendered page to extract the first paragraph to include in the open graph description field. But BeautifulSoup does a good job of it. This carries a nontrivial performance penalty for loading these pages, but overall /help/ is a low-traffic site compared to the main app, so it doesn't matter much. (As a sidenote, it wouldn't be a bad idea to cache this stuff). There's lots of things we can improve in this, largely through editing the articles, but we can deal with that over time. Thanks to Rishi for writing all the tests.	2018-12-19 10:18:20 -08:00
Tim Abbott	ae6fc0a471	sessions: Resync session middleware from Django upstream. Until we resolve https://github.com/zulip/zulip/issues/10832, we will need to maintain our own forked copy of Django's SessionMiddleware. We apparently let this get out of date. This fixes a few subtle bugs involving the user logout experience that were throwing occasional exceptions (e.g. the UpdateError fix you can see).	2018-11-14 15:16:12 -08:00
Tim Abbott	10ac671cd4	middleware: Fix logging of query counts in websockets requests. Apparently, we weren't resetting the query counters inside the websockets codebase, resulting in broken log results like this: SOCKET 403 2ms (db: 1ms/2q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 5ms (db: 2ms/3q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 2ms (db: 3ms/4q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 2ms (db: 3ms/5q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 2ms (db: 4ms/6q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 2ms (db: 5ms/7q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 2ms (db: 5ms/8q) /socket/auth [transport=websocket] (unknown via ?) SOCKET 403 3ms (db: 6ms/9q) /socket/auth [transport=websocket] (unknown via ?) The correct fix for this is to call reset_queries at the start of each endpoint within the websockets system. As it turns out, we're already calling record_request_start_data there, and in fact should be calling `reset_queries` in all code paths that use that function (the other code paths, in zerver/middleware.py, do it manually with connection.connection.queries = []). So we can clean up the code in a way that reduces risk for similar future issues and fix this logging bug with this simple refactor.	2018-10-31 16:22:17 -07:00
Tim Abbott	e4813e462b	tornado: Rename async_request_{restart,stop} to mention timer. Previously, these timer accounting functions could be easily mistaken for referring to starting/stopping the request. By adding timer to the name, we make the code easier for the casual observer to read and understand.	2018-10-16 15:39:10 -07:00
Vishnu Ks	d2e4417a72	urls: Separate endpoint for signup and new realm email confirm. This is preparation for the next commit.	2018-08-26 22:53:57 -07:00
Aditya Bansal	993d50f5ab	zerver: Change use of typing.Text to str.	2018-05-12 15:22:39 -07:00
neiljp (Neil Pilgrim)	2ed6da77c7	mypy: Rewrite some middleware annotations to use ViewFuncT.	2018-03-17 23:25:05 +00:00
Greg Price	53c57cf002	errors: Include request info on error mails for JSON routes too. When our code raises an exception and Django converts it to a 500 response (in django.core.handlers.exception.handle_uncaught_exception), it attaches the request to the log record, and we use this in our AdminNotifyHandler to include data like the user and the URL path in the error email sent to admins. On this line, when our code raises an exception but we've decided (in `TagRequests`) to format any errors as JSON errors, we suppress the exception so we have to generate the log record ourselves. Attach the request here, just like Django does when we let it do the job. This still isn't an awesome solution, in that there are lots of other places where we call `logging.error` or `logging.exception` while inside a request; this just covers one of them. This is one of the most common, though, so it's a start.	2018-03-01 15:12:32 -08:00
Callum Fraser	aa9567ce37	mypy: Use Python 3 type syntax in zerver/middleware.py.	2017-12-11 18:43:24 -08:00
rht	a1cc720860	zerver: Use Python 3 syntax for typing. Tweaked by tabbott to fix some minor whitespace errors.	2017-11-28 16:49:36 -08:00
Greg Price	b6cc21b438	debug: Add facility to dump tracemalloc snapshots. Originally this used signals, namely SIGRTMIN. But in prod, the signal handler never fired; I debugged fruitlessly for a while, and suspect uwsgi was foiling it in a mysterious way (which is kind of the only way uwsgi does anything.) So, we listen on a socket. Bit more code, and a bit trickier to invoke, but it works. This was developed for the investigation of memory-bloating on chat.zulip.org that led to `a331b4f64` "Optimize query_all_subs_by_stream()". For usage instructions, see docstring.	2017-11-28 15:52:07 -08:00
Robert Hönig	0e0a8a2b14	queue processor tests: Call consume by default. This significantly improves the API for queue_json_publish to not be overly focused on what the behavior of this function should be in our unit tests.	2017-11-26 11:45:34 -08:00

1 2 3 4 5 ...

252 Commits