Previously, several of our URL patterns accidentally did not end with
`$`, and thus ended up controlling just the stated URL, but actually a
much broader set of URLs starting with it.
I did an audit and fixed what I believe are all instances of this URL
pattern behavior. In the process, I fixed a few tests that were
unintentionally relying on the behavior.
Fixes#13082.
An endpoint was created in zerver/views. Basic rate-limiting was
implemented using RealmAuditLog. The idea here is to simply log each
export event as a realm_exported event. The number of events
occurring in the time delta is checked to ensure that the weekly
limit is not exceeded.
The event is published to the 'deferred_work' queue processor to
prevent the export process from being killed after 60s.
Upon completion of the export the realm admin(s) are notified.
While this would never happen for a real article, this prevents a 500
in this case for a situation which is definitely user error and should
be a 40x (in this case, 404).
As part of this, we refactor the main view code to do validation in a
single code path, since the semi-duplicated-in-3-places logic was
getting pretty buggy.
This works around a bug in Django in handling the error case of a
client sending an inappropriate HTTP `Host:` header. Various
internal Django machinery expects to be able to casually call
`request.get_host()`, which will attempt to parse that header, so an
exception will be raised. The exception-handling machinery attempts
to catch that exception and just turn it into a 400 response... but
in a certain case, that machinery itself ends up trying to call
`request.get_host()`, and we end up with an uncaught exception that
causes a 500 response, a chain of tracebacks in the logs, and an email
to the server admins. See example below.
That `request.get_host` call comes in the midst of some CSRF-related
middleware, which doesn't even serve any function unless you have a
form in your 400 response page that you want CSRF protection for.
We use the default 400 response page, which is a 26-byte static
HTML error message. So, just send that with no further ado.
Example exception from server logs (lightly edited):
2017-10-08 09:51:50.835 ERR [django.security.DisallowedHost] Invalid HTTP_HOST header: 'example.com'. You may need to add 'example.com' to ALLOWED_HOSTS.
2017-10-08 09:51:50.835 ERR [django.request] Internal Server Error: /loginWithSetCookie
Traceback (most recent call last):
File ".../django/core/handlers/exception.py", line 41, in inner
response = get_response(request)
File ".../django/utils/deprecation.py", line 138, in __call__
response = self.process_request(request)
File ".../django/middleware/common.py", line 57, in process_request
host = request.get_host()
File ".../django/http/request.py", line 113, in get_host
raise DisallowedHost(msg)
django.core.exceptions.DisallowedHost: Invalid HTTP_HOST header: 'example.com'. You may need to add 'example.com' to ALLOWED_HOSTS.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File ".../django/core/handlers/exception.py", line 109, in get_exception_response
response = callback(request, **dict(param_dict, exception=exception))
File ".../django/utils/decorators.py", line 145, in _wrapped_view
result = middleware.process_view(request, view_func, args, kwargs)
File ".../django/middleware/csrf.py", line 276, in process_view
good_referer = request.get_host()
File ".../django/http/request.py", line 113, in get_host
raise DisallowedHost(msg)
django.core.exceptions.DisallowedHost: Invalid HTTP_HOST header: 'example.com'. You may need to add 'example.com' to ALLOWED_HOSTS.
This is one of the last major endpoints that were still done in the
pre-REST style.
While we're at it, we change the endpoint to expect a stream ID, not a
stream name.
For a long time, rest_dispatch has had this hack where we have to
create a copy of it in each views file using it, in order to directly
access the globals list in that file. This removes that hack, instead
making rest_dispatch just use Django's import_string to access the
target method to use.
[tweaked and reorganized from acrefoot's original branch in various
ways by tabbott]