2017-11-16 00:53:11 +01:00
|
|
|
import logging
|
2024-07-12 02:30:25 +02:00
|
|
|
from collections.abc import Collection
|
2023-01-18 05:25:49 +01:00
|
|
|
from contextlib import suppress
|
2024-07-12 02:30:25 +02:00
|
|
|
from typing import Any, Optional
|
2023-12-05 21:14:17 +01:00
|
|
|
from urllib.parse import unquote
|
2017-11-16 00:53:11 +01:00
|
|
|
|
2016-11-27 04:56:26 +01:00
|
|
|
import tornado.web
|
2022-03-18 08:34:10 +01:00
|
|
|
from asgiref.sync import sync_to_async
|
2016-11-27 04:56:26 +01:00
|
|
|
from django import http
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
from django.core import signals
|
2016-11-27 04:56:26 +01:00
|
|
|
from django.core.handlers import base
|
2017-11-16 00:53:11 +01:00
|
|
|
from django.core.handlers.wsgi import WSGIRequest, get_script_name
|
2016-11-27 04:56:26 +01:00
|
|
|
from django.http import HttpRequest, HttpResponse
|
2020-06-11 00:54:34 +02:00
|
|
|
from django.urls import set_script_prefix
|
2021-04-02 23:05:07 +02:00
|
|
|
from django.utils.cache import patch_vary_headers
|
2022-06-27 23:53:09 +02:00
|
|
|
from tornado.iostream import StreamClosedError
|
2016-11-27 04:56:26 +01:00
|
|
|
from tornado.wsgi import WSGIContainer
|
2023-10-12 19:43:45 +02:00
|
|
|
from typing_extensions import override
|
2016-11-27 04:56:26 +01:00
|
|
|
|
2022-05-27 01:06:39 +02:00
|
|
|
from zerver.lib.response import AsynchronousResponse, json_response
|
2016-11-27 06:14:48 +01:00
|
|
|
from zerver.tornado.descriptors import get_descriptor_by_handler_id
|
2016-11-27 04:56:26 +01:00
|
|
|
|
2016-11-27 06:36:06 +01:00
|
|
|
current_handler_id = 0
|
2024-07-12 02:30:17 +02:00
|
|
|
handlers: dict[int, "AsyncDjangoHandler"] = {}
|
2023-04-26 02:52:20 +02:00
|
|
|
fake_wsgi_container = WSGIContainer(lambda environ, start_response: [])
|
2016-11-27 04:56:26 +01:00
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2023-12-12 19:02:18 +01:00
|
|
|
def get_handler_by_id(handler_id: int) -> Optional["AsyncDjangoHandler"]:
|
|
|
|
return handlers.get(handler_id)
|
2016-11-27 06:36:06 +01:00
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2021-02-12 08:20:45 +01:00
|
|
|
def allocate_handler_id(handler: "AsyncDjangoHandler") -> int:
|
2016-11-27 06:36:06 +01:00
|
|
|
global current_handler_id
|
|
|
|
handlers[current_handler_id] = handler
|
2022-06-23 23:46:27 +02:00
|
|
|
handler_id = current_handler_id
|
2016-11-27 06:36:06 +01:00
|
|
|
current_handler_id += 1
|
2022-06-23 23:46:27 +02:00
|
|
|
return handler_id
|
2016-11-27 06:36:06 +01:00
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2017-10-26 11:38:28 +02:00
|
|
|
def clear_handler_by_id(handler_id: int) -> None:
|
2023-12-12 05:32:05 +01:00
|
|
|
if handler_id in handlers:
|
|
|
|
del handlers[handler_id]
|
2016-11-27 06:36:06 +01:00
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2017-10-26 11:38:28 +02:00
|
|
|
def handler_stats_string() -> str:
|
2020-06-10 06:41:04 +02:00
|
|
|
return f"{len(handlers)} handlers, latest ID {current_handler_id}"
|
2016-11-27 06:36:06 +01:00
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2024-07-12 02:30:17 +02:00
|
|
|
def finish_handler(handler_id: int, event_queue_id: str, contents: list[dict[str, Any]]) -> None:
|
2016-11-27 06:36:06 +01:00
|
|
|
try:
|
2021-07-09 10:06:04 +02:00
|
|
|
# We do the import during runtime to avoid cyclic dependency
|
|
|
|
# with zerver.lib.request
|
2021-08-21 19:24:20 +02:00
|
|
|
from zerver.lib.request import RequestNotes
|
2021-07-08 16:10:54 +02:00
|
|
|
from zerver.middleware import async_request_timer_restart
|
|
|
|
|
2023-12-12 19:02:18 +01:00
|
|
|
# The request handler may have been GC'd by a
|
|
|
|
# on_connection_close elsewhere already, so we have to check
|
|
|
|
# it is still around.
|
2016-11-27 06:36:06 +01:00
|
|
|
handler = get_handler_by_id(handler_id)
|
2023-12-12 19:02:18 +01:00
|
|
|
if handler is None:
|
|
|
|
return
|
2016-11-27 06:36:06 +01:00
|
|
|
request = handler._request
|
2022-06-23 23:49:32 +02:00
|
|
|
assert request is not None
|
2023-12-12 19:02:18 +01:00
|
|
|
|
|
|
|
# We call async_request_timer_restart here in case we are
|
|
|
|
# being finished without any events (because another
|
|
|
|
# get_events request has supplanted this request)
|
2018-10-17 00:39:10 +02:00
|
|
|
async_request_timer_restart(request)
|
2021-08-21 19:24:20 +02:00
|
|
|
log_data = RequestNotes.get_notes(request).log_data
|
2021-07-09 10:06:04 +02:00
|
|
|
assert log_data is not None
|
2016-11-27 06:36:06 +01:00
|
|
|
if len(contents) != 1:
|
2021-07-09 10:06:04 +02:00
|
|
|
log_data["extra"] = f"[{event_queue_id}/1]"
|
2016-11-27 06:36:06 +01:00
|
|
|
else:
|
2021-07-09 10:06:04 +02:00
|
|
|
log_data["extra"] = "[{}/1/{}]".format(event_queue_id, contents[0]["type"])
|
2016-11-27 06:36:06 +01:00
|
|
|
|
2022-03-18 08:34:10 +01:00
|
|
|
tornado.ioloop.IOLoop.current().add_callback(
|
|
|
|
handler.zulip_finish,
|
2021-02-12 08:20:45 +01:00
|
|
|
dict(result="success", msg="", events=contents, queue_id=event_queue_id),
|
2021-02-12 08:19:30 +01:00
|
|
|
request,
|
|
|
|
)
|
2022-06-03 05:51:16 +02:00
|
|
|
except Exception as e:
|
|
|
|
if not (
|
|
|
|
(isinstance(e, OSError) and str(e) == "Stream is closed")
|
|
|
|
or (isinstance(e, AssertionError) and str(e) == "Request closed")
|
|
|
|
):
|
|
|
|
logging.exception(
|
|
|
|
"Got error finishing handler for queue %s", event_queue_id, stack_info=True
|
|
|
|
)
|
2016-11-27 06:36:06 +01:00
|
|
|
|
|
|
|
|
2022-09-30 03:08:23 +02:00
|
|
|
class AsyncDjangoHandler(tornado.web.RequestHandler):
|
|
|
|
handler_id: int
|
2016-11-27 04:56:26 +01:00
|
|
|
|
2023-12-07 22:15:32 +01:00
|
|
|
SUPPORTED_METHODS: Collection[str] = {"GET", "POST", "DELETE"} # type: ignore[assignment] # https://github.com/tornadoweb/tornado/pull/3354
|
2023-12-07 19:47:14 +01:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2022-09-30 03:08:23 +02:00
|
|
|
def initialize(self, django_handler: base.BaseHandler) -> None:
|
|
|
|
self.django_handler = django_handler
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
|
|
|
|
# Prevent Tornado from automatically finishing the request
|
2016-11-27 04:56:26 +01:00
|
|
|
self._auto_finish = False
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
|
2016-11-27 04:56:26 +01:00
|
|
|
# Handler IDs are allocated here, and the handler ID map must
|
2023-12-11 21:02:31 +01:00
|
|
|
# be cleared when the handler finishes its response. See
|
|
|
|
# on_finish and on_connection_close.
|
2022-06-23 23:46:27 +02:00
|
|
|
self.handler_id = allocate_handler_id(self)
|
2016-11-27 04:56:26 +01:00
|
|
|
|
2024-07-12 02:30:23 +02:00
|
|
|
self._request: HttpRequest | None = None
|
2022-06-23 23:49:32 +02:00
|
|
|
|
2023-12-07 20:20:54 +01:00
|
|
|
@override
|
|
|
|
def on_finish(self) -> None:
|
2023-12-11 21:02:31 +01:00
|
|
|
# Note that this only runs on _successful_ requests. If the
|
|
|
|
# client closes the connection, see on_connection_close,
|
|
|
|
# below.
|
2023-12-07 20:20:54 +01:00
|
|
|
clear_handler_by_id(self.handler_id)
|
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2017-10-26 11:38:28 +02:00
|
|
|
def __repr__(self) -> str:
|
2016-11-27 06:02:45 +01:00
|
|
|
descriptor = get_descriptor_by_handler_id(self.handler_id)
|
2020-06-10 06:41:04 +02:00
|
|
|
return f"AsyncDjangoHandler<{self.handler_id}, {descriptor}>"
|
2016-11-27 04:56:26 +01:00
|
|
|
|
2022-06-27 02:33:55 +02:00
|
|
|
async def convert_tornado_request_to_django_request(self) -> HttpRequest:
|
2020-02-08 01:02:23 +01:00
|
|
|
# This takes the WSGI environment that Tornado received (which
|
|
|
|
# fully describes the HTTP request that was sent to Tornado)
|
|
|
|
# and pass it to Django's WSGIRequest to generate a Django
|
|
|
|
# HttpRequest object with the original Tornado request's HTTP
|
|
|
|
# headers, parameters, etc.
|
2023-04-26 02:52:20 +02:00
|
|
|
environ = fake_wsgi_container.environ(self.request)
|
2023-12-05 21:14:17 +01:00
|
|
|
environ["PATH_INFO"] = unquote(environ["PATH_INFO"])
|
2016-11-27 04:56:26 +01:00
|
|
|
|
2020-02-08 01:02:23 +01:00
|
|
|
# Django WSGIRequest setup code that should match logic from
|
|
|
|
# Django's WSGIHandler.__call__ before the call to
|
|
|
|
# `get_response()`.
|
2016-11-27 04:56:26 +01:00
|
|
|
set_script_prefix(get_script_name(environ))
|
2022-06-27 02:33:55 +02:00
|
|
|
await sync_to_async(
|
2022-10-08 07:35:48 +02:00
|
|
|
lambda: signals.request_started.send(sender=type(self.django_handler)),
|
2022-09-30 03:08:23 +02:00
|
|
|
thread_sensitive=True,
|
2022-06-27 02:33:55 +02:00
|
|
|
)()
|
2022-06-24 10:20:46 +02:00
|
|
|
self._request = WSGIRequest(environ)
|
2020-02-08 01:02:23 +01:00
|
|
|
|
2021-07-09 15:30:06 +02:00
|
|
|
# We do the import during runtime to avoid cyclic dependency
|
2021-08-21 19:24:20 +02:00
|
|
|
from zerver.lib.request import RequestNotes
|
2021-07-09 15:30:06 +02:00
|
|
|
|
2020-02-08 01:02:23 +01:00
|
|
|
# Provide a way for application code to access this handler
|
|
|
|
# given the HttpRequest object.
|
2022-06-24 10:20:46 +02:00
|
|
|
RequestNotes.get_notes(self._request).tornado_handler_id = self.handler_id
|
2020-02-08 01:02:23 +01:00
|
|
|
|
2022-06-24 10:20:46 +02:00
|
|
|
return self._request
|
2020-02-08 01:02:23 +01:00
|
|
|
|
2022-06-01 01:35:08 +02:00
|
|
|
async def write_django_response_as_tornado_response(self, response: HttpResponse) -> None:
|
2020-02-08 01:10:35 +01:00
|
|
|
# This takes a Django HttpResponse and copies its HTTP status
|
|
|
|
# code, headers, cookies, and content onto this
|
|
|
|
# tornado.web.RequestHandler (which is how Tornado prepares a
|
|
|
|
# response to write).
|
2016-11-27 04:56:26 +01:00
|
|
|
|
2020-02-08 01:10:35 +01:00
|
|
|
# Copy the HTTP status code.
|
2016-11-27 04:56:26 +01:00
|
|
|
self.set_status(response.status_code)
|
2020-02-08 01:10:35 +01:00
|
|
|
|
|
|
|
# Copy the HTTP headers (iterating through a Django
|
|
|
|
# HttpResponse is the way to access its headers as key/value pairs)
|
2016-11-27 04:56:26 +01:00
|
|
|
for h in response.items():
|
|
|
|
self.set_header(h[0], h[1])
|
|
|
|
|
2020-02-08 01:10:35 +01:00
|
|
|
# Copy any cookies
|
2016-11-27 04:56:26 +01:00
|
|
|
if not hasattr(self, "_new_cookies"):
|
2024-07-12 02:30:17 +02:00
|
|
|
self._new_cookies: list[http.cookie.SimpleCookie] = []
|
2016-11-27 04:56:26 +01:00
|
|
|
self._new_cookies.append(response.cookies)
|
|
|
|
|
2020-02-08 01:10:35 +01:00
|
|
|
# Copy the response content
|
2016-11-27 04:56:26 +01:00
|
|
|
self.write(response.content)
|
2020-02-08 01:10:35 +01:00
|
|
|
|
|
|
|
# Close the connection.
|
2023-01-18 05:25:49 +01:00
|
|
|
# While writing the response, we might realize that the
|
|
|
|
# user already closed the connection; that is fine.
|
|
|
|
with suppress(StreamClosedError):
|
2022-06-27 23:53:09 +02:00
|
|
|
await self.finish()
|
2016-11-27 04:56:26 +01:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2022-03-18 08:34:10 +01:00
|
|
|
async def get(self, *args: Any, **kwargs: Any) -> None:
|
2022-06-27 02:33:55 +02:00
|
|
|
request = await self.convert_tornado_request_to_django_request()
|
2022-09-30 03:08:23 +02:00
|
|
|
response = await sync_to_async(
|
|
|
|
lambda: self.django_handler.get_response(request), thread_sensitive=True
|
|
|
|
)()
|
2020-02-08 01:10:35 +01:00
|
|
|
|
|
|
|
try:
|
2022-05-27 01:06:39 +02:00
|
|
|
if isinstance(response, AsynchronousResponse):
|
2021-07-08 16:10:54 +02:00
|
|
|
# We import async_request_timer_restart during runtime
|
|
|
|
# to avoid cyclic dependency with zerver.lib.request
|
|
|
|
from zerver.middleware import async_request_timer_stop
|
|
|
|
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
# For asynchronous requests, this is where we exit
|
|
|
|
# without returning the HttpResponse that Django
|
|
|
|
# generated back to the user in order to long-poll the
|
|
|
|
# connection. We save some timers here in order to
|
|
|
|
# support accurate accounting of the total resources
|
|
|
|
# consumed by the request when it eventually returns a
|
|
|
|
# response and is logged.
|
|
|
|
async_request_timer_stop(request)
|
2021-07-20 07:49:38 +02:00
|
|
|
else:
|
|
|
|
# For normal/synchronous requests that don't end up
|
|
|
|
# long-polling, we just need to write the HTTP
|
|
|
|
# response that Django prepared for us via Tornado.
|
2021-08-18 17:54:22 +02:00
|
|
|
assert isinstance(response, HttpResponse)
|
2022-06-01 01:35:08 +02:00
|
|
|
await self.write_django_response_as_tornado_response(response)
|
2020-02-08 01:10:35 +01:00
|
|
|
finally:
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
# Tell Django that we're done processing this request on
|
|
|
|
# the Django side; this triggers cleanup work like
|
|
|
|
# resetting the urlconf and any cache/database
|
|
|
|
# connections.
|
2022-03-17 21:42:25 +01:00
|
|
|
await sync_to_async(response.close, thread_sensitive=True)()
|
2020-02-08 01:10:35 +01:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2022-03-18 08:34:10 +01:00
|
|
|
async def post(self, *args: Any, **kwargs: Any) -> None:
|
|
|
|
await self.get(*args, **kwargs)
|
2016-11-27 04:56:26 +01:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2022-03-18 08:34:10 +01:00
|
|
|
async def delete(self, *args: Any, **kwargs: Any) -> None:
|
|
|
|
await self.get(*args, **kwargs)
|
2016-11-27 04:56:26 +01:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2017-10-26 11:38:28 +02:00
|
|
|
def on_connection_close(self) -> None:
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
# Register a Tornado handler that runs when client-side
|
|
|
|
# connections are closed to notify the events system.
|
2023-12-11 21:02:31 +01:00
|
|
|
|
|
|
|
# If the client goes away, garbage collect the handler (with
|
|
|
|
# attached request information).
|
|
|
|
clear_handler_by_id(self.handler_id)
|
2016-11-27 04:56:26 +01:00
|
|
|
client_descriptor = get_descriptor_by_handler_id(self.handler_id)
|
|
|
|
if client_descriptor is not None:
|
|
|
|
client_descriptor.disconnect_handler(client_closed=True)
|
|
|
|
|
2024-07-12 02:30:17 +02:00
|
|
|
async def zulip_finish(self, result_dict: dict[str, Any], old_request: HttpRequest) -> None:
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
# Function called when we want to break a long-polled
|
|
|
|
# get_events request and return a response to the client.
|
|
|
|
|
|
|
|
# Marshall the response data from result_dict.
|
2021-02-12 08:20:45 +01:00
|
|
|
if result_dict["result"] == "error":
|
2016-11-27 04:56:26 +01:00
|
|
|
self.set_status(400)
|
|
|
|
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
# The `result` dictionary contains the data we want to return
|
|
|
|
# to the client. We want to do so in a proper Tornado HTTP
|
|
|
|
# response after running the Django response middleware (which
|
|
|
|
# does things like log the request, add rate-limit headers,
|
|
|
|
# etc.). The Django middleware API expects to receive a fresh
|
|
|
|
# HttpRequest object, and so to minimize hacks, our strategy
|
|
|
|
# is to create a duplicate Django HttpRequest object, tagged
|
|
|
|
# to automatically return our data in its response, and call
|
|
|
|
# Django's main self.get_response() handler to generate an
|
|
|
|
# HttpResponse with all Django middleware run.
|
2022-06-27 02:33:55 +02:00
|
|
|
request = await self.convert_tornado_request_to_django_request()
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
|
2021-08-21 19:24:20 +02:00
|
|
|
# We import RequestNotes during runtime to avoid
|
2021-07-09 10:06:04 +02:00
|
|
|
# cyclic import
|
2021-08-21 19:24:20 +02:00
|
|
|
from zerver.lib.request import RequestNotes
|
2021-07-09 10:06:04 +02:00
|
|
|
|
2021-08-21 19:24:20 +02:00
|
|
|
request_notes = RequestNotes.get_notes(request)
|
|
|
|
old_request_notes = RequestNotes.get_notes(old_request)
|
2021-07-09 10:06:04 +02:00
|
|
|
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
# Add to this new HttpRequest logging data from the processing of
|
|
|
|
# the original request; we will need these for logging.
|
2021-07-09 10:06:04 +02:00
|
|
|
request_notes.log_data = old_request_notes.log_data
|
2021-07-19 23:27:29 +02:00
|
|
|
if request_notes.rate_limit is not None:
|
|
|
|
request_notes.rate_limit = old_request_notes.rate_limit
|
2023-06-20 22:52:31 +02:00
|
|
|
if request_notes.requester_for_logs is not None:
|
|
|
|
request_notes.requester_for_logs = old_request_notes.requester_for_logs
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
request.user = old_request.user
|
2021-07-09 18:10:51 +02:00
|
|
|
request_notes.client = old_request_notes.client
|
|
|
|
request_notes.client_name = old_request_notes.client_name
|
|
|
|
request_notes.client_version = old_request_notes.client_version
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
|
|
|
|
# The saved_response attribute, if present, causes
|
|
|
|
# rest_dispatch to return the response immediately before
|
|
|
|
# doing any work. This arrangement allows Django's full
|
|
|
|
# request/middleware system to run unmodified while avoiding
|
|
|
|
# running expensive things like Zulip's authentication code a
|
|
|
|
# second time.
|
2021-07-09 15:17:33 +02:00
|
|
|
request_notes.saved_response = json_response(
|
2021-02-12 08:20:45 +01:00
|
|
|
res_type=result_dict["result"], data=result_dict, status=self.get_status()
|
2021-02-12 08:19:30 +01:00
|
|
|
)
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
|
2022-09-30 03:08:23 +02:00
|
|
|
response = await sync_to_async(
|
|
|
|
lambda: self.django_handler.get_response(request), thread_sensitive=True
|
|
|
|
)()
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
try:
|
2021-04-02 23:05:07 +02:00
|
|
|
# Explicitly mark requests as varying by cookie, since the
|
|
|
|
# middleware will not have seen a session access
|
|
|
|
patch_vary_headers(response, ("Cookie",))
|
2021-08-18 17:54:22 +02:00
|
|
|
assert isinstance(response, HttpResponse)
|
2022-06-01 01:35:08 +02:00
|
|
|
await self.write_django_response_as_tornado_response(response)
|
tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
2020-02-06 22:09:10 +01:00
|
|
|
finally:
|
|
|
|
# Tell Django we're done processing this request
|
2022-03-17 21:42:25 +01:00
|
|
|
await sync_to_async(response.close, thread_sensitive=True)()
|