mirror of https://github.com/zulip/zulip.git
docs: Update some notes about Tornado scalability.
This commit is contained in:
parent
5050b595a0
commit
f9b60b4803
|
@ -95,16 +95,20 @@ by the Supervisor configuration (which explains how to start the server
|
|||
processes; see "Supervisor" below) and the nginx configuration (which
|
||||
explains which HTTP requests get sent to which app server).
|
||||
|
||||
Tornado is an asynchronous server and is meant specifically to hold open
|
||||
tens of thousands of long-lived (long-polling or websocket) connections
|
||||
-- that is to say, routes that maintain a persistent connection from
|
||||
every running client. For this reason, it's responsible for event
|
||||
(message) delivery, but not much else. We try to avoid any blocking
|
||||
calls in Tornado because we don't want to delay delivery to thousands of
|
||||
other connections (as this would make Zulip very much not real-time).
|
||||
For instance, we avoid doing cache or database queries inside the
|
||||
Tornado code paths, since those blocking requests carry a very high
|
||||
performance penalty for a single-threaded, asynchronous server.
|
||||
Tornado is an asynchronous server and is meant specifically to hold
|
||||
open tens of thousands of long-lived (long-polling or websocket)
|
||||
connections -- that is to say, routes that maintain a persistent
|
||||
connection from every running client. For this reason, it's
|
||||
responsible for event (message) delivery, but not much else. We try to
|
||||
avoid any blocking calls in Tornado because we don't want to delay
|
||||
delivery to thousands of other connections (as this would make Zulip
|
||||
very much not real-time). For instance, we avoid doing cache or
|
||||
database queries inside the Tornado code paths, since those blocking
|
||||
requests carry a very high performance penalty for a single-threaded,
|
||||
asynchronous server system. (In principle, we could do non-blocking
|
||||
requests to those services, but the Django-based database libraries we
|
||||
use in most of our codebase using don't support that, and in any case,
|
||||
our architecture doesn't require Tornado to do that).
|
||||
|
||||
The parts that are activated relatively rarely (e.g. when people type or
|
||||
click on something) are processed by the Django application server. One
|
||||
|
|
|
@ -436,18 +436,23 @@ running Zulip with larger teams (especially >1000 users).
|
|||
S3 backend for storing user-uploaded files and avatars and will want
|
||||
to make sure secrets are available on the hot spare.
|
||||
|
||||
* Zulip does not support dividing traffic for a given Zulip realm
|
||||
between multiple application servers. There are two issues: you
|
||||
need to share the memcached/Redis/RabbitMQ instance (these should
|
||||
can be moved to a network service shared by multiple servers with a
|
||||
bit of configuration) and the Tornado event system for pushing to
|
||||
browsers currently has no mechanism for multiple frontend servers
|
||||
(or event processes) talking to each other. One can probably get a
|
||||
factor of 10 in a single server's scalability by [supporting
|
||||
multiple tornado processes on a single
|
||||
server](https://github.com/zulip/zulip/issues/372), which is also
|
||||
likely the first part of any project to support exchanging events
|
||||
amongst multiple servers.
|
||||
* Zulip 2.0 and later supports running multiple Tornado servers
|
||||
sharded by realm/organization, which is how we scale Zulip Cloud.
|
||||
|
||||
* However, Zulip does not yet support dividing traffic for a single
|
||||
Zulip realm between multiple application servers. There are two
|
||||
issues: you need to share the memcached/Redis/RabbitMQ instance
|
||||
(these should can be moved to a network service shared by multiple
|
||||
servers with a bit of configuration) and the Tornado event system
|
||||
for pushing to browsers currently has no mechanism for multiple
|
||||
frontend servers (or event processes) talking to each other. One
|
||||
can probably get a factor of 10 in a single server's scalability by
|
||||
[supporting multiple tornado processes on a single server](https://github.com/zulip/zulip/issues/372),
|
||||
which is also likely the first part of any project to support
|
||||
exchanging events amongst multiple servers. The work for changing
|
||||
this is pretty far along, though, and thus while not generally
|
||||
available yet, we can set it up for users with an enterprise support
|
||||
contract.
|
||||
|
||||
Questions, concerns, and bug reports about this area of Zulip are very
|
||||
welcome! This is an area we are hoping to improve.
|
||||
|
|
Loading…
Reference in New Issue