If running on a stand-alone PostgreSQL server, then supervisor does
exist -- but `stop-server` is useless, and in fact cannot run because
the Zulip directory may not be readable by the `zulip` user.
Detect if this is an application front-end server by looking for
`/home/zulip/deployments`, and use the stop-server and flush-memcached
from there if it exists. The `create-db.sql` and
`terminate-psql-sessions` files are still read from the local
directory, but those already have precautions from being from a
non-world-readable directory, and are more obviously important to keep
in sync with the `create-database` script.
This package is replaced by libmagic1t64 1:5.45-3 for the Ubuntu
64-bit time_t transition, but hasn’t been deleted from the archive
yet.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Replace a separate call to subprocess, starting `node` from scratch,
with an optional standalone node Express service which performs the
rendering. In benchmarking, this reduces the overhead of a KaTeX call
from 120ms to 2.8ms. This is notable because enough calls to KaTeX in
a single message would previously time out the whole message
rendering.
The service is optional because he majority of deployments do not use
enough LaTeX to merit the additional memory usage (60Mb).
Fixes: #17425.
While this could be done previously by calling
`upgrade-zulip-from-git --remote-url /srv/zulip.git`, the explicit
argument makes this more straightforward, and avoids churning the
`refs/remotes/origin/` namespace.
A user who somehow got an empty `zulip` database, but without a
`zerver_messages` table in it, would get stuck in the installer at:
```
++ su postgres -c 'cd / && psql -v ON_ERROR_STOP=1 -Atc '\''SELECT COUNT(*) FROM zulip.zerver_message;'\'' zulip'
ERROR: relation "zulip.zerver_message" does not exist
LINE 1: SELECT COUNT(*) FROM zulip.zerver_message;
^
+ records=
```
Treat a failure to select from `zerver_messages` as having 0 messages,
and continue with the `DROP DATABASE IF EXISTS` / `CREATE DATABASE`
that `create-db.sql` usually does.
Fixes: #29110.
This leaves `args.filter_terms` as the raw values the user specified,
before they may have been transformed to the shape that we use for
substring matching.
Decouple the sending of client restart events from the restarting of
the servers. Restarts use the new Tornado restart-clients endpoint to
inject "restart" events into queues of clients which were loaded from
the previous Tornado process. The rate is controlled by the
`application_server.client_restart_rate`, in clients per minute, or a
flag to `restart-clients` which overrides it. Note that a web client
will also spread its restart over 5 minutes, so artificially-slow
client restarts are generally not very necessary.
Restarts of clients are deferred to until after post-deploy hooks are
run, such that the pre- and post- deploy hooks are around the actual
server restarts, even if pushing restart events to clients takes
significant time.
This flag was generally used not because we wanted to avoid
restarting Tornado, but because we wanted to avoid increasing load
the server when all of the clients were told to reload.
Since we have laid the groundwork for separately telling Tornado to
tell clients to restart, we remove the --skip-tornado flag; the next
commit will add the ability to skip client restarts.
This queue is used to things which definitionally may take longer than
a request, so paging after 60s is rather aggressive. This is
especially true because this queue has a very long tail of very slow
tasks -- p99 of task time in this queue is 8.5s, while p99.9 is 197s.
Raise the paging threshold to 15 minutes. While there are
semi-user-facing tasks which use this queue (primarily marking
messages as read), those being delayed for minutes is already a real
possibility if they are stuck behind a large realm export -- and this
is not a situation which should necessarily page, since it is not
solvable by the administrator.
Filling caches needs to happen close to when the server is restarted,
as the gap opens us up to race conditions with user modifications. If
there are migrations, however, it must happen within the critical
period after the migrations are applied.
Move the call to fill the caches to within the `shutdown_server`
function, so that we push it as close to the server shutdown as
possible.
This can happen if `machine.pgroonga` is set during initial
installation. We cannot run `CREATE EXTENSION PGROONGA` because the
database that we need to run that statement in does not exist yet;
make the command a silent no-op that does not create the
`pgroonga_setup.sql.applied` flag file, such that a later
`zulip-puppet-apply` once the database exists can pick up and install
the extension.
Tweaked provision script to run successfully in Fedora 38 and
included a script to build the groonga libs from source because
the packages in Fedora repos are outdated.
There is a major version jump from the last supported version (F34)
which is EOL so references and support for older versions were
removed.
Fixes: #20635
nginx sets the value of the `$http_host` variable to the empty string
when using http/3, as there is technically no `Host:` header sent:
https://github.com/nginx-quic/nginx-quic/issues/3
Users with a browser that support http/3 will send their first request
to nginx with http/2, and get an expected HTTP 200 -- but any
subsequent requests will fail with am HTTP 400, since the browser will
have upgraded to http/3, which has an empty `Host` header, which Zulip
rejects.
Switch to the `$host` variable, which works for all HTTP versions.
Co-authored-by: Alex Vandiver <alexmv@zulip.com>
Restore the default django.utils.log.AdminEmailHandler when
ERROR_REPORTING is enabled. Those with more sophisticated needs can
turn it off and use Sentry or a Sentry-compatible system.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The claim in the comment from c8ec3dfcf6, that we can and should use
the current deploy's venv, misses one key case -- when upgrading the
operating system, the current deploy's venv is unworkable, since it
was configured for a previous version of Python. As such, any attempt
to load Django to verify the version of PostgreSQL it is talking to
must happen after the venv is configured.
Move the database version check into
`scripts/lib/check-database-compatibility`, which also moves it after
the new venv is configured.
Because we no longer reliably know, at `apt-get upgrade` time, what
version of PostgreSQL is installed, we hold all versions of the
pgroonga packages.
This ensures that the next `upgrade-zulip-from-git` has access to the
commit history of the initial install, if it was from a forked
repository. `/home/zulip/deployments/current` and `/srv/zulip.git`
are not quite organized into the steady-state that they will have
after one `upgrade-zulip-from-git`:
- `/home/zulip/deployments/current` is its own clone, not a worktree
- `/srv/zulip.git` has an origin of `/home/zulip/deployments/current`
- `remote.origin.mirror` is set on `/srv/zulip.git`
- `remote.origin.fetch` is `+refs/*:refs/*`
All but the first are automatically cleaned up by
`upgrade-zulip-from-git` when it is next run, using the code added in
30457ecd02. The additional complexity of making an existing
independent clone into a worktree seem not worth solving the first
point.
Updating the pgroonga package is not sufficient to upgrade the
extension in PostgreSQL -- an `ALTER EXTENSION pgroonga UPDATE` must
explicitly be run[^1]. Failure to do so can lead to unexpected behavior,
including crashes of PostgreSQL.
Expand on the existing `pgroonga_setup.sql.applied` file, to track
which version of the PostgreSQL extension has been configured. If the
file exists but is empty, we run `ALTER EXTENSION pgroonga UPDATE`
regardless -- if it is a no-op, it still succeeds with a `NOTICE`:
```
zulip=# ALTER EXTENSION pgroonga UPDATE;
NOTICE: version "3.0.8" of extension "pgroonga" is already installed
ALTER EXTENSION
```
The simple `ALTER EXTENSION` is sufficient for the
backwards-compatible case[^1] -- which, for our usage, is every
upgrade since 0.9 -> 1.0. Since version 1.0 was released in 2015,
before pgroonga support was added to Zulip in 2016, we can assume for
the moment that all pgroonga upgrades are backwards-compatible, and
not bother regenerating indexes.
Fixes: #25989.
[^1]: https://pgroonga.github.io/upgrade/
This was only necessary for PGroonga 1.x, and the `pgroonga` schema
will most likely be removed at some point inthe future, which will
make this statement error out.
Drop the unnecessary statement.
If the `postgresql.version` in `/etc/zulip/zulip.conf` is out of date
or wrong, upgrading to the actual current version would drop your
production database without prompting. While we do document taking a
Zulip backup (which includes a database backup) before running
`upgrade-postgresql`[^1], not everyone does so, with possibly
catastrophic consequences.
Do a true end-to-end check of the version in `/etc/zulip/zulip.conf`
by asking Django to query the database for its version, checking that
against the configured value, and aborting if there is any
disagreement.
[^1]: https://zulip.readthedocs.io/en/latest/production/upgrade.html#upgrading-postgresql
If `zulip-puppet-apply` is run during an upgrade, it will immediately
try to re-`stop-server` before running migrations; if the last step in
the puppet application was to restart `supervisor`, it may not be
listening on its UNIX socket yet. In such cases, `socket.connect()`
throws a `FileNotFoundError`:
```
Traceback (most recent call last):
File "./scripts/stop-server", line 53, in <module>
services = list_supervisor_processes(services, only_running=True)
File "./scripts/lib/supervisor.py", line 34, in list_supervisor_processes
processes = rpc().supervisor.getAllProcessInfo()
File "/usr/lib/python3.9/xmlrpc/client.py", line 1116, in __call__
return self.__send(self.__name, args)
File "/usr/lib/python3.9/xmlrpc/client.py", line 1456, in __request
response = self.__transport.request(
File "/usr/lib/python3.9/xmlrpc/client.py", line 1160, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib/python3.9/xmlrpc/client.py", line 1172, in single_request
http_conn = self.send_request(host, handler, request_body, verbose)
File "/usr/lib/python3.9/xmlrpc/client.py", line 1285, in send_request
self.send_content(connection, request_body)
File "/usr/lib/python3.9/xmlrpc/client.py", line 1315, in send_content
connection.endheaders(request_body)
File "/usr/lib/python3.9/http/client.py", line 1250, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.9/http/client.py", line 1010, in _send_output
self.send(msg)
File "/usr/lib/python3.9/http/client.py", line 950, in send
self.connect()
File "./scripts/lib/supervisor.py", line 10, in connect
self.sock.connect(self.host)
FileNotFoundError: [Errno 2] No such file or directory
```
Catch the `FileNotFoundError` and retry twice more, with backoff. If
it fails repeatedly, point to `service supervisor status` for further
debugging, as `FileNotFoundError` is rather misleading -- the file
exists, it simply is not accepting connections.
Instead of copying over a mostly-unchanged `postgresql.conf`, we
transition to deploying a `conf.d/zulip.conf` which contains the
only material changes we made to the file, which were previously
appended to the end.
While shipping separate while `postgresql.conf` files for each
supported version is useful if there is large variety in supported
options between versions, there is not no such variation at current,
and the burden of overriding the entire default configuration is that
it must be keep up to date wit the package's version.