Commit Graph

1146 Commits

Author SHA1 Message Date
Anders Kaseorg 21548ff7c0 install-node: Upgrade Node.js from 16.13.1 to 16.13.2.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-01-24 15:55:38 -08:00
Alex Vandiver a3adaf4aa3 puppet: Fix standalone certbot configurations.
This addresses the problems mentioned in the previous commit, but for
existing installations which have `authenticator = standalone` in
their configurations.

This reconfigures all hostnames in certbot to use the webroot
authenticator, and attempts to force-renew their certificates.
Force-renewal is necessary because certbot contains no way to merely
update the configuration.  Let's Encrypt allows for multiple extra
renewals per week, so this is a reasonable cost.

Because the certbot configuration is `configobj`, and not
`configparser`, we have no way to easily parse to determine if webroot
is in use; additionally, `certbot certificates` does not provide this
information.  We use `grep`, on the assumption that this will catch
nearly all cases.

It is possible that this will find `authenticator = standalone`
certificates which are managed by Certbot, but not Zulip certificates.
These certificates would also fail to renew while Zulip is running, so
switching them to use the Zulip webroot would still be an improvement.

Fixes #20593.
2022-01-24 12:13:44 -08:00
Alex Vandiver 76ce8631c0 setup: Install a temporary certificate, before certbot runs.
Installing certbot with --method=standalone means that the
configuration file will be written to assume that the standalone
method will be used going forward.  Since nginx will be running,
attempts to renew the certificate will fail.

Install a temporary self-signed certificate, just to allow nginx to
start, and then follow up (after applying puppet to start nginx) with
the call to setup-certbot, which will use the webroot authenticator.

The `setup-certbot --method=standalone` option is left intact, for use
in development environments.

Fixes part of #20593; it does not address installs which were
previously improperly configured with `authenticator = standalone`.
2022-01-24 12:13:44 -08:00
Anders Kaseorg 97e4e9886c python: Replace universal_newlines with text.
This is supported in Python ≥ 3.7.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-01-23 22:16:01 -08:00
Anders Kaseorg a58a71ef43 Remove Ubuntu 18.04 support.
As a consequence:

• Bump minimum supported Python version to 3.7.
• Move Vagrant environment to Debian 10, which has Python 3.7.
• Move CI frontend tests to Debian 10.
• Move production build test to Debian 10.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-01-21 17:26:14 -08:00
Alex Vandiver 677467f040 upgrade-zulip-from-git: Fix upstream URL for existing deploys. 2022-01-18 21:10:38 -08:00
Alex Vandiver bad58cdca6 upgrade-zulip-from-git: Fix the upstream URL not be the custom remote. 2022-01-18 21:10:38 -08:00
Alex Vandiver 6bc5849ea8 puppet: Remove now-unused debathena apt repository. 2022-01-18 14:13:28 -08:00
Anders Kaseorg e2cc554077 zulip_tools: Rename may_be_perform_purging to maybe_perform_purging.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-01-12 13:21:35 -08:00
Alex Vandiver b31658482b upgrade-zulip: Pass any arguments down to upgrade-zulip-stage-2.
This is the equivalent of 93f3da4c05 but
for the tarball codepath.
2022-01-11 14:26:54 -08:00
Alex Vandiver 06e115bb00 zulip_tools: Switch get_deploy_options to use shlex.split.
This makes it honor quoting in the config file.
2022-01-11 14:26:54 -08:00
Anders Kaseorg 1cc1de82cd reindex-textual-data: Reindex textual functional indexes too.
This catches nine functional indexes that the previous query didn’t:

upper_preregistration_email_idx
upper_stream_name_idx
upper_subject_idx
upper_userprofile_email_idx
zerver_message_recipient_upper_subject
zerver_mutedtopic_stream_topic
zerver_stream_realm_id_name_uniq
zerver_userprofile_realm_id_delivery_email_uniq
zerver_userprofile_realm_id_email_uniq

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-01-07 10:37:04 -08:00
Alex Vandiver 6218ed91c2 puppet: Use lazy-apps and uwsgi control sockets for rolling reloads.
Restarting the uwsgi processes by way of supervisor opens a window
during which nginx 502's all responses.  uwsgi has a configuration
called "chain reloading" which allows for rolling restart of the uwsgi
processes, such that only one process at once in unavailable; see
uwsgi documentation ([1]).

The tradeoff is that this requires that the uwsgi processes load the
libraries after forking, rather than before ("lazy apps"); in theory
this can lead to larger memory footprints, since they are not shared.
In practice, as Django defers much of the loading, this is not as much
of an issue.  In a very basic test of memory consumption (measured by
total memory - free - caches - buffers; 6 uwsgi workers), both
immediately after restarting Django, and after requesting `/` 60 times
with 6 concurrent requests:

                      |  Non-lazy  |  Lazy app  | Difference
    ------------------+------------+------------+-------------
    Fresh             |  2,827,216 |  2,870,480 |   +43,264
    After 60 requests |  3,332,284 |  3,409,608 |   +77,324
    ..................|............|............|.............
    Difference        |   +505,068 |   +539,128 |   +34,060

That is, "lazy app" loading increased the footprint pre-requests by
43MB, and after 60 requests grew the memory footprint by 539MB, as
opposed to non-lazy loading, which grew it by 505MB.  Using wsgi "lazy
app" loading does increase the memory footprint, but not by a large
percentage.

The other effect is that processes may be served by either old or new
code during the restart window.  This may cause transient failures
when new frontend code talks to old backend code.

Enable chain-reloading during graceful, puppetless restarts, but only
if enabled via a zulip.conf configuration flag.

Fixes #2559.

[1]: https://uwsgi-docs.readthedocs.io/en/latest/articles/TheArtOfGracefulReloading.html#chain-reloading-lazy-apps
2022-01-05 14:48:52 -08:00
Alex Vandiver 4aaa250623 zulip_tools: Fix a typo in a comment. 2022-01-05 14:48:52 -08:00
Alex Vandiver 9d85f64e5a upgrade-zulip-stage-2: Pass through --skip-tornado and --less-graceful.
These restart-server arguments are useful to be able to provide to
`upgrade-zulip`.
2021-12-31 11:17:14 -08:00
Alex Vandiver fb3368b482 restart-server: Factor out argparser, to allow reuse. 2021-12-31 11:17:14 -08:00
Alex Vandiver 93f3da4c05 upgrade-from-git: Pass unknown options through to the upgrade process. 2021-12-31 11:17:14 -08:00
Anders Kaseorg 82748d45d8 install-yarn: Use test -ef in case /srv is a symlink.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-30 13:42:07 -08:00
Anders Kaseorg 0b454dda12 install: Try apt-get update if the Ubuntu universe check fails.
On a system where ‘apt-get update’ has never been run, ‘apt-cache
policy’ may show no repositories at all.  Try to correct this with
‘apt-get update’ before giving up.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-16 17:56:23 -08:00
Alex Vandiver f6520a97cd setup-certbot: Reinstate nginx reload after installation.
If nginx was already installed, and we're using the webroot method of
initializing certbot, nginx needs to be reloaded.  Hooks in
`/etc/letsencrypt/renewal-hooks/deploy/` do not run during initial
`certbot certonly`, so an explicit reload is required.
2021-12-10 16:43:53 -08:00
Alex Vandiver 01e8f752a8 puppet: Use certbot package timer, not our own cron job.
The certbot package installs its own systemd timer (and cron job,
which disabled itself if systemd is enabled) which updates
certificates.  This process races with the cron job which Zulip
installs -- the only difference being that Zulip respects the
`certbot.auto_renew` setting, and that it passes the deploy hook.
This means that occasionally nginx would not be reloaded, when the
systemd timer caught the expiration first.

Remove the custom cron job and `certbot-maybe-renew` script, and
reconfigure certbot to always reload nginx after deploying, using
certbot directory hooks.

Since `certbot.auto_renew` can't have an effect, remove the setting.
In turn, this removes the need for `--no-zulip-conf` to
`setup-certbot`.  `--deploy-hook` is similarly removed, as running
deploy hooks to restart nginx is now the default; pass
`--no-directory-hooks` in standalone mode to not attempt to reload
nginx.  The other property of `--deploy-hook`, of skipping symlinking
into place, is given its own flog.
2021-12-09 13:47:33 -08:00
Tim Abbott 9aa2e0ad45 upgrade-zulip-from-git: Improve webpack failure error handling.
We've had a number of unhappy reports of upgrades failing due to
webpack requiring too much memory.  While the previous commit will
likely fix this issue for everyone, it's worth improving the error
message for failures here.

We avoid doing the stop+retry ourselves, because that could cause an
outage in a production system if webpack fails for another reason.

Fixes #20105.
2021-12-09 12:26:34 -08:00
Tim Abbott 72b381d749 upgrade-zulip-from-git: Require more memory to run webpack.
Since the upgrade to Webpack 5, we've been seeing occasional reports
that servers with roughly 4GiB of RAM were getting OOM kills while
running webpack.

Since we can't readily optimize the memory requirements for webpack
itself, we should raise the RAM requirements for doing the
lower-downtime upgrade strategy.

Fixes #20231.
2021-12-09 12:23:25 -08:00
Alex Vandiver 939d2e2705 scripts: Only stop/start existing tornado processes.
Stopping both `zulip-tornado` and `zulip-tornado:*` causes errors on
deploys with tornado sharding, as the plain `zulip-tornado` service
does not exist.

Pass `zulip-tornado:*`, which matches both plain `zulip-tornado`, as
well as the sharded `zulip-tornado:zulip-tornado-port-9800` cases.
2021-12-08 14:06:06 -08:00
Tim Abbott 73d503995a scripts: Fix running compare-settings-to-template from any CWD.
This matches the number of dirname() calls for other files in its
directory.

Fixes #20489.
2021-12-07 14:45:53 -08:00
Anders Kaseorg 2e5af073b7 install-node: Upgrade Node.js from 16.13.0 to 16.13.1.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-03 14:33:53 -08:00
Anders Kaseorg 2e1a8ff632 configure-rabbitmq: Increase startup timeout.
Starting RabbitMQ at boot seems to have gotten slower, which broke
‘vagrant up --provision’.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-03 14:32:23 -08:00
Alex Vandiver 3455fc137a upgrade-postgresql: Check for extension upgrade steps. 2021-11-20 07:13:50 -08:00
Alex Vandiver 544e8c569e install: Switch default to PostgreSQL 14. 2021-11-08 18:21:46 -08:00
Alex Vandiver f77bbd3323 upgrade-postgresql: Switch to vacuumdb --all --analzyze-only --jobs 10.
The `analyze_new_cluster.sh` script output by `pg_upgrade` just runs
`vacuumdb --all --analyze-in-stages`, which runs three passes over the
database, getting better stats each time.  Each of these passes is
independent; the third pass does not require the first two.
`--analyze-in-stages` is only provided to get "something" into the
database, on the theory that it could then be started and used.  Since
we wait for all three passes to complete before starting the database,
the first two passes add no value.

Additionally, PosttgreSQL 14 and up stop writing the
`analyze_new_cluster.sh` script as part of `pg_upgrade`, suggesting
the equivalent `vacuumdb --all --analyze-in-stages` call instead.

Switch to explicitly call `vacuumdb --all --analyze-only`, since we do
not gain any benefit from `--analyze-in-stages`.  We also enable
parallelism, with `--jobs 10`, in order to analyze up to 10 tables in
parallel.  This may increase load, but will accelerate the upgrade
process.
2021-11-08 18:21:46 -08:00
Anders Kaseorg f2a443a736 install-node: Upgrade Node.js from 14.18.1 to 16.13.0.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-11-05 17:34:13 -07:00
Anders Kaseorg 458844a2f5 install-yarn: Verify that the install location is /srv/zulip-yarn.
scripts.lib.node_cache expects Yarn to be in /srv/zulip-yarn, so if
it’s installed somewhere else, even if it’s the right version, we need
to reinstall it.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-11-03 16:49:58 -07:00
rht bb8504d925 lint: Fix typos found by codespell. 2021-10-19 16:51:13 -07:00
Anders Kaseorg 291087d70c install-yarn: Upgrade Yarn from 1.22.11 to 1.22.17.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-10-17 07:15:09 -07:00
Anders Kaseorg 7df96b78c6 install-node: Upgrade Node.js from 14.17.6 to 14.18.1.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-10-17 07:15:09 -07:00
Anders Kaseorg 2f993f1a79 install-node: Stop using NVM.
NVM doesn’t check hashes or signatures and really just adds
complexity we don’t need.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-09-24 06:58:32 -07:00
Anders Kaseorg 902883d818 setup_venv: Skip virtualenv’s automatic download of setuptools.
It recently started failing on Debian 10 (buster).  We immediately
follow this by replacing these packages with our own versions from
pip.txt, anyway.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-09-23 14:29:04 -07:00
Anders Kaseorg 08e459b393 zulip_tools: Convert "".format to Python 3.6 f-strings.
Generated automatically by pyupgrade.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-09-22 13:58:46 -07:00
Anders Kaseorg 9bed17e0ab install-node: Upgrade Node.js from 14.17.5 to 14.17.6.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-09-13 10:12:43 -07:00
Gaurav Pandey 502697d239 docs: Add documentation for bullseye support.
The support for bullseye was added in #17951
but it was not documented as bullseye was
frozen and did not have proper configuration
files, hence wasn't documented.

Since now bullseye is released as a stable
version, it's support can be documented.
2021-09-09 11:05:16 -07:00
Anders Kaseorg 915884bff7 docs: Apply bullet style changes from Prettier.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-09-08 12:06:24 -07:00
Anders Kaseorg 02582c6956 upgrade-zulip-from-git: Run git fetch with --prune.
This prevents upgrading to an obsolete version of a branch that has
been deleted or renamed.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-09-01 05:34:57 -07:00
Anders Kaseorg 3cb66d59ac install: Remove /dev/null redirect for zulip-puppet-apply.
The usual output from this command looks like

Notice: Compiled catalog for localhost in environment production in 2.33 seconds
Notice: /Stage[main]/Zulip::Apt_repository/Exec[setup_apt_repo]/returns: current_value 'notrun', should be ['0'] (noop)
Notice: Class[Zulip::Apt_repository]: Would have triggered 'refresh' from 1 event
Notice: Stage[main]: Would have triggered 'refresh' from 1 event
Notice: Applied catalog in 1.20 seconds

which doesn’t seem abnormally alarming, and hiding it makes failures
harder to diagnose.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-31 16:30:53 -07:00
Alex Vandiver faf71eea41 upgrade-postgresql: Do not remove other supervisor configs.
We previously used `zulip-puppet-apply` with a custom config file,
with an updated PostgreSQL version but more limited set of
`puppet_classes`, to pre-create the basic settings for the new cluster
before running `pg_upgradecluster`.

Unfortunately, the supervisor config uses `purge => true` to remove
all SUPERVISOR configuration files that are not included in the puppet
configuration; this leads to it removing all other supervisor
processes during the upgrade, only to add them back and start them
during the second `zulip-puppet-apply`.

It also leads to `process-fts-updates` not being started after the
upgrade completes; this is the one supervisor config file which was
not removed and re-added, and thus the one that is not re-started due
to having been re-added.  This was not detected in CI because CI added
a `start-server` command which was not in the upgrade documentation.

Set a custom facter fact that prevents the `purge` behaviour of the
supervisor configuration.  We want to preserve that behaviour in
general, and using `zulip-puppet-apply` continues to be the best way
to pre-set-up the PostgreSQL configuration -- but we wish to avoid
that behaviour when we know we are applying a subset of the puppet
classes.

Since supervisor configs are no longer removed and re-added, this
requires an explicit start-server step in the instructions after the
upgrades complete.  This brings the documentation into alignment with
what CI is testing.
2021-08-24 19:00:58 -07:00
Anders Kaseorg 7b2e585213 install-yarn: Upgrade Yarn from 1.22.10 to 1.22.11.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-23 12:33:27 -07:00
Anders Kaseorg ebb8e9109c install-node: Upgrade Node.js from 14.17.3 to 14.17.5.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-23 12:29:04 -07:00
Anders Kaseorg 4206e5f00b python: Remove locally dead code.
These changes are all independent of each other; I just didn’t feel
like making dozens of commits for them.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-19 01:51:37 -07:00
Alex Vandiver c9bb2c16cc restart-server: Add a --skip-tornado.
Tornado restarts are the most user-visible; provide a means to restart
everything but them, for changes which are known to not affect
Tornado.
2021-08-04 10:57:53 -07:00
Tim Abbott d439a2a53e emails: Create wider marketing email base template.
For our marketing emails, we want a width that's more appropriate for
newsletter context, vs. the narrow emails we use for transactional
content.

I haven't figured out a cleaner way to do this than duplicating most
of email_base_default.source.html. But it's not a big deal to
duplicate, since we've been changing that base template only about
once a year.
2021-08-03 11:57:31 -07:00
Anders Kaseorg 5483ebae37 python: Convert "".format to Python 3.6 f-strings.
Generated automatically by pyupgrade.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-02 15:53:52 -07:00