There is only one PostgreSQL database; the "appdb" is irrelevant.
Also use "postgresql," as it is the name of the software, whereas
"postgres" the name of the binary and colloquial name. This is minor
cleanup, but enabled by the other renames in the previous commit.
This moves the puppet configuration closer to the "roles and profiles
method"[1] which is suggested for organizing puppet classes. Notably,
here it makes clear which classes are meant to be able to stand alone
as deployments.
Shims are left behind at the previous names, for compatibility with
existing `zulip.conf` files when upgrading.
[1] https://puppet.com/docs/pe/2019.8/the_roles_and_profiles_method
Because the command is part of a pipe sequence, the exitcode defaults
to the last in the sequence, which is not the most important one here.
Set pipefail, which sets the exit status to the exit code of the last
program in the sequence to exit non-zero, or 0 if all succeeded. This
prevents the upgrade from barreling onward and setting
`postgres.version` improperly if the database upgrade step failed.
The contents in the database are unchanged across the PostgreSQL
restart; as such, there is no reason to invalidate the caches.
This step was inherited from the general operating system upgrade
documentation. When Python versions change, such as during OS
upgrades, we must ensure that memcached is cleared. However, the
`do-release-upgrade` process uninstalled and upgraded to a new
memcached, as well as likely restarted the system; a separate step for
OS upgrades to restart memcached is thus unnecessary.
pg_upgradecluster has two possibilities for `--method`: `dump`, and
`upgrade`. The former is the default, and does a `pg_dump` of all of
the databases in the old cluster and feeds them into the new cluster.
This is a sure-fire way of getting the same information in both
databases, but may be extremely slow on large databases, and is
guaranteed to fail on servers whose databases take up >50% of their
disk.
The `--method=upgrade` method, by contrast, uses pg_upgrade to copy
the raw database data file over to the new cluster, and then fiddles
with their internal structure as needed by the upgrade to let them be
correct for the new version[1]. This is slightly faster than the
dump/load method, since it skips the serialization step, but still
requires that there be enough space on disk for both old and new
versions at once. `pg_upgrade` is currently supported for all
versions of PostgreSQL from 8.4 to 12.
Using `pg_upgrade` incurs slightly more risk, but since the it is
widely used by now, using it in the relatively-controlled Zulip server
environment is reasonable. The expected worst failure is failure to
upgrade, not corruption or data loss.
Additionally passing `--link` uses hardlinks to link the data files
into both the old and new directories simultaneously. This resolve
both the runtime of the operation, as well as the disk space usage.
The only potential downside to this is that as soon as writes have
occurred on the upgraded cluster, the old cluster can no longer be
started. Since this tooling intends to remove the old cluster
immediately after the upgrade completes successfully, this is not a
significant drawback.
Switch to using `--method=upgrade --link`. This technique spits out
two shell scripts which are expected to be run after completion of the
upgrade; one re-analyzes the statistics, the other does an `rm -rf` of
the data where it is still hardlinked in the old cluster. Extract the
location of these scripts from parsing the `pg_upgradecluster` output;
since the path is not static, we must rely on it being relatively easy
to parse. The risk of the path changing is lower, and has more
obvious failure modes, than inserting the current contents of these
upgrade steps into the overall `upgrade-postgres`.
[1] https://www.postgresql.org/docs/12/pgupgrade.html
Running `pg-upgradecluster` runs the `CREATE TEXT SEARCH DICTIONARY`
and `CREATE TEXT SEARCH CONFIGURATION` from
`zerver/migrations/0001_initial.py` on the new PostgreSQL cluster;
this requires that the stopwords file and dictionary exist _prior_
to `pg_upgradecluster` being run.
This causes a minor dependency conflict -- we do not wish to duplicate
the functionality from `zulip::postgres_appdb_base` which configures
those files, but installing all of `zulip::postgres_appdb_tuned` will
attempt to restart PostgreSQL -- which has not configured the cluster
for the new version yet.
In order to split out configuration of the prerequisites for the
application database, and the steps required to run it, we need to be
able to apply only part of the puppet configuration. Use the
newly-added `--config` argument to provide a more limited `zulip.conf`
which only applies `zulip::postgres_appdb_base` to the new version of
Postgres, creating the required tsearch data files.
This also preserves the property that a failure at any point prior to
the `pg_upgradecluster` is easily recoverable, by re-running
`zulip-puppet-apply`.
This is based on the existing steps in the documentation, with
additional changes now that the PostgreSQL version is stored in
`/etc/zulip/zulip.conf`.