This provides a single reference point for all zulip.conf settings;
these mostly link out to the more complete documentation about each
setting, elsewhere.
Fixes#12490.
There is only one PostgreSQL database; the "appdb" is irrelevant.
Also use "postgresql," as it is the name of the software, whereas
"postgres" the name of the binary and colloquial name. This is minor
cleanup, but enabled by the other renames in the previous commit.
This moves the puppet configuration closer to the "roles and profiles
method"[1] which is suggested for organizing puppet classes. Notably,
here it makes clear which classes are meant to be able to stand alone
as deployments.
Shims are left behind at the previous names, for compatibility with
existing `zulip.conf` files when upgrading.
[1] https://puppet.com/docs/pe/2019.8/the_roles_and_profiles_method
This also removes direct includes of `zulip::common`, making
`zulip::base` gatekeep the inclusion of it. This helps enforce that
any top-level deploy only needs include a single class, and that any
configuration which is not meant to be deployed by itself will not
apply, due to lack of `zulip::common` include.
The following commit will better differentiate these top-level deploys
by moving them into a subdirectory.
d2aa81858c replaced the `apt::source` to set up debathena with
`Exec['setup-apt-repo-debathena']`, but mistakenly left the
`apt::source` in place in `zmirror` (but not `zmirror_personals`).
The `apt::source` resource type was later removed in c9d54f7854,
making the manifest to apply on `zmirror`.
Remove the broken and unnecessary `apt::source` resource.
This property is not related to the base zulip install; move it to
zulip::postgres_common, which is already used as a namespace for
various postgres variables.
There was likely more dependency complexity prior to 97766102df, but
there is now no reason to require that consumers explicitly include
zulip::apt_repository.
Use https://github.com/stripe/smokescreen to provide a server for an
outgoing proxy, run under supervisor. This will allow centralized
blocking of internal metadata IPs, localhost, and so forth, as well as
providing default request timeouts (10s by default).
Disabled on webservers in 047817b6b0, it has since lingered in
configuration, as well as running (to no effect) every minute on the
loadbalancer.
Remove the vestiges of its configuration.
This banner shows on lb1, advertising itself as lb0. There is no
compelling reason for a custom motd, especially one which needs to
be reconfigured for each host.
The rabbitmq cron jobs exist in order to call rabbitmqctl as root and
write the output to files that nagios can consume, since nagios is not
allowed to run rabbitmqctl.
In systems which do not have nagios configured, these every-minute
cron jobs add non-insignificant load, to no effect. Move their
installation into `zulip_ops`. In doing so, also combine the cron.d
files into a single file; this allows us to `ensure => absent` the old
filenames, removing them from existing systems. Leave the resulting
combined cron.d file in `zulip`, since it is still of general utility
and note.
wal-g was used in `puppet/zulip` by env-wal-g, but only installed in
`puppet/zulip_ops`.
Merge all of the dependencies of doing backups using wal-g (wal-g
installation, the pg_backup_and_purge job, the nagios plugin that
verifies it happens) into a common base class in `puppet/zulip`, since
it is generally useful.
No plugins are installed inside the /usr/local/munin/lib this creates
in munin-node, nor are they symlinked into /etc/munin/plugins, so
non-default plugins are added by this.
The one complexity is that hosts_fullstack are treated differently, as
they are not currently found in the manual `hosts` list, and as such
do not get munin monitoring.
In Bionic, nagios-plugins-basic is a transitional package which
depends on monitoring-plugins-basic. In Focal, it is a virtual
package, which means that every time puppet runs, it tries to
re-install the nagios-plugins-basic package.
Switch all instances to referring to `$zulip::common::nagios_plugins`,
and repoint that to monitoring-plugins-basic.
Support for Xenial and Stretch was removed (5154ddafca, 0f4b1076ad,
8944e0ad53, 79acd5ae40, 1219a2e854), but not all codepaths were
updated to remove their conditionals on it.
Remove all code predicated on Xenial or Stretch. debathena support
was migrated to Bionic, since that appears to be the current state of
existing debathena servers.
Since 9e8f1aacb3, zulip_ops machines
might have two Package declarations for `certbot`, which doesn't work
in puppet.
The fix is, as usual, to use our `zulip::safepackage` wrapper instead.
The style guide for Zulip is to always use "primary" and "replica"
when describing database replication. Adjust a few comments under
`puppet/` that do not adhere to this.
Unfortunately, some references still remain to the insensitive and
inaccurate "master" / "slave" terminology. However, these are only in
files which we are attempting to preserve as close to the upstream
versions they are derived from (e.g. postgresql.conf,
postfix/master.cf).
All differences between the primary and replica roles having been
merged, fold the `postgres_common`, `postgres_master`, and
`postgres_slave` roles into just `postgres_appdb`.
These values differed between the primary and secondary database
hosts, for unclear reasons. The differences date back to their
introduction in 387f63deaa. As the comment in the replica
confguration notes, settings of `vm.dirty_ratio = 10` and
`vm.dirty_background_ratio = 5` matched the kernel defaults for
"newer" kernels; however, kernel 2.6.30 bumped those to 20 and 10,
respectively[1], as a fix for underlying logic now being more correct.
Remove these overrides; they should at very least be consistent across
roles, and the previous values look to be an attempt to tune for a
very much older version of the Linux kernel, which was using an
different, buggier, algorithm under the hood.
[1] 1b5e62b42b
This file controls streaming replication, and recovery using wal-g on
the secondary. The `primary_conninfo` data needs to change on short
notice when database failover happens, in a way that is not suitable
for being controlled by puppet.
PostgreSQL 12, in fact, removes the use of the `recovery.conf` file[1];
the `primary_conninfo` and `restore_command` information goes into the
main `postgresql.conf` file, and the standby status is controlled by
the presence of absence of an empty `standby.signal` file.
Remove the puppet control of the `recovery.conf` file.
[1] https://pgstef.github.io/2018/11/26/postgresql12_preview_recovery_conf_disappears.html
Since the nagios authentication is stored _in the database_, it is
unnecessary to run if the database is simply a replica of the
production database. The only case in which this statement would have
an effect is if the postgres node contains a _different_ (or empty)
database, which `setup_disks` now effectively prevents.
Remove the unnecessary step.
481613a344 updated the `setup_disks` script to no longer reference
`mdadm`, since we no longer set up RAID on servers.
Update the puppet that would call it to remove the `mdadm` dependency,
and run only if the state is not what it produces -- namely, a symlink
for `/var/lib/postgresql`, which must point to an existent
`/srv/postgresql` directory.
As the previous commit, this is currently only used in tuning, but is
a property of the whole postgres configuration; move it there, as just
the directory, not the file.
Use this directory consistently in the erb templates. Since we
produce a `pg_hba.conf`, it makes sense that we point to the path that we
know that we explicitly wrote to, for instance.