Commit Graph

143 Commits

Author SHA1 Message Date
Tim Abbott ab3cb2b3bf puppet: Fix internal redis puppet configuration.
The inherits rule is required for overriding existing configuration
files; while the `::profile` piece was missed in the recent ::profile
migration.
2020-10-29 11:53:43 -07:00
Alex Vandiver 45f6c79c4a puppet: Rename postgres_ variables to postgresql_. 2020-10-28 11:51:52 -07:00
Alex Vandiver a155430eb5 docs: Document all zulip.conf settings.
This provides a single reference point for all zulip.conf settings;
these mostly link out to the more complete documentation about each
setting, elsewhere.

Fixes #12490.
2020-10-27 13:31:57 -07:00
Alex Vandiver d24c571bab puppet: Automatically back up the database if we have the secrets.
This avoids folks having to manually add to the puppet_classes.
2020-10-27 13:29:19 -07:00
Alex Vandiver e7798d2797 puppet: Move zulip_ops::profile::postgres_appdb to postgresql. 2020-10-27 13:29:19 -07:00
Alex Vandiver 9f25389bff puppet: Move top-level zulip_ops deployments to zulip_ops::profile. 2020-10-27 13:29:19 -07:00
Alex Vandiver 5365af544a puppet: Rename zulip::profile::rabbit to ::rabbitmq. 2020-10-27 13:29:19 -07:00
Alex Vandiver 188af57296 puppet: Rename postgres_appdb to postgresql.
There is only one PostgreSQL database; the "appdb" is irrelevant.
Also use "postgresql," as it is the name of the software, whereas
"postgres" the name of the binary and colloquial name.  This is minor
cleanup, but enabled by the other renames in the previous commit.
2020-10-27 13:29:19 -07:00
Alex Vandiver c2185a81d6 puppet: Move top-level zulip deployments into "profile" directory.
This moves the puppet configuration closer to the "roles and profiles
method"[1] which is suggested for organizing puppet classes.  Notably,
here it makes clear which classes are meant to be able to stand alone
as deployments.

Shims are left behind at the previous names, for compatibility with
existing `zulip.conf` files when upgrading.

[1] https://puppet.com/docs/pe/2019.8/the_roles_and_profiles_method
2020-10-27 13:29:19 -07:00
Alex Vandiver 27cfb14d92 puppet: Only include zulip::base for top-level deploys.
This also removes direct includes of `zulip::common`, making
`zulip::base` gatekeep the inclusion of it.  This helps enforce that
any top-level deploy only needs include a single class, and that any
configuration which is not meant to be deployed by itself will not
apply, due to lack of `zulip::common` include.

The following commit will better differentiate these top-level deploys
by moving them into a subdirectory.
2020-10-27 13:29:19 -07:00
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Alex Vandiver a7d1fd9ffb puppet: Remove non-working apt::source.
d2aa81858c replaced the `apt::source` to set up debathena with
`Exec['setup-apt-repo-debathena']`, but mistakenly left the
`apt::source` in place in `zmirror` (but not `zmirror_personals`).
The `apt::source` resource type was later removed in c9d54f7854,
making the manifest to apply on `zmirror`.

Remove the broken and unnecessary `apt::source` resource.
2020-10-23 11:31:20 -07:00
Alex Vandiver 0ea20bd7d8 puppet: Move postgres_version into postgres_common.
This property is not related to the base zulip install; move it to
zulip::postgres_common, which is already used as a namespace for
various postgres variables.
2020-10-22 11:32:25 -07:00
Alex Vandiver ca971ebc59 puppet: Remove empty zulip_ops class. 2020-10-22 11:30:53 -07:00
Alex Vandiver 16af05758d puppet: Move zulip_org into zulip_ops.
This class is not of general interest.
2020-10-22 11:30:53 -07:00
Alex Vandiver ad566c491d puppet: Drop now-unused zulip_ops:::git class. 2020-10-22 11:30:53 -07:00
Alex Vandiver 50e9e2ed20 puppet: Make zulip::base include zulip::apt_repository.
There was likely more dependency complexity prior to 97766102df, but
there is now no reason to require that consumers explicitly include
zulip::apt_repository.
2020-10-22 11:30:53 -07:00
Alex Vandiver 2dc6d26ec6 puppet: Fix included monitoring class name. 2020-10-19 22:30:20 -07:00
Alex Vandiver 7a1132d605 puppet: Switch golang and smokescreen to use /srv.
/srv and /opt have very similar usages; but we should be internally
consistent.

Move these two (the only usages of /opt) to match the rest in /srv.
2020-10-16 13:00:06 -07:00
Alex Vandiver fffea9612b puppet: Add an outgoing HTTP/HTTPS proxy server.
Use https://github.com/stripe/smokescreen to provide a server for an
outgoing proxy, run under supervisor.  This will allow centralized
blocking of internal metadata IPs, localhost, and so forth, as well as
providing default request timeouts (10s by default).
2020-10-15 15:18:35 -07:00
Alex Vandiver f61ac4a28d puppet: Move frontend monitoring into its own file.
This allows it to be pulled in for deploys like czo, which don't use
the full `zulip_ops::app_frontend`, but we wish to monitor.
2020-10-13 17:37:32 -07:00
Alex Vandiver c8df9a150e puppet: Drop all log2zulip configuration.
Disabled on webservers in 047817b6b0, it has since lingered in
configuration, as well as running (to no effect) every minute on the
loadbalancer.

Remove the vestiges of its configuration.
2020-10-13 11:00:50 -07:00
Alex Vandiver b431b1b021 puppet: Remove misleading motd.
This banner shows on lb1, advertising itself as lb0.  There is no
compelling reason for a custom motd, especially one which needs to
be reconfigured for each host.
2020-10-13 11:00:36 -07:00
Alex Vandiver 4fd7df4e8c puppet: Remove absent of check-apns-tokens.
This was marked as ensure absent in d02101a401, in v1.7.0 in 2017.
2020-09-29 18:17:08 -07:00
Alex Vandiver 872a349508 puppet: Remove absent of log2zulip.
This was marked as ensure absent in 047817b6b0, in v2.0.0 in 2018.
2020-09-29 18:17:08 -07:00
Alex Vandiver 57d88eedd8 puppet: Only install rabbitmq cron jobs via zulip_ops.
The rabbitmq cron jobs exist in order to call rabbitmqctl as root and
write the output to files that nagios can consume, since nagios is not
allowed to run rabbitmqctl.

In systems which do not have nagios configured, these every-minute
cron jobs add non-insignificant load, to no effect.  Move their
installation into `zulip_ops`.  In doing so, also combine the cron.d
files into a single file; this allows us to `ensure => absent` the old
filenames, removing them from existing systems.  Leave the resulting
combined cron.d file in `zulip`, since it is still of general utility
and note.
2020-09-29 17:44:44 -07:00
Tim Abbott 5a1243db3c puppet: Use correct scope for zulip_ops::munin_plugin. 2020-07-15 21:49:45 -07:00
Alex Vandiver 48c3c33d10 puppet: Fully-qualify the munin-plugin name 2020-07-14 17:58:51 -07:00
Alex Vandiver 6c27f07c1d puppet: Move PostgreSQL backups to their own class.
wal-g was used in `puppet/zulip` by env-wal-g, but only installed in
`puppet/zulip_ops`.

Merge all of the dependencies of doing backups using wal-g (wal-g
installation, the pg_backup_and_purge job, the nagios plugin that
verifies it happens) into a common base class in `puppet/zulip`, since
it is generally useful.
2020-07-14 00:40:25 -07:00
Anders Kaseorg 15483c09cb puppet: Add missing trailing commas.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-07-13 15:36:06 -07:00
Alex Vandiver 3691a94efe puppet: Configure munin and nagios under apache with puppet.
This swaps in the actually-in-use munin configuiration file;
otherwise, it is an implementation of the configuration as it exists
on the machine.
2020-07-13 13:23:11 -07:00
Alex Vandiver 4e42164b4a munin: Add plugins to prod hosts. 2020-07-13 13:23:11 -07:00
Alex Vandiver 2a14212b27 munin: Add a helper resource definition for munin plugins. 2020-07-13 12:49:28 -07:00
Alex Vandiver eda2c4b8e2 puppet: Split munin-node from munin-server.
No plugins are installed inside the /usr/local/munin/lib this creates
in munin-node, nor are they symlinked into /etc/munin/plugins, so
non-default plugins are added by this.
2020-07-13 12:49:28 -07:00
Alex Vandiver 8cff27f67d puppet: Pull hosts from zulip.conf, not hardcoded list.
The one complexity is that hosts_fullstack are treated differently, as
they are not currently found in the manual `hosts` list, and as such
do not get munin monitoring.
2020-07-10 00:14:09 -07:00
Alex Vandiver 24383a5082 puppet: Rename hosts_domain so hosts_prefix can be grepped for. 2020-07-10 00:14:09 -07:00
Anders Kaseorg 9900298315 zthumbor: Remove Python 2 residue.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-07-06 18:44:58 -07:00
Alex Vandiver a21a086f5c puppet: nagios-plugins-basic is replaced by monitoring-plugins-basic.
In Bionic, nagios-plugins-basic is a transitional package which
depends on monitoring-plugins-basic.  In Focal, it is a virtual
package, which means that every time puppet runs, it tries to
re-install the nagios-plugins-basic package.

Switch all instances to referring to `$zulip::common::nagios_plugins`,
and repoint that to monitoring-plugins-basic.
2020-06-29 14:58:01 -07:00
Alex Vandiver 876ee4a8ed installer: Remove code specific to stretch or xenial.
Support for Xenial and Stretch was removed (5154ddafca, 0f4b1076ad,
8944e0ad53, 79acd5ae40, 1219a2e854), but not all codepaths were
updated to remove their conditionals on it.

Remove all code predicated on Xenial or Stretch.  debathena support
was migrated to Bionic, since that appears to be the current state of
existing debathena servers.
2020-06-24 12:57:38 -07:00
Alex Vandiver 7250d41bf7 puppet: Fix the path to install-wall-g 2020-06-17 15:23:18 -07:00
Tim Abbott 26396c5e25 puppet: Fix exceptions with multiple certbot declarations.
Since 9e8f1aacb3, zulip_ops machines
might have two Package declarations for `certbot`, which doesn't work
in puppet.

The fix is, as usual, to use our `zulip::safepackage` wrapper instead.
2020-06-15 18:21:33 -07:00
Alex Vandiver f8fc3a16eb puppet: Use "primary" / "replica" consistently in comments.
The style guide for Zulip is to always use "primary" and "replica"
when describing database replication.  Adjust a few comments under
`puppet/` that do not adhere to this.

Unfortunately, some references still remain to the insensitive and
inaccurate "master" / "slave" terminology.  However, these are only in
files which we are attempting to preserve as close to the upstream
versions they are derived from (e.g. postgresql.conf,
postfix/master.cf).
2020-06-15 16:18:07 -07:00
Alex Vandiver 97b9308781 puppet: Merge multiple postgres roles in `zulip_ops`.
All differences between the primary and replica roles having been
merged, fold the `postgres_common`, `postgres_master`, and
`postgres_slave` roles into just `postgres_appdb`.
2020-06-12 14:57:46 -07:00
Alex Vandiver 55bd31721d puppet: Remove custom `vm.dirty_ratio` and `vm.dirty_background_ratio`.
These values differed between the primary and secondary database
hosts, for unclear reasons.  The differences date back to their
introduction in 387f63deaa.  As the comment in the replica
confguration notes, settings of `vm.dirty_ratio = 10` and
`vm.dirty_background_ratio = 5` matched the kernel defaults for
"newer" kernels; however, kernel 2.6.30 bumped those to 20 and 10,
respectively[1], as a fix for underlying logic now being more correct.

Remove these overrides; they should at very least be consistent across
roles, and the previous values look to be an attempt to tune for a
very much older version of the Linux kernel, which was using an
different, buggier, algorithm under the hood.

[1] 1b5e62b42b
2020-06-12 14:57:46 -07:00
Alex Vandiver f39816e768 puppet: Stop distributing recovery.conf file.
This file controls streaming replication, and recovery using wal-g on
the secondary.  The `primary_conninfo` data needs to change on short
notice when database failover happens, in a way that is not suitable
for being controlled by puppet.

PostgreSQL 12, in fact, removes the use of the `recovery.conf` file[1];
the `primary_conninfo` and `restore_command` information goes into the
main `postgresql.conf` file, and the standby status is controlled by
the presence of absence of an empty `standby.signal` file.

Remove the puppet control of the `recovery.conf` file.

[1] https://pgstef.github.io/2018/11/26/postgresql12_preview_recovery_conf_disappears.html
2020-06-12 14:57:46 -07:00
Alex Vandiver 316498a169 puppet: Remove unnecessary nagios authentication setup.
Since the nagios authentication is stored _in the database_, it is
unnecessary to run if the database is simply a replica of the
production database.  The only case in which this statement would have
an effect is if the postgres node contains a _different_ (or empty)
database, which `setup_disks` now effectively prevents.

Remove the unnecessary step.
2020-06-11 21:01:49 -07:00
Alex Vandiver 0774f54c1b puppet: Move to `setup_disks` to postgres_common.
The tooling should now be run no matter if the node is a primary or
replica.
2020-06-11 21:01:49 -07:00
Alex Vandiver 6f6a0e890a puppet: Run setup_disks based on symlink; remove mdadm dependency.
481613a344 updated the `setup_disks` script to no longer reference
`mdadm`, since we no longer set up RAID on servers.

Update the puppet that would call it to remove the `mdadm` dependency,
and run only if the state is not what it produces -- namely, a symlink
for `/var/lib/postgresql`, which must point to an existent
`/srv/postgresql` directory.
2020-06-11 21:01:49 -07:00
Alex Vandiver 16c4cea951 puppet: Pull postgres config directory into postgres_appdb_base.
As the previous commit, this is currently only used in tuning, but is
a property of the whole postgres configuration; move it there, as just
the directory, not the file.

Use this directory consistently in the erb templates.  Since we
produce a `pg_hba.conf`, it makes sense that we point to the path that we
know that we explicitly wrote to, for instance.
2020-06-11 20:56:55 -07:00
Alex Vandiver 4fe0444108 puppet: Install wal-g, not wal-e. 2020-06-11 15:52:43 -07:00