zulip

Commit Graph

Author	SHA1	Message	Date
Anders Kaseorg	15483c09cb	puppet: Add missing trailing commas. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-13 15:36:06 -07:00
Alex Vandiver	3691a94efe	puppet: Configure munin and nagios under apache with puppet. This swaps in the actually-in-use munin configuiration file; otherwise, it is an implementation of the configuration as it exists on the machine.	2020-07-13 13:23:11 -07:00
Alex Vandiver	4e42164b4a	munin: Add plugins to prod hosts.	2020-07-13 13:23:11 -07:00
Alex Vandiver	2a14212b27	munin: Add a helper resource definition for munin plugins.	2020-07-13 12:49:28 -07:00
Alex Vandiver	7c7b5fcd6f	munin: Deal with spaces in the channel names.	2020-07-13 12:49:28 -07:00
Alex Vandiver	eda2c4b8e2	puppet: Split munin-node from munin-server. No plugins are installed inside the /usr/local/munin/lib this creates in munin-node, nor are they symlinked into /etc/munin/plugins, so non-default plugins are added by this.	2020-07-13 12:49:28 -07:00
Alex Vandiver	ddc7bb5a45	munin: Fix the path to check_send_receive_time.	2020-07-13 12:49:28 -07:00
Alex Vandiver	8be544e7eb	munin: Rename monitoring plugin to use zulip name, not humbug.	2020-07-13 12:49:28 -07:00
Alex Vandiver	1b3560af94	nagios: Stop assuming /api is where zulip client is. The api/ directory was removed in f9ba3cb60c; as that commit notes, we use the python-zulip-api module for that, added in `938597c5da`.	2020-07-13 12:49:28 -07:00
Mateusz Mandera	57d3ef42b8	puppet: Don't run thumbor services in production. Fixes #15649. Currently, no production services use thumbor; so, it makes sense to not run them in production systems.	2020-07-10 14:22:17 -07:00
Alex Vandiver	f0f29584aa	puppet: Add an arity count ("at least two") to zulipconf function.	2020-07-10 00:14:09 -07:00
Alex Vandiver	8cff27f67d	puppet: Pull hosts from zulip.conf, not hardcoded list. The one complexity is that hosts_fullstack are treated differently, as they are not currently found in the manual `hosts` list, and as such do not get munin monitoring.	2020-07-10 00:14:09 -07:00
Alex Vandiver	24383a5082	puppet: Rename hosts_domain so hosts_prefix can be grepped for.	2020-07-10 00:14:09 -07:00
Alex Vandiver	a4e7c7a27e	nagios: Remove check_memcached. check_memcached does not support memcached authentication even in its latest release (it’s in a TODO item comment, and that’s it), and was never particularly useful.	2020-07-10 00:12:48 -07:00
Anders Kaseorg	ebf7f4d0f6	zthumbor: Rename thumbor.conf to thumbor_settings.py. So we can apply all our lint checks to it. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-06 18:44:58 -07:00
Anders Kaseorg	9900298315	zthumbor: Remove Python 2 residue. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-06 18:44:58 -07:00
Alex Vandiver	17002f2a0e	puppet: Allow passing an alternate config path to zulip-puppet-apply. When temporary configuration changes are desired, this lets one set up an alternate `zulip.conf` to apply while leaving the true one in place.	2020-07-06 18:30:16 -07:00
Alex Vandiver	64b44a12f5	puppet: Add an exec rule to reload the whole supervisor config. When supervisor is first installed, it is started automatically, and creates the socket, owned by root. Subsequent reconfiguration in puppet only calls `reread + update`, which is insufficient to apply the `chown = zulip:zulip` line in `supervisord.conf`, leaving the socket owned by `root` and the last part of the installation unable to restart `supervisor` services as the `zulip` user. The `chown` line in `scripts/lib/install` exists to paper over this. Add a separate exec target for changes to `supervisord.conf` itself, which restarts the full service. This leaves the default `restart` action on the service for the lightweight `reread + update` action, which is more common. We use `systemctl` only on redhat-esque builds, because CI runs Ubuntu, but init is not systemd in that context. `systemctl reload` is sufficient to re-apply the socket ownership, but a full `restart` and not `reload` is necessary under `/etc/init.d/supervisor`.	2020-07-01 10:40:54 -07:00
Alex Vandiver	dd91f8edba	puppet: Move supervisor start command into zulip::common. Move this command alongside the rest of the distro-dependent supervisor paths.	2020-07-01 10:40:53 -07:00
Alex Vandiver	a5d63cfedf	wal-g: Update pg_backup_and_purge for wal-g format. wal-g has a slihghtly different format than wal-e in its `backup-list` output; it only contains three columns: - `name` - `last_modified`, - `wal_segment_backup_start` ..rather than wal-e's plethora, most of which were blank: - `name` - `last_modified` - `expanded_size_bytes` - `wal_segment_backup_start` - `wal_segment_offset_backup_start` - `wal_segment_backup_stop` - `wal_segment_offset_backup_stop` Remove one argument from the split.	2020-06-29 17:17:26 -07:00
Alex Vandiver	a21a086f5c	puppet: nagios-plugins-basic is replaced by monitoring-plugins-basic. In Bionic, nagios-plugins-basic is a transitional package which depends on monitoring-plugins-basic. In Focal, it is a virtual package, which means that every time puppet runs, it tries to re-install the nagios-plugins-basic package. Switch all instances to referring to `$zulip::common::nagios_plugins`, and repoint that to monitoring-plugins-basic.	2020-06-29 14:58:01 -07:00
Alex Vandiver	6fdcb4aa17	puppet: Move supervisor conf file path into zulip::common. Move this config file alongside the rest of the distro-dependent paths.	2020-06-29 13:41:05 -07:00
Alex Vandiver	93401448b9	puppet: Explain value of reload && update trick for supervisor. While the stock reload works just fine, it causes too much disruption.	2020-06-29 13:39:09 -07:00
Alex Vandiver	d2de5aced8	puppet: Remove unnecessary supervisor service name variable.	2020-06-29 13:39:09 -07:00
Alex Vandiver	73805f8279	puppet: Stop removing file that contains only comments. In modern PostgreSQL, this file, provided by `postgresql-common`, has no non-comment, non-blank lines. There's hence no reason to remove it.	2020-06-29 13:37:42 -07:00
Alex Vandiver	6e3a424921	puppet: Install the latest postgresql-client on frontend hosts. Frontend hosts in multiple-host configurations (including docker hosts) need a `psql` binary installed. `ca9d27175b` switched to not setting `postgresql.version` in `zulip.conf`, which in turn means that `$zulip::base::postgres_version` is unset. This, in turn, led to the frontend hosts installing `postgresql-client-`, whose trailing dash causes apt to _uninstall_ that package. Unconditionally install `postgresql-client` with no explicit version attached. This is a metapackage which depends on the latest client package, which currently means it will install `postgresql-client-12`. On single-host installs which have configured `postgresql.version` in `zulip.conf` to be a lower version, this will result in `postgresql-client-12` existing alongside another version (e.g. `postgresql-client-10`); `psql` will give the most recent. This is acceptable because the semantic meaning of the postgresql version in `zulip.conf` is about the database engine itself, not the command-line client.	2020-06-29 13:37:16 -07:00
Alex Vandiver	2c36bb19b2	puppet: Pull out `unzip` package which is identical in both cases.	2020-06-29 13:37:16 -07:00
Alex Vandiver	876ee4a8ed	installer: Remove code specific to stretch or xenial. Support for Xenial and Stretch was removed (`5154ddafca`, `0f4b1076ad`, `8944e0ad53`, `79acd5ae40`, `1219a2e854`), but not all codepaths were updated to remove their conditionals on it. Remove all code predicated on Xenial or Stretch. debathena support was migrated to Bionic, since that appears to be the current state of existing debathena servers.	2020-06-24 12:57:38 -07:00
Anders Kaseorg	a9e59b6bd3	memcached: Change the default MEMCACHED_USERNAME to zulip@localhost. This prevents memcached from automatically appending the hostname to the username, which was a source of problems on servers where the hostname was changed. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-19 21:22:30 -07:00
Alex Vandiver	7250d41bf7	puppet: Fix the path to install-wall-g	2020-06-17 15:23:18 -07:00
Alex Vandiver	03bffd3938	upgrade-zulip: Pin the postgres version to the OS default. We would prefer to use the postgres packages from Postgres themselves, if available. However, this requires ensures that, for existing installs, we preserve the same version of postgres as their base distribution installed. Move the version-determination logic from being computed at puppet interpolation time, to being computed at install time and pinned into zulip.conf.	2020-06-16 17:05:46 -07:00
Tim Abbott	26396c5e25	puppet: Fix exceptions with multiple certbot declarations. Since `9e8f1aacb3`, zulip_ops machines might have two Package declarations for `certbot`, which doesn't work in puppet. The fix is, as usual, to use our `zulip::safepackage` wrapper instead.	2020-06-15 18:21:33 -07:00
Alex Vandiver	bff3b540b1	puppet: Postgres replication should always switch to latest timeline. Omission of this setting makes resuming after a primary switchover difficult-to-impossible. It is the default in PostgreSQL 12.	2020-06-15 16:18:07 -07:00
Alex Vandiver	f8fc3a16eb	puppet: Use "primary" / "replica" consistently in comments. The style guide for Zulip is to always use "primary" and "replica" when describing database replication. Adjust a few comments under `puppet/` that do not adhere to this. Unfortunately, some references still remain to the insensitive and inaccurate "master" / "slave" terminology. However, these are only in files which we are attempting to preserve as close to the upstream versions they are derived from (e.g. postgresql.conf, postfix/master.cf).	2020-06-15 16:18:07 -07:00
Alex Vandiver	5f433d6eeb	puppet: Remove vestigial check_postgres.pl. `65774e1c4f` switched from using the bundled check_postgres.pl to using the version from packages; the file itself remained, however. Remove it, and clean up references to it. Fixes #15389.	2020-06-15 16:18:07 -07:00
Alex Vandiver	7d4a370a57	puppet: Move monitoring of pg replication to the pg hosts. Instead of SSH'ing around to them, run directly on the database hosts. This means that the replicas do not know how many bytes behind they are in _receiving_ the wall logs; thus, the monitoring also extends to the primary database, which knows that information for each replica. This also allows for detecting when there are too few active replicas.	2020-06-15 16:18:07 -07:00
Anders Kaseorg	5dc9b55c43	python: Manually convert more percent-formatting to f-strings. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00
Anders Kaseorg	74c17bf94a	python: Convert more percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Now including %d, %i, %u, and multi-line strings. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00
Anders Kaseorg	1ed2d9b4a0	logging: Use logging.exception and exc_info for unexpected exceptions. logging.exception() and logging.debug(exc_info=True), etc. automatically include a traceback. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00
Tim Abbott	80589099d8	puppet: Fix typo in logic for whether to install certbot. Fixes #15372.	2020-06-14 16:04:39 -07:00
rht	89af2f381d	puppet: Link postgres dict symlinks to hunspell files on CentOS. This is a temporary measure until we can find the directory of postgresql dicts on CentOS.	2020-06-13 17:53:38 -07:00
rht	36a5ca5015	puppet: Add cyrus-sasl to memcached_packages on RedHat. This is to mirror the sasl2-bin package on Debian.	2020-06-13 17:49:51 -07:00
rht	e776d2d159	puppet: Abstract out owner:group of memcached-sasldb2.	2020-06-13 17:49:51 -07:00
Anders Kaseorg	91a86c24f5	python: Replace None defaults with empty collections where appropriate. Use read-only types (List ↦ Sequence, Dict ↦ Mapping, Set ↦ AbstractSet) to guard against accidental mutation of the default value. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-13 15:31:27 -07:00
Alex Vandiver	97b9308781	puppet: Merge multiple postgres roles in `zulip_ops`. All differences between the primary and replica roles having been merged, fold the `postgres_common`, `postgres_master`, and `postgres_slave` roles into just `postgres_appdb`.	2020-06-12 14:57:46 -07:00
Alex Vandiver	55bd31721d	puppet: Remove custom `vm.dirty_ratio` and `vm.dirty_background_ratio`. These values differed between the primary and secondary database hosts, for unclear reasons. The differences date back to their introduction in `387f63deaa`. As the comment in the replica confguration notes, settings of `vm.dirty_ratio = 10` and `vm.dirty_background_ratio = 5` matched the kernel defaults for "newer" kernels; however, kernel 2.6.30 bumped those to 20 and 10, respectively[1], as a fix for underlying logic now being more correct. Remove these overrides; they should at very least be consistent across roles, and the previous values look to be an attempt to tune for a very much older version of the Linux kernel, which was using an different, buggier, algorithm under the hood. [1] `1b5e62b42b`	2020-06-12 14:57:46 -07:00
Alex Vandiver	f39816e768	puppet: Stop distributing recovery.conf file. This file controls streaming replication, and recovery using wal-g on the secondary. The `primary_conninfo` data needs to change on short notice when database failover happens, in a way that is not suitable for being controlled by puppet. PostgreSQL 12, in fact, removes the use of the `recovery.conf` file[1]; the `primary_conninfo` and `restore_command` information goes into the main `postgresql.conf` file, and the standby status is controlled by the presence of absence of an empty `standby.signal` file. Remove the puppet control of the `recovery.conf` file. [1] https://pgstef.github.io/2018/11/26/postgresql12_preview_recovery_conf_disappears.html	2020-06-12 14:57:46 -07:00
Alex Vandiver	316498a169	puppet: Remove unnecessary nagios authentication setup. Since the nagios authentication is stored _in the database_, it is unnecessary to run if the database is simply a replica of the production database. The only case in which this statement would have an effect is if the postgres node contains a _different_ (or empty) database, which `setup_disks` now effectively prevents. Remove the unnecessary step.	2020-06-11 21:01:49 -07:00
Alex Vandiver	0774f54c1b	puppet: Move to `setup_disks` to postgres_common. The tooling should now be run no matter if the node is a primary or replica.	2020-06-11 21:01:49 -07:00
Alex Vandiver	6f6a0e890a	puppet: Run setup_disks based on symlink; remove mdadm dependency. `481613a344` updated the `setup_disks` script to no longer reference `mdadm`, since we no longer set up RAID on servers. Update the puppet that would call it to remove the `mdadm` dependency, and run only if the state is not what it produces -- namely, a symlink for `/var/lib/postgresql`, which must point to an existent `/srv/postgresql` directory.	2020-06-11 21:01:49 -07:00

1 2 3 4 5 ...

1000 Commits