zulip

Commit Graph

Author	SHA1	Message	Date
Alex Vandiver	9799a03d79	puppet: Expose Smokescreen prometheus metrics on :9810.	2023-07-13 11:47:34 -07:00
Alex Vandiver	3aba2789d3	prometheus: Add an exporter for wal-g backup properties. Since backups may now taken on arbitrary hosts, we need a blackbox monitor that _some_ backup was produced. Add a Prometheus exporter which calls `wal-g backup-list` and reports statistics about the backups. This could be extended to include `wal-g wal-verify`, but that requires a connection to the PostgreSQL server.	2023-04-26 15:41:39 -07:00
Alex Vandiver	cace8858f9	puppet: Move logrotate config into app_frontend_base. `7c023042cf` moved the logrotate configuration to being a templated file, from a static file, but missed that the static file was still referenced from `zulip_ops::app_frontend`; it only updated `zulip::profile::app_frontend`. This caused errors in applying puppet on any `zulip_ops::app_frontend` host. Prior to `7c023042cf`, the Puppet role was identical between those two classes; deduplicate the rule by moving the updated template definition into `zulip::app_frontend_base` which is common to those two classes and not used in any other classes.	2023-04-19 09:34:37 -07:00
Alex Vandiver	d0fc3f1c2e	puppet: Add prod hooks to push zulip-cloud-current and notify CZO.	2023-04-12 11:36:33 -07:00
Tim Abbott	561daee2a1	puppet: Update declared zmirror dependencies. Following zulip/python-zulip-api/pull/758/, we're no longer using python-zephyr, and don't need to build it from source. Additionally, we no longer need to build a forked Zephyr package, since ZLoadSession and ZDumpSession were merged in `e6a545e759`.	2023-04-06 09:45:06 -07:00
Alex Vandiver	6975417acf	puppet: Create zmirror supervisor subdirectory. To not change the `supervisor.conf` file, which requires a restart of supervisor (and thus all services running under it, which is extremely disruptive) we carefully leave the contents unchanged for most installs, and append a new piece to the file, only for the zmirror configuration, using `concat`.	2023-04-06 09:45:06 -07:00
Alex Vandiver	8a771c7ac0	hooks: Add a hook to send a Zulip before/after the deploy.	2023-04-05 18:51:55 -04:00
Alex Vandiver	89e366771a	prometheus: Add a postgres exporter.	2023-03-30 16:16:18 -07:00
Alex Vandiver	c2beb64a79	prometheus: Consistently import the base class and supervisor, if needed.	2023-03-30 16:16:18 -07:00
Alex Vandiver	f2a20b56bc	puppet: Enable sentry hooks for production and staging.	2023-03-17 08:10:31 -07:00
Alex Vandiver	1a65315566	puppet: Switch teleport to running under systemd, not supervisord. There is no reason that the base node access method should be run under supervisor, which exists primarily to give access to the `zulip` user to restart its managed services. This access is unnecessary for Teleport, and also causes unwanted restarts of Teleport services when the `supervisor` base configuration changes. Additionally, supervisor does not support the in-place upgrade process that Teleport uses, as it replaces its core process with a new one. Switch to installing a systemd configuration file (as generated by `teleport install systemd`) for each part of Teleport, customized to pass a `--config` path. As such, we explicitly disable the `teleport` service provided by the package. The supervisor process is shut down by dint of no longer installing the file, which purges it from the managed directory, and reloads Supervisor to pick up the removed service.	2023-03-15 17:23:42 -04:00
Alex Vandiver	044ccdb334	chat.zulip.org: Enable Sentry hook.	2023-02-14 17:20:35 -05:00
Alex Vandiver	e8123dfeea	puppet: Match the `x` bits on directories to what puppet actually does. Puppet _always_ sets the `+x` bit on directories if they have the `r` bit set for that slot[^1]: > When specifying numeric permissions for directories, Puppet sets the > search permission wherever the read permission is set. As such, for instance, `0640` is actually applied as `0750`. Fix what we "want" to match what puppet is applying, by adding the `x` bit. In none of these cases did we actually intend the directory to not be executable. [1] https://www.puppet.com/docs/puppet/5.5/types/file.html#file-attribute-mode	2023-01-26 15:06:01 -08:00
Alex Vandiver	d0de66b273	puppet: Remove "ensure => absent" rules which have all been applied.	2023-01-24 13:05:24 -08:00
Alex Vandiver	42f84a8cc7	puppet: Use existing autossh tunnels as OpenSSH "master" sockets. A number of autossh connections are already left open for port-forwarding Munin ports; autossh starts the connections and ensures that they are automatically restarted if they are severed. However, this represents a missed opportunity. Nagios's monitoring uses a large number of SSH connections to the remote hosts to run commands on them; each of these connections requires doing a complete SSH handshake and authentication, which can have non-trivial network latency, particularly for hosts which may be located far away, in a network topology sense (up to 1s for a no-op command!). Use OpenSSH's ability to multiplex multiple connections over a single socket, to reuse the already-established connection. We leave an explicit `ControlMaster no` in the general configuration, and not `auto`, as we do not wish any of the short-lived Nagios connections to get promoted to being a control socket if the autossh is not running for some reason. We enable protocol-level keepalives, to give a better chance of the socket being kept open.	2022-11-01 22:24:40 -07:00
Alex Vandiver	9bd88a93e2	puppet: Tell needrestart to not default to restarting core services. The `needrestart` tool added in 22.04 is useful in terms of listing which services may need to be restarted to pick up updated libraries. However, it prompts about the current state of services needing restart for every subsequent `apt-get upgrade`, and defaulting core services to restarting requires carefully manually excluding them every time, at risk of causing an unscheduled outage. Build a list of default-off services based on the list in unattended-upgrades.	2022-07-19 17:51:18 -07:00
Alex Vandiver	8bc26aab08	nagios: Switch check_user_zephyr_mirror_liveness to run via cron. This check loads Django, and as such must be run as the zulip user. Repeat the same pattern used elsewhere in nagios, of writing a state file, which is read by `check_cron_file`.	2022-06-22 12:07:38 -07:00
Alex Vandiver	775a084d0f	nagios: Add a catchall "other" set.	2022-06-22 12:07:38 -07:00
Alex Vandiver	33472ee9ff	nagios: Remove unused stats host set.	2022-06-22 12:07:38 -07:00
Alex Vandiver	7f6a77da31	puppet: Add a redis exporter.	2022-05-03 17:13:44 -07:00
Anders Kaseorg	e9ba9b0e0d	zulip-ec2-configure-interfaces: Remove. Our current EC2 systems don’t have an interface named ‘eth0’, and if they did, this script would do nothing but crash with ImportError because we have never installed boto.utils for Python 3. (The message of commit `2a4d851a7c` made an effort to document for future researchers why this script should not have been blindly converted to Python 3. However, commit `2dc6d09c2a` (#14278) was evidently unresearched and untested.) Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-05-03 02:25:59 -07:00
Anders Kaseorg	646a4d19a3	puppet: Remove quotes for enumerable values. https://puppet.com/docs/puppet/7/style_guide.html#style_guide_module_design-quoting “If a string is a value from an enumerable set of options, such as present and absent, it SHOULD NOT be enclosed in quotes at all.” Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-29 22:06:46 -07:00
Alex Vandiver	35db1ee435	puppet: Only include "app_service" section if there are apps. This works around gravitational/teleport#12256, but also produces config files that are slightly cleaner.	2022-04-26 16:36:13 -07:00
Alex Vandiver	f6d27562fa	puppet: Configure chrony to use AWS-local NTP sources. This prevents hosts from spewing traffic to random hosts across the Internet.	2022-03-25 17:07:53 -07:00
Alex Vandiver	1bd5723cd2	puppet: Add a prometheus monitor for tornado processes.	2022-03-20 16:12:11 -07:00
Alex Vandiver	6b91652d9a	puppet: Open the grok_exporter port. The complete grok_exporter configuration is not ready to be committed, but this at least prepares the way for it.	2022-03-20 16:12:11 -07:00
Alex Vandiver	6558655fc6	puppet: Add rabbitmq prometheus plugin, and open the firewall.	2022-03-20 16:12:11 -07:00
Alex Vandiver	bdd2f35d05	puppet: Switch czo to using zulip_ops::app_frontend_monitoring. This was clearly intended in `f61ac4a28d` but never executed.	2022-03-20 16:12:11 -07:00
Alex Vandiver	17699bea44	puppet: postgresql_backups is auto-included if s3_backups_bucket is set. Since `6496d43148`.	2022-03-20 16:12:11 -07:00
Alex Vandiver	bedc7c2986	puppet: Smokescreen is now auto-included in standalone. Since `c33562f0a8`.	2022-03-20 16:12:11 -07:00
Alex Vandiver	788daa953b	puppet: Factor out $::architecture case statement for golang.	2022-02-15 12:04:37 -08:00
Alex Vandiver	e032b38661	puppet: Fix typo in uwsgi exporter dependency.	2022-02-08 15:17:17 -08:00
Alex Vandiver	3bbe5c1110	puppet: Put comments on iptables lines. In addition to documenting the rules.v4 and rules.v6 files slightly, these comments show up in `iptables -L`: ``` root@hostname:~# iptables -L INPUT Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere LOGDROP all -- anywhere localhost/8 ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT tcp -- anywhere anywhere tcp dpt:ssh /* ssh / ACCEPT tcp -- anywhere anywhere tcp dpt:3000 / grafana / ACCEPT tcp -- anywhere anywhere tcp dpt:9100 / node_exporter */ LOGDROP all -- anywhere anywhere ```	2022-01-21 16:46:14 -08:00
Alex Vandiver	6bc5849ea8	puppet: Remove now-unused debathena apt repository.	2022-01-18 14:13:28 -08:00
Alex Vandiver	b3f07cc98d	puppet: Replace debathena zephyr package with equivalent puppet file.	2022-01-18 14:13:28 -08:00
Alex Vandiver	a6d7539571	puppet: Replace debathena krb5 package with equivalent puppet file.	2022-01-18 14:13:28 -08:00
Alex Vandiver	75224ea5de	puppet: python-dev is now purely virtual; install python2.7-dev.	2022-01-18 14:13:28 -08:00
Alex Vandiver	0b8a6a51b8	puppet: Remove all parts of AWS kernels. Otherwise, we just uninstall the meta-package, and still restart into the installed AWS kernel.	2022-01-12 15:52:19 -08:00
Alex Vandiver	1e80b844f4	puppet: Disable apparmor profile for msmtp. As the nagios user, we want to read the msmtp configuration from ~nagios, which apparmor's profile does not allow msmtp to do.	2022-01-11 09:38:31 -08:00
Alex Vandiver	3c95ad82c6	puppet: Upgrade to nagios4. This updates the puppeted nagios configuration file for the Nagios4 defaults.	2022-01-11 09:38:31 -08:00
Alex Vandiver	4a95967a33	puppet: Gather uwsgi stats from chat.zulip.org.	2022-01-03 21:26:57 -08:00
Alex Vandiver	8a5be972d2	puppet: Add a uwsgi exporter for monitoring. This allows investigation of how many workers are busy, and to track "harikari" terminations.	2022-01-03 15:25:58 -08:00
Anders Kaseorg	82748d45d8	install-yarn: Use test -ef in case /srv is a symlink. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-12-30 13:42:07 -08:00
Alex Vandiver	c094867a74	puppet: Add aarch64 build hashes to external dependencies. wal-g does not ship aarch64 binaries, currently; the compilation process([1]) is somewhat complicated, so we defer the decision about how to support wal-g for aarch64 until a later date. [1]: https://github.com/wal-g/wal-g/blob/master/docs/PostgreSQL.md#installing	2021-12-29 16:35:15 -08:00
Alex Vandiver	f166f9f7d6	puppet: Centralize versions and sha256 hashes of external dependencies. This will make it easier to update versions of these dependencies.	2021-12-29 16:35:15 -08:00
Alex Vandiver	57662689a9	puppet: Provide a constant homedir for grafana user. The homedir of a user cannot be changed if any processes are running as them, so having it change over time as upgrades happen will break puppet application, as the old grafana process under supervisor will effectively lock changes to the user's homedir. Unfortunately, that means that this change will thus fail to puppet-apply unless `supervisorctl stop grafana` is run first, but there's no way around that.	2021-12-29 16:35:15 -08:00
Alex Vandiver	6e55e52694	puppet: Pull out grafana $data_dir.	2021-12-29 16:35:15 -08:00
Alex Vandiver	1e4e6a09af	puppet: Stop making resources for external binaries and directories. In the event that extracting doesn't produce the binary we expected it to, all this will do is create an _empty_ file where we expect the binary to be. This will likely muddle debugging. Since the only reason the resourfce was made in the first place was to make dependencies clear, switch to depending on the External_Dep itself, when such a dependency is needed.	2021-12-29 16:35:15 -08:00
Alex Vandiver	3c163a7d5e	puppet: Move slash out of $dir by convention.	2021-12-29 16:35:15 -08:00
Alex Vandiver	bb5a2c8138	puppet: Move prometheus to external_dep.	2021-12-29 16:35:15 -08:00

1 2 3 4 5

241 Commits