Alex Vandiver
1e81775fa0
nagios: Drop unhelpful hostgroup comment.
2022-06-22 12:07:38 -07:00
Alex Vandiver
7b584401ac
nagios: Reformat hostgroups.
2022-06-22 12:07:38 -07:00
Alex Vandiver
93bcb86345
nagios: Reorder service checks.
2022-06-22 12:07:38 -07:00
Alex Vandiver
eaaa2fbff8
nagios: Use canonical "hostgroup_name" consistently.
2022-06-22 12:07:38 -07:00
Alex Vandiver
e8996b53a5
nagios: Remove unused has_swap hostgroup.
2022-06-22 12:07:38 -07:00
Alex Vandiver
33472ee9ff
nagios: Remove unused stats host set.
2022-06-22 12:07:38 -07:00
Alex Vandiver
bc4f4b4862
nagios: Make the pageable/not/flaky tri-state clearer.
2022-06-22 12:07:38 -07:00
Alex Vandiver
c74f195fba
nagios: Split AWS and non-AWS hosts, for ntp checks.
...
The non-AWS hosts cannot use the AWS ntp server for their check.
2022-06-22 12:07:38 -07:00
Alex Vandiver
872efdee58
nagios: Fold single- and multitornado_frontends back into frontends.
...
5abf4dee92
made this distinction, then multitornado_frontends was
never used; the singletornado_frontends alerting worked even for the
multiple-Tornado instances.
Remove the useless and misleading distinction.
2022-06-22 12:07:38 -07:00
Alex Vandiver
3741c1c034
puppet: Switch to checking time against the AWS timeserver.
...
Since this is what chrony is sync'ing to, it lessens the chance of
spurious firings of this alert.
See https://aws.amazon.com/blogs/aws/keeping-time-with-amazon-time-sync-service/
2022-05-31 22:57:32 -07:00
Alex Vandiver
7f6a77da31
puppet: Add a redis exporter.
2022-05-03 17:13:44 -07:00
Anders Kaseorg
e9ba9b0e0d
zulip-ec2-configure-interfaces: Remove.
...
Our current EC2 systems don’t have an interface named ‘eth0’, and if
they did, this script would do nothing but crash with ImportError
because we have never installed boto.utils for Python 3.
(The message of commit 2a4d851a7c
made
an effort to document for future researchers why this script should
not have been blindly converted to Python 3. However, commit
2dc6d09c2a
(#14278 ) was evidently
unresearched and untested.)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-05-03 02:25:59 -07:00
Anders Kaseorg
646a4d19a3
puppet: Remove quotes for enumerable values.
...
https://puppet.com/docs/puppet/7/style_guide.html#style_guide_module_design-quoting
“If a string is a value from an enumerable set of options, such as
present and absent, it SHOULD NOT be enclosed in quotes at all.”
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-04-29 22:06:46 -07:00
Alex Vandiver
35db1ee435
puppet: Only include "app_service" section if there are apps.
...
This works around gravitational/teleport#12256 , but also produces config
files that are slightly cleaner.
2022-04-26 16:36:13 -07:00
Alex Vandiver
f6d27562fa
puppet: Configure chrony to use AWS-local NTP sources.
...
This prevents hosts from spewing traffic to random hosts across the
Internet.
2022-03-25 17:07:53 -07:00
Alex Vandiver
1bd5723cd2
puppet: Add a prometheus monitor for tornado processes.
2022-03-20 16:12:11 -07:00
Alex Vandiver
6b91652d9a
puppet: Open the grok_exporter port.
...
The complete grok_exporter configuration is not ready to be committed,
but this at least prepares the way for it.
2022-03-20 16:12:11 -07:00
Alex Vandiver
6558655fc6
puppet: Add rabbitmq prometheus plugin, and open the firewall.
2022-03-20 16:12:11 -07:00
Alex Vandiver
bdd2f35d05
puppet: Switch czo to using zulip_ops::app_frontend_monitoring.
...
This was clearly intended in f61ac4a28d
but never executed.
2022-03-20 16:12:11 -07:00
Alex Vandiver
17699bea44
puppet: postgresql_backups is auto-included if s3_backups_bucket is set.
...
Since 6496d43148
.
2022-03-20 16:12:11 -07:00
Alex Vandiver
bedc7c2986
puppet: Smokescreen is now auto-included in standalone.
...
Since c33562f0a8
.
2022-03-20 16:12:11 -07:00
Anders Kaseorg
b3260bd610
docs: Use Debian and Ubuntu version numbers over development codenames.
...
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-02-23 12:04:24 -08:00
Alex Vandiver
788daa953b
puppet: Factor out $::architecture case statement for golang.
2022-02-15 12:04:37 -08:00
Anders Kaseorg
f6a701090c
setup-apt-repos: Don’t install lsb_release.
...
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-02-14 16:38:53 -08:00
Alex Vandiver
e032b38661
puppet: Fix typo in uwsgi exporter dependency.
2022-02-08 15:17:17 -08:00
Alex Vandiver
3bbe5c1110
puppet: Put comments on iptables lines.
...
In addition to documenting the rules.v4 and rules.v6 files slightly,
these comments show up in `iptables -L`:
```
root@hostname:~# iptables -L INPUT
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT all -- anywhere anywhere
LOGDROP all -- anywhere localhost/8
ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED
ACCEPT tcp -- anywhere anywhere tcp dpt:ssh /* ssh */
ACCEPT tcp -- anywhere anywhere tcp dpt:3000 /* grafana */
ACCEPT tcp -- anywhere anywhere tcp dpt:9100 /* node_exporter */
LOGDROP all -- anywhere anywhere
```
2022-01-21 16:46:14 -08:00
Alex Vandiver
6bc5849ea8
puppet: Remove now-unused debathena apt repository.
2022-01-18 14:13:28 -08:00
Alex Vandiver
b3f07cc98d
puppet: Replace debathena zephyr package with equivalent puppet file.
2022-01-18 14:13:28 -08:00
Alex Vandiver
a6d7539571
puppet: Replace debathena krb5 package with equivalent puppet file.
2022-01-18 14:13:28 -08:00
Alex Vandiver
75224ea5de
puppet: python-dev is now purely virtual; install python2.7-dev.
2022-01-18 14:13:28 -08:00
Alex Vandiver
fc1adef28a
puppet: Fix server_name of internal staging server.
2022-01-18 12:36:56 -08:00
Alex Vandiver
7e630b81f8
puppet: Switch to using snakeoil certs for staging.
...
This parallels ba3b88c81b
, but for the
staging host.
2022-01-18 12:36:56 -08:00
Alex Vandiver
0b8a6a51b8
puppet: Remove all parts of AWS kernels.
...
Otherwise, we just uninstall the meta-package, and still restart into
the installed AWS kernel.
2022-01-12 15:52:19 -08:00
Alex Vandiver
4d7e6b26df
puppet: Provide more attributes to teleport on ssh nodes.
2022-01-12 14:15:45 -08:00
Alex Vandiver
339e70671c
puppet: Switch Grafana to Grafana 8 Unified Alerting.
2022-01-11 14:27:11 -08:00
Alex Vandiver
6a7eecee9a
puppet: Increase load paging thresholds.
2022-01-11 09:38:31 -08:00
Alex Vandiver
1e80b844f4
puppet: Disable apparmor profile for msmtp.
...
As the nagios user, we want to read the msmtp configuration from
~nagios, which apparmor's profile does not allow msmtp to do.
2022-01-11 09:38:31 -08:00
Alex Vandiver
3c95ad82c6
puppet: Upgrade to nagios4.
...
This updates the puppeted nagios configuration file for the Nagios4
defaults.
2022-01-11 09:38:31 -08:00
Alex Vandiver
4a95967a33
puppet: Gather uwsgi stats from chat.zulip.org.
2022-01-03 21:26:57 -08:00
Alex Vandiver
8a5be972d2
puppet: Add a uwsgi exporter for monitoring.
...
This allows investigation of how many workers are busy, and to track
"harikari" terminations.
2022-01-03 15:25:58 -08:00
Anders Kaseorg
82748d45d8
install-yarn: Use test -ef in case /srv is a symlink.
...
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-30 13:42:07 -08:00
Alex Vandiver
c094867a74
puppet: Add aarch64 build hashes to external dependencies.
...
wal-g does not ship aarch64 binaries, currently; the compilation
process([1]) is somewhat complicated, so we defer the decision about
how to support wal-g for aarch64 until a later date.
[1]: https://github.com/wal-g/wal-g/blob/master/docs/PostgreSQL.md#installing
2021-12-29 16:35:15 -08:00
Alex Vandiver
f166f9f7d6
puppet: Centralize versions and sha256 hashes of external dependencies.
...
This will make it easier to update versions of these dependencies.
2021-12-29 16:35:15 -08:00
Alex Vandiver
57662689a9
puppet: Provide a constant homedir for grafana user.
...
The homedir of a user cannot be changed if any processes are running
as them, so having it change over time as upgrades happen will break
puppet application, as the old grafana process under supervisor will
effectively lock changes to the user's homedir.
Unfortunately, that means that this change will thus fail to
puppet-apply unless `supervisorctl stop grafana` is run first, but
there's no way around that.
2021-12-29 16:35:15 -08:00
Alex Vandiver
6e55e52694
puppet: Pull out grafana $data_dir.
2021-12-29 16:35:15 -08:00
Alex Vandiver
1e4e6a09af
puppet: Stop making resources for external binaries and directories.
...
In the event that extracting doesn't produce the binary we expected it
to, all this will do is create an _empty_ file where we expect the
binary to be. This will likely muddle debugging.
Since the only reason the resourfce was made in the first place was to
make dependencies clear, switch to depending on the External_Dep
itself, when such a dependency is needed.
2021-12-29 16:35:15 -08:00
Alex Vandiver
3c163a7d5e
puppet: Move slash out of $dir by convention.
2021-12-29 16:35:15 -08:00
Alex Vandiver
bb5a2c8138
puppet: Move prometheus to external_dep.
2021-12-29 16:35:15 -08:00
Alex Vandiver
2d6c096904
puppet: Move node_exporter to external_dep.
2021-12-29 16:35:15 -08:00
Alex Vandiver
e4b23daad7
puppet: Upgrade to Grafana 8.3.2, for CVE-2021-43813.
2021-12-10 14:00:11 -08:00