Commit Graph

334 Commits

Author SHA1 Message Date
Alex Vandiver 033a96aa5d puppet: Fix check_ssl_certificate check to check named host, not self. 2021-05-17 18:38:30 -07:00
Alex Vandiver feb7870db7 puppet: Adjust thresholds on autovac_freeze.
These thresholds are in relationship to the
`autovacuum_freeze_max_age`, *not* the XID wraparound, which happens
at 2^31-1.  As such, it is *perfectly normal* that they hit 100%, and
then autovacuum kicks in and brings it back down.  The unusual
condition is that PostgreSQL pushes past the point where an autovacuum
would be triggered -- therein lies the XID wraparound danger.

With the `autovacuum_freeze_max_age` set to 2000000000 in
`postgresql.conf`, XID wraparound happens at 107.3%.  Set the warning
and error thresholds to below this, but above 100% so this does not
trigger constantly.
2021-05-11 17:11:47 -07:00
Anders Kaseorg 544bbd5398 docs: Fix capitalization mistakes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-05-10 09:57:26 -07:00
Anders Kaseorg 9d57fa9759 puppet: Use pgrep -x to avoid accidental matches.
Matching the full process name (-x without -f) or full command
line (-xf) is less prone to mistakes like matching a random substring
of some other command line or pgrep matching itself.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-05-07 08:54:41 -07:00
Alex Vandiver 6ee74b3433 puppet: Check health of APT repository. 2021-03-23 19:27:42 -07:00
Alex Vandiver c01345d20c puppet: Add nagios check for long-lived certs that do not auto-renew. 2021-03-23 19:27:27 -07:00
Alex Vandiver 9ea86c861b puppet: Add a nagios alert configuration for smokescreen.
This verifies that the proxy is working by accessing a
highly-available website through it.  Since failure of this equates to
failures of Sentry notifications and Android mobile push
notifications, this is a paging service.
2021-03-18 10:11:15 -07:00
Anders Kaseorg 129ea6dd11 nginx: Consistently listen on IPv6 and with HTTP/2.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-03-17 17:46:32 -07:00
Alex Vandiver 06c07109e4 puppet: Add missing semicolons left off in ba3b88c81b. 2021-03-12 15:48:53 -08:00
Alex Vandiver ba3b88c81b puppet: Explicitly use the snakeoil certificates for nginx.
In production, the `wildcard-zulipchat.com.combined-chain.crt` file is
just a symlink to the snakeoil certificates; but we do not puppet that
symlink, which makes new hosts fail to start cleanly.  Instead, point
explicitly to the snakeoil certificate, and explain why.
2021-03-12 13:31:54 -08:00
Alex Vandiver 306bf930f5 puppet: Add a warning if ksplice is enabled but has no key set. 2021-03-10 17:57:20 -08:00
Alex Vandiver a215c83c2d puppet: Switch to more explicit variable rather than reuse a nagios one.
Redis is not nagios, and this only leads to confusion as to why there
is a nagios domain setting on frontend servers; it also leaves the
`redis0` part of the name buried in the template.

Switch to an explicit variable for the redis hostname.
2021-03-10 11:44:54 -08:00
Alex Vandiver a5b29398fc puppet: Only install ksplice uptrack if there is an access key. 2021-03-10 11:44:11 -08:00
Alex Vandiver d938dd9d4a puppet: Document smokescreen installation, and move to puppet/zulip/.
This is more broadly useful than for just Kandra; provide
documentation and means to install Smokescreen for stand-alone
servers, and motivate its use somewhat more.
2021-03-02 17:16:38 -08:00
Alex Vandiver 2f5eae5c68 puppet: Minor formatting. 2021-02-28 17:03:29 -08:00
Alex Vandiver a759d26a32 puppet: Make ksplice config not world-readable, use 'adm' group.
This matches the configuration that ksplice itself creates the file
and directory with.
2021-02-28 17:03:29 -08:00
Tim Abbott 957c16aa77 nagios: Tweak prod load monitoring parameters.
Ultimately this monitoring isn't that helpful, but we're mainly
interested in when it spikes to very high numbers.
2021-02-26 08:39:52 -08:00
Alex Vandiver 32149c6a1c puppet: Add ksplice uptrack for kernel hotpatches. 2021-02-25 18:05:47 -08:00
Alex Vandiver 173d2dec3d puppet: Check in defensive restart-camo cron job.
This was found on lb1; add it to the camo install on smokescreen.
2021-02-24 16:42:21 -08:00
Alex Vandiver 0b736ef4cf puppet: Remove puppet_ops configuration for separate loadbalancer host. 2021-02-22 16:05:13 -08:00
Alex Vandiver e30b524896 iptables: Limit smokescreen port 4750, add camo port.
Limit incoming connections to port 4750 to only the smokescreen host,
and also allow access to the Camo server on that host, on port 9292.
2021-02-17 13:52:38 -08:00
Alex Vandiver a88af1b5a2 camo: Install on smokescreen host. 2021-02-16 08:12:31 -08:00
Alex Vandiver 29f60bad20 smokescreen: Put the version into the supervisorctl command.
This makes it reload correctly if the version is changed.
2021-02-16 08:12:31 -08:00
Anders Kaseorg 6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg 11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Alex Vandiver 559cdf7317 puppet: Set APT::Periodic::Unattended-Upgrade in apt config.
This is required for unattended upgrades to actually run regularly.
In some distributions, it may be found in 20auto-upgrades, but placing
it here makes it more discoverable.
2021-02-12 08:59:19 -08:00
Tim Abbott fd8504e06b munin: Update to use NAGIOS_BOT_HOST.
We haven't actively used this plugin in years, and so it was never
converted from the 2014-era monitoring to detect the hostname.

This seems worth fixing since we may want to migrate this logic to a
more modern monitoring system, and it's helpful to have it correct.
2021-01-27 12:07:09 -08:00
Alex Vandiver c2526844e9 worker: Remove SignupWorker and friends.
ZULIP_FRIENDS_LIST_ID and MAILCHIMP_API_KEY are not currently used in
production.

This removes the unused 'signups' queue and worker.
2021-01-17 11:16:35 -08:00
Alex Vandiver 90ca06d873 puppet: Allow unattended upgrades of -updates in addition to -security.
This ensures that software will be fully up-to-date, not just with
security patches.
2020-11-13 16:45:05 -08:00
Tim Abbott 494a685827 puppet: Fix typo in name of missedmessage_emails consumer.
This has been present since this check was introduced in
45c9c3cc30.
2020-10-29 12:28:54 -07:00
Tim Abbott ab3cb2b3bf puppet: Fix internal redis puppet configuration.
The inherits rule is required for overriding existing configuration
files; while the `::profile` piece was missed in the recent ::profile
migration.
2020-10-29 11:53:43 -07:00
Alex Vandiver b9797770d3 provision: Rename backup directory to postgresql. 2020-10-28 11:57:03 -07:00
Alex Vandiver 1f7132f50d docs: Standardize on PostgreSQL, not Postgres. 2020-10-28 11:55:16 -07:00
Alex Vandiver eaa99359b1 puppet: Rename to check_postgresql_replication_lag. 2020-10-28 11:51:52 -07:00
Alex Vandiver 53e59a0a13 puppet: Rename check_postgres_backup to check_postgresql_backup. 2020-10-28 11:51:52 -07:00
Alex Vandiver 45f6c79c4a puppet: Rename postgres_ variables to postgresql_. 2020-10-28 11:51:52 -07:00
Alex Vandiver e124324050 puppet: Rename postgres_appdb in nagios to postgresql. 2020-10-28 11:51:52 -07:00
Alex Vandiver a155430eb5 docs: Document all zulip.conf settings.
This provides a single reference point for all zulip.conf settings;
these mostly link out to the more complete documentation about each
setting, elsewhere.

Fixes #12490.
2020-10-27 13:31:57 -07:00
Alex Vandiver d24c571bab puppet: Automatically back up the database if we have the secrets.
This avoids folks having to manually add to the puppet_classes.
2020-10-27 13:29:19 -07:00
Alex Vandiver e7798d2797 puppet: Move zulip_ops::profile::postgres_appdb to postgresql. 2020-10-27 13:29:19 -07:00
Alex Vandiver 9f25389bff puppet: Move top-level zulip_ops deployments to zulip_ops::profile. 2020-10-27 13:29:19 -07:00
Alex Vandiver 5365af544a puppet: Rename zulip::profile::rabbit to ::rabbitmq. 2020-10-27 13:29:19 -07:00
Alex Vandiver 188af57296 puppet: Rename postgres_appdb to postgresql.
There is only one PostgreSQL database; the "appdb" is irrelevant.
Also use "postgresql," as it is the name of the software, whereas
"postgres" the name of the binary and colloquial name.  This is minor
cleanup, but enabled by the other renames in the previous commit.
2020-10-27 13:29:19 -07:00
Alex Vandiver c2185a81d6 puppet: Move top-level zulip deployments into "profile" directory.
This moves the puppet configuration closer to the "roles and profiles
method"[1] which is suggested for organizing puppet classes.  Notably,
here it makes clear which classes are meant to be able to stand alone
as deployments.

Shims are left behind at the previous names, for compatibility with
existing `zulip.conf` files when upgrading.

[1] https://puppet.com/docs/pe/2019.8/the_roles_and_profiles_method
2020-10-27 13:29:19 -07:00
Alex Vandiver 27cfb14d92 puppet: Only include zulip::base for top-level deploys.
This also removes direct includes of `zulip::common`, making
`zulip::base` gatekeep the inclusion of it.  This helps enforce that
any top-level deploy only needs include a single class, and that any
configuration which is not meant to be deployed by itself will not
apply, due to lack of `zulip::common` include.

The following commit will better differentiate these top-level deploys
by moving them into a subdirectory.
2020-10-27 13:29:19 -07:00
Alex Vandiver c296b5d819 puppet: Allow unattended-upgrades for all but servers.
Restarting servers is what can cause service interruptions, and
increase risk.  Add all of the servers that we use to the list of
ignored packages, and uncomment the default allowed-origins in order
to enable unattended upgrades.
2020-10-23 16:46:06 -07:00
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Alex Vandiver a7d1fd9ffb puppet: Remove non-working apt::source.
d2aa81858c replaced the `apt::source` to set up debathena with
`Exec['setup-apt-repo-debathena']`, but mistakenly left the
`apt::source` in place in `zmirror` (but not `zmirror_personals`).
The `apt::source` resource type was later removed in c9d54f7854,
making the manifest to apply on `zmirror`.

Remove the broken and unnecessary `apt::source` resource.
2020-10-23 11:31:20 -07:00
Alex Vandiver 48e06c25ba puppet: Switch nagios SSH checks to id_ed25519 key.
The ssh-rsa algorithm was deprecated[1] in OpenSSH 8.2 (2020-02-14) and
will be removed in a future release.

[1] https://www.openssh.com/txt/release-8.4
2020-10-22 16:42:30 -07:00
Alex Vandiver 0ea20bd7d8 puppet: Move postgres_version into postgres_common.
This property is not related to the base zulip install; move it to
zulip::postgres_common, which is already used as a namespace for
various postgres variables.
2020-10-22 11:32:25 -07:00