Commit Graph

1142 Commits

Author SHA1 Message Date
Alex Vandiver 4868a4fe48 puppet: Set a long timeout on wal-g wal-push, to prevent stalls.
`wal-g wal-push` has a known bug with occasionally hanging after file
upload to S3[1]; set a rather long timeout on the upload process, so
that we don't simply stall forever when archiving WAL segments.

[1] https://github.com/wal-g/wal-g/issues/656
2020-11-20 11:32:36 -08:00
Sourabh Rana 419f163906 nginx: Increase file upload size from 25mb to 80mb. 2020-11-19 00:49:49 -08:00
Alex Vandiver 90ca06d873 puppet: Allow unattended upgrades of -updates in addition to -security.
This ensures that software will be fully up-to-date, not just with
security patches.
2020-11-13 16:45:05 -08:00
Alex Vandiver 2e20ab1658 puppet: Log the "Host" header and total response time.
Logging `Host` is useful for determining access patterns to realms,
especially if ROOT_DOMAIN_LANDING_PAGE is set.  Total response time is
useful in debugging access and performance patterns.
2020-11-13 16:42:32 -08:00
Tim Abbott 494a685827 puppet: Fix typo in name of missedmessage_emails consumer.
This has been present since this check was introduced in
45c9c3cc30.
2020-10-29 12:28:54 -07:00
Tim Abbott ab3cb2b3bf puppet: Fix internal redis puppet configuration.
The inherits rule is required for overriding existing configuration
files; while the `::profile` piece was missed in the recent ::profile
migration.
2020-10-29 11:53:43 -07:00
Alex Vandiver 6b9d7000b5 puppet: Set proxy environment variables.
These are respected by `urllib`, and thus also `requests`.  We set
`HTTP_proxy`, not `HTTP_PROXY`, because the latter is ignored in
situations which might be running under CGI -- in such cases it may be
coming from the `Proxy:` header in the request.
2020-10-28 12:17:35 -07:00
Alex Vandiver 8b0f32ee07 puppet: Move environment-setting into configuration, not command. 2020-10-28 12:13:04 -07:00
Alex Vandiver b9797770d3 provision: Rename backup directory to postgresql. 2020-10-28 11:57:03 -07:00
Alex Vandiver 1f7132f50d docs: Standardize on PostgreSQL, not Postgres. 2020-10-28 11:55:16 -07:00
Alex Vandiver eaa99359b1 puppet: Rename to check_postgresql_replication_lag. 2020-10-28 11:51:52 -07:00
Alex Vandiver 53e59a0a13 puppet: Rename check_postgres_backup to check_postgresql_backup. 2020-10-28 11:51:52 -07:00
Alex Vandiver 45f6c79c4a puppet: Rename postgres_ variables to postgresql_. 2020-10-28 11:51:52 -07:00
Alex Vandiver e124324050 puppet: Rename postgres_appdb in nagios to postgresql. 2020-10-28 11:51:52 -07:00
Alex Vandiver a155430eb5 docs: Document all zulip.conf settings.
This provides a single reference point for all zulip.conf settings;
these mostly link out to the more complete documentation about each
setting, elsewhere.

Fixes #12490.
2020-10-27 13:31:57 -07:00
Alex Vandiver e81bc19e45 puppet: Remove shims for old classes, except dockervoyager.
The upgrade mechanism in the previous commit negates the need for
them -- with the exception of dockervoyager.
2020-10-27 13:29:19 -07:00
Alex Vandiver d24c571bab puppet: Automatically back up the database if we have the secrets.
This avoids folks having to manually add to the puppet_classes.
2020-10-27 13:29:19 -07:00
Alex Vandiver e7798d2797 puppet: Move zulip_ops::profile::postgres_appdb to postgresql. 2020-10-27 13:29:19 -07:00
Alex Vandiver 9f25389bff puppet: Move top-level zulip_ops deployments to zulip_ops::profile. 2020-10-27 13:29:19 -07:00
Alex Vandiver 5365af544a puppet: Rename zulip::profile::rabbit to ::rabbitmq. 2020-10-27 13:29:19 -07:00
Alex Vandiver 188af57296 puppet: Rename postgres_appdb to postgresql.
There is only one PostgreSQL database; the "appdb" is irrelevant.
Also use "postgresql," as it is the name of the software, whereas
"postgres" the name of the binary and colloquial name.  This is minor
cleanup, but enabled by the other renames in the previous commit.
2020-10-27 13:29:19 -07:00
Alex Vandiver 91cb0988e1 puppet: Generalize docker detection.
This also has the benefit of detecting zulip::dockervoyager as well as
zulip::profile::docker.
2020-10-27 13:29:19 -07:00
Alex Vandiver 0f25acc7b3 puppet: Rename "voyager"/"dockervoyager" to "standalone"/"docker".
The "voyager" name is non-intuitive and not significant.
`zulip::voyager` and `zulip::dockervoyager` stubs are kept for
back-compatibility with existing `zulip.conf` files.
2020-10-27 13:29:19 -07:00
Alex Vandiver c2185a81d6 puppet: Move top-level zulip deployments into "profile" directory.
This moves the puppet configuration closer to the "roles and profiles
method"[1] which is suggested for organizing puppet classes.  Notably,
here it makes clear which classes are meant to be able to stand alone
as deployments.

Shims are left behind at the previous names, for compatibility with
existing `zulip.conf` files when upgrading.

[1] https://puppet.com/docs/pe/2019.8/the_roles_and_profiles_method
2020-10-27 13:29:19 -07:00
Alex Vandiver 27cfb14d92 puppet: Only include zulip::base for top-level deploys.
This also removes direct includes of `zulip::common`, making
`zulip::base` gatekeep the inclusion of it.  This helps enforce that
any top-level deploy only needs include a single class, and that any
configuration which is not meant to be deployed by itself will not
apply, due to lack of `zulip::common` include.

The following commit will better differentiate these top-level deploys
by moving them into a subdirectory.
2020-10-27 13:29:19 -07:00
Alex Vandiver 34e8c2c61e puppet: Move total_memory_mb from zulip::base into zulip::common.
This makes `zulip::common` used only for variable-setting, and
`zulip::base` used only for resource creation.
2020-10-27 13:29:19 -07:00
Alex Vandiver 7bb888c2ec puppet: Template supervisor.conf for redhat paths. 2020-10-27 13:29:19 -07:00
Alex Vandiver 3ab9b31d2f puppet: Purge all un-managed supervisor configuration files.
Relying on `defined(Class['...'])` makes the class sensitive to
resource evaluation ordering, and thus brittle.  It is also only
functional for a single service (thumbor).

Generalize by using `purge => true` for the directory to automatically
remove all un-managed files.  This is more general than the previous
form, and may result in additional not-managed services being removed.
2020-10-27 13:29:19 -07:00
Alex Vandiver 1d54630b4e log: Rename email-deliverer.log to match other files. 2020-10-25 14:56:37 -07:00
Alex Vandiver 93d661d119 puppet: Configure logrotate for all logger files.
This adds log rotation to all /var/log/zulip files.
2020-10-25 14:56:37 -07:00
Alex Vandiver c296b5d819 puppet: Allow unattended-upgrades for all but servers.
Restarting servers is what can cause service interruptions, and
increase risk.  Add all of the servers that we use to the list of
ignored packages, and uncomment the default allowed-origins in order
to enable unattended upgrades.
2020-10-23 16:46:06 -07:00
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Alex Vandiver a7d1fd9ffb puppet: Remove non-working apt::source.
d2aa81858c replaced the `apt::source` to set up debathena with
`Exec['setup-apt-repo-debathena']`, but mistakenly left the
`apt::source` in place in `zmirror` (but not `zmirror_personals`).
The `apt::source` resource type was later removed in c9d54f7854,
making the manifest to apply on `zmirror`.

Remove the broken and unnecessary `apt::source` resource.
2020-10-23 11:31:20 -07:00
Alex Vandiver 48e06c25ba puppet: Switch nagios SSH checks to id_ed25519 key.
The ssh-rsa algorithm was deprecated[1] in OpenSSH 8.2 (2020-02-14) and
will be removed in a future release.

[1] https://www.openssh.com/txt/release-8.4
2020-10-22 16:42:30 -07:00
Alex Vandiver 0ea20bd7d8 puppet: Move postgres_version into postgres_common.
This property is not related to the base zulip install; move it to
zulip::postgres_common, which is already used as a namespace for
various postgres variables.
2020-10-22 11:32:25 -07:00
Alex Vandiver 25e995b677 puppet: Move normal_queues to the one place that uses it. 2020-10-22 11:32:25 -07:00
Alex Vandiver 423b5c2be2 puppet: Move queue error and stats directories to just the app host. 2020-10-22 11:31:05 -07:00
Alex Vandiver 4d4c21499a puppet: Move supervisor dependency into process_fts_updates.
PostgreSQL itself has no dependency on supervisor; rather, the FTS
updates do.
2020-10-22 11:30:53 -07:00
Alex Vandiver ca971ebc59 puppet: Remove empty zulip_ops class. 2020-10-22 11:30:53 -07:00
Alex Vandiver 16af05758d puppet: Move zulip_org into zulip_ops.
This class is not of general interest.
2020-10-22 11:30:53 -07:00
Alex Vandiver ad566c491d puppet: Drop now-unused zulip_ops:::git class. 2020-10-22 11:30:53 -07:00
Alex Vandiver 50e9e2ed20 puppet: Make zulip::base include zulip::apt_repository.
There was likely more dependency complexity prior to 97766102df, but
there is now no reason to require that consumers explicitly include
zulip::apt_repository.
2020-10-22 11:30:53 -07:00
Alex Vandiver 2dc6d26ec6 puppet: Fix included monitoring class name. 2020-10-19 22:30:20 -07:00
Alex Vandiver 7a1132d605 puppet: Switch golang and smokescreen to use /srv.
/srv and /opt have very similar usages; but we should be internally
consistent.

Move these two (the only usages of /opt) to match the rest in /srv.
2020-10-16 13:00:06 -07:00
Alex Vandiver 78b92a51cc puppet: Allow access to smokescreen port via iptables. 2020-10-15 15:18:35 -07:00
Alex Vandiver 0d5356969e puppet: Reformat ipv4 iptables rules comments. 2020-10-15 15:18:35 -07:00
Alex Vandiver fffea9612b puppet: Add an outgoing HTTP/HTTPS proxy server.
Use https://github.com/stripe/smokescreen to provide a server for an
outgoing proxy, run under supervisor.  This will allow centralized
blocking of internal metadata IPs, localhost, and so forth, as well as
providing default request timeouts (10s by default).
2020-10-15 15:18:35 -07:00
Anders Kaseorg dfaea9df65 shfmt: Reformat shell scripts with shfmt.
https://github.com/mvdan/sh

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-15 15:16:00 -07:00
Alex Vandiver f61ac4a28d puppet: Move frontend monitoring into its own file.
This allows it to be pulled in for deploys like czo, which don't use
the full `zulip_ops::app_frontend`, but we wish to monitor.
2020-10-13 17:37:32 -07:00
Tim Abbott 7c2c82b190 nginx: Update nginx configuration for fhir/hl7 organization.
We should eventually add templating for the set of hosts here, but
it's worth merging this change to remove the deleted hostname and
replace it with the current one.
2020-10-13 16:50:26 -07:00
Anders Kaseorg 723d285e46 nginx: Redirect {www.,}zulipchat.com, www.zulip.com to zulip.com.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-13 16:49:23 -07:00
Alex Vandiver c8df9a150e puppet: Drop all log2zulip configuration.
Disabled on webservers in 047817b6b0, it has since lingered in
configuration, as well as running (to no effect) every minute on the
loadbalancer.

Remove the vestiges of its configuration.
2020-10-13 11:00:50 -07:00
Alex Vandiver b431b1b021 puppet: Remove misleading motd.
This banner shows on lb1, advertising itself as lb0.  There is no
compelling reason for a custom motd, especially one which needs to
be reconfigured for each host.
2020-10-13 11:00:36 -07:00
Alex Vandiver 45c9c3cc30 queue: Monitor user_activity queue, now that it has a consumer.
Since this was using repead individual get() calls previously, it
could not be monitored for having a consumer.  Add it in, by marking
it of queue type "consumer" (the default), and adding Nagios lines for
it.

Also adjust missedmessage_emails to be monitored; it stopped using
LoopQueueProcessingWorker in 5cec566cb9, but was never added back
into the set of monitored consumers.
2020-10-11 14:19:42 -07:00
Alex Vandiver 4fd7df4e8c puppet: Remove absent of check-apns-tokens.
This was marked as ensure absent in d02101a401, in v1.7.0 in 2017.
2020-09-29 18:17:08 -07:00
Alex Vandiver 872a349508 puppet: Remove absent of log2zulip.
This was marked as ensure absent in 047817b6b0, in v2.0.0 in 2018.
2020-09-29 18:17:08 -07:00
Alex Vandiver 0137772fdb puppet: Remove absent of calculate-first-visible-message-id.
This was marked as ensure absent in dc7d44a245, in v1.9.0 in 2018.
2020-09-29 18:17:08 -07:00
Alex Vandiver 966c8dc23d puppet: Remove absent of email-mirror cron job.
This was marked as ensure absent in 24f8492236, in v1.3.0 in 2014.
2020-09-29 18:17:08 -07:00
Alex Vandiver 430d3b8554 puppet: Remove absent of libapache2-mod-wsgi.
This was marked as ensure absent in 89b97e7480, in v1.7.0 in 2017,
though it did not take effect until 6e55aa2ce6, in v1.9.0 in 2018.
2020-09-29 18:17:08 -07:00
Alex Vandiver 12085552d5 puppet: Tidy indentation. 2020-09-29 17:44:44 -07:00
Alex Vandiver 57d88eedd8 puppet: Only install rabbitmq cron jobs via zulip_ops.
The rabbitmq cron jobs exist in order to call rabbitmqctl as root and
write the output to files that nagios can consume, since nagios is not
allowed to run rabbitmqctl.

In systems which do not have nagios configured, these every-minute
cron jobs add non-insignificant load, to no effect.  Move their
installation into `zulip_ops`.  In doing so, also combine the cron.d
files into a single file; this allows us to `ensure => absent` the old
filenames, removing them from existing systems.  Leave the resulting
combined cron.d file in `zulip`, since it is still of general utility
and note.
2020-09-29 17:44:44 -07:00
Alex Vandiver 79931051bd puppet: Permit outgoing mail from postfix.
The configuration change made in 1c17583ad5 only allowed delivery to
those specific Zulip addresses.  However, they also prevent the
mailserver from being used as an outgoing email relay from Zulip,
since all mail that passed through the mailserver (from any
originator) was required to have a `RCPT TO` that matched those
regexes.

Allow mail originating from `mynetworks` to have an arbitrary
addresses in `RCPT TO`.
2020-09-25 15:09:27 -07:00
Alex Vandiver 36ea307fbf puppet: Depend other changes on sharding.py validation.
Use the validation of the tornado sharding config that
`stage_updated_sharding` does, by depending on it.  This ensures that
we don't write out a supervisor or nginx config based on a
bad (e.g. non-sequential) list of tornado ports.
2020-09-25 10:52:40 -07:00
Alex Vandiver c0e240277b tornado: Remove fingerprinting, write out .tmp files always.
Fingerprinting the config is somewhat brittle -- it requires either
custom bootstrapping for old (fingerprint-less) configs, and may have
false-positives.

Since generating the config is lightweight, do so into the .tmp files,
and compare the output to the originals to determine if there are
changes to apply.

In order to both surface errors, as well as notify the user in case a
restart is necessary, we must run it twice.  The `onlyif`
functionality cannot show configuration errors to the user, only
determine if the command runs or not.  We thus run the command once,
judging errors as "interesting" enough to run the actual command,
whose failure will be verbose in Puppet and halt any steps that depend
on it.

Removing the `onlyif` would result in `stage_updated_sharding` showing
up in the output of every Puppet run, which obscures the important
messages it displays when an update to sharding is necessary.
Removing the `command` (e.g. making it an `echo`) would result in
removing the ability to report configuration errors.  We thus have no
choice but to run it twice; this is thankfully low-overhead.
2020-09-25 10:52:40 -07:00
Alex Vandiver 2a12fedcf1 tornado: Remove explicit tornado_processes setting; compute it.
We can compute the intended number of processes from the sharding
configuration.  In doing so, also validate that all of the ports are
contiguous.

This removes a discrepancy between `scripts/lib/sharding.py` and other
parts of the codebase about if merely having a `[tornado_sharding]`
section is sufficient to enable sharding.  Having behaviour which
changes merely based on if an empty section exists is surprising.

This does require that a (presumably empty) `9800` configuration line
exist, but making that default explicit is useful.

After this commit, configuring sharding can be done by adding to
`zulip.conf`:

```
[tornado_sharding]
9800 =              # default
9801 = other_realm
```

Followed by running `./scripts/refresh-sharding-and-restart`.
2020-09-18 15:13:40 -07:00
Alex Vandiver f638518722 tornado: Move default production port to 9800.
In development and test, we keep the Tornado port at 9993 and 9983,
respectively; this allows tests to run while a dev instance is
running.

In production, moving to port 9800 consistently removes an odd edge
case, when just one worker is on an entirely different port than if
two workers are used.
2020-09-18 15:13:40 -07:00
Alex Vandiver ff94254598 tornado: Log to files by port number.
Without an explicit port number, the `stdout_logfile` values for each
port are identical.  Supervisor apparently decides that it will
de-conflict this by appending an arbitrary number to the end:

```
/var/log/zulip/tornado.log
/var/log/zulip/tornado.log.1
/var/log/zulip/tornado.log.10
/var/log/zulip/tornado.log.2
/var/log/zulip/tornado.log.3
/var/log/zulip/tornado.log.7
/var/log/zulip/tornado.log.8
/var/log/zulip/tornado.log.9
```

This is quite confusing, since most other files in `/var/log/zulip/`
use `.1` to mean logrotate was used.  Also note that these are not all
sequential -- 4, 5, and 6 are mysteriously missing, though they were
used in previous restarts.  This can make it extremely hard to debug
logs from a particular Tornado shard.

Give the logfiles a consistent name, and set them up to logrotate.
2020-09-14 22:17:51 -07:00
Alex Vandiver efdaa58c24 supervisor: Use more specific process_name than "port-9800".
Making this include "zulip-tornado" makes it clearer in supervisor
logs.  Without this, one only sees:
```
2020-09-14 03:43:13,788 INFO waiting for port-9807 to stop
2020-09-14 03:43:14,466 INFO stopped: port-9807 (exit status 1)
2020-09-14 03:43:14,469 INFO spawned: 'port-9807' with pid 24289
2020-09-14 03:43:15,470 INFO success: port-9807 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
```
2020-09-14 22:17:51 -07:00
Alex Vandiver e9d0bdea65 puppet: Coerce uwsgi_listen_backlog_limit into an int before doing math. 2020-09-14 21:22:13 -07:00
Alex Vandiver 8adf530400 puppet: Generate sharding in puppet, then refresh-sharding-and-restart.
This supports running puppet to pick up new sharding changes, which
will warn of the need to finalize them via
`refresh-sharding-and-restart`, or simply running that directly.
2020-09-14 16:27:15 -07:00
Alex Vandiver 0de356c2df puppet: Move generation of tornado nginx upstreams into tornado_sharding.
This puts the creation of the upstreams referenced by
`nginx_sharding.conf` adjacent to their use.
2020-09-14 16:27:15 -07:00
Alex Vandiver bf029d99f1 sharding: Also mark sharding.json 644 for consistency.
There is no reason to limit this to 640; mark it 644 for consistency
with the other file.
2020-09-14 16:27:15 -07:00
Alex Vandiver 1c17583ad5 puppet: Restrict postfix incoming addresses to postmaster and zulip.
This removes the possibility of local user enumeration via RCPT TO.
2020-09-11 18:49:22 -07:00
Alex Vandiver 482c964dd3 puppet: Logrotate for webhook exceptions. 2020-09-10 17:47:21 -07:00
Alex Vandiver e38051736d puppet: Wrap and sort logrotate config. 2020-09-10 17:47:21 -07:00
Anders Kaseorg 75c59a820d python: Convert subprocess.Popen.communicate to run or check_output.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:42:35 -07:00
Anders Kaseorg fbfd4b399d python: Elide action="store" for argparse arguments.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 16:17:14 -07:00
Anders Kaseorg 1f2ac1962f python: Elide default=None for argparse arguments.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 16:17:14 -07:00
Anders Kaseorg d751e0cece puppet: Don’t install netcat.
It’s been unused since commit 0af22dad18
(#13239).

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 10:33:47 -07:00
Anders Kaseorg ab120a03bc python: Replace unnecessary intermediate lists with generators.
Mostly suggested by the flake8-comprehension plugin.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-02 11:15:41 -07:00
Anders Kaseorg a5dbab8fb0 python: Remove redundant dest for argparse arguments.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-02 11:04:10 -07:00
Anders Kaseorg dbdf67301b memcached: Switch from pylibmc to python-binary-memcached.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-06 12:51:14 -07:00
Casper Kvan Clausen ed7a6d5e4d puppet: Support nginx_listen_port with http_only 2020-08-03 12:58:12 -07:00
Alex Vandiver cd530d627b uwsgi: Stop generating IOError and SIGPIPE on client close.
Clients that close their socket to nginx suddenly also cause nginx to close
its connection to uwsgi.  When uwsgi finishes computing the response,
it thus tries to write to a closed socket, and generates either
IOError or SIGPIPE failures.

Since these are caused by the _client_ closing the connection
suddenly, they are not actionable by the server.  At particularly high
volumes, this could represent some sort of server-side failure;
however, this is better detected by examining status codes at the
loadbalancer.  nginx uses the error code 499 for this occurrence:
https://httpstatuses.com/499

Stop uwsgi from generating this family of exception entirely, using
configuration for uwsgi[1]; it documents these errors as "(annoying),"
hinting at their general utility."

[1] https://uwsgi-docs.readthedocs.io/en/latest/Options.html#ignore-sigpipe
2020-07-31 10:40:09 -07:00
Alex Vandiver ceb909dbc5 puppet: Increase backlogged socket count based on uwsgi backlog.
Increasing the uwsgi listen backlog is intended to allow it to handle
higher connection rates during server restart, when many clients may
be trying to connect.  The kernel, in turn, needs to have a
proportionally increased somaxconn soas to not refuse the connection.

Set somaxconn to 2x the uwsgi backlog, but no lower than the
default (128).
2020-07-28 21:16:26 -07:00
Alex Vandiver 38d01cd4db puppet: Generalize install-wal-g to be arbitrary tarballs. 2020-07-24 17:24:57 -07:00
Tim Abbott 5a1243db3c puppet: Use correct scope for zulip_ops::munin_plugin. 2020-07-15 21:49:45 -07:00
Alex Vandiver 48c3c33d10 puppet: Fully-qualify the munin-plugin name 2020-07-14 17:58:51 -07:00
Alex Vandiver c68333040b
puppet: Revert PostgreSQL setting of recovery_target_timeline.
Prior to PostgreSQL 12, the `recovery_target_timeline` setting is only
valid in a `recovery.conf` file, as that file has its own
configuration parser.  As such, including it in `postgresql.conf`
results in an error, and PostgreSQL will fail to start.

Remove the setting, reverting bff3b540b1.  This fixes PostgreSQL 9.5,
9.6, 10, and 11; while the setting is not an error in a PostgreSQL 12
configuration file, it is unnecessary since `latest` is the default.
2020-07-14 16:28:20 -07:00
Alex Vandiver 31d80a77d4 puppet: Update nagios check_postgres_replication_lag to be on DB hosts
7d4a370a57 attempted to move the replication check to on the
PostgreSQL hosts.  While it updated the _check_ to assume it was
running and talking to a local PostgreSQL instance, the configuration
and installation for the check were not updated.  As such, the check
ran on the nagios host for each DB host, and produced no output.

Start distributing the check to all apopdb hosts, and configure nagios
to use the SSH tunnel to get there.
2020-07-14 16:27:18 -07:00
Alex Vandiver 2174db27db puppet: Put the dependencies on pg_backup_and_purge itself, and ensure them. 2020-07-14 00:40:25 -07:00
Alex Vandiver 6c27f07c1d puppet: Move PostgreSQL backups to their own class.
wal-g was used in `puppet/zulip` by env-wal-g, but only installed in
`puppet/zulip_ops`.

Merge all of the dependencies of doing backups using wal-g (wal-g
installation, the pg_backup_and_purge job, the nagios plugin that
verifies it happens) into a common base class in `puppet/zulip`, since
it is generally useful.
2020-07-14 00:40:25 -07:00
Anders Kaseorg 15483c09cb puppet: Add missing trailing commas.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-07-13 15:36:06 -07:00
Alex Vandiver 3691a94efe puppet: Configure munin and nagios under apache with puppet.
This swaps in the actually-in-use munin configuiration file;
otherwise, it is an implementation of the configuration as it exists
on the machine.
2020-07-13 13:23:11 -07:00
Alex Vandiver 4e42164b4a munin: Add plugins to prod hosts. 2020-07-13 13:23:11 -07:00
Alex Vandiver 2a14212b27 munin: Add a helper resource definition for munin plugins. 2020-07-13 12:49:28 -07:00
Alex Vandiver 7c7b5fcd6f munin: Deal with spaces in the channel names. 2020-07-13 12:49:28 -07:00
Alex Vandiver eda2c4b8e2 puppet: Split munin-node from munin-server.
No plugins are installed inside the /usr/local/munin/lib this creates
in munin-node, nor are they symlinked into /etc/munin/plugins, so
non-default plugins are added by this.
2020-07-13 12:49:28 -07:00
Alex Vandiver ddc7bb5a45 munin: Fix the path to check_send_receive_time. 2020-07-13 12:49:28 -07:00
Alex Vandiver 8be544e7eb munin: Rename monitoring plugin to use zulip name, not humbug. 2020-07-13 12:49:28 -07:00