zulip

Commit Graph

Author	SHA1	Message	Date
Alex Vandiver	459f37f041	puppet: Add prometheus server.	2021-06-08 22:21:00 -07:00
Alex Vandiver	19fb58e845	puppet: Add prometheus node exporter.	2021-06-08 22:21:00 -07:00
Alex Vandiver	a2b1009ed5	puppet: Turn on "authentication" which defaults to user with all rights. Nagios refuses to allow any modifications with use_authentication off; re-enabled "authentication" but set a default user, which (by way of the `*` permissions in `359f37389a`) is allowed to take all actions.	2021-06-08 15:19:28 -07:00
Alex Vandiver	61b6fc865c	puppet: Add a label to teleport applications, to allow RBAC. Roles can only grant or deny access based on labels; set one based on the application name.	2021-06-08 15:19:04 -07:00
Alex Vandiver	4aff5b1d22	puppet: Allow access to `/` in nagios. This was a regression in `51b985b40d`.	2021-06-07 22:40:58 -07:00
Alex Vandiver	54768c2210	puppet: Remove now-unused basic auth support files. `51b985b40d` made these unnecessary.	2021-06-07 16:17:45 -07:00
Alex Vandiver	359f37389a	puppet: Remove in-nagios auth restrictions. `51b985b40d` made nagios only accessible from localhost, or as proxied via teleport. Remove the HTTP-level auth requirements.	2021-06-07 16:17:45 -07:00
Alex Vandiver	2352fac6b5	puppet: Fix indentation.	2021-06-02 18:38:38 -07:00
Alex Vandiver	51b985b40d	puppet: Move nagios to behind teleport. This makes the server only accessible via localhost, by way of the Teleport application service.	2021-06-02 18:38:38 -07:00
Alex Vandiver	4f51d32676	puppet: Add a teleport application server. This requires switching to a reverse tunnel for the auth connection, with the side effect that the `zulip_ops::teleport::node` manifest can be applied on servers anywhere in the Internet; they do not need to have any publicly-available open ports.	2021-06-02 18:38:38 -07:00
Alex Vandiver	c59421682f	puppet: Add a teleport node on every host. Teleport nodes[1] are the equivalent to SSH servers. In addition to this config, joining the teleport cluster will require presenting a one-time "join token" from the proxy server[2], which may either be short-lived or static. [1] https://goteleport.com/docs/architecture/nodes/ [2] https://goteleport.com/docs/admin-guide/#adding-nodes-to-the-cluster	2021-06-02 18:38:38 -07:00
Alex Vandiver	1cdf14d195	puppet: Add a teleport server. See https://goteleport.com/docs/architecture/overview/ for the general architecture of a Teleport cluster. This commit adds a Teleport auth[1] and proxy[2] server. The auth server serves as a CA for granting time-bounded access to users and authenticating nodes on the cluster; the proxy provides access and a management UI. [1] https://goteleport.com/docs/architecture/authentication/ [2] https://goteleport.com/docs/architecture/proxy/	2021-06-02 18:38:38 -07:00
Alex Vandiver	3ebd627c50	puppet: Fix "import" -> "include" in chat_zulip_org.	2021-06-02 11:02:34 -07:00
Alex Vandiver	2130fc0645	puppet: Add an explicit class for czo.	2021-06-01 22:18:50 -07:00
Alex Vandiver	c9141785fd	puppet: Use concat fragments to place port allows next to services. This means that services will only open their ports if they are actually run, without having to clutter rules.v4 with a log of `if` statements. This does not go as far as using `puppetlabs/firewall`[1] because that would represent an additional DSL to learn; raw IPtables sections can easily be inserted into the generated iptables file via `concat::fragment` (either inline, or as a separate file), but config can be centralized next to the appropriate service. [1] https://forge.puppet.com/modules/puppetlabs/firewall	2021-05-27 21:14:48 -07:00
Alex Vandiver	4f79b53825	puppet: Factor out firewall config.	2021-05-27 21:14:48 -07:00
Alex Vandiver	87a109e3e0	puppet: Pull in pinned puppet modules. Using puppet modules from the puppet forge judiciously will allow us to simplify the configuration somewhat; this specifically pulls in the stdlib module, which we were already using parts of.	2021-05-27 21:14:48 -07:00
Alex Vandiver	f3eea72c2a	setup: Merge multiple setup-apt-repo scripts into one. This moves the `.asc` files into subdirectories, and writes out the according `.list` files into them. It moves from templates to written-out `.list` files for clarity and ease of implementation (Debian and Ubuntu need different templates for `zulip`), and as a way of making explicit which releases are supported for each list. For the special-case of the PGroonga signing key, we source an additional file within the directory. This simplifies the process for adding another class of `.list` file.	2021-05-26 14:42:29 -07:00
Alex Vandiver	4f017614c5	nagios: Replace check_fts_update_log with a process_fts_updates flag. This avoids having to duplicate the connection logic from process_fts_updates. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:56:05 -07:00
Alex Vandiver	ab130ceb35	nagios: Support arbitrary database user and dbname in replication check. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:56:05 -07:00
Alex Vandiver	c17f502bb0	process_fts_updates: Support arbitrary database user and dbname. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:56:05 -07:00
Alex Vandiver	02fc0d3e1d	db: Drop None and empty-string checking in arguments. psycopg2 treats None and "" the same as not-provided: ``` assert connect(user="zulip", dbname="zulip") assert connect(user="zulip", dbname="zulip", host="") assert connect(user="zulip", dbname="zulip", host=None) with Raises("no password supplied"): connect(user="zulip", dbname="zulip", host="localhost") assert connect(user="zulip", dbname="zulip", port="") assert connect(user="zulip", dbname="zulip", port=None) assert connect(user="zulip", dbname="zulip", port=5432) with Raises("could not connect to server"): connect(user="zulip", dbname="zulip", port=5000) assert connect(dbname="zulip", host="localhost", password="right-password") with Raises("no password supplied"): connect(dbname="zulip", host="localhost", password="") with Raises("no password supplied"): connect(dbname="zulip", host="localhost", password=None) with Raises("password authentication failed"): connect(dbname="zulip", host="localhost", password="wrong") ``` Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	9c652eb16b	db: Use the pre-computed values from settings. Rather than duplicate logic from `computed_settings`, use the values that were computed therein. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	94d7c29d92	db: Use the same codepath for cases (2) and (3). Using the second branch _only_ for case (3), of a PostgreSQL server on a different host, leaves it untested in CI. It also brings in an unnecessary Django dependency. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	add6971ad9	db: Make USING_PGROONGA logic clearer. We only need to read the `zulip.conf` file to determine if we're using PGROONGA if we are on the PostgreSQL machine, with no access to Django. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	75bf19c9d9	db: Combine the `if "host" in pg_args` stanza with earlier clause. The only way in which "host" could be set is in cases (1) or (2), when it was potentially read from Django's settings. In case (3), we already know we are on the same host as the PostgreSQL server. This unifies the two separated checks, which are actually the same check. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	67fc8e84ea	db: Clarify the 3 different cases that process_fts_updates must support. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	116e41f1da	puppet: Move files out and back when mounting /srv. Specifically, this affects /srv/zulip-aws-tools.	2021-05-23 13:29:23 -07:00
Alex Vandiver	ea98549e88	puppet: Always install linux-image-virtual, for ksplice support.	2021-05-23 13:29:23 -07:00
Alex Vandiver	0b1dd27841	puppet: AWS mounts its extra disks with inconsistent names. It is now /dev/nvme1n1, not /dev/nvme0n1; but it always has a consistent major/minor node. Source the file that defines these.	2021-05-23 13:29:23 -07:00
Alex Vandiver	82797dd53c	settings: Standardize the name of the deliver_scheduled_messages logs. This makes it match its command name, and other logfile name.	2021-05-18 12:39:28 -07:00
Alex Vandiver	343a1396af	puppet: Rename logfile for deliver_scheduled_messages to be consistent.	2021-05-18 12:39:28 -07:00
Alex Vandiver	ef6d0ec5ca	puppet: Only run deliver_scheduled_messages and _emails on one server. `deliver_scheduled_emails` and `deliver_scheduled_messages` use the `ScheduledEmail` and `ScheduledMessage` tables as a queue, effectively, pulling values off of them. As noted in their comments, this is not safe to run on multiple hosts at once. As such, split out the supervisor files for them.	2021-05-18 12:39:28 -07:00
Alex Vandiver	033a96aa5d	puppet: Fix check_ssl_certificate check to check named host, not self.	2021-05-17 18:38:30 -07:00
Alex Vandiver	a2b7a5ef4b	puppet: Clarify 20m keepalive time from the LB is a max; it can be less.	2021-05-17 14:56:51 -07:00
Alex Vandiver	66a232e303	smokescreen: Bump version of Go and Smokescreen. Move version pins to the latest versions of Go and Smokescreen.	2021-05-12 10:08:42 -10:00
Alex Vandiver	feb7870db7	puppet: Adjust thresholds on autovac_freeze. These thresholds are in relationship to the `autovacuum_freeze_max_age`, not the XID wraparound, which happens at 2^31-1. As such, it is perfectly normal that they hit 100%, and then autovacuum kicks in and brings it back down. The unusual condition is that PostgreSQL pushes past the point where an autovacuum would be triggered -- therein lies the XID wraparound danger. With the `autovacuum_freeze_max_age` set to 2000000000 in `postgresql.conf`, XID wraparound happens at 107.3%. Set the warning and error thresholds to below this, but above 100% so this does not trigger constantly.	2021-05-11 17:11:47 -07:00
Alex Vandiver	0f1611286d	management: Rename the deliver_email command to deliver_scheduled_email. This makes it parallel with deliver_scheduled_messages, and clarifies that it is not used for simply sending outgoing emails (e.g. the `email_senders` queue). This also renames the supervisor job to match.	2021-05-11 13:07:29 -07:00
Anders Kaseorg	544bbd5398	docs: Fix capitalization mistakes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-05-10 09:57:26 -07:00
Tim Abbott	ad0be6cea1	puppet: Remove thumbor.conf nginx configuration. This was missing in `405bc8dabf`.	2021-05-07 16:57:29 -07:00
Anders Kaseorg	9d57fa9759	puppet: Use pgrep -x to avoid accidental matches. Matching the full process name (-x without -f) or full command line (-xf) is less prone to mistakes like matching a random substring of some other command line or pgrep matching itself. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-05-07 08:54:41 -07:00
Anders Kaseorg	405bc8dabf	requirements: Remove Thumbor. Thumbor and tc-aws have been dragging their feet on Python 3 support for years, and even the alphas and unofficial forks we’ve been running don’t seem to be maintained anymore. Depending on these projects is no longer viable for us. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-05-06 20:07:32 -07:00
Alex Vandiver	eda9ce2364	locale: Use `C.UTF-8` rather than `en_US.UTF-8`. The `en_US.UTF-8` locale may not be configured or generated on all installs; it also requires that the `locales` package be installed. If users generate the `en_US.UTF-8` locale without adding it to the permanent set of system locales, the generated `en_US.UTF-8` stops working when the `locales` package is updated. Switch to using `C.UTF-8` in all cases, which is guaranteed to be installed. Fixes #15819.	2021-05-04 08:51:46 -07:00
Alex Vandiver	ddb9d16132	puppet: Install procps, for pgrep. In puppet, we use pgrep in the collection stage, to see if rabbitmq is running. Sufficiently bare-bones systems will not have `procps` (which provides `pgrep`) installed yet, which makes the install abort when running `puppet` for the first time. Just installing the `procps` package in Puppet is insufficient, because the check in the `unless` block runs when Puppet is determining which resources it needs to instantiate, and in what order; any package installation has yet to happen. As `erlang-base` (which provides `epmd`) happens to have a dependency of `procps`, any system without `pgrep` will also not have `epmd` installed or running. Regardless, it is safe to run `epmd -daemon` even if one is already running, as the comment above notes.	2021-05-03 14:48:52 -07:00
Alex Vandiver	3577c6dbd4	puppet: `pgrep -f something` can match itself. Using `pgrep -f epmd` to determine if `empd` is running is a race condition with itself, since the pgrep is attempting to match the "full process name" and its own full process name contains "epmd". This leads to epmd not being started when it should be, which in turn leads to rabbitmq-server failing to start. Use the standard trick for this, namely a one-character character class, to prevent self-matching.	2021-05-03 14:48:52 -07:00
Jennifer Hwang	c9f5946239	puppet: Add override for queue_workers_multiprocess. With tweaks to the documentation by tabbott. This uses the following configuration option: [application_server] queue_workers_multiprocess = false	2021-04-20 14:37:15 -07:00
Tim Abbott	bb676f1143	smokescreen: Move supervisor configuration to managed directory. We've established the conf.d/zulip directory as the recommended path for Zulip-managed configuration files, so this belongs there.	2021-04-16 14:05:42 -07:00
Gaurav Pandey	303e7b9701	ci: Add Debian bullseye to production test suite.	2021-04-15 21:38:31 -07:00
Gaurav Pandey	feb720b463	install: Add beta support for debian bullseye for production. This won't work on a real bullseye system until Bullseye actually officially releases. Fixes part of #17863.	2021-04-15 21:38:31 -07:00
Alex Vandiver	9de35d98d3	puppet: Ensure a snakeoil certificate, for Postfix and PostgreSQL. We use the snakeoil TLS certificate for PostgreSQL and Postfix; some VMs install the `ssl-cert` package but (reasonably) don't build the snakeoil certs into the image. Build them as needed. Fixes #14955.	2021-04-15 21:37:55 -07:00
Anders Kaseorg	b01d43f339	mypy: Fix strict_equality violations. puppet/zulip/files/nagios_plugins/zulip_postgresql/check_postgresql_replication_lag:98: error: Non-overlapping equality check (left operand type: "List[List[str]]", right operand type: "Literal[0]") [comparison-overlap] zerver/tests/test_realm.py:650: error: Non-overlapping container check (element type: "Dict[str, Any]", container item type: "str") [comparison-overlap] Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-04-13 09:18:18 -07:00
Alex Vandiver	93f3b41811	puppet: Also move avatars to the same nginx include file.	2021-04-09 08:28:42 -07:00
Alex Vandiver	aae8f454ce	puppet: Simplify uploads handling. `uploads-route.noserve` and `uploads-route.internal` contained identical location blocks for `/upload`, since differentiation was necessary for Trusty until 33c941407b72; move the now-common sections into `app`. This the only differences between internal and S3 serving as a single block which should be included or not based on config; move it to a file which may or may not be placed in `app.d/`.	2021-04-09 08:28:42 -07:00
Alex Vandiver	fb26c6b7ca	puppet: Move uwsgi_pass setting into uwsgi_params. We only ever call `uwsgi_pass django` in association with `include uwsgi_params`; refactor it in.	2021-04-09 08:28:42 -07:00
Alex Vandiver	9cf9d5f2cf	puppet: Move HTTP_X_REAL_IP setting into uwsgi_params. This effectively also adds it to serving `/user_uploads`, where its lack would cause failures to list the actual IP address.	2021-04-09 08:28:42 -07:00
Alex Vandiver	795517bd52	puppet: Only set X-Real-IP once. `07779ea879` added an additional `proxy_set_header` of `X-Real-IP` to `puppet/zulip/files/nginx/zulip-include-common/proxy`; as noted in that commit, Tornado longpoll proxies already included such a line. Unfortunately, this equates to setting that header _twice_ for Tornado ports, like so: ``` X-Real-Ip: 198.199.116.58 X-Real-Ip: 198.199.116.58 ``` ...which is represented, once parsed by Django, as an IP of `198.199.116.58, 198.199.116.58`. For IPv4, this odd "IP address" has no problems, and appears in the access logs accordingly; for IPv6 addresses, however, its length is such that it overflows a call to `getaddrinfo` when attempting to determine the validity of the IP. Remove the now-duplicated inclusion of the header.	2021-04-09 08:28:42 -07:00
Alex Vandiver	07779ea879	middleware: Do not trust X-Forwarded-For; use X-Real-Ip, set from nginx. The `X-Forwarded-For` header is a list of proxies' IP addresses; each proxy appends the remote address of the host it received its request from to the list, as it passes the request down. A naïve parsing, as SetRemoteAddrFromForwardedFor did, would thus interpret the first address in the list as the client's IP. However, clients can pass in arbitrary `X-Forwarded-For` headers, which would allow them to spoof their IP address. `nginx`'s behavior is to treat the addresses as untrusted unless they match an allowlist of known proxies. By setting `real_ip_recursive on`, it also allows this behavior to be applied repeatedly, moving from right to left down the `X-Forwarded-For` list, stopping at the right-most that is untrusted. Rather than re-implement this logic in Django, pass the first untrusted value that `nginx` computer down into Django via `X-Real-Ip` header. This allows consistent IP addresses in logs between `nginx` and Django. Proxied calls into Tornado (which don't use UWSGI) already passed this header, as Tornado logging respects it.	2021-03-31 14:19:38 -07:00
Anders Kaseorg	29e4c71ec4	puppet: Reformat custom Ruby modules with Rufo. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-03-24 12:12:04 -07:00
Alex Vandiver	6ee74b3433	puppet: Check health of APT repository.	2021-03-23 19:27:42 -07:00
Alex Vandiver	c01345d20c	puppet: Add nagios check for long-lived certs that do not auto-renew.	2021-03-23 19:27:27 -07:00
Alex Vandiver	9ea86c861b	puppet: Add a nagios alert configuration for smokescreen. This verifies that the proxy is working by accessing a highly-available website through it. Since failure of this equates to failures of Sentry notifications and Android mobile push notifications, this is a paging service.	2021-03-18 10:11:15 -07:00
Anders Kaseorg	129ea6dd11	nginx: Consistently listen on IPv6 and with HTTP/2. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-03-17 17:46:32 -07:00
Alex Vandiver	15c58cce5a	puppet: Create new nginx logfiles as the zulip user, not as www-data. All of `/var/log/nginx/` is chown'd to `zulip` and the nginx processes themselves run as `nginx`, and would thus (on their own) create new logfiles as `zulip`. Having `logrotate` create them as the package default of `www-data` means that they are momentarily unreadable by the `zulip` user just after rotation, which can cause problems with logtail scripts. Commit the standard `nginx` logrotate configuration, but with the `zulip` user instead of the `www-data` user.	2021-03-16 14:45:13 -07:00
Alex Vandiver	3314fefaec	puppet: Do not require a venv for zulip-puppet-apply. `0663b23d54` changed zulip-puppet-apply to use the venv, because it began using `yaml` to parse the output of puppet to determine if changes would happen. However, not every install ends with a venv; notably, non-frontend servers do not have one. Attempting to run zulip-puppet-apply on them hence now fails. Remove this dependency on the venv, by installing a system python3-yaml package -- though in reality, this package is already an indirect dependency of the system. Especially since pyyaml is quite stable, we're not using it in any interesting way, and it does not actually add to the dependencies, it is preferable to parsing the YAML by hand in this instance.	2021-03-14 17:50:57 -07:00
Alex Vandiver	52f155873f	puppet: Ensure that all `scripts/lib/install` packages are installed. These have all been required packages for some time, but this helps keep the install-time list more clearly a subset of the upgrade-time list.	2021-03-14 17:50:57 -07:00
Alex Vandiver	06c07109e4	puppet: Add missing semicolons left off in `ba3b88c81b`.	2021-03-12 15:48:53 -08:00
Alex Vandiver	024282b51e	Revert "puppet: Use rabbitmq as the user for its config files." This reverts commit `211232978f`. The `rabbitmq` user does not exist yet on first install, and the goal is to create the `rabbitmq-env.conf` file before the package is installed.	2021-03-12 15:37:19 -08:00
Alex Vandiver	ba3b88c81b	puppet: Explicitly use the snakeoil certificates for nginx. In production, the `wildcard-zulipchat.com.combined-chain.crt` file is just a symlink to the snakeoil certificates; but we do not puppet that symlink, which makes new hosts fail to start cleanly. Instead, point explicitly to the snakeoil certificate, and explain why.	2021-03-12 13:31:54 -08:00
Alex Vandiver	211232978f	puppet: Use rabbitmq as the user for its config files. This matches the initial ownership by the `rabbitmq-server` package.	2021-03-12 13:31:03 -08:00
Alex Vandiver	ef188af82d	puppet: Use two location blocks, instead of nesting them. Directives in `location` blocks may or may not inherit from surrounding `location` blocks; specifically, `add_header` directives do not[1]: > There could be several add_header directives. These directives are > inherited from the previous configuration level if and only if there > are no add_header directives defined on the current level. In order to maintain the same headers (including, critically, `Access-Control-Allow-Origin`) as the surrounding block, all `add_header` directives must thus be repeated (which includes the `include`). For clarity, un-nest and repeat the entire `location` block as was used for `/static/`, but with the additional `add_header`. This is preferred to the of an `if $request_uri` statement to add the header, as those can have unexpected or undefined results[2]. [1] http://nginx.org/en/docs/http/ngx_http_headers_module.html#add_header [2] https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/	2021-03-11 21:09:15 -08:00
Alex Vandiver	306bf930f5	puppet: Add a warning if ksplice is enabled but has no key set.	2021-03-10 17:57:20 -08:00
Alex Vandiver	a215c83c2d	puppet: Switch to more explicit variable rather than reuse a nagios one. Redis is not nagios, and this only leads to confusion as to why there is a nagios domain setting on frontend servers; it also leaves the `redis0` part of the name buried in the template. Switch to an explicit variable for the redis hostname.	2021-03-10 11:44:54 -08:00
Alex Vandiver	a5b29398fc	puppet: Only install ksplice uptrack if there is an access key.	2021-03-10 11:44:11 -08:00
Alex Vandiver	189e86e18e	puppet: Set aggressive caching headers on immutable webpack files. A partial fix for #3470.	2021-03-07 22:00:32 -08:00
Alex Vandiver	e63f170027	puppet: Add access time and host to nginx access logs. `2e20ab1658` attempted to add this; but there are multiple locations that access logs are set, and the most specific wins.	2021-03-04 18:06:47 -08:00
Alex Vandiver	8961885b0f	puppet: Add smokescreen to logrotate.	2021-03-02 17:16:38 -08:00
Alex Vandiver	d938dd9d4a	puppet: Document smokescreen installation, and move to puppet/zulip/. This is more broadly useful than for just Kandra; provide documentation and means to install Smokescreen for stand-alone servers, and motivate its use somewhat more.	2021-03-02 17:16:38 -08:00
Alex Vandiver	2f5eae5c68	puppet: Minor formatting.	2021-02-28 17:03:29 -08:00
Alex Vandiver	a759d26a32	puppet: Make ksplice config not world-readable, use 'adm' group. This matches the configuration that ksplice itself creates the file and directory with.	2021-02-28 17:03:29 -08:00
Tim Abbott	957c16aa77	nagios: Tweak prod load monitoring parameters. Ultimately this monitoring isn't that helpful, but we're mainly interested in when it spikes to very high numbers.	2021-02-26 08:39:52 -08:00
Alex Vandiver	32149c6a1c	puppet: Add ksplice uptrack for kernel hotpatches.	2021-02-25 18:05:47 -08:00
Alex Vandiver	173d2dec3d	puppet: Check in defensive restart-camo cron job. This was found on lb1; add it to the camo install on smokescreen.	2021-02-24 16:42:21 -08:00
Alex Vandiver	d15e6990e5	puppet: Only execute setup-apt-repo if necessary. This means that in steady-state, `zulip-puppet-apply` is expected to produce no changes or commands to execute. The verification step of `setup-apt-repo` is quite fast, so this cleans up the output for very little cost.	2021-02-23 18:16:02 -08:00
Alex Vandiver	0b736ef4cf	puppet: Remove puppet_ops configuration for separate loadbalancer host.	2021-02-22 16:05:13 -08:00
Alex Vandiver	e30b524896	iptables: Limit smokescreen port 4750, add camo port. Limit incoming connections to port 4750 to only the smokescreen host, and also allow access to the Camo server on that host, on port 9292.	2021-02-17 13:52:38 -08:00
Alex Vandiver	1caff01463	puppet: Configure nginx for long keep-alives when behind a loadbalancer. These optimizations only makes sense when all connections at a TCP level are coming from the same host or set of hosts; as such, they are only enabled if `loadbalancer.ips` is set in the `zulip.conf`.	2021-02-17 10:25:33 -08:00
Alex Vandiver	a88af1b5a2	camo: Install on smokescreen host.	2021-02-16 08:12:31 -08:00
Alex Vandiver	29f60bad20	smokescreen: Put the version into the supervisorctl command. This makes it reload correctly if the version is changed.	2021-02-16 08:12:31 -08:00
Anders Kaseorg	6e4c3e41dc	python: Normalize quotes with Black. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	11741543da	python: Reformat with Black, except quotes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	5028c081cb	python: Merge concatenated string literals that Black would uglify. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Alex Vandiver	559cdf7317	puppet: Set APT::Periodic::Unattended-Upgrade in apt config. This is required for unattended upgrades to actually run regularly. In some distributions, it may be found in 20auto-upgrades, but placing it here makes it more discoverable.	2021-02-12 08:59:19 -08:00
Ganesh Pawar	65e23dd713	puppet: Add Zulip specific postgresql configuration for 13. Based on the work done in `a03e4784c7`.	2021-02-05 09:30:34 -08:00
Ganesh Pawar	90a3dc8a91	puppet: Add upstream version of postgresql 13 config. This is a prep commit to add provision support for Ubuntu 20.10 Groovy.	2021-02-05 09:30:34 -08:00
Tim Abbott	fd8504e06b	munin: Update to use NAGIOS_BOT_HOST. We haven't actively used this plugin in years, and so it was never converted from the 2014-era monitoring to detect the hostname. This seems worth fixing since we may want to migrate this logic to a more modern monitoring system, and it's helpful to have it correct.	2021-01-27 12:07:09 -08:00
Alex Vandiver	ab035f76de	puppet: Be more restrictive about mm addresses. These will always have only 32 characters after the `mm`.	2021-01-26 10:13:58 -08:00
Alex Vandiver	a53092687e	puppet: Only match incoming gateway address on our mail domain. `79931051bd` allows outgoing emails from localhost, but outgoing recipients are still subjected to virtualmaps. This caused all outgoing email from Zulip with destination addresses containing `.`, `+`, or starting with `mm`, to be redirected back through the email gateway. Bracket the virualmap addresses used for local delivery to the mail gateway with a restriction on the domain matching the `postfix.mailname` configuration, regex-escaped, so those only apply to email destined for that domain. The hostname is _not_ moved from `mydestination` to `virtual_alias_domains`, as that would preclude delivery to actually-local addresses, like `postmaster@`.	2021-01-26 10:13:58 -08:00
Alex Vandiver	c2526844e9	worker: Remove SignupWorker and friends. ZULIP_FRIENDS_LIST_ID and MAILCHIMP_API_KEY are not currently used in production. This removes the unused 'signups' queue and worker.	2021-01-17 11:16:35 -08:00
Tim Abbott	4ee58f408b	process_fts_updates: Make normal development startup silent. We run this tool at DEBUG log level in production, so we will still see the notice on startup there; this avoids a spammy line in the development environment output..	2020-12-20 12:19:49 -08:00
Sutou Kouhei	0d3f9fc855	install: Use PGroonga packages built for PostgreSQL packages by PGDG Because we always use PostgreSQL packages by PGDG since Zulip 3.0. Fixes #16058.	2020-12-18 15:38:21 -08:00
Alex Vandiver	4868a4fe48	puppet: Set a long timeout on wal-g wal-push, to prevent stalls. `wal-g wal-push` has a known bug with occasionally hanging after file upload to S3[1]; set a rather long timeout on the upload process, so that we don't simply stall forever when archiving WAL segments. [1] https://github.com/wal-g/wal-g/issues/656	2020-11-20 11:32:36 -08:00
Sourabh Rana	419f163906	nginx: Increase file upload size from 25mb to 80mb.	2020-11-19 00:49:49 -08:00
Alex Vandiver	90ca06d873	puppet: Allow unattended upgrades of -updates in addition to -security. This ensures that software will be fully up-to-date, not just with security patches.	2020-11-13 16:45:05 -08:00
Alex Vandiver	2e20ab1658	puppet: Log the "Host" header and total response time. Logging `Host` is useful for determining access patterns to realms, especially if ROOT_DOMAIN_LANDING_PAGE is set. Total response time is useful in debugging access and performance patterns.	2020-11-13 16:42:32 -08:00
Tim Abbott	494a685827	puppet: Fix typo in name of missedmessage_emails consumer. This has been present since this check was introduced in `45c9c3cc30`.	2020-10-29 12:28:54 -07:00
Tim Abbott	ab3cb2b3bf	puppet: Fix internal redis puppet configuration. The inherits rule is required for overriding existing configuration files; while the `::profile` piece was missed in the recent ::profile migration.	2020-10-29 11:53:43 -07:00
Alex Vandiver	6b9d7000b5	puppet: Set proxy environment variables. These are respected by `urllib`, and thus also `requests`. We set `HTTP_proxy`, not `HTTP_PROXY`, because the latter is ignored in situations which might be running under CGI -- in such cases it may be coming from the `Proxy:` header in the request.	2020-10-28 12:17:35 -07:00
Alex Vandiver	8b0f32ee07	puppet: Move environment-setting into configuration, not command.	2020-10-28 12:13:04 -07:00
Alex Vandiver	b9797770d3	provision: Rename backup directory to postgresql.	2020-10-28 11:57:03 -07:00
Alex Vandiver	1f7132f50d	docs: Standardize on PostgreSQL, not Postgres.	2020-10-28 11:55:16 -07:00
Alex Vandiver	eaa99359b1	puppet: Rename to check_postgresql_replication_lag.	2020-10-28 11:51:52 -07:00
Alex Vandiver	53e59a0a13	puppet: Rename check_postgres_backup to check_postgresql_backup.	2020-10-28 11:51:52 -07:00
Alex Vandiver	45f6c79c4a	puppet: Rename postgres_ variables to postgresql_.	2020-10-28 11:51:52 -07:00
Alex Vandiver	e124324050	puppet: Rename postgres_appdb in nagios to postgresql.	2020-10-28 11:51:52 -07:00
Alex Vandiver	a155430eb5	docs: Document all zulip.conf settings. This provides a single reference point for all zulip.conf settings; these mostly link out to the more complete documentation about each setting, elsewhere. Fixes #12490.	2020-10-27 13:31:57 -07:00
Alex Vandiver	e81bc19e45	puppet: Remove shims for old classes, except dockervoyager. The upgrade mechanism in the previous commit negates the need for them -- with the exception of dockervoyager.	2020-10-27 13:29:19 -07:00
Alex Vandiver	d24c571bab	puppet: Automatically back up the database if we have the secrets. This avoids folks having to manually add to the puppet_classes.	2020-10-27 13:29:19 -07:00
Alex Vandiver	e7798d2797	puppet: Move zulip_ops::profile::postgres_appdb to postgresql.	2020-10-27 13:29:19 -07:00
Alex Vandiver	9f25389bff	puppet: Move top-level zulip_ops deployments to zulip_ops::profile.	2020-10-27 13:29:19 -07:00
Alex Vandiver	5365af544a	puppet: Rename zulip::profile::rabbit to ::rabbitmq.	2020-10-27 13:29:19 -07:00
Alex Vandiver	188af57296	puppet: Rename postgres_appdb to postgresql. There is only one PostgreSQL database; the "appdb" is irrelevant. Also use "postgresql," as it is the name of the software, whereas "postgres" the name of the binary and colloquial name. This is minor cleanup, but enabled by the other renames in the previous commit.	2020-10-27 13:29:19 -07:00
Alex Vandiver	91cb0988e1	puppet: Generalize docker detection. This also has the benefit of detecting zulip::dockervoyager as well as zulip::profile::docker.	2020-10-27 13:29:19 -07:00
Alex Vandiver	0f25acc7b3	puppet: Rename "voyager"/"dockervoyager" to "standalone"/"docker". The "voyager" name is non-intuitive and not significant. `zulip::voyager` and `zulip::dockervoyager` stubs are kept for back-compatibility with existing `zulip.conf` files.	2020-10-27 13:29:19 -07:00
Alex Vandiver	c2185a81d6	puppet: Move top-level zulip deployments into "profile" directory. This moves the puppet configuration closer to the "roles and profiles method"[1] which is suggested for organizing puppet classes. Notably, here it makes clear which classes are meant to be able to stand alone as deployments. Shims are left behind at the previous names, for compatibility with existing `zulip.conf` files when upgrading. [1] https://puppet.com/docs/pe/2019.8/the_roles_and_profiles_method	2020-10-27 13:29:19 -07:00
Alex Vandiver	27cfb14d92	puppet: Only include zulip::base for top-level deploys. This also removes direct includes of `zulip::common`, making `zulip::base` gatekeep the inclusion of it. This helps enforce that any top-level deploy only needs include a single class, and that any configuration which is not meant to be deployed by itself will not apply, due to lack of `zulip::common` include. The following commit will better differentiate these top-level deploys by moving them into a subdirectory.	2020-10-27 13:29:19 -07:00
Alex Vandiver	34e8c2c61e	puppet: Move total_memory_mb from zulip::base into zulip::common. This makes `zulip::common` used only for variable-setting, and `zulip::base` used only for resource creation.	2020-10-27 13:29:19 -07:00
Alex Vandiver	7bb888c2ec	puppet: Template supervisor.conf for redhat paths.	2020-10-27 13:29:19 -07:00
Alex Vandiver	3ab9b31d2f	puppet: Purge all un-managed supervisor configuration files. Relying on `defined(Class['...'])` makes the class sensitive to resource evaluation ordering, and thus brittle. It is also only functional for a single service (thumbor). Generalize by using `purge => true` for the directory to automatically remove all un-managed files. This is more general than the previous form, and may result in additional not-managed services being removed.	2020-10-27 13:29:19 -07:00
Alex Vandiver	1d54630b4e	log: Rename email-deliverer.log to match other files.	2020-10-25 14:56:37 -07:00
Alex Vandiver	93d661d119	puppet: Configure logrotate for all logger files. This adds log rotation to all /var/log/zulip files.	2020-10-25 14:56:37 -07:00
Alex Vandiver	c296b5d819	puppet: Allow unattended-upgrades for all but servers. Restarting servers is what can cause service interruptions, and increase risk. Add all of the servers that we use to the list of ignored packages, and uncomment the default allowed-origins in order to enable unattended upgrades.	2020-10-23 16:46:06 -07:00
Anders Kaseorg	72d6ff3c3b	docs: Fix more capitalization issues. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-23 11:46:55 -07:00
Alex Vandiver	a7d1fd9ffb	puppet: Remove non-working apt::source. `d2aa81858c` replaced the `apt::source` to set up debathena with `Exec['setup-apt-repo-debathena']`, but mistakenly left the `apt::source` in place in `zmirror` (but not `zmirror_personals`). The `apt::source` resource type was later removed in `c9d54f7854`, making the manifest to apply on `zmirror`. Remove the broken and unnecessary `apt::source` resource.	2020-10-23 11:31:20 -07:00
Alex Vandiver	48e06c25ba	puppet: Switch nagios SSH checks to id_ed25519 key. The ssh-rsa algorithm was deprecated[1] in OpenSSH 8.2 (2020-02-14) and will be removed in a future release. [1] https://www.openssh.com/txt/release-8.4	2020-10-22 16:42:30 -07:00
Alex Vandiver	0ea20bd7d8	puppet: Move postgres_version into postgres_common. This property is not related to the base zulip install; move it to zulip::postgres_common, which is already used as a namespace for various postgres variables.	2020-10-22 11:32:25 -07:00
Alex Vandiver	25e995b677	puppet: Move normal_queues to the one place that uses it.	2020-10-22 11:32:25 -07:00
Alex Vandiver	423b5c2be2	puppet: Move queue error and stats directories to just the app host.	2020-10-22 11:31:05 -07:00
Alex Vandiver	4d4c21499a	puppet: Move supervisor dependency into process_fts_updates. PostgreSQL itself has no dependency on supervisor; rather, the FTS updates do.	2020-10-22 11:30:53 -07:00
Alex Vandiver	ca971ebc59	puppet: Remove empty zulip_ops class.	2020-10-22 11:30:53 -07:00
Alex Vandiver	16af05758d	puppet: Move zulip_org into zulip_ops. This class is not of general interest.	2020-10-22 11:30:53 -07:00
Alex Vandiver	ad566c491d	puppet: Drop now-unused zulip_ops:::git class.	2020-10-22 11:30:53 -07:00
Alex Vandiver	50e9e2ed20	puppet: Make zulip::base include zulip::apt_repository. There was likely more dependency complexity prior to `97766102df`, but there is now no reason to require that consumers explicitly include zulip::apt_repository.	2020-10-22 11:30:53 -07:00
Alex Vandiver	2dc6d26ec6	puppet: Fix included monitoring class name.	2020-10-19 22:30:20 -07:00
Alex Vandiver	7a1132d605	puppet: Switch golang and smokescreen to use /srv. /srv and /opt have very similar usages; but we should be internally consistent. Move these two (the only usages of /opt) to match the rest in /srv.	2020-10-16 13:00:06 -07:00
Alex Vandiver	78b92a51cc	puppet: Allow access to smokescreen port via iptables.	2020-10-15 15:18:35 -07:00
Alex Vandiver	0d5356969e	puppet: Reformat ipv4 iptables rules comments.	2020-10-15 15:18:35 -07:00
Alex Vandiver	fffea9612b	puppet: Add an outgoing HTTP/HTTPS proxy server. Use https://github.com/stripe/smokescreen to provide a server for an outgoing proxy, run under supervisor. This will allow centralized blocking of internal metadata IPs, localhost, and so forth, as well as providing default request timeouts (10s by default).	2020-10-15 15:18:35 -07:00
Anders Kaseorg	dfaea9df65	shfmt: Reformat shell scripts with shfmt. https://github.com/mvdan/sh Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-15 15:16:00 -07:00
Alex Vandiver	f61ac4a28d	puppet: Move frontend monitoring into its own file. This allows it to be pulled in for deploys like czo, which don't use the full `zulip_ops::app_frontend`, but we wish to monitor.	2020-10-13 17:37:32 -07:00
Tim Abbott	7c2c82b190	nginx: Update nginx configuration for fhir/hl7 organization. We should eventually add templating for the set of hosts here, but it's worth merging this change to remove the deleted hostname and replace it with the current one.	2020-10-13 16:50:26 -07:00
Anders Kaseorg	723d285e46	nginx: Redirect {www.,}zulipchat.com, www.zulip.com to zulip.com. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-13 16:49:23 -07:00
Alex Vandiver	c8df9a150e	puppet: Drop all log2zulip configuration. Disabled on webservers in `047817b6b0`, it has since lingered in configuration, as well as running (to no effect) every minute on the loadbalancer. Remove the vestiges of its configuration.	2020-10-13 11:00:50 -07:00
Alex Vandiver	b431b1b021	puppet: Remove misleading motd. This banner shows on lb1, advertising itself as lb0. There is no compelling reason for a custom motd, especially one which needs to be reconfigured for each host.	2020-10-13 11:00:36 -07:00
Alex Vandiver	45c9c3cc30	queue: Monitor user_activity queue, now that it has a consumer. Since this was using repead individual get() calls previously, it could not be monitored for having a consumer. Add it in, by marking it of queue type "consumer" (the default), and adding Nagios lines for it. Also adjust missedmessage_emails to be monitored; it stopped using LoopQueueProcessingWorker in `5cec566cb9`, but was never added back into the set of monitored consumers.	2020-10-11 14:19:42 -07:00
Alex Vandiver	4fd7df4e8c	puppet: Remove absent of check-apns-tokens. This was marked as ensure absent in `d02101a401`, in v1.7.0 in 2017.	2020-09-29 18:17:08 -07:00
Alex Vandiver	872a349508	puppet: Remove absent of log2zulip. This was marked as ensure absent in `047817b6b0`, in v2.0.0 in 2018.	2020-09-29 18:17:08 -07:00
Alex Vandiver	0137772fdb	puppet: Remove absent of calculate-first-visible-message-id. This was marked as ensure absent in `dc7d44a245`, in v1.9.0 in 2018.	2020-09-29 18:17:08 -07:00
Alex Vandiver	966c8dc23d	puppet: Remove absent of email-mirror cron job. This was marked as ensure absent in `24f8492236`, in v1.3.0 in 2014.	2020-09-29 18:17:08 -07:00
Alex Vandiver	430d3b8554	puppet: Remove absent of libapache2-mod-wsgi. This was marked as ensure absent in `89b97e7480`, in v1.7.0 in 2017, though it did not take effect until `6e55aa2ce6`, in v1.9.0 in 2018.	2020-09-29 18:17:08 -07:00
Alex Vandiver	12085552d5	puppet: Tidy indentation.	2020-09-29 17:44:44 -07:00
Alex Vandiver	57d88eedd8	puppet: Only install rabbitmq cron jobs via zulip_ops. The rabbitmq cron jobs exist in order to call rabbitmqctl as root and write the output to files that nagios can consume, since nagios is not allowed to run rabbitmqctl. In systems which do not have nagios configured, these every-minute cron jobs add non-insignificant load, to no effect. Move their installation into `zulip_ops`. In doing so, also combine the cron.d files into a single file; this allows us to `ensure => absent` the old filenames, removing them from existing systems. Leave the resulting combined cron.d file in `zulip`, since it is still of general utility and note.	2020-09-29 17:44:44 -07:00
Alex Vandiver	79931051bd	puppet: Permit outgoing mail from postfix. The configuration change made in `1c17583ad5` only allowed delivery to those specific Zulip addresses. However, they also prevent the mailserver from being used as an outgoing email relay from Zulip, since all mail that passed through the mailserver (from any originator) was required to have a `RCPT TO` that matched those regexes. Allow mail originating from `mynetworks` to have an arbitrary addresses in `RCPT TO`.	2020-09-25 15:09:27 -07:00
Alex Vandiver	36ea307fbf	puppet: Depend other changes on sharding.py validation. Use the validation of the tornado sharding config that `stage_updated_sharding` does, by depending on it. This ensures that we don't write out a supervisor or nginx config based on a bad (e.g. non-sequential) list of tornado ports.	2020-09-25 10:52:40 -07:00
Alex Vandiver	c0e240277b	tornado: Remove fingerprinting, write out .tmp files always. Fingerprinting the config is somewhat brittle -- it requires either custom bootstrapping for old (fingerprint-less) configs, and may have false-positives. Since generating the config is lightweight, do so into the .tmp files, and compare the output to the originals to determine if there are changes to apply. In order to both surface errors, as well as notify the user in case a restart is necessary, we must run it twice. The `onlyif` functionality cannot show configuration errors to the user, only determine if the command runs or not. We thus run the command once, judging errors as "interesting" enough to run the actual command, whose failure will be verbose in Puppet and halt any steps that depend on it. Removing the `onlyif` would result in `stage_updated_sharding` showing up in the output of every Puppet run, which obscures the important messages it displays when an update to sharding is necessary. Removing the `command` (e.g. making it an `echo`) would result in removing the ability to report configuration errors. We thus have no choice but to run it twice; this is thankfully low-overhead.	2020-09-25 10:52:40 -07:00
Alex Vandiver	2a12fedcf1	tornado: Remove explicit tornado_processes setting; compute it. We can compute the intended number of processes from the sharding configuration. In doing so, also validate that all of the ports are contiguous. This removes a discrepancy between `scripts/lib/sharding.py` and other parts of the codebase about if merely having a `[tornado_sharding]` section is sufficient to enable sharding. Having behaviour which changes merely based on if an empty section exists is surprising. This does require that a (presumably empty) `9800` configuration line exist, but making that default explicit is useful. After this commit, configuring sharding can be done by adding to `zulip.conf`: ``` [tornado_sharding] 9800 = # default 9801 = other_realm ``` Followed by running `./scripts/refresh-sharding-and-restart`.	2020-09-18 15:13:40 -07:00
Alex Vandiver	f638518722	tornado: Move default production port to 9800. In development and test, we keep the Tornado port at 9993 and 9983, respectively; this allows tests to run while a dev instance is running. In production, moving to port 9800 consistently removes an odd edge case, when just one worker is on an entirely different port than if two workers are used.	2020-09-18 15:13:40 -07:00
Alex Vandiver	ff94254598	tornado: Log to files by port number. Without an explicit port number, the `stdout_logfile` values for each port are identical. Supervisor apparently decides that it will de-conflict this by appending an arbitrary number to the end: ``` /var/log/zulip/tornado.log /var/log/zulip/tornado.log.1 /var/log/zulip/tornado.log.10 /var/log/zulip/tornado.log.2 /var/log/zulip/tornado.log.3 /var/log/zulip/tornado.log.7 /var/log/zulip/tornado.log.8 /var/log/zulip/tornado.log.9 ``` This is quite confusing, since most other files in `/var/log/zulip/` use `.1` to mean logrotate was used. Also note that these are not all sequential -- 4, 5, and 6 are mysteriously missing, though they were used in previous restarts. This can make it extremely hard to debug logs from a particular Tornado shard. Give the logfiles a consistent name, and set them up to logrotate.	2020-09-14 22:17:51 -07:00
Alex Vandiver	efdaa58c24	supervisor: Use more specific process_name than "port-9800". Making this include "zulip-tornado" makes it clearer in supervisor logs. Without this, one only sees: ``` 2020-09-14 03:43:13,788 INFO waiting for port-9807 to stop 2020-09-14 03:43:14,466 INFO stopped: port-9807 (exit status 1) 2020-09-14 03:43:14,469 INFO spawned: 'port-9807' with pid 24289 2020-09-14 03:43:15,470 INFO success: port-9807 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) ```	2020-09-14 22:17:51 -07:00
Alex Vandiver	e9d0bdea65	puppet: Coerce uwsgi_listen_backlog_limit into an int before doing math.	2020-09-14 21:22:13 -07:00
Alex Vandiver	8adf530400	puppet: Generate sharding in puppet, then refresh-sharding-and-restart. This supports running puppet to pick up new sharding changes, which will warn of the need to finalize them via `refresh-sharding-and-restart`, or simply running that directly.	2020-09-14 16:27:15 -07:00
Alex Vandiver	0de356c2df	puppet: Move generation of tornado nginx upstreams into tornado_sharding. This puts the creation of the upstreams referenced by `nginx_sharding.conf` adjacent to their use.	2020-09-14 16:27:15 -07:00
Alex Vandiver	bf029d99f1	sharding: Also mark sharding.json 644 for consistency. There is no reason to limit this to 640; mark it 644 for consistency with the other file.	2020-09-14 16:27:15 -07:00
Alex Vandiver	1c17583ad5	puppet: Restrict postfix incoming addresses to postmaster and zulip. This removes the possibility of local user enumeration via RCPT TO.	2020-09-11 18:49:22 -07:00
Alex Vandiver	482c964dd3	puppet: Logrotate for webhook exceptions.	2020-09-10 17:47:21 -07:00
Alex Vandiver	e38051736d	puppet: Wrap and sort logrotate config.	2020-09-10 17:47:21 -07:00
Anders Kaseorg	75c59a820d	python: Convert subprocess.Popen.communicate to run or check_output. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 17:42:35 -07:00
Anders Kaseorg	fbfd4b399d	python: Elide action="store" for argparse arguments. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 16:17:14 -07:00
Anders Kaseorg	1f2ac1962f	python: Elide default=None for argparse arguments. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 16:17:14 -07:00
Anders Kaseorg	d751e0cece	puppet: Don’t install netcat. It’s been unused since commit `0af22dad18` (#13239). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 10:33:47 -07:00
Anders Kaseorg	ab120a03bc	python: Replace unnecessary intermediate lists with generators. Mostly suggested by the flake8-comprehension plugin. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-02 11:15:41 -07:00
Anders Kaseorg	a5dbab8fb0	python: Remove redundant dest for argparse arguments. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-02 11:04:10 -07:00
Anders Kaseorg	dbdf67301b	memcached: Switch from pylibmc to python-binary-memcached. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-08-06 12:51:14 -07:00
Casper Kvan Clausen	ed7a6d5e4d	puppet: Support nginx_listen_port with http_only	2020-08-03 12:58:12 -07:00
Alex Vandiver	cd530d627b	uwsgi: Stop generating IOError and SIGPIPE on client close. Clients that close their socket to nginx suddenly also cause nginx to close its connection to uwsgi. When uwsgi finishes computing the response, it thus tries to write to a closed socket, and generates either IOError or SIGPIPE failures. Since these are caused by the _client_ closing the connection suddenly, they are not actionable by the server. At particularly high volumes, this could represent some sort of server-side failure; however, this is better detected by examining status codes at the loadbalancer. nginx uses the error code 499 for this occurrence: https://httpstatuses.com/499 Stop uwsgi from generating this family of exception entirely, using configuration for uwsgi[1]; it documents these errors as "(annoying)," hinting at their general utility." [1] https://uwsgi-docs.readthedocs.io/en/latest/Options.html#ignore-sigpipe	2020-07-31 10:40:09 -07:00
Alex Vandiver	ceb909dbc5	puppet: Increase backlogged socket count based on uwsgi backlog. Increasing the uwsgi listen backlog is intended to allow it to handle higher connection rates during server restart, when many clients may be trying to connect. The kernel, in turn, needs to have a proportionally increased somaxconn soas to not refuse the connection. Set somaxconn to 2x the uwsgi backlog, but no lower than the default (128).	2020-07-28 21:16:26 -07:00
Alex Vandiver	38d01cd4db	puppet: Generalize install-wal-g to be arbitrary tarballs.	2020-07-24 17:24:57 -07:00
Tim Abbott	5a1243db3c	puppet: Use correct scope for zulip_ops::munin_plugin.	2020-07-15 21:49:45 -07:00
Alex Vandiver	48c3c33d10	puppet: Fully-qualify the munin-plugin name	2020-07-14 17:58:51 -07:00
Alex Vandiver	c68333040b	puppet: Revert PostgreSQL setting of recovery_target_timeline. Prior to PostgreSQL 12, the `recovery_target_timeline` setting is only valid in a `recovery.conf` file, as that file has its own configuration parser. As such, including it in `postgresql.conf` results in an error, and PostgreSQL will fail to start. Remove the setting, reverting `bff3b540b1`. This fixes PostgreSQL 9.5, 9.6, 10, and 11; while the setting is not an error in a PostgreSQL 12 configuration file, it is unnecessary since `latest` is the default.	2020-07-14 16:28:20 -07:00
Alex Vandiver	31d80a77d4	puppet: Update nagios check_postgres_replication_lag to be on DB hosts `7d4a370a57` attempted to move the replication check to on the PostgreSQL hosts. While it updated the _check_ to assume it was running and talking to a local PostgreSQL instance, the configuration and installation for the check were not updated. As such, the check ran on the nagios host for each DB host, and produced no output. Start distributing the check to all apopdb hosts, and configure nagios to use the SSH tunnel to get there.	2020-07-14 16:27:18 -07:00
Alex Vandiver	2174db27db	puppet: Put the dependencies on pg_backup_and_purge itself, and ensure them.	2020-07-14 00:40:25 -07:00
Alex Vandiver	6c27f07c1d	puppet: Move PostgreSQL backups to their own class. wal-g was used in `puppet/zulip` by env-wal-g, but only installed in `puppet/zulip_ops`. Merge all of the dependencies of doing backups using wal-g (wal-g installation, the pg_backup_and_purge job, the nagios plugin that verifies it happens) into a common base class in `puppet/zulip`, since it is generally useful.	2020-07-14 00:40:25 -07:00
Anders Kaseorg	15483c09cb	puppet: Add missing trailing commas. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-13 15:36:06 -07:00
Alex Vandiver	3691a94efe	puppet: Configure munin and nagios under apache with puppet. This swaps in the actually-in-use munin configuiration file; otherwise, it is an implementation of the configuration as it exists on the machine.	2020-07-13 13:23:11 -07:00
Alex Vandiver	4e42164b4a	munin: Add plugins to prod hosts.	2020-07-13 13:23:11 -07:00
Alex Vandiver	2a14212b27	munin: Add a helper resource definition for munin plugins.	2020-07-13 12:49:28 -07:00
Alex Vandiver	7c7b5fcd6f	munin: Deal with spaces in the channel names.	2020-07-13 12:49:28 -07:00
Alex Vandiver	eda2c4b8e2	puppet: Split munin-node from munin-server. No plugins are installed inside the /usr/local/munin/lib this creates in munin-node, nor are they symlinked into /etc/munin/plugins, so non-default plugins are added by this.	2020-07-13 12:49:28 -07:00
Alex Vandiver	ddc7bb5a45	munin: Fix the path to check_send_receive_time.	2020-07-13 12:49:28 -07:00
Alex Vandiver	8be544e7eb	munin: Rename monitoring plugin to use zulip name, not humbug.	2020-07-13 12:49:28 -07:00

... 2 3 4 5 6 ...

1342 Commits