zulip

Commit Graph

Author	SHA1	Message	Date
Anders Kaseorg	b01d43f339	mypy: Fix strict_equality violations. puppet/zulip/files/nagios_plugins/zulip_postgresql/check_postgresql_replication_lag:98: error: Non-overlapping equality check (left operand type: "List[List[str]]", right operand type: "Literal[0]") [comparison-overlap] zerver/tests/test_realm.py:650: error: Non-overlapping container check (element type: "Dict[str, Any]", container item type: "str") [comparison-overlap] Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-04-13 09:18:18 -07:00
Alex Vandiver	93f3b41811	puppet: Also move avatars to the same nginx include file.	2021-04-09 08:28:42 -07:00
Alex Vandiver	aae8f454ce	puppet: Simplify uploads handling. `uploads-route.noserve` and `uploads-route.internal` contained identical location blocks for `/upload`, since differentiation was necessary for Trusty until 33c941407b72; move the now-common sections into `app`. This the only differences between internal and S3 serving as a single block which should be included or not based on config; move it to a file which may or may not be placed in `app.d/`.	2021-04-09 08:28:42 -07:00
Alex Vandiver	fb26c6b7ca	puppet: Move uwsgi_pass setting into uwsgi_params. We only ever call `uwsgi_pass django` in association with `include uwsgi_params`; refactor it in.	2021-04-09 08:28:42 -07:00
Alex Vandiver	9cf9d5f2cf	puppet: Move HTTP_X_REAL_IP setting into uwsgi_params. This effectively also adds it to serving `/user_uploads`, where its lack would cause failures to list the actual IP address.	2021-04-09 08:28:42 -07:00
Alex Vandiver	795517bd52	puppet: Only set X-Real-IP once. `07779ea879` added an additional `proxy_set_header` of `X-Real-IP` to `puppet/zulip/files/nginx/zulip-include-common/proxy`; as noted in that commit, Tornado longpoll proxies already included such a line. Unfortunately, this equates to setting that header _twice_ for Tornado ports, like so: ``` X-Real-Ip: 198.199.116.58 X-Real-Ip: 198.199.116.58 ``` ...which is represented, once parsed by Django, as an IP of `198.199.116.58, 198.199.116.58`. For IPv4, this odd "IP address" has no problems, and appears in the access logs accordingly; for IPv6 addresses, however, its length is such that it overflows a call to `getaddrinfo` when attempting to determine the validity of the IP. Remove the now-duplicated inclusion of the header.	2021-04-09 08:28:42 -07:00
Alex Vandiver	07779ea879	middleware: Do not trust X-Forwarded-For; use X-Real-Ip, set from nginx. The `X-Forwarded-For` header is a list of proxies' IP addresses; each proxy appends the remote address of the host it received its request from to the list, as it passes the request down. A naïve parsing, as SetRemoteAddrFromForwardedFor did, would thus interpret the first address in the list as the client's IP. However, clients can pass in arbitrary `X-Forwarded-For` headers, which would allow them to spoof their IP address. `nginx`'s behavior is to treat the addresses as untrusted unless they match an allowlist of known proxies. By setting `real_ip_recursive on`, it also allows this behavior to be applied repeatedly, moving from right to left down the `X-Forwarded-For` list, stopping at the right-most that is untrusted. Rather than re-implement this logic in Django, pass the first untrusted value that `nginx` computer down into Django via `X-Real-Ip` header. This allows consistent IP addresses in logs between `nginx` and Django. Proxied calls into Tornado (which don't use UWSGI) already passed this header, as Tornado logging respects it.	2021-03-31 14:19:38 -07:00
Anders Kaseorg	29e4c71ec4	puppet: Reformat custom Ruby modules with Rufo. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-03-24 12:12:04 -07:00
Alex Vandiver	6ee74b3433	puppet: Check health of APT repository.	2021-03-23 19:27:42 -07:00
Alex Vandiver	c01345d20c	puppet: Add nagios check for long-lived certs that do not auto-renew.	2021-03-23 19:27:27 -07:00
Alex Vandiver	9ea86c861b	puppet: Add a nagios alert configuration for smokescreen. This verifies that the proxy is working by accessing a highly-available website through it. Since failure of this equates to failures of Sentry notifications and Android mobile push notifications, this is a paging service.	2021-03-18 10:11:15 -07:00
Anders Kaseorg	129ea6dd11	nginx: Consistently listen on IPv6 and with HTTP/2. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-03-17 17:46:32 -07:00
Alex Vandiver	15c58cce5a	puppet: Create new nginx logfiles as the zulip user, not as www-data. All of `/var/log/nginx/` is chown'd to `zulip` and the nginx processes themselves run as `nginx`, and would thus (on their own) create new logfiles as `zulip`. Having `logrotate` create them as the package default of `www-data` means that they are momentarily unreadable by the `zulip` user just after rotation, which can cause problems with logtail scripts. Commit the standard `nginx` logrotate configuration, but with the `zulip` user instead of the `www-data` user.	2021-03-16 14:45:13 -07:00
Alex Vandiver	3314fefaec	puppet: Do not require a venv for zulip-puppet-apply. `0663b23d54` changed zulip-puppet-apply to use the venv, because it began using `yaml` to parse the output of puppet to determine if changes would happen. However, not every install ends with a venv; notably, non-frontend servers do not have one. Attempting to run zulip-puppet-apply on them hence now fails. Remove this dependency on the venv, by installing a system python3-yaml package -- though in reality, this package is already an indirect dependency of the system. Especially since pyyaml is quite stable, we're not using it in any interesting way, and it does not actually add to the dependencies, it is preferable to parsing the YAML by hand in this instance.	2021-03-14 17:50:57 -07:00
Alex Vandiver	52f155873f	puppet: Ensure that all `scripts/lib/install` packages are installed. These have all been required packages for some time, but this helps keep the install-time list more clearly a subset of the upgrade-time list.	2021-03-14 17:50:57 -07:00
Alex Vandiver	06c07109e4	puppet: Add missing semicolons left off in `ba3b88c81b`.	2021-03-12 15:48:53 -08:00
Alex Vandiver	024282b51e	Revert "puppet: Use rabbitmq as the user for its config files." This reverts commit `211232978f`. The `rabbitmq` user does not exist yet on first install, and the goal is to create the `rabbitmq-env.conf` file before the package is installed.	2021-03-12 15:37:19 -08:00
Alex Vandiver	ba3b88c81b	puppet: Explicitly use the snakeoil certificates for nginx. In production, the `wildcard-zulipchat.com.combined-chain.crt` file is just a symlink to the snakeoil certificates; but we do not puppet that symlink, which makes new hosts fail to start cleanly. Instead, point explicitly to the snakeoil certificate, and explain why.	2021-03-12 13:31:54 -08:00
Alex Vandiver	211232978f	puppet: Use rabbitmq as the user for its config files. This matches the initial ownership by the `rabbitmq-server` package.	2021-03-12 13:31:03 -08:00
Alex Vandiver	ef188af82d	puppet: Use two location blocks, instead of nesting them. Directives in `location` blocks may or may not inherit from surrounding `location` blocks; specifically, `add_header` directives do not[1]: > There could be several add_header directives. These directives are > inherited from the previous configuration level if and only if there > are no add_header directives defined on the current level. In order to maintain the same headers (including, critically, `Access-Control-Allow-Origin`) as the surrounding block, all `add_header` directives must thus be repeated (which includes the `include`). For clarity, un-nest and repeat the entire `location` block as was used for `/static/`, but with the additional `add_header`. This is preferred to the of an `if $request_uri` statement to add the header, as those can have unexpected or undefined results[2]. [1] http://nginx.org/en/docs/http/ngx_http_headers_module.html#add_header [2] https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/	2021-03-11 21:09:15 -08:00
Alex Vandiver	306bf930f5	puppet: Add a warning if ksplice is enabled but has no key set.	2021-03-10 17:57:20 -08:00
Alex Vandiver	a215c83c2d	puppet: Switch to more explicit variable rather than reuse a nagios one. Redis is not nagios, and this only leads to confusion as to why there is a nagios domain setting on frontend servers; it also leaves the `redis0` part of the name buried in the template. Switch to an explicit variable for the redis hostname.	2021-03-10 11:44:54 -08:00
Alex Vandiver	a5b29398fc	puppet: Only install ksplice uptrack if there is an access key.	2021-03-10 11:44:11 -08:00
Alex Vandiver	189e86e18e	puppet: Set aggressive caching headers on immutable webpack files. A partial fix for #3470.	2021-03-07 22:00:32 -08:00
Alex Vandiver	e63f170027	puppet: Add access time and host to nginx access logs. `2e20ab1658` attempted to add this; but there are multiple locations that access logs are set, and the most specific wins.	2021-03-04 18:06:47 -08:00
Alex Vandiver	8961885b0f	puppet: Add smokescreen to logrotate.	2021-03-02 17:16:38 -08:00
Alex Vandiver	d938dd9d4a	puppet: Document smokescreen installation, and move to puppet/zulip/. This is more broadly useful than for just Kandra; provide documentation and means to install Smokescreen for stand-alone servers, and motivate its use somewhat more.	2021-03-02 17:16:38 -08:00
Alex Vandiver	2f5eae5c68	puppet: Minor formatting.	2021-02-28 17:03:29 -08:00
Alex Vandiver	a759d26a32	puppet: Make ksplice config not world-readable, use 'adm' group. This matches the configuration that ksplice itself creates the file and directory with.	2021-02-28 17:03:29 -08:00
Tim Abbott	957c16aa77	nagios: Tweak prod load monitoring parameters. Ultimately this monitoring isn't that helpful, but we're mainly interested in when it spikes to very high numbers.	2021-02-26 08:39:52 -08:00
Alex Vandiver	32149c6a1c	puppet: Add ksplice uptrack for kernel hotpatches.	2021-02-25 18:05:47 -08:00
Alex Vandiver	173d2dec3d	puppet: Check in defensive restart-camo cron job. This was found on lb1; add it to the camo install on smokescreen.	2021-02-24 16:42:21 -08:00
Alex Vandiver	d15e6990e5	puppet: Only execute setup-apt-repo if necessary. This means that in steady-state, `zulip-puppet-apply` is expected to produce no changes or commands to execute. The verification step of `setup-apt-repo` is quite fast, so this cleans up the output for very little cost.	2021-02-23 18:16:02 -08:00
Alex Vandiver	0b736ef4cf	puppet: Remove puppet_ops configuration for separate loadbalancer host.	2021-02-22 16:05:13 -08:00
Alex Vandiver	e30b524896	iptables: Limit smokescreen port 4750, add camo port. Limit incoming connections to port 4750 to only the smokescreen host, and also allow access to the Camo server on that host, on port 9292.	2021-02-17 13:52:38 -08:00
Alex Vandiver	1caff01463	puppet: Configure nginx for long keep-alives when behind a loadbalancer. These optimizations only makes sense when all connections at a TCP level are coming from the same host or set of hosts; as such, they are only enabled if `loadbalancer.ips` is set in the `zulip.conf`.	2021-02-17 10:25:33 -08:00
Alex Vandiver	a88af1b5a2	camo: Install on smokescreen host.	2021-02-16 08:12:31 -08:00
Alex Vandiver	29f60bad20	smokescreen: Put the version into the supervisorctl command. This makes it reload correctly if the version is changed.	2021-02-16 08:12:31 -08:00
Anders Kaseorg	6e4c3e41dc	python: Normalize quotes with Black. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	11741543da	python: Reformat with Black, except quotes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	5028c081cb	python: Merge concatenated string literals that Black would uglify. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Alex Vandiver	559cdf7317	puppet: Set APT::Periodic::Unattended-Upgrade in apt config. This is required for unattended upgrades to actually run regularly. In some distributions, it may be found in 20auto-upgrades, but placing it here makes it more discoverable.	2021-02-12 08:59:19 -08:00
Ganesh Pawar	65e23dd713	puppet: Add Zulip specific postgresql configuration for 13. Based on the work done in `a03e4784c7`.	2021-02-05 09:30:34 -08:00
Ganesh Pawar	90a3dc8a91	puppet: Add upstream version of postgresql 13 config. This is a prep commit to add provision support for Ubuntu 20.10 Groovy.	2021-02-05 09:30:34 -08:00
Tim Abbott	fd8504e06b	munin: Update to use NAGIOS_BOT_HOST. We haven't actively used this plugin in years, and so it was never converted from the 2014-era monitoring to detect the hostname. This seems worth fixing since we may want to migrate this logic to a more modern monitoring system, and it's helpful to have it correct.	2021-01-27 12:07:09 -08:00
Alex Vandiver	ab035f76de	puppet: Be more restrictive about mm addresses. These will always have only 32 characters after the `mm`.	2021-01-26 10:13:58 -08:00
Alex Vandiver	a53092687e	puppet: Only match incoming gateway address on our mail domain. `79931051bd` allows outgoing emails from localhost, but outgoing recipients are still subjected to virtualmaps. This caused all outgoing email from Zulip with destination addresses containing `.`, `+`, or starting with `mm`, to be redirected back through the email gateway. Bracket the virualmap addresses used for local delivery to the mail gateway with a restriction on the domain matching the `postfix.mailname` configuration, regex-escaped, so those only apply to email destined for that domain. The hostname is _not_ moved from `mydestination` to `virtual_alias_domains`, as that would preclude delivery to actually-local addresses, like `postmaster@`.	2021-01-26 10:13:58 -08:00
Alex Vandiver	c2526844e9	worker: Remove SignupWorker and friends. ZULIP_FRIENDS_LIST_ID and MAILCHIMP_API_KEY are not currently used in production. This removes the unused 'signups' queue and worker.	2021-01-17 11:16:35 -08:00
Tim Abbott	4ee58f408b	process_fts_updates: Make normal development startup silent. We run this tool at DEBUG log level in production, so we will still see the notice on startup there; this avoids a spammy line in the development environment output..	2020-12-20 12:19:49 -08:00
Sutou Kouhei	0d3f9fc855	install: Use PGroonga packages built for PostgreSQL packages by PGDG Because we always use PostgreSQL packages by PGDG since Zulip 3.0. Fixes #16058.	2020-12-18 15:38:21 -08:00
Alex Vandiver	4868a4fe48	puppet: Set a long timeout on wal-g wal-push, to prevent stalls. `wal-g wal-push` has a known bug with occasionally hanging after file upload to S3[1]; set a rather long timeout on the upload process, so that we don't simply stall forever when archiving WAL segments. [1] https://github.com/wal-g/wal-g/issues/656	2020-11-20 11:32:36 -08:00
Sourabh Rana	419f163906	nginx: Increase file upload size from 25mb to 80mb.	2020-11-19 00:49:49 -08:00
Alex Vandiver	90ca06d873	puppet: Allow unattended upgrades of -updates in addition to -security. This ensures that software will be fully up-to-date, not just with security patches.	2020-11-13 16:45:05 -08:00
Alex Vandiver	2e20ab1658	puppet: Log the "Host" header and total response time. Logging `Host` is useful for determining access patterns to realms, especially if ROOT_DOMAIN_LANDING_PAGE is set. Total response time is useful in debugging access and performance patterns.	2020-11-13 16:42:32 -08:00
Tim Abbott	494a685827	puppet: Fix typo in name of missedmessage_emails consumer. This has been present since this check was introduced in `45c9c3cc30`.	2020-10-29 12:28:54 -07:00
Tim Abbott	ab3cb2b3bf	puppet: Fix internal redis puppet configuration. The inherits rule is required for overriding existing configuration files; while the `::profile` piece was missed in the recent ::profile migration.	2020-10-29 11:53:43 -07:00
Alex Vandiver	6b9d7000b5	puppet: Set proxy environment variables. These are respected by `urllib`, and thus also `requests`. We set `HTTP_proxy`, not `HTTP_PROXY`, because the latter is ignored in situations which might be running under CGI -- in such cases it may be coming from the `Proxy:` header in the request.	2020-10-28 12:17:35 -07:00
Alex Vandiver	8b0f32ee07	puppet: Move environment-setting into configuration, not command.	2020-10-28 12:13:04 -07:00
Alex Vandiver	b9797770d3	provision: Rename backup directory to postgresql.	2020-10-28 11:57:03 -07:00
Alex Vandiver	1f7132f50d	docs: Standardize on PostgreSQL, not Postgres.	2020-10-28 11:55:16 -07:00
Alex Vandiver	eaa99359b1	puppet: Rename to check_postgresql_replication_lag.	2020-10-28 11:51:52 -07:00
Alex Vandiver	53e59a0a13	puppet: Rename check_postgres_backup to check_postgresql_backup.	2020-10-28 11:51:52 -07:00
Alex Vandiver	45f6c79c4a	puppet: Rename postgres_ variables to postgresql_.	2020-10-28 11:51:52 -07:00
Alex Vandiver	e124324050	puppet: Rename postgres_appdb in nagios to postgresql.	2020-10-28 11:51:52 -07:00
Alex Vandiver	a155430eb5	docs: Document all zulip.conf settings. This provides a single reference point for all zulip.conf settings; these mostly link out to the more complete documentation about each setting, elsewhere. Fixes #12490.	2020-10-27 13:31:57 -07:00
Alex Vandiver	e81bc19e45	puppet: Remove shims for old classes, except dockervoyager. The upgrade mechanism in the previous commit negates the need for them -- with the exception of dockervoyager.	2020-10-27 13:29:19 -07:00
Alex Vandiver	d24c571bab	puppet: Automatically back up the database if we have the secrets. This avoids folks having to manually add to the puppet_classes.	2020-10-27 13:29:19 -07:00
Alex Vandiver	e7798d2797	puppet: Move zulip_ops::profile::postgres_appdb to postgresql.	2020-10-27 13:29:19 -07:00
Alex Vandiver	9f25389bff	puppet: Move top-level zulip_ops deployments to zulip_ops::profile.	2020-10-27 13:29:19 -07:00
Alex Vandiver	5365af544a	puppet: Rename zulip::profile::rabbit to ::rabbitmq.	2020-10-27 13:29:19 -07:00
Alex Vandiver	188af57296	puppet: Rename postgres_appdb to postgresql. There is only one PostgreSQL database; the "appdb" is irrelevant. Also use "postgresql," as it is the name of the software, whereas "postgres" the name of the binary and colloquial name. This is minor cleanup, but enabled by the other renames in the previous commit.	2020-10-27 13:29:19 -07:00
Alex Vandiver	91cb0988e1	puppet: Generalize docker detection. This also has the benefit of detecting zulip::dockervoyager as well as zulip::profile::docker.	2020-10-27 13:29:19 -07:00
Alex Vandiver	0f25acc7b3	puppet: Rename "voyager"/"dockervoyager" to "standalone"/"docker". The "voyager" name is non-intuitive and not significant. `zulip::voyager` and `zulip::dockervoyager` stubs are kept for back-compatibility with existing `zulip.conf` files.	2020-10-27 13:29:19 -07:00
Alex Vandiver	c2185a81d6	puppet: Move top-level zulip deployments into "profile" directory. This moves the puppet configuration closer to the "roles and profiles method"[1] which is suggested for organizing puppet classes. Notably, here it makes clear which classes are meant to be able to stand alone as deployments. Shims are left behind at the previous names, for compatibility with existing `zulip.conf` files when upgrading. [1] https://puppet.com/docs/pe/2019.8/the_roles_and_profiles_method	2020-10-27 13:29:19 -07:00
Alex Vandiver	27cfb14d92	puppet: Only include zulip::base for top-level deploys. This also removes direct includes of `zulip::common`, making `zulip::base` gatekeep the inclusion of it. This helps enforce that any top-level deploy only needs include a single class, and that any configuration which is not meant to be deployed by itself will not apply, due to lack of `zulip::common` include. The following commit will better differentiate these top-level deploys by moving them into a subdirectory.	2020-10-27 13:29:19 -07:00
Alex Vandiver	34e8c2c61e	puppet: Move total_memory_mb from zulip::base into zulip::common. This makes `zulip::common` used only for variable-setting, and `zulip::base` used only for resource creation.	2020-10-27 13:29:19 -07:00
Alex Vandiver	7bb888c2ec	puppet: Template supervisor.conf for redhat paths.	2020-10-27 13:29:19 -07:00
Alex Vandiver	3ab9b31d2f	puppet: Purge all un-managed supervisor configuration files. Relying on `defined(Class['...'])` makes the class sensitive to resource evaluation ordering, and thus brittle. It is also only functional for a single service (thumbor). Generalize by using `purge => true` for the directory to automatically remove all un-managed files. This is more general than the previous form, and may result in additional not-managed services being removed.	2020-10-27 13:29:19 -07:00
Alex Vandiver	1d54630b4e	log: Rename email-deliverer.log to match other files.	2020-10-25 14:56:37 -07:00
Alex Vandiver	93d661d119	puppet: Configure logrotate for all logger files. This adds log rotation to all /var/log/zulip files.	2020-10-25 14:56:37 -07:00
Alex Vandiver	c296b5d819	puppet: Allow unattended-upgrades for all but servers. Restarting servers is what can cause service interruptions, and increase risk. Add all of the servers that we use to the list of ignored packages, and uncomment the default allowed-origins in order to enable unattended upgrades.	2020-10-23 16:46:06 -07:00
Anders Kaseorg	72d6ff3c3b	docs: Fix more capitalization issues. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-23 11:46:55 -07:00
Alex Vandiver	a7d1fd9ffb	puppet: Remove non-working apt::source. `d2aa81858c` replaced the `apt::source` to set up debathena with `Exec['setup-apt-repo-debathena']`, but mistakenly left the `apt::source` in place in `zmirror` (but not `zmirror_personals`). The `apt::source` resource type was later removed in `c9d54f7854`, making the manifest to apply on `zmirror`. Remove the broken and unnecessary `apt::source` resource.	2020-10-23 11:31:20 -07:00
Alex Vandiver	48e06c25ba	puppet: Switch nagios SSH checks to id_ed25519 key. The ssh-rsa algorithm was deprecated[1] in OpenSSH 8.2 (2020-02-14) and will be removed in a future release. [1] https://www.openssh.com/txt/release-8.4	2020-10-22 16:42:30 -07:00
Alex Vandiver	0ea20bd7d8	puppet: Move postgres_version into postgres_common. This property is not related to the base zulip install; move it to zulip::postgres_common, which is already used as a namespace for various postgres variables.	2020-10-22 11:32:25 -07:00
Alex Vandiver	25e995b677	puppet: Move normal_queues to the one place that uses it.	2020-10-22 11:32:25 -07:00
Alex Vandiver	423b5c2be2	puppet: Move queue error and stats directories to just the app host.	2020-10-22 11:31:05 -07:00
Alex Vandiver	4d4c21499a	puppet: Move supervisor dependency into process_fts_updates. PostgreSQL itself has no dependency on supervisor; rather, the FTS updates do.	2020-10-22 11:30:53 -07:00
Alex Vandiver	ca971ebc59	puppet: Remove empty zulip_ops class.	2020-10-22 11:30:53 -07:00
Alex Vandiver	16af05758d	puppet: Move zulip_org into zulip_ops. This class is not of general interest.	2020-10-22 11:30:53 -07:00
Alex Vandiver	ad566c491d	puppet: Drop now-unused zulip_ops:::git class.	2020-10-22 11:30:53 -07:00
Alex Vandiver	50e9e2ed20	puppet: Make zulip::base include zulip::apt_repository. There was likely more dependency complexity prior to `97766102df`, but there is now no reason to require that consumers explicitly include zulip::apt_repository.	2020-10-22 11:30:53 -07:00
Alex Vandiver	2dc6d26ec6	puppet: Fix included monitoring class name.	2020-10-19 22:30:20 -07:00
Alex Vandiver	7a1132d605	puppet: Switch golang and smokescreen to use /srv. /srv and /opt have very similar usages; but we should be internally consistent. Move these two (the only usages of /opt) to match the rest in /srv.	2020-10-16 13:00:06 -07:00
Alex Vandiver	78b92a51cc	puppet: Allow access to smokescreen port via iptables.	2020-10-15 15:18:35 -07:00
Alex Vandiver	0d5356969e	puppet: Reformat ipv4 iptables rules comments.	2020-10-15 15:18:35 -07:00
Alex Vandiver	fffea9612b	puppet: Add an outgoing HTTP/HTTPS proxy server. Use https://github.com/stripe/smokescreen to provide a server for an outgoing proxy, run under supervisor. This will allow centralized blocking of internal metadata IPs, localhost, and so forth, as well as providing default request timeouts (10s by default).	2020-10-15 15:18:35 -07:00
Anders Kaseorg	dfaea9df65	shfmt: Reformat shell scripts with shfmt. https://github.com/mvdan/sh Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-15 15:16:00 -07:00
Alex Vandiver	f61ac4a28d	puppet: Move frontend monitoring into its own file. This allows it to be pulled in for deploys like czo, which don't use the full `zulip_ops::app_frontend`, but we wish to monitor.	2020-10-13 17:37:32 -07:00
Tim Abbott	7c2c82b190	nginx: Update nginx configuration for fhir/hl7 organization. We should eventually add templating for the set of hosts here, but it's worth merging this change to remove the deleted hostname and replace it with the current one.	2020-10-13 16:50:26 -07:00
Anders Kaseorg	723d285e46	nginx: Redirect {www.,}zulipchat.com, www.zulip.com to zulip.com. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-13 16:49:23 -07:00
Alex Vandiver	c8df9a150e	puppet: Drop all log2zulip configuration. Disabled on webservers in `047817b6b0`, it has since lingered in configuration, as well as running (to no effect) every minute on the loadbalancer. Remove the vestiges of its configuration.	2020-10-13 11:00:50 -07:00
Alex Vandiver	b431b1b021	puppet: Remove misleading motd. This banner shows on lb1, advertising itself as lb0. There is no compelling reason for a custom motd, especially one which needs to be reconfigured for each host.	2020-10-13 11:00:36 -07:00
Alex Vandiver	45c9c3cc30	queue: Monitor user_activity queue, now that it has a consumer. Since this was using repead individual get() calls previously, it could not be monitored for having a consumer. Add it in, by marking it of queue type "consumer" (the default), and adding Nagios lines for it. Also adjust missedmessage_emails to be monitored; it stopped using LoopQueueProcessingWorker in `5cec566cb9`, but was never added back into the set of monitored consumers.	2020-10-11 14:19:42 -07:00
Alex Vandiver	4fd7df4e8c	puppet: Remove absent of check-apns-tokens. This was marked as ensure absent in `d02101a401`, in v1.7.0 in 2017.	2020-09-29 18:17:08 -07:00
Alex Vandiver	872a349508	puppet: Remove absent of log2zulip. This was marked as ensure absent in `047817b6b0`, in v2.0.0 in 2018.	2020-09-29 18:17:08 -07:00
Alex Vandiver	0137772fdb	puppet: Remove absent of calculate-first-visible-message-id. This was marked as ensure absent in `dc7d44a245`, in v1.9.0 in 2018.	2020-09-29 18:17:08 -07:00
Alex Vandiver	966c8dc23d	puppet: Remove absent of email-mirror cron job. This was marked as ensure absent in `24f8492236`, in v1.3.0 in 2014.	2020-09-29 18:17:08 -07:00
Alex Vandiver	430d3b8554	puppet: Remove absent of libapache2-mod-wsgi. This was marked as ensure absent in `89b97e7480`, in v1.7.0 in 2017, though it did not take effect until `6e55aa2ce6`, in v1.9.0 in 2018.	2020-09-29 18:17:08 -07:00
Alex Vandiver	12085552d5	puppet: Tidy indentation.	2020-09-29 17:44:44 -07:00
Alex Vandiver	57d88eedd8	puppet: Only install rabbitmq cron jobs via zulip_ops. The rabbitmq cron jobs exist in order to call rabbitmqctl as root and write the output to files that nagios can consume, since nagios is not allowed to run rabbitmqctl. In systems which do not have nagios configured, these every-minute cron jobs add non-insignificant load, to no effect. Move their installation into `zulip_ops`. In doing so, also combine the cron.d files into a single file; this allows us to `ensure => absent` the old filenames, removing them from existing systems. Leave the resulting combined cron.d file in `zulip`, since it is still of general utility and note.	2020-09-29 17:44:44 -07:00
Alex Vandiver	79931051bd	puppet: Permit outgoing mail from postfix. The configuration change made in `1c17583ad5` only allowed delivery to those specific Zulip addresses. However, they also prevent the mailserver from being used as an outgoing email relay from Zulip, since all mail that passed through the mailserver (from any originator) was required to have a `RCPT TO` that matched those regexes. Allow mail originating from `mynetworks` to have an arbitrary addresses in `RCPT TO`.	2020-09-25 15:09:27 -07:00
Alex Vandiver	36ea307fbf	puppet: Depend other changes on sharding.py validation. Use the validation of the tornado sharding config that `stage_updated_sharding` does, by depending on it. This ensures that we don't write out a supervisor or nginx config based on a bad (e.g. non-sequential) list of tornado ports.	2020-09-25 10:52:40 -07:00
Alex Vandiver	c0e240277b	tornado: Remove fingerprinting, write out .tmp files always. Fingerprinting the config is somewhat brittle -- it requires either custom bootstrapping for old (fingerprint-less) configs, and may have false-positives. Since generating the config is lightweight, do so into the .tmp files, and compare the output to the originals to determine if there are changes to apply. In order to both surface errors, as well as notify the user in case a restart is necessary, we must run it twice. The `onlyif` functionality cannot show configuration errors to the user, only determine if the command runs or not. We thus run the command once, judging errors as "interesting" enough to run the actual command, whose failure will be verbose in Puppet and halt any steps that depend on it. Removing the `onlyif` would result in `stage_updated_sharding` showing up in the output of every Puppet run, which obscures the important messages it displays when an update to sharding is necessary. Removing the `command` (e.g. making it an `echo`) would result in removing the ability to report configuration errors. We thus have no choice but to run it twice; this is thankfully low-overhead.	2020-09-25 10:52:40 -07:00
Alex Vandiver	2a12fedcf1	tornado: Remove explicit tornado_processes setting; compute it. We can compute the intended number of processes from the sharding configuration. In doing so, also validate that all of the ports are contiguous. This removes a discrepancy between `scripts/lib/sharding.py` and other parts of the codebase about if merely having a `[tornado_sharding]` section is sufficient to enable sharding. Having behaviour which changes merely based on if an empty section exists is surprising. This does require that a (presumably empty) `9800` configuration line exist, but making that default explicit is useful. After this commit, configuring sharding can be done by adding to `zulip.conf`: ``` [tornado_sharding] 9800 = # default 9801 = other_realm ``` Followed by running `./scripts/refresh-sharding-and-restart`.	2020-09-18 15:13:40 -07:00
Alex Vandiver	f638518722	tornado: Move default production port to 9800. In development and test, we keep the Tornado port at 9993 and 9983, respectively; this allows tests to run while a dev instance is running. In production, moving to port 9800 consistently removes an odd edge case, when just one worker is on an entirely different port than if two workers are used.	2020-09-18 15:13:40 -07:00
Alex Vandiver	ff94254598	tornado: Log to files by port number. Without an explicit port number, the `stdout_logfile` values for each port are identical. Supervisor apparently decides that it will de-conflict this by appending an arbitrary number to the end: ``` /var/log/zulip/tornado.log /var/log/zulip/tornado.log.1 /var/log/zulip/tornado.log.10 /var/log/zulip/tornado.log.2 /var/log/zulip/tornado.log.3 /var/log/zulip/tornado.log.7 /var/log/zulip/tornado.log.8 /var/log/zulip/tornado.log.9 ``` This is quite confusing, since most other files in `/var/log/zulip/` use `.1` to mean logrotate was used. Also note that these are not all sequential -- 4, 5, and 6 are mysteriously missing, though they were used in previous restarts. This can make it extremely hard to debug logs from a particular Tornado shard. Give the logfiles a consistent name, and set them up to logrotate.	2020-09-14 22:17:51 -07:00
Alex Vandiver	efdaa58c24	supervisor: Use more specific process_name than "port-9800". Making this include "zulip-tornado" makes it clearer in supervisor logs. Without this, one only sees: ``` 2020-09-14 03:43:13,788 INFO waiting for port-9807 to stop 2020-09-14 03:43:14,466 INFO stopped: port-9807 (exit status 1) 2020-09-14 03:43:14,469 INFO spawned: 'port-9807' with pid 24289 2020-09-14 03:43:15,470 INFO success: port-9807 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) ```	2020-09-14 22:17:51 -07:00
Alex Vandiver	e9d0bdea65	puppet: Coerce uwsgi_listen_backlog_limit into an int before doing math.	2020-09-14 21:22:13 -07:00
Alex Vandiver	8adf530400	puppet: Generate sharding in puppet, then refresh-sharding-and-restart. This supports running puppet to pick up new sharding changes, which will warn of the need to finalize them via `refresh-sharding-and-restart`, or simply running that directly.	2020-09-14 16:27:15 -07:00
Alex Vandiver	0de356c2df	puppet: Move generation of tornado nginx upstreams into tornado_sharding. This puts the creation of the upstreams referenced by `nginx_sharding.conf` adjacent to their use.	2020-09-14 16:27:15 -07:00
Alex Vandiver	bf029d99f1	sharding: Also mark sharding.json 644 for consistency. There is no reason to limit this to 640; mark it 644 for consistency with the other file.	2020-09-14 16:27:15 -07:00
Alex Vandiver	1c17583ad5	puppet: Restrict postfix incoming addresses to postmaster and zulip. This removes the possibility of local user enumeration via RCPT TO.	2020-09-11 18:49:22 -07:00
Alex Vandiver	482c964dd3	puppet: Logrotate for webhook exceptions.	2020-09-10 17:47:21 -07:00
Alex Vandiver	e38051736d	puppet: Wrap and sort logrotate config.	2020-09-10 17:47:21 -07:00
Anders Kaseorg	75c59a820d	python: Convert subprocess.Popen.communicate to run or check_output. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 17:42:35 -07:00
Anders Kaseorg	fbfd4b399d	python: Elide action="store" for argparse arguments. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 16:17:14 -07:00
Anders Kaseorg	1f2ac1962f	python: Elide default=None for argparse arguments. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 16:17:14 -07:00
Anders Kaseorg	d751e0cece	puppet: Don’t install netcat. It’s been unused since commit `0af22dad18` (#13239). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 10:33:47 -07:00
Anders Kaseorg	ab120a03bc	python: Replace unnecessary intermediate lists with generators. Mostly suggested by the flake8-comprehension plugin. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-02 11:15:41 -07:00
Anders Kaseorg	a5dbab8fb0	python: Remove redundant dest for argparse arguments. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-02 11:04:10 -07:00
Anders Kaseorg	dbdf67301b	memcached: Switch from pylibmc to python-binary-memcached. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-08-06 12:51:14 -07:00
Casper Kvan Clausen	ed7a6d5e4d	puppet: Support nginx_listen_port with http_only	2020-08-03 12:58:12 -07:00
Alex Vandiver	cd530d627b	uwsgi: Stop generating IOError and SIGPIPE on client close. Clients that close their socket to nginx suddenly also cause nginx to close its connection to uwsgi. When uwsgi finishes computing the response, it thus tries to write to a closed socket, and generates either IOError or SIGPIPE failures. Since these are caused by the _client_ closing the connection suddenly, they are not actionable by the server. At particularly high volumes, this could represent some sort of server-side failure; however, this is better detected by examining status codes at the loadbalancer. nginx uses the error code 499 for this occurrence: https://httpstatuses.com/499 Stop uwsgi from generating this family of exception entirely, using configuration for uwsgi[1]; it documents these errors as "(annoying)," hinting at their general utility." [1] https://uwsgi-docs.readthedocs.io/en/latest/Options.html#ignore-sigpipe	2020-07-31 10:40:09 -07:00
Alex Vandiver	ceb909dbc5	puppet: Increase backlogged socket count based on uwsgi backlog. Increasing the uwsgi listen backlog is intended to allow it to handle higher connection rates during server restart, when many clients may be trying to connect. The kernel, in turn, needs to have a proportionally increased somaxconn soas to not refuse the connection. Set somaxconn to 2x the uwsgi backlog, but no lower than the default (128).	2020-07-28 21:16:26 -07:00
Alex Vandiver	38d01cd4db	puppet: Generalize install-wal-g to be arbitrary tarballs.	2020-07-24 17:24:57 -07:00
Tim Abbott	5a1243db3c	puppet: Use correct scope for zulip_ops::munin_plugin.	2020-07-15 21:49:45 -07:00
Alex Vandiver	48c3c33d10	puppet: Fully-qualify the munin-plugin name	2020-07-14 17:58:51 -07:00
Alex Vandiver	c68333040b	puppet: Revert PostgreSQL setting of recovery_target_timeline. Prior to PostgreSQL 12, the `recovery_target_timeline` setting is only valid in a `recovery.conf` file, as that file has its own configuration parser. As such, including it in `postgresql.conf` results in an error, and PostgreSQL will fail to start. Remove the setting, reverting `bff3b540b1`. This fixes PostgreSQL 9.5, 9.6, 10, and 11; while the setting is not an error in a PostgreSQL 12 configuration file, it is unnecessary since `latest` is the default.	2020-07-14 16:28:20 -07:00
Alex Vandiver	31d80a77d4	puppet: Update nagios check_postgres_replication_lag to be on DB hosts `7d4a370a57` attempted to move the replication check to on the PostgreSQL hosts. While it updated the _check_ to assume it was running and talking to a local PostgreSQL instance, the configuration and installation for the check were not updated. As such, the check ran on the nagios host for each DB host, and produced no output. Start distributing the check to all apopdb hosts, and configure nagios to use the SSH tunnel to get there.	2020-07-14 16:27:18 -07:00
Alex Vandiver	2174db27db	puppet: Put the dependencies on pg_backup_and_purge itself, and ensure them.	2020-07-14 00:40:25 -07:00
Alex Vandiver	6c27f07c1d	puppet: Move PostgreSQL backups to their own class. wal-g was used in `puppet/zulip` by env-wal-g, but only installed in `puppet/zulip_ops`. Merge all of the dependencies of doing backups using wal-g (wal-g installation, the pg_backup_and_purge job, the nagios plugin that verifies it happens) into a common base class in `puppet/zulip`, since it is generally useful.	2020-07-14 00:40:25 -07:00
Anders Kaseorg	15483c09cb	puppet: Add missing trailing commas. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-13 15:36:06 -07:00
Alex Vandiver	3691a94efe	puppet: Configure munin and nagios under apache with puppet. This swaps in the actually-in-use munin configuiration file; otherwise, it is an implementation of the configuration as it exists on the machine.	2020-07-13 13:23:11 -07:00
Alex Vandiver	4e42164b4a	munin: Add plugins to prod hosts.	2020-07-13 13:23:11 -07:00
Alex Vandiver	2a14212b27	munin: Add a helper resource definition for munin plugins.	2020-07-13 12:49:28 -07:00
Alex Vandiver	7c7b5fcd6f	munin: Deal with spaces in the channel names.	2020-07-13 12:49:28 -07:00
Alex Vandiver	eda2c4b8e2	puppet: Split munin-node from munin-server. No plugins are installed inside the /usr/local/munin/lib this creates in munin-node, nor are they symlinked into /etc/munin/plugins, so non-default plugins are added by this.	2020-07-13 12:49:28 -07:00
Alex Vandiver	ddc7bb5a45	munin: Fix the path to check_send_receive_time.	2020-07-13 12:49:28 -07:00
Alex Vandiver	8be544e7eb	munin: Rename monitoring plugin to use zulip name, not humbug.	2020-07-13 12:49:28 -07:00
Alex Vandiver	1b3560af94	nagios: Stop assuming /api is where zulip client is. The api/ directory was removed in f9ba3cb60c; as that commit notes, we use the python-zulip-api module for that, added in `938597c5da`.	2020-07-13 12:49:28 -07:00
Mateusz Mandera	57d3ef42b8	puppet: Don't run thumbor services in production. Fixes #15649. Currently, no production services use thumbor; so, it makes sense to not run them in production systems.	2020-07-10 14:22:17 -07:00
Alex Vandiver	f0f29584aa	puppet: Add an arity count ("at least two") to zulipconf function.	2020-07-10 00:14:09 -07:00
Alex Vandiver	8cff27f67d	puppet: Pull hosts from zulip.conf, not hardcoded list. The one complexity is that hosts_fullstack are treated differently, as they are not currently found in the manual `hosts` list, and as such do not get munin monitoring.	2020-07-10 00:14:09 -07:00
Alex Vandiver	24383a5082	puppet: Rename hosts_domain so hosts_prefix can be grepped for.	2020-07-10 00:14:09 -07:00
Alex Vandiver	a4e7c7a27e	nagios: Remove check_memcached. check_memcached does not support memcached authentication even in its latest release (it’s in a TODO item comment, and that’s it), and was never particularly useful.	2020-07-10 00:12:48 -07:00
Anders Kaseorg	ebf7f4d0f6	zthumbor: Rename thumbor.conf to thumbor_settings.py. So we can apply all our lint checks to it. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-06 18:44:58 -07:00
Anders Kaseorg	9900298315	zthumbor: Remove Python 2 residue. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-06 18:44:58 -07:00
Alex Vandiver	17002f2a0e	puppet: Allow passing an alternate config path to zulip-puppet-apply. When temporary configuration changes are desired, this lets one set up an alternate `zulip.conf` to apply while leaving the true one in place.	2020-07-06 18:30:16 -07:00
Alex Vandiver	64b44a12f5	puppet: Add an exec rule to reload the whole supervisor config. When supervisor is first installed, it is started automatically, and creates the socket, owned by root. Subsequent reconfiguration in puppet only calls `reread + update`, which is insufficient to apply the `chown = zulip:zulip` line in `supervisord.conf`, leaving the socket owned by `root` and the last part of the installation unable to restart `supervisor` services as the `zulip` user. The `chown` line in `scripts/lib/install` exists to paper over this. Add a separate exec target for changes to `supervisord.conf` itself, which restarts the full service. This leaves the default `restart` action on the service for the lightweight `reread + update` action, which is more common. We use `systemctl` only on redhat-esque builds, because CI runs Ubuntu, but init is not systemd in that context. `systemctl reload` is sufficient to re-apply the socket ownership, but a full `restart` and not `reload` is necessary under `/etc/init.d/supervisor`.	2020-07-01 10:40:54 -07:00
Alex Vandiver	dd91f8edba	puppet: Move supervisor start command into zulip::common. Move this command alongside the rest of the distro-dependent supervisor paths.	2020-07-01 10:40:53 -07:00
Alex Vandiver	a5d63cfedf	wal-g: Update pg_backup_and_purge for wal-g format. wal-g has a slihghtly different format than wal-e in its `backup-list` output; it only contains three columns: - `name` - `last_modified`, - `wal_segment_backup_start` ..rather than wal-e's plethora, most of which were blank: - `name` - `last_modified` - `expanded_size_bytes` - `wal_segment_backup_start` - `wal_segment_offset_backup_start` - `wal_segment_backup_stop` - `wal_segment_offset_backup_stop` Remove one argument from the split.	2020-06-29 17:17:26 -07:00
Alex Vandiver	a21a086f5c	puppet: nagios-plugins-basic is replaced by monitoring-plugins-basic. In Bionic, nagios-plugins-basic is a transitional package which depends on monitoring-plugins-basic. In Focal, it is a virtual package, which means that every time puppet runs, it tries to re-install the nagios-plugins-basic package. Switch all instances to referring to `$zulip::common::nagios_plugins`, and repoint that to monitoring-plugins-basic.	2020-06-29 14:58:01 -07:00
Alex Vandiver	6fdcb4aa17	puppet: Move supervisor conf file path into zulip::common. Move this config file alongside the rest of the distro-dependent paths.	2020-06-29 13:41:05 -07:00
Alex Vandiver	93401448b9	puppet: Explain value of reload && update trick for supervisor. While the stock reload works just fine, it causes too much disruption.	2020-06-29 13:39:09 -07:00
Alex Vandiver	d2de5aced8	puppet: Remove unnecessary supervisor service name variable.	2020-06-29 13:39:09 -07:00
Alex Vandiver	73805f8279	puppet: Stop removing file that contains only comments. In modern PostgreSQL, this file, provided by `postgresql-common`, has no non-comment, non-blank lines. There's hence no reason to remove it.	2020-06-29 13:37:42 -07:00
Alex Vandiver	6e3a424921	puppet: Install the latest postgresql-client on frontend hosts. Frontend hosts in multiple-host configurations (including docker hosts) need a `psql` binary installed. `ca9d27175b` switched to not setting `postgresql.version` in `zulip.conf`, which in turn means that `$zulip::base::postgres_version` is unset. This, in turn, led to the frontend hosts installing `postgresql-client-`, whose trailing dash causes apt to _uninstall_ that package. Unconditionally install `postgresql-client` with no explicit version attached. This is a metapackage which depends on the latest client package, which currently means it will install `postgresql-client-12`. On single-host installs which have configured `postgresql.version` in `zulip.conf` to be a lower version, this will result in `postgresql-client-12` existing alongside another version (e.g. `postgresql-client-10`); `psql` will give the most recent. This is acceptable because the semantic meaning of the postgresql version in `zulip.conf` is about the database engine itself, not the command-line client.	2020-06-29 13:37:16 -07:00
Alex Vandiver	2c36bb19b2	puppet: Pull out `unzip` package which is identical in both cases.	2020-06-29 13:37:16 -07:00
Alex Vandiver	876ee4a8ed	installer: Remove code specific to stretch or xenial. Support for Xenial and Stretch was removed (`5154ddafca`, `0f4b1076ad`, `8944e0ad53`, `79acd5ae40`, `1219a2e854`), but not all codepaths were updated to remove their conditionals on it. Remove all code predicated on Xenial or Stretch. debathena support was migrated to Bionic, since that appears to be the current state of existing debathena servers.	2020-06-24 12:57:38 -07:00
Anders Kaseorg	a9e59b6bd3	memcached: Change the default MEMCACHED_USERNAME to zulip@localhost. This prevents memcached from automatically appending the hostname to the username, which was a source of problems on servers where the hostname was changed. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-19 21:22:30 -07:00
Alex Vandiver	7250d41bf7	puppet: Fix the path to install-wall-g	2020-06-17 15:23:18 -07:00
Alex Vandiver	03bffd3938	upgrade-zulip: Pin the postgres version to the OS default. We would prefer to use the postgres packages from Postgres themselves, if available. However, this requires ensures that, for existing installs, we preserve the same version of postgres as their base distribution installed. Move the version-determination logic from being computed at puppet interpolation time, to being computed at install time and pinned into zulip.conf.	2020-06-16 17:05:46 -07:00
Tim Abbott	26396c5e25	puppet: Fix exceptions with multiple certbot declarations. Since `9e8f1aacb3`, zulip_ops machines might have two Package declarations for `certbot`, which doesn't work in puppet. The fix is, as usual, to use our `zulip::safepackage` wrapper instead.	2020-06-15 18:21:33 -07:00
Alex Vandiver	bff3b540b1	puppet: Postgres replication should always switch to latest timeline. Omission of this setting makes resuming after a primary switchover difficult-to-impossible. It is the default in PostgreSQL 12.	2020-06-15 16:18:07 -07:00
Alex Vandiver	f8fc3a16eb	puppet: Use "primary" / "replica" consistently in comments. The style guide for Zulip is to always use "primary" and "replica" when describing database replication. Adjust a few comments under `puppet/` that do not adhere to this. Unfortunately, some references still remain to the insensitive and inaccurate "master" / "slave" terminology. However, these are only in files which we are attempting to preserve as close to the upstream versions they are derived from (e.g. postgresql.conf, postfix/master.cf).	2020-06-15 16:18:07 -07:00
Alex Vandiver	5f433d6eeb	puppet: Remove vestigial check_postgres.pl. `65774e1c4f` switched from using the bundled check_postgres.pl to using the version from packages; the file itself remained, however. Remove it, and clean up references to it. Fixes #15389.	2020-06-15 16:18:07 -07:00
Alex Vandiver	7d4a370a57	puppet: Move monitoring of pg replication to the pg hosts. Instead of SSH'ing around to them, run directly on the database hosts. This means that the replicas do not know how many bytes behind they are in _receiving_ the wall logs; thus, the monitoring also extends to the primary database, which knows that information for each replica. This also allows for detecting when there are too few active replicas.	2020-06-15 16:18:07 -07:00
Anders Kaseorg	5dc9b55c43	python: Manually convert more percent-formatting to f-strings. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00
Anders Kaseorg	74c17bf94a	python: Convert more percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Now including %d, %i, %u, and multi-line strings. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00
Anders Kaseorg	1ed2d9b4a0	logging: Use logging.exception and exc_info for unexpected exceptions. logging.exception() and logging.debug(exc_info=True), etc. automatically include a traceback. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00
Tim Abbott	80589099d8	puppet: Fix typo in logic for whether to install certbot. Fixes #15372.	2020-06-14 16:04:39 -07:00
rht	89af2f381d	puppet: Link postgres dict symlinks to hunspell files on CentOS. This is a temporary measure until we can find the directory of postgresql dicts on CentOS.	2020-06-13 17:53:38 -07:00
rht	36a5ca5015	puppet: Add cyrus-sasl to memcached_packages on RedHat. This is to mirror the sasl2-bin package on Debian.	2020-06-13 17:49:51 -07:00
rht	e776d2d159	puppet: Abstract out owner:group of memcached-sasldb2.	2020-06-13 17:49:51 -07:00
Anders Kaseorg	91a86c24f5	python: Replace None defaults with empty collections where appropriate. Use read-only types (List ↦ Sequence, Dict ↦ Mapping, Set ↦ AbstractSet) to guard against accidental mutation of the default value. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-13 15:31:27 -07:00
Alex Vandiver	97b9308781	puppet: Merge multiple postgres roles in `zulip_ops`. All differences between the primary and replica roles having been merged, fold the `postgres_common`, `postgres_master`, and `postgres_slave` roles into just `postgres_appdb`.	2020-06-12 14:57:46 -07:00
Alex Vandiver	55bd31721d	puppet: Remove custom `vm.dirty_ratio` and `vm.dirty_background_ratio`. These values differed between the primary and secondary database hosts, for unclear reasons. The differences date back to their introduction in `387f63deaa`. As the comment in the replica confguration notes, settings of `vm.dirty_ratio = 10` and `vm.dirty_background_ratio = 5` matched the kernel defaults for "newer" kernels; however, kernel 2.6.30 bumped those to 20 and 10, respectively[1], as a fix for underlying logic now being more correct. Remove these overrides; they should at very least be consistent across roles, and the previous values look to be an attempt to tune for a very much older version of the Linux kernel, which was using an different, buggier, algorithm under the hood. [1] `1b5e62b42b`	2020-06-12 14:57:46 -07:00
Alex Vandiver	f39816e768	puppet: Stop distributing recovery.conf file. This file controls streaming replication, and recovery using wal-g on the secondary. The `primary_conninfo` data needs to change on short notice when database failover happens, in a way that is not suitable for being controlled by puppet. PostgreSQL 12, in fact, removes the use of the `recovery.conf` file[1]; the `primary_conninfo` and `restore_command` information goes into the main `postgresql.conf` file, and the standby status is controlled by the presence of absence of an empty `standby.signal` file. Remove the puppet control of the `recovery.conf` file. [1] https://pgstef.github.io/2018/11/26/postgresql12_preview_recovery_conf_disappears.html	2020-06-12 14:57:46 -07:00
Alex Vandiver	316498a169	puppet: Remove unnecessary nagios authentication setup. Since the nagios authentication is stored _in the database_, it is unnecessary to run if the database is simply a replica of the production database. The only case in which this statement would have an effect is if the postgres node contains a _different_ (or empty) database, which `setup_disks` now effectively prevents. Remove the unnecessary step.	2020-06-11 21:01:49 -07:00
Alex Vandiver	0774f54c1b	puppet: Move to `setup_disks` to postgres_common. The tooling should now be run no matter if the node is a primary or replica.	2020-06-11 21:01:49 -07:00
Alex Vandiver	6f6a0e890a	puppet: Run setup_disks based on symlink; remove mdadm dependency. `481613a344` updated the `setup_disks` script to no longer reference `mdadm`, since we no longer set up RAID on servers. Update the puppet that would call it to remove the `mdadm` dependency, and run only if the state is not what it produces -- namely, a symlink for `/var/lib/postgresql`, which must point to an existent `/srv/postgresql` directory.	2020-06-11 21:01:49 -07:00
Alex Vandiver	1dc2de5026	puppet: Update setup-disks to be idempotent. The end state it produces is _either_: - `/srv/postgresql` already existed, which was symlinked into `/var/lib/postgresql`; postgres is left untouched. This is the situation if `setup_disks` is run on the database primary, or a replica which was correctly configured. - An empty `/srv/postgresql` now exists, symlinked into `/var/lib/postgresql`, and postgres is stopped. This is the situation if `puppet` was just run on a new host, or a previously-configured host was rebooted (clearing the temporary disk in `/dev/nvme0`) In the latter case, where `/srv/postgresql` is now empty, any previous contents of `/var/lib/postgresql` are placed under `/root`, timestamped for uniqueness. In either case, the tool should now be idempotent.	2020-06-11 21:01:49 -07:00
Alex Vandiver	8373f5f4b9	puppet: Make parent directories of postgresql.conf This fixes errors when provisioning a new system (or version of postgres) when the configuration file cannot be written because its parent directories do not exist. Files inherently depend on their containing directories, so no explicit dependencies are necessary.	2020-06-11 20:56:55 -07:00
Alex Vandiver	9fd7a026ad	puppet: Pull postgres data directory into postgres_appdb_base. The `pg_datadir` variable was only used, and accurate, for CentOS. Pull it out into `postgres_app_base`, broaden it to being accurate on Debian-based systems as well, and use it consistently in the templates.	2020-06-11 20:56:55 -07:00
Alex Vandiver	16c4cea951	puppet: Pull postgres config directory into postgres_appdb_base. As the previous commit, this is currently only used in tuning, but is a property of the whole postgres configuration; move it there, as just the directory, not the file. Use this directory consistently in the erb templates. Since we produce a `pg_hba.conf`, it makes sense that we point to the path that we know that we explicitly wrote to, for instance.	2020-06-11 20:56:55 -07:00
Alex Vandiver	2a7373b602	puppet: Pull postgres restart config into postgres_appdb_base. While it is only currently used in the tuning configuration, it is a property of the base configuration, and fits more clearly into the case block there.	2020-06-11 20:56:55 -07:00
Anders Kaseorg	365fe0b3d5	python: Sort imports with isort. Fixes #2665. Regenerated by tabbott with `lint --fix` after a rebase and change in parameters. Note from tabbott: In a few cases, this converts technical debt in the form of unsorted imports into different technical debt in the form of our largest files having very long, ugly import sequences at the start. I expect this change will increase pressure for us to split those files, which isn't a bad thing. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-11 16:45:32 -07:00
Anders Kaseorg	69730a78cc	python: Use trailing commas consistently. Automatically generated by the following script, based on the output of lint with flake8-comma: import re import sys last_filename = None last_row = None lines = [] for msg in sys.stdin: m = re.match( r"\x1b\[35mflake8 \\|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg ) if m: filename, row_str, col_str, err = m.groups() row, col = int(row_str), int(col_str) if filename == last_filename: assert last_row != row else: if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) with open(filename) as f: lines = f.readlines() last_filename = filename last_row = row line = lines[row - 1] if err in ["C812", "C815"]: lines[row - 1] = line[: col - 1] + "," + line[col - 1 :] elif err in ["C819"]: assert line[col - 2] == "," lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ") if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-06-11 16:04:12 -07:00
Alex Vandiver	b114eb2f10	puppet: Rename env-wal-e to env-wal-g. It runs wal-g now, not wal-e; make its name respect that.	2020-06-11 15:52:43 -07:00
Alex Vandiver	4fe0444108	puppet: Install wal-g, not wal-e.	2020-06-11 15:52:43 -07:00
Alex Vandiver	39d6185ce7	puppet: Remove python-dateutil requirement from pg_backup_and_purge. `1f565a9f41` removed the `package` lines which install `python-dateutil`, but not the line in `puppet_ops` that reference it; as such, Puppet manifests in puppet_ops fail to compile. Remove the stale reference to `python-dateutil`, which is unnecessary since the code is python3, not python2.	2020-06-11 14:28:55 -07:00
Anders Kaseorg	ca4357fd64	python: Use standard NoReturn (Python ≥ 3.6). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-11 12:56:52 -07:00
Mateusz Mandera	fbc96d56d5	sharding: Fix permissions on the nginx_sharding.conf file. The zulip user needs to be able to read the file, when running the backup tool. We put root:root as owner on other nginx config files, so it's probably correct to keep the ownership as it is, and set the mode to 0644.	2020-06-11 12:56:06 -07:00
Anders Kaseorg	67e7a3631d	python: Convert percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-10 15:02:09 -07:00
arpit551	03d563ce0f	postgres: Changed max_connections in postgres 12 config template. Value of max_connections is now 1000 like in other postgres versions template.	2020-06-08 21:59:57 -07:00
arpit551	9e8f1aacb3	certbot: Switch to use certbot from apt. certbot-auto doesn’t work on Ubuntu 20.04, and won’t be updated; we migrate to instead using the certbot package shipped with the OS instead. Also made sure that sure certbot gets installed when running zulip-puppet-apply, to handle existing systems.	2020-06-08 21:59:29 -07:00
arpit551	7e75a7e336	postgres: Fix syntax error in postgres 12 config. <% used as example in postgres 12 config is being confused with erb syntax so added extra % as <%% means literal <%.	2020-06-08 21:57:54 -07:00
arpit551	7d11be5ca5	puppet: Add Zulip specific postgres configuration for 12. Based on the work done in `a03e478`.	2020-06-08 21:57:54 -07:00
arpit551	4e52f1bc53	puppet: Commit an upstream version of postgres 12 config. In preparation for adding production support for Ubuntu Focal.	2020-06-08 21:57:54 -07:00
Tim Abbott	71078adc50	docs: Update URLs to use https://zulip.com . We're migrating to using the cleaner zulip.com domain, which involves changing all of our links from ReadTheDocs and other places to point to the cleaner URL.	2020-06-08 18:10:45 -07:00
Anders Kaseorg	1f565a9f41	timezone: Use standard library datetime.timezone.utc consistently. datetime.timezone is available in Python ≥ 3.2. This also lets us remove a pytz dependency from the PostgreSQL scripts. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-05 09:34:17 -07:00
Alex Vandiver	8b1d49dbc7	puppet: Rename "wiki" realm to "monitoring". This is vestigial. It requires manually altering the `htdigest` file (not stored in this repo) to change the digest realm from `wiki` to `monitoring`, and will re-prompt users for their passwords if the browsers currently store them.	2020-05-30 12:26:21 -07:00
Alex Vandiver	b33aa8da7f	postgresql: Update setup-disks to use `service postgresql`. Using `service postgresql` makes it no longer linked to the specific version/cluster that is on the host.	2020-05-30 12:14:24 -07:00
Alex Vandiver	4e370cda75	postgresql: Update setup-disks to drop /mnt disabling. Hosts do not start out with a `/mnt`; there is no need to disable it.	2020-05-30 12:14:24 -07:00
Alex Vandiver	a7d85b7e69	postgresql: Update setup-disks to not move /tmp. Drop the change to move `/tmp` onto the local disk. Doing this move confuses `resolved` until there is a restart, and has no clear benefits. The change came in during `bf82fadc95`, but does not describe the reasoning; it is particularly puzzling, since postgresql stores its temporary files under `$PGDATA/base/pgsql_tmp`.	2020-05-30 12:14:24 -07:00
Alex Vandiver	481613a344	postgresql: Update setup-disks to not use RAID. Do not RAID the disks together. This was previously done when they were spinning media, for reliability; running them on an SSD obviates this sufficiently. This means that updating the initramfs is also not necessary.	2020-05-30 12:14:24 -07:00
Alex Vandiver	b537563bc1	postgresql: Set the current primary host.	2020-05-30 12:14:24 -07:00
Alex Vandiver	ad2918ea51	puppet: Remove `postgres_other` nagios hostgroup. This no longer has any rules specific to it. We leave the `postgres` munin group (which now only contains `postgres_appdb`) as future-proofing, and so that `postgres_appdb` matches to the puppet manifest of the same name.	2020-05-28 17:24:35 -07:00
Alex Vandiver	2c73fbdcb6	puppet: Remove munin monitoring for no-longer-used "postgres_other". The `wiki` and `trac` products are no longer used.	2020-05-28 17:24:35 -07:00
Tim Abbott	0b93e09e72	puppet: Add nginx configuration for blog.zulip.org move.	2020-05-26 14:47:05 -07:00
Anders Kaseorg	f5b33f9398	python: Further pyupgrade changes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-26 11:43:40 -07:00
Anders Kaseorg	333f7d16c9	logging: Pass more format arguments to logging. Commit `bdc365d0fe` (#14852) missed this because of https://github.com/returntocorp/semgrep/issues/831. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-26 11:42:23 -07:00
Anders Kaseorg	824d97987b	process_fts_updates: Use cursor.execute correctly. Commit `b501d04f6a` (#14841) missed this because of https://github.com/returntocorp/semgrep/issues/831. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-26 11:42:23 -07:00
arpit551	439f0d3004	install: Ad production support for Zulip on Ubuntu Focal. Install script now runs on Focal. Python 2 is now installed via the `python2` package in Focal.	2020-05-25 16:58:42 -07:00
Tim Abbott	220620e7cf	sharding: Add basic sharding configuration for Tornado. This allows straight-forward configuration of realm-based Tornado sharding through simply editing /etc/zulip/zulip.conf to configure shards and running scripts/refresh-sharding-and-restart. Co-Author-By: Mateusz Mandera <mateusz.mandera@zulip.com>	2020-05-20 13:47:20 -07:00
Tim Abbott	cdd3b7efbc	tornado: Configure upstreams for TORNADO_PROCESSES.	2020-05-20 13:43:48 -07:00
Tim Abbott	c3d3324295	puppet: Add link to the sources for Zephyr patches.	2020-05-19 20:54:11 -07:00
Tim Abbott	a35e71ebbc	puppet: Update package name for boto-on-python3. The python3-boto3 package is the maintained fork that supports Python 3; it was renamed in Ubuntu Bionic from the original Ubuntu Xenial name.	2020-05-19 20:25:11 -07:00
Tim Abbott	1c28770810	puppet: Fix apt_repo_debathena setup_file path. There was a typo introduced here when scripts_path was added.	2020-05-19 20:21:30 -07:00
Tim Abbott	c43b3d95e2	puppet: Switch env-wal-e to use wal-g rather than wal-e. wal-g is the modern reimplementation of wal-e that supports current postgres. It requires a bit of extra configuration to specify the AWS region.	2020-05-15 16:45:36 -07:00
Anders Kaseorg	fcca4a38b6	puppet: Work around memcached SASL configuration path bug. memcached 1.5.22 in Ubuntu 20.04 has a bug where it looks for its SASL configuration at /etc/sasl2/memcached.conf/memcached.conf instead of /etc/sasl2/memcached.conf. https://bugs.launchpad.net/ubuntu/+source/memcached/+bug/1878721 Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-14 23:25:24 -07:00
Tim Abbott	b3c5f2c13e	puppet: Remove check_postgres_replication_lag hostname hardcoding. Since this runs on the Nagios server, which already has the relevant hostnames defined in zulip.conf, we can just read it from there.	2020-05-11 23:42:36 -07:00
Tim Abbott	225bbf3633	puppet: Update check_postgres_replication_lag for postgres 10. These functions were renamed in postgres 10.	2020-05-11 15:59:23 -07:00
Tim Abbott	d8ea649869	puppet: Cast tornado_processes to Integer. This is the latest mechanism in puppet for turning a string into an integer. We update an adjacent comment while we're at it.	2020-05-11 00:54:48 -07:00
Tim Abbott	6319c181eb	puppet: Use actual name for the bind9-host package. Using the `host` virtual package confused Puppet into reporting it was doing work every time one did a puppet run, resulting in unnecessarily spammy output.	2020-05-11 00:51:53 -07:00
Mateusz Mandera	dd40649e04	queue_processors: Remove the slow_queries queue. While this functionality to post slow queries to a Zulip stream was very useful in the early days of Zulip, when there were only a few hundred accounts, it's long since been useless since (1) the total request volume on larger Zulip servers run by Zulip developers, and (2) other server operators don't want real-time notifications of slow backend queries. The right structure for this is just a log file. We get rid of the queue and replace it with a "zulip.slow_queries" logger, which will still log to /var/log/zulip/slow_queries.log for ease of access to this information and propagate to the other logging handlers. Reducing the amount of queues is good for lowering zulip's memory footprint and restart performance, since we run at least one dedicated queue worker process for each one in most configurations.	2020-05-11 00:45:13 -07:00
Tim Abbott	21a04e2dbc	puppet: Use nice to deprioritize various processes. Our priority hierarchy is: (1) Tornado and base services like memcached, redis, etc. (2) Django and message sender queue workers. (3) Everything else. Ideally, we'd have something a bit more fine-grained (e.g. some queue workers are potentially in the sending path, while others aren't), but this should have a big impact on ensuring Tornado gets the resources it needs during load spikes. I think this has a good chance of causing some load spikes that would previously have resulted in a user-facing delivery delays no longer having any significant user-facing impact.	2020-05-10 23:28:25 -07:00
shubhamgupta2956	9cd8644c7c	uploads: Add support for ".jpe" file extension. Currently when the user uploads files with ".jpe" file extension, the markdown is converted to link but the image is not embedded. This commit adds the support for ".jpe" file extension. Fixes #14863	2020-05-10 22:55:52 -07:00
Anders Kaseorg	8cdf2801f7	python: Convert more variable type annotations to Python 3.6 style. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-08 16:42:43 -07:00
Anders Kaseorg	708c6f4f11	puppet: Finally vanquish the cursed integer conversion conditional. We no longer support Puppet 3. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-08 16:42:43 -07:00
Tim Abbott	50d8d61d3c	puppet: Remove unnecssary/broken ;. This breaks the Xenial build, which we're removing soon, but it's unnecessary in any case.	2020-05-07 16:23:37 -07:00
Tim Abbott	03991d098a	puppet: Add optional postgres version override. This makes it convenient to run an alternative postgres version.	2020-05-07 09:33:24 -07:00
Mateusz Mandera	4643e48f60	retention: Add a daily cron job. This will run archive_messages management command at 6am every day, 1 hour after soft_deactivate_users (which runs at 5am).	2020-05-05 10:11:38 -07:00
Tim Abbott	4034f6f99e	nagios: Fix check_postgres_replication_lag. This expects to be run outside a virtualenv and thus without typing_extensions available.	2020-05-03 00:14:54 -07:00
Tim Abbott	4f3976b917	process_fts_updates: Clean up logging output. This saves a couple lines of spammy output in the run-dev.py startup experience, and will be better output in production as well.	2020-05-01 11:51:20 -07:00
Anders Kaseorg	c0ffa71fa9	nginx: Replace unanchored regexes in location directives. We could anchor the regexes, but there’s no need for the power (and responsibility) of regexes here. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-24 16:58:19 -07:00
Anders Kaseorg	5e01a0ae8b	zulip-ec2-configure-interfaces: Convert function type annotations. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-24 13:06:54 -07:00
Anders Kaseorg	f8339f019d	python: Convert assignment type annotations to Python 3.6 style. Commit split by tabbott; this has changes to scripts/, tools/, and puppet/. scripts/lib/hash_reqs.py, scripts/lib/setup_venv.py, scripts/lib/zulip_tools.py, and tools/lib/provision.py are excluded so tools/provision still gives the right error message on Ubuntu 16.04 with Python 3.5. Generated by com2ann, with whitespace fixes and various manual fixes for runtime issues: -shebang_rules: List[Rule] = [ +shebang_rules: List["Rule"] = [ -trailing_whitespace_rule: Rule = { +trailing_whitespace_rule: "Rule" = { -whitespace_rules: List[Rule] = [ +whitespace_rules: List["Rule"] = [ -comma_whitespace_rule: List[Rule] = [ +comma_whitespace_rule: List["Rule"] = [ -prose_style_rules: List[Rule] = [ +prose_style_rules: List["Rule"] = [ -html_rules: List[Rule] = whitespace_rules + prose_style_rules + [ +html_rules: List["Rule"] = whitespace_rules + prose_style_rules + [ - target_port: int = None + target_port: int Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-24 13:06:54 -07:00
Anders Kaseorg	09ea778db1	nginx: Listen for ACME challenges on port 80 too. This should make Certbot renewals more reliable. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-23 16:22:04 -07:00

... 3 4 5 6 7 ...

1342 Commits