zulip

Commit Graph

Author	SHA1	Message	Date
Alex Vandiver	dd90083ed7	puppet: Provide FQDN of self as URI, so the certificate validates. Failure to do this results in: ``` psql: error: failed to connect to `host=localhost user=zulip database=zulip`: failed to write startup message (x509: certificate is valid for [redacted], not localhost) ```	2021-06-14 00:14:48 -07:00
Alex Vandiver	c90ff80084	puppet: Bump grafana version to 8.0.1. Most notably, this fixes an annoying bug with CloudWatch metrics being repeated in graphs.	2021-06-10 15:49:08 -07:00
Alex Vandiver	d905eb6131	puppet: Add a database teleport server. Host-based md5 auth for 127.0.0.1 must be removed from `pg_hba.conf`, otherwise password authentication is preferred over certificate-based authentication for localhost.	2021-06-08 22:21:21 -07:00
Alex Vandiver	100a899d5d	puppet: Add grafana server.	2021-06-08 22:21:00 -07:00
Alex Vandiver	459f37f041	puppet: Add prometheus server.	2021-06-08 22:21:00 -07:00
Alex Vandiver	19fb58e845	puppet: Add prometheus node exporter.	2021-06-08 22:21:00 -07:00
Alex Vandiver	a2b1009ed5	puppet: Turn on "authentication" which defaults to user with all rights. Nagios refuses to allow any modifications with use_authentication off; re-enabled "authentication" but set a default user, which (by way of the `*` permissions in `359f37389a`) is allowed to take all actions.	2021-06-08 15:19:28 -07:00
Alex Vandiver	61b6fc865c	puppet: Add a label to teleport applications, to allow RBAC. Roles can only grant or deny access based on labels; set one based on the application name.	2021-06-08 15:19:04 -07:00
Alex Vandiver	4aff5b1d22	puppet: Allow access to `/` in nagios. This was a regression in `51b985b40d`.	2021-06-07 22:40:58 -07:00
Alex Vandiver	54768c2210	puppet: Remove now-unused basic auth support files. `51b985b40d` made these unnecessary.	2021-06-07 16:17:45 -07:00
Alex Vandiver	359f37389a	puppet: Remove in-nagios auth restrictions. `51b985b40d` made nagios only accessible from localhost, or as proxied via teleport. Remove the HTTP-level auth requirements.	2021-06-07 16:17:45 -07:00
Alex Vandiver	2352fac6b5	puppet: Fix indentation.	2021-06-02 18:38:38 -07:00
Alex Vandiver	51b985b40d	puppet: Move nagios to behind teleport. This makes the server only accessible via localhost, by way of the Teleport application service.	2021-06-02 18:38:38 -07:00
Alex Vandiver	4f51d32676	puppet: Add a teleport application server. This requires switching to a reverse tunnel for the auth connection, with the side effect that the `zulip_ops::teleport::node` manifest can be applied on servers anywhere in the Internet; they do not need to have any publicly-available open ports.	2021-06-02 18:38:38 -07:00
Alex Vandiver	c59421682f	puppet: Add a teleport node on every host. Teleport nodes[1] are the equivalent to SSH servers. In addition to this config, joining the teleport cluster will require presenting a one-time "join token" from the proxy server[2], which may either be short-lived or static. [1] https://goteleport.com/docs/architecture/nodes/ [2] https://goteleport.com/docs/admin-guide/#adding-nodes-to-the-cluster	2021-06-02 18:38:38 -07:00
Alex Vandiver	1cdf14d195	puppet: Add a teleport server. See https://goteleport.com/docs/architecture/overview/ for the general architecture of a Teleport cluster. This commit adds a Teleport auth[1] and proxy[2] server. The auth server serves as a CA for granting time-bounded access to users and authenticating nodes on the cluster; the proxy provides access and a management UI. [1] https://goteleport.com/docs/architecture/authentication/ [2] https://goteleport.com/docs/architecture/proxy/	2021-06-02 18:38:38 -07:00
Alex Vandiver	3ebd627c50	puppet: Fix "import" -> "include" in chat_zulip_org.	2021-06-02 11:02:34 -07:00
Alex Vandiver	2130fc0645	puppet: Add an explicit class for czo.	2021-06-01 22:18:50 -07:00
Alex Vandiver	c9141785fd	puppet: Use concat fragments to place port allows next to services. This means that services will only open their ports if they are actually run, without having to clutter rules.v4 with a log of `if` statements. This does not go as far as using `puppetlabs/firewall`[1] because that would represent an additional DSL to learn; raw IPtables sections can easily be inserted into the generated iptables file via `concat::fragment` (either inline, or as a separate file), but config can be centralized next to the appropriate service. [1] https://forge.puppet.com/modules/puppetlabs/firewall	2021-05-27 21:14:48 -07:00
Alex Vandiver	4f79b53825	puppet: Factor out firewall config.	2021-05-27 21:14:48 -07:00
Alex Vandiver	87a109e3e0	puppet: Pull in pinned puppet modules. Using puppet modules from the puppet forge judiciously will allow us to simplify the configuration somewhat; this specifically pulls in the stdlib module, which we were already using parts of.	2021-05-27 21:14:48 -07:00
Alex Vandiver	f3eea72c2a	setup: Merge multiple setup-apt-repo scripts into one. This moves the `.asc` files into subdirectories, and writes out the according `.list` files into them. It moves from templates to written-out `.list` files for clarity and ease of implementation (Debian and Ubuntu need different templates for `zulip`), and as a way of making explicit which releases are supported for each list. For the special-case of the PGroonga signing key, we source an additional file within the directory. This simplifies the process for adding another class of `.list` file.	2021-05-26 14:42:29 -07:00
Alex Vandiver	4f017614c5	nagios: Replace check_fts_update_log with a process_fts_updates flag. This avoids having to duplicate the connection logic from process_fts_updates. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:56:05 -07:00
Alex Vandiver	ab130ceb35	nagios: Support arbitrary database user and dbname in replication check. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:56:05 -07:00
Alex Vandiver	c17f502bb0	process_fts_updates: Support arbitrary database user and dbname. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:56:05 -07:00
Alex Vandiver	02fc0d3e1d	db: Drop None and empty-string checking in arguments. psycopg2 treats None and "" the same as not-provided: ``` assert connect(user="zulip", dbname="zulip") assert connect(user="zulip", dbname="zulip", host="") assert connect(user="zulip", dbname="zulip", host=None) with Raises("no password supplied"): connect(user="zulip", dbname="zulip", host="localhost") assert connect(user="zulip", dbname="zulip", port="") assert connect(user="zulip", dbname="zulip", port=None) assert connect(user="zulip", dbname="zulip", port=5432) with Raises("could not connect to server"): connect(user="zulip", dbname="zulip", port=5000) assert connect(dbname="zulip", host="localhost", password="right-password") with Raises("no password supplied"): connect(dbname="zulip", host="localhost", password="") with Raises("no password supplied"): connect(dbname="zulip", host="localhost", password=None) with Raises("password authentication failed"): connect(dbname="zulip", host="localhost", password="wrong") ``` Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	9c652eb16b	db: Use the pre-computed values from settings. Rather than duplicate logic from `computed_settings`, use the values that were computed therein. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	94d7c29d92	db: Use the same codepath for cases (2) and (3). Using the second branch _only_ for case (3), of a PostgreSQL server on a different host, leaves it untested in CI. It also brings in an unnecessary Django dependency. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	add6971ad9	db: Make USING_PGROONGA logic clearer. We only need to read the `zulip.conf` file to determine if we're using PGROONGA if we are on the PostgreSQL machine, with no access to Django. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	75bf19c9d9	db: Combine the `if "host" in pg_args` stanza with earlier clause. The only way in which "host" could be set is in cases (1) or (2), when it was potentially read from Django's settings. In case (3), we already know we are on the same host as the PostgreSQL server. This unifies the two separated checks, which are actually the same check. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	67fc8e84ea	db: Clarify the 3 different cases that process_fts_updates must support. Co-authored-by: Adam Birds <adam.birds@adbwebdesigns.co.uk>	2021-05-25 13:46:58 -07:00
Alex Vandiver	116e41f1da	puppet: Move files out and back when mounting /srv. Specifically, this affects /srv/zulip-aws-tools.	2021-05-23 13:29:23 -07:00
Alex Vandiver	ea98549e88	puppet: Always install linux-image-virtual, for ksplice support.	2021-05-23 13:29:23 -07:00
Alex Vandiver	0b1dd27841	puppet: AWS mounts its extra disks with inconsistent names. It is now /dev/nvme1n1, not /dev/nvme0n1; but it always has a consistent major/minor node. Source the file that defines these.	2021-05-23 13:29:23 -07:00
Alex Vandiver	82797dd53c	settings: Standardize the name of the deliver_scheduled_messages logs. This makes it match its command name, and other logfile name.	2021-05-18 12:39:28 -07:00
Alex Vandiver	343a1396af	puppet: Rename logfile for deliver_scheduled_messages to be consistent.	2021-05-18 12:39:28 -07:00
Alex Vandiver	ef6d0ec5ca	puppet: Only run deliver_scheduled_messages and _emails on one server. `deliver_scheduled_emails` and `deliver_scheduled_messages` use the `ScheduledEmail` and `ScheduledMessage` tables as a queue, effectively, pulling values off of them. As noted in their comments, this is not safe to run on multiple hosts at once. As such, split out the supervisor files for them.	2021-05-18 12:39:28 -07:00
Alex Vandiver	033a96aa5d	puppet: Fix check_ssl_certificate check to check named host, not self.	2021-05-17 18:38:30 -07:00
Alex Vandiver	a2b7a5ef4b	puppet: Clarify 20m keepalive time from the LB is a max; it can be less.	2021-05-17 14:56:51 -07:00
Alex Vandiver	66a232e303	smokescreen: Bump version of Go and Smokescreen. Move version pins to the latest versions of Go and Smokescreen.	2021-05-12 10:08:42 -10:00
Alex Vandiver	feb7870db7	puppet: Adjust thresholds on autovac_freeze. These thresholds are in relationship to the `autovacuum_freeze_max_age`, not the XID wraparound, which happens at 2^31-1. As such, it is perfectly normal that they hit 100%, and then autovacuum kicks in and brings it back down. The unusual condition is that PostgreSQL pushes past the point where an autovacuum would be triggered -- therein lies the XID wraparound danger. With the `autovacuum_freeze_max_age` set to 2000000000 in `postgresql.conf`, XID wraparound happens at 107.3%. Set the warning and error thresholds to below this, but above 100% so this does not trigger constantly.	2021-05-11 17:11:47 -07:00
Alex Vandiver	0f1611286d	management: Rename the deliver_email command to deliver_scheduled_email. This makes it parallel with deliver_scheduled_messages, and clarifies that it is not used for simply sending outgoing emails (e.g. the `email_senders` queue). This also renames the supervisor job to match.	2021-05-11 13:07:29 -07:00
Anders Kaseorg	544bbd5398	docs: Fix capitalization mistakes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-05-10 09:57:26 -07:00
Tim Abbott	ad0be6cea1	puppet: Remove thumbor.conf nginx configuration. This was missing in `405bc8dabf`.	2021-05-07 16:57:29 -07:00
Anders Kaseorg	9d57fa9759	puppet: Use pgrep -x to avoid accidental matches. Matching the full process name (-x without -f) or full command line (-xf) is less prone to mistakes like matching a random substring of some other command line or pgrep matching itself. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-05-07 08:54:41 -07:00
Anders Kaseorg	405bc8dabf	requirements: Remove Thumbor. Thumbor and tc-aws have been dragging their feet on Python 3 support for years, and even the alphas and unofficial forks we’ve been running don’t seem to be maintained anymore. Depending on these projects is no longer viable for us. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-05-06 20:07:32 -07:00
Alex Vandiver	eda9ce2364	locale: Use `C.UTF-8` rather than `en_US.UTF-8`. The `en_US.UTF-8` locale may not be configured or generated on all installs; it also requires that the `locales` package be installed. If users generate the `en_US.UTF-8` locale without adding it to the permanent set of system locales, the generated `en_US.UTF-8` stops working when the `locales` package is updated. Switch to using `C.UTF-8` in all cases, which is guaranteed to be installed. Fixes #15819.	2021-05-04 08:51:46 -07:00
Alex Vandiver	ddb9d16132	puppet: Install procps, for pgrep. In puppet, we use pgrep in the collection stage, to see if rabbitmq is running. Sufficiently bare-bones systems will not have `procps` (which provides `pgrep`) installed yet, which makes the install abort when running `puppet` for the first time. Just installing the `procps` package in Puppet is insufficient, because the check in the `unless` block runs when Puppet is determining which resources it needs to instantiate, and in what order; any package installation has yet to happen. As `erlang-base` (which provides `epmd`) happens to have a dependency of `procps`, any system without `pgrep` will also not have `epmd` installed or running. Regardless, it is safe to run `epmd -daemon` even if one is already running, as the comment above notes.	2021-05-03 14:48:52 -07:00
Alex Vandiver	3577c6dbd4	puppet: `pgrep -f something` can match itself. Using `pgrep -f epmd` to determine if `empd` is running is a race condition with itself, since the pgrep is attempting to match the "full process name" and its own full process name contains "epmd". This leads to epmd not being started when it should be, which in turn leads to rabbitmq-server failing to start. Use the standard trick for this, namely a one-character character class, to prevent self-matching.	2021-05-03 14:48:52 -07:00
Jennifer Hwang	c9f5946239	puppet: Add override for queue_workers_multiprocess. With tweaks to the documentation by tabbott. This uses the following configuration option: [application_server] queue_workers_multiprocess = false	2021-04-20 14:37:15 -07:00
Tim Abbott	bb676f1143	smokescreen: Move supervisor configuration to managed directory. We've established the conf.d/zulip directory as the recommended path for Zulip-managed configuration files, so this belongs there.	2021-04-16 14:05:42 -07:00
Gaurav Pandey	303e7b9701	ci: Add Debian bullseye to production test suite.	2021-04-15 21:38:31 -07:00
Gaurav Pandey	feb720b463	install: Add beta support for debian bullseye for production. This won't work on a real bullseye system until Bullseye actually officially releases. Fixes part of #17863.	2021-04-15 21:38:31 -07:00
Alex Vandiver	9de35d98d3	puppet: Ensure a snakeoil certificate, for Postfix and PostgreSQL. We use the snakeoil TLS certificate for PostgreSQL and Postfix; some VMs install the `ssl-cert` package but (reasonably) don't build the snakeoil certs into the image. Build them as needed. Fixes #14955.	2021-04-15 21:37:55 -07:00
Anders Kaseorg	b01d43f339	mypy: Fix strict_equality violations. puppet/zulip/files/nagios_plugins/zulip_postgresql/check_postgresql_replication_lag:98: error: Non-overlapping equality check (left operand type: "List[List[str]]", right operand type: "Literal[0]") [comparison-overlap] zerver/tests/test_realm.py:650: error: Non-overlapping container check (element type: "Dict[str, Any]", container item type: "str") [comparison-overlap] Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-04-13 09:18:18 -07:00
Alex Vandiver	93f3b41811	puppet: Also move avatars to the same nginx include file.	2021-04-09 08:28:42 -07:00
Alex Vandiver	aae8f454ce	puppet: Simplify uploads handling. `uploads-route.noserve` and `uploads-route.internal` contained identical location blocks for `/upload`, since differentiation was necessary for Trusty until 33c941407b72; move the now-common sections into `app`. This the only differences between internal and S3 serving as a single block which should be included or not based on config; move it to a file which may or may not be placed in `app.d/`.	2021-04-09 08:28:42 -07:00
Alex Vandiver	fb26c6b7ca	puppet: Move uwsgi_pass setting into uwsgi_params. We only ever call `uwsgi_pass django` in association with `include uwsgi_params`; refactor it in.	2021-04-09 08:28:42 -07:00
Alex Vandiver	9cf9d5f2cf	puppet: Move HTTP_X_REAL_IP setting into uwsgi_params. This effectively also adds it to serving `/user_uploads`, where its lack would cause failures to list the actual IP address.	2021-04-09 08:28:42 -07:00
Alex Vandiver	795517bd52	puppet: Only set X-Real-IP once. `07779ea879` added an additional `proxy_set_header` of `X-Real-IP` to `puppet/zulip/files/nginx/zulip-include-common/proxy`; as noted in that commit, Tornado longpoll proxies already included such a line. Unfortunately, this equates to setting that header _twice_ for Tornado ports, like so: ``` X-Real-Ip: 198.199.116.58 X-Real-Ip: 198.199.116.58 ``` ...which is represented, once parsed by Django, as an IP of `198.199.116.58, 198.199.116.58`. For IPv4, this odd "IP address" has no problems, and appears in the access logs accordingly; for IPv6 addresses, however, its length is such that it overflows a call to `getaddrinfo` when attempting to determine the validity of the IP. Remove the now-duplicated inclusion of the header.	2021-04-09 08:28:42 -07:00
Alex Vandiver	07779ea879	middleware: Do not trust X-Forwarded-For; use X-Real-Ip, set from nginx. The `X-Forwarded-For` header is a list of proxies' IP addresses; each proxy appends the remote address of the host it received its request from to the list, as it passes the request down. A naïve parsing, as SetRemoteAddrFromForwardedFor did, would thus interpret the first address in the list as the client's IP. However, clients can pass in arbitrary `X-Forwarded-For` headers, which would allow them to spoof their IP address. `nginx`'s behavior is to treat the addresses as untrusted unless they match an allowlist of known proxies. By setting `real_ip_recursive on`, it also allows this behavior to be applied repeatedly, moving from right to left down the `X-Forwarded-For` list, stopping at the right-most that is untrusted. Rather than re-implement this logic in Django, pass the first untrusted value that `nginx` computer down into Django via `X-Real-Ip` header. This allows consistent IP addresses in logs between `nginx` and Django. Proxied calls into Tornado (which don't use UWSGI) already passed this header, as Tornado logging respects it.	2021-03-31 14:19:38 -07:00
Anders Kaseorg	29e4c71ec4	puppet: Reformat custom Ruby modules with Rufo. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-03-24 12:12:04 -07:00
Alex Vandiver	6ee74b3433	puppet: Check health of APT repository.	2021-03-23 19:27:42 -07:00
Alex Vandiver	c01345d20c	puppet: Add nagios check for long-lived certs that do not auto-renew.	2021-03-23 19:27:27 -07:00
Alex Vandiver	9ea86c861b	puppet: Add a nagios alert configuration for smokescreen. This verifies that the proxy is working by accessing a highly-available website through it. Since failure of this equates to failures of Sentry notifications and Android mobile push notifications, this is a paging service.	2021-03-18 10:11:15 -07:00
Anders Kaseorg	129ea6dd11	nginx: Consistently listen on IPv6 and with HTTP/2. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-03-17 17:46:32 -07:00
Alex Vandiver	15c58cce5a	puppet: Create new nginx logfiles as the zulip user, not as www-data. All of `/var/log/nginx/` is chown'd to `zulip` and the nginx processes themselves run as `nginx`, and would thus (on their own) create new logfiles as `zulip`. Having `logrotate` create them as the package default of `www-data` means that they are momentarily unreadable by the `zulip` user just after rotation, which can cause problems with logtail scripts. Commit the standard `nginx` logrotate configuration, but with the `zulip` user instead of the `www-data` user.	2021-03-16 14:45:13 -07:00
Alex Vandiver	3314fefaec	puppet: Do not require a venv for zulip-puppet-apply. `0663b23d54` changed zulip-puppet-apply to use the venv, because it began using `yaml` to parse the output of puppet to determine if changes would happen. However, not every install ends with a venv; notably, non-frontend servers do not have one. Attempting to run zulip-puppet-apply on them hence now fails. Remove this dependency on the venv, by installing a system python3-yaml package -- though in reality, this package is already an indirect dependency of the system. Especially since pyyaml is quite stable, we're not using it in any interesting way, and it does not actually add to the dependencies, it is preferable to parsing the YAML by hand in this instance.	2021-03-14 17:50:57 -07:00
Alex Vandiver	52f155873f	puppet: Ensure that all `scripts/lib/install` packages are installed. These have all been required packages for some time, but this helps keep the install-time list more clearly a subset of the upgrade-time list.	2021-03-14 17:50:57 -07:00
Alex Vandiver	06c07109e4	puppet: Add missing semicolons left off in `ba3b88c81b`.	2021-03-12 15:48:53 -08:00
Alex Vandiver	024282b51e	Revert "puppet: Use rabbitmq as the user for its config files." This reverts commit `211232978f`. The `rabbitmq` user does not exist yet on first install, and the goal is to create the `rabbitmq-env.conf` file before the package is installed.	2021-03-12 15:37:19 -08:00
Alex Vandiver	ba3b88c81b	puppet: Explicitly use the snakeoil certificates for nginx. In production, the `wildcard-zulipchat.com.combined-chain.crt` file is just a symlink to the snakeoil certificates; but we do not puppet that symlink, which makes new hosts fail to start cleanly. Instead, point explicitly to the snakeoil certificate, and explain why.	2021-03-12 13:31:54 -08:00
Alex Vandiver	211232978f	puppet: Use rabbitmq as the user for its config files. This matches the initial ownership by the `rabbitmq-server` package.	2021-03-12 13:31:03 -08:00
Alex Vandiver	ef188af82d	puppet: Use two location blocks, instead of nesting them. Directives in `location` blocks may or may not inherit from surrounding `location` blocks; specifically, `add_header` directives do not[1]: > There could be several add_header directives. These directives are > inherited from the previous configuration level if and only if there > are no add_header directives defined on the current level. In order to maintain the same headers (including, critically, `Access-Control-Allow-Origin`) as the surrounding block, all `add_header` directives must thus be repeated (which includes the `include`). For clarity, un-nest and repeat the entire `location` block as was used for `/static/`, but with the additional `add_header`. This is preferred to the of an `if $request_uri` statement to add the header, as those can have unexpected or undefined results[2]. [1] http://nginx.org/en/docs/http/ngx_http_headers_module.html#add_header [2] https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/	2021-03-11 21:09:15 -08:00
Alex Vandiver	306bf930f5	puppet: Add a warning if ksplice is enabled but has no key set.	2021-03-10 17:57:20 -08:00
Alex Vandiver	a215c83c2d	puppet: Switch to more explicit variable rather than reuse a nagios one. Redis is not nagios, and this only leads to confusion as to why there is a nagios domain setting on frontend servers; it also leaves the `redis0` part of the name buried in the template. Switch to an explicit variable for the redis hostname.	2021-03-10 11:44:54 -08:00
Alex Vandiver	a5b29398fc	puppet: Only install ksplice uptrack if there is an access key.	2021-03-10 11:44:11 -08:00
Alex Vandiver	189e86e18e	puppet: Set aggressive caching headers on immutable webpack files. A partial fix for #3470.	2021-03-07 22:00:32 -08:00
Alex Vandiver	e63f170027	puppet: Add access time and host to nginx access logs. `2e20ab1658` attempted to add this; but there are multiple locations that access logs are set, and the most specific wins.	2021-03-04 18:06:47 -08:00
Alex Vandiver	8961885b0f	puppet: Add smokescreen to logrotate.	2021-03-02 17:16:38 -08:00
Alex Vandiver	d938dd9d4a	puppet: Document smokescreen installation, and move to puppet/zulip/. This is more broadly useful than for just Kandra; provide documentation and means to install Smokescreen for stand-alone servers, and motivate its use somewhat more.	2021-03-02 17:16:38 -08:00
Alex Vandiver	2f5eae5c68	puppet: Minor formatting.	2021-02-28 17:03:29 -08:00
Alex Vandiver	a759d26a32	puppet: Make ksplice config not world-readable, use 'adm' group. This matches the configuration that ksplice itself creates the file and directory with.	2021-02-28 17:03:29 -08:00
Tim Abbott	957c16aa77	nagios: Tweak prod load monitoring parameters. Ultimately this monitoring isn't that helpful, but we're mainly interested in when it spikes to very high numbers.	2021-02-26 08:39:52 -08:00
Alex Vandiver	32149c6a1c	puppet: Add ksplice uptrack for kernel hotpatches.	2021-02-25 18:05:47 -08:00
Alex Vandiver	173d2dec3d	puppet: Check in defensive restart-camo cron job. This was found on lb1; add it to the camo install on smokescreen.	2021-02-24 16:42:21 -08:00
Alex Vandiver	d15e6990e5	puppet: Only execute setup-apt-repo if necessary. This means that in steady-state, `zulip-puppet-apply` is expected to produce no changes or commands to execute. The verification step of `setup-apt-repo` is quite fast, so this cleans up the output for very little cost.	2021-02-23 18:16:02 -08:00
Alex Vandiver	0b736ef4cf	puppet: Remove puppet_ops configuration for separate loadbalancer host.	2021-02-22 16:05:13 -08:00
Alex Vandiver	e30b524896	iptables: Limit smokescreen port 4750, add camo port. Limit incoming connections to port 4750 to only the smokescreen host, and also allow access to the Camo server on that host, on port 9292.	2021-02-17 13:52:38 -08:00
Alex Vandiver	1caff01463	puppet: Configure nginx for long keep-alives when behind a loadbalancer. These optimizations only makes sense when all connections at a TCP level are coming from the same host or set of hosts; as such, they are only enabled if `loadbalancer.ips` is set in the `zulip.conf`.	2021-02-17 10:25:33 -08:00
Alex Vandiver	a88af1b5a2	camo: Install on smokescreen host.	2021-02-16 08:12:31 -08:00
Alex Vandiver	29f60bad20	smokescreen: Put the version into the supervisorctl command. This makes it reload correctly if the version is changed.	2021-02-16 08:12:31 -08:00
Anders Kaseorg	6e4c3e41dc	python: Normalize quotes with Black. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	11741543da	python: Reformat with Black, except quotes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	5028c081cb	python: Merge concatenated string literals that Black would uglify. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Alex Vandiver	559cdf7317	puppet: Set APT::Periodic::Unattended-Upgrade in apt config. This is required for unattended upgrades to actually run regularly. In some distributions, it may be found in 20auto-upgrades, but placing it here makes it more discoverable.	2021-02-12 08:59:19 -08:00
Ganesh Pawar	65e23dd713	puppet: Add Zulip specific postgresql configuration for 13. Based on the work done in `a03e4784c7`.	2021-02-05 09:30:34 -08:00
Ganesh Pawar	90a3dc8a91	puppet: Add upstream version of postgresql 13 config. This is a prep commit to add provision support for Ubuntu 20.10 Groovy.	2021-02-05 09:30:34 -08:00
Tim Abbott	fd8504e06b	munin: Update to use NAGIOS_BOT_HOST. We haven't actively used this plugin in years, and so it was never converted from the 2014-era monitoring to detect the hostname. This seems worth fixing since we may want to migrate this logic to a more modern monitoring system, and it's helpful to have it correct.	2021-01-27 12:07:09 -08:00
Alex Vandiver	ab035f76de	puppet: Be more restrictive about mm addresses. These will always have only 32 characters after the `mm`.	2021-01-26 10:13:58 -08:00
Alex Vandiver	a53092687e	puppet: Only match incoming gateway address on our mail domain. `79931051bd` allows outgoing emails from localhost, but outgoing recipients are still subjected to virtualmaps. This caused all outgoing email from Zulip with destination addresses containing `.`, `+`, or starting with `mm`, to be redirected back through the email gateway. Bracket the virualmap addresses used for local delivery to the mail gateway with a restriction on the domain matching the `postfix.mailname` configuration, regex-escaped, so those only apply to email destined for that domain. The hostname is _not_ moved from `mydestination` to `virtual_alias_domains`, as that would preclude delivery to actually-local addresses, like `postmaster@`.	2021-01-26 10:13:58 -08:00
Alex Vandiver	c2526844e9	worker: Remove SignupWorker and friends. ZULIP_FRIENDS_LIST_ID and MAILCHIMP_API_KEY are not currently used in production. This removes the unused 'signups' queue and worker.	2021-01-17 11:16:35 -08:00
Tim Abbott	4ee58f408b	process_fts_updates: Make normal development startup silent. We run this tool at DEBUG log level in production, so we will still see the notice on startup there; this avoids a spammy line in the development environment output..	2020-12-20 12:19:49 -08:00
Sutou Kouhei	0d3f9fc855	install: Use PGroonga packages built for PostgreSQL packages by PGDG Because we always use PostgreSQL packages by PGDG since Zulip 3.0. Fixes #16058.	2020-12-18 15:38:21 -08:00
Alex Vandiver	4868a4fe48	puppet: Set a long timeout on wal-g wal-push, to prevent stalls. `wal-g wal-push` has a known bug with occasionally hanging after file upload to S3[1]; set a rather long timeout on the upload process, so that we don't simply stall forever when archiving WAL segments. [1] https://github.com/wal-g/wal-g/issues/656	2020-11-20 11:32:36 -08:00
Sourabh Rana	419f163906	nginx: Increase file upload size from 25mb to 80mb.	2020-11-19 00:49:49 -08:00
Alex Vandiver	90ca06d873	puppet: Allow unattended upgrades of -updates in addition to -security. This ensures that software will be fully up-to-date, not just with security patches.	2020-11-13 16:45:05 -08:00
Alex Vandiver	2e20ab1658	puppet: Log the "Host" header and total response time. Logging `Host` is useful for determining access patterns to realms, especially if ROOT_DOMAIN_LANDING_PAGE is set. Total response time is useful in debugging access and performance patterns.	2020-11-13 16:42:32 -08:00
Tim Abbott	494a685827	puppet: Fix typo in name of missedmessage_emails consumer. This has been present since this check was introduced in `45c9c3cc30`.	2020-10-29 12:28:54 -07:00
Tim Abbott	ab3cb2b3bf	puppet: Fix internal redis puppet configuration. The inherits rule is required for overriding existing configuration files; while the `::profile` piece was missed in the recent ::profile migration.	2020-10-29 11:53:43 -07:00
Alex Vandiver	6b9d7000b5	puppet: Set proxy environment variables. These are respected by `urllib`, and thus also `requests`. We set `HTTP_proxy`, not `HTTP_PROXY`, because the latter is ignored in situations which might be running under CGI -- in such cases it may be coming from the `Proxy:` header in the request.	2020-10-28 12:17:35 -07:00
Alex Vandiver	8b0f32ee07	puppet: Move environment-setting into configuration, not command.	2020-10-28 12:13:04 -07:00
Alex Vandiver	b9797770d3	provision: Rename backup directory to postgresql.	2020-10-28 11:57:03 -07:00
Alex Vandiver	1f7132f50d	docs: Standardize on PostgreSQL, not Postgres.	2020-10-28 11:55:16 -07:00
Alex Vandiver	eaa99359b1	puppet: Rename to check_postgresql_replication_lag.	2020-10-28 11:51:52 -07:00
Alex Vandiver	53e59a0a13	puppet: Rename check_postgres_backup to check_postgresql_backup.	2020-10-28 11:51:52 -07:00
Alex Vandiver	45f6c79c4a	puppet: Rename postgres_ variables to postgresql_.	2020-10-28 11:51:52 -07:00
Alex Vandiver	e124324050	puppet: Rename postgres_appdb in nagios to postgresql.	2020-10-28 11:51:52 -07:00
Alex Vandiver	a155430eb5	docs: Document all zulip.conf settings. This provides a single reference point for all zulip.conf settings; these mostly link out to the more complete documentation about each setting, elsewhere. Fixes #12490.	2020-10-27 13:31:57 -07:00
Alex Vandiver	e81bc19e45	puppet: Remove shims for old classes, except dockervoyager. The upgrade mechanism in the previous commit negates the need for them -- with the exception of dockervoyager.	2020-10-27 13:29:19 -07:00
Alex Vandiver	d24c571bab	puppet: Automatically back up the database if we have the secrets. This avoids folks having to manually add to the puppet_classes.	2020-10-27 13:29:19 -07:00
Alex Vandiver	e7798d2797	puppet: Move zulip_ops::profile::postgres_appdb to postgresql.	2020-10-27 13:29:19 -07:00
Alex Vandiver	9f25389bff	puppet: Move top-level zulip_ops deployments to zulip_ops::profile.	2020-10-27 13:29:19 -07:00
Alex Vandiver	5365af544a	puppet: Rename zulip::profile::rabbit to ::rabbitmq.	2020-10-27 13:29:19 -07:00
Alex Vandiver	188af57296	puppet: Rename postgres_appdb to postgresql. There is only one PostgreSQL database; the "appdb" is irrelevant. Also use "postgresql," as it is the name of the software, whereas "postgres" the name of the binary and colloquial name. This is minor cleanup, but enabled by the other renames in the previous commit.	2020-10-27 13:29:19 -07:00
Alex Vandiver	91cb0988e1	puppet: Generalize docker detection. This also has the benefit of detecting zulip::dockervoyager as well as zulip::profile::docker.	2020-10-27 13:29:19 -07:00
Alex Vandiver	0f25acc7b3	puppet: Rename "voyager"/"dockervoyager" to "standalone"/"docker". The "voyager" name is non-intuitive and not significant. `zulip::voyager` and `zulip::dockervoyager` stubs are kept for back-compatibility with existing `zulip.conf` files.	2020-10-27 13:29:19 -07:00
Alex Vandiver	c2185a81d6	puppet: Move top-level zulip deployments into "profile" directory. This moves the puppet configuration closer to the "roles and profiles method"[1] which is suggested for organizing puppet classes. Notably, here it makes clear which classes are meant to be able to stand alone as deployments. Shims are left behind at the previous names, for compatibility with existing `zulip.conf` files when upgrading. [1] https://puppet.com/docs/pe/2019.8/the_roles_and_profiles_method	2020-10-27 13:29:19 -07:00
Alex Vandiver	27cfb14d92	puppet: Only include zulip::base for top-level deploys. This also removes direct includes of `zulip::common`, making `zulip::base` gatekeep the inclusion of it. This helps enforce that any top-level deploy only needs include a single class, and that any configuration which is not meant to be deployed by itself will not apply, due to lack of `zulip::common` include. The following commit will better differentiate these top-level deploys by moving them into a subdirectory.	2020-10-27 13:29:19 -07:00
Alex Vandiver	34e8c2c61e	puppet: Move total_memory_mb from zulip::base into zulip::common. This makes `zulip::common` used only for variable-setting, and `zulip::base` used only for resource creation.	2020-10-27 13:29:19 -07:00
Alex Vandiver	7bb888c2ec	puppet: Template supervisor.conf for redhat paths.	2020-10-27 13:29:19 -07:00
Alex Vandiver	3ab9b31d2f	puppet: Purge all un-managed supervisor configuration files. Relying on `defined(Class['...'])` makes the class sensitive to resource evaluation ordering, and thus brittle. It is also only functional for a single service (thumbor). Generalize by using `purge => true` for the directory to automatically remove all un-managed files. This is more general than the previous form, and may result in additional not-managed services being removed.	2020-10-27 13:29:19 -07:00
Alex Vandiver	1d54630b4e	log: Rename email-deliverer.log to match other files.	2020-10-25 14:56:37 -07:00
Alex Vandiver	93d661d119	puppet: Configure logrotate for all logger files. This adds log rotation to all /var/log/zulip files.	2020-10-25 14:56:37 -07:00
Alex Vandiver	c296b5d819	puppet: Allow unattended-upgrades for all but servers. Restarting servers is what can cause service interruptions, and increase risk. Add all of the servers that we use to the list of ignored packages, and uncomment the default allowed-origins in order to enable unattended upgrades.	2020-10-23 16:46:06 -07:00
Anders Kaseorg	72d6ff3c3b	docs: Fix more capitalization issues. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-23 11:46:55 -07:00
Alex Vandiver	a7d1fd9ffb	puppet: Remove non-working apt::source. `d2aa81858c` replaced the `apt::source` to set up debathena with `Exec['setup-apt-repo-debathena']`, but mistakenly left the `apt::source` in place in `zmirror` (but not `zmirror_personals`). The `apt::source` resource type was later removed in `c9d54f7854`, making the manifest to apply on `zmirror`. Remove the broken and unnecessary `apt::source` resource.	2020-10-23 11:31:20 -07:00
Alex Vandiver	48e06c25ba	puppet: Switch nagios SSH checks to id_ed25519 key. The ssh-rsa algorithm was deprecated[1] in OpenSSH 8.2 (2020-02-14) and will be removed in a future release. [1] https://www.openssh.com/txt/release-8.4	2020-10-22 16:42:30 -07:00
Alex Vandiver	0ea20bd7d8	puppet: Move postgres_version into postgres_common. This property is not related to the base zulip install; move it to zulip::postgres_common, which is already used as a namespace for various postgres variables.	2020-10-22 11:32:25 -07:00
Alex Vandiver	25e995b677	puppet: Move normal_queues to the one place that uses it.	2020-10-22 11:32:25 -07:00
Alex Vandiver	423b5c2be2	puppet: Move queue error and stats directories to just the app host.	2020-10-22 11:31:05 -07:00
Alex Vandiver	4d4c21499a	puppet: Move supervisor dependency into process_fts_updates. PostgreSQL itself has no dependency on supervisor; rather, the FTS updates do.	2020-10-22 11:30:53 -07:00
Alex Vandiver	ca971ebc59	puppet: Remove empty zulip_ops class.	2020-10-22 11:30:53 -07:00
Alex Vandiver	16af05758d	puppet: Move zulip_org into zulip_ops. This class is not of general interest.	2020-10-22 11:30:53 -07:00
Alex Vandiver	ad566c491d	puppet: Drop now-unused zulip_ops:::git class.	2020-10-22 11:30:53 -07:00
Alex Vandiver	50e9e2ed20	puppet: Make zulip::base include zulip::apt_repository. There was likely more dependency complexity prior to `97766102df`, but there is now no reason to require that consumers explicitly include zulip::apt_repository.	2020-10-22 11:30:53 -07:00
Alex Vandiver	2dc6d26ec6	puppet: Fix included monitoring class name.	2020-10-19 22:30:20 -07:00
Alex Vandiver	7a1132d605	puppet: Switch golang and smokescreen to use /srv. /srv and /opt have very similar usages; but we should be internally consistent. Move these two (the only usages of /opt) to match the rest in /srv.	2020-10-16 13:00:06 -07:00
Alex Vandiver	78b92a51cc	puppet: Allow access to smokescreen port via iptables.	2020-10-15 15:18:35 -07:00
Alex Vandiver	0d5356969e	puppet: Reformat ipv4 iptables rules comments.	2020-10-15 15:18:35 -07:00
Alex Vandiver	fffea9612b	puppet: Add an outgoing HTTP/HTTPS proxy server. Use https://github.com/stripe/smokescreen to provide a server for an outgoing proxy, run under supervisor. This will allow centralized blocking of internal metadata IPs, localhost, and so forth, as well as providing default request timeouts (10s by default).	2020-10-15 15:18:35 -07:00
Anders Kaseorg	dfaea9df65	shfmt: Reformat shell scripts with shfmt. https://github.com/mvdan/sh Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-15 15:16:00 -07:00
Alex Vandiver	f61ac4a28d	puppet: Move frontend monitoring into its own file. This allows it to be pulled in for deploys like czo, which don't use the full `zulip_ops::app_frontend`, but we wish to monitor.	2020-10-13 17:37:32 -07:00
Tim Abbott	7c2c82b190	nginx: Update nginx configuration for fhir/hl7 organization. We should eventually add templating for the set of hosts here, but it's worth merging this change to remove the deleted hostname and replace it with the current one.	2020-10-13 16:50:26 -07:00
Anders Kaseorg	723d285e46	nginx: Redirect {www.,}zulipchat.com, www.zulip.com to zulip.com. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-13 16:49:23 -07:00
Alex Vandiver	c8df9a150e	puppet: Drop all log2zulip configuration. Disabled on webservers in `047817b6b0`, it has since lingered in configuration, as well as running (to no effect) every minute on the loadbalancer. Remove the vestiges of its configuration.	2020-10-13 11:00:50 -07:00
Alex Vandiver	b431b1b021	puppet: Remove misleading motd. This banner shows on lb1, advertising itself as lb0. There is no compelling reason for a custom motd, especially one which needs to be reconfigured for each host.	2020-10-13 11:00:36 -07:00
Alex Vandiver	45c9c3cc30	queue: Monitor user_activity queue, now that it has a consumer. Since this was using repead individual get() calls previously, it could not be monitored for having a consumer. Add it in, by marking it of queue type "consumer" (the default), and adding Nagios lines for it. Also adjust missedmessage_emails to be monitored; it stopped using LoopQueueProcessingWorker in `5cec566cb9`, but was never added back into the set of monitored consumers.	2020-10-11 14:19:42 -07:00
Alex Vandiver	4fd7df4e8c	puppet: Remove absent of check-apns-tokens. This was marked as ensure absent in `d02101a401`, in v1.7.0 in 2017.	2020-09-29 18:17:08 -07:00
Alex Vandiver	872a349508	puppet: Remove absent of log2zulip. This was marked as ensure absent in `047817b6b0`, in v2.0.0 in 2018.	2020-09-29 18:17:08 -07:00
Alex Vandiver	0137772fdb	puppet: Remove absent of calculate-first-visible-message-id. This was marked as ensure absent in `dc7d44a245`, in v1.9.0 in 2018.	2020-09-29 18:17:08 -07:00
Alex Vandiver	966c8dc23d	puppet: Remove absent of email-mirror cron job. This was marked as ensure absent in `24f8492236`, in v1.3.0 in 2014.	2020-09-29 18:17:08 -07:00
Alex Vandiver	430d3b8554	puppet: Remove absent of libapache2-mod-wsgi. This was marked as ensure absent in `89b97e7480`, in v1.7.0 in 2017, though it did not take effect until `6e55aa2ce6`, in v1.9.0 in 2018.	2020-09-29 18:17:08 -07:00
Alex Vandiver	12085552d5	puppet: Tidy indentation.	2020-09-29 17:44:44 -07:00
Alex Vandiver	57d88eedd8	puppet: Only install rabbitmq cron jobs via zulip_ops. The rabbitmq cron jobs exist in order to call rabbitmqctl as root and write the output to files that nagios can consume, since nagios is not allowed to run rabbitmqctl. In systems which do not have nagios configured, these every-minute cron jobs add non-insignificant load, to no effect. Move their installation into `zulip_ops`. In doing so, also combine the cron.d files into a single file; this allows us to `ensure => absent` the old filenames, removing them from existing systems. Leave the resulting combined cron.d file in `zulip`, since it is still of general utility and note.	2020-09-29 17:44:44 -07:00
Alex Vandiver	79931051bd	puppet: Permit outgoing mail from postfix. The configuration change made in `1c17583ad5` only allowed delivery to those specific Zulip addresses. However, they also prevent the mailserver from being used as an outgoing email relay from Zulip, since all mail that passed through the mailserver (from any originator) was required to have a `RCPT TO` that matched those regexes. Allow mail originating from `mynetworks` to have an arbitrary addresses in `RCPT TO`.	2020-09-25 15:09:27 -07:00
Alex Vandiver	36ea307fbf	puppet: Depend other changes on sharding.py validation. Use the validation of the tornado sharding config that `stage_updated_sharding` does, by depending on it. This ensures that we don't write out a supervisor or nginx config based on a bad (e.g. non-sequential) list of tornado ports.	2020-09-25 10:52:40 -07:00
Alex Vandiver	c0e240277b	tornado: Remove fingerprinting, write out .tmp files always. Fingerprinting the config is somewhat brittle -- it requires either custom bootstrapping for old (fingerprint-less) configs, and may have false-positives. Since generating the config is lightweight, do so into the .tmp files, and compare the output to the originals to determine if there are changes to apply. In order to both surface errors, as well as notify the user in case a restart is necessary, we must run it twice. The `onlyif` functionality cannot show configuration errors to the user, only determine if the command runs or not. We thus run the command once, judging errors as "interesting" enough to run the actual command, whose failure will be verbose in Puppet and halt any steps that depend on it. Removing the `onlyif` would result in `stage_updated_sharding` showing up in the output of every Puppet run, which obscures the important messages it displays when an update to sharding is necessary. Removing the `command` (e.g. making it an `echo`) would result in removing the ability to report configuration errors. We thus have no choice but to run it twice; this is thankfully low-overhead.	2020-09-25 10:52:40 -07:00
Alex Vandiver	2a12fedcf1	tornado: Remove explicit tornado_processes setting; compute it. We can compute the intended number of processes from the sharding configuration. In doing so, also validate that all of the ports are contiguous. This removes a discrepancy between `scripts/lib/sharding.py` and other parts of the codebase about if merely having a `[tornado_sharding]` section is sufficient to enable sharding. Having behaviour which changes merely based on if an empty section exists is surprising. This does require that a (presumably empty) `9800` configuration line exist, but making that default explicit is useful. After this commit, configuring sharding can be done by adding to `zulip.conf`: ``` [tornado_sharding] 9800 = # default 9801 = other_realm ``` Followed by running `./scripts/refresh-sharding-and-restart`.	2020-09-18 15:13:40 -07:00
Alex Vandiver	f638518722	tornado: Move default production port to 9800. In development and test, we keep the Tornado port at 9993 and 9983, respectively; this allows tests to run while a dev instance is running. In production, moving to port 9800 consistently removes an odd edge case, when just one worker is on an entirely different port than if two workers are used.	2020-09-18 15:13:40 -07:00
Alex Vandiver	ff94254598	tornado: Log to files by port number. Without an explicit port number, the `stdout_logfile` values for each port are identical. Supervisor apparently decides that it will de-conflict this by appending an arbitrary number to the end: ``` /var/log/zulip/tornado.log /var/log/zulip/tornado.log.1 /var/log/zulip/tornado.log.10 /var/log/zulip/tornado.log.2 /var/log/zulip/tornado.log.3 /var/log/zulip/tornado.log.7 /var/log/zulip/tornado.log.8 /var/log/zulip/tornado.log.9 ``` This is quite confusing, since most other files in `/var/log/zulip/` use `.1` to mean logrotate was used. Also note that these are not all sequential -- 4, 5, and 6 are mysteriously missing, though they were used in previous restarts. This can make it extremely hard to debug logs from a particular Tornado shard. Give the logfiles a consistent name, and set them up to logrotate.	2020-09-14 22:17:51 -07:00
Alex Vandiver	efdaa58c24	supervisor: Use more specific process_name than "port-9800". Making this include "zulip-tornado" makes it clearer in supervisor logs. Without this, one only sees: ``` 2020-09-14 03:43:13,788 INFO waiting for port-9807 to stop 2020-09-14 03:43:14,466 INFO stopped: port-9807 (exit status 1) 2020-09-14 03:43:14,469 INFO spawned: 'port-9807' with pid 24289 2020-09-14 03:43:15,470 INFO success: port-9807 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) ```	2020-09-14 22:17:51 -07:00
Alex Vandiver	e9d0bdea65	puppet: Coerce uwsgi_listen_backlog_limit into an int before doing math.	2020-09-14 21:22:13 -07:00
Alex Vandiver	8adf530400	puppet: Generate sharding in puppet, then refresh-sharding-and-restart. This supports running puppet to pick up new sharding changes, which will warn of the need to finalize them via `refresh-sharding-and-restart`, or simply running that directly.	2020-09-14 16:27:15 -07:00
Alex Vandiver	0de356c2df	puppet: Move generation of tornado nginx upstreams into tornado_sharding. This puts the creation of the upstreams referenced by `nginx_sharding.conf` adjacent to their use.	2020-09-14 16:27:15 -07:00
Alex Vandiver	bf029d99f1	sharding: Also mark sharding.json 644 for consistency. There is no reason to limit this to 640; mark it 644 for consistency with the other file.	2020-09-14 16:27:15 -07:00
Alex Vandiver	1c17583ad5	puppet: Restrict postfix incoming addresses to postmaster and zulip. This removes the possibility of local user enumeration via RCPT TO.	2020-09-11 18:49:22 -07:00
Alex Vandiver	482c964dd3	puppet: Logrotate for webhook exceptions.	2020-09-10 17:47:21 -07:00
Alex Vandiver	e38051736d	puppet: Wrap and sort logrotate config.	2020-09-10 17:47:21 -07:00
Anders Kaseorg	75c59a820d	python: Convert subprocess.Popen.communicate to run or check_output. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 17:42:35 -07:00
Anders Kaseorg	fbfd4b399d	python: Elide action="store" for argparse arguments. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 16:17:14 -07:00
Anders Kaseorg	1f2ac1962f	python: Elide default=None for argparse arguments. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 16:17:14 -07:00
Anders Kaseorg	d751e0cece	puppet: Don’t install netcat. It’s been unused since commit `0af22dad18` (#13239). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-03 10:33:47 -07:00
Anders Kaseorg	ab120a03bc	python: Replace unnecessary intermediate lists with generators. Mostly suggested by the flake8-comprehension plugin. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-02 11:15:41 -07:00
Anders Kaseorg	a5dbab8fb0	python: Remove redundant dest for argparse arguments. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-02 11:04:10 -07:00
Anders Kaseorg	dbdf67301b	memcached: Switch from pylibmc to python-binary-memcached. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-08-06 12:51:14 -07:00
Casper Kvan Clausen	ed7a6d5e4d	puppet: Support nginx_listen_port with http_only	2020-08-03 12:58:12 -07:00
Alex Vandiver	cd530d627b	uwsgi: Stop generating IOError and SIGPIPE on client close. Clients that close their socket to nginx suddenly also cause nginx to close its connection to uwsgi. When uwsgi finishes computing the response, it thus tries to write to a closed socket, and generates either IOError or SIGPIPE failures. Since these are caused by the _client_ closing the connection suddenly, they are not actionable by the server. At particularly high volumes, this could represent some sort of server-side failure; however, this is better detected by examining status codes at the loadbalancer. nginx uses the error code 499 for this occurrence: https://httpstatuses.com/499 Stop uwsgi from generating this family of exception entirely, using configuration for uwsgi[1]; it documents these errors as "(annoying)," hinting at their general utility." [1] https://uwsgi-docs.readthedocs.io/en/latest/Options.html#ignore-sigpipe	2020-07-31 10:40:09 -07:00
Alex Vandiver	ceb909dbc5	puppet: Increase backlogged socket count based on uwsgi backlog. Increasing the uwsgi listen backlog is intended to allow it to handle higher connection rates during server restart, when many clients may be trying to connect. The kernel, in turn, needs to have a proportionally increased somaxconn soas to not refuse the connection. Set somaxconn to 2x the uwsgi backlog, but no lower than the default (128).	2020-07-28 21:16:26 -07:00
Alex Vandiver	38d01cd4db	puppet: Generalize install-wal-g to be arbitrary tarballs.	2020-07-24 17:24:57 -07:00
Tim Abbott	5a1243db3c	puppet: Use correct scope for zulip_ops::munin_plugin.	2020-07-15 21:49:45 -07:00
Alex Vandiver	48c3c33d10	puppet: Fully-qualify the munin-plugin name	2020-07-14 17:58:51 -07:00
Alex Vandiver	c68333040b	puppet: Revert PostgreSQL setting of recovery_target_timeline. Prior to PostgreSQL 12, the `recovery_target_timeline` setting is only valid in a `recovery.conf` file, as that file has its own configuration parser. As such, including it in `postgresql.conf` results in an error, and PostgreSQL will fail to start. Remove the setting, reverting `bff3b540b1`. This fixes PostgreSQL 9.5, 9.6, 10, and 11; while the setting is not an error in a PostgreSQL 12 configuration file, it is unnecessary since `latest` is the default.	2020-07-14 16:28:20 -07:00
Alex Vandiver	31d80a77d4	puppet: Update nagios check_postgres_replication_lag to be on DB hosts `7d4a370a57` attempted to move the replication check to on the PostgreSQL hosts. While it updated the _check_ to assume it was running and talking to a local PostgreSQL instance, the configuration and installation for the check were not updated. As such, the check ran on the nagios host for each DB host, and produced no output. Start distributing the check to all apopdb hosts, and configure nagios to use the SSH tunnel to get there.	2020-07-14 16:27:18 -07:00
Alex Vandiver	2174db27db	puppet: Put the dependencies on pg_backup_and_purge itself, and ensure them.	2020-07-14 00:40:25 -07:00
Alex Vandiver	6c27f07c1d	puppet: Move PostgreSQL backups to their own class. wal-g was used in `puppet/zulip` by env-wal-g, but only installed in `puppet/zulip_ops`. Merge all of the dependencies of doing backups using wal-g (wal-g installation, the pg_backup_and_purge job, the nagios plugin that verifies it happens) into a common base class in `puppet/zulip`, since it is generally useful.	2020-07-14 00:40:25 -07:00
Anders Kaseorg	15483c09cb	puppet: Add missing trailing commas. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-13 15:36:06 -07:00
Alex Vandiver	3691a94efe	puppet: Configure munin and nagios under apache with puppet. This swaps in the actually-in-use munin configuiration file; otherwise, it is an implementation of the configuration as it exists on the machine.	2020-07-13 13:23:11 -07:00
Alex Vandiver	4e42164b4a	munin: Add plugins to prod hosts.	2020-07-13 13:23:11 -07:00
Alex Vandiver	2a14212b27	munin: Add a helper resource definition for munin plugins.	2020-07-13 12:49:28 -07:00
Alex Vandiver	7c7b5fcd6f	munin: Deal with spaces in the channel names.	2020-07-13 12:49:28 -07:00
Alex Vandiver	eda2c4b8e2	puppet: Split munin-node from munin-server. No plugins are installed inside the /usr/local/munin/lib this creates in munin-node, nor are they symlinked into /etc/munin/plugins, so non-default plugins are added by this.	2020-07-13 12:49:28 -07:00
Alex Vandiver	ddc7bb5a45	munin: Fix the path to check_send_receive_time.	2020-07-13 12:49:28 -07:00
Alex Vandiver	8be544e7eb	munin: Rename monitoring plugin to use zulip name, not humbug.	2020-07-13 12:49:28 -07:00
Alex Vandiver	1b3560af94	nagios: Stop assuming /api is where zulip client is. The api/ directory was removed in f9ba3cb60c; as that commit notes, we use the python-zulip-api module for that, added in `938597c5da`.	2020-07-13 12:49:28 -07:00
Mateusz Mandera	57d3ef42b8	puppet: Don't run thumbor services in production. Fixes #15649. Currently, no production services use thumbor; so, it makes sense to not run them in production systems.	2020-07-10 14:22:17 -07:00
Alex Vandiver	f0f29584aa	puppet: Add an arity count ("at least two") to zulipconf function.	2020-07-10 00:14:09 -07:00
Alex Vandiver	8cff27f67d	puppet: Pull hosts from zulip.conf, not hardcoded list. The one complexity is that hosts_fullstack are treated differently, as they are not currently found in the manual `hosts` list, and as such do not get munin monitoring.	2020-07-10 00:14:09 -07:00
Alex Vandiver	24383a5082	puppet: Rename hosts_domain so hosts_prefix can be grepped for.	2020-07-10 00:14:09 -07:00
Alex Vandiver	a4e7c7a27e	nagios: Remove check_memcached. check_memcached does not support memcached authentication even in its latest release (it’s in a TODO item comment, and that’s it), and was never particularly useful.	2020-07-10 00:12:48 -07:00
Anders Kaseorg	ebf7f4d0f6	zthumbor: Rename thumbor.conf to thumbor_settings.py. So we can apply all our lint checks to it. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-06 18:44:58 -07:00
Anders Kaseorg	9900298315	zthumbor: Remove Python 2 residue. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-06 18:44:58 -07:00
Alex Vandiver	17002f2a0e	puppet: Allow passing an alternate config path to zulip-puppet-apply. When temporary configuration changes are desired, this lets one set up an alternate `zulip.conf` to apply while leaving the true one in place.	2020-07-06 18:30:16 -07:00
Alex Vandiver	64b44a12f5	puppet: Add an exec rule to reload the whole supervisor config. When supervisor is first installed, it is started automatically, and creates the socket, owned by root. Subsequent reconfiguration in puppet only calls `reread + update`, which is insufficient to apply the `chown = zulip:zulip` line in `supervisord.conf`, leaving the socket owned by `root` and the last part of the installation unable to restart `supervisor` services as the `zulip` user. The `chown` line in `scripts/lib/install` exists to paper over this. Add a separate exec target for changes to `supervisord.conf` itself, which restarts the full service. This leaves the default `restart` action on the service for the lightweight `reread + update` action, which is more common. We use `systemctl` only on redhat-esque builds, because CI runs Ubuntu, but init is not systemd in that context. `systemctl reload` is sufficient to re-apply the socket ownership, but a full `restart` and not `reload` is necessary under `/etc/init.d/supervisor`.	2020-07-01 10:40:54 -07:00
Alex Vandiver	dd91f8edba	puppet: Move supervisor start command into zulip::common. Move this command alongside the rest of the distro-dependent supervisor paths.	2020-07-01 10:40:53 -07:00
Alex Vandiver	a5d63cfedf	wal-g: Update pg_backup_and_purge for wal-g format. wal-g has a slihghtly different format than wal-e in its `backup-list` output; it only contains three columns: - `name` - `last_modified`, - `wal_segment_backup_start` ..rather than wal-e's plethora, most of which were blank: - `name` - `last_modified` - `expanded_size_bytes` - `wal_segment_backup_start` - `wal_segment_offset_backup_start` - `wal_segment_backup_stop` - `wal_segment_offset_backup_stop` Remove one argument from the split.	2020-06-29 17:17:26 -07:00
Alex Vandiver	a21a086f5c	puppet: nagios-plugins-basic is replaced by monitoring-plugins-basic. In Bionic, nagios-plugins-basic is a transitional package which depends on monitoring-plugins-basic. In Focal, it is a virtual package, which means that every time puppet runs, it tries to re-install the nagios-plugins-basic package. Switch all instances to referring to `$zulip::common::nagios_plugins`, and repoint that to monitoring-plugins-basic.	2020-06-29 14:58:01 -07:00
Alex Vandiver	6fdcb4aa17	puppet: Move supervisor conf file path into zulip::common. Move this config file alongside the rest of the distro-dependent paths.	2020-06-29 13:41:05 -07:00
Alex Vandiver	93401448b9	puppet: Explain value of reload && update trick for supervisor. While the stock reload works just fine, it causes too much disruption.	2020-06-29 13:39:09 -07:00
Alex Vandiver	d2de5aced8	puppet: Remove unnecessary supervisor service name variable.	2020-06-29 13:39:09 -07:00
Alex Vandiver	73805f8279	puppet: Stop removing file that contains only comments. In modern PostgreSQL, this file, provided by `postgresql-common`, has no non-comment, non-blank lines. There's hence no reason to remove it.	2020-06-29 13:37:42 -07:00
Alex Vandiver	6e3a424921	puppet: Install the latest postgresql-client on frontend hosts. Frontend hosts in multiple-host configurations (including docker hosts) need a `psql` binary installed. `ca9d27175b` switched to not setting `postgresql.version` in `zulip.conf`, which in turn means that `$zulip::base::postgres_version` is unset. This, in turn, led to the frontend hosts installing `postgresql-client-`, whose trailing dash causes apt to _uninstall_ that package. Unconditionally install `postgresql-client` with no explicit version attached. This is a metapackage which depends on the latest client package, which currently means it will install `postgresql-client-12`. On single-host installs which have configured `postgresql.version` in `zulip.conf` to be a lower version, this will result in `postgresql-client-12` existing alongside another version (e.g. `postgresql-client-10`); `psql` will give the most recent. This is acceptable because the semantic meaning of the postgresql version in `zulip.conf` is about the database engine itself, not the command-line client.	2020-06-29 13:37:16 -07:00
Alex Vandiver	2c36bb19b2	puppet: Pull out `unzip` package which is identical in both cases.	2020-06-29 13:37:16 -07:00
Alex Vandiver	876ee4a8ed	installer: Remove code specific to stretch or xenial. Support for Xenial and Stretch was removed (`5154ddafca`, `0f4b1076ad`, `8944e0ad53`, `79acd5ae40`, `1219a2e854`), but not all codepaths were updated to remove their conditionals on it. Remove all code predicated on Xenial or Stretch. debathena support was migrated to Bionic, since that appears to be the current state of existing debathena servers.	2020-06-24 12:57:38 -07:00
Anders Kaseorg	a9e59b6bd3	memcached: Change the default MEMCACHED_USERNAME to zulip@localhost. This prevents memcached from automatically appending the hostname to the username, which was a source of problems on servers where the hostname was changed. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-19 21:22:30 -07:00
Alex Vandiver	7250d41bf7	puppet: Fix the path to install-wall-g	2020-06-17 15:23:18 -07:00
Alex Vandiver	03bffd3938	upgrade-zulip: Pin the postgres version to the OS default. We would prefer to use the postgres packages from Postgres themselves, if available. However, this requires ensures that, for existing installs, we preserve the same version of postgres as their base distribution installed. Move the version-determination logic from being computed at puppet interpolation time, to being computed at install time and pinned into zulip.conf.	2020-06-16 17:05:46 -07:00
Tim Abbott	26396c5e25	puppet: Fix exceptions with multiple certbot declarations. Since `9e8f1aacb3`, zulip_ops machines might have two Package declarations for `certbot`, which doesn't work in puppet. The fix is, as usual, to use our `zulip::safepackage` wrapper instead.	2020-06-15 18:21:33 -07:00
Alex Vandiver	bff3b540b1	puppet: Postgres replication should always switch to latest timeline. Omission of this setting makes resuming after a primary switchover difficult-to-impossible. It is the default in PostgreSQL 12.	2020-06-15 16:18:07 -07:00
Alex Vandiver	f8fc3a16eb	puppet: Use "primary" / "replica" consistently in comments. The style guide for Zulip is to always use "primary" and "replica" when describing database replication. Adjust a few comments under `puppet/` that do not adhere to this. Unfortunately, some references still remain to the insensitive and inaccurate "master" / "slave" terminology. However, these are only in files which we are attempting to preserve as close to the upstream versions they are derived from (e.g. postgresql.conf, postfix/master.cf).	2020-06-15 16:18:07 -07:00
Alex Vandiver	5f433d6eeb	puppet: Remove vestigial check_postgres.pl. `65774e1c4f` switched from using the bundled check_postgres.pl to using the version from packages; the file itself remained, however. Remove it, and clean up references to it. Fixes #15389.	2020-06-15 16:18:07 -07:00
Alex Vandiver	7d4a370a57	puppet: Move monitoring of pg replication to the pg hosts. Instead of SSH'ing around to them, run directly on the database hosts. This means that the replicas do not know how many bytes behind they are in _receiving_ the wall logs; thus, the monitoring also extends to the primary database, which knows that information for each replica. This also allows for detecting when there are too few active replicas.	2020-06-15 16:18:07 -07:00
Anders Kaseorg	5dc9b55c43	python: Manually convert more percent-formatting to f-strings. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00
Anders Kaseorg	74c17bf94a	python: Convert more percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Now including %d, %i, %u, and multi-line strings. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00
Anders Kaseorg	1ed2d9b4a0	logging: Use logging.exception and exc_info for unexpected exceptions. logging.exception() and logging.debug(exc_info=True), etc. automatically include a traceback. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00
Tim Abbott	80589099d8	puppet: Fix typo in logic for whether to install certbot. Fixes #15372.	2020-06-14 16:04:39 -07:00
rht	89af2f381d	puppet: Link postgres dict symlinks to hunspell files on CentOS. This is a temporary measure until we can find the directory of postgresql dicts on CentOS.	2020-06-13 17:53:38 -07:00
rht	36a5ca5015	puppet: Add cyrus-sasl to memcached_packages on RedHat. This is to mirror the sasl2-bin package on Debian.	2020-06-13 17:49:51 -07:00
rht	e776d2d159	puppet: Abstract out owner:group of memcached-sasldb2.	2020-06-13 17:49:51 -07:00
Anders Kaseorg	91a86c24f5	python: Replace None defaults with empty collections where appropriate. Use read-only types (List ↦ Sequence, Dict ↦ Mapping, Set ↦ AbstractSet) to guard against accidental mutation of the default value. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-13 15:31:27 -07:00
Alex Vandiver	97b9308781	puppet: Merge multiple postgres roles in `zulip_ops`. All differences between the primary and replica roles having been merged, fold the `postgres_common`, `postgres_master`, and `postgres_slave` roles into just `postgres_appdb`.	2020-06-12 14:57:46 -07:00
Alex Vandiver	55bd31721d	puppet: Remove custom `vm.dirty_ratio` and `vm.dirty_background_ratio`. These values differed between the primary and secondary database hosts, for unclear reasons. The differences date back to their introduction in `387f63deaa`. As the comment in the replica confguration notes, settings of `vm.dirty_ratio = 10` and `vm.dirty_background_ratio = 5` matched the kernel defaults for "newer" kernels; however, kernel 2.6.30 bumped those to 20 and 10, respectively[1], as a fix for underlying logic now being more correct. Remove these overrides; they should at very least be consistent across roles, and the previous values look to be an attempt to tune for a very much older version of the Linux kernel, which was using an different, buggier, algorithm under the hood. [1] `1b5e62b42b`	2020-06-12 14:57:46 -07:00
Alex Vandiver	f39816e768	puppet: Stop distributing recovery.conf file. This file controls streaming replication, and recovery using wal-g on the secondary. The `primary_conninfo` data needs to change on short notice when database failover happens, in a way that is not suitable for being controlled by puppet. PostgreSQL 12, in fact, removes the use of the `recovery.conf` file[1]; the `primary_conninfo` and `restore_command` information goes into the main `postgresql.conf` file, and the standby status is controlled by the presence of absence of an empty `standby.signal` file. Remove the puppet control of the `recovery.conf` file. [1] https://pgstef.github.io/2018/11/26/postgresql12_preview_recovery_conf_disappears.html	2020-06-12 14:57:46 -07:00
Alex Vandiver	316498a169	puppet: Remove unnecessary nagios authentication setup. Since the nagios authentication is stored _in the database_, it is unnecessary to run if the database is simply a replica of the production database. The only case in which this statement would have an effect is if the postgres node contains a _different_ (or empty) database, which `setup_disks` now effectively prevents. Remove the unnecessary step.	2020-06-11 21:01:49 -07:00
Alex Vandiver	0774f54c1b	puppet: Move to `setup_disks` to postgres_common. The tooling should now be run no matter if the node is a primary or replica.	2020-06-11 21:01:49 -07:00
Alex Vandiver	6f6a0e890a	puppet: Run setup_disks based on symlink; remove mdadm dependency. `481613a344` updated the `setup_disks` script to no longer reference `mdadm`, since we no longer set up RAID on servers. Update the puppet that would call it to remove the `mdadm` dependency, and run only if the state is not what it produces -- namely, a symlink for `/var/lib/postgresql`, which must point to an existent `/srv/postgresql` directory.	2020-06-11 21:01:49 -07:00
Alex Vandiver	1dc2de5026	puppet: Update setup-disks to be idempotent. The end state it produces is _either_: - `/srv/postgresql` already existed, which was symlinked into `/var/lib/postgresql`; postgres is left untouched. This is the situation if `setup_disks` is run on the database primary, or a replica which was correctly configured. - An empty `/srv/postgresql` now exists, symlinked into `/var/lib/postgresql`, and postgres is stopped. This is the situation if `puppet` was just run on a new host, or a previously-configured host was rebooted (clearing the temporary disk in `/dev/nvme0`) In the latter case, where `/srv/postgresql` is now empty, any previous contents of `/var/lib/postgresql` are placed under `/root`, timestamped for uniqueness. In either case, the tool should now be idempotent.	2020-06-11 21:01:49 -07:00
Alex Vandiver	8373f5f4b9	puppet: Make parent directories of postgresql.conf This fixes errors when provisioning a new system (or version of postgres) when the configuration file cannot be written because its parent directories do not exist. Files inherently depend on their containing directories, so no explicit dependencies are necessary.	2020-06-11 20:56:55 -07:00
Alex Vandiver	9fd7a026ad	puppet: Pull postgres data directory into postgres_appdb_base. The `pg_datadir` variable was only used, and accurate, for CentOS. Pull it out into `postgres_app_base`, broaden it to being accurate on Debian-based systems as well, and use it consistently in the templates.	2020-06-11 20:56:55 -07:00
Alex Vandiver	16c4cea951	puppet: Pull postgres config directory into postgres_appdb_base. As the previous commit, this is currently only used in tuning, but is a property of the whole postgres configuration; move it there, as just the directory, not the file. Use this directory consistently in the erb templates. Since we produce a `pg_hba.conf`, it makes sense that we point to the path that we know that we explicitly wrote to, for instance.	2020-06-11 20:56:55 -07:00
Alex Vandiver	2a7373b602	puppet: Pull postgres restart config into postgres_appdb_base. While it is only currently used in the tuning configuration, it is a property of the base configuration, and fits more clearly into the case block there.	2020-06-11 20:56:55 -07:00
Anders Kaseorg	365fe0b3d5	python: Sort imports with isort. Fixes #2665. Regenerated by tabbott with `lint --fix` after a rebase and change in parameters. Note from tabbott: In a few cases, this converts technical debt in the form of unsorted imports into different technical debt in the form of our largest files having very long, ugly import sequences at the start. I expect this change will increase pressure for us to split those files, which isn't a bad thing. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-11 16:45:32 -07:00
Anders Kaseorg	69730a78cc	python: Use trailing commas consistently. Automatically generated by the following script, based on the output of lint with flake8-comma: import re import sys last_filename = None last_row = None lines = [] for msg in sys.stdin: m = re.match( r"\x1b\[35mflake8 \\|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg ) if m: filename, row_str, col_str, err = m.groups() row, col = int(row_str), int(col_str) if filename == last_filename: assert last_row != row else: if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) with open(filename) as f: lines = f.readlines() last_filename = filename last_row = row line = lines[row - 1] if err in ["C812", "C815"]: lines[row - 1] = line[: col - 1] + "," + line[col - 1 :] elif err in ["C819"]: assert line[col - 2] == "," lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ") if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-06-11 16:04:12 -07:00
Alex Vandiver	b114eb2f10	puppet: Rename env-wal-e to env-wal-g. It runs wal-g now, not wal-e; make its name respect that.	2020-06-11 15:52:43 -07:00
Alex Vandiver	4fe0444108	puppet: Install wal-g, not wal-e.	2020-06-11 15:52:43 -07:00
Alex Vandiver	39d6185ce7	puppet: Remove python-dateutil requirement from pg_backup_and_purge. `1f565a9f41` removed the `package` lines which install `python-dateutil`, but not the line in `puppet_ops` that reference it; as such, Puppet manifests in puppet_ops fail to compile. Remove the stale reference to `python-dateutil`, which is unnecessary since the code is python3, not python2.	2020-06-11 14:28:55 -07:00
Anders Kaseorg	ca4357fd64	python: Use standard NoReturn (Python ≥ 3.6). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-11 12:56:52 -07:00
Mateusz Mandera	fbc96d56d5	sharding: Fix permissions on the nginx_sharding.conf file. The zulip user needs to be able to read the file, when running the backup tool. We put root:root as owner on other nginx config files, so it's probably correct to keep the ownership as it is, and set the mode to 0644.	2020-06-11 12:56:06 -07:00
Anders Kaseorg	67e7a3631d	python: Convert percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-10 15:02:09 -07:00
arpit551	03d563ce0f	postgres: Changed max_connections in postgres 12 config template. Value of max_connections is now 1000 like in other postgres versions template.	2020-06-08 21:59:57 -07:00
arpit551	9e8f1aacb3	certbot: Switch to use certbot from apt. certbot-auto doesn’t work on Ubuntu 20.04, and won’t be updated; we migrate to instead using the certbot package shipped with the OS instead. Also made sure that sure certbot gets installed when running zulip-puppet-apply, to handle existing systems.	2020-06-08 21:59:29 -07:00
arpit551	7e75a7e336	postgres: Fix syntax error in postgres 12 config. <% used as example in postgres 12 config is being confused with erb syntax so added extra % as <%% means literal <%.	2020-06-08 21:57:54 -07:00
arpit551	7d11be5ca5	puppet: Add Zulip specific postgres configuration for 12. Based on the work done in `a03e478`.	2020-06-08 21:57:54 -07:00
arpit551	4e52f1bc53	puppet: Commit an upstream version of postgres 12 config. In preparation for adding production support for Ubuntu Focal.	2020-06-08 21:57:54 -07:00
Tim Abbott	71078adc50	docs: Update URLs to use https://zulip.com . We're migrating to using the cleaner zulip.com domain, which involves changing all of our links from ReadTheDocs and other places to point to the cleaner URL.	2020-06-08 18:10:45 -07:00
Anders Kaseorg	1f565a9f41	timezone: Use standard library datetime.timezone.utc consistently. datetime.timezone is available in Python ≥ 3.2. This also lets us remove a pytz dependency from the PostgreSQL scripts. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-05 09:34:17 -07:00
Alex Vandiver	8b1d49dbc7	puppet: Rename "wiki" realm to "monitoring". This is vestigial. It requires manually altering the `htdigest` file (not stored in this repo) to change the digest realm from `wiki` to `monitoring`, and will re-prompt users for their passwords if the browsers currently store them.	2020-05-30 12:26:21 -07:00
Alex Vandiver	b33aa8da7f	postgresql: Update setup-disks to use `service postgresql`. Using `service postgresql` makes it no longer linked to the specific version/cluster that is on the host.	2020-05-30 12:14:24 -07:00
Alex Vandiver	4e370cda75	postgresql: Update setup-disks to drop /mnt disabling. Hosts do not start out with a `/mnt`; there is no need to disable it.	2020-05-30 12:14:24 -07:00
Alex Vandiver	a7d85b7e69	postgresql: Update setup-disks to not move /tmp. Drop the change to move `/tmp` onto the local disk. Doing this move confuses `resolved` until there is a restart, and has no clear benefits. The change came in during `bf82fadc95`, but does not describe the reasoning; it is particularly puzzling, since postgresql stores its temporary files under `$PGDATA/base/pgsql_tmp`.	2020-05-30 12:14:24 -07:00
Alex Vandiver	481613a344	postgresql: Update setup-disks to not use RAID. Do not RAID the disks together. This was previously done when they were spinning media, for reliability; running them on an SSD obviates this sufficiently. This means that updating the initramfs is also not necessary.	2020-05-30 12:14:24 -07:00
Alex Vandiver	b537563bc1	postgresql: Set the current primary host.	2020-05-30 12:14:24 -07:00
Alex Vandiver	ad2918ea51	puppet: Remove `postgres_other` nagios hostgroup. This no longer has any rules specific to it. We leave the `postgres` munin group (which now only contains `postgres_appdb`) as future-proofing, and so that `postgres_appdb` matches to the puppet manifest of the same name.	2020-05-28 17:24:35 -07:00
Alex Vandiver	2c73fbdcb6	puppet: Remove munin monitoring for no-longer-used "postgres_other". The `wiki` and `trac` products are no longer used.	2020-05-28 17:24:35 -07:00
Tim Abbott	0b93e09e72	puppet: Add nginx configuration for blog.zulip.org move.	2020-05-26 14:47:05 -07:00
Anders Kaseorg	f5b33f9398	python: Further pyupgrade changes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-26 11:43:40 -07:00
Anders Kaseorg	333f7d16c9	logging: Pass more format arguments to logging. Commit `bdc365d0fe` (#14852) missed this because of https://github.com/returntocorp/semgrep/issues/831. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-26 11:42:23 -07:00
Anders Kaseorg	824d97987b	process_fts_updates: Use cursor.execute correctly. Commit `b501d04f6a` (#14841) missed this because of https://github.com/returntocorp/semgrep/issues/831. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-26 11:42:23 -07:00
arpit551	439f0d3004	install: Ad production support for Zulip on Ubuntu Focal. Install script now runs on Focal. Python 2 is now installed via the `python2` package in Focal.	2020-05-25 16:58:42 -07:00
Tim Abbott	220620e7cf	sharding: Add basic sharding configuration for Tornado. This allows straight-forward configuration of realm-based Tornado sharding through simply editing /etc/zulip/zulip.conf to configure shards and running scripts/refresh-sharding-and-restart. Co-Author-By: Mateusz Mandera <mateusz.mandera@zulip.com>	2020-05-20 13:47:20 -07:00
Tim Abbott	cdd3b7efbc	tornado: Configure upstreams for TORNADO_PROCESSES.	2020-05-20 13:43:48 -07:00
Tim Abbott	c3d3324295	puppet: Add link to the sources for Zephyr patches.	2020-05-19 20:54:11 -07:00
Tim Abbott	a35e71ebbc	puppet: Update package name for boto-on-python3. The python3-boto3 package is the maintained fork that supports Python 3; it was renamed in Ubuntu Bionic from the original Ubuntu Xenial name.	2020-05-19 20:25:11 -07:00
Tim Abbott	1c28770810	puppet: Fix apt_repo_debathena setup_file path. There was a typo introduced here when scripts_path was added.	2020-05-19 20:21:30 -07:00
Tim Abbott	c43b3d95e2	puppet: Switch env-wal-e to use wal-g rather than wal-e. wal-g is the modern reimplementation of wal-e that supports current postgres. It requires a bit of extra configuration to specify the AWS region.	2020-05-15 16:45:36 -07:00
Anders Kaseorg	fcca4a38b6	puppet: Work around memcached SASL configuration path bug. memcached 1.5.22 in Ubuntu 20.04 has a bug where it looks for its SASL configuration at /etc/sasl2/memcached.conf/memcached.conf instead of /etc/sasl2/memcached.conf. https://bugs.launchpad.net/ubuntu/+source/memcached/+bug/1878721 Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-14 23:25:24 -07:00
Tim Abbott	b3c5f2c13e	puppet: Remove check_postgres_replication_lag hostname hardcoding. Since this runs on the Nagios server, which already has the relevant hostnames defined in zulip.conf, we can just read it from there.	2020-05-11 23:42:36 -07:00
Tim Abbott	225bbf3633	puppet: Update check_postgres_replication_lag for postgres 10. These functions were renamed in postgres 10.	2020-05-11 15:59:23 -07:00
Tim Abbott	d8ea649869	puppet: Cast tornado_processes to Integer. This is the latest mechanism in puppet for turning a string into an integer. We update an adjacent comment while we're at it.	2020-05-11 00:54:48 -07:00
Tim Abbott	6319c181eb	puppet: Use actual name for the bind9-host package. Using the `host` virtual package confused Puppet into reporting it was doing work every time one did a puppet run, resulting in unnecessarily spammy output.	2020-05-11 00:51:53 -07:00
Mateusz Mandera	dd40649e04	queue_processors: Remove the slow_queries queue. While this functionality to post slow queries to a Zulip stream was very useful in the early days of Zulip, when there were only a few hundred accounts, it's long since been useless since (1) the total request volume on larger Zulip servers run by Zulip developers, and (2) other server operators don't want real-time notifications of slow backend queries. The right structure for this is just a log file. We get rid of the queue and replace it with a "zulip.slow_queries" logger, which will still log to /var/log/zulip/slow_queries.log for ease of access to this information and propagate to the other logging handlers. Reducing the amount of queues is good for lowering zulip's memory footprint and restart performance, since we run at least one dedicated queue worker process for each one in most configurations.	2020-05-11 00:45:13 -07:00
Tim Abbott	21a04e2dbc	puppet: Use nice to deprioritize various processes. Our priority hierarchy is: (1) Tornado and base services like memcached, redis, etc. (2) Django and message sender queue workers. (3) Everything else. Ideally, we'd have something a bit more fine-grained (e.g. some queue workers are potentially in the sending path, while others aren't), but this should have a big impact on ensuring Tornado gets the resources it needs during load spikes. I think this has a good chance of causing some load spikes that would previously have resulted in a user-facing delivery delays no longer having any significant user-facing impact.	2020-05-10 23:28:25 -07:00
shubhamgupta2956	9cd8644c7c	uploads: Add support for ".jpe" file extension. Currently when the user uploads files with ".jpe" file extension, the markdown is converted to link but the image is not embedded. This commit adds the support for ".jpe" file extension. Fixes #14863	2020-05-10 22:55:52 -07:00
Anders Kaseorg	8cdf2801f7	python: Convert more variable type annotations to Python 3.6 style. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-08 16:42:43 -07:00
Anders Kaseorg	708c6f4f11	puppet: Finally vanquish the cursed integer conversion conditional. We no longer support Puppet 3. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-08 16:42:43 -07:00
Tim Abbott	50d8d61d3c	puppet: Remove unnecssary/broken ;. This breaks the Xenial build, which we're removing soon, but it's unnecessary in any case.	2020-05-07 16:23:37 -07:00
Tim Abbott	03991d098a	puppet: Add optional postgres version override. This makes it convenient to run an alternative postgres version.	2020-05-07 09:33:24 -07:00
Mateusz Mandera	4643e48f60	retention: Add a daily cron job. This will run archive_messages management command at 6am every day, 1 hour after soft_deactivate_users (which runs at 5am).	2020-05-05 10:11:38 -07:00
Tim Abbott	4034f6f99e	nagios: Fix check_postgres_replication_lag. This expects to be run outside a virtualenv and thus without typing_extensions available.	2020-05-03 00:14:54 -07:00
Tim Abbott	4f3976b917	process_fts_updates: Clean up logging output. This saves a couple lines of spammy output in the run-dev.py startup experience, and will be better output in production as well.	2020-05-01 11:51:20 -07:00
Anders Kaseorg	c0ffa71fa9	nginx: Replace unanchored regexes in location directives. We could anchor the regexes, but there’s no need for the power (and responsibility) of regexes here. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-24 16:58:19 -07:00
Anders Kaseorg	5e01a0ae8b	zulip-ec2-configure-interfaces: Convert function type annotations. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-24 13:06:54 -07:00
Anders Kaseorg	f8339f019d	python: Convert assignment type annotations to Python 3.6 style. Commit split by tabbott; this has changes to scripts/, tools/, and puppet/. scripts/lib/hash_reqs.py, scripts/lib/setup_venv.py, scripts/lib/zulip_tools.py, and tools/lib/provision.py are excluded so tools/provision still gives the right error message on Ubuntu 16.04 with Python 3.5. Generated by com2ann, with whitespace fixes and various manual fixes for runtime issues: -shebang_rules: List[Rule] = [ +shebang_rules: List["Rule"] = [ -trailing_whitespace_rule: Rule = { +trailing_whitespace_rule: "Rule" = { -whitespace_rules: List[Rule] = [ +whitespace_rules: List["Rule"] = [ -comma_whitespace_rule: List[Rule] = [ +comma_whitespace_rule: List["Rule"] = [ -prose_style_rules: List[Rule] = [ +prose_style_rules: List["Rule"] = [ -html_rules: List[Rule] = whitespace_rules + prose_style_rules + [ +html_rules: List["Rule"] = whitespace_rules + prose_style_rules + [ - target_port: int = None + target_port: int Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-24 13:06:54 -07:00
Anders Kaseorg	09ea778db1	nginx: Listen for ACME challenges on port 80 too. This should make Certbot renewals more reliable. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-23 16:22:04 -07:00
Aman Agrawal	2dc6d09c2a	python3-upgrade: Move python2 scripts to run on python3.	2020-04-22 16:13:15 -07:00
Anders Kaseorg	5901e7ba7e	python: Convert function type annotations to Python 3 style. Generated by com2ann (slightly patched to avoid also converting assignment type annotations, which require Python 3.6), followed by some manual whitespace adjustment, and six fixes for runtime issues: - def __init__(self, token: Token, parent: Optional[Node]) -> None: + def __init__(self, token: Token, parent: "Optional[Node]") -> None: -def main(options: argparse.Namespace) -> NoReturn: +def main(options: argparse.Namespace) -> "NoReturn": -def fetch_request(url: str, callback: Any, kwargs: Any) -> Generator[Callable[..., Any], Any, None]: +def fetch_request(url: str, callback: Any, kwargs: Any) -> "Generator[Callable[..., Any], Any, None]": -def assert_server_running(server: subprocess.Popen[bytes], log_file: Optional[str]) -> None: +def assert_server_running(server: "subprocess.Popen[bytes]", log_file: Optional[str]) -> None: -def server_is_up(server: subprocess.Popen[bytes], log_file: Optional[str]) -> bool: +def server_is_up(server: "subprocess.Popen[bytes]", log_file: Optional[str]) -> bool: - method_kwarg_pairs: List[FuncKwargPair], + method_kwarg_pairs: "List[FuncKwargPair]", Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-18 20:42:48 -07:00
pemontto	fd34bc5161	puppet: Allow /etc/zulip to be a symlink. This PR updates the puppet manifest to allow /etc/zulip to be a symlink. The current behaviour overwrites /etc/zulip if it is link to another directory, which is problematic with docker-zulip and in particular the `LINK_SETTINGS_TO_DATA` setting.	2020-04-17 12:45:05 -07:00
Tim Abbott	777a3b6c18	puppet: Fix nagios check to not require typing_extensions.	2020-04-16 17:56:05 -07:00
Tim Abbott	e1ce53ac46	puppet: Update nagios checks for disk to exclude kernel filesystems. The fact that we have to explicitly list these is almost certainly a bug in check_disk, but at least this works.	2020-04-16 17:49:29 -07:00
Tim Abbott	cfbb617f5c	puppet: Update nagios configuration for checking local disk.	2020-04-16 17:48:36 -07:00
Tim Abbott	9821dfa9fc	puppet: The letsencrypt package is debian is now certbot. It was an alias starting with Ubuntu Xenial, and will eventually be removed.	2020-04-16 17:30:01 -07:00
Tim Abbott	8e5a866122	puppet: Update tuning for load average monitoring.	2020-04-16 16:47:05 -07:00
Tim Abbott	b1ff823798	puppet: Remove old zulipbot configuration. We haven't used zulipbot hosted here for years.	2020-04-16 16:18:48 -07:00
Anders Kaseorg	99242138a7	static: Serve webpack bundles from the root domain. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-10 00:48:02 -07:00
Anders Kaseorg	c734bbd95d	python: Modernize legacy Python 2 syntax with pyupgrade. Generated by `pyupgrade --py3-plus --keep-percent-format` on all our Python code except `zthumbor` and `zulip-ec2-configure-interfaces`, followed by manual indentation fixes. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-09 16:43:22 -07:00
Vishnu KS	449f7e2d4b	team: Generate team page data using cron job. This eliminates the contributors data as a possible source of flakiness when installing Zulip from Git. Fixes #14351.	2020-04-08 12:52:31 -07:00
Anders Kaseorg	15d68c40dd	nginx: Set X-XSS-Protection: 1; mode=block. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-05 16:13:53 -07:00
Anders Kaseorg	79c215626e	nginx: Set X-Content-Type-Options: nosniff globally. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-05 16:13:53 -07:00
Anders Kaseorg	06e7d4ec19	nginx: Don’t override HSTS, X-Frame-Options with other ‘add_header’s. The nginx ‘add_header’ directive doesn’t inherit the way you’d want (https://trac.nginx.org/nginx/ticket/854), so we need to manually simulate inheritance using ‘include’, like we previously did with api_headers. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-05 16:13:53 -07:00
Mateusz Mandera	5252b081bd	queue_processors: Gather statistics on queue worker operations.	2020-04-01 16:44:06 -07:00
Stefan Weil	d2fa058cc1	text: Fix some typos (most of them found and fixed by codespell). Signed-off-by: Stefan Weil <sw@weilnetz.de>	2020-03-27 17:25:56 -07:00
Anders Kaseorg	7ff9b22500	docs: Convert many http URLs to https. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-03-26 21:35:32 -07:00
Anders Kaseorg	687553a661	setup_path_on_import: Replace with setup_path function. isort 5 knows not to reorder imports across function calls, so this will stop isort from breaking our code. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-02-25 15:40:21 -08:00
Anders Kaseorg	9d598d95a6	puppet: Fix puppet-lint warning. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-02-20 19:51:48 -08:00
Anders Kaseorg	91edb7dc43	puppet: Fix regeneration of memcached-sasldb2 on password changes. Puppet doesn’t re-run an exec blocks that’s declared as creating an existing file, even if it’s notified. Remove the creates declaration. Fixes #13730. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-02-19 14:20:43 -08:00
Mateusz Mandera	4c5a8e6f0c	queue: Remove missedmessage_email_senders.	2020-01-31 12:13:51 -08:00
Tim Abbott	dd969b5339	install: Remove references to "Zulip Voyager". "Zulip Voyager" was a name invented during the Hack Week to open source Zulip for what a single-system Zulip server might be called, as a Star Trek pun on the code it was based on, "Zulip Enterprise". At the time, we just needed a name quickly, but it was never a good name, just a placeholder. This removes that placeholder name from much of the codebase. A bit more work will be required to transition the `zulip::voyager` Puppet class, as that has some migration work involved.	2020-01-30 12:40:41 -08:00
Tim Abbott	d70e799466	bots: Remove FEEDBACK_BOT implementation. This legacy cross-realm bot hasn't been used in several years, as far as I know. If we wanted to re-introduce it, I'd want to implement it as an embedded bot using those common APIs, rather than the totally custom hacky code used for it that involves unnecessary queue workers and similar details. Fixes #13533.	2020-01-25 22:41:39 -08:00
Anders Kaseorg	3360df7ad1	generate_secrets: Enable memcached authentication in production. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-15 17:35:15 -08:00
Anders Kaseorg	cdda983e90	settings: Support optional memcached authentication. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-15 17:35:15 -08:00
Anders Kaseorg	ea6934c26d	dependencies: Remove WebSockets system for sending messages. Zulip has had a small use of WebSockets (specifically, for the code path of sending messages, via the webapp only) since ~2013. We originally added this use of WebSockets in the hope that the latency benefits of doing so would allow us to avoid implementing a markdown local echo; they were not. Further, HTTP/2 may have eliminated the latency difference we hoped to exploit by using WebSockets in any case. While we’d originally imagined using WebSockets for other endpoints, there was never a good justification for moving more components to the WebSockets system. This WebSockets code path had a lot of downsides/complexity, including: * The messy hack involving constructing an emulated request object to hook into doing Django requests. * The `message_senders` queue processor system, which increases RAM needs and must be provisioned independently from the rest of the server). * A duplicate check_send_receive_time Nagios test specific to WebSockets. * The requirement for users to have their firewalls/NATs allow WebSocket connections, and a setting to disable them for networks where WebSockets don’t work. * Dependencies on the SockJS family of libraries, which has at times been poorly maintained, and periodically throws random JavaScript exceptions in our production environments without a deep enough traceback to effectively investigate. * A total of about 1600 lines of our code related to the feature. * Increased load on the Tornado system, especially around a Zulip server restart, and especially for large installations like zulipchat.com, resulting in extra delay before messages can be sent again. As detailed in https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it appears that removing WebSockets moderately increases the time it takes for the `send_message` API query to return from the server, but does not significantly change the time between when a message is sent and when it is received by clients. We don’t understand the reason for that change (suggesting the possibility of a measurement error), and even if it is a real change, we consider that potential small latency regression to be acceptable. If we later want WebSockets, we’ll likely want to just use Django Channels. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-14 22:34:00 -08:00
Anders Kaseorg	6749810c2e	puppet: Fix zuli-redis.conf path typo. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-13 17:37:09 -08:00
Anders Kaseorg	79cae1e7e0	puppet: Delete legacy rediscleanup code. It was added in commit `9afb1c7a71` from before 1.4.0. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-13 17:37:09 -08:00
Anders Kaseorg	5526af32f3	puppet: Switch double quoted strings to single quoted. Resolves these warnings from puppet-lint. puppet-lint\| puppet/zulip/manifests/app_frontend_base.pp - WARNING: double quoted string containing no variables on line 14 puppet-lint\| puppet/zulip/manifests/app_frontend_base.pp - WARNING: double quoted string containing no variables on line 19 Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-07 15:10:17 -08:00
rht	70dfb423e4	puppet: Specify CentOS specific path to ca certificates for nginx.	2020-01-07 13:25:25 -08:00
rht	d5284b177e	puppet: Convert memorysize_mb to integer depending on Puppet version.	2020-01-07 13:25:25 -08:00
rht	dccfb0ebe9	puppet: Remove duplicate postgresql-client safepackage check on CentOS.	2020-01-07 13:25:25 -08:00
Anders Kaseorg	a78f8647d8	install: Run generate_secrets.py before zulip-puppet-apply. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-05 22:48:08 -08:00
Vishnu KS	8b57e39c7e	settings: Add option to set remote postgres port.	2019-12-12 12:17:11 -08:00
Anders Kaseorg	0d20145b93	mypy: Upgrade from 0.730 to 0.740. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-11-13 12:38:45 -08:00
Anders Kaseorg	0ae2c5c96e	nginx: Enable TLS 1.3 if supported. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-10-30 13:09:57 -07:00
Anders Kaseorg	ee9a6071fd	5xx.html: Build with webpack. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-10-28 15:53:15 -07:00
Tim Abbott	f84c037225	puppet: Tune check_postgres_locks parameters. This has been a spurious alert for a long time. It's unclear that this check is useful at all, but if it spikes dramatically above what's normal, there's perhaps still utility in being alerted.	2019-10-23 15:04:38 -07:00
Tim Abbott	e4dee9532c	nagios: Update configuration for user_activity worker change. Since LoopQueueProcessingWorker jobs cannot be monitored by checking for connected consumers (since they poll, rather than consuming as events arrive), they can't be monitored with check_consumers. It's OK, because that monitoring was redundant with monitoring for potential growth in their queue that we have as well. Also clean up the block comments for the two other similar queue procesors.	2019-09-23 11:49:46 -07:00
Anders Kaseorg	b72bb8171b	nginx: Add CORS, HSTS, and X-Frame-Options headers to error responses. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-09-19 12:10:18 -07:00
Anders Kaseorg	6701c4463c	search: Remove now unnecessary tsearch_extra dependency. Now that we're implemented tsearch_extras in pure postgres, we no longer need a custom extension. This should help us considerably, as it means we no longer need to ship custom apt packages at all. Fixes #467. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-29 12:49:26 -07:00
Anders Kaseorg	b2e1af90fc	process_fts_updates: Reconnect on OperationalError. This allows process_fts_updates to recover if Postgres is restarted. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-21 11:00:58 -07:00
Anders Kaseorg	fb42cd3af9	process_fts_updates: Fix log message. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-21 11:00:58 -07:00
Anders Kaseorg	473c4abca5	process_fts_updates: Use psycopg2.connect kwargs. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-21 11:00:58 -07:00
Anders Kaseorg	fa11b2d806	nginx: Don’t gzip files that are already compressed. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-21 10:51:37 -07:00
Anders Kaseorg	4e620ed43c	nginx: Enable http2 in on-premise configuration. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-21 10:51:37 -07:00
Hemanth V. Alluri	dac068df31	production: Finish adding production support for Zulip on Debian Buster. This commit finishes adding end-to-end support for the install script on Debian Buster (making it production ready). Some support for this was already added in prior commits such as `99414e2d96`. We plan to revert the postgres hunks of this once we've built tsearch_extras for our packagecloud archive. Fixes #9828.	2019-08-17 12:22:32 -07:00
Hemanth V. Alluri	083723b6a9	puppet: Add Zulip specific postgres configuration for 11. Based on the work done in `a03e4784c7`.	2019-08-17 11:41:11 -07:00
Hemanth V. Alluri	792283c441	puppet: Commit an upstream version of postgres 11 config. In preparation for adding production support for Debian Buster. Based on the work done in commit `964a1ac8a7`.	2019-08-17 11:41:11 -07:00
Hemanth V. Alluri	5dd45b4b2e	puppet: Fix the release detection regex patterns in base.pp. The issue here was that the '.' character was unescaped and the regex was not anchored with a terminal '$'. This was detected by Anders Kaseorg. Co-authored-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-17 11:41:11 -07:00
Anders Kaseorg	66649d84cb	puppet: Reload postfix on /etc/postfix/virtual changes. `/etc/postfix/virtual` is of `regexp:` type, not `hash:` type, so running `postmap` on it has no effect; we need to reload Postfix when it changes. http://www.postfix.org/DATABASE_README.html#detect In the interest of forcing a reload now, optimize the regexes by eliding the unanchored `.*`s at the beginnings and ends. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-15 22:45:26 -07:00
rht	61be9fb4bd	puppet: Add Zulip-specific postgres configuration for 10 on Centos.	2019-08-14 14:31:16 -07:00
rht	03fb4b5f90	puppet: Commit an upstream CentOS version of postgres 10 sample config.	2019-08-14 14:31:16 -07:00
Anders Kaseorg	263d71bf2b	nginx: Add CORS headers to /user_uploads. Fixes: #12980. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-12 15:35:35 -07:00
Anders Kaseorg	2e57f3ffae	puppet: “Resolve” puppet-lint warnings. Introduced by #12966. puppet/zulip/manifests/base.pp - WARNING: double quoted string containing no variables on line 93 puppet/zulip/manifests/base.pp - WARNING: string containing only a variable on line 93 scanf doesn’t accept a number as input, so uh, add a dummy space character. What. You can’t give me a bad language and then complain when I write bad programs in it. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-12 15:29:23 -07:00
Anders Kaseorg	820165e4da	Merge pull request #12968 from andersk/ffdhe2048 nginx: Use fixed ffdhe2048 DH parameter (RFC 7919)	2019-08-09 16:29:10 -07:00
Anders Kaseorg	4e9fb05c4f	puppet: Use built-in memorysize_mb fact. Fixes this warning: Warning: The string '8167976' was automatically coerced to the numerical value 8167976 (file: /root/zulip/puppet/zulip/manifests/base.pp, line: 93, column: 19) Fixes #9682. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-09 16:28:09 -07:00
Tim Abbott	de0a41bc9c	provision: Fix missing dependency on unzip. Because this is often installed by default, we hadn't noticed that our Slack importer doesn't run without it. Thanks to Ray Kraesig for the report.	2019-08-08 10:49:20 -07:00
Anders Kaseorg	0962393933	cleanup: Delete trailing newlines. Delete trailing newlines from all files, except tools/ci/success-http-headers.txt and tools/setup/dev-motd, where they are significant, and static/third, where we want to stay close to upstream. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-06 23:29:11 -07:00
Anders Kaseorg	becef760bf	cleanup: Delete leading newlines. Previous cleanups (mostly the removals of Python __future__ imports) were done in a way that introduced leading newlines. Delete leading newlines from all files, except static/assets/zulip-emoji/NOTICE, which is a verbatim copy of the Apache 2.0 license. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-06 23:29:11 -07:00
Anders Kaseorg	68dd8e4ec8	mypy: Migrate from mypy_extensions to typing_extensions. This gives us access to typing_extensions.Deque, which was not added to typing until 3.5.4. (PROVISION_VERSION is not bumped because the transitive dependency set in dev.txt hasn’t changed.) Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-05 17:24:09 -07:00
Wyatt Hoodes	a109508e34	typing: Remove now-unnecessary conditional import. As a result of dropping support for trusty, we can remove our old pattern of putting `if False` before importing the typing module, which was essential for Python 3.4 support, but not required and maybe harmful on newer versions. cron_file_helper check_rabbitmq_consumers hash_reqs check_zephyr_mirror check_personal_zephyr_mirrors check_cron_file zulip_tools check_postgres_replication_lag api_test_helpers purge-old-deployments setup_venv node_cache clean_venv_cache clean_node_cache clean_emoji_cache pg_backup_and_purge restore-backup generate_secrets zulip-ec2-configure-interfaces diagnose check_user_zephyr_mirror_liveness	2019-07-29 15:18:22 -07:00
Anders Kaseorg	b758ed5ac1	nginx: Remove invalid extra headers for OPTIONS /api/v1/events. Since 204 responses don’t contain a payload body, Content-Type is neither required nor encouraged (RFC 7231 §3.1.1.5), and ours was missing a semicolon to boot; Content-Length is expressly forbidden (RFC 7230 §3.3.2). Furthermore, these add_header directives were silencing the CORS headers set in api_headers, because add_header inheritance doesn’t work the way you think it does, as was known before commit `5614d51afc`. Fixes: #12902. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-07-29 14:58:35 -07:00
Anders Kaseorg	6d5a20ac62	requirements: Remove django-pipeline. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-07-24 17:40:31 -07:00
Tim Abbott	2eb855b302	puppet: Include . separator in email mirror rules. This is required for the postfix-localmail integration to use the new `.` format email addresses.	2019-07-22 11:13:36 -07:00
Wyatt Hoodes	e331a758c3	python: Migrate open statements to use with. This is low priority, but it's nice to be consistently using the best practice pattern. Fixes: #12419.	2019-07-20 15:48:52 -07:00
Anders Kaseorg	c97ca677c9	nginx: Update TLS settings based on Mozilla recommendations 5.0. Disable TLS 1.0 and TLS 1.1. (We no longer need to support IE8 on Windows XP.) Prefer client-selected cipher order. (Now that all enabled ciphers provide good security, this allows mobile clients lacking AES hardware acceleration to pick ChaCha20 for better performance.) Disable session tickets. (Mozilla discourages them based on https://www.imperialviolet.org/2013/06/27/botchingpfs.html.) Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-07-08 15:51:02 -07:00
Anders Kaseorg	079ddae4c8	minify-js: Remove; everything has been migrated to Webpack. min/sockjs-0.3.4.min.js is not used. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-07-03 13:58:21 -07:00
Tim Abbott	aea1279e8c	puppet: Remove trusty configuration for static_asset_compiler. Trusty is desupported.	2019-06-26 11:32:06 -07:00
Tim Abbott	8fbd965ab5	puppet: Remove legacy pgtune related configuration for trusty. Since we no longer support Ubuntu Trusty, we no longer need this backwards-compatibility cruft (which we only kept around to avoid randomizing configuration for existing systems).	2019-06-26 11:32:06 -07:00
Anders Kaseorg	33c941407b	puppet: Remove legacy unauthenticated local uploads backend. This was only used in Ubuntu 14.04 Trusty. Removing this also finally lets us simplify our security model discussion of uploaded files. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-06-26 11:31:46 -07:00
Tim Abbott	271319fb13	puppet: Fix hacky release test for whether we're in EC2. The result is still a bit hacky, but guaranteed to be correct if we adjust the OS version of our systems, which we of course will do over time.	2019-06-25 22:19:04 -07:00
Tim Abbott	8d8cfb314b	puppet: Remove zulip_ops configuration for trusty. There are no longer any zulip_ops systems using trusty.	2019-06-25 22:09:06 -07:00
Tim Abbott	88b77af54f	puppet: Add support for changing the nginx port directly. This provides a clean process for changing Zulip's nginx port.	2019-06-17 12:24:22 -07:00
Fabian Stanke	51ba9ddd89	postfix: Inserted compulsory setting for postfix ≥ 2.10. One of smtpd_relay_restrictions or smtpd_recipient_restrictions is required by postfix ≥ 2.10 (see http://www.postfix.org/SMTPD_ACCESS_README.html). This is important for using the email mirror on Ubuntu Bionic.	2019-06-16 18:48:39 -07:00
Tim Abbott	b41c2d93d1	puppet: Exclude squashfs filesystems from nagios disk checks. These generally aren't being written to.	2019-06-16 16:22:23 -07:00
Tim Abbott	0ec1b4e82c	puppet: Move check_send_receive_time to the _once ruleset. We don't actually want to run this bundle of message-sending Nagios checks to run on every single server.	2019-06-16 15:48:35 -07:00
Tim Abbott	df83979c76	zulip_ops: Extract a prod_app_frontend_once ruleset.	2019-06-16 15:48:35 -07:00
Tim Abbott	738cfe54c3	puppet: Move app_frontend_once out of prod configuration. That logic made it inconvenient to run multiple prod servers with the same top-level puppet configuration.	2019-06-16 15:24:20 -07:00
Tim Abbott	e85250941d	puppet: Fix quoting of commented-out python3-boto. This will avoid a linter error if/when we uncomment it.	2019-06-13 14:39:24 -07:00
Tim Abbott	337efe0fb7	puppet: Remove puppet-el, which no longer exists. This package was only every available on Ubuntu Xenial.	2019-06-13 14:39:24 -07:00
Tim Abbott	afb0d1ccce	Revert "puppet: Use nice to deprioritize various processes." This reverts commit `d959de7a89`. This broken our Travis CI, so I'm pulling it off while we investigate.	2019-06-05 12:55:56 -07:00
Tim Abbott	d959de7a89	puppet: Use nice to deprioritize various processes. Our priority hierarchy is: (1) Tornado and base services like memcached, redis, etc. (2) Django and message sender queue workers. (3) Everything else. Ideally, we'd have something a bit more fine-grained (e.g. some queue workers are potentially in the sending path, while others aren't), but this should have a big impact on ensuring Tornado gets the resources it needs during load spikes. I think this has a good chance of causing some load spikes that would previously have resulted in a user-facing delivery delays no longer having any significant user-facing impact.	2019-06-05 11:56:48 -07:00
Tim Abbott	cd1ec37404	puppet: Make uwsgi listen backlog limit configurable. This can be useful for busy servers to limit the risk of bursts of traffic causing them to reject requests.	2019-05-17 12:38:56 -07:00
Tim Abbott	ca48b4ec9f	puppet: Set postgres max_connections to 1000. There isn't much legitimate reason to have a limit as low as 100, given how few resources a connection consumes.	2019-05-13 17:19:31 -07:00
Tim Abbott	b7d50190b7	process_fts_updates: Batch updates when catching up. Previously, if process_fts_updates ended up very far behind (e.g. 100,000s of messages), it was unable to recover without doing some very expensive databsae operations to fetch and then delete the list of message IDs needing updates. This change fixes that issue by doing the catch-up work in batches.	2019-05-09 22:44:07 -07:00
Vishnu Ks	ecdd3bea43	billing: Add cron job to run invoice_plans once a day. Fixes #11960	2019-04-29 11:23:17 -07:00
Anders Kaseorg	643bd18b9f	lint: Fix code that evaded our lint checks for string % non-tuple. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-04-23 15:21:37 -07:00
Anders Kaseorg	5290519a62	scripts: Always use ON_ERROR_STOP=1 when running psql. Also use psql -e (--echo-queries) in scripts that use ‘set -x’, so errors can be traced to a specific query from the output. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-04-22 14:54:19 -07:00
Puneeth Chaganti	9876f1b14e	check_rabbitmq_queue: Fix the time period when we ignore long queues. The commit `87d1809657` changed the time when digests are sent by 3 hours to account for moving from the US East Coast to the West Coast, but didn't change the time period exception in the `check-rabbitmq-queue` script. Closes #5415	2019-04-13 20:43:07 -07:00
Anders Kaseorg	9f7c0b7e65	postgres_master.pp: Fix wacky su command line. The construction `su postgres -c -- bash -c 'psql …'` didn’t behave the way it reads, and only worked by accident: 1. `-c --` sets the command to `--`. 2. `bash` sets the first argument to `bash`. 3. `-c 'psql …'` replaces the command with `psql …`. Thus, `su` ended up executing `<shell> -c 'psql …' bash`, where `<shell>` is the `postgres` user’s login shell, usually also `bash`, which then executed 'psql …' and ignored the extra `bash`. Unconfuse this construction. Note from tabbott: The old code didn't even work by accident, it was just broken. The right fix is to move the quoting around properly. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-04-12 17:27:23 -07:00
Tim Abbott	b1da797955	puppet: Make uwsgi buffer size configurable.	2019-03-18 22:43:59 -07:00
Anders Kaseorg	fd6f18f7cf	nginx: Improve TLS settings based on Mozilla config generator. Lengthen the session timeout and enlarge the session cache. Upgrade Diffie-Hellman parameters from fixed 1024-bit to custom 2048-bit. Enable OCSP stapling. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-03-11 23:40:34 -07:00
Tim Abbott	e0c6136ce1	puppet: Fix nginx configuration logic for S3 backend. Apparently, our testing environment for this configuration was broken and didn't test the code we thought it did; as a result, a variable redefinition bug slipped through. Fixes #11786.	2019-03-06 13:17:11 -08:00
Tim Abbott	5614d51afc	nginx: Restructure how we manage uploaded file routes. The overall goal of this change is to fix an issue where on Ubuntu Trusty, we were accidentally overriding the configuration to serve uploads from disk with the regular expressions for adding access control headers. However, while investigating this, it became clear that we could considerably simplify the mental energy required to understand this system by making the uploads-route file be unconditionally available and included from `zulip-include/app` (which means the zulip_ops code can share behavior here). We also move the Access-Control-Allow-* headers to a separate include file, to avoid duplicating it in 5 places. Fixing this duplication discovered a potential bug in the settings used for Tornado, where DELETE was not allowed on a route that definitely expects DELETE. Fixes #11758.	2019-03-02 12:14:28 -08:00
Anders Kaseorg	649235cfec	python: Remove unused imports. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-02-22 16:54:36 -08:00
Tim Abbott	a0add8f651	puppet: Add IPv6 support to standard nginx listen directives. This should save some setup work for anyone wanting to setup nginx on their Zulip server.	2019-02-13 15:00:21 -08:00
Tim Abbott	ab18dbfde5	uwsgi: Increase buffer-size to 8192. For users putting Zulip behind certain proxies (and potentially some third-party API clients), buffer sizes can exceed the uwsgi default of 4096. Since we aren't doing such high-throughput APIs that a small buffer size is valuable, we should just raise this for everyone.	2019-02-13 11:17:55 -08:00
Anders Kaseorg	c109690cf8	puppet: Remove unused Python imports. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-02-02 17:02:12 -08:00
Tim Abbott	68552c31cb	Revert "puppet: Increase process listening count for uwsgi." This reverts commit `ccce83d0f0`. This needs sysctl changes as well.	2019-01-23 11:02:14 -08:00
Tim Abbott	ccce83d0f0	puppet: Increase process listening count for uwsgi. The default limit is too low for situations right around a server restart when there might be a large burst of connections.	2019-01-23 10:34:01 -08:00
Harshit Bansal	50ef91bb08	scripts: Add argparse option to `restart-zerver` for `--fill-cache`. Nowm unless you specify `--fill-cache`, memcached caches will not be pre-filled after a server restart. This will be helpful when someone is in a hurry (e.g. if the server is down right now, or if he/she testing a configuration change in a newly setup server), it's best to just restart without pre-filling the cache. Fixes: #10900.	2019-01-14 15:20:01 -08:00
Tom Daff	fbffbf8ef0	puppet/nginx: Update to recommended SSL ciphers. Update the list of ciphers that nginx will use to the current Mozilla recommended ones. These are Intermediate compatibility ones suitable for clients running anything newer than Firefox 1, Chrome 1, IE 7, Opera 5 and Safari 1. Modern compatibility is not suitable as it excludes Andriod 4 which is still seen on ~1% of traffic. More info: https://wiki.mozilla.org/Security/Server_Side_TLS	2019-01-08 14:19:49 -08:00
rht	3f0bae8c38	puppet: Disable camo when not on Debian.	2019-01-07 18:52:45 -08:00
rht	bf65f86a0b	puppet: Abstract out ssl certs and private keys dirs.	2019-01-07 18:52:45 -08:00
rht	d9ef3fd505	puppet: Manually create ssl-cert group on CentOS to acess ssl private key.	2019-01-07 18:51:39 -08:00
rht	6c3bb507b0	puppet: Ensure nginx sites-available & sites-enabled dirs exist on CentOS. These are automatically created on Debian.	2019-01-07 17:09:42 -08:00
rht	f2b6a2c68a	puppet: Add CentOS version of the command to start supervisor.	2019-01-05 15:57:53 -08:00
rht	39f28a0d0f	puppet: Abstract out supervisor service name.	2019-01-05 15:57:53 -08:00
rht	d2069f7720	puppet: Include yum repository for CentOS voyager.	2019-01-05 15:57:45 -08:00
rht	1da17be52a	puppet: Ensure supervisord conf.d directory is created on CentOS.	2019-01-05 15:55:43 -08:00
rht	902bb7a80c	puppet: Add CentOS version of supervisor conf.d path.	2019-01-05 15:54:21 -08:00
rht	6b0bf828f7	puppet: Add CentOS version of supervisord.conf path.	2019-01-05 15:49:03 -08:00
rht	9ee2ee046a	puppet: Use systemctl instead of pg_ctlcluster on CentOS.	2019-01-05 15:49:03 -08:00
rht	2bcf83d940	puppet: Add CentOS packages to static_asset_compiler.pp.	2019-01-05 15:49:03 -08:00
rht	071e32985c	puppet: Generalize redis.conf path to CentOS.	2019-01-05 15:49:03 -08:00
rht	acaf001cdd	puppet: Group commonly reused variables into zulip::common.	2019-01-05 15:49:03 -08:00
rht	766ff38586	puppet: Abstract out nagios plugins directory.	2019-01-05 15:49:03 -08:00
rht	b22f6c6a99	puppet: Abstract out postgresql package.	2019-01-05 15:49:03 -08:00
rht	43fdb00fc7	puppet: Abstract out nginx package.	2019-01-05 15:49:03 -08:00
rht	5424fca168	puppet: Add CentOS packages to postgres_appdb_base.pp.	2019-01-05 15:49:03 -08:00
rht	21c71a0c52	puppet: Use generic erlang package variable for all dependencies.	2019-01-05 15:49:02 -08:00
Tim Abbott	047817b6b0	puppet: Disable log2zulip cron job. It hasn't been working for years, but more importantly, it spams up root's mail queue so that one can't find important things in there (e.g. the fact that the long-term-idle cron job was failing).	2019-01-05 10:56:44 -08:00
rht	801b04c057	puppet: Abstract out nagios-plugins package.	2019-01-04 15:27:03 -08:00
rht	04372e3300	puppet: Add CentOS packages to postgres_common.pp.	2019-01-04 15:24:42 -08:00
rht	bdf36bdc3d	puppet: Use pip to install python dependencies on CentOS.	2019-01-04 15:23:45 -08:00
rht	008879eb22	puppet: Add postgresql.conf path for CentOS.	2019-01-03 14:36:43 -08:00
rht	dce43e1a0e	puppet: Add CentOS-version of pg data path at pg_backup_and_purge.	2019-01-03 14:36:43 -08:00
rht	59993aea80	puppet: Abstract out path to postgresql.conf.	2019-01-03 14:36:43 -08:00
rht	c189409ffd	puppet: Initialize yum_repository.pp to wrap setup-yum-repo.	2019-01-03 14:36:43 -08:00
rht	1b02fb6d6d	puppet: Add CentOS packages to rabbit.pp.	2019-01-03 14:36:43 -08:00
rht	a3d67e52fe	puppet: Add CentOS packages to redis.pp.	2019-01-03 14:36:43 -08:00
rht	788128f05c	puppet: Add CentOS packages to nginx.pp.	2019-01-03 14:36:43 -08:00
rht	1965cc1491	puppet: Add CentOS packages to base.pp.	2019-01-03 14:36:42 -08:00
rht	0caaed4e1f	puppet: Add CentOS packages to apache_sso.pp.	2019-01-03 14:36:41 -08:00
rht	9fd18a2c7f	puppet: Detect CentOS for release_name.	2019-01-03 14:35:09 -08:00
Fabian Tribrunner	013b169469	check_email_deliverer_backlog: Switch down from root if required. This now checks if the user is zulip, and if not, switches to the zulip user, making it possible to run it as root. Significantly modified by tabbott to not break existing behavior.	2019-01-02 10:16:56 -08:00
Fabian Tribrunner	4612a84a67	check_email_deliverer_process: Fix typo in process name. This nagios check has never worked, since it had the wrong process name from the beginning.	2018-12-30 09:41:42 -08:00
Anders Kaseorg	392175d6e8	Use #!/usr/bin/env for bash shebangs. /bin/sh and /usr/bin/env are the only two binaries that NixOS provides at a fixed path (outside a buildFHSUserEnv sandbox). This discussion was split from #11004. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-12-17 17:21:08 -08:00
Tim Abbott	2558f101af	docs: Add documentation for `if False` mypy pattern in scripts. This should help make it clear what's going on with these scripts.	2018-12-17 11:12:53 -08:00
Tim Abbott	bce90a3340	lint: Add lint rule for scripts importing typing improperly. This is a common bug that users might be tempated to introduce. And also fix two instances of this bug that were present in our codebase, including an important one in our upgrade code path.	2018-12-17 10:46:37 -08:00
rht	43bedc0909	provision: Use vendored pg_hba.conf on CentOS.	2018-12-16 13:21:54 -08:00
rht	c9d54f7854	puppet: Remove vendored puppetlabs apt and stdlibs dependencies. This commit works by vendoring the couple functions we still use from puppetlabs stdlib (join and range), but removing the rest of the puppetlabs codebase, and of course cleaning up our linter rules in the process. Fixes #7423.	2018-12-11 13:03:26 -08:00
rht	d2aa81858c	puppet/zulip_ops: Replace apt::source with setup-apt-repo-debathena. Tweaked by tabbott to use a clearer name.	2018-12-11 13:02:56 -08:00
rht	97766102df	puppet/zulip: Replace apt::source and apt::ppa with setup-apt-repo.	2018-12-11 13:01:26 -08:00
Tim Abbott	b218c2a70e	loadbalancer: Use same certbot cert for zulipstaging.com. This is a simple configuration improvement.	2018-12-07 13:43:21 -08:00
Tim Abbott	467694c1fa	nginx: Enable http2 in external nginx configuration. This should be a nice performance improvement for browsers that support it. We can't yet enabled this in the Zulip on-premise nginx configuration, because that still has to support Trusty.	2018-12-07 13:43:02 -08:00
Tim Abbott	e609e10229	puppet: Fix missing dependency of tsearch-extras on apt repository. This isn't super required, in that we add these repositories via `setup-apt-repo` in any case, but the previous code was wrong and worth fixing in any case.	2018-11-30 10:45:04 -08:00
Tim Abbott	ededdc512b	nginx: Fix missing API authentication configuration. This fixes a bug where our API routes for uploaded files (where we need to use a consistent URL between session auth and API auth) were not properly configured to pass through the API authentication headers (and otherwise provide REST endpoint settings). In particular, this prevented the Zulip mobile apps from being able to access authenticated image files using these URLs.	2018-11-16 11:25:54 -08:00
Tim Abbott	f62050212b	tornado: Fix supervisord configuration for multiple processes. Apparently, we can use the process group naming style of having dashes in the names without using the explicit nun_procs feature of supervisord configuration. The new configuration is perfectly satisfactory, so there's no real reason to prefer the old approach.	2018-11-06 17:56:06 -08:00
Tim Abbott	5abf4dee92	nagios: Add new host groups for Tornado processes. We also move all the existing Tornado monitoring rules to the singletornado_frontends rule.	2018-11-06 16:33:18 -08:00
Tim Abbott	5f3b79c9e7	nagios: Fix tab-based whitespace.	2018-11-06 16:30:29 -08:00
Tim Abbott	5e7aa27c29	puppet: Add supervisord support for multiple tornado processes.	2018-11-02 16:55:33 -07:00
Tim Abbott	a5acbd51c3	settings: Add new zulip.conf setting for number of Tornado processes. This will eventually be used to support Tornado sharding; for now, it's just used to contain the code intended to support that feature.	2018-11-02 16:47:26 -07:00
Tim Abbott	dc7d44a245	puppet: Don't run calculate-first-visible-message-id on most systems. This should only be run on systems that are running zilencer, because the cron job is part of the zilencer project.	2018-10-30 11:40:24 -07:00
Tim Abbott	a4df001cef	check_queue_worker_errors: Add support for running unprivileged. Previously, this script needed access to Django settings, which in turn required access to /etc/zulip/zulip-secrets.conf. Since that isn't world-readable, this meant that this couldn't run as an unprivileged `nagios` user. Fix that by just hardcoding the appropriate path under /var/log/.	2018-10-18 15:03:17 -07:00
Tim Abbott	98d89b676d	pg_backup_and_purge: Fix incorrect conversion to use python3 types. When using the Python 3 typing style, Python scripts can't import from typing inside an `if False` (in contrast, one needs to import inside an `if False` to support the Python 3 syntax without needing python-typing installed). So this was just incorrectly half-converted from the Python 2 style to the Python 3 style.	2018-10-16 11:12:52 -07:00
Tim Abbott	2c7f9ce0fc	puppet: Fix puppet-lint warnings in various manifests. Apparently, `puppet-lint` on Ubuntu trusty throws warnings for certain quoting patterns that are OK in modern `puppet-lint`. I believe the old Zulip code was actually correct (i.e. the old `puppet-lint` implementation was the problem), but it seems worth changing anyway to suppress the warnings. We also exclude more of puppet-apt from linting, since it's third-party code.	2018-08-28 13:46:31 -07:00
Tim Abbott	b53a712856	nginx: Update configuration for using certbot certs everywhere.	2018-08-22 11:59:15 -07:00
Abhilash Verma	0e2322a322	logging: Show timestamp in UTC in non-django production scripts. Done in pair programming with @aero31aero. Fixes #9678.	2018-08-20 12:52:40 -07:00
Tim Abbott	5021f7b76f	puppet: Fix accidental conflict on apache2 package. Apparently, the work to force installation of the Python 3 version of mod_wsgi was buggy and tried to force uninstall apache2. Fixes #10318.	2018-08-16 14:15:35 -07:00
Tim Abbott	90828297e4	puppet-lint: Enforce double_quoted_strings check. This makes our puppet codebase more consistent by using single-quoted strings consistently.	2018-08-13 12:31:19 -07:00
Tim Abbott	d0b51b70f4	puppet-lint: Enforce 2sp_soft_tables puppet-lint check. This cleans up the puppet codebase's whitespace formatting to be more consistent.	2018-08-13 12:31:16 -07:00
Tim Abbott	b26e0a957d	puppet-lint: Enforce arrow_alignment check. This fixes all exceptions in our puppet codebase to this lint rule.	2018-08-13 12:30:57 -07:00
Tim Abbott	054c07b585	mypy: Fix run types in pg_backup_and_purge.	2018-08-09 12:57:53 -07:00
Tim Abbott	db1f706d09	pg_backup_and_purge: Fix buggy recovery status parsing. This was converted to Python 3 incorrectly, in a way that actually completely broke the script (the .decode() that this adds is critical, since 'f' != b'f'). We fix this, and also add an assert that makes the parsing code safer against future refactors.	2018-08-09 11:48:48 -07:00
Aditya Bansal	5bfe24beef	puppet-lint: Fix an error with defined type safepackage in base.pp. We fix "ERROR: safepackage not in autoload module layout" error which was caused by a defined type "safepackage" definitation lying in the wrong place. We refactor to create the defined type according to puppet guidelines. Link below: https://docs.puppet.com/puppet/2.7/lang_defined_types.html	2018-08-07 10:03:40 -07:00
Aditya Bansal	710d4507de	puppet-lint: Fix lines longer than 140 characters lint warnings. We fix these by adding ignore statements in a bunch of files where this error popped up. We target only specific lines using the ignore statements and not the entire files.	2018-08-07 10:03:40 -07:00
Anders Kaseorg	edfd5ef992	setup_disks.sh: Fix shellcheck warnings. In puppet/zulip_ops/files/postgresql/setup_disks.sh line 15: array_name=$(mdadm --examine --scan \| sed 's/.*name=//') ^-- SC2034: array_name appears unused. Verify use (or export if used externally). Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-08-03 09:15:26 -07:00
Anders Kaseorg	5a0fecc2d5	munin_plugins: Fix shellcheck warnings. In puppet/zulip_ops/files/munin-plugins/rabbitmq_connections line 66: echo "connections.value $(HOME=$HOME rabbitmqctl list_connections \| grep -v "^Listing" \| grep -v "done.$" \| wc -l)" ^-- SC2126: Consider using grep -c instead of grep\|wc -l. In puppet/zulip_ops/files/munin-plugins/rabbitmq_consumers line 32: VHOST=${vhost:-"/"} ^-- SC2034: VHOST appears unused. Verify use (or export if used externally). In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages line 32: VHOST=${vhost:-"/"} ^-- SC2034: VHOST appears unused. Verify use (or export if used externally). In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages_unacknowledged line 32: VHOST=${vhost:-"/"} ^-- SC2034: VHOST appears unused. Verify use (or export if used externally). In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages_uncommitted line 32: VHOST=${vhost:-"/"} ^-- SC2034: VHOST appears unused. Verify use (or export if used externally). In puppet/zulip_ops/files/munin-plugins/rabbitmq_queue_memory line 32: VHOST=${vhost:-"/"} ^-- SC2034: VHOST appears unused. Verify use (or export if used externally). Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-08-03 09:15:08 -07:00
Anders Kaseorg	16ed5d5e79	env-wal-e: Fix shellcheck warnings. In puppet/zulip/files/postgresql/env-wal-e line 6: export AWS_ACCESS_KEY_ID=$(crudini --get "$ZULIP_SECRETS_CONF" secrets s3_backups_key) ^-- SC2155: Declare and assign separately to avoid masking return values. In puppet/zulip/files/postgresql/env-wal-e line 7: export AWS_SECRET_ACCESS_KEY=$(crudini --get "$ZULIP_SECRETS_CONF" secrets s3_backups_secret_key) ^-- SC2155: Declare and assign separately to avoid masking return values. In puppet/zulip/files/postgresql/env-wal-e line 9: if [ $? -ne 0 ]; then ^-- SC2181: Check exit code directly with e.g. 'if mycmd;', not indirectly with $?. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-08-03 09:13:07 -07:00
Anders Kaseorg	d5bf4eed9a	check_worker_memory: Fix shellcheck warnings. In puppet/zulip/files/nagios_plugins/zulip_app_frontend/check_worker_memory line 12: ps -o vsize,size,pid,user,command --sort -vsize $processes > "$datafile" ^-- SC2086: Double quote to prevent globbing and word splitting. In puppet/zulip/files/nagios_plugins/zulip_app_frontend/check_worker_memory line 14: top_worker=$(cat "$datafile" \| head -n2 \| tail -n1) ^-- SC2002: Useless cat. Consider 'cmd < file \| ..' or 'cmd file \| ..' instead. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-08-03 08:42:08 -07:00
Anders Kaseorg	eb4855b77b	check_email_deliverer_process: Fix shellcheck warnings. In puppet/zulip/files/nagios_plugins/zulip_app_frontend/check_email_deliverer_process line 16: elif [ "$(echo "$STATUS" \| egrep '(STOPPED)\|(STARTING)\|(BACKOFF)\|(STOPPING)\|(EXITED)\|(FATAL)\|(UNKNOWN)$')" ] ^-- SC2143: Use egrep -q instead of comparing output with [ -n .. ]. ^-- SC2196: egrep is non-standard and deprecated. Use grep -E instead. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-08-03 08:42:08 -07:00
Anders Kaseorg	e3253a7a1b	check_email_deliverer_backlog: Fix shellcheck warnings. In puppet/zulip/files/nagios_plugins/zulip_app_frontend/check_email_deliverer_backlog line 8: cd /home/zulip/deployments/current ^-- SC2164: Use 'cd ... \|\| exit' or 'cd ... \|\| return' in case cd fails. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-08-03 08:42:08 -07:00
Anders Kaseorg	510c97d861	scripts: Use shell quoting when displaying commands to be run. This way, commands with arguments containing whitespace or metacharacters are unambiguously readable. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-07-30 22:39:08 -07:00
Tim Abbott	7853254df7	puppet: Run thumbor by default on voyager systems. With this change, all that one needs to do to start using thumbor in production is to set the `THUMBOR_URL` setting. Since without THUMBOR_URL enabled, the thumbor service doesn't actually do anything, this is pretty safe.	2018-07-30 16:16:52 -07:00
Tim Abbott	02ae71f27f	api: Stop using API keys for Django->Tornado authentication. As part of our effort to change the data model away from each user having a single API key, we're eliminating the couple requests that were made from Django to Tornado (as part of a /register or home request) where we used the user's API key grabbed from the database for authentication. Instead, we use the (already existing) internal_notify_view authentication mechanism, which uses the SHARED_SECRET setting for security, for these requests, and just fetch the user object using get_user_profile_by_id directly. Tweaked by Yago to include the new /api/v1/events/internal endpoint in the exempt_patterns list in test_helpers, since it's an endpoint we call through Tornado. Also added a couple missing return type annotations.	2018-07-30 12:28:31 -07:00
Tim Abbott	07af59d4cc	tornado: Split get_events_backend into two functions. The lower-layer function, now called get_events_backend, is intended to be called by multiple code paths (including the upcoming get_events_internal).	2018-07-30 12:28:31 -07:00
Anders Kaseorg	dbe65231fc	puppet/zulip/files/nagios_plugins/zulip_app_frontend/check_send_receive_time: Avoid shelling out for mv. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-07-19 10:43:37 -07:00
Joshua Schmidlkofer	b1a57d144f	thumbor: Add production installer/puppet support. This commits adds the necessary puppet configuration and installer/upgrade code for installing and managing the thumbor service in production. This configuration is gated by the 'thumbor.pp' manifest being enabled (which is not yet the default), and so this commit should have no effect in a default Zulip production environment (or in the long term, in any Zulip production server that isn't using thumbor). Credit for this effort is shared by @TigorC (who initiated the work on this project), @joshland (who did a great deal of work on this and got it working during PyCon 2017) and @adnrs96, who completed the work.	2018-07-12 20:37:34 +05:30
Tim Abbott	afdfdf775c	nginx: Set X-Frame-Options header to DENY. While there are legitimate use cases for embedded Zulip in an iFrame, they're rare, and it's more important to prevent this category of attack by default. Sysadmins can switch this to a whitelist when they want to use frames.	2018-05-30 09:24:17 -07:00
Sampriti Panda	250015a5d5	pgroonga: Fix issues with HTML escaping in queries.	2018-05-28 16:53:30 -07:00
Vishnu Ks	54a002c2e2	requirements: Upgrade pyflakes to 2.0.0. We fix a few errors that only the new version finds.	2018-05-24 11:31:36 -07:00
Tim Abbott	42da4522a9	puppet-apt: Fix buggy access to caller_module_name. New versions of Puppet on Ubuntu bionic don't like this.	2018-05-24 09:52:16 -07:00
Tim Abbott	b83ba85100	puppet: Switch memcached to using common total_memory_mb value. This just cuts a bit of unnecessary code duplication.	2018-05-24 09:49:43 -07:00
Tim Abbott	9b4b15cd0a	static_asset_compiler: Remove dependency on node packages. We no longer need or use these, since Zulip installs a pinned version of node directly with the scripts/setup/install-node tool. Noticed because in the effort of adding Ubuntu bionic support, we noticed the package names changed again.	2018-05-24 09:43:45 -07:00
Tim Abbott	c843276196	nginx: Fix accidental load-balancing between IPv4 and IPv6. Apparently, our nginx configuration's use of "localhost", combined with the default in modern Linux of having localhost resolve to both the IPv4 and IPv6 addresses on a given machine, resulted in `nginx` load-balancing requests to a given Zulip server between the IPv4 and IPv6 addresses. This, in turn, resulted in irrelevant 502 errors problems every few minutes on the /events endpoints for some clients. Disabling IPv6 on the server resolved the problem, as does simply spelling localhost as 127.0.0.1 for the `nginx` upstreams that we declare for proxying to non-Django services on localhost.	2018-05-22 11:56:59 -07:00
Tim Abbott	12dcabcdbd	docker: Remove need for static_asset_compiler. Now that the way we're installing from Git involving building a release tarball with a 2-stage build, we no longer need to do this.	2018-05-20 13:15:21 -07:00
Tim Abbott	61d6965634	puppet: Add option for controlling file upload nginx config. Now, one can just set `no_serve_uploads` in `zulip.conf` to prevent `nginx` from serving locally uploaded files. This should help simplify the S3 integration setup process.	2018-05-17 07:02:30 -07:00
Tim Abbott	6d74ba8271	puppet: Add zulip.conf option for HTTP only. This option is intended to support situations like a quick Docker setup where doing HTTPS adds more setup overhead than it's worth. It's not intended to be used in actual production environments.	2018-05-17 06:58:35 -07:00
Jason Michalski	3d8e424d84	puppet: Add cron package dependency The Zulip puppet installs various cron jobs and will fail if cron is not installed. This was found when installing Zulip in a minimal docker image.	2018-05-16 15:04:31 -07:00
Tim Abbott	f2efa122a6	puppet: Include static_asset_compiler in dockervoyager. This is required to build static assets from Git.	2018-05-15 18:27:01 -07:00
Tim Abbott	9498260516	puppet: Include process_fts_updates in dockervoyager manifest. This is preferred, since we don't currently have a way to run Django logic on the postgres hosts with the Docker implementation. This is a necessary part of removing the need for the docker-zulip package to patch this file to make Zulip work with Docker.	2018-05-15 15:37:12 -07:00
Tim Abbott	ee3cd95bd1	puppet: Remove python 2 psycopg2 package. We no longer need this, since we're a Python 3 project now.	2018-05-15 15:37:12 -07:00
Tim Abbott	bd5e2ddc74	puppet: Extract zulip::process_fts_updates. In theory, one might want to run this either on the postgres server or on an application server.	2018-05-15 15:37:12 -07:00

... 8 9 10 11 12 ...

1646 Commits