zulip

Commit Graph

Author	SHA1	Message	Date
Anders Kaseorg	15483c09cb	puppet: Add missing trailing commas. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-13 15:36:06 -07:00
Alex Vandiver	3691a94efe	puppet: Configure munin and nagios under apache with puppet. This swaps in the actually-in-use munin configuiration file; otherwise, it is an implementation of the configuration as it exists on the machine.	2020-07-13 13:23:11 -07:00
Alex Vandiver	4e42164b4a	munin: Add plugins to prod hosts.	2020-07-13 13:23:11 -07:00
Alex Vandiver	2a14212b27	munin: Add a helper resource definition for munin plugins.	2020-07-13 12:49:28 -07:00
Alex Vandiver	7c7b5fcd6f	munin: Deal with spaces in the channel names.	2020-07-13 12:49:28 -07:00
Alex Vandiver	eda2c4b8e2	puppet: Split munin-node from munin-server. No plugins are installed inside the /usr/local/munin/lib this creates in munin-node, nor are they symlinked into /etc/munin/plugins, so non-default plugins are added by this.	2020-07-13 12:49:28 -07:00
Alex Vandiver	ddc7bb5a45	munin: Fix the path to check_send_receive_time.	2020-07-13 12:49:28 -07:00
Alex Vandiver	8be544e7eb	munin: Rename monitoring plugin to use zulip name, not humbug.	2020-07-13 12:49:28 -07:00
Alex Vandiver	8cff27f67d	puppet: Pull hosts from zulip.conf, not hardcoded list. The one complexity is that hosts_fullstack are treated differently, as they are not currently found in the manual `hosts` list, and as such do not get munin monitoring.	2020-07-10 00:14:09 -07:00
Alex Vandiver	24383a5082	puppet: Rename hosts_domain so hosts_prefix can be grepped for.	2020-07-10 00:14:09 -07:00
Alex Vandiver	a4e7c7a27e	nagios: Remove check_memcached. check_memcached does not support memcached authentication even in its latest release (it’s in a TODO item comment, and that’s it), and was never particularly useful.	2020-07-10 00:12:48 -07:00
Anders Kaseorg	9900298315	zthumbor: Remove Python 2 residue. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-07-06 18:44:58 -07:00
Alex Vandiver	a21a086f5c	puppet: nagios-plugins-basic is replaced by monitoring-plugins-basic. In Bionic, nagios-plugins-basic is a transitional package which depends on monitoring-plugins-basic. In Focal, it is a virtual package, which means that every time puppet runs, it tries to re-install the nagios-plugins-basic package. Switch all instances to referring to `$zulip::common::nagios_plugins`, and repoint that to monitoring-plugins-basic.	2020-06-29 14:58:01 -07:00
Alex Vandiver	876ee4a8ed	installer: Remove code specific to stretch or xenial. Support for Xenial and Stretch was removed (`5154ddafca`, `0f4b1076ad`, `8944e0ad53`, `79acd5ae40`, `1219a2e854`), but not all codepaths were updated to remove their conditionals on it. Remove all code predicated on Xenial or Stretch. debathena support was migrated to Bionic, since that appears to be the current state of existing debathena servers.	2020-06-24 12:57:38 -07:00
Alex Vandiver	7250d41bf7	puppet: Fix the path to install-wall-g	2020-06-17 15:23:18 -07:00
Tim Abbott	26396c5e25	puppet: Fix exceptions with multiple certbot declarations. Since `9e8f1aacb3`, zulip_ops machines might have two Package declarations for `certbot`, which doesn't work in puppet. The fix is, as usual, to use our `zulip::safepackage` wrapper instead.	2020-06-15 18:21:33 -07:00
Alex Vandiver	f8fc3a16eb	puppet: Use "primary" / "replica" consistently in comments. The style guide for Zulip is to always use "primary" and "replica" when describing database replication. Adjust a few comments under `puppet/` that do not adhere to this. Unfortunately, some references still remain to the insensitive and inaccurate "master" / "slave" terminology. However, these are only in files which we are attempting to preserve as close to the upstream versions they are derived from (e.g. postgresql.conf, postfix/master.cf).	2020-06-15 16:18:07 -07:00
Alex Vandiver	5f433d6eeb	puppet: Remove vestigial check_postgres.pl. `65774e1c4f` switched from using the bundled check_postgres.pl to using the version from packages; the file itself remained, however. Remove it, and clean up references to it. Fixes #15389.	2020-06-15 16:18:07 -07:00
Alex Vandiver	7d4a370a57	puppet: Move monitoring of pg replication to the pg hosts. Instead of SSH'ing around to them, run directly on the database hosts. This means that the replicas do not know how many bytes behind they are in _receiving_ the wall logs; thus, the monitoring also extends to the primary database, which knows that information for each replica. This also allows for detecting when there are too few active replicas.	2020-06-15 16:18:07 -07:00
Anders Kaseorg	74c17bf94a	python: Convert more percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Now including %d, %i, %u, and multi-line strings. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00
Anders Kaseorg	91a86c24f5	python: Replace None defaults with empty collections where appropriate. Use read-only types (List ↦ Sequence, Dict ↦ Mapping, Set ↦ AbstractSet) to guard against accidental mutation of the default value. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-13 15:31:27 -07:00
Alex Vandiver	97b9308781	puppet: Merge multiple postgres roles in `zulip_ops`. All differences between the primary and replica roles having been merged, fold the `postgres_common`, `postgres_master`, and `postgres_slave` roles into just `postgres_appdb`.	2020-06-12 14:57:46 -07:00
Alex Vandiver	55bd31721d	puppet: Remove custom `vm.dirty_ratio` and `vm.dirty_background_ratio`. These values differed between the primary and secondary database hosts, for unclear reasons. The differences date back to their introduction in `387f63deaa`. As the comment in the replica confguration notes, settings of `vm.dirty_ratio = 10` and `vm.dirty_background_ratio = 5` matched the kernel defaults for "newer" kernels; however, kernel 2.6.30 bumped those to 20 and 10, respectively[1], as a fix for underlying logic now being more correct. Remove these overrides; they should at very least be consistent across roles, and the previous values look to be an attempt to tune for a very much older version of the Linux kernel, which was using an different, buggier, algorithm under the hood. [1] `1b5e62b42b`	2020-06-12 14:57:46 -07:00
Alex Vandiver	f39816e768	puppet: Stop distributing recovery.conf file. This file controls streaming replication, and recovery using wal-g on the secondary. The `primary_conninfo` data needs to change on short notice when database failover happens, in a way that is not suitable for being controlled by puppet. PostgreSQL 12, in fact, removes the use of the `recovery.conf` file[1]; the `primary_conninfo` and `restore_command` information goes into the main `postgresql.conf` file, and the standby status is controlled by the presence of absence of an empty `standby.signal` file. Remove the puppet control of the `recovery.conf` file. [1] https://pgstef.github.io/2018/11/26/postgresql12_preview_recovery_conf_disappears.html	2020-06-12 14:57:46 -07:00
Alex Vandiver	316498a169	puppet: Remove unnecessary nagios authentication setup. Since the nagios authentication is stored _in the database_, it is unnecessary to run if the database is simply a replica of the production database. The only case in which this statement would have an effect is if the postgres node contains a _different_ (or empty) database, which `setup_disks` now effectively prevents. Remove the unnecessary step.	2020-06-11 21:01:49 -07:00
Alex Vandiver	0774f54c1b	puppet: Move to `setup_disks` to postgres_common. The tooling should now be run no matter if the node is a primary or replica.	2020-06-11 21:01:49 -07:00
Alex Vandiver	6f6a0e890a	puppet: Run setup_disks based on symlink; remove mdadm dependency. `481613a344` updated the `setup_disks` script to no longer reference `mdadm`, since we no longer set up RAID on servers. Update the puppet that would call it to remove the `mdadm` dependency, and run only if the state is not what it produces -- namely, a symlink for `/var/lib/postgresql`, which must point to an existent `/srv/postgresql` directory.	2020-06-11 21:01:49 -07:00
Alex Vandiver	1dc2de5026	puppet: Update setup-disks to be idempotent. The end state it produces is _either_: - `/srv/postgresql` already existed, which was symlinked into `/var/lib/postgresql`; postgres is left untouched. This is the situation if `setup_disks` is run on the database primary, or a replica which was correctly configured. - An empty `/srv/postgresql` now exists, symlinked into `/var/lib/postgresql`, and postgres is stopped. This is the situation if `puppet` was just run on a new host, or a previously-configured host was rebooted (clearing the temporary disk in `/dev/nvme0`) In the latter case, where `/srv/postgresql` is now empty, any previous contents of `/var/lib/postgresql` are placed under `/root`, timestamped for uniqueness. In either case, the tool should now be idempotent.	2020-06-11 21:01:49 -07:00
Alex Vandiver	16c4cea951	puppet: Pull postgres config directory into postgres_appdb_base. As the previous commit, this is currently only used in tuning, but is a property of the whole postgres configuration; move it there, as just the directory, not the file. Use this directory consistently in the erb templates. Since we produce a `pg_hba.conf`, it makes sense that we point to the path that we know that we explicitly wrote to, for instance.	2020-06-11 20:56:55 -07:00
Anders Kaseorg	365fe0b3d5	python: Sort imports with isort. Fixes #2665. Regenerated by tabbott with `lint --fix` after a rebase and change in parameters. Note from tabbott: In a few cases, this converts technical debt in the form of unsorted imports into different technical debt in the form of our largest files having very long, ugly import sequences at the start. I expect this change will increase pressure for us to split those files, which isn't a bad thing. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-11 16:45:32 -07:00
Anders Kaseorg	69730a78cc	python: Use trailing commas consistently. Automatically generated by the following script, based on the output of lint with flake8-comma: import re import sys last_filename = None last_row = None lines = [] for msg in sys.stdin: m = re.match( r"\x1b\[35mflake8 \\|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg ) if m: filename, row_str, col_str, err = m.groups() row, col = int(row_str), int(col_str) if filename == last_filename: assert last_row != row else: if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) with open(filename) as f: lines = f.readlines() last_filename = filename last_row = row line = lines[row - 1] if err in ["C812", "C815"]: lines[row - 1] = line[: col - 1] + "," + line[col - 1 :] elif err in ["C819"]: assert line[col - 2] == "," lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ") if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-06-11 16:04:12 -07:00
Alex Vandiver	b114eb2f10	puppet: Rename env-wal-e to env-wal-g. It runs wal-g now, not wal-e; make its name respect that.	2020-06-11 15:52:43 -07:00
Alex Vandiver	4fe0444108	puppet: Install wal-g, not wal-e.	2020-06-11 15:52:43 -07:00
Alex Vandiver	39d6185ce7	puppet: Remove python-dateutil requirement from pg_backup_and_purge. `1f565a9f41` removed the `package` lines which install `python-dateutil`, but not the line in `puppet_ops` that reference it; as such, Puppet manifests in puppet_ops fail to compile. Remove the stale reference to `python-dateutil`, which is unnecessary since the code is python3, not python2.	2020-06-11 14:28:55 -07:00
Anders Kaseorg	67e7a3631d	python: Convert percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-10 15:02:09 -07:00
Alex Vandiver	8b1d49dbc7	puppet: Rename "wiki" realm to "monitoring". This is vestigial. It requires manually altering the `htdigest` file (not stored in this repo) to change the digest realm from `wiki` to `monitoring`, and will re-prompt users for their passwords if the browsers currently store them.	2020-05-30 12:26:21 -07:00
Alex Vandiver	b33aa8da7f	postgresql: Update setup-disks to use `service postgresql`. Using `service postgresql` makes it no longer linked to the specific version/cluster that is on the host.	2020-05-30 12:14:24 -07:00
Alex Vandiver	4e370cda75	postgresql: Update setup-disks to drop /mnt disabling. Hosts do not start out with a `/mnt`; there is no need to disable it.	2020-05-30 12:14:24 -07:00
Alex Vandiver	a7d85b7e69	postgresql: Update setup-disks to not move /tmp. Drop the change to move `/tmp` onto the local disk. Doing this move confuses `resolved` until there is a restart, and has no clear benefits. The change came in during `bf82fadc95`, but does not describe the reasoning; it is particularly puzzling, since postgresql stores its temporary files under `$PGDATA/base/pgsql_tmp`.	2020-05-30 12:14:24 -07:00
Alex Vandiver	481613a344	postgresql: Update setup-disks to not use RAID. Do not RAID the disks together. This was previously done when they were spinning media, for reliability; running them on an SSD obviates this sufficiently. This means that updating the initramfs is also not necessary.	2020-05-30 12:14:24 -07:00
Alex Vandiver	b537563bc1	postgresql: Set the current primary host.	2020-05-30 12:14:24 -07:00
Alex Vandiver	ad2918ea51	puppet: Remove `postgres_other` nagios hostgroup. This no longer has any rules specific to it. We leave the `postgres` munin group (which now only contains `postgres_appdb`) as future-proofing, and so that `postgres_appdb` matches to the puppet manifest of the same name.	2020-05-28 17:24:35 -07:00
Alex Vandiver	2c73fbdcb6	puppet: Remove munin monitoring for no-longer-used "postgres_other". The `wiki` and `trac` products are no longer used.	2020-05-28 17:24:35 -07:00
Anders Kaseorg	f5b33f9398	python: Further pyupgrade changes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-05-26 11:43:40 -07:00
Tim Abbott	220620e7cf	sharding: Add basic sharding configuration for Tornado. This allows straight-forward configuration of realm-based Tornado sharding through simply editing /etc/zulip/zulip.conf to configure shards and running scripts/refresh-sharding-and-restart. Co-Author-By: Mateusz Mandera <mateusz.mandera@zulip.com>	2020-05-20 13:47:20 -07:00
Tim Abbott	c3d3324295	puppet: Add link to the sources for Zephyr patches.	2020-05-19 20:54:11 -07:00
Tim Abbott	a35e71ebbc	puppet: Update package name for boto-on-python3. The python3-boto3 package is the maintained fork that supports Python 3; it was renamed in Ubuntu Bionic from the original Ubuntu Xenial name.	2020-05-19 20:25:11 -07:00
Tim Abbott	1c28770810	puppet: Fix apt_repo_debathena setup_file path. There was a typo introduced here when scripts_path was added.	2020-05-19 20:21:30 -07:00
Tim Abbott	6319c181eb	puppet: Use actual name for the bind9-host package. Using the `host` virtual package confused Puppet into reporting it was doing work every time one did a puppet run, resulting in unnecessarily spammy output.	2020-05-11 00:51:53 -07:00
Mateusz Mandera	dd40649e04	queue_processors: Remove the slow_queries queue. While this functionality to post slow queries to a Zulip stream was very useful in the early days of Zulip, when there were only a few hundred accounts, it's long since been useless since (1) the total request volume on larger Zulip servers run by Zulip developers, and (2) other server operators don't want real-time notifications of slow backend queries. The right structure for this is just a log file. We get rid of the queue and replace it with a "zulip.slow_queries" logger, which will still log to /var/log/zulip/slow_queries.log for ease of access to this information and propagate to the other logging handlers. Reducing the amount of queues is good for lowering zulip's memory footprint and restart performance, since we run at least one dedicated queue worker process for each one in most configurations.	2020-05-11 00:45:13 -07:00
Anders Kaseorg	c0ffa71fa9	nginx: Replace unanchored regexes in location directives. We could anchor the regexes, but there’s no need for the power (and responsibility) of regexes here. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-24 16:58:19 -07:00
Anders Kaseorg	5e01a0ae8b	zulip-ec2-configure-interfaces: Convert function type annotations. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-24 13:06:54 -07:00
Anders Kaseorg	f8339f019d	python: Convert assignment type annotations to Python 3.6 style. Commit split by tabbott; this has changes to scripts/, tools/, and puppet/. scripts/lib/hash_reqs.py, scripts/lib/setup_venv.py, scripts/lib/zulip_tools.py, and tools/lib/provision.py are excluded so tools/provision still gives the right error message on Ubuntu 16.04 with Python 3.5. Generated by com2ann, with whitespace fixes and various manual fixes for runtime issues: -shebang_rules: List[Rule] = [ +shebang_rules: List["Rule"] = [ -trailing_whitespace_rule: Rule = { +trailing_whitespace_rule: "Rule" = { -whitespace_rules: List[Rule] = [ +whitespace_rules: List["Rule"] = [ -comma_whitespace_rule: List[Rule] = [ +comma_whitespace_rule: List["Rule"] = [ -prose_style_rules: List[Rule] = [ +prose_style_rules: List["Rule"] = [ -html_rules: List[Rule] = whitespace_rules + prose_style_rules + [ +html_rules: List["Rule"] = whitespace_rules + prose_style_rules + [ - target_port: int = None + target_port: int Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-24 13:06:54 -07:00
Aman Agrawal	2dc6d09c2a	python3-upgrade: Move python2 scripts to run on python3.	2020-04-22 16:13:15 -07:00
Anders Kaseorg	5901e7ba7e	python: Convert function type annotations to Python 3 style. Generated by com2ann (slightly patched to avoid also converting assignment type annotations, which require Python 3.6), followed by some manual whitespace adjustment, and six fixes for runtime issues: - def __init__(self, token: Token, parent: Optional[Node]) -> None: + def __init__(self, token: Token, parent: "Optional[Node]") -> None: -def main(options: argparse.Namespace) -> NoReturn: +def main(options: argparse.Namespace) -> "NoReturn": -def fetch_request(url: str, callback: Any, kwargs: Any) -> Generator[Callable[..., Any], Any, None]: +def fetch_request(url: str, callback: Any, kwargs: Any) -> "Generator[Callable[..., Any], Any, None]": -def assert_server_running(server: subprocess.Popen[bytes], log_file: Optional[str]) -> None: +def assert_server_running(server: "subprocess.Popen[bytes]", log_file: Optional[str]) -> None: -def server_is_up(server: subprocess.Popen[bytes], log_file: Optional[str]) -> bool: +def server_is_up(server: "subprocess.Popen[bytes]", log_file: Optional[str]) -> bool: - method_kwarg_pairs: List[FuncKwargPair], + method_kwarg_pairs: "List[FuncKwargPair]", Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-18 20:42:48 -07:00
Tim Abbott	e1ce53ac46	puppet: Update nagios checks for disk to exclude kernel filesystems. The fact that we have to explicitly list these is almost certainly a bug in check_disk, but at least this works.	2020-04-16 17:49:29 -07:00
Tim Abbott	cfbb617f5c	puppet: Update nagios configuration for checking local disk.	2020-04-16 17:48:36 -07:00
Tim Abbott	9821dfa9fc	puppet: The letsencrypt package is debian is now certbot. It was an alias starting with Ubuntu Xenial, and will eventually be removed.	2020-04-16 17:30:01 -07:00
Tim Abbott	8e5a866122	puppet: Update tuning for load average monitoring.	2020-04-16 16:47:05 -07:00
Anders Kaseorg	c734bbd95d	python: Modernize legacy Python 2 syntax with pyupgrade. Generated by `pyupgrade --py3-plus --keep-percent-format` on all our Python code except `zthumbor` and `zulip-ec2-configure-interfaces`, followed by manual indentation fixes. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-09 16:43:22 -07:00
Vishnu KS	449f7e2d4b	team: Generate team page data using cron job. This eliminates the contributors data as a possible source of flakiness when installing Zulip from Git. Fixes #14351.	2020-04-08 12:52:31 -07:00
Stefan Weil	d2fa058cc1	text: Fix some typos (most of them found and fixed by codespell). Signed-off-by: Stefan Weil <sw@weilnetz.de>	2020-03-27 17:25:56 -07:00
Anders Kaseorg	7ff9b22500	docs: Convert many http URLs to https. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-03-26 21:35:32 -07:00
Anders Kaseorg	687553a661	setup_path_on_import: Replace with setup_path function. isort 5 knows not to reorder imports across function calls, so this will stop isort from breaking our code. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-02-25 15:40:21 -08:00
Mateusz Mandera	4c5a8e6f0c	queue: Remove missedmessage_email_senders.	2020-01-31 12:13:51 -08:00
Tim Abbott	dd969b5339	install: Remove references to "Zulip Voyager". "Zulip Voyager" was a name invented during the Hack Week to open source Zulip for what a single-system Zulip server might be called, as a Star Trek pun on the code it was based on, "Zulip Enterprise". At the time, we just needed a name quickly, but it was never a good name, just a placeholder. This removes that placeholder name from much of the codebase. A bit more work will be required to transition the `zulip::voyager` Puppet class, as that has some migration work involved.	2020-01-30 12:40:41 -08:00
Tim Abbott	d70e799466	bots: Remove FEEDBACK_BOT implementation. This legacy cross-realm bot hasn't been used in several years, as far as I know. If we wanted to re-introduce it, I'd want to implement it as an embedded bot using those common APIs, rather than the totally custom hacky code used for it that involves unnecessary queue workers and similar details. Fixes #13533.	2020-01-25 22:41:39 -08:00
Anders Kaseorg	ea6934c26d	dependencies: Remove WebSockets system for sending messages. Zulip has had a small use of WebSockets (specifically, for the code path of sending messages, via the webapp only) since ~2013. We originally added this use of WebSockets in the hope that the latency benefits of doing so would allow us to avoid implementing a markdown local echo; they were not. Further, HTTP/2 may have eliminated the latency difference we hoped to exploit by using WebSockets in any case. While we’d originally imagined using WebSockets for other endpoints, there was never a good justification for moving more components to the WebSockets system. This WebSockets code path had a lot of downsides/complexity, including: * The messy hack involving constructing an emulated request object to hook into doing Django requests. * The `message_senders` queue processor system, which increases RAM needs and must be provisioned independently from the rest of the server). * A duplicate check_send_receive_time Nagios test specific to WebSockets. * The requirement for users to have their firewalls/NATs allow WebSocket connections, and a setting to disable them for networks where WebSockets don’t work. * Dependencies on the SockJS family of libraries, which has at times been poorly maintained, and periodically throws random JavaScript exceptions in our production environments without a deep enough traceback to effectively investigate. * A total of about 1600 lines of our code related to the feature. * Increased load on the Tornado system, especially around a Zulip server restart, and especially for large installations like zulipchat.com, resulting in extra delay before messages can be sent again. As detailed in https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it appears that removing WebSockets moderately increases the time it takes for the `send_message` API query to return from the server, but does not significantly change the time between when a message is sent and when it is received by clients. We don’t understand the reason for that change (suggesting the possibility of a measurement error), and even if it is a real change, we consider that potential small latency regression to be acceptable. If we later want WebSockets, we’ll likely want to just use Django Channels. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-14 22:34:00 -08:00
Tim Abbott	f84c037225	puppet: Tune check_postgres_locks parameters. This has been a spurious alert for a long time. It's unclear that this check is useful at all, but if it spikes dramatically above what's normal, there's perhaps still utility in being alerted.	2019-10-23 15:04:38 -07:00
Tim Abbott	e4dee9532c	nagios: Update configuration for user_activity worker change. Since LoopQueueProcessingWorker jobs cannot be monitored by checking for connected consumers (since they poll, rather than consuming as events arrive), they can't be monitored with check_consumers. It's OK, because that monitoring was redundant with monitoring for potential growth in their queue that we have as well. Also clean up the block comments for the two other similar queue procesors.	2019-09-23 11:49:46 -07:00
Anders Kaseorg	0962393933	cleanup: Delete trailing newlines. Delete trailing newlines from all files, except tools/ci/success-http-headers.txt and tools/setup/dev-motd, where they are significant, and static/third, where we want to stay close to upstream. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-06 23:29:11 -07:00
Anders Kaseorg	becef760bf	cleanup: Delete leading newlines. Previous cleanups (mostly the removals of Python __future__ imports) were done in a way that introduced leading newlines. Delete leading newlines from all files, except static/assets/zulip-emoji/NOTICE, which is a verbatim copy of the Apache 2.0 license. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-08-06 23:29:11 -07:00
Wyatt Hoodes	a109508e34	typing: Remove now-unnecessary conditional import. As a result of dropping support for trusty, we can remove our old pattern of putting `if False` before importing the typing module, which was essential for Python 3.4 support, but not required and maybe harmful on newer versions. cron_file_helper check_rabbitmq_consumers hash_reqs check_zephyr_mirror check_personal_zephyr_mirrors check_cron_file zulip_tools check_postgres_replication_lag api_test_helpers purge-old-deployments setup_venv node_cache clean_venv_cache clean_node_cache clean_emoji_cache pg_backup_and_purge restore-backup generate_secrets zulip-ec2-configure-interfaces diagnose check_user_zephyr_mirror_liveness	2019-07-29 15:18:22 -07:00
Wyatt Hoodes	e331a758c3	python: Migrate open statements to use with. This is low priority, but it's nice to be consistently using the best practice pattern. Fixes: #12419.	2019-07-20 15:48:52 -07:00
Tim Abbott	271319fb13	puppet: Fix hacky release test for whether we're in EC2. The result is still a bit hacky, but guaranteed to be correct if we adjust the OS version of our systems, which we of course will do over time.	2019-06-25 22:19:04 -07:00
Tim Abbott	8d8cfb314b	puppet: Remove zulip_ops configuration for trusty. There are no longer any zulip_ops systems using trusty.	2019-06-25 22:09:06 -07:00
Tim Abbott	b41c2d93d1	puppet: Exclude squashfs filesystems from nagios disk checks. These generally aren't being written to.	2019-06-16 16:22:23 -07:00
Tim Abbott	0ec1b4e82c	puppet: Move check_send_receive_time to the _once ruleset. We don't actually want to run this bundle of message-sending Nagios checks to run on every single server.	2019-06-16 15:48:35 -07:00
Tim Abbott	df83979c76	zulip_ops: Extract a prod_app_frontend_once ruleset.	2019-06-16 15:48:35 -07:00
Tim Abbott	738cfe54c3	puppet: Move app_frontend_once out of prod configuration. That logic made it inconvenient to run multiple prod servers with the same top-level puppet configuration.	2019-06-16 15:24:20 -07:00
Tim Abbott	e85250941d	puppet: Fix quoting of commented-out python3-boto. This will avoid a linter error if/when we uncomment it.	2019-06-13 14:39:24 -07:00
Tim Abbott	337efe0fb7	puppet: Remove puppet-el, which no longer exists. This package was only every available on Ubuntu Xenial.	2019-06-13 14:39:24 -07:00
Vishnu Ks	ecdd3bea43	billing: Add cron job to run invoice_plans once a day. Fixes #11960	2019-04-29 11:23:17 -07:00
Anders Kaseorg	643bd18b9f	lint: Fix code that evaded our lint checks for string % non-tuple. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-04-23 15:21:37 -07:00
Anders Kaseorg	9f7c0b7e65	postgres_master.pp: Fix wacky su command line. The construction `su postgres -c -- bash -c 'psql …'` didn’t behave the way it reads, and only worked by accident: 1. `-c --` sets the command to `--`. 2. `bash` sets the first argument to `bash`. 3. `-c 'psql …'` replaces the command with `psql …`. Thus, `su` ended up executing `<shell> -c 'psql …' bash`, where `<shell>` is the `postgres` user’s login shell, usually also `bash`, which then executed 'psql …' and ignored the extra `bash`. Unconfuse this construction. Note from tabbott: The old code didn't even work by accident, it was just broken. The right fix is to move the quoting around properly. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-04-12 17:27:23 -07:00
Anders Kaseorg	649235cfec	python: Remove unused imports. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-02-22 16:54:36 -08:00
Anders Kaseorg	c109690cf8	puppet: Remove unused Python imports. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-02-02 17:02:12 -08:00
rht	9ee2ee046a	puppet: Use systemctl instead of pg_ctlcluster on CentOS.	2019-01-05 15:49:03 -08:00
Tim Abbott	047817b6b0	puppet: Disable log2zulip cron job. It hasn't been working for years, but more importantly, it spams up root's mail queue so that one can't find important things in there (e.g. the fact that the long-term-idle cron job was failing).	2019-01-05 10:56:44 -08:00
Tim Abbott	2558f101af	docs: Add documentation for `if False` mypy pattern in scripts. This should help make it clear what's going on with these scripts.	2018-12-17 11:12:53 -08:00
rht	d2aa81858c	puppet/zulip_ops: Replace apt::source with setup-apt-repo-debathena. Tweaked by tabbott to use a clearer name.	2018-12-11 13:02:56 -08:00
Tim Abbott	b218c2a70e	loadbalancer: Use same certbot cert for zulipstaging.com. This is a simple configuration improvement.	2018-12-07 13:43:21 -08:00
Tim Abbott	467694c1fa	nginx: Enable http2 in external nginx configuration. This should be a nice performance improvement for browsers that support it. We can't yet enabled this in the Zulip on-premise nginx configuration, because that still has to support Trusty.	2018-12-07 13:43:02 -08:00
Tim Abbott	5abf4dee92	nagios: Add new host groups for Tornado processes. We also move all the existing Tornado monitoring rules to the singletornado_frontends rule.	2018-11-06 16:33:18 -08:00
Tim Abbott	5f3b79c9e7	nagios: Fix tab-based whitespace.	2018-11-06 16:30:29 -08:00
Tim Abbott	dc7d44a245	puppet: Don't run calculate-first-visible-message-id on most systems. This should only be run on systems that are running zilencer, because the cron job is part of the zilencer project.	2018-10-30 11:40:24 -07:00
Tim Abbott	2c7f9ce0fc	puppet: Fix puppet-lint warnings in various manifests. Apparently, `puppet-lint` on Ubuntu trusty throws warnings for certain quoting patterns that are OK in modern `puppet-lint`. I believe the old Zulip code was actually correct (i.e. the old `puppet-lint` implementation was the problem), but it seems worth changing anyway to suppress the warnings. We also exclude more of puppet-apt from linting, since it's third-party code.	2018-08-28 13:46:31 -07:00
Tim Abbott	b53a712856	nginx: Update configuration for using certbot certs everywhere.	2018-08-22 11:59:15 -07:00
Tim Abbott	90828297e4	puppet-lint: Enforce double_quoted_strings check. This makes our puppet codebase more consistent by using single-quoted strings consistently.	2018-08-13 12:31:19 -07:00
Tim Abbott	d0b51b70f4	puppet-lint: Enforce 2sp_soft_tables puppet-lint check. This cleans up the puppet codebase's whitespace formatting to be more consistent.	2018-08-13 12:31:16 -07:00
Tim Abbott	b26e0a957d	puppet-lint: Enforce arrow_alignment check. This fixes all exceptions in our puppet codebase to this lint rule.	2018-08-13 12:30:57 -07:00
Aditya Bansal	710d4507de	puppet-lint: Fix lines longer than 140 characters lint warnings. We fix these by adding ignore statements in a bunch of files where this error popped up. We target only specific lines using the ignore statements and not the entire files.	2018-08-07 10:03:40 -07:00
Anders Kaseorg	edfd5ef992	setup_disks.sh: Fix shellcheck warnings. In puppet/zulip_ops/files/postgresql/setup_disks.sh line 15: array_name=$(mdadm --examine --scan \| sed 's/.*name=//') ^-- SC2034: array_name appears unused. Verify use (or export if used externally). Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-08-03 09:15:26 -07:00
Anders Kaseorg	5a0fecc2d5	munin_plugins: Fix shellcheck warnings. In puppet/zulip_ops/files/munin-plugins/rabbitmq_connections line 66: echo "connections.value $(HOME=$HOME rabbitmqctl list_connections \| grep -v "^Listing" \| grep -v "done.$" \| wc -l)" ^-- SC2126: Consider using grep -c instead of grep\|wc -l. In puppet/zulip_ops/files/munin-plugins/rabbitmq_consumers line 32: VHOST=${vhost:-"/"} ^-- SC2034: VHOST appears unused. Verify use (or export if used externally). In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages line 32: VHOST=${vhost:-"/"} ^-- SC2034: VHOST appears unused. Verify use (or export if used externally). In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages_unacknowledged line 32: VHOST=${vhost:-"/"} ^-- SC2034: VHOST appears unused. Verify use (or export if used externally). In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages_uncommitted line 32: VHOST=${vhost:-"/"} ^-- SC2034: VHOST appears unused. Verify use (or export if used externally). In puppet/zulip_ops/files/munin-plugins/rabbitmq_queue_memory line 32: VHOST=${vhost:-"/"} ^-- SC2034: VHOST appears unused. Verify use (or export if used externally). Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-08-03 09:15:08 -07:00
Tim Abbott	02ae71f27f	api: Stop using API keys for Django->Tornado authentication. As part of our effort to change the data model away from each user having a single API key, we're eliminating the couple requests that were made from Django to Tornado (as part of a /register or home request) where we used the user's API key grabbed from the database for authentication. Instead, we use the (already existing) internal_notify_view authentication mechanism, which uses the SHARED_SECRET setting for security, for these requests, and just fetch the user object using get_user_profile_by_id directly. Tweaked by Yago to include the new /api/v1/events/internal endpoint in the exempt_patterns list in test_helpers, since it's an endpoint we call through Tornado. Also added a couple missing return type annotations.	2018-07-30 12:28:31 -07:00
Tim Abbott	07af59d4cc	tornado: Split get_events_backend into two functions. The lower-layer function, now called get_events_backend, is intended to be called by multiple code paths (including the upcoming get_events_internal).	2018-07-30 12:28:31 -07:00
Tim Abbott	63fe39e381	zulip_ops: Disable Ubuntu's built-in update-motd.d files. We can't really do this in the zulip manifests (since it's sorta a sysadmin policy decision), but these scripts can cause significant load when Nagios logs into a server (because many of them take 50ms or more of work to run). So we just get rid of them.	2018-05-06 18:47:40 -07:00
Tim Abbott	4e8487c886	nagios: Bump maximum processes limits. These seemed to be flapping for no good reason.	2018-05-02 11:12:47 -07:00
Tim Abbott	718492638b	puppet: Fix name for dhcpcd5 package. Apparently the name dhcpcd isn't installable.	2018-04-23 11:32:07 -07:00
Tim Abbott	35aa4f0377	puppet: Sort ensure attributes to be always first. This inconsistency was flagged by puppet-lint.	2018-04-22 23:41:49 -07:00
Tim Abbott	e103c2ff2d	puppet: Switch to modern quoted, octal file modes. This is one of the prerequisite tasks for Puppet 4 support. Constructed using puppet-lint.	2018-04-22 23:30:48 -07:00
Tim Abbott	62b12e0c34	zulip_ops: Add missing dependency on dhcpcd.	2018-04-19 14:27:48 -07:00
neiljp (Neil Pilgrim)	090b47ed19	mypy: Add explicit Optional for default=None parameters in various files.	2018-03-28 12:31:51 -07:00
neiljp (Neil Pilgrim)	f32f3cbf72	mypy: Amend zulip-ec2-configure-interfaces to avoid None.	2018-03-23 11:39:54 -07:00
Tim Abbott	d98be2f19f	puppet: Only run analytics Nagios checks on machine running cron. Running this on additional machines would be redundant; additionally, the FillState checker cron job runs only on cron systems, so this will crash on other app frontends.	2018-03-06 13:38:27 -08:00
Tim Abbott	8e8faab006	puppet: Move clearsessions cron job to app_frontend_once. While this is a different system than I'd written up in #8004, I think this is a better solution to the general problem of cron jobs to run on just one server. Fixes #8004.	2018-03-06 13:35:51 -08:00
Tim Abbott	3ae645ed12	puppet: Rename analytics.pp to app_frontend_once.pp.	2018-03-06 13:35:51 -08:00
Tim Abbott	24b6106c9c	puppet: Dsiable checking for evictions in memcached nagios. Zulip's caching model for message history is such that it is normal and healthy for there to eventually be a nontrivial volume of evictions.	2018-03-06 13:34:02 -08:00
Greg Price	4475950ddf	queue: Restore prematurely-cut upgrade path. Revert `c8f034e9a` "queue: Remove missedmessage_email_senders code." As the comment in the code says, it ensures a smooth upgrade path from 1.7.x; we can delete it in master after 1.8.0 is released. The removal commit was merged early due to a communication failure.	2018-02-28 11:15:53 -08:00
Umair Khan	c8f034e9a0	queue: Remove missedmessage_email_senders code. After `68513952fb`, all emails are sent through email_senders queue. This commit removes code related to the legacy queue.	2018-02-21 16:43:56 -08:00
Tim Abbott	005b0fb566	puppet: Clean up ssh authorized_keys configuration rules.	2018-02-09 16:37:03 -08:00
Tim Abbott	aca25b6f0a	puppet: Move ssh configuration to use notify. This handles more correctly the case where we're using the upstream sshd_config file.	2018-02-09 16:37:03 -08:00
Tim Abbott	486de8abfc	puppet: Edit some rules to support chat.zulip.org. This should make it possible to use the zulip_ops base rules successfully on chat.zulip.org. Many of the changes in this commit are hacks and probably can be cleaned up later, but given that we plan to drop trusty support soon, it's likely that most of them will simply be deleted then.	2018-02-09 16:37:03 -08:00
Rishi Gupta	1d581a9c6e	nagios: Add nagios check for analytics state. This should help us detect issues where the analytics cron jobs aren't running properly. The cron/nagios part of the implementation done by tabbott.	2018-02-09 16:36:05 -08:00
Tim Abbott	9ed2a94b8c	nagios: Add configuration designed for full-stack servers. This doesn't yet pass all Nagios checks correctly, and still has a few flaws: * The ideal setup code for the `nagios` user in the database isn't included. * Some of the other details are a bit off; we need to split some host roles. But it's better than nothing, and we can iterate from here.	2018-01-24 14:16:03 -08:00
Tim Abbott	2365b13b68	puppet: Move postgres Nagios plugin to main postgres-common. This plugins package is required in order to use Nagios checks to verify the Zulip postgres database, and thus belongs in the default package set.	2018-01-23 10:31:48 -08:00
Umair Khan	68513952fb	email-worker: Create EmailSendingWorker. This commit just copies all the code from MissedMessageSendingWorker class to a new EmailSendingWorker class. All the logic to send an email through a queue was already there. This commit only makes the logic generic. It does so by creating a special purpose queue called 'email_senders' to send any type of email. To make MissedMessageSendingWorker still work we derive it from EmailSendingWorker. All the tests that were testing MissedMessageSendingWorker now run against EmailSendingWorker.	2017-12-20 19:36:27 -08:00
Vishnu Ks	766511e519	actions: Mark all messages as read when user unsubscribes from stream. This fixes a bug where, when a user is unsubscribed from a stream, they might have unread messages on that stream leak. While it might seem to be a minor problem, it can cause significant problems for computing the `unread_msgs` data structures, since it means we need to add an extra filter for whether the user is still subscribed, either in the backend or in the UI. Fixes #7095.	2017-11-21 20:09:17 -08:00
Tim Abbott	94554c65da	certbot: Modify nginx configuration to support automated renewal.	2017-11-08 12:32:26 -08:00
Tim Abbott	62bb465896	puppet: Modify lb0 nginx configuration.	2017-11-08 12:32:26 -08:00
rht	549a26860f	refactor: Remove six.moves.range import.	2017-11-07 10:46:42 -08:00
Tim Abbott	0d1194811f	mypy: Remove ignores for a few typeshed bugs fixed upstream.	2017-10-27 17:09:00 -07:00
Tim Abbott	540cae19a8	puppet: Remove obsolete sparkle configuration. Sparkle was the auto-update system used by the legacy desktop app. We haven't been capable of using it for auto-update in years, so there's no reason to keep around the configuration. The new Electron app uses a different system anyway.	2017-10-19 16:35:55 -07:00
rht	b57289aacd	py3: Remove all `from __future__ import print_function. Except for these files: - tools/linter_lib/* - tools/lib - tools/lister.py	2017-10-18 12:07:19 -07:00
rht	2f3ae84e5a	py3: Remove all `__future__ import division`.	2017-10-17 23:09:12 -07:00
Tim Abbott	6a5cb0e48c	puppet: Make problems with Zephyr mirroring pageable. Generally this indicates sending messages is completely broken.	2017-10-12 00:16:32 -07:00
rht	de30400fc5	pg_backup_and_purge.py: Remove .py extension.	2017-10-08 15:32:43 -07:00
Tim Abbott	47c5aae5b2	log2zulip: Enforce using python 3 in cron job. We aren't guaranteed to have the Zulip dependencies installed on Python 2.	2017-10-06 16:37:17 -07:00
Tim Abbott	f2055397c1	nagios: Update apache configuration to be generated. Since this is basically just stock Apache configuration for Nagios with a hostname put in, we can just fetch the hostname from our configuration.	2017-10-05 21:51:29 -07:00
Tim Abbott	3af01bed85	puppet: Simplify zulip_ops nginx configuration. Whatever dist/ functionality this had in 2014 is now served by zulip.org, and since this serves as a sample, it should be as simple as possible. Previously, this was more cluttered than it needed to be.	2017-10-05 21:17:57 -07:00
Tim Abbott	e6e7bcf6e1	nagios: Move camo_check_url into configuration.	2017-10-05 21:09:24 -07:00
Tim Abbott	1c453fdf2a	puppet: Add redis_password file for Nagios. This allows the Nagios user to access redis without having full access to the redis system. Ideally, this would eventually use a password that only has statistics read access, but I'm not sure redis supports that.	2017-10-05 20:42:07 -07:00
Tim Abbott	13a36d9af3	puppet: Make old redis_tunnel configuration usable. This old puppet configuration was never really used, and regardless hardcoded an ancient zulip.net hostname. We fix this to use the zulipconf system to get the host domain (though not, at present, the hostname).	2017-10-05 20:40:22 -07:00
Tim Abbott	96c3014da0	nagios: Automate configuration of outgoing email with msmtp. Now we no longer need to check in a bunch of hostnames in order to configure Nagios.	2017-10-05 20:29:47 -07:00
Tim Abbott	5b4c260c3f	puppet: Add munin apache auth configuration. This is completely stock configuration, and seems to be required for munin to run properly.	2017-10-05 20:17:12 -07:00
Tim Abbott	ba7be4102e	puppet: Update munin tunnels configuration to use zulipconf. This eliminates another old hardcoding of zulip.net.	2017-10-05 20:14:43 -07:00
Tim Abbott	162eaf8917	nagios: Modify check for swap to allow no swap. If a machine is configured with no swap intentationally, that shouldn't be a Nagios problem. This alert is intended to flag machines which are swapping.	2017-10-05 20:07:44 -07:00
Tim Abbott	80a16bf873	nagios: Fix path to source zulip_nagios.cfg. Arguably, we should make this a symlink, but it's probably a good idea to have every change in the production Nagios configuration go through the zulip-puppet-apply diff experience.	2017-10-05 20:06:50 -07:00
Tim Abbott	886a8853ac	nagios: Move server-specific config into hostgroups. These new hostgroups exist so we can eliminate explicit references to individual hosts in services.cfg.	2017-10-05 20:06:48 -07:00
Tim Abbott	b6ce9583a9	nagios: Fetch list of hosts from zulip.conf. This makes this much more configurable and much less hardcoded.	2017-10-05 20:06:30 -07:00

1 2 3 4 5 ...

360 Commits