zulip

Commit Graph

Author	SHA1	Message	Date
Anders Kaseorg	7021852627	install-node: Silence expected “node: command not found” on first run. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-11-03 12:11:08 -07:00
Anders Kaseorg	9d2d6c8eb7	ruff: Fix M001 Unused `noqa` directive. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-11-03 12:10:15 -07:00
Alex Vandiver	6662a3bac4	teleport: Switch to the new apt host for Teleport. The apt.releases.teleport.dev repository is deprecated as of the release of Teleport 11, and has been replaced with deb.releases.teleport.dev[1]. [1]: https://goteleport.com/docs/changelog/#deprecated-old-debrpm-repositories	2022-10-28 16:52:54 -07:00
Alex Vandiver	f5f6a3789b	restart-server: Default to running config and database checks. If there is a syntax error in `settings.py`, `restart-server` should provide a reasonable message about this. It did so prior to `af08bcdb3f`, becausde any invocation `./manage.py` without `--skip-checks` will verify `settings.py`, among several other checks. After `af08bcdb3f`, there are no `./manage.py` calls in most restarts, which `fa77be6e6c` took further. Add an explicit `./manage.py check` in the default case. upgrade-zulip-stage-2 overrides this by passing `--skip-checks`, for performance. This also means that `upgrade-zulip-from-git` itself picks up the same `--skip-checks` flag, since it inherits the same flag parsing, though that is perhaps of dubious utility.	2022-10-14 13:10:46 -07:00
Anders Kaseorg	afccebc1ee	install-node: Upgrade Node.js from 16.17.0 to 18.10.0. Although Node.js 18 is not the active LTS release for another 3 weeks, the Node.js 16 end-of-life date was moved forward to September 2023, (https://nodejs.org/en/blog/announcements/nodejs16-eol/), so it seems prudent to switch now. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-10-11 10:50:57 -07:00
Anders Kaseorg	11a86ec328	install: Remove PostgreSQL 10 support. PostgreSQL 10 reaches its upstream end of life in November, and is not supported by Django 4.1. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-10-06 15:59:07 -07:00
Anders Kaseorg	b267b17677	python: Use ‘not in’ for more negated membership tests. Fixes “E713 Test for membership should be `not in`” found by ruff (now that I’ve fixed it not to ignore scripts lacking a .py extension). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-09-26 12:09:46 -07:00
Anders Kaseorg	83bd709562	Revert "zulip-puppet-apply: Work around broken Puppet on Ubuntu 22.04." This reverts commit `25c87cc7da` (#21328). This upstream Ubuntu bug was fixed. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-09-22 15:18:15 -07:00
Anders Kaseorg	403837e52d	python: Use ‘not in’ for negated membership tests Fixes “E713 Test for membership should be `not in`” found by ruff (https://github.com/charliermarsh/ruff). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-09-17 11:48:33 -07:00
Anders Kaseorg	987ab741f9	sharding: Support Tornado sharding by regexes. One should now be able to configure a regex by appending _regex to the port number: [tornado_sharding] 9802_regex = ^[l-p].*\.zulipchat\.com$ Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-09-15 16:07:50 -07:00
Anders Kaseorg	7666ff603d	sharding: Configure Tornado sharding with nginx map. https://nginx.org/en/docs/http/ngx_http_map_module.html Since Puppet doesn’t manage the contents of nginx_sharding.conf after its initial creation, it needs to be renamed so we can give it different default contents. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-09-15 16:07:50 -07:00
Anders Kaseorg	ea6f18bb46	refresh-sharding-and-restart: Quote to prevent shell glob expansion. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-09-14 09:35:12 -07:00
Anders Kaseorg	5e4cec56cb	install-node: Upgrade Node.js from 16.16.0 to 16.17.0. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-09-06 15:02:29 -07:00
Anders Kaseorg	5d77d50423	scripts: Help mypy resolve the psycopg2.connect overload. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-08-30 17:36:21 -07:00
Zixuan James Li	059d0e7be8	settings: Make SHARED_SECRET mandatory. This implements get_mandatory_secret that ensures SHARED_SECRET is set when we hit zerver.decorator.authenticate_notify. To avoid getting ZulipSettingsError when setting up the secrets, we set an environment variable DISABLE_MANDATORY_SECRET_CHECK to skip the check and default its value to an empty string. Signed-off-by: Zixuan James Li <p359101898@gmail.com>	2022-08-25 12:13:03 -07:00
Anders Kaseorg	7da1586cbf	install-node: Upgrade Node.js from 16.15.1 to 16.16.0. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-08-04 13:51:51 -07:00
Anders Kaseorg	443b974b3e	python: Apply changes from pyupgrade. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-07-20 22:33:28 -07:00
Alex Vandiver	526a04b4e6	restore-backup: Provide flags to leave settings.py and zulip.conf as-is.	2022-07-20 12:35:51 -07:00
Alex Vandiver	d8ae270899	restore-backup: Only extract /etc/zulip once. This is already handled in the earlier block; there is no need to extract it twice.	2022-07-19 17:56:40 -07:00
Alex Vandiver	1b57669771	restore-backup: Switch to run() to check exit codes.	2022-07-19 17:56:40 -07:00
Alex Vandiver	c71c6187ea	restore-backup: Ensure it is run as root.	2022-07-19 17:56:40 -07:00
Anders Kaseorg	81892df176	requirements: Upgrade to Django 4.0. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-07-13 16:07:17 -07:00
Anders Kaseorg	463fe515b8	install-yarn: Upgrade Yarn from 1.22.18 to 1.22.19. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-07-06 17:23:16 -07:00
Anders Kaseorg	d104407531	log-search: Fix re.Match type annotations. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-07-05 12:55:03 -07:00
Alex Vandiver	2b02722d16	log-search: Add a filter to exclude all lines not explicitly wanted.	2022-06-28 15:59:31 -07:00
Alex Vandiver	180565d8d6	log-search: Fix copy/paste-o in filtering for presence.	2022-06-28 15:59:31 -07:00
Anders Kaseorg	3bf8ee2156	python: Unquote some unnecessarily quoted type annotations. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-06-26 17:37:41 -07:00
Alex Vandiver	41deef40cf	nagios: Switch to generic check_cron_file for queues and consumers. These share a common root; `91da4bd59b` duplicated the code, but didn't move the existing uses to the new utility.	2022-06-22 12:07:38 -07:00
Alex Vandiver	e01a4242aa	nagios: Sort queue consumer checks.	2022-06-22 12:07:38 -07:00
Alex Vandiver	27b63d0baf	check-rabbitmq-consumers: Fix a misleading comment.	2022-06-22 12:07:38 -07:00
Alex Vandiver	4e06ee45c7	check-rabbitmq-consumers: Remove unused --min-threshold. This has never actually been used -- and does not make sense with the check-all-queues-at-once model switched to in `88a123d5e0`. The Tornado processes are the only ones we expect to be non-1, and since they were added in `3f03dcdf5e` the right number has been read from config, not passed as an argument.	2022-06-22 12:07:38 -07:00
Alex Vandiver	53c01aa299	check-rabbitmq-consumers: Remove --queue argument from help. This has not been accepted since `88a123d5e0`.	2022-06-22 12:07:38 -07:00
Alex Vandiver	a35af3f38b	install/upgrade: Allow new packages during `apt-get upgrade`. `postgresql-14.4` is a notable upgrade in the PostgreSQL series, as it fixes potential database corruption from `CREATE INDEX CONCURRENTLY` statements which are run while rows are modified[1]. However, it also requires an upgrade from `libllvm9` to `libllvm10`, which means it is not installed by a mere `apt-get upgrade`. Add the `--with-new-pkgs` flag to all of the potentially relevant `apt-get upgrade` calls, so that this (and similar) packages are upgraded successfully. [1]: https://www.postgresql.org/docs/release/14.4/	2022-06-21 11:21:49 -07:00
Alex Vandiver	5bdc4b3562	upgrade-zulip-from-git: init, then add remote. `30457ecd02` removed the `--mirror` from initial clones, but did not add back `--bare`, which `--mirror` implies. This leads to `/srv/zulip.git` having a working tree in it, with a `/srv/zulip.git/.git` directory. This is mostly harmless, and since the bug was recent, not worth introducing additional complexity into the upgrade process to handle. Calling `git clone --bare`, however, would clone the refs into `refs/heads/`, not the `refs/remotes/origin/` we want. Instead, use `git init --bare`, followed by `git remote add origin`. The remote will be fetched by the usual `git fetch --all --prune` which is below.	2022-06-09 11:18:42 -07:00
Alex Vandiver	1639792e9e	upgrade-zulip-from-git: Check fetch refspecs, not mirror flag. While the `remote.origin.mirror` boolean being set is a very good proxy for having been cloned with `--mirror`, is technically only used when pushing into the remote[1]. What we care about is if fetches from this remote will overwrite `refs/heads/`, or all of `refs/` -- the latter of which is most likely, from having run `git clone --bare`. Detect either of these fetch refspecs, and not the mirror flag. We let the upgrade process error out if `remote.origin.fetch` is unset, as that represents an unexpected state. We ignore failures to unset the `remote.origin.mirror` flag, in case it is not set already. [1]: https://git-scm.com/docs/git-config#Documentation/git-config.txt-remoteltnamegtmirror	2022-06-09 11:18:42 -07:00
Anders Kaseorg	61c9740bbd	install-yarn: Upgrade Yarn from 1.22.17 to 1.22.18. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-06-02 12:03:49 -07:00
Anders Kaseorg	2007c75061	install-node: Upgrade Node.js from 16.14.1 to 16.15.1. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-06-02 12:03:49 -07:00
Alex Vandiver	30457ecd02	upgrade-zulip-from-git: Stop mirroring the remote. The local `/srv/zulip.git` directory has been cloned with `--mirror` since it was first created as a local cache in `dc4b89fb08`. This made some sense at the time, since it was purely a cache of the remote, and not a home to local branches of its own. That changed in `3f83b843c2`, when we began using `git worktree`, which caused the `deployment-...` branches to begin being stored in `/src/zulip.git`. This caused intermixing of local and remote branches. When `02582c6956` landed, the addition of `--prune` caused all but the most recent deployment branch to be deleted upon every fetch -- leaving previous deployments with non-existent branches checked out: ``` zulip@example-prod-host:~/deployments/last$ git status On branch deployment-2022-04-15-23-07-55 No commits yet Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: .browserslistrc new file: .codecov.yml new file: .codespellignore new file: .editorconfig [...snip list of every file in repo...] ``` Switch `/srv/zulip.git` to no longer be a `--mirror` cache of the origin. We reconfigure the remote to drop `remote.origin.mirror`, and delete all refs under `refs/pulls/` and `refs/heads/`, while preserving any checked-out branches. `refs/pulls/`, if the remote is the canonical upstream, contains _tens of thousands_ of refs, so pruning those refs trims off 20% of the repository size. Those savings require a `git gc --prune=now`, otherwise the dangling objects are ejected from the packfiles, which would balloon the repository up to more than three times its previous size. Repacking the repository is reasonable, in general, after removing such a large number of refs -- and the `--prune=now` is safe and will not lose data, as the `--mirror` was good at ensuring that the repository could not be used for any local state. The refname in the upgrade process was previously resolved from the union of local and remote refs, since they were in the same namespace. We instead now only resolve arguments as tags, then origin branches; this means that stale local branches will be skipped. Users who want to deploy from local branches can use `--remote-url=.`. Because the `scripts/lib/upgrade-zulip-from-git` file is "stage 1" and run from the old version's code, this will take two invocations of `upgrade-zulip-from-git` to take effect. Fixes #21901.	2022-06-01 16:06:15 -07:00
Anders Kaseorg	98ed6248e3	apt-repos: Remove now-unneeded Ubuntu 21.10 repository on 22.04. Followup to commit `f8957863a2` (#22055). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-05-25 17:25:23 -07:00
Alex Vandiver	6337f17923	upgrade: Add --skip-restart which preps but does not restart. This adds a --skip-restart which makes `deployments/next` in a state where it can be restarted into, but holds off on conducting that restart. This requires many of the same guarantees as `--skip-tornado`, in terms of there being no Puppet or database schema changes between the versions. Enforce those with `--skip-restart`, and also broaden both flags to prevent other, less common changes which nonetheless potentially might affect the other deploy.	2022-05-22 15:07:37 -07:00
Alex Vandiver	86a4e64726	upgrade: Enforce that --skip-tornado does not have Puppet or DB changes.	2022-05-22 15:07:18 -07:00
Alex Vandiver	ef7c2ea0ea	upgrade: Copy cache prefix with --skip-tornado. Because Tornado and Django use memcached as a shared cache for checking session information, they must agree on the prefix used to store those values. Subsequent commits will work to ensure that it is always _safe_ to share that cache.	2022-05-22 14:52:38 -07:00
Alex Vandiver	fa77be6e6c	upgrade: Only run Django system checks once, explicitly. These are expensive, and moving them to one explicit call early has considerable time savings in the critical period: ``` $ hyperfine './manage.py fill_memcached_caches' './manage.py fill_memcached_caches --skip-checks' Benchmark #1: ./manage.py fill_memcached_caches Time (mean ± σ): 5.264 s ± 0.146 s [User: 4.885 s, System: 0.344 s] Range (min … max): 5.119 s … 5.569 s 10 runs Benchmark #2: ./manage.py fill_memcached_caches --skip-checks Time (mean ± σ): 3.090 s ± 0.089 s [User: 2.853 s, System: 0.214 s] Range (min … max): 2.950 s … 3.204 s 10 runs Summary './manage.py fill_memcached_caches --skip-checks' ran 1.70 ± 0.07 times faster than './manage.py fill_memcached_caches' ```	2022-05-22 14:52:38 -07:00
Alex Vandiver	3928606886	restart-server: Treat as a start if nothing is running. Treating the restart as a start is important in reducing the critical period during upgrades -- we call restart even when we suspect the services are stopped, because puppet has a small possibility of placing them in indeterminate state. However, restart orders the workers first, then tornado/django, which prolongs the outage. Recognize when no services are currently started, and switch to acting like a start, not a restart, which places tornado/django first.	2022-05-22 14:52:38 -07:00
Alex Vandiver	3717c329b8	stop-server: Only stop services if they exist and are running. This hides ugly output if the services were already stopped: ``` 2022-03-25 23:26:04,165 upgrade-zulip-stage-2: Stopping Zulip... process-fts-updates: ERROR (not running) zulip-django: ERROR (not running) zulip_deliver_scheduled_emails: ERROR (not running) zulip_deliver_scheduled_messages: ERROR (not running) Zulip stopped successfully! ``` Being able to skip having to shell out to `supervisorctl`, if all services are already stopped is also a significant performance improvement.	2022-05-22 14:52:38 -07:00
Alex Vandiver	2e5a079ef4	upgrade: Check with zulip-puppet-apply to see if we can skip it.	2022-05-22 14:52:38 -07:00
Alex Vandiver	ecfc23bd0b	zulip-puppet-apply: Make --force --noop have an exit code.	2022-05-22 14:52:38 -07:00
Alex Vandiver	c91725bfb5	zulip-puppet-apply: Factor out the --noop returncode logic.	2022-05-22 14:52:38 -07:00
Alex Vandiver	b15d8e0118	upgrade: Skip the pre-work if the server is already stopped. This optimization makes sense if the server is already running, but if it is already stopped, it is just prolonging the downtime.	2022-05-22 14:52:38 -07:00
Alex Vandiver	05af4b0a11	upgrade: Fill caches before the critical period, if possible.	2022-05-22 14:52:38 -07:00
Alex Vandiver	2f7068ffbb	upgrade: Move puppet class renames earlier. These do not need to happen during the critical period when the server is stopped.	2022-05-22 14:52:38 -07:00
Anders Kaseorg	f8957863a2	Revert "apt-repos: Downgrade PostgreSQL to dodge PGroonga regression." This reverts commit `9c8d2b7be3` (#21115). The PostgreSQL fix was released 2022-05-12. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-05-17 15:07:37 -07:00
Alex Vandiver	258b658cc0	log-search: Allow multiple search terms. This allows AND'ing multiple terms together.	2022-05-06 17:45:46 -07:00
Alex Vandiver	bd73e7d411	log-search: Factor out argument parsing.	2022-05-06 17:45:46 -07:00
Alex Vandiver	8eab5f6931	log-search: Add status code search. This moves log filename parsing after the filter parsing, as that can now enable --nginx.	2022-05-06 17:45:46 -07:00
Alex Vandiver	0bad002c14	log-search: Factor out logfile name parsing.	2022-05-06 17:45:46 -07:00
Alex Vandiver	67e641f37d	log-search: Add a filter by path.	2022-05-06 17:45:46 -07:00
Alex Vandiver	df47c5a750	log-search: Update docs to include client-id as an option.	2022-05-06 17:45:46 -07:00
Alex Vandiver	b1749259d4	log-search: Fix URLs for non-zulipchat.com hosts.	2022-05-06 17:45:46 -07:00
Alex Vandiver	e3a65b1528	log-search: Some Django log lines do not include hostname.	2022-05-06 17:45:46 -07:00
Alex Vandiver	fe17a4d6d0	log-search: Handle ^C more gracefully.	2022-05-06 17:45:46 -07:00
Alex Vandiver	da4ae3ff24	log-search: Filter out user avatars.	2022-05-06 17:45:46 -07:00
Alex Vandiver	d3ae7480cc	log-search: Handle settings.LOGGING_SHOW_PID.	2022-05-06 17:45:46 -07:00
Alex Vandiver	bd298ba753	log-search: Not all servers are in UTC.	2022-05-06 17:45:46 -07:00
Anders Kaseorg	3cb7d3d1dc	node_cache: Remove node_modules/.cache when copying. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-05-04 09:56:07 -07:00
Alex Vandiver	65b99377d2	log-search: Show duration.	2022-05-03 13:44:29 -07:00
Alex Vandiver	056895cc33	log-search: Search for user-ids.	2022-05-03 13:44:29 -07:00
Alex Vandiver	b355a0a63e	log-search: Default to searching python logfiles. These have more accurate timestamps, and have user information -- but are harder to parse, and will not show requests when Django or Tornado is stopped.	2022-05-03 13:44:29 -07:00
Alex Vandiver	ba1237119c	log-search: Add a tool to search nginx logs by IP/hostname. This is a script to search nginx log files by server hostname or client IP address, and output matching lines, all while skipping common and less-interesting request lines.	2022-05-03 13:44:29 -07:00
Alex Vandiver	e13154f089	puppet: Add ksplice support for 22.04.	2022-05-03 12:36:19 -07:00
Alex Vandiver	cda55a40e7	puppet: Add teleport support for 22.04.	2022-05-03 12:36:19 -07:00
Anders Kaseorg	e952641013	install: Resupport Ubuntu 22.04. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-05-03 09:41:08 -07:00
Anders Kaseorg	25c87cc7da	zulip-puppet-apply: Work around broken Puppet on Ubuntu 22.04. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-05-03 09:41:08 -07:00
Anders Kaseorg	080a806d60	build-pgroonga: Update PGroonga to 2.3.6. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-29 16:02:45 -07:00
Alex Vandiver	3476f63dca	compare-settings-to-template: Handle prod_settings_template renaming.	2022-04-28 14:52:38 -07:00
Alex Vandiver	b6b6faa404	compare-settings-to-template: Simplify and dedent logic.	2022-04-28 14:52:38 -07:00
Alex Vandiver	d205050ab0	compare-settings-to-template: Fetch 100 per pagination.	2022-04-28 14:52:38 -07:00
Alex Vandiver	d79776f80d	compare-settings-to-template: Paginate through all tags. The default page size is 30, which means this only goes back to 4.6 at present, due to starting with `shared-...` and old `enterprise-...` tags.	2022-04-28 14:52:38 -07:00
Anders Kaseorg	098a514599	python: Use Python 3.8 shlex.join function. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-27 12:57:49 -07:00
Anders Kaseorg	0451d1e47f	zulip_tools: Replace universal_newlines with text. Generated by pyupgrade. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-27 12:57:49 -07:00
Anders Kaseorg	a543dcc8e3	Remove Debian 10 support. As a consequence: • Bump minimum supported Python version to 3.8. • Move Vagrant environment to Ubuntu 20.04, which has Python 3.8. • Move CI frontend tests to Ubuntu 20.04. • Move production build test to Ubuntu 20.04. • Move 3.4 upgrade test to Ubuntu 20.04. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-26 16:32:02 -07:00
Anders Kaseorg	63a1ef0e91	configure-rabbitmq: Remove use of sudo. It already runs as root everywhere except in provision_inner, so move the sudo there. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-19 12:36:31 -07:00
Anders Kaseorg	cc30ed8ec7	actions: Delete zerver.lib.actions. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-14 17:14:38 -07:00
Alex Vandiver	09860dc284	check-database-compatibility: Sort and prettify output.	2022-04-06 14:10:46 -07:00
Alex Vandiver	eb31681934	check-database-compatibility: Ignore squashed and renamed migrations. Fixes: #21596.	2022-04-01 16:15:41 -07:00
Alex Vandiver	0af00a3233	upgrade: Mark puppet as having started the server. We previously used restart-server if puppet was run, as a nod to the fact that `supervisor reread && supervisor update` will _start_ service groups that were modified, even if they were previously stopped; this is because they are marked as `autostart=true`, which is honored on service change. However, upgrades want to run while there are no services running. If puppet is run, explicitly set the server as potentially being "up", so that a `shutdown_server()` before migrations, if they exist, will stop services.	2022-03-31 17:21:39 -07:00
Alex Vandiver	e9596637e7	upgrade: Move the shutdown_server calls to where they are relevant. shutdown_server is a noop if the server is already stopped; placing these in each block makes the logic more apparent.	2022-03-31 17:21:39 -07:00
Alex Vandiver	65e19c4fbd	supervisor: 'foo:' also matches 'foo'. `7c4293a7d3` switched to checking if the service was already running, and use `supervisorctl start` if it was not. Unfortunately, `list_supervisor_processes("zulip-tornado:")` did not include `zulip-tornado`, and as such a non-sharded process was always considered to _not_ be running, and was thus started, not restarted. Starting an already-started service is a no-op, and thus non-sharded tornado processes were never restarted. The observed behaviour is that requests to the tornado process attempt to load the user from the cache, with a different prefix from Django, and immediately invalidate the session and eject the user back to the login page. Fix the `list_supervisor_processes` logic to match without the trailing `:*`.	2022-03-31 10:41:41 -07:00
Anders Kaseorg	55882fb343	python: Use modern set comprehension syntax. Generated by pyupgrade. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-03-25 10:45:12 -07:00
Anders Kaseorg	1f68c73e66	supervisor: Update superseded super(C, self) syntax to superior super(). Generated by pyupgrade. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-03-25 10:45:12 -07:00
Anders Kaseorg	2762121162	python: Convert last type comments to annotations. We had skipped these in #14693 so we could keep generating a friendly error on Python 3.5, but we gave that up in #19801. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-03-24 20:32:39 -07:00
Alex Vandiver	d7b59c86ce	puppet: Build wal-g from source for aarch64. Since wal-g does not provide binaries for aarch64, build them from source. While building them from source for arm64 would better ensure that build process is tested, the build process takes 7min and 700M of temp files, which is an unacceptable cost; we thus only build on aarch64. Since the wal-g build process uses submodules, which are not in the Github export, we clone the full wal-g repository. Because the repository is relatively small, we clone it anew on each new version, rather than attempt to manage the remotes. Fixes #21070.	2022-03-22 15:02:35 -07:00
Alex Vandiver	a4d0f03319	scripts: Switch to stop-server/restart-server. stop-server and restart-server address all services which talk to the database, and are thus more correct than restarting or stopping everything in supervisor. This is possible now that the previous commit ensures that the zulip user can read the zulip installation directory during `create-database`; previously, that directory was still owned by root when `create-database` was run, whereas now it is in `~zulip/deployments/`.	2022-03-21 16:33:28 -07:00
Alex Vandiver	c0cc98c6a8	install: Re-order final steps. Move database creation to immediately before database initialization; this means it happens in a directory readable by the `zulip` user, as well as placing it alongside similar operations. It removes the check for the `zulip::postgresql_common` Puppet class; instead it keeps the check for `--no-init-db`, and switches to require `zulip::app_frontend_base`. This is a behavior change for any install of `zulip::postgresql_common`-only classes, but that is not a common form -- and such installs likely already pass `--no-init-db` because they are warm spare replicas. As a result, all non-`zulip::app_frontend_base` installs now skip database initialization, even without `--no-init-db`. This is clearly correct for, e.g. Redis-only hosts, and makes clearer that the frontend, not the database host, is responsible for database initialization.	2022-03-21 16:33:28 -07:00
Alex Vandiver	394f1eadde	setup: Rename postgresql-init-db to create-database. The old name was confusingly similar to initialize-database.	2022-03-21 16:33:28 -07:00
Anders Kaseorg	7d4b02738d	install-node: Upgrade Node.js from 16.14.0 to 16.14.1. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-03-17 15:24:46 -07:00
Anders Kaseorg	84e91a6e33	configure-rabbitmq: Use rabbitmqctl await_online_nodes. rabbitmqctl ping only checks that the Erlang process is registered with epmd. There’s a window after that where the rabbit app is still starting inside it. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-03-14 16:26:05 -07:00
Alex Vandiver	52d363cada	upgrade: Skip re-checking of new bots on upgrade. This was added in `c770bdaa3a`, and we have not added any realm-internal bots since `c770bdaa3a`. Speed up the critical period during upgrades by skipping this step.	2022-03-14 14:14:53 -07:00
Alex Vandiver	d26a15b14d	setup-apt-repo: Make hashes file not contain full path. Using an absolute `ZULIP_SCRIPTS` path when computing sha245sums results in a set of hashes which varies based on the path that the script is called as. This means that each deploy _always_ has `setup-apt-repo --verify` fail, since it is a different base path. Make all paths passed to sha256sum be relative to the repository root, ensuring they can be compared across runs.	2022-03-12 17:24:19 -08:00
Alex Vandiver	7c4293a7d3	restart-server: Check if service is running before restart, vs start. In some instances (e.g. during upgrades) we run `restart-server` and not `start-server`, even though we expect the server to most likely already be stopped. `supervisorctl restart servicename` if the service is stopped produces the perhaps-alarming message: ``` restart-server: Restarting servicename servicename: ERROR (not running) servicename: started ``` This may cause operators to worry that something is broken, when it is not. Check if the service is already running, and switch from "restart" to "start" in cases where it is not. The race condition here is safe -- if the service transitions from stopped to started between the check and the `start` call, it will merely output: ``` servicename: ERROR (already started) ``` ...and continue, as that has exit status 0. If the service transitions from started to stopped between the check and the `restart` call, we are merely back in the current case, where it outputs: ``` servicename: ERROR (not running) servicename: started ``` In none of these cases does a call to "restart" fail to result in the service being stopped and then started.	2022-03-09 14:42:15 -08:00
Anders Kaseorg	646e466341	install: Desupport Ubuntu 22.04 for now. Ubuntu 22.04 pushed a post-feature-freeze update to Python 3.10, breaking virtual environments in a Debian patch (https://bugs.launchpad.net/ubuntu/+source/python3.10/+bug/1962791). Also, our antique version of Tornado doesn’t work in 3.10, and we’ll need to do some work to upgrade that. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-03-07 11:46:07 -08:00
Anders Kaseorg	60e943b92e	install-node: Upgrade Node.js from 16.13.2 to 16.14.0. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-03-01 23:09:46 -08:00
Anders Kaseorg	de1fb2b8d0	check-database-compatibility: Ignore guardian, django.contrib.sites. We can safely ignore the presence of the extra tables that could be left behind in the database from when we had these installed (before Zulip 1.7.0 and 2.0.0, respectively). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-03-01 10:30:23 -08:00
Tim Abbott	98a05257ea	scripts: Print names of missing migrations in compatibility check. This will make it much easier to debug any situations where this happens.	2022-02-28 11:09:52 -08:00
Anders Kaseorg	894a50b5c9	install: Support Ubuntu 22.04. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-25 14:49:07 -08:00
Anders Kaseorg	f9997e311c	generate-self-signed-cert: Remove RANDFILE. This was not needed for OpenSSL ≥ 1.1.1 (all our supported platforms), and breaks with OpenSSL ≥ 3.0.0 (Ubuntu 22.04). It was removed from the upstream configuration file too: https://bugs.debian.org/990228. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-25 14:49:07 -08:00
Anders Kaseorg	f852af0709	upgrade-zulip-stage-2: Set default PostgreSQL version for Debian 11. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-25 14:49:07 -08:00
Anders Kaseorg	1fa2761790	upgrade-zulip-stage-2: Remove create_large_indexes optimization. This was only used for upgrading from Zulip < 1.9.0, which is no longer possible because Zulip < 2.1.0 had no common supported platforms with current main. If we ever want this optimization for a future migration, it would be better implemented using Django merge migrations. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-23 11:59:45 -08:00
Anders Kaseorg	1629d6bfb3	python: Reformat with Black 22 (stable). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-18 18:03:13 -08:00
Alex Vandiver	1d2582c899	upgrade: Log the commit hash and directory when upgrading.	2022-02-16 12:33:58 -08:00
Anders Kaseorg	f6a701090c	setup-apt-repos: Don’t install lsb_release. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-14 16:38:53 -08:00
Anders Kaseorg	9c8d2b7be3	apt-repos: Downgrade PostgreSQL to dodge PGroonga regression. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-13 19:11:49 -08:00
Anders Kaseorg	43c4672deb	apt-repos: Remove groovy. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-13 19:11:49 -08:00
Anders Kaseorg	fdc1294993	setup-apt-repo: Support installing an APT preferences file. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-13 19:11:49 -08:00
Anders Kaseorg	7077a289ae	setup-apt-repo: Move supported release check earlier. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-13 19:11:49 -08:00
Anders Kaseorg	c8bb98554e	setup-apt-repo: Use /etc/os-release instead of lsb_release. But still install lsb-release for now since Puppet acts funny without it. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-13 19:11:49 -08:00
Anders Kaseorg	d1241be496	configure-rabbitmq: Use rabbitmqctl ping. Our supported distributions now all have RabbitMQ ≥ 3.7.8. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-13 19:09:41 -08:00
Tim Abbott	1a7c4a0276	scripts: Fix typo in logging statement.	2022-02-11 13:47:24 -08:00
Alex Vandiver	8da6098631	upgrade: Catch "upgrade" attempts which would downgrade the database. Attempting to "upgrade" from `main` to 4.x should abort; Django does not prevent running old code against the new database (though it likely errors at runtime), and `./manage.py migrate` from the old version during the "upgrade" does not downgrade the database, since the migrations are entirely missing in that directory, so don't get reversed. Compare the list of applied migrations to the list of on-disk migrations, and abort if there are applied migrations which are not found on disk. Fixes: #19284.	2022-02-10 16:02:49 -08:00
Alex Vandiver	71e02d7893	zulip_tools: Factor out ZULIP_VERSION parsing.	2022-02-10 16:02:49 -08:00
Anders Kaseorg	e1f42c1ac5	docs: Add missing space to compound verbs “back up”, “log in”, etc. Noun: backup, login, logout, lookup, setup. Verb: back up, log in, log out, look up, set up. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-07 19:20:54 -08:00
Anders Kaseorg	b0ce4f1bce	docs: Fix many spelling mistakes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-07 18:51:06 -08:00
Alex Vandiver	2066860ab6	start-server: Start auxiliary services, if they exist. Services like go-camo and smokescreen are not stopped in stop-server, since they are upgraded and restarted by puppet application. As such, they also do not appear in start-server, despite the server relying on them to be running to function properly. Ensure those services are started, by starting them in start-server, if they are configured in supervisor on the host.	2022-01-26 12:39:54 -08:00
Alex Vandiver	88c3f560ae	supervisor: Add a filter for only(-not)-running.	2022-01-26 12:39:54 -08:00
Alex Vandiver	7243c3c73d	scripts: Re-implement list_supervisor_processes using API.	2022-01-26 12:39:54 -08:00
Alex Vandiver	8e35cdb3da	scripts: Add a supervisor package, to use the XMLRPC Supervisor API. For many uses, shelling out to `supervisorctl` is going to produce better error messages. However, for instances where we wish to parse the output of `supervisorctl`, using the API directly is less brittle.	2022-01-26 12:39:54 -08:00
Anders Kaseorg	aec6cd4cdb	reindex-textual-data: Find psycopg2 in the virtualenv. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-01-26 11:56:30 -08:00
Alex Vandiver	a5496f4098	CVE-2021-43799: Set a secure Erlang cookie. The RabbitMQ docs state ([1]): RabbitMQ nodes and CLI tools (e.g. rabbitmqctl) use a cookie to determine whether they are allowed to communicate with each other. [...] The cookie is just a string of alphanumeric characters up to 255 characters in size. It is usually stored in a local file. ...and goes on to state (emphasis ours): If the file does not exist, Erlang VM will try to create one with a randomly generated value when the RabbitMQ server starts up. Using such generated cookie files are appropriate in development environments only. The auto-generated cookie does not use cryptographic sources of randomness, and generates 20 characters of `[A-Z]`. Because of a semi-predictable seed, the entropy of this password is thus less than the idealized 26^20 = 94 bits of entropy; in actuality, it is 36 bits of entropy, or potentially as low as 20 if the performance of the server is known. These sizes are well within the scope of remote brute-force attacks. On provision, install, and upgrade, replace the default insecure 20-character Erlang cookie with a cryptographically secure 255-character string (the max length allowed). [1] https://www.rabbitmq.com/clustering.html#erlang-cookie	2022-01-25 02:13:53 +00:00
Alex Vandiver	93a344fc3c	configure-rabbitmq: Set -u, and not -x.	2022-01-25 01:52:36 +00:00
Alex Vandiver	ece96c9729	configure-rabbitmq: Factor out sudo, instead of rabbitmqctl.	2022-01-25 01:52:36 +00:00
Alex Vandiver	bd7deed691	upgrade: Show output from (re)starting zulip. `5c450afd2d`, in ancient history, switched from `check_call` to `check_output` and throwing away its result. Use check_call, so that we show the steps to (re)starting the server.	2022-01-25 01:52:34 +00:00
Alex Vandiver	e705883857	CVE-2021-43799: During upgrades, restart rabbitmq if necessary. Check if it is listening on a public interface on port 25672, and if so shut it down so it can pick up the new configuration.	2022-01-25 01:51:56 +00:00
Alex Vandiver	da5201b986	upgrade: Make calling shutdown_server twice, only try once.	2022-01-25 01:48:05 +00:00
Alex Vandiver	43d63bd5a1	puppet: Always set the RabbitMQ nodename to zulip@localhost. This is required in order to lock down the RabbitMQ port to only listen on localhost. If the nodename is `rabbit@hostname`, in most circumstances the hostname will resolve to an external IP, which the rabbitmq port will not be bound to. Installs which used `rabbit@hostname`, due to RabbitMQ having been installed before Zulip, would not have functioned if the host or RabbitMQ service was restarted, as the localhost restrictions in the RabbitMQ configuration would have made rabbitmqctl (and Zulip cron jobs that call it) unable to find the rabbitmq server. The previous commit ensures that configure-rabbitmq is re-run after the nodename has changed. However, rabbitmq needs to be stopped before `rabbitmq-env.conf` is changed; we use an `onlyif` on an `exec` to print the warning about the node change, and let the subsequent config change and notify of the service and configure-rabbitmq to complete the re-configuration.	2022-01-25 01:48:02 +00:00
Alex Vandiver	3bfcfeac24	puppet: Run configure-rabbitmq on nodename change. `/etc/rabbitmq/rabbitmq-env.conf` sets the nodename; anytime the nodename changes, the backing database changes, and this requires re-creating the rabbitmq users and permissions. Trigger this in puppet by running configure-rabbitmq after the file changes.	2022-01-25 01:46:51 +00:00
Alex Vandiver	b6cd89440e	setup: Remove unused RABBITMQ_NODE. This reverts commit `889547ff5e`. It is unused in the Docker container, as the configurtaion of the `zulip` user in the rabbitmq node is done via environment variables. The Zulip host in that context does not have `rabbitmqctl` installed, and would have needed to know the Erlang cookie to be able to run these commands.	2022-01-25 01:46:51 +00:00
Anders Kaseorg	21548ff7c0	install-node: Upgrade Node.js from 16.13.1 to 16.13.2. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-01-24 15:55:38 -08:00
Alex Vandiver	a3adaf4aa3	puppet: Fix standalone certbot configurations. This addresses the problems mentioned in the previous commit, but for existing installations which have `authenticator = standalone` in their configurations. This reconfigures all hostnames in certbot to use the webroot authenticator, and attempts to force-renew their certificates. Force-renewal is necessary because certbot contains no way to merely update the configuration. Let's Encrypt allows for multiple extra renewals per week, so this is a reasonable cost. Because the certbot configuration is `configobj`, and not `configparser`, we have no way to easily parse to determine if webroot is in use; additionally, `certbot certificates` does not provide this information. We use `grep`, on the assumption that this will catch nearly all cases. It is possible that this will find `authenticator = standalone` certificates which are managed by Certbot, but not Zulip certificates. These certificates would also fail to renew while Zulip is running, so switching them to use the Zulip webroot would still be an improvement. Fixes #20593.	2022-01-24 12:13:44 -08:00
Alex Vandiver	76ce8631c0	setup: Install a temporary certificate, before certbot runs. Installing certbot with --method=standalone means that the configuration file will be written to assume that the standalone method will be used going forward. Since nginx will be running, attempts to renew the certificate will fail. Install a temporary self-signed certificate, just to allow nginx to start, and then follow up (after applying puppet to start nginx) with the call to setup-certbot, which will use the webroot authenticator. The `setup-certbot --method=standalone` option is left intact, for use in development environments. Fixes part of #20593; it does not address installs which were previously improperly configured with `authenticator = standalone`.	2022-01-24 12:13:44 -08:00
Anders Kaseorg	97e4e9886c	python: Replace universal_newlines with text. This is supported in Python ≥ 3.7. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-01-23 22:16:01 -08:00
Anders Kaseorg	a58a71ef43	Remove Ubuntu 18.04 support. As a consequence: • Bump minimum supported Python version to 3.7. • Move Vagrant environment to Debian 10, which has Python 3.7. • Move CI frontend tests to Debian 10. • Move production build test to Debian 10. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-01-21 17:26:14 -08:00
Alex Vandiver	677467f040	upgrade-zulip-from-git: Fix upstream URL for existing deploys.	2022-01-18 21:10:38 -08:00
Alex Vandiver	bad58cdca6	upgrade-zulip-from-git: Fix the upstream URL not be the custom remote.	2022-01-18 21:10:38 -08:00
Alex Vandiver	6bc5849ea8	puppet: Remove now-unused debathena apt repository.	2022-01-18 14:13:28 -08:00
Anders Kaseorg	e2cc554077	zulip_tools: Rename may_be_perform_purging to maybe_perform_purging. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-01-12 13:21:35 -08:00
Alex Vandiver	b31658482b	upgrade-zulip: Pass any arguments down to upgrade-zulip-stage-2. This is the equivalent of `93f3da4c05` but for the tarball codepath.	2022-01-11 14:26:54 -08:00
Alex Vandiver	06e115bb00	zulip_tools: Switch get_deploy_options to use shlex.split. This makes it honor quoting in the config file.	2022-01-11 14:26:54 -08:00
Anders Kaseorg	1cc1de82cd	reindex-textual-data: Reindex textual functional indexes too. This catches nine functional indexes that the previous query didn’t: upper_preregistration_email_idx upper_stream_name_idx upper_subject_idx upper_userprofile_email_idx zerver_message_recipient_upper_subject zerver_mutedtopic_stream_topic zerver_stream_realm_id_name_uniq zerver_userprofile_realm_id_delivery_email_uniq zerver_userprofile_realm_id_email_uniq Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-01-07 10:37:04 -08:00
Alex Vandiver	6218ed91c2	puppet: Use lazy-apps and uwsgi control sockets for rolling reloads. Restarting the uwsgi processes by way of supervisor opens a window during which nginx 502's all responses. uwsgi has a configuration called "chain reloading" which allows for rolling restart of the uwsgi processes, such that only one process at once in unavailable; see uwsgi documentation ([1]). The tradeoff is that this requires that the uwsgi processes load the libraries after forking, rather than before ("lazy apps"); in theory this can lead to larger memory footprints, since they are not shared. In practice, as Django defers much of the loading, this is not as much of an issue. In a very basic test of memory consumption (measured by total memory - free - caches - buffers; 6 uwsgi workers), both immediately after restarting Django, and after requesting `/` 60 times with 6 concurrent requests: \| Non-lazy \| Lazy app \| Difference ------------------+------------+------------+------------- Fresh \| 2,827,216 \| 2,870,480 \| +43,264 After 60 requests \| 3,332,284 \| 3,409,608 \| +77,324 ..................\|............\|............\|............. Difference \| +505,068 \| +539,128 \| +34,060 That is, "lazy app" loading increased the footprint pre-requests by 43MB, and after 60 requests grew the memory footprint by 539MB, as opposed to non-lazy loading, which grew it by 505MB. Using wsgi "lazy app" loading does increase the memory footprint, but not by a large percentage. The other effect is that processes may be served by either old or new code during the restart window. This may cause transient failures when new frontend code talks to old backend code. Enable chain-reloading during graceful, puppetless restarts, but only if enabled via a zulip.conf configuration flag. Fixes #2559. [1]: https://uwsgi-docs.readthedocs.io/en/latest/articles/TheArtOfGracefulReloading.html#chain-reloading-lazy-apps	2022-01-05 14:48:52 -08:00
Alex Vandiver	4aaa250623	zulip_tools: Fix a typo in a comment.	2022-01-05 14:48:52 -08:00

1 2 3 4 5 ...

1382 Commits