Commit Graph

1001 Commits

Author SHA1 Message Date
Anders Kaseorg 77fdac3579 install-node: Upgrade Node.js to 14.15.1 and nvm to 0.37.2.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-12-09 23:07:40 -08:00
Vishnu KS eb008fc864 emails: Use macros for email tags in invitation email. 2020-10-30 11:50:30 -07:00
Anders Kaseorg aaa7b766d8 python: Use universal_newlines to get str from subprocess.
We can replace ‘universal_newlines’ with ‘text’ when we bump our
minimum Python version to 3.7.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Anders Kaseorg 86e8d81c7f python: Skip unnecessary decode before JSON parsing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Tim Abbott c537912a77 puppet: Migrate postgres_backups puppet manifest name. 2020-10-29 11:29:44 -07:00
Alex Vandiver 2332113c97 upgrade: Adjust puppet class names even with --skip-puppet.
The class names need to be renamed even if we are not about to run
puppet ourselves; otherwise, deployments which rely on running puppet
themselves will still have the wrong class names.
2020-10-28 17:49:14 -07:00
Alex Vandiver 6b9d7000b5 puppet: Set proxy environment variables.
These are respected by `urllib`, and thus also `requests`.  We set
`HTTP_proxy`, not `HTTP_PROXY`, because the latter is ignored in
situations which might be running under CGI -- in such cases it may be
coming from the `Proxy:` header in the request.
2020-10-28 12:17:35 -07:00
Alex Vandiver 97745688ca docs: Link to the new doc home of the email gateway. 2020-10-28 12:13:04 -07:00
Alex Vandiver f1cf730c5b restore-backup: Rename variables to postgresql. 2020-10-28 11:57:03 -07:00
Alex Vandiver 5ee3379ce0 upgrade: Rename variables to postgresql. 2020-10-28 11:57:03 -07:00
Alex Vandiver 2b0bbbb882 tools: Rename postgres to postgresql in tool names. 2020-10-28 11:57:02 -07:00
Alex Vandiver 5eb8064a1a install: Rename postgres options to postgresql. 2020-10-28 11:55:32 -07:00
Alex Vandiver 1f7132f50d docs: Standardize on PostgreSQL, not Postgres. 2020-10-28 11:55:16 -07:00
Anders Kaseorg 23a289ecd5 install-node: Upgrade Node.js to 12.19.0 and Yarn to 1.22.10.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-28 11:45:02 -07:00
Anders Kaseorg de5282d2cf install-node: Install npm and npx symlinks.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-28 11:45:02 -07:00
Alex Vandiver 5f3765b872 upgrade: Adjust puppet classes to new names. 2020-10-27 13:29:19 -07:00
Alex Vandiver 16d9dd84b8 upgrade: Switch to using crudini to update zulip.conf contents.
Using `config_file.write()` only writes out what python stored of the
file; as such, it strips all comments and whitespace.

Use `crudini --set`, which only modifies the line whose contents are
changed.
2020-10-27 13:29:19 -07:00
Alex Vandiver 5365af544a puppet: Rename zulip::profile::rabbit to ::rabbitmq. 2020-10-27 13:29:19 -07:00
Alex Vandiver 188af57296 puppet: Rename postgres_appdb to postgresql.
There is only one PostgreSQL database; the "appdb" is irrelevant.
Also use "postgresql," as it is the name of the software, whereas
"postgres" the name of the binary and colloquial name.  This is minor
cleanup, but enabled by the other renames in the previous commit.
2020-10-27 13:29:19 -07:00
Alex Vandiver 0f25acc7b3 puppet: Rename "voyager"/"dockervoyager" to "standalone"/"docker".
The "voyager" name is non-intuitive and not significant.
`zulip::voyager` and `zulip::dockervoyager` stubs are kept for
back-compatibility with existing `zulip.conf` files.
2020-10-27 13:29:19 -07:00
Alex Vandiver c2185a81d6 puppet: Move top-level zulip deployments into "profile" directory.
This moves the puppet configuration closer to the "roles and profiles
method"[1] which is suggested for organizing puppet classes.  Notably,
here it makes clear which classes are meant to be able to stand alone
as deployments.

Shims are left behind at the previous names, for compatibility with
existing `zulip.conf` files when upgrading.

[1] https://puppet.com/docs/pe/2019.8/the_roles_and_profiles_method
2020-10-27 13:29:19 -07:00
Alex Vandiver 7cf737988d queue: Be more explicit about test/real queue division. 2020-10-26 12:32:47 -07:00
Anders Kaseorg 31d0141a30 python: Close opened files.
Fixes various instances of ‘ResourceWarning: unclosed file’ with
python -Wd.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-26 12:31:30 -07:00
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Anders Kaseorg 16aa48d9b2 configure-rabbitmq: Wait for RabbitMQ to start up.
Fixes an occasional failure in ‘vagrant up --provision’.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-15 17:01:00 -07:00
Anders Kaseorg f16aa8f264 configure-rabbitmq: Put the command and flags in one array.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-15 17:01:00 -07:00
Alex Vandiver 1fa4ef0271 upgrade-postgres: Catch failed pg_upgradecluster exit code.
Because the command is part of a pipe sequence, the exitcode defaults
to the last in the sequence, which is not the most important one here.

Set pipefail, which sets the exit status to the exit code of the last
program in the sequence to exit non-zero, or 0 if all succeeded.  This
prevents the upgrade from barreling onward and setting
`postgres.version` improperly if the database upgrade step failed.
2020-10-15 15:21:30 -07:00
Anders Kaseorg dfaea9df65 shfmt: Reformat shell scripts with shfmt.
https://github.com/mvdan/sh

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-15 15:16:00 -07:00
Anders Kaseorg dd48dbd912 docs: Add spaces to “check out”, “log in”, “set up”, “sign up” as verbs.
“Checkout”, “login”, “setup”, and “signup” are nouns, not verbs.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-13 15:47:13 -07:00
Anders Kaseorg b7a94be152 python: Catch BaseException when we need to clean something up.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-11 16:16:16 -07:00
Tim Abbott 5de6f3523c upgrade-postgres: Pass the requested postgres explicitly. 2020-10-01 14:29:24 -07:00
Alex Vandiver 4d65ea256a rabbitmq: Consolidate check_rabbitmq_queue to call rabbitmqctl once.
`rabbitmqctl` tends to be slow; this shaves half a second off the time
to run `check-rabbitmq-consumers` in some cases.
2020-09-29 17:44:44 -07:00
Alex Vandiver c0e240277b tornado: Remove fingerprinting, write out .tmp files always.
Fingerprinting the config is somewhat brittle -- it requires either
custom bootstrapping for old (fingerprint-less) configs, and may have
false-positives.

Since generating the config is lightweight, do so into the .tmp files,
and compare the output to the originals to determine if there are
changes to apply.

In order to both surface errors, as well as notify the user in case a
restart is necessary, we must run it twice.  The `onlyif`
functionality cannot show configuration errors to the user, only
determine if the command runs or not.  We thus run the command once,
judging errors as "interesting" enough to run the actual command,
whose failure will be verbose in Puppet and halt any steps that depend
on it.

Removing the `onlyif` would result in `stage_updated_sharding` showing
up in the output of every Puppet run, which obscures the important
messages it displays when an update to sharding is necessary.
Removing the `command` (e.g. making it an `echo`) would result in
removing the ability to report configuration errors.  We thus have no
choice but to run it twice; this is thankfully low-overhead.
2020-09-25 10:52:40 -07:00
Alex Vandiver 4b3121db0b certbot: Explicitly apt-get update before installing certbot.
There is no guarantee that the apt data is up-to-date, unless we
explicitly update.

Fixes: zulip/docker-zulip#275
2020-09-21 15:26:28 -07:00
Mateusz Mandera e2dcdc2758 queue: Increase allowed expected_time_to_clear_backlog for embed_links.
It's okay for this queue to be a bit slow, and the default limits are
kind of too low for it.
2020-09-21 15:24:04 -07:00
Mateusz Mandera cd9b194d88 queue: Eliminate useless "burst" concept in monitoring.
The reason higher expected_time_to_clear_backlog were allowed for queues
during "bursts" was, in simpler terms, because those queues to which
this happens, intrinsically have a higher acceptable "time until cleared"
for new events. E.g. digests_email, where it's completely fine to take a
long time to send them out after putting in the queue. And that's
already configurable without a normal/burst distinction.
Thanks to this we can remove a bunch of overly complicated, and
ultimately useless, logic.
2020-09-21 15:24:04 -07:00
Mateusz Mandera 2365a53496 queue: Fix a race condition in monitoring after queue stops being idle.
The race condition is described in the comment block removed by this
commit. This leaves room for another, remaining race condition
that should be virtually impossible, but nevertheless it seems
worthwhile to have it documented in the code, so we put a new comment
describing it.
As a final note, this is not a new race condition,
it was hypothetically possible with the old code as well.
2020-09-21 15:22:56 -07:00
Alex Vandiver 2a12fedcf1 tornado: Remove explicit tornado_processes setting; compute it.
We can compute the intended number of processes from the sharding
configuration.  In doing so, also validate that all of the ports are
contiguous.

This removes a discrepancy between `scripts/lib/sharding.py` and other
parts of the codebase about if merely having a `[tornado_sharding]`
section is sufficient to enable sharding.  Having behaviour which
changes merely based on if an empty section exists is surprising.

This does require that a (presumably empty) `9800` configuration line
exist, but making that default explicit is useful.

After this commit, configuring sharding can be done by adding to
`zulip.conf`:

```
[tornado_sharding]
9800 =              # default
9801 = other_realm
```

Followed by running `./scripts/refresh-sharding-and-restart`.
2020-09-18 15:13:40 -07:00
Anders Kaseorg b7874ac82e install-node: Upgrade Node.js to 12.18.4 and Yarn to 1.22.5.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-15 16:33:28 -07:00
Alex Vandiver efdaa58c24 supervisor: Use more specific process_name than "port-9800".
Making this include "zulip-tornado" makes it clearer in supervisor
logs.  Without this, one only sees:
```
2020-09-14 03:43:13,788 INFO waiting for port-9807 to stop
2020-09-14 03:43:14,466 INFO stopped: port-9807 (exit status 1)
2020-09-14 03:43:14,469 INFO spawned: 'port-9807' with pid 24289
2020-09-14 03:43:15,470 INFO success: port-9807 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
```
2020-09-14 22:17:51 -07:00
Alex Vandiver 13fb7875e2 nagios: Remove an unnecessary path.append. 2020-09-14 18:20:12 -07:00
Alex Vandiver dd68cc98fd upgrade: Stop in the same order as restart-server.
restart-server explicitly stops the workers first, then the core
services.  Keep that ordering consistently.
2020-09-14 16:27:15 -07:00
Alex Vandiver dc58dec231 restart-server: Start services in opposite order from stop.
`supervisorctl` starts and stops its arguments sequentially, in the
order they are passed[1].  Start them in the opposite order from the
order in which they were stopped -- this puts the dependencies first,
and the most core services (`zulip-django`) last.

While the only "dependency" here is currently thumbor, this sets us up
in case others are added later.

[1] https://github.com/Supervisor/supervisor/blob/master/supervisor/supervisorctl.py#L782
2020-09-14 16:27:15 -07:00
Alex Vandiver 8adf530400 puppet: Generate sharding in puppet, then refresh-sharding-and-restart.
This supports running puppet to pick up new sharding changes, which
will warn of the need to finalize them via
`refresh-sharding-and-restart`, or simply running that directly.
2020-09-14 16:27:15 -07:00
Alex Vandiver bf029d99f1 sharding: Also mark sharding.json 644 for consistency.
There is no reason to limit this to 640; mark it 644 for consistency
with the other file.
2020-09-14 16:27:15 -07:00
Alex Vandiver b5bcff04e5 sharding: Consistent mode for nginx sharding file.
This disagreed between `tornado_sharding.pp` in puppet and
`scripts/refresh-sharding-and-restart`.
2020-09-14 16:27:15 -07:00
Mateusz Mandera aae84197e8 check-rabbitmq-queue: Use list_queues output for current backlog size.
The value in the stats file can get outdated if the queue hasn't done
enough iterations to update the stats file for a while. The queue size
output by rabbitmqctl list_queues is more up to date, and empirically
tends to agree with the value in the stats file (when the stats file is
fresh).
2020-09-11 15:51:07 -07:00
Anders Kaseorg b7b7475672 python: Use standard secrets module to generate random tokens.
There are three functional side effects:

• Correct an insignificant but mathematically offensive bias toward
repeated characters in generate_api_key introduced in commit
47b4283c4b4c70ecde4d3c8de871c90ee2506d87; its entropy is increased
from 190.52864 bits to 190.53428 bits.

• Use the base32 alphabet in confirmation.models.generate_key; its
entropy is reduced from 124.07820 bits to the documented 120 bits, but
now it uses 1 syscall instead of 24.

• Use the base32 alphabet in get_bigbluebutton_url; its entropy is
reduced from 51.69925 bits to 50 bits, but now it uses 1 syscall
instead of 10.

(The base32 alphabet is A-Z 2-7.  We could probably replace all of
these with plain secrets.token_urlsafe, since I expect most callers
can handle the full urlsafe_b64 alphabet A-Z a-z 0-9 - _ without
problems.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-09 15:52:57 -07:00
Anders Kaseorg f91d287447 python: Pre-fix a few spots for better Black formatting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00
Anders Kaseorg bb4fc3c4c7 python: Prefer --flag=option over --flag option.
For less inflation by Black.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00