Commit Graph

2476 Commits

Author SHA1 Message Date
Alex Vandiver a5496f4098 CVE-2021-43799: Set a secure Erlang cookie.
The RabbitMQ docs state ([1]):

    RabbitMQ nodes and CLI tools (e.g. rabbitmqctl) use a cookie to
    determine whether they are allowed to communicate with each
    other. [...] The cookie is just a string of alphanumeric
    characters up to 255 characters in size. It is usually stored in a
    local file.

...and goes on to state (emphasis ours):

    If the file does not exist, Erlang VM will try to create one with
    a randomly generated value when the RabbitMQ server starts
    up. Using such generated cookie files are **appropriate in
    development environments only.**

The auto-generated cookie does not use cryptographic sources of
randomness, and generates 20 characters of `[A-Z]`.  Because of a
semi-predictable seed, the entropy of this password is thus less than
the idealized 26^20 = 94 bits of entropy; in actuality, it is 36 bits
of entropy, or potentially as low as 20 if the performance of the
server is known.

These sizes are well within the scope of remote brute-force attacks.

On provision, install, and upgrade, replace the default insecure
20-character Erlang cookie with a cryptographically secure
255-character string (the max length allowed).

[1] https://www.rabbitmq.com/clustering.html#erlang-cookie
2022-01-25 02:13:53 +00:00
Alex Vandiver a46f6df91e CVE-2021-43799: Write rabbitmq configuration before starting.
Zulip writes a `rabbitmq.config` configuration file which locks down
RabbitMQ to listen only on localhost:5672, as well as the RabbitMQ
distribution port, on localhost:25672.

The "distribution port" is part of Erlang's clustering configuration;
while it is documented that the protocol is fundamentally
insecure ([1], [2]) and can result in remote arbitrary execution of
code, by default the RabbitMQ configuration on Debian and Ubuntu
leaves it publicly accessible, with weak credentials.

The configuration file that Zulip writes, while effective, is only
written _after_ the package has been installed and the service
started, which leaves the port exposed until RabbitMQ or system
restart.

Ensure that rabbitmq's `/etc/rabbitmq/rabbitmq.config` is written
before rabbitmq is installed or starts, and that changes to that file
trigger a restart of the service, such that the ports are only ever
bound to localhost.  This does not mitigate existing installs, since
it does not force a rabbitmq restart.

[1] https://www.erlang.org/doc/apps/erts/erl_dist_protocol.html
[2] https://www.erlang.org/doc/reference_manual/distributed.html#distributed-erlang-system
2022-01-25 01:48:05 +00:00
Alex Vandiver 43d63bd5a1 puppet: Always set the RabbitMQ nodename to zulip@localhost.
This is required in order to lock down the RabbitMQ port to only
listen on localhost.  If the nodename is `rabbit@hostname`, in most
circumstances the hostname will resolve to an external IP, which the
rabbitmq port will not be bound to.

Installs which used `rabbit@hostname`, due to RabbitMQ having been
installed before Zulip, would not have functioned if the host or
RabbitMQ service was restarted, as the localhost restrictions in the
RabbitMQ configuration would have made rabbitmqctl (and Zulip cron
jobs that call it) unable to find the rabbitmq server.

The previous commit ensures that configure-rabbitmq is re-run after
the nodename has changed.  However, rabbitmq needs to be stopped
before `rabbitmq-env.conf` is changed; we use an `onlyif` on an `exec`
to print the warning about the node change, and let the subsequent
config change and notify of the service and configure-rabbitmq to
complete the re-configuration.
2022-01-25 01:48:02 +00:00
Alex Vandiver 694c4dfe8f puppet: Admit we leave epmd port 4369 open on all interfaces.
The Erlang `epmd` daemon listens on port 4369, and provides
information (without authentication) about which Erlang processes are
listening on what ports.  This information is not itself a
vulnerability, but may provide information for remote attackers about
what local Erlang services (such as `rabbitmq-server`) are running,
and where.

`epmd` supports an `ERL_EPMD_ADDRESS` environment variable to limit
which interfaces it binds on.  While this environment variable is set
in `/etc/default/rabbitmq-server`, Zulip unfortunately attempts to
start `epmd` using an explicit `exec` block, which ignores those
settings.

Regardless, this lack of `ERL_EPMD_ADDRESS` variable only controls
`epmd`'s startup upon first installation.  Upon reboot, there are two
ways in which `epmd` might be started, neither of which respect
`ERL_EPMD_ADDRESS`:

 - On Focal, an `epmd` service exists and is activated, which uses
   systemd's configuration to choose which interfaces to bind on, and
   thus `ERL_EPMD_ADDRESS` is irrelevant.

 - On Bionic (and Focal, due to a broken dependency from
   `rabbitmq-server` to `epmd@` instead of `epmd`, which may lead to
   the explicit `epmd` service losing a race), `epmd` is started by
   `rabbitmq-server` when it does not detect a running instance.
   Unfortunately, only `/etc/init.d/rabbitmq-server` would respects
   `/etc/default/rabbitmq-server` -- and it defers the actual startup
   to using systemd, which does not pass the environment variable
   down.  Thus, `ERL_EPMD_ADDRESS` is also irrelevant here.

We unfortunately cannot limit `epmd` to only listening on localhost,
due to a number of overlapping bugs and limitations:

 - Manually starting `epmd` with `-address 127.0.0.1` silently fails
   to start on hosts with IPv6 disabled, due to an Erlang bug ([1],
   [2]).

 - The dependencies of the systemd `rabbitmq-server` service can be
   fixed to include the `epmd` service, and systemd can be made to
   bind to `127.0.0.1:4369` and pass that socket to `epmd`, bypassing
   the above bug.  However, the startup of this service is not
   guaranteed, because it races with other sources of `epmd` (see
   below).

 - Any process that runs `rabbitmqctl` results in `epmd` being started
   if one is not currently running; these instances do not respect any
   environment variables as to which addresses to bind on.  This is
   also triggered by `service rabbitmq-server status`, as well as
   various Zulip cron jobs which inspect the rabbitmq queues.  As
   such, it is difficult-to-impossible to ensure that some other
   `epmd` process will not win the race and open the port on all
   interfaces.

Since the only known exposure from leaving port 4369 open is
information that rabbitmq is running on the host, and the complexity
of adjusting this to only bind on localhost is high, we remove the
setting which does not address the problem, and document that the port
is left open, and should be protected via system-level or
network-level firewalls.

[1]: https://bugs.launchpad.net/ubuntu/+source/erlang/+bug/1374109
[2]: https://github.com/erlang/otp/issues/4820
2022-01-25 01:46:51 +00:00
Alya Abbott 669010494e portico: Update contributor count from 700 to 1000.
Note: I did not check whether we have numbers other than 700 that also
need to be updated.
2022-01-24 12:41:49 -08:00
Anders Kaseorg a58a71ef43 Remove Ubuntu 18.04 support.
As a consequence:

• Bump minimum supported Python version to 3.7.
• Move Vagrant environment to Debian 10, which has Python 3.7.
• Move CI frontend tests to Debian 10.
• Move production build test to Debian 10.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-01-21 17:26:14 -08:00
Alex Vandiver be1c4c2bd8 docs: Mention Camo does not use a local Smokescreen in the proxies docs.
This documents the new behaviour in d328d3dd4d.
2022-01-21 15:57:27 -08:00
Eeshan Garg 9e1fd26125 docs: Make general improvements to our billing docs.
With a few wording tweaks from tabbott to the "Upgrading a Zulip
organization" section.
2022-01-21 14:21:02 -08:00
rht 42f46a78e9 docs: Fix grammar problems found by LanguageTool.
With tweaks to security-model.md by tabbott to expand the SSO acronym.

Ignored, but still needs discussion on whether we should exclude this
rule:

```
The word ‘install’ is not a noun.
  ✗ ...ble to connect to the client during the install process:  So you'll need to shut down a...
                                               ^^^^^^^
  ✓ ...ble to connect to the client during the installation process:  So you'll need to shut down a...
  A_INSTALL: a/the + install

The word ‘install’ is not a noun.
  ✗ ...detected at install time will cause the install to abort. If you already have PostgreSQ...
                                               ^^^^^^^
  ✓ ...detected at install time will cause the installation to abort. If you already have PostgreSQ...
  A_INSTALL: a/the + install
```
2022-01-21 14:02:14 -08:00
Alya Abbott ca311e83c8 docs: Fix typos in GSoC guide. 2022-01-21 13:38:30 -08:00
Alya Abbott 19154f81c0
docs: Clarify purpose of zulip-announce. 2022-01-19 15:34:24 -08:00
Eeshan Garg fd303e3b1b docs: Update the release checklist for our PyPI packages. 2022-01-19 13:59:13 -08:00
Alex Vandiver 5f237cb34e puppet: Document that upgrades from Git require 3GB.
The step of rebuilding static assets using webpack requires more than
2G of RAM.
2022-01-19 12:36:44 -08:00
Abhijeet Prasad Bodas 12eb52221b docs: Rename `contributing/gsoc-ideas` -> `contributing/gsoc`.
This page contains a lot of other material related to GSoC than
just project ideas.
We would also want to add a redirect from the old URL to the new
one from the RTD admin page.
2022-01-19 11:35:33 -08:00
Alya Abbott 3659d95092 developer docs: Update GSoC documentation. 2022-01-18 21:16:18 -08:00
Rishabh-792 177931a23d doc: Fix typos in accessibility doc.
Hyphenated open source to open-source.

Capitalized aXe to Axe.
2022-01-11 15:41:08 -08:00
Rishabh-792 6cd8e088e9 doc: Fix typo in zulipbot-usage doc.
Fix a spelling mistake in the zulipbot doc.
2022-01-11 15:40:06 -08:00
Rishabh-792 1ec018d237 doc: Fix typos in code reviewing doc.
Made some spelling and grammatical changes.
2022-01-11 15:40:05 -08:00
Alex Vandiver d328d3dd4d puppet: Allow routing camo requests through an outgoing proxy.
Because Camo includes logic to deny access to private subnets, routing
its requests through Smokescreen is generally not necessary.  However,
it may be necessary if Zulip has configured a non-Smokescreen exit
proxy.

Default Camo to using the proxy only if it is not Smokescreen, with a
new `proxy.enable_for_camo` setting to override this behaviour if need
be.  Note that that setting is in `zulip.conf` on the host with Camo
installed -- not the Zulip frontend host, if they are different.

Fixes: #20550.
2022-01-07 12:08:10 -08:00
Alex Vandiver 2c5fc1827c puppet: Standardize what values are bools, and what true is.
For `no_serve_uploads`, `http_only`, which previously specified
"non-empty" to enable, this tightens what values are true.  For
`pgroonga` and `queue_workers_multiprocess`, this broadens the
possible values from `enabled`, and `true` respectively.
2022-01-07 12:08:10 -08:00
Anders Kaseorg 1696144df7 docs: Consistently hyphenate “self-host” and “self-service”.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-01-05 16:21:35 -08:00
Alex Vandiver 6218ed91c2 puppet: Use lazy-apps and uwsgi control sockets for rolling reloads.
Restarting the uwsgi processes by way of supervisor opens a window
during which nginx 502's all responses.  uwsgi has a configuration
called "chain reloading" which allows for rolling restart of the uwsgi
processes, such that only one process at once in unavailable; see
uwsgi documentation ([1]).

The tradeoff is that this requires that the uwsgi processes load the
libraries after forking, rather than before ("lazy apps"); in theory
this can lead to larger memory footprints, since they are not shared.
In practice, as Django defers much of the loading, this is not as much
of an issue.  In a very basic test of memory consumption (measured by
total memory - free - caches - buffers; 6 uwsgi workers), both
immediately after restarting Django, and after requesting `/` 60 times
with 6 concurrent requests:

                      |  Non-lazy  |  Lazy app  | Difference
    ------------------+------------+------------+-------------
    Fresh             |  2,827,216 |  2,870,480 |   +43,264
    After 60 requests |  3,332,284 |  3,409,608 |   +77,324
    ..................|............|............|.............
    Difference        |   +505,068 |   +539,128 |   +34,060

That is, "lazy app" loading increased the footprint pre-requests by
43MB, and after 60 requests grew the memory footprint by 539MB, as
opposed to non-lazy loading, which grew it by 505MB.  Using wsgi "lazy
app" loading does increase the memory footprint, but not by a large
percentage.

The other effect is that processes may be served by either old or new
code during the restart window.  This may cause transient failures
when new frontend code talks to old backend code.

Enable chain-reloading during graceful, puppetless restarts, but only
if enabled via a zulip.conf configuration flag.

Fixes #2559.

[1]: https://uwsgi-docs.readthedocs.io/en/latest/articles/TheArtOfGracefulReloading.html#chain-reloading-lazy-apps
2022-01-05 14:48:52 -08:00
BIKI DAS 42dd58cffe
docs: Fix a few typos in documentation. 2021-12-28 09:36:59 -08:00
BIKI DAS c1134a8bda
docs: Fix "should should" typo. 2021-12-28 09:19:04 -08:00
Anders Kaseorg 1d3520db12 webhooks: Remove space from UptimeRobot.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg 68c99511a2 webhooks: Fix TeamCity capitalization.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg 65868b09eb webhooks: Add missing space in Review Board.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg c02c053ec3 webhooks: Fix Mailchimp capitalization.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg cd8a01587b webhooks: Fix Jotform capitalization.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg 3ca2f8ca1e webhooks: Fix Clubhouse capitalization.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg 517ddbc9e6 setup-advanced: Remove misleading python3 symlink suggestion.
One should never have to manually symlink things in /usr/bin,
especially with -f.  That should be managed by the system package
manager.  Indeed, on CentOS 7 and 8, one can simply install the
python3 package and get a working /usr/bin/python3.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-16 17:30:04 -08:00
Alya Abbott aaf1258de2 developer docs: Tweak ToS for push notifications wording. 2021-12-14 14:13:34 -08:00
Tim Abbott ee77c6365a portico: Use /help/ style pages for displaying policies.
This replaces the TERMS_OF_SERVICE and PRIVACY_POLICY settings with
just a POLICIES_DIRECTORY setting, in order to support settings (like
Zulip Cloud) where there's more policies than just those two.

With minor changes by Eeshan Garg.
2021-12-10 17:56:12 -08:00
Alex Vandiver 01e8f752a8 puppet: Use certbot package timer, not our own cron job.
The certbot package installs its own systemd timer (and cron job,
which disabled itself if systemd is enabled) which updates
certificates.  This process races with the cron job which Zulip
installs -- the only difference being that Zulip respects the
`certbot.auto_renew` setting, and that it passes the deploy hook.
This means that occasionally nginx would not be reloaded, when the
systemd timer caught the expiration first.

Remove the custom cron job and `certbot-maybe-renew` script, and
reconfigure certbot to always reload nginx after deploying, using
certbot directory hooks.

Since `certbot.auto_renew` can't have an effect, remove the setting.
In turn, this removes the need for `--no-zulip-conf` to
`setup-certbot`.  `--deploy-hook` is similarly removed, as running
deploy hooks to restart nginx is now the default; pass
`--no-directory-hooks` in standalone mode to not attempt to reload
nginx.  The other property of `--deploy-hook`, of skipping symlinking
into place, is given its own flog.
2021-12-09 13:47:33 -08:00
Eeshan Garg 3bab91079f external links: Migrate the rest of /developer-community links.
We recently changed /developer-community to /development-community.
Now that this change is in production, we can also migrate the
external links in our ReadTheDocs documentation.
2021-12-09 12:14:26 -08:00
Alex Vandiver cb2d0ff32b postgresql: Support replication on PostgreSQL >= 11, document.
PostgreSQL 11 and below used a configuration file names
`recovery.conf` to manage replicas and standbys; support for this was
removed in PostgreSQL 12[1], and the configuration parameters were
moved into the main `postgresql.conf`.

Add `zulip.conf` settings for the primary server hostname and
replication username, so that the complete `postgresql.conf`
configuration on PostgreSQL 14 can continue to be managed, even when
replication is enabled.  For consistency, also begin writing out the
`recovery.conf` for PostgreSQL 11 and below.

In PostgreSQL 12 configuration and later, the `wal_level =
hot_standby` setting is removed, as `hot_standby` is equivalent to
`replica`, which is the default value[2].  Similarly, the
`hot_standby = on` setting is also the default[3].

Documentation is added for these features, and the commentary on the
"Export and Import" page referencing files under `puppet/zulip_ops/`
is removed, as those files no longer have any replication-specific
configuration.

[1]: https://www.postgresql.org/docs/current/recovery-config.html
[2]: https://www.postgresql.org/docs/12/runtime-config-wal.html#GUC-WAL-LEVEL
[3]: https://www.postgresql.org/docs/12/runtime-config-replication.html#GUC-HOT-STANDBY
2021-12-03 16:32:41 -08:00
Emilio López baea14ee57 docs: Clarify use of `loadbalancer.ips` when using a reverse proxy.
When Zulip is run behind one or more reverse proxies, you must
configure `loadbalancer.ips` so that Zulip respects the client IP
addresses found in the `X-Forwarded-For` header. This is not
immediately clear from the documentation, so this commit makes it more
clear and augments the existing examples to showcase this need.

Fixes: #19073
2021-12-03 13:59:31 -08:00
Alex Vandiver ab8be84b36 docs: Secret reading is done using RawConfigParser, not ConfigParser.
ConfigParser makes `%` signs require escaping, which is why it is not
used in Zulip, particularly for secrets.
2021-12-02 15:25:04 -08:00
Alex Vandiver 54d037f24a version: Update version and changelog after 4.8 release. 2021-12-01 23:42:11 +00:00
AEsping f6c4f17900 dev docs: Update Jinja translation tag link.
Updates the link to Jinja i18n extension for auto-translation.
2021-11-30 14:36:29 -08:00
AEsping 828313b54a dev docs: Update Jinja translation tag link.
Updates the link to Jinja i18n extension for auto-translation.
2021-11-30 14:36:29 -08:00
AEsping 704c9609ee dev docs: Update Tig link.
Updates the link to the Tig git visualizer.
2021-11-30 14:36:29 -08:00
AEsping 11f2575c31 dev docs: Update "Solo" link.
Fixes the link to "El adveribo <<solo>> y los pronombres
demonstrativos, sin tilde."
2021-11-30 14:36:29 -08:00
AEsping 510b8867a6 dev docs: Update Neil Green link in the reading list.
Fixes the link to the Neil Green presentation on TypeScript
vs Coffee Script vs ES6.

This is a change from slides to a video becasue the slides are
no longer available.
2021-11-30 14:36:29 -08:00
AEsping 55f9178506 dev docs: Update Black link.
Updates the link to the editior integration for Black.
2021-11-30 14:36:29 -08:00
AEsping 5410009a88 prod docs: Update BBB configuration link.
Updates the Big Blue Button customization link for
extracting shared secrets.
2021-11-30 14:36:29 -08:00
Mateusz Mandera 8c1a6f4bba docs: Suggest updating settings.py in OIDC instructions.
OIDC config features a get_secret call (so it requires adding an import)
as well as having a bunch of its instructions in the form of comments on
the various keys of the config dict - thus users should really update
settings.py to fetch all of that.
2021-11-29 15:52:52 -08:00
Alex Vandiver 0ae375e0f9 ci: Test upgrades from the latest minor release. 2021-11-25 08:00:34 -08:00
AEsping 6ad1c5c8ed docs:: Update GSoC application tips.
- Add missing link for GitHub.
- Fix broken links to Matt Ringel's blog post.
- Add link to Julia Evans blog post.
- Add section heading for "Questions Are Important."
- Rearrange some content to fit with new section heading.

With additional tweaks from tabbott:
* Avoid linking to chat.zulip.org not via our documentation.
* Avoid the CZO abbreviation.
2021-11-23 16:05:33 -08:00
Alex Vandiver b982222e03 camo: Replace with go-camo implementation.
The upstream of the `camo` repository[1] has been unmaintained for
several years, and is now archived by the owner.  Additionally, it has
a number of limitations:
 - It is installed as a sysinit service, which does not run under
   Docker
 - It does not prevent access to internal IPs, like 127.0.0.1
 - It does not respect standard `HTTP_proxy` environment variables,
   making it unable to use Smokescreen to prevent the prior flaw
 - It occasionally just crashes, and thus must have a cron job to
   restart it.

Swap camo out for the drop-in replacement go-camo[2], which has the
same external API, requiring not changes to Django code, but is more
maintained.  Additionally, it resolves all of the above complaints.

go-camo is not configured to use Smokescreen as a proxy, because its
own private-IP filtering prevents using a proxy which lies within that
IP space.  It is also unclear if the addition of Smokescreen would
provide any additional protection over the existing IP address
restrictions in go-camo.

go-camo has a subset of the security headers that our nginx reverse
proxy sets, and which camo set; provide the missing headers with `-H`
to ensure that go-camo, if exposed from behind some other non-nginx
load-balancer, still provides the necessary security headers.

Fixes #18351 by moving to supervisor.
Fixes zulip/docker-zulip#298 also by moving to supervisor.

[1] https://github.com/atmos/camo
[2] https://github.com/cactus/go-camo
2021-11-19 15:58:26 -08:00