This reverts commit 211232978f. The
`rabbitmq` user does not exist yet on first install, and the goal is
to create the `rabbitmq-env.conf` file before the package is
installed.
In production, the `wildcard-zulipchat.com.combined-chain.crt` file is
just a symlink to the snakeoil certificates; but we do not puppet that
symlink, which makes new hosts fail to start cleanly. Instead, point
explicitly to the snakeoil certificate, and explain why.
Directives in `location` blocks may or may not inherit from
surrounding `location` blocks; specifically, `add_header` directives
do not[1]:
> There could be several add_header directives. These directives are
> inherited from the previous configuration level if and only if there
> are no add_header directives defined on the current level.
In order to maintain the same headers (including, critically,
`Access-Control-Allow-Origin`) as the surrounding block, all
`add_header` directives must thus be repeated (which includes the
`include`).
For clarity, un-nest and repeat the entire `location` block as was
used for `/static/`, but with the additional `add_header`. This is
preferred to the of an `if $request_uri` statement to add the header,
as those can have unexpected or undefined results[2].
[1] http://nginx.org/en/docs/http/ngx_http_headers_module.html#add_header
[2] https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/
Redis is not nagios, and this only leads to confusion as to why there
is a nagios domain setting on frontend servers; it also leaves the
`redis0` part of the name buried in the template.
Switch to an explicit variable for the redis hostname.
This is more broadly useful than for just Kandra; provide
documentation and means to install Smokescreen for stand-alone
servers, and motivate its use somewhat more.
This means that in steady-state, `zulip-puppet-apply` is expected to
produce no changes or commands to execute. The verification step of
`setup-apt-repo` is quite fast, so this cleans up the output for very
little cost.
These optimizations only makes sense when all connections at a TCP
level are coming from the same host or set of hosts; as such, they
are only enabled if `loadbalancer.ips` is set in the `zulip.conf`.
This is required for unattended upgrades to actually run regularly.
In some distributions, it may be found in 20auto-upgrades, but placing
it here makes it more discoverable.
We haven't actively used this plugin in years, and so it was never
converted from the 2014-era monitoring to detect the hostname.
This seems worth fixing since we may want to migrate this logic to a
more modern monitoring system, and it's helpful to have it correct.
79931051bd allows outgoing emails from
localhost, but outgoing recipients are still subjected to virtualmaps.
This caused all outgoing email from Zulip with destination addresses
containing `.`, `+`, or starting with `mm`, to be redirected back
through the email gateway.
Bracket the virualmap addresses used for local delivery to the mail
gateway with a restriction on the domain matching the
`postfix.mailname` configuration, regex-escaped, so those only apply
to email destined for that domain.
The hostname is _not_ moved from `mydestination` to
`virtual_alias_domains`, as that would preclude delivery to
actually-local addresses, like `postmaster@`.
We run this tool at DEBUG log level in production, so we will still
see the notice on startup there; this avoids a spammy line in the
development environment output..
`wal-g wal-push` has a known bug with occasionally hanging after file
upload to S3[1]; set a rather long timeout on the upload process, so
that we don't simply stall forever when archiving WAL segments.
[1] https://github.com/wal-g/wal-g/issues/656
Logging `Host` is useful for determining access patterns to realms,
especially if ROOT_DOMAIN_LANDING_PAGE is set. Total response time is
useful in debugging access and performance patterns.
These are respected by `urllib`, and thus also `requests`. We set
`HTTP_proxy`, not `HTTP_PROXY`, because the latter is ignored in
situations which might be running under CGI -- in such cases it may be
coming from the `Proxy:` header in the request.
This provides a single reference point for all zulip.conf settings;
these mostly link out to the more complete documentation about each
setting, elsewhere.
Fixes#12490.
There is only one PostgreSQL database; the "appdb" is irrelevant.
Also use "postgresql," as it is the name of the software, whereas
"postgres" the name of the binary and colloquial name. This is minor
cleanup, but enabled by the other renames in the previous commit.
The "voyager" name is non-intuitive and not significant.
`zulip::voyager` and `zulip::dockervoyager` stubs are kept for
back-compatibility with existing `zulip.conf` files.
This moves the puppet configuration closer to the "roles and profiles
method"[1] which is suggested for organizing puppet classes. Notably,
here it makes clear which classes are meant to be able to stand alone
as deployments.
Shims are left behind at the previous names, for compatibility with
existing `zulip.conf` files when upgrading.
[1] https://puppet.com/docs/pe/2019.8/the_roles_and_profiles_method
This also removes direct includes of `zulip::common`, making
`zulip::base` gatekeep the inclusion of it. This helps enforce that
any top-level deploy only needs include a single class, and that any
configuration which is not meant to be deployed by itself will not
apply, due to lack of `zulip::common` include.
The following commit will better differentiate these top-level deploys
by moving them into a subdirectory.
Relying on `defined(Class['...'])` makes the class sensitive to
resource evaluation ordering, and thus brittle. It is also only
functional for a single service (thumbor).
Generalize by using `purge => true` for the directory to automatically
remove all un-managed files. This is more general than the previous
form, and may result in additional not-managed services being removed.
Restarting servers is what can cause service interruptions, and
increase risk. Add all of the servers that we use to the list of
ignored packages, and uncomment the default allowed-origins in order
to enable unattended upgrades.
d2aa81858c replaced the `apt::source` to set up debathena with
`Exec['setup-apt-repo-debathena']`, but mistakenly left the
`apt::source` in place in `zmirror` (but not `zmirror_personals`).
The `apt::source` resource type was later removed in c9d54f7854,
making the manifest to apply on `zmirror`.
Remove the broken and unnecessary `apt::source` resource.
This property is not related to the base zulip install; move it to
zulip::postgres_common, which is already used as a namespace for
various postgres variables.
There was likely more dependency complexity prior to 97766102df, but
there is now no reason to require that consumers explicitly include
zulip::apt_repository.
Use https://github.com/stripe/smokescreen to provide a server for an
outgoing proxy, run under supervisor. This will allow centralized
blocking of internal metadata IPs, localhost, and so forth, as well as
providing default request timeouts (10s by default).
We should eventually add templating for the set of hosts here, but
it's worth merging this change to remove the deleted hostname and
replace it with the current one.
Disabled on webservers in 047817b6b0, it has since lingered in
configuration, as well as running (to no effect) every minute on the
loadbalancer.
Remove the vestiges of its configuration.
This banner shows on lb1, advertising itself as lb0. There is no
compelling reason for a custom motd, especially one which needs to
be reconfigured for each host.
Since this was using repead individual get() calls previously, it
could not be monitored for having a consumer. Add it in, by marking
it of queue type "consumer" (the default), and adding Nagios lines for
it.
Also adjust missedmessage_emails to be monitored; it stopped using
LoopQueueProcessingWorker in 5cec566cb9, but was never added back
into the set of monitored consumers.
The rabbitmq cron jobs exist in order to call rabbitmqctl as root and
write the output to files that nagios can consume, since nagios is not
allowed to run rabbitmqctl.
In systems which do not have nagios configured, these every-minute
cron jobs add non-insignificant load, to no effect. Move their
installation into `zulip_ops`. In doing so, also combine the cron.d
files into a single file; this allows us to `ensure => absent` the old
filenames, removing them from existing systems. Leave the resulting
combined cron.d file in `zulip`, since it is still of general utility
and note.
The configuration change made in 1c17583ad5 only allowed delivery to
those specific Zulip addresses. However, they also prevent the
mailserver from being used as an outgoing email relay from Zulip,
since all mail that passed through the mailserver (from any
originator) was required to have a `RCPT TO` that matched those
regexes.
Allow mail originating from `mynetworks` to have an arbitrary
addresses in `RCPT TO`.
Use the validation of the tornado sharding config that
`stage_updated_sharding` does, by depending on it. This ensures that
we don't write out a supervisor or nginx config based on a
bad (e.g. non-sequential) list of tornado ports.
Fingerprinting the config is somewhat brittle -- it requires either
custom bootstrapping for old (fingerprint-less) configs, and may have
false-positives.
Since generating the config is lightweight, do so into the .tmp files,
and compare the output to the originals to determine if there are
changes to apply.
In order to both surface errors, as well as notify the user in case a
restart is necessary, we must run it twice. The `onlyif`
functionality cannot show configuration errors to the user, only
determine if the command runs or not. We thus run the command once,
judging errors as "interesting" enough to run the actual command,
whose failure will be verbose in Puppet and halt any steps that depend
on it.
Removing the `onlyif` would result in `stage_updated_sharding` showing
up in the output of every Puppet run, which obscures the important
messages it displays when an update to sharding is necessary.
Removing the `command` (e.g. making it an `echo`) would result in
removing the ability to report configuration errors. We thus have no
choice but to run it twice; this is thankfully low-overhead.