There is no reason that the base node access method should be run
under supervisor, which exists primarily to give access to the `zulip`
user to restart its managed services. This access is unnecessary for
Teleport, and also causes unwanted restarts of Teleport services when
the `supervisor` base configuration changes. Additionally,
supervisor does not support the in-place upgrade process that Teleport
uses, as it replaces its core process with a new one.
Switch to installing a systemd configuration file (as generated by
`teleport install systemd`) for each part of Teleport, customized to
pass a `--config` path. As such, we explicitly disable the `teleport`
service provided by the package.
The supervisor process is shut down by dint of no longer installing
the file, which purges it from the managed directory, and reloads
Supervisor to pick up the removed service.
Puppet _always_ sets the `+x` bit on directories if they have the `r`
bit set for that slot[^1]:
> When specifying numeric permissions for directories, Puppet sets the
> search permission wherever the read permission is set.
As such, for instance, `0640` is actually applied as `0750`.
Fix what we "want" to match what puppet is applying, by adding the `x`
bit. In none of these cases did we actually intend the directory to
not be executable.
[1] https://www.puppet.com/docs/puppet/5.5/types/file.html#file-attribute-mode
A number of autossh connections are already left open for
port-forwarding Munin ports; autossh starts the connections and
ensures that they are automatically restarted if they are severed.
However, this represents a missed opportunity. Nagios's monitoring
uses a large number of SSH connections to the remote hosts to run
commands on them; each of these connections requires doing a complete
SSH handshake and authentication, which can have non-trivial network
latency, particularly for hosts which may be located far away, in a
network topology sense (up to 1s for a no-op command!).
Use OpenSSH's ability to multiplex multiple connections over a single
socket, to reuse the already-established connection. We leave an
explicit `ControlMaster no` in the general configuration, and not
`auto`, as we do not wish any of the short-lived Nagios connections to
get promoted to being a control socket if the autossh is not running
for some reason.
We enable protocol-level keepalives, to give a better chance of the
socket being kept open.
The `needrestart` tool added in 22.04 is useful in terms of listing
which services may need to be restarted to pick up updated libraries.
However, it prompts about the current state of services needing
restart for *every* subsequent `apt-get upgrade`, and defaulting core
services to restarting requires carefully manually excluding them
every time, at risk of causing an unscheduled outage.
Build a list of default-off services based on the list in
unattended-upgrades.
This check loads Django, and as such must be run as the zulip user.
Repeat the same pattern used elsewhere in nagios, of writing a state
file, which is read by `check_cron_file`.
Our current EC2 systems don’t have an interface named ‘eth0’, and if
they did, this script would do nothing but crash with ImportError
because we have never installed boto.utils for Python 3.
(The message of commit 2a4d851a7c made
an effort to document for future researchers why this script should
not have been blindly converted to Python 3. However, commit
2dc6d09c2a (#14278) was evidently
unresearched and untested.)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The homedir of a user cannot be changed if any processes are running
as them, so having it change over time as upgrades happen will break
puppet application, as the old grafana process under supervisor will
effectively lock changes to the user's homedir.
Unfortunately, that means that this change will thus fail to
puppet-apply unless `supervisorctl stop grafana` is run first, but
there's no way around that.
In the event that extracting doesn't produce the binary we expected it
to, all this will do is create an _empty_ file where we expect the
binary to be. This will likely muddle debugging.
Since the only reason the resourfce was made in the first place was to
make dependencies clear, switch to depending on the External_Dep
itself, when such a dependency is needed.
The default in the previous commit, inherited from camo, was to bind
to 0.0.0.0:9292. In standalone deployments, camo is deployed on the
same host as the nginx reverse proxy, and as such there is no need to
open it up to other IPs.
Make `zulip::camo` take an optional parameter, which allows overriding
it in puppet, but skips a `zulip.conf` setting for it, since it is
unlikely to be adjust by most users.
The upstream of the `camo` repository[1] has been unmaintained for
several years, and is now archived by the owner. Additionally, it has
a number of limitations:
- It is installed as a sysinit service, which does not run under
Docker
- It does not prevent access to internal IPs, like 127.0.0.1
- It does not respect standard `HTTP_proxy` environment variables,
making it unable to use Smokescreen to prevent the prior flaw
- It occasionally just crashes, and thus must have a cron job to
restart it.
Swap camo out for the drop-in replacement go-camo[2], which has the
same external API, requiring not changes to Django code, but is more
maintained. Additionally, it resolves all of the above complaints.
go-camo is not configured to use Smokescreen as a proxy, because its
own private-IP filtering prevents using a proxy which lies within that
IP space. It is also unclear if the addition of Smokescreen would
provide any additional protection over the existing IP address
restrictions in go-camo.
go-camo has a subset of the security headers that our nginx reverse
proxy sets, and which camo set; provide the missing headers with `-H`
to ensure that go-camo, if exposed from behind some other non-nginx
load-balancer, still provides the necessary security headers.
Fixes#18351 by moving to supervisor.
Fixeszulip/docker-zulip#298 also by moving to supervisor.
[1] https://github.com/atmos/camo
[2] https://github.com/cactus/go-camo