Commit Graph

184 Commits

Author SHA1 Message Date
Tim Abbott b41c2d93d1 puppet: Exclude squashfs filesystems from nagios disk checks.
These generally aren't being written to.
2019-06-16 16:22:23 -07:00
Tim Abbott 0ec1b4e82c puppet: Move check_send_receive_time to the _once ruleset.
We don't actually want to run this bundle of message-sending Nagios
checks to run on every single server.
2019-06-16 15:48:35 -07:00
Tim Abbott df83979c76 zulip_ops: Extract a prod_app_frontend_once ruleset. 2019-06-16 15:48:35 -07:00
Tim Abbott 738cfe54c3 puppet: Move app_frontend_once out of prod configuration.
That logic made it inconvenient to run multiple prod servers with the
same top-level puppet configuration.
2019-06-16 15:24:20 -07:00
Tim Abbott e85250941d puppet: Fix quoting of commented-out python3-boto.
This will avoid a linter error if/when we uncomment it.
2019-06-13 14:39:24 -07:00
Tim Abbott 337efe0fb7 puppet: Remove puppet-el, which no longer exists.
This package was only every available on Ubuntu Xenial.
2019-06-13 14:39:24 -07:00
Vishnu Ks ecdd3bea43 billing: Add cron job to run invoice_plans once a day.
Fixes #11960
2019-04-29 11:23:17 -07:00
Anders Kaseorg 643bd18b9f lint: Fix code that evaded our lint checks for string % non-tuple.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:21:37 -07:00
Anders Kaseorg 9f7c0b7e65 postgres_master.pp: Fix wacky su command line.
The construction `su postgres -c -- bash -c 'psql …'` didn’t behave the
way it reads, and only worked by accident:

1. `-c --` sets the command to `--`.
2. `bash` sets the first argument to `bash`.
3. `-c 'psql …'` replaces the command with `psql …`.

Thus, `su` ended up executing `<shell> -c 'psql …' bash`, where
`<shell>` is the `postgres` user’s login shell, usually also `bash`,
which then executed 'psql …' and ignored the extra `bash`.

Unconfuse this construction.

Note from tabbott: The old code didn't even work by accident, it was
just broken.  The right fix is to move the quoting around properly.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-12 17:27:23 -07:00
Anders Kaseorg 649235cfec python: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-22 16:54:36 -08:00
Anders Kaseorg c109690cf8 puppet: Remove unused Python imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:02:12 -08:00
rht 9ee2ee046a puppet: Use systemctl instead of pg_ctlcluster on CentOS. 2019-01-05 15:49:03 -08:00
Tim Abbott 047817b6b0 puppet: Disable log2zulip cron job.
It hasn't been working for years, but more importantly, it spams up
root's mail queue so that one can't find important things in there
(e.g. the fact that the long-term-idle cron job was failing).
2019-01-05 10:56:44 -08:00
Tim Abbott 2558f101af docs: Add documentation for `if False` mypy pattern in scripts.
This should help make it clear what's going on with these scripts.
2018-12-17 11:12:53 -08:00
rht d2aa81858c puppet/zulip_ops: Replace apt::source with setup-apt-repo-debathena.
Tweaked by tabbott to use a clearer name.
2018-12-11 13:02:56 -08:00
Tim Abbott b218c2a70e loadbalancer: Use same certbot cert for zulipstaging.com.
This is a simple configuration improvement.
2018-12-07 13:43:21 -08:00
Tim Abbott 467694c1fa nginx: Enable http2 in external nginx configuration.
This should be a nice performance improvement for browsers that
support it.

We can't yet enabled this in the Zulip on-premise nginx configuration,
because that still has to support Trusty.
2018-12-07 13:43:02 -08:00
Tim Abbott 5abf4dee92 nagios: Add new host groups for Tornado processes.
We also move all the existing Tornado monitoring rules to the
singletornado_frontends rule.
2018-11-06 16:33:18 -08:00
Tim Abbott 5f3b79c9e7 nagios: Fix tab-based whitespace. 2018-11-06 16:30:29 -08:00
Tim Abbott dc7d44a245 puppet: Don't run calculate-first-visible-message-id on most systems.
This should only be run on systems that are running zilencer, because
the cron job is part of the zilencer project.
2018-10-30 11:40:24 -07:00
Tim Abbott 2c7f9ce0fc puppet: Fix puppet-lint warnings in various manifests.
Apparently, `puppet-lint` on Ubuntu trusty throws warnings for certain
quoting patterns that are OK in modern `puppet-lint`.  I believe the
old Zulip code was actually correct (i.e. the old `puppet-lint`
implementation was the problem), but it seems worth changing anyway to
suppress the warnings.

We also exclude more of puppet-apt from linting, since it's
third-party code.
2018-08-28 13:46:31 -07:00
Tim Abbott b53a712856 nginx: Update configuration for using certbot certs everywhere. 2018-08-22 11:59:15 -07:00
Tim Abbott 90828297e4 puppet-lint: Enforce double_quoted_strings check.
This makes our puppet codebase more consistent by using single-quoted
strings consistently.
2018-08-13 12:31:19 -07:00
Tim Abbott d0b51b70f4 puppet-lint: Enforce 2sp_soft_tables puppet-lint check.
This cleans up the puppet codebase's whitespace formatting to be more
consistent.
2018-08-13 12:31:16 -07:00
Tim Abbott b26e0a957d puppet-lint: Enforce arrow_alignment check.
This fixes all exceptions in our puppet codebase to this lint rule.
2018-08-13 12:30:57 -07:00
Aditya Bansal 710d4507de puppet-lint: Fix lines longer than 140 characters lint warnings.
We fix these by adding ignore statements in a bunch of files
where this error popped up. We target only specific lines using
the ignore statements and not the entire files.
2018-08-07 10:03:40 -07:00
Anders Kaseorg edfd5ef992 setup_disks.sh: Fix shellcheck warnings.
In puppet/zulip_ops/files/postgresql/setup_disks.sh line 15:
array_name=$(mdadm --examine --scan | sed 's/.*name=//')
^-- SC2034: array_name appears unused. Verify use (or export if used externally).

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2018-08-03 09:15:26 -07:00
Anders Kaseorg 5a0fecc2d5 munin_plugins: Fix shellcheck warnings.
In puppet/zulip_ops/files/munin-plugins/rabbitmq_connections line 66:
echo "connections.value $(HOME=$HOME rabbitmqctl list_connections | grep -v "^Listing" | grep -v "done.$" | wc -l)"
                                                                                         ^-- SC2126: Consider using grep -c instead of grep|wc -l.

In puppet/zulip_ops/files/munin-plugins/rabbitmq_consumers line 32:
VHOST=${vhost:-"/"}
^-- SC2034: VHOST appears unused. Verify use (or export if used externally).

In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages line 32:
VHOST=${vhost:-"/"}
^-- SC2034: VHOST appears unused. Verify use (or export if used externally).

In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages_unacknowledged line 32:
VHOST=${vhost:-"/"}
^-- SC2034: VHOST appears unused. Verify use (or export if used externally).

In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages_uncommitted line 32:
VHOST=${vhost:-"/"}
^-- SC2034: VHOST appears unused. Verify use (or export if used externally).

In puppet/zulip_ops/files/munin-plugins/rabbitmq_queue_memory line 32:
VHOST=${vhost:-"/"}
^-- SC2034: VHOST appears unused. Verify use (or export if used externally).

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2018-08-03 09:15:08 -07:00
Tim Abbott 02ae71f27f api: Stop using API keys for Django->Tornado authentication.
As part of our effort to change the data model away from each user
having a single API key, we're eliminating the couple requests that
were made from Django to Tornado (as part of a /register or home
request) where we used the user's API key grabbed from the database
for authentication.

Instead, we use the (already existing) internal_notify_view
authentication mechanism, which uses the SHARED_SECRET setting for
security, for these requests, and just fetch the user object using
get_user_profile_by_id directly.

Tweaked by Yago to include the new /api/v1/events/internal endpoint in
the exempt_patterns list in test_helpers, since it's an endpoint we call
through Tornado. Also added a couple missing return type annotations.
2018-07-30 12:28:31 -07:00
Tim Abbott 07af59d4cc tornado: Split get_events_backend into two functions.
The lower-layer function, now called get_events_backend, is intended
to be called by multiple code paths (including the upcoming
get_events_internal).
2018-07-30 12:28:31 -07:00
Tim Abbott 63fe39e381 zulip_ops: Disable Ubuntu's built-in update-motd.d files.
We can't really do this in the zulip manifests (since it's sorta a
sysadmin policy decision), but these scripts can cause significant
load when Nagios logs into a server (because many of them take 50ms or
more of work to run).  So we just get rid of them.
2018-05-06 18:47:40 -07:00
Tim Abbott 4e8487c886 nagios: Bump maximum processes limits.
These seemed to be flapping for no good reason.
2018-05-02 11:12:47 -07:00
Tim Abbott 718492638b puppet: Fix name for dhcpcd5 package.
Apparently the name dhcpcd isn't installable.
2018-04-23 11:32:07 -07:00
Tim Abbott 35aa4f0377 puppet: Sort ensure attributes to be always first.
This inconsistency was flagged by puppet-lint.
2018-04-22 23:41:49 -07:00
Tim Abbott e103c2ff2d puppet: Switch to modern quoted, octal file modes.
This is one of the prerequisite tasks for Puppet 4 support.

Constructed using puppet-lint.
2018-04-22 23:30:48 -07:00
Tim Abbott 62b12e0c34 zulip_ops: Add missing dependency on dhcpcd. 2018-04-19 14:27:48 -07:00
neiljp (Neil Pilgrim) 090b47ed19 mypy: Add explicit Optional for default=None parameters in various files. 2018-03-28 12:31:51 -07:00
neiljp (Neil Pilgrim) f32f3cbf72 mypy: Amend zulip-ec2-configure-interfaces to avoid None. 2018-03-23 11:39:54 -07:00
Tim Abbott d98be2f19f puppet: Only run analytics Nagios checks on machine running cron.
Running this on additional machines would be redundant; additionally,
the FillState checker cron job runs only on cron systems, so this will
crash on other app frontends.
2018-03-06 13:38:27 -08:00
Tim Abbott 8e8faab006 puppet: Move clearsessions cron job to app_frontend_once.
While this is a different system than I'd written up in #8004, I think
this is a better solution to the general problem of cron jobs to run
on just one server.

Fixes #8004.
2018-03-06 13:35:51 -08:00
Tim Abbott 3ae645ed12 puppet: Rename analytics.pp to app_frontend_once.pp. 2018-03-06 13:35:51 -08:00
Tim Abbott 24b6106c9c puppet: Dsiable checking for evictions in memcached nagios.
Zulip's caching model for message history is such that it is normal
and healthy for there to eventually be a nontrivial volume of
evictions.
2018-03-06 13:34:02 -08:00
Greg Price 4475950ddf queue: Restore prematurely-cut upgrade path.
Revert c8f034e9a "queue: Remove missedmessage_email_senders code."
As the comment in the code says, it ensures a smooth upgrade path
from 1.7.x; we can delete it in master after 1.8.0 is released.
The removal commit was merged early due to a communication failure.
2018-02-28 11:15:53 -08:00
Umair Khan c8f034e9a0 queue: Remove missedmessage_email_senders code.
After 68513952fb, all emails are sent through email_senders queue.
This commit removes code related to the legacy queue.
2018-02-21 16:43:56 -08:00
Tim Abbott 005b0fb566 puppet: Clean up ssh authorized_keys configuration rules. 2018-02-09 16:37:03 -08:00
Tim Abbott aca25b6f0a puppet: Move ssh configuration to use notify.
This handles more correctly the case where we're using the upstream
sshd_config file.
2018-02-09 16:37:03 -08:00
Tim Abbott 486de8abfc puppet: Edit some rules to support chat.zulip.org.
This should make it possible to use the zulip_ops base rules
successfully on chat.zulip.org.  Many of the changes in this commit
are hacks and probably can be cleaned up later, but given that we plan
to drop trusty support soon, it's likely that most of them will simply
be deleted then.
2018-02-09 16:37:03 -08:00
Rishi Gupta 1d581a9c6e nagios: Add nagios check for analytics state.
This should help us detect issues where the analytics cron jobs aren't
running properly.

The cron/nagios part of the implementation done by tabbott.
2018-02-09 16:36:05 -08:00
Tim Abbott 9ed2a94b8c nagios: Add configuration designed for full-stack servers.
This doesn't yet pass all Nagios checks correctly, and still has a few
flaws:
* The ideal setup code for the `nagios` user in the database isn't included.
* Some of the other details are a bit off; we need to split some host roles.

But it's better than nothing, and we can iterate from here.
2018-01-24 14:16:03 -08:00
Tim Abbott 2365b13b68 puppet: Move postgres Nagios plugin to main postgres-common.
This plugins package is required in order to use Nagios checks to
verify the Zulip postgres database, and thus belongs in the default
package set.
2018-01-23 10:31:48 -08:00