Tim Abbott
6a5cb0e48c
puppet: Make problems with Zephyr mirroring pageable.
...
Generally this indicates sending messages is completely broken.
2017-10-12 00:16:32 -07:00
Tim Abbott
e6e7bcf6e1
nagios: Move camo_check_url into configuration.
2017-10-05 21:09:24 -07:00
Tim Abbott
96c3014da0
nagios: Automate configuration of outgoing email with msmtp.
...
Now we no longer need to check in a bunch of hostnames in order to
configure Nagios.
2017-10-05 20:29:47 -07:00
Tim Abbott
162eaf8917
nagios: Modify check for swap to allow no swap.
...
If a machine is configured with no swap intentationally, that
shouldn't be a Nagios problem. This alert is intended to flag
machines which are swapping.
2017-10-05 20:07:44 -07:00
Tim Abbott
886a8853ac
nagios: Move server-specific config into hostgroups.
...
These new hostgroups exist so we can eliminate explicit references to
individual hosts in services.cfg.
2017-10-05 20:06:48 -07:00
Tim Abbott
5193936bc3
nagios: Add Memcached and Redis monitoring.
...
These are standard Nagios plugins that might be sometimes helpful.
2017-10-05 20:06:16 -07:00
Tim Abbott
f7d554d533
nagios: Rename zmirror2 to zmirrorp in configuration.
...
The "p" stands for "personals", aka zephyr private messages, which is
what this host manages.
2017-10-05 20:06:08 -07:00
Tim Abbott
062d280914
puppet: Clean up unnecessary pagerduty_nagios.cfg.
2017-10-05 19:23:33 -07:00
Tim Abbott
6017d3dec5
puppet: Move contacts.cfg to be a template.
2017-10-05 19:23:33 -07:00
Tim Abbott
09aec3e467
puppet: Move hosts.cfg to be managed by a template.
2017-10-05 19:23:33 -07:00
Tim Abbott
5a80c029a2
nagios: Update path to sync_public_streams to match new config.
2017-10-05 13:34:27 -07:00
Reid Barton
ccb4c5c26f
bots: Move zephyr-related files to api/integrations/zephyr/.
2017-05-26 15:07:02 -07:00
Elliott Jin
0ec9e54954
bots: Add queue and QueueProcessingWorker for embedded bots.
2017-05-25 15:00:51 -07:00
vaibhav
8881b5eb9f
Outgoing Webhook System: Check for @-mentioned outgoing webhook bots.
...
Also puts them into a processing queue, though the queue processor
does nothing.
Rewritten by tabbott to avoid unnecessary database queries in
do_send_messages.
2017-05-02 09:22:04 -07:00
K.Kanakhin
6a801db1c2
missed-emails-sending: Move email sending to separate queue worker.
...
- Add new 'missedmessage_email_senders' queue for sending missed messages emails.
- Add the new worker to process 'missedmessage_email_senders' queue.
- Split aggregation missed messages and sending missed messages email
to separate queue workers.
- Adapt tests for sending missed emails to the new logic.
Fixes #2607
2017-03-07 20:08:40 -08:00
Tim Abbott
fa8045a484
puppet: Add websockets Nagios test to configuration.
...
Since browser clients send messages via websockets and not the API,
this is an important element in making sure mission-critical Zulip
functionality is working.
2017-02-08 11:13:19 -08:00
JefftheBest1
9de75f5167
Fixed typos with separate
2017-01-12 04:52:05 -08:00
Tim Abbott
3e32102016
nagios: Fix various critical issues not tagged as pageable.
2017-01-06 21:49:20 -08:00
Tim Abbott
93c2c19775
nagios: Increase process count limits.
2017-01-06 21:49:15 -08:00
Tim Abbott
9ab8e7ba34
nagios: Disable swap checks for servers with no swap.
2017-01-06 21:39:07 -08:00
Tim Abbott
3e01ed1f73
nagios: Increase NTP max_check_attempts.
...
NTP often suffers from brief interruptions of service that lead to
spurious Nagios alerts; it makes sense to suppress these.
2017-01-06 21:32:43 -08:00
Tim Abbott
65774e1c4f
zulip_ops: use check_postgres package from apt.
2017-01-06 21:18:55 -08:00
Tim Abbott
4fbe201187
puppet: Automate autossh process monitoring maintenance.
...
Previously, the Zulip Nagios configuration effectively hardcoded the
count for how many system should have autossh connections.
2016-10-26 00:49:03 -07:00
Tim Abbott
0a5a2c4eda
nagios: Automate authorized users list maintenance.
2016-10-26 00:37:29 -07:00
Tim Abbott
d490e83645
puppet: Upgrade nagios cgi.cfg with modern defaults.
2016-10-26 00:31:41 -07:00
Tim Abbott
1159ad4857
puppet: Upgrade nagios.cfg with modern defaults.
2016-10-26 00:31:41 -07:00
Tim Abbott
73178e5e5a
puppet: Run check_send_receive_time via a cron job.
...
This allows the actual nagios work involved with
check_send_receive_time nagios checks to be done by an unprivileged
"nagios" user rather than the "zulip" user.
2016-10-26 00:26:52 -07:00
Tim Abbott
96cf330649
puppet: ssh as the nagios user instead of zulip user.
...
This is a follow-up to 4f58fef54b
,
touching services.cfg instead of commands.cfg.
2016-10-26 00:23:47 -07:00
Tim Abbott
c3727c9886
nagios: Remove old zulip.com trac/git/replica servers.
...
These are unlikely to be relevant to anyone.
2016-10-26 00:21:53 -07:00
Tim Abbott
383f39b543
nagios: Enable allow_empty_hostgroup_assignment.
...
This fixes the configuration being broken when we remove some of the
old zulip.com hosts that are unlikely to be of interest to anyone.
2016-10-26 00:19:21 -07:00
Tim Abbott
4f58fef54b
zulip_ops: Use nagios user for all Nagios checks.
...
There's no reason these Nagios checks needs to run as the
semi-priviliged Zulip user.
2016-10-26 00:17:26 -07:00
Tim Abbott
32d244dbe5
puppet: Add Nagios checks for other consumers.
2016-10-26 00:11:08 -07:00
Tim Abbott
080dd8c987
nagios: Ignore kthreads in check_procs tests.
...
Modern Linux can have a lot of kernel threads not doing anything.
Since this isn't interesting from a monitoring perpsective, we ignore
these.
2016-10-26 00:10:40 -07:00
Tim Abbott
f9ad75f58e
puppet: Remove configuration for old zulip.com bots host.
...
This configuration didn't do anything anyway and just clutters the
repo.
2016-10-26 00:01:29 -07:00
Tim Abbott
2227e77cce
puppet: Remove Dropbox usernames from Nagios config.
2016-10-25 23:55:42 -07:00
Tim Abbott
36e336edc3
puppet: Rename zulip_internal to zulip_ops.
...
The old "zulip_internal" name was from back when Zulip, Inc. had two
distributions of Zulip, the enterprise distribution in puppet/zulip/
and the "internal" SAAS distribution in puppet/zulip_internal. I
think the name is a bit confusing in the new fully open-source Zulip
work, so we're replacing it with "zulip_ops". I don't think the new
name is perfect, but it's better.
In the following commits, we'll delete a bunch of pieces of Zulip,
Inc.'s infrastructure that don't exist anymore and thus are no longer
useful (e.g. the old Trac configuration), with the goal of cleaning
the repository of as much unnecessary content as possible.
2016-10-16 19:23:27 -07:00