zulip

Commit Graph

Author	SHA1	Message	Date
Tim Abbott	4cd3fd234c	puppet: Add supervisord configuration for feedback-bot. (imported from commit c7deece3e48d59de856393a4a6b7929757bc1c7c)	2013-02-05 14:27:56 -05:00
Tim Abbott	f5b44cf349	nagios: Add monitoring for zmirror subscriptions syncing. (imported from commit 2e4ae2c35d589f14b57758cd68a58f8b49b7ecf3)	2013-02-05 14:27:56 -05:00
Tim Abbott	3c6dc21b05	Add pagerduty_nagios.cfg to git. (imported from commit 2f7110d5ab65893afcb83e6f38944bf065abedff)	2013-02-01 14:50:28 -05:00
Tim Abbott	26aece90b8	nagios: Enable the Nagios commands feature. This allows us to in particular reschedule a Nagios check to run immediately, which I've in the past found super useful when trying to figure out whether we actually fixed a problem. Unfortunately, Nagios config sucks and there's no easy way to create a group containing all of us as people able to issue commands; you have to list them in like 8 different places. (imported from commit 2c1e53330eff1e47e09d0b1917136f101d64e86a)	2013-02-01 14:50:28 -05:00
Tim Abbott	1fe6045288	nagios: add check that process_user_activity is running. This fixes trac #670, and also adds the "-u humbug" parameter on the other check_procs run, since that is a good practice move to help avoid the check counting its parent process as one of the matches. (imported from commit 43ae9b4863ba67579a21c86a910b73019f85a538)	2013-02-01 14:50:28 -05:00
Tim Abbott	2dd2bc8759	nagios: Make default contact_groups not page. This will help us avoid making things accidentally pageable. Also, explicitly set contact_groups for all our services, to help encourage making explicit decisions about which new items are pageable. (imported from commit 740c6550d4a7091e58681435eeb7aaabf98df75c)	2013-02-01 14:50:28 -05:00
Keegan McAllister	6990260b59	[manual] Minify JavaScript and CSS in production Manual deployment steps: The same Nginx reload as for "Get rid of the static-access-control mechanism". If deploying both commits at once, just do it once. (imported from commit dd8dbbf14b95fce0a4b6f66f462fa0a6b50bfb8c)	2013-01-31 15:41:01 -05:00
Keegan McAllister	5e9b0ba79d	[manual] Get rid of the static-access-control mechanism We will minify our code, rather than trying to restrict who can see the un-minified code. Removing access control first simplifies things. Manual deployment steps: scp servers/puppet/files/nginx/humbug-include/app root@staging.humbughq.com:/etc/nginx/humbug-include/ ssh root@staging.humbughq.com service nginx reload and then the same for app.humbughq.com once deployed to prod. (imported from commit 63788aa3fa7ba5fd97fcf85b05760abb5e7cae4b)	2013-01-31 15:34:12 -05:00
Leo Franchi	6e9b8d895c	Add munin plugin for send-receive timing (imported from commit e2ae0775379ce59ab43213e68ade4d3f88b578e6)	2013-01-31 13:02:57 -05:00
Jessica McKellar	14d0ec1096	nagios: add several postgres checks. (imported from commit 5440b2b14d5db11fa9794fe4bcb86a1d6fe90b5d)	2013-01-30 10:55:35 -05:00
Jessica McKellar	a5337033b7	nagios: add a send-receive delay check. (imported from commit ed58f49440fc1e8175ea02eb5d1b0ae8b53472f0)	2013-01-30 10:55:35 -05:00
Zev Benjamin	726ba8dad9	Make Postgres have a log prefix more like what pgFouine requires We'll still need a conversion script, but it should be easy. pgFouine requires a log prefix of '%t [%p]: [%l-1] '. We instead use '%m [%c]: [%l-1] ' which contains strictly more data. Specifically, "%m" is "%t" (time) but with milliseconds and "%c" is "%p" (pid) but with the process start time. (imported from commit a0bb583b563bdea0ca19b8b21677df0b9a18092a)	2013-01-28 16:21:42 -05:00
Jessica McKellar	767bf16c1c	Hack up paths to be able to import both the API and Django model. (imported from commit ca89d6bf6208455db4b636198737698ffe575698)	2013-01-24 13:36:11 -05:00
Luke O'Malley	61843b8645	nagios: Add plugin to watch the latency for a message roundtrip. (imported from commit 75888fa4f7ceedb4a95e9b6c4012c32e106ee1ad)	2013-01-24 13:36:11 -05:00
Tim Abbott	2be39640d3	Add postgres config for new frontend. (imported from commit 0b67ec1cb2c4b06d85d875c14154dd3e453f05c2)	2013-01-17 22:08:39 -05:00
Keegan McAllister	c9a555b605	Nginx: Drop caching directive for /static This might fix problems where users were running old code even after reloading. (imported from commit dedc4d513f884aa2bafa0c7cc7a817d6715b48a0)	2013-01-16 15:03:40 -05:00
Luke Faraone	d0a5d7f7e2	Serve static content in /dist on app (imported from commit b5850ee1f6c6663a27fee14f430f1fae7b690725)	2013-01-15 19:10:09 -05:00
Keegan McAllister	6d7ef69cda	nginx: Add config for plant.humbughq.com (imported from commit e90b8e350014b49de53bfd5640442060672e691d)	2013-01-11 17:41:11 -05:00
Keegan McAllister	56660f30f8	nginx: Factor out shared parts of app / staging config (imported from commit e00d5eec1bc58754db6e97935bc803fe3a4fe291)	2013-01-11 17:39:51 -05:00
Keegan McAllister	ef6a5220c8	nginx: Remove unused config humbug-dev (imported from commit 178a320bf56076c61f4010bf6cb89ba04798b4a4)	2013-01-11 17:39:48 -05:00
Jessica McKellar	9730a65f59	nagios: revamp check_user_zephyr_mirror_liveness to monitor sudden drops in mirror use. (imported from commit e92df66c40065584e84c049cfab8d82f71d6dddd)	2013-01-08 10:53:33 -05:00
Jessica McKellar	0655397536	Give the NTP check the default number of retries. It had a max_check_attempts of 1, which makes it susceptible to network blips. (imported from commit 20e51878d75bef36d02c5afaab78b8cdd701077f)	2013-01-08 10:53:33 -05:00
Jessica McKellar	62284f39f4	nagios: monitor feedback bot liveness. (imported from commit 64a97e74b8a44bf0a6faf97398f843d8209b8e36)	2013-01-08 10:53:32 -05:00
Jessica McKellar	5d7b64993b	nagios: Add monitoring for clock skew. (imported from commit 1db47e7c6b28c9dd119e4c50309867d52d3c294b)	2013-01-03 10:21:16 -05:00
Jessica McKellar	ee0b01b8a3	puppet: munin: Document the manual SSH tunnel setup required. The full documentation, referenced in the config file, is at https://wiki.humbughq.com/Deployment%20process/components#munin. (imported from commit b7f989accb2ee8c5f400e68bf7a7491115a7d0b3)	2013-01-02 17:41:50 -05:00
Jessica McKellar	9083b0f184	puppet: Add munin and munin-node config files. (imported from commit fa9d7b191fe89894f61f4fd15cb7382663e34837)	2013-01-02 17:41:50 -05:00
Jessica McKellar	d8cd78ec85	nagios: Add and make the default contact a PagerDuty group. (imported from commit 6ab1fd777f3ec7804e6b4f31eaa5efad51993f1a)	2013-01-02 17:41:50 -05:00
Jessica McKellar	cfad014596	nagios: Do check_user_zephyr_mirror_liveness as user humbug. That user has the necessary database certs. (imported from commit 2f0778a1c5ca5259143b8e7ab25b557a6ddd76df)	2013-01-02 17:41:49 -05:00
Zev Benjamin	a40b5da432	puppet: Use PostgreSQL's internal logging system This also requires disabling logrotate for postgres log files. (imported from commit eeedb87a4f488829c59eddecc041654e762d6d0e)	2013-01-02 16:56:57 -05:00
Zev Benjamin	779191b30e	puppet: Add postgres server configuration files (imported from commit bbfe6e9246a9a172a48c4cf8257d32936de009f9)	2013-01-02 16:56:57 -05:00
Tim Abbott	a2f26f1106	Nagios: Fix retry interval of zephyr_mirror_forwarding check. (imported from commit eae984669dad0a2dd6779092e9759909fbbd1da7)	2012-12-19 11:21:47 -05:00
Zev Benjamin	1aa825e6d0	puppet: Add generic nagios monitoring for postgres.humbughq.com (imported from commit 9e732b69580bc3da8507a5fe6fdd81f044fb4443)	2012-12-13 11:30:02 -05:00
Zev Benjamin	dc6d48611d	puppet: Accept traffic on port 5432 (postgresql) (imported from commit bf30d0af2377209f3d5c10add3a526a1fee28dd8)	2012-12-13 11:30:02 -05:00
Jessica McKellar	375f8e3540	nagios: disable flap detection. This will ensure that we always get state change alerts, even when the service is changing states frequently. (imported from commit 57fa5a941dd1a6042eb782dbac2fed0e4cb934ba)	2012-12-11 10:22:52 -05:00
Keegan McAllister	d8b4cefccb	nagios: Remove AllowOverride AuthConfig We don't use it. (imported from commit 875148e24e0de2815737b6bc03eeb7f1cb8d770d)	2012-12-03 17:54:16 -05:00
Keegan McAllister	2cf49c4ff2	nagios: Go straight to the service detail page This bypasses the side navigation frame, but I think said frame currently provides negative value. (imported from commit b067d546e4a7fb95e7de2a35be7e7f947c7a0da1)	2012-12-03 17:54:16 -05:00
Keegan McAllister	d435f29308	Add X-Frame-Options header on nagios, trac, wiki Prevents clickjacking attacks. (imported from commit 8b3872e607d8a4e714c280a3226465fde0d5a6ed)	2012-12-03 17:54:16 -05:00
Keegan McAllister	7c495d7232	Move the nagios Apache authentication directives to a <Location> block Following the trac Apache config. (imported from commit 01e773f2361d85f45f190f6ade2510b84a2f88ee)	2012-12-03 17:54:15 -05:00
Keegan McAllister	41319fe820	Rework the nagios Apache config as a proper vhost This also adds HSTS. Based on the trac Apache config. Fixes #435. Suggested viewing: git show -w (imported from commit e7e9fe74687b88497ddb21f74febfc7fdf9b1979)	2012-12-03 17:54:15 -05:00
Keegan McAllister	a9c16b38ce	Fix up whitespace in Apache configs (imported from commit 605253abf9b029e18774f80979d23c60ffca034b)	2012-12-03 17:54:15 -05:00
Keegan McAllister	922b44a1da	Add iptables config for zmirror.humbughq.com For now we allow all UDP traffic. I'll look into doing something clever. This isn't puppetized, either. (imported from commit bdf53df87a5f6c8af6d950b25946b5ec8a4f910b)	2012-12-03 17:43:04 -05:00
Keegan McAllister	ed0cb0a5f8	Puppetize nginx.conf Fixes #201. (imported from commit 0feaff372d94009fa51dabf2bda55062826e2ed5)	2012-12-03 15:58:16 -05:00
Keegan McAllister	4aa7615234	Nginx: Use $host instead of $server_name The latter is just the first name in the 'server_name' directive. The former uses the HTTP Host header, if provided. This fixes the redirect from http://zephyr.humbughq.com to https://zephyr.humbughq.com (imported from commit be47b05f4f055bb2d1d82aebbe155579f49c538d)	2012-11-30 17:12:42 -05:00
Keegan McAllister	500a5e29c3	Nginx: Redirect unknown hostnames to https://humbughq.com (imported from commit f6dd65c1db033d09f1df8f0a5972f067f3aeb80a)	2012-11-30 15:32:32 -05:00
Keegan McAllister	ac18c533c8	Nginx: Serve the cert for zephyr.humbughq.com rather than app.humbughq.com This will cause SSL errors for anyone still using the deprecated app.humbughq.com name, which we concluded is (almost?) nobody. (imported from commit 7f3c149a4064e7bdae8ec944f2bb8a482df6f90d)	2012-11-30 15:32:32 -05:00
Keegan McAllister	2fcb9cfd49	Nginx: Make zephyr.humbughq.com an alias for humbughq.com (imported from commit d23ef5aeed990a04f294b7dffe322b8d174c1f07)	2012-11-30 15:32:32 -05:00
Keegan McAllister	0f20150a81	Nagios: move /var/lib/nagios/humbug-api to /usr/local/lib/humbug (imported from commit ff3ff1e3cc54a4c556479e62e058002229143627)	2012-11-26 16:58:51 -05:00
Keegan McAllister	d7b3afef6b	Send Nagios alerts to Humbug Fixes #385. (imported from commit 7dac013debd6ccff031fc4da0dd7185e198b4498)	2012-11-26 14:42:55 -05:00
Jessica McKellar	be27ec1ad4	nagios: Change zephyr mirror liveness check to only care about aggregate statistics. Too many individual users occasionally don't update their mirrors, causing us to be permanently alerting; we have sufficient user notification at this point (plus Waseem keeping an eye on /activity) that we don't need to alert on individual users. We do, however, still care if something happens (say, Linerva going down) that causes many users' mirrors to go down. (imported from commit 392952c95739e183d4a711120e3a963671cec289)	2012-11-26 10:31:29 -05:00
Keegan McAllister	75526a2c67	nagios: Drop ssh -o StrictHostKeyChecking=no This is bad for security. I've checked that all currently known hosts for nagios@nagios.humbughq.com match one of our existing servers. When adding servers to nagios in the future, it will be necessary to do an initial manual ssh from nagios@ and check the host key fingerprint. (imported from commit adfd1d29f03343d4be04e87c5e26a018f31e5194)	2012-11-26 00:25:15 -05:00

1 2

85 Commits