Commit Graph

110 Commits

Author SHA1 Message Date
Zev Benjamin 431acdb186 munin: Rename postgres to postgres0
(imported from commit 07c324b5b7e0579e7b97b07c4fdf55f0e66f131c)
2013-07-17 14:34:00 -04:00
Zev Benjamin b4a208445b Run check_postgres.pl against the correct database
We were previously running it against the 'postgres' database, which
meant we weren't actually checking the non-clusterwide statistics.

(imported from commit a6be529b16d5f1927463e49a7f7f4cf0b5299213)
2013-07-17 14:34:00 -04:00
Luke Faraone 1f811133d1 Serve static /dist/ content on app servers when hostname zulip.com is used
(imported from commit cc78ffafdffe5df2baf08bdd70a219dbb694337d)
2013-07-15 16:49:55 -04:00
Luke Faraone bb0a7c8fc3 [manual] Switch various configuration files to refer to .zulip.net.
We only want to change cases where we're talking about the hostname; HTTP
requests should still go to staging.humbughq.com for now.

Before this commit is deployed the hostname of staging.humbughq.com should
be changed to staging.zulip.net on the VM.

(the same for prod)

(imported from commit 7412530773f720ac227f40061c9ddb1a851e19bb)
2013-07-15 16:49:55 -04:00
Luke Faraone 9bef61ad87 Interpret X-Forwarded-For on app servers' nginx.
See:
    http://nginx.org/en/docs/http/ngx_http_realip_module.html#set_real_ip_from

(imported from commit adc4ebf46aefd1c71bda187d84519d8c31f6c590)
2013-07-15 16:49:55 -04:00
Luke Faraone 44b49b3bf8 Puppet configuration and associated nginx files for lb0.zulip.net.
lb0.zulip.net will proxy connections to the relevant backend servers.

Depressingly, SSL certificate verification of the backend servers is not
performed at this time, see:
    <http://trac.nginx.org/nginx/ticket/13>

The above-mentioned bug has existed since 2011, but a CVE was not
allocated until January. The nginx developers don't seem to care. Sigh.

In any case, this is of somewhat limited impact at Humbug, since we can
have reasonable confidence that communications within AWS are not
subject to active MITMs. Passive MITM is not a concern, because the
traffic *is* in fact encrypted.

(imported from commit c96e1235fc17192c7452e0417a1309cfcda62de2)
2013-07-15 16:49:55 -04:00
Luke Faraone ebde5ab341 Switch to logging module instead of syslog.
(imported from commit 4c2c2f0f23e2688ce916d33d0cf513e386dca70c)
2013-07-15 16:49:54 -04:00
Luke Faraone 4843303267 Automatically configure iptables and routing for secondary interfaces.
This is a horrible hack.

(imported from commit 01dca4514f01f7ad419d735b8879a25a999b552e)
2013-07-15 16:49:54 -04:00
Luke Faraone 0696a3fbd7 Automatically configure all interfaces (including virtual!) at boot
On EC2-VPC we have the ability to attach multiple addresses to one
interface, and multiple interfaces to one machine.

We should configure those interfaces whenever our system boots, and
ideally whenever networking is restarted.

This commit adds a script that is executed once eth0 is brought up that
proceeds to configure all subsequent interfaces, real and virtual.

The script is configured to be installed (along with the helper script
that calls it) on all systems via Puppet.

(imported from commit fdc153ef649edbb8fedd40ff4d77262aae593c39)
2013-07-15 16:49:54 -04:00
Leo Franchi 6a61c8d237 [manual] Change Humbug to Zulip in Sparkle, and start with 0.3.4
This requires a puppet apply on prod

(imported from commit 6890146fd5330acd1c5cbac5609191f332ebca4a)
2013-07-15 13:31:15 -04:00
Leo Franchi 2a5e53eaec [manual] Update desktop apps to 0.3.3
This requires a puppet apply on prod

(imported from commit aba8004684de70772d2ddd31a563b3650c4cbd9b)
2013-07-05 16:41:26 -04:00
Luke Faraone 1be1cb121c nginx / Puppet configuration for staging.zulip.com
We create a new sites-available entry which is essentially a duplicate of
sites-available/humbug-staging with s/humbug/zulip, and add the associated
symlink directive in Puppet.

(imported from commit febcb585ce93c21c6849d96458cc2bd096b30538)
2013-07-02 12:04:56 -04:00
Leo Franchi 975e13a1b8 Update sparkle to our 0.3.0 release
(imported from commit bd02d67fbd13d709b579f93a69d625da5517eec7)
2013-07-02 10:40:12 -04:00
Leo Franchi 7036915933 Add windows sparkle files
(imported from commit b7c0770acd34f44e961014a00d2059dfc7bef701)
2013-07-01 16:25:35 -04:00
Tim Abbott 3bdd446651 puppet: Fix nginx configuration for api.humbughq.com.
(imported from commit d8b535b666a3b3d758a62812a118413c619c09a5)
2013-06-28 15:57:28 -04:00
Tim Abbott ea8a80603a [manual] Change API URLs to be based on api.humbughq.com/api.
This must be deployed after we update our running nginx configuration
to serve api.humbughq.com.

(imported from commit b5c34ebdd595f55eecd6dca6a18a37f105107bd5)
2013-06-28 15:57:27 -04:00
Scott Feeney 83cd963c49 Remove unused imports
(imported from commit 9e3050c72a2d1137b9096c6cfa1c3945341b9a56)
2013-06-27 16:22:39 -04:00
Zev Benjamin 6f874995ff [schema] Use custom stopwords file for full text search
This stop words file is just the default Postgres english stop file
with all the rest of the letters of the alphabet added.  Adding the
extra letters ensures that, e.g., "bed" doesn't get transformed into
"bed | b".

(imported from commit 0be3ef9a43eb524ed4f081d5081a786cf602c487)
2013-06-27 14:18:53 -04:00
Tim Abbott 400db86008 [manual] nginx: Pass post-rewrite URIs to FastCGI.
This requires us to do a puppet apply when it is deployed to each of
staging and prod.

(imported from commit eed631ce10340e7fe3252cd8a4f05fd59ef3c942)
2013-06-25 16:34:43 -04:00
Tim Abbott ae89b25d69 nginx: Add fastcgi_params to puppet.
(imported from commit 12e6b02cd2cb411ab83a29a486053df6dff9ebb8)
2013-06-25 16:34:43 -04:00
Zev Benjamin 15d13f8f40 puppet: Add script for doing Postgres base backups and purging old backups
(imported from commit 93a92729b2e964e054aa1af7bcb8a0bae3fd1b33)
2013-06-21 14:08:57 -04:00
Zev Benjamin 33b3b1fa62 puppet: Switch which S3 bucket we backup Postgres to
The old bucket was versioned and didn't allow deletes.  This was
great for paranoia, but not so great for being able to delete old
backups.

(imported from commit be79b5c582ca5ee466cdfea6d3093b6d5ba0e23d)
2013-06-21 14:08:57 -04:00
Zev Benjamin 1b6514b89f puppet: Use the correct Postgres archive command
I hadn't changed it previously out of paranoia in the case we had a
faulty failover and had two masters both uploading to the same place.
However, I now don't think this can happen, as recovery completion
will cause Postgres to start a new timeline.

(imported from commit d58f1aa306eff4f6fd950664ff658539c1249bdf)
2013-06-21 14:08:57 -04:00
Zev Benjamin bf82fadc95 puppet: Move /tmp to local storage on Postgres master servers
(imported from commit eae0a31faad6d95c8e2b55c11481aa19d7e108f2)
2013-06-21 14:08:57 -04:00
Luke Faraone 6bd3886406 Don't pass along client locale settings when sshing in to our servers
(imported from commit d25f2a47b60c1ac7e4dcbd4a0133d0c0c9698b4e)
2013-06-18 17:20:48 -04:00
Leo Franchi 23322a791d puppet: Add sparkle configuration files
(imported from commit e36efd64584d946bb13fb5b44af817e85345e197)
2013-06-18 16:12:14 -04:00
Tim Abbott 261300d10e puppet: Add Nagios crontab to puppet.
(imported from commit 353b167b303b27ccbfc0cd0130665399faab80dc)
2013-06-17 13:48:06 -04:00
Tim Abbott d3d5334a55 puppet: Import pagerduty_nagios.pl into puppet.
(imported from commit 1b91524498372d3e69f07468e4635c4d66c44d85)
2013-06-17 13:48:06 -04:00
Tim Abbott 5c388ed28e puppet: Run our wiki out of supervisord.
(imported from commit a8f6d14ce55de0e7458496f9debb15529120deaf)
2013-06-17 13:48:06 -04:00
Tim Abbott 4d31e5d79e puppet: Increase memcached memory limit to 512MB.
(imported from commit 152c2545a3337fb1d6794a41c63c4d0b148adecc)
2013-06-17 13:48:05 -04:00
Zev Benjamin a9e4441bee [manual] Serve static files from the same location across prod deploys
This only affects DEPLOYED installations.

This does not take care of removing old versions of static files from
that directory.  The problem is that staticfiles is clever and
doesn't copy files that are already there, so we can't depend on
mtime for detecting which files we no longer need.  Hopefully that
won't be too much of a problem for now.

(imported from commit 4341460dd5bc6544086fd445014ebdac58192910)
2013-06-12 17:46:38 -04:00
Leo Franchi 113180b7b7 nagios: Don't page about load/disk/ levels on non-critical servers.
Add a pageable_servers and not_pageable_servers hostgroup, and only page for
app/postgres/zmirror.

(imported from commit 15c286324e942bd38e2a600a3b9091044f117e28)
2013-06-05 10:20:56 -04:00
Tim Abbott efcf88a707 puppet: Fix paths in feedback-bot configuration.
(imported from commit e9407af884dc75490de5168e067453e77aa612d7)
2013-06-04 19:48:13 -04:00
Tim Abbott cd65aea287 Add our trac configuration to puppet.
(imported from commit 8a9cf825344cdf83e8233f15ba66bbf050c920e4)
2013-06-04 19:48:13 -04:00
Leo Franchi 8cc0a9b4f9 [manual] Require redis-server to be installed on our servers
This requires `redis-server` to be installed. Check it is installed before
deploying this commit. It also requires 'python-redis' to be installed.

(imported from commit e3434a04456e596f6c84c1a3c289a00aa7cbb2ed)
2013-06-04 09:43:09 -04:00
Leo Franchi f9a99192df Add supervisor conf file for stats
(imported from commit e9104676e714dc36050fef50cabe8386b6c52e4d)
2013-06-03 16:16:22 -04:00
Luke Faraone 742d3bb511 Move check_send_receive.py to the naigos plugins directory, renaming it.
For consistency, and because nobody could think of a reason to have it live
in bots/ with a symlink.

(imported from commit def372653fcdde2805729134fec9d4bc3ce294ec)
2013-05-29 15:36:47 -04:00
Luke Faraone 8570f5fe55 [manual] Configure prod to use our wildcard cert.
These changes can be applied with "puppet apply".

(imported from commit 999611539e81f452dd605bb98f70436737747c29)
2013-05-29 15:36:47 -04:00
Zev Benjamin d92d62412f [manual] Use humbug-deployments/current as the CWD for supervisor processes
Some of our code uses the CWD, so we have to set it.

The config file needs to be copied over.

(imported from commit cec991ccbffddf7ea4d1ec8471377221ddd7c669)
2013-05-29 14:13:39 -04:00
Zev Benjamin 6824c94b7e [manual] Remove dependence on /home/humbug/humbug git checkout on app frontends
Modified files need to be copied into the right place.  The checkout
on git.humbughq.com also needs to be updated.

(imported from commit dbe9e05a0512e1f59c7819dd8d44c2c4e9c83bcf)
2013-05-29 12:00:03 -04:00
Luke Faraone 08ad49184a Switch memcached user to "nobody" to match production.
(imported from commit 849ac9c1d7d6f06447b22e1c1ed2495f8c59943c)
2013-05-28 18:39:08 -04:00
Michael McCanna 0e77082873 [manual] Bump Nginx buffers, don't use fastcgi temp files
Nginx's fastcgi buffers default to 8 pages (32KB). I've bumped it to 4MB,
as queries like get_old_messages take something like 130KB, and was
being ferried off to disk. In case this change to the buffers parameters isn't
enough, we explicitly set the maximum temporary file size to 0; if the fastcgi
request goes over the buffers allocated, the request will be handled synchronously,
and never go out to disk on nginx's fastcgi requests.

The manual step that must be done is to apply changes to /etc/nginx/humbug-include/app
from servers/puppet/modules/humbug/files/nginx/humbug-include/app.
The nginx process can be reloaded with `/etc/init.d/nginx restart`.
This must be done for both staging and prod.

(imported from commit 99c1bd6989c54b7e230b7c04f2fdf09be7423352)
2013-05-28 18:13:45 -04:00
Zev Benjamin cce8dfab84 [manual] Use the same socket across server restarts
We let supervisor create the socket for us by making humbug-django a
fcig-program.  Unfortunately, supevisor doesn't support putting
fcgi-programs in groups (see
https://github.com/Supervisor/supervisor/issues/148), so we have to
restart tornado and django separately.

To deploy, copy the config files over and restart nginx and
supervisor (via stopping and then starting it because restart is
broken).  I believe the automated restart as part of
update-deployment will fail because of the way supervisor treats
programs in groups.  If so, after restarting supervisor, you will
also need to run restart-server manually to fill the caches and then
delete the lock directory in humbug-deployments.

(imported from commit bfb5db7dd42dcbc4bfefa2944355b3cbb2ef9104)
2013-05-23 00:19:17 -04:00
Zev Benjamin 8fd72a09bc Restart Django and Tornado separately from the other worker processes
The amount of process downtime during a supervisord-mediated restart
appears to be linear in the number of processes that are being
restarted.  Therefore, restarting just Django and Tornado causes less
downtime than doing them at the same time as the other worker
processes.

(imported from commit 1fa9ef547bcd88caeec49800664e37d5f2fcb7a8)
2013-05-21 16:13:39 -04:00
Zev Benjamin de3ba5a038 puppet: Replace postgres2 with postgres1 in pg_hba.conf
(imported from commit 2d8654f9382df7473ec12caf2067ef0af5fef791)
2013-05-20 23:55:03 -04:00
Leo Franchi 2fcc7c0c5c Fix aggregation rules to sum at correct frequency
(imported from commit a8a27c417ae6e9cc8a6c383313da27ff6d2e875f)
2013-05-20 23:55:03 -04:00
acrefoot 9d8f847fed [manual] Run server using supervisord
This change will make it so that processes related to the app.humbughq.com
server are run under supervisord, which uses a state machine model to ensure
that programs are running. It also ensure process startup order.

We will need to manually switch the old way of running server (in screen) into
this new way of doing things, on both staging and prod (app_frontend.pp has been
updated appropriately). This means:
1) cp servers/puppet/modules/humbug/files/supervisord/conf.d/humbug.conf /etc/supervisord/conf.d
2) installing the supervisor package.
3) killing those while loops in that screen session
4) mkdir /var/log/humbug (as root)
5) /etc/init.d/supervisord start
6) check that nothing broke

(imported from commit 055269a70973db89acd69049e01b185fabdc8f90)
2013-05-20 23:42:28 -04:00
Leo Franchi 25b915fa6a Enable rabbitmq consumser checks on app
(imported from commit e3df8bc849dc0e1ae2e7782c0c9be5c08d4818c2)
2013-05-20 23:29:54 -04:00
Leo Franchi 3d4e239247 Check rabbitmq consumers for all important queues
(imported from commit 1279d33e3e1c36ee8da01859875d24b54e14e2e6)
2013-05-17 01:02:35 -04:00
Luke Faraone c3421b31b9 Include certificate configuration for www.humbughq.com via Comodo
This expires in on Aug 11 23:59:59 2013 GMT.

I've set a calendar event for this :)

(imported from commit fb426b703c88dd255536e10285375dc997e47b01)
2013-05-17 01:02:32 -04:00