Commit Graph

223 Commits

Author SHA1 Message Date
Leo Franchi acec697fe7 Report unnarrow times as well as narrow times
(imported from commit b3a889aa11dc112508c5a1d213f68e5223a879fc)
2014-02-13 14:45:22 -05:00
Zev Benjamin 41e3a89398 [manual] puppet: Puppetize Munin
To deploy this, the zulip_internal::base and zulip_internal::munin classes must
be added to nagios.zulip.net.

(imported from commit 50d6a4ed19fcc9c62c7104977d69043bf5b9bbf9)
2014-02-13 13:26:40 -05:00
Leo Franchi 2efaf75b25 Release desktop app v0.4.3
(imported from commit 13f5b79ce483db22cfa136a1318eadc4d04eb746)
2014-02-12 16:18:34 -05:00
Zev Benjamin 32d66d6f73 [manual] Monitor the new redis servers with nagios and munin
We have to start the tunnels up manually and add them to the wiki

(imported from commit aa5f80630a651c3fb33bba321e9d4444b5c498a2)
2014-02-10 13:23:28 -05:00
Zev Benjamin 631783f3cd [manual] Use dedicated Redis server for staging
Before we deploy this commit, we must migrate the data from the staging redis
server to the new, dedicated redis server.  The steps for doing so are the
following:

* Remove the zulip::redis puppet class from staging's zulip.conf
* ssh once from staging to redis-staging.zulip.net so that the host key is known
* Create a tunnel from redis0.zulip.net to staging.zulip.net
  * zulip@redis0:~$ ssh -N -L 127.0.0.1:6380:127.0.0.1:6379 -o ServerAliveInterval=30 -o ServerAliveCountMax=3 staging.zulip.net
* Set the redis instance on redis0.zulip.net to replicate the one on staging.zulip.net
  * redis 127.0.0.1:6379> slaveof 127.0.0.1 6380
* Stop the app on staging
* Stop redis-server on staging
* Promote the redis server on redis0.zulip.net to a master
  * redis 127.0.0.1:6379> slaveof no one
* Do a puppet apply at this commit on staging (this will bring up the tunnel to redis0)
* Deploy this commit to staging (start the app on staging)
* Kill the tunnel from redis0.zulip.net to staging.zulip.net
* Uninstall redis-server on staging

The steps for migrating prod will be the same modulo s/staging/prod0/.

(imported from commit 546d258883ac299d65e896710edd0974b6bd60f8)
2014-02-10 13:23:28 -05:00
Zev Benjamin 1d7976d332 puppet: Add manifest for dedicated Redis server
(imported from commit 894ad5ca005de0fb9a64bfb58da374f72734eb8d)
2014-02-10 13:23:28 -05:00
Zev Benjamin 4d91bb39d3 [manual] puppet: Split out redis server configuration from app_frontend
The zulip::redis puppet class should be added to all our frontends' zulip.conf
after this is deployed.  No puppet apply is required.

(imported from commit ccea89f4779c6c49c0cbe837adcb5be21bfe55ab)
2014-02-10 13:23:28 -05:00
Luke Faraone c7565222f0 Fail fast if fqdn is not defined on Enterprise with Postfix
Otherwise, we won't be able to generate valid configuration files.

(imported from commit 5ec1a43fed5991dc609c470b596926a5febcd4c5)
2014-02-07 01:02:06 -05:00
Luke Faraone 602f7f96e5 Move postfix inclusion from public app_frontend to internal manifest
Otherwise, we will enable the postfix config on all frontends,
regardless of whether Enterprise deployments requested it.

(imported from commit 9592be3706adcee7547f6795f32fe7b8d85e71ee)
2014-02-07 01:01:33 -05:00
Luke Faraone 60cfd3cfb0 Accept SMTP connections on hosts.
(imported from commit 524ae3f4362ffea12ff96498ae554322f7fe8a3c)
2014-02-06 12:14:21 -05:00
Luke Faraone 24f8492236 [manual] Enable local email mirror on all frontends.
This removed the cronjob from all app_frontend servers and enables the
local Postfix mail server on the same.

This is a no-op on staging if the parent commit has already been
applied.

To deploy this commit, run a puppet-apply on prod.

(imported from commit 6d3977fd12088abcd33418279e9fa28f9b2a2006)
2014-02-06 10:26:56 -05:00
Luke Faraone 30a6fd3bd7 [manual] Enable postfix email mirror on staging
This will cause us to recieve messages sent to streams.staging.zulip.com
via the local Postfix daemon running on staging.

This commit does not impact prod. To deploy, a puppet-apply is needed on
staging.

(imported from commit 9eaedc28359f55a65b672a2e078c57362897c0de)
2014-02-04 10:38:17 -05:00
Luke Faraone 882047515c [manual] Move polling email mirror to prod from staging
This will allow us to roll out the Postfix-based mirror on staging in
the future without impacting production mirroring.

This branch should be puppet-deployed first on prod, then staging.

(imported from commit eceaa6c02a06f7074cacc19c6439e5928eef3ae4)
2014-02-04 10:38:17 -05:00
Luke Faraone 374acb7f24 [puppet] Move email mirror cron to public module
This way we can reference it in the documentation.

(imported from commit 37d5cbfcfb745e2b44768674f53d7ba450518cd0)
2014-02-04 10:38:17 -05:00
Luke Faraone de56b947d4 Remove unused postfix aliases file.
(imported from commit f40cb5b532aaf6421b9dd55a197644ecf65021a4)
2014-02-04 10:38:17 -05:00
Luke Faraone 38636d5125 Puppet configuration for postfix
(imported from commit 230325f6233c6d32ecab5f9fa3fc102373b22039)
2014-01-31 15:33:15 -05:00
Luke Faraone 760cd7a474 email-mirror: Run queue worker from supervisord
(imported from commit f496046bbc92b3d3b41aa15c3fbdd1d38556d6d0)
2014-01-31 15:33:15 -05:00
Luke Faraone 3263d09939 Convert zmirror to use puppet apt module for debathena sources
(imported from commit 080d59d2ac750d03b55460752d7fe7d02e72611c)
2014-01-31 13:43:04 -05:00
Luke Faraone aa52475e96 Switch to puppetlabs/apt
(imported from commit b2f581280dc7877051ef79d86eac671bfd455ace)
2014-01-31 13:43:04 -05:00
Tim Abbott 532cd061fb [puppet] Raise maximum items per page for trac.
(imported from commit 2ffa5e04c220a87d51cba42ade89874cc43ba584)
2014-01-29 17:22:19 -05:00
Tim Abbott 5108253e97 nagios: Make Zephyr mirroring alerts not pageable.
(imported from commit ab98af762b1edf93703fc865496aedc59ce7bd2d)
2014-01-24 13:53:48 -05:00
Zev Benjamin 759d33fad1 puppet: Check all disks via nagios, not just /
(imported from commit 0bc9fc150e791ce3ccec99688f3593a8678a87c9)
2014-01-23 13:37:27 -05:00
Tim Abbott 57c7634a4e Increase Zulip worker memory limits.
(imported from commit 6969eb1d2db0ee47c7b115b7f9b55ded2c9265dd)
2014-01-22 17:19:19 -05:00
Zev Benjamin c4e1d9f02a puppet: check_postgres_backup: Connect to the 'postgres' database
This allows the utility to run on trac.zulip.net, which doesn't have a 'zulip'
database.

(imported from commit c8eabb89e5e161191d6f2c92ca2b1428b17a9aa0)
2014-01-22 12:07:57 -05:00
Zev Benjamin 49f2657c8d nagios: Add check_postgres checks for the trac and wiki databases
We don't do the sequence check because that requires read access to the database
itself, which the zulip user doesn't have.

(imported from commit fba7604826353b2974e9757f01dcb426297993b3)
2014-01-22 12:07:56 -05:00
Zev Benjamin 3840cf760f nagios: Move a few services from hostgroup postgres -> hostgroup postgres_appdb
(imported from commit 54a738f19f176d36526d40968c379f6357d56e6b)
2014-01-22 12:07:56 -05:00
Zev Benjamin 1ae040c7fb nagios: Specify the db and user for check_postgres via arguments
(imported from commit c3b1a7fe7c63094ed8956ed1bdf4861d747637bd)
2014-01-22 12:07:56 -05:00
Zev Benjamin a974301b8b nagios: Add trac to the postgres_other hostgroup
(imported from commit 7e531b982b8f8961f2201cdc8b88d90d5d238907)
2014-01-22 12:07:56 -05:00
Zev Benjamin 41e274a8e4 nagios: Split postgres hostgroup into more fine-grained groups
(imported from commit ab5fcc0893fb8635defecdf3045a3ffdd5e26f14)
2014-01-22 12:07:56 -05:00
Leo Franchi e734155a1c Mount and make graphite backup drive when creating stats1
(imported from commit f8af032fa314812610d0ec7eb6227ebb0b3c2f32)
2014-01-22 10:49:49 -05:00
Luke Faraone 92ae790130 [manual] Switch listen address to www.humbughq.com for humbughq.com domains
We cannot use SNI for these legacy domains because old plugins still
connect to them.

This commit (along with the three previous commits) requires a lb0 nginx
deployment to function.

(imported from commit f47f3d7b597666508b3817d965fe8ce19d50c2c0)
2014-01-21 11:15:08 -05:00
Luke Faraone e852580a0e Use correct key for humbughq SAN cert.
This is live right now.

(imported from commit 051a44e2962557f3fc293e3e2f2e169a5d6e658c)
2014-01-21 11:15:07 -05:00
Luke Faraone c9158dd3d9 [manual] Use SNI cert instead of wildcard for humbughq
To deploy, the certs need to manually be copied to lb0's /etc/ssl/certs
directory, the nginx config updated, and the server restarted

(imported from commit c70c7678cd010a1b2b0aba830ab3d862005bd627)
2014-01-17 15:03:29 -05:00
Tim Abbott 7ce692b3c3 Restore serving the app on humbughq.
Partially reverts b1a8de8763

(imported from commit ddd9443d527f1e46f78008178b2410374551b8a6)
2014-01-17 15:03:29 -05:00
Luke Faraone 846be23ce2 Load SNI-enabled www.hhq.c cert
This replaces the old www.humbughq.com cert.

Contains these hostnames:
 * www.humbughq.com
 * api.humbughq.com
 * humbughq.com

Generated per 9d674d6a0.

(imported from commit 0ef3f0ff2a02996246868466b5e634ebf45439a2)
2014-01-17 15:03:16 -05:00
Luke Faraone ce50478a1e Move humbughq.com hosts to www.zulip.com IP
These are redirect hosts, so they don't need their own IP.  Supporting
non-SNI clients isn't a priority for us.

(imported from commit b1a8de8763ab944885518c868e4e30307d84c11d)
2014-01-16 15:56:16 -05:00
Luke Faraone 2c86c5c8ee Redirect humbughq domains to www.zulip.com per Waseem.
(imported from commit d5b8e8f33787d2a590516219ca4043b304b80a21)
2014-01-16 15:54:53 -05:00
Luke Faraone b6a2208d84 nginx configuration for customer29 on lb0
(imported from commit 7b6712e3e68aca71e81a6224af7d3f876af6ab1e)
2014-01-16 15:54:53 -05:00
Luke Faraone 8ebf0a414c Remove expired and unused SSL certificates
(imported from commit 7b058878183edc6cca593df6cd4b8cfeb15bab70)
2014-01-16 15:54:53 -05:00
Zev Benjamin 20e4e31dcf puppet: Update env-wal-e to take the S3 bucket to use from /etc/zulip/zulip.conf
This will let us do normal puppet applies on our postgres hosts again.

Crudini is already installed and /etc/zulip/zulip.conf has already been edited
on the relevant hosts.

(imported from commit 8e2b88d2fe2f7b2367ecb73a50a299200fe381a0)
2014-01-16 15:23:21 -05:00
Zev Benjamin ab1aafeb1c puppet: Add python-sqlalchemy dependency
(imported from commit 1ed6a8a730d368a97fad6cd478ec13e75504b789)
2014-01-14 11:47:12 -05:00
Zev Benjamin ef5ed9f9b9 puppet: Add postgresql-9.1-tsearch-extras dependency
Note that this change can not currently be applied on postgres hosts due to the
postgres puppet config currently being slightly broken.

(imported from commit 5d8ddeabfd9612d469a048256d22949c0bfa6aba)
2014-01-14 11:47:12 -05:00
Luke Faraone 16ae70948f Move python-googleapi dep to public Zulip manifest
(imported from commit 20298f82fbd674b3cf6b67b7741bf800b9733f36)
2014-01-13 16:24:21 -05:00
Luke Faraone 3948e1673d [manual] Accept OAuth2 tokens for API login via Google Apps
This is used by the Android app to authenticate without prompting for a
password.

To do so, we implement a custom authentication backend that validates
the ID token provided by Google and then tries to see if we have a
corresponding UserProfile on file for them.

If the attestation is valid but the user is unregistered, we return that
fact by modifying a dictionary passed in as a parameter. We then return
the appropriate error message via the API.

This commit adds a dependency on the "googleapi" module. On Debian-based
systems with the Zulip APT repository:
    sudo apt-get install python-googleapi

For OS X and other platforms:
    pip install googleapi

(imported from commit dbda4e657e5228f081c39af95f956bd32dd20139)
2014-01-13 13:30:55 -05:00
Leo Franchi 20f3b3af8f Fix zulip->zulip_internal puppet path change for apns checker
(imported from commit 1fd43a4f4907c24fcbbda73bbaf3cf092a6cace1)
2014-01-10 21:38:59 -05:00
Leo Franchi 91c54754fb [puppet] Add the apns-token crontab file to puppet
(imported from commit f12001453c9ca924c801a6000927e3ee2696a392)
2014-01-10 21:38:57 -05:00
Zev Benjamin c045644097 puppet: Run check_ntp_time against an NTP pool instead of time.mit.edu
MIT implemented NTP rate-limiting to defend against on-going reflection attacks,
which was causing our nagios checks to fail intermittently.  When the attacks
die down or when external sites fix their NTP configurations, checking against
time.mit.edu will stop failing.  However, there also isn't much of a reason to
stick with checking against a single server.

(imported from commit 2c2a1a04646b880b010cbb4b6d94016b1eccd1a0)
2014-01-06 17:30:09 -05:00
Jessica McKellar 61d660f9f3 [manual] digest: move cron job from staging to all app frontends.
Manual instructions:

This commit requires a puppet apply after deployment on both staging
and prod.

(imported from commit 2d10e33c6db2f5e9cc1204cdd5f2c91833da2a8e)
2013-12-20 12:50:23 -05:00
Tim Abbott bdcc2e5c52 nagios: Set max_check_attempts to 3 for batched queue processors.
(imported from commit ec0ac86726cd6ff3d0fdfcfcb161d3329fca02ac)
2013-12-19 17:31:41 -05:00
Tim Abbott b2d01e2da0 [manual] restart-server: Minimize downtime for message sender worker.
The manual step here is that we need to do the `puppet apply` before
pushing this commit, or `restart-server` will crash.

Previously we shut down everything in one group, which performed
poorly with supervisor's bad performance on restarting many daemons at
once.  Now we shut down the unimportant stuff, then the important
stuff, bring back the important stuff, and then bring back the
unimportant stuff.

This new model has a little over 5s of downtime for the core
user-facing daemons -- which is still far more than would be ideal,
but a lot less than the 13s or so that we had before.

Here's some logs with the current setup for the tornado/django downtime:
2013-12-19 20:16:51,995 restart-server: Stopping daemons
2013-12-19 20:16:53,461 restart-server: Starting daemons
2013-12-19 20:16:57,146 restart-server: Starting workers

Compare with the behavior on master today:
2013-12-19 20:21:45,281 restart-server: Stopping daemons
2013-12-19 20:21:49,225 restart-server: Starting daemons
2013-12-19 20:21:58,463 restart-server: Done!

(imported from commit b2c1ba77f3dc989551d0939779208465a8410435)
2013-12-19 17:21:23 -05:00