MIT implemented NTP rate-limiting to defend against on-going reflection attacks,
which was causing our nagios checks to fail intermittently. When the attacks
die down or when external sites fix their NTP configurations, checking against
time.mit.edu will stop failing. However, there also isn't much of a reason to
stick with checking against a single server.
(imported from commit 2c2a1a04646b880b010cbb4b6d94016b1eccd1a0)
Manual instructions:
This commit requires a puppet apply after deployment on both staging
and prod.
(imported from commit 2d10e33c6db2f5e9cc1204cdd5f2c91833da2a8e)
The manual step here is that we need to do the `puppet apply` before
pushing this commit, or `restart-server` will crash.
Previously we shut down everything in one group, which performed
poorly with supervisor's bad performance on restarting many daemons at
once. Now we shut down the unimportant stuff, then the important
stuff, bring back the important stuff, and then bring back the
unimportant stuff.
This new model has a little over 5s of downtime for the core
user-facing daemons -- which is still far more than would be ideal,
but a lot less than the 13s or so that we had before.
Here's some logs with the current setup for the tornado/django downtime:
2013-12-19 20:16:51,995 restart-server: Stopping daemons
2013-12-19 20:16:53,461 restart-server: Starting daemons
2013-12-19 20:16:57,146 restart-server: Starting workers
Compare with the behavior on master today:
2013-12-19 20:21:45,281 restart-server: Stopping daemons
2013-12-19 20:21:49,225 restart-server: Starting daemons
2013-12-19 20:21:58,463 restart-server: Done!
(imported from commit b2c1ba77f3dc989551d0939779208465a8410435)
We also move uploads.types to zulip-include-frontend since its only
needed on the frontends.
(imported from commit cfdf15c0c537f7ea4c239b0f882aeaa561929777)
This reqires a puppet apply as well as a manual move of the installed
files and symlink switch. Leo will do it when it hits master.
(imported from commit e58e52087ad38f1cb8e0e606b82266a93cf91e53)
It's confusing to have our log data on different files on different
systems (e.g. loadbalancer vs. app).
(imported from commit be701072ee05e2659f146b226a39f33cb4707180)
This tool is a little crude; it runs out of a cron job and will
forward to staging a notice about any new lines in the declared log
files, truncating if there are more than 10 lines.
(imported from commit 6748ddff1def0907b061dc278a3a848bd2e933f1)
Manual deployment instructions:
On staging, do a puppet apply.
No action needs to be taken for the prod deploy.
(imported from commit 0f6e5ab22aaeacfcc69d57de12f2bb6fac6f0635)
They were being installed as executable anyway, but this will make
running them manually a bit easier.
(imported from commit a1181d2c90770af5aa44b0f65a47a460efdcf2d7)
There were a few recently introduced bugs, and this also cuts down on
our having to review diffs that don't actually affect the relevant
server when doing updates.
(imported from commit 43f3cff9a414bc1632f45a8222012846353e8501)
The trailing "/" actually means "replace the location with /", which
is either useless or actively harmful, depending on the location.
(imported from commit 58b9c4c9e55e3a162ffce49c954bc2182ec57dde)
Previously we sometimes set it to $proxy_add_x_forwarded_for and other
times to $remote_addr, but according to
http://wiki.nginx.org/HttpProxyModule#.24proxy_add_x_forwarded_for
$proxy_add_x_forwarded_for handles this for us -- it will be
$remote_addr if there was no X-Forwarded-For header anyway.
(imported from commit 67dc52250e3e7751b1bf375d1a71d0272475435c)
We now have 2 variablse:
EXTERNAL_API_PATH: e.g. staging.zulip.com/api
EXTERNAL_API_URI: e.g. https://staging.zulip.com/api
The former is primarily needed for certain integrations.
(imported from commit 3878b99a4d835c5fcc2a2c6001bc7eeeaf4c9363)
Now that we've debugged the memory leak, I don't think we need this
anymore.
This reverts commit 1bdc7ee2f72bdebb1cdc94601247834a434614d6.
Conflicts:
puppet/zulip/files/cron.d/rabbitmq-numconsumers
puppet/zulip/files/supervisor/conf.d/zulip.conf
(imported from commit ff87f2aebcbc71013fa7a05aedb24e2dcad82ae6)
This is something we forgot to do in the VPC migration, so our IPs
have all been the lb0 IP in our logs :(.
(imported from commit 9d3fc69cf72a84f7bd7c54e50fb1e776a67d971f)