Otherwise we end up running Tornado against the _previous_ version of
our code!
Testing of using supervisorctl to stop and then start a process shows
it takes about the same amount of time as doing a supervisorctl
restart, so there's no reason not to split the two commands apart and
make it super clear that nothing is running at the time that we move
the deployment symlink.
(imported from commit c38049da2bfc9fa94320a32dbf3240d1fcba67f7)
We let supervisor create the socket for us by making humbug-django a
fcig-program. Unfortunately, supevisor doesn't support putting
fcgi-programs in groups (see
https://github.com/Supervisor/supervisor/issues/148), so we have to
restart tornado and django separately.
To deploy, copy the config files over and restart nginx and
supervisor (via stopping and then starting it because restart is
broken). I believe the automated restart as part of
update-deployment will fail because of the way supervisor treats
programs in groups. If so, after restarting supervisor, you will
also need to run restart-server manually to fill the caches and then
delete the lock directory in humbug-deployments.
(imported from commit bfb5db7dd42dcbc4bfefa2944355b3cbb2ef9104)
The amount of process downtime during a supervisord-mediated restart
appears to be linear in the number of processes that are being
restarted. Therefore, restarting just Django and Tornado causes less
downtime than doing them at the same time as the other worker
processes.
(imported from commit 1fa9ef547bcd88caeec49800664e37d5f2fcb7a8)
This change will make it so that processes related to the app.humbughq.com
server are run under supervisord, which uses a state machine model to ensure
that programs are running. It also ensure process startup order.
We will need to manually switch the old way of running server (in screen) into
this new way of doing things, on both staging and prod (app_frontend.pp has been
updated appropriately). This means:
1) cp servers/puppet/modules/humbug/files/supervisord/conf.d/humbug.conf /etc/supervisord/conf.d
2) installing the supervisor package.
3) killing those while loops in that screen session
4) mkdir /var/log/humbug (as root)
5) /etc/init.d/supervisord start
6) check that nothing broke
(imported from commit 055269a70973db89acd69049e01b185fabdc8f90)
The previous version ended up being (at least sometimes) wrong after
the recent deployment system changes.
(imported from commit dec3beb1b1bf8b9c9ad6820b93b0a5d730d020e8)
This requires manual steps on deploy to each of staging and prod:
(1) Run the new update-deployment code to setup the initial deployment directory.
(2) Restart all the programs running in screen sessions.
(3) Deploy the nginx changes and restart nginx.
(imported from commit 1ffe27933ee79274dc0a93d35c9938712de0ef36)
restart-server has been relatively slow recently, and it'd be nice to
know what it is spending its time doing when it hangs for a few
seconds.
(imported from commit a411c951f5a3f2a1366b6d5d3a40d0660ebec11b)
Our previous code could in theory end up clearing the caches it had
just filled, if Tornado's cache filling work happened to be faster
than the memcached flush.
(imported from commit 48174aadad398fb7a7c917a1df765c1261b12a55)
At Ksplice we used /usr/bin/python because we shipped dependencies as Debian /
Red Hat packages, which would be installed against the system Python. We were
also very careful to use only Python 2.3 features so that even old system
Python would still work.
None of that is true at Humbug. We expect users to install dependencies
themselves, so it's more likely that the Python in $PATH is correct. On OS X
in particular, it's common to have five broken Python installs and there's no
expectation that /usr/bin/python is the right one.
The files which aren't marked executable are not interesting to run as scripts,
so we just remove the line there. (In general it's common to have libraries
that can also be executed, to run test cases or whatever, but that's not the
case here.)
(imported from commit 437d4aee2c6e66601ad3334eefd50749cce2eca6)
Manual deployment steps: The same Nginx reload as for "Get rid of the
static-access-control mechanism". If deploying both commits at once,
just do it once.
(imported from commit dd8dbbf14b95fce0a4b6f66f462fa0a6b50bfb8c)