We ran into the 1024 file descriptor limit today for tornado. The
limits.conf file descriptor limit isn't used for supervisor started as
we currently do it, but it's a good idea to have this in place so that
if we move supervisor to run as the zulip user, we don't experience a
nasty surprise.
(imported from commit 2eb4805b5c129bc2684d151b77f295c2eaa9fc3e)
To apply this change, we must not only do a puppet apply, but also
restart rabbitmq and epmd. Rabbitmq is easy to restart, but epmd is
a little more annoying. epmd is run as a side effect of starting up
rabbitmq-server, but is not stopped when rabbitmq-server is stopped.
Therefore, the correct procedure is to stop rabbitmq-server, kill
epmd (by running `epmd -kill`), and then start rabbitmq-server again.
(imported from commit a651e5363a8b9a04b713c31baef379c566d5dbfc)
New dependency: sockjs-tornado
One known limitation is that we don't clean up sessions for
non-websockets transports. This is a bug in Tornado so I'm going to
look at upgrading us to the latest version:
https://github.com/mrjoes/sockjs-tornado/issues/47
(imported from commit 31cdb7596dd5ee094ab006c31757db17dca8899b)
These parameters collectively determine when very old transaction ids
are replaced with the special FrozenXID transaction id, which is
older than any other transaction id. Old transaction ids must be
periodically replaced like this to prevent transaction id wraparound.
The only disadvantage of increasing autovacuum_freeze_max_age is
increased disk usage, but a value of 2 billion should only require
500MB, which is pretty trivial.
Since changing autovacuum_freeze_max_age requires a restart, we will
still have to issue a manual VACUUM, with vacuum_freeze_min_age and
vacuum_freeze_table_age temporarily set to lower values.
(imported from commit 22f9ecdfc5b6a07918771d541192aa6d6369878a)
The psql command was previously failing because it was trying to use
the non-existant DB user "zulip".
(imported from commit 68ad5c979ce33cc54e9d55f1822cba9fac648944)
make sure to run `apt-get update` and then `tools/zulip-puppet-apply`
on staging and prod
(imported from commit 8d1e6295c7c395333a05cd664aa28cab6496bdaa)
We need to make sure it's totally drop-in functional on the app servers
before I'd feel comfortable modifying their config.
(imported from commit e42d6e37732d65f827982aabaff9399ec1bda0f2)
This may require just doing an mv on the home directory, plus changing
the home directory in /etc/passwd. It should of course be done carefully.
(imported from commit 660997d897ee6d33563af74f0fc5d4267a911755)
This will probably require shutting down all the supervisor processes
and restarting them to deploy properly.
(imported from commit ce687078980b43be495c49f52ed6aa0e25cdad00)
This requires doing an `mv` and then `puppet apply` on each of staging
and prod as part of the deployment process.
(imported from commit 5d0be64a3846f7151d2036d2e0b31049bc1c2dd2)
These have been superceded by checks for the existance of consumers
of the relevant queues.
(imported from commit 68a0e79734366411e39e9e4346b5a61bdd34144b)
This temporarily breaks the rabbitmq consumer checks for
user_activity and notify_tornado on prod. This should be deployed in
such a way to minimize the time that the alert needs to be ignored.
(imported from commit 08fa2f0e7d78fca1346c62824573263e42339a45)
This temporarily breaks the rabbitmq consumer checks for the
user_activity and notify_tornado queues because their state files
were renamed to match their queue names. It will be fixed for
staging in the next commit.
(imported from commit a6aaa330a1134d8ddffe8f4959deb12b219f241a)
This requires a "puppet apply" to be done to create /var/log/zulip
before we deploy anything using the new directory.
(imported from commit 2d7baedbf923df9f01b152cf0bda6494f0eac936)
We need to manually remove the old humbug and humbug-staging sites-*
files when we deploy this via puppet.
(imported from commit d25e0172a14032c5acf1501668602d34b1b13b85)