Most of the code in show_unreads is for diagnosising unread
counts issues, and we may not use that often.
We're creating a dedicated fix_unreads management command with
less clutter.
This causes `upgrade-zulip-from-git`, as well as a no-option run of
`tools/build-release-tarball`, to produce a Zulip install running
Python 3, rather than Python 2. In particular this means that the
virtualenv we create, in which all application code runs, is Python 3.
One shebang line, on `zulip-ec2-configure-interfaces`, explicitly
keeps Python 2, and at least one external ops script, `wal-e`, also
still runs on Python 2. See discussion on the respective previous
commits that made those explicit. There may also be some other
third-party scripts we use, outside of this source tree and running
outside our virtualenv, that still run on Python 2.
This management command creates the same indexes as migrations
82, 83, and 95, which are all indexes on the huge UserMessage
table. (*)
This command quickly no-ops with clear messaging when the
indexes already exist, so it's idempotent in that regard. (If
somebody somehow creates an index by the same name incorrectly,
they can always drop it in dbshell and re-run this command.)
If any of the migrations have not been run, which we detect simply
by the existence of the indexes, then we create them using a
`CREATE INDEX CONCURRENTLY` command. This functionality in
postgres allows you to create indexes against large tables
without disrupting queries against those tables. The tradeoff
here is that creating indexes concurrently takes significantly
longer than doing them non-concurrently.
Since most tables are small, we typically just use regular
Django migrations and run them during a brief interval while
the app is down.
For indexes on big tables, we will want to run this command
as part of the upgrade process, and we will want to run
it while the app is still up, otherwise it's pointless.
All the code in create_indexes() is literally copy/pasted
from the relevant migrations, and that scheme should work
going forward. (It uses a different implementation of
create_index_if_not_exist than the migrations use, but the
code is identical lexically in the function.)
If we ever do major restructuring of our large tables, such
as UserMessage, and we end up droppping some of these indexes,
then we will need to make this command migrations-aware. For
now it's safe to assume that indexes are generally additive in
nature, and the sooner we create them during the upgrade process,
the better.
(*) UserMessage is huge for large installations, of course.
This no longer does the correct thing (in terms of onboarding emails,
default streams, etc), and is tempting for new server admins to use.
Once we remove it we'll also have the invariant that we can't have a realm
without a user, which will simplify accounts_register a bit.
This now breaks the process of cleaning up unread counts for
non-active streams into a three step process.
This allows us to use our unread message flags index, at least
in testing on dev. Here is the relevant excerpt from explain
analyze:
Bitmap Index Scan on zerver_usermessage_unread_message_id
This makes supervisor see the service as cheerfully running
and let it alone, rather than constantly retry starting it.
Because the crash/restart loop means repeatedly spending a
couple of seconds loading Django and the app, separated by
brief periods while supervisor notices the crash and acts
on it, it was actually consuming about 30-50% CPU on the
zulipchat.com staging server.