Update schema changes content and formatting.

(imported from commit 46f3d43bfe62883d490ea870b2d29f7a45397c08)
This commit is contained in:
Yoyo Zhou 2015-08-19 17:05:00 -07:00
parent 4a6e132a44
commit f78abd2dcc
1 changed files with 84 additions and 80 deletions

View File

@ -1,5 +1,5 @@
==============
Schema Changes
Schema changes
==============
If you are making a change that requires a database schema upgrade,
@ -8,68 +8,65 @@ there are a few extra things you need to keep in mind.
Using South for migrations
--------------------------
#. Discuss the change informally with other devs.
#. Edit zerver/models.py for your particular class.
1. Discuss the change informally with your team.
#. Edit ``zerver/models.py`` for your particular class.
#. See notes below about keep\_default.
* See notes below about keep\_default.
#. Run ``./manage.py schemamigration zerver --auto``
#. This will create the ``000#_***.py`` schema migration file in
``zerver/migrations``.
* This will create the ``000#_***.py`` schema migration file in
``zerver/migrations``.
#. Read Notes section below, as you may need to edit the migration. A
common step here is setting keep\_default to True.
#. Do "git add" with your new migration.
#. Run ./manage.py migrate zerver.
#. Read `Notes and Cautions`_ section below, as you may need to edit
the migration. A common step here is setting keep\_default to True.
#. Do ``git add`` with your new migration.
#. Run ``./manage.py migrate zerver``.
#. Write supporting code or otherwise validate the DB change locally.
#. Email devel@ about your proposed changes to get formal sign off.
#. Commit your changes
TODO: Advice on testing schema changes?
#. Commit your changes:
#. The migration must be in the same commit as the models.py changes.
a. The migration must be in the same commit as the models.py changes.
#. Include [schema] in the commit message.
#. Include [manual] in the commit message if additional steps are
required.
#. Before deploying your code fix, read the notes on "Deploying to
staging".
#. Before deploying your code fix, read the notes on `Deploying to
staging`_.
Deploying to staging
--------------------
Always follow this process.
(1) Schedule the migration for after hours.
1. Schedule the migration for after hours.
(2) For long-running migrations, double check that you use appropriate
library helpers in migrate.py to ensure that changes happen in small
batches that are committed frequently.
#. For long-running migrations, double check that you use appropriate
library helpers in ``migrate.py`` to ensure that changes happen in small
batches that are committed frequently.
(3) Announce that you are doing the migration to the team, to avoid
simultaneous migrations and other outcomes of poor communication.
#. Announce that you are doing the migration to your team, to avoid
simultaneous migrations and other outcomes of poor communication.
(4) Do any administrative steps, such as increasing
checkpoint\_segments.
#. Do any administrative steps, such as increasing
checkpoint\_segments.
(5) Apply the migration in advance from staging, using commands similar
to the following, where "[your commit]" is the commit that has your
migration:
#. Apply the migration in advance from staging, using commands similar
to the following, where ``[your commit]`` is the commit that has your
migration::
::
cd ~/zulip
git fetch
cd /tmp
git clone ~/zulip
cd zulip
git checkout [your commit]
./manage.py migrate zerver
cd /tmp
rm -Rf zulip
ssh zulip@staging.zulip.net
cd zulip
git fetch
cd /tmp
git clone ~/zulip
cd zulip
git checkout [your commit]
./manage.py migrate zerver
cd /tmp
rm -Rf zulip
(6) Undo any temporary administrative changes, such as increasing
checkpoint segments.
#. Undo any temporary administrative changes, such as increasing
checkpoint\_segments.
Because staging and prod share a database, for most migrations, nothing
special needs to be done when deploying to prod since the shared
@ -80,48 +77,55 @@ Migrating to a new schema
-------------------------
When doing a git pull and noticing a [schema] commit, you must manually
perform a schema upgrade: ``./manage.py migrate zephyr``. You used to
need to run ``generate-fixtures --force`` to update the test data set,
but these days ``generate-fixtures`` should automatically detect whether
perform a schema upgrade: ``./manage.py migrate zerver``.
``generate-fixtures`` should automatically detect whether
the schema has changed and update things accordingly.
Notes and Cautions
------------------
- **Large tables**: For large tables like Message and UserMessage, you
want to take precautions when adding columns to the table, performing
data backfills, or building indexes. We have a migrate.py library to
help with adding columns and backfilling data. For building indexes,
we should do this outside of South using postgres's CONCURRENTLY
keyword.
- **Numbering conflicts across branches**: If you've done your schema
change in a branch, and meanwhile another schema change has taken
place, South will now have two migrations with the same number. To
fix this, delete the migration file that South generated, and re-run
``./manage.py schemamigration zephyr --auto``
- **Avoid nullables**: You generally no longer need a Nullable column
to avoid problems with staging and prod not having the same models.
See the next point about setting ``keep_default=True``.
- **Use keep\_default**: When adding a new column to an existing table,
you almost always will want to set ``keep_default=True`` in the South
migration ``db.add_column`` call. If you don't, everything will
appear to work fine in testing and on staging, but once the schema
migration is done, the pre-migration code running on prod will be
unable to save new rows for that table (so e.g. if you were adding a
new field to UserProfile, we'd be unable to create new users). The
exception to this rule is when your field default is not a constant
value. In this case, you'll need to do something special to either
set a database-level default or use a Nullable field and a multi-step
schema deploy process.
- **Rebase pain**: If you ever need to rebase a schema change past
other schema changes made on other branches, in addition to
renumbering your schema change, youalso need to be sure to regenerate
at least the bottom part of your migration (which shows the current
state of all the models) after rebasing; if you don't, then the next
migration made after your migration is merged will incorrectly
attempt to re-apply all the schema changes made in the migration you
skipped. This can be potentially dangerous.
- **Enterprise Version**: Keep in mind that any schema migrations we
write will also be run on Zulip Enterprise installations. So don't do
anything to set particular values appropriate for our site which may
not be appropriate for Zulip Enterprise.
**Large tables**
For large tables like Message and UserMessage, you
want to take precautions when adding columns to the table, performing
data backfills, or building indexes. We have a ``migrate.py`` library to
help with adding columns and backfilling data. For building indexes,
we should do this outside of South using postgres's CONCURRENTLY
keyword.
**Numbering conflicts across branches**
If you've done your schema change in a branch, and meanwhile another
schema change has taken place, South will now have two migrations with
the same number. To fix this, delete the migration file that South
generated, and re-run ``./manage.py schemamigration zerver --auto``.
**Avoid nullables**
You generally no longer need a Nullable column
to avoid problems with staging and prod not having the same models.
See the next point about setting ``keep_default=True``.
**Use keep\_default**
When adding a new column to an existing table,
you almost always will want to set ``keep_default=True`` in the South
migration ``db.add_column`` call. If you don't, everything will
appear to work fine in testing and on staging, but once the schema
migration is done, the pre-migration code running on prod will be
unable to save new rows for that table (so e.g. if you were adding a
new field to UserProfile, we'd be unable to create new users). The
exception to this rule is when your field default is not a constant
value. In this case, you'll need to do something special to either
set a database-level default or use a Nullable field and a multi-step
schema deploy process.
**Rebase pain**
If you ever need to rebase a schema change past
other schema changes made on other branches, in addition to
renumbering your schema change, youalso need to be sure to regenerate
at least the bottom part of your migration (which shows the current
state of all the models) after rebasing; if you don't, then the next
migration made after your migration is merged will incorrectly
attempt to re-apply all the schema changes made in the migration you
skipped. This can be potentially dangerous.
**Upstreaming**
We recommend upstreaming schema changes as soon as possible to
avoid schema numbering conflicts (see above).