From dc5dbcbee382acc85d6a5051dd31c48eec9143da Mon Sep 17 00:00:00 2001 From: David Rosa Date: Tue, 24 Sep 2019 16:58:20 -0700 Subject: [PATCH] docs: Merge "Backups" with export-and-import. - Merges the "Backups" section from production/maintain-secure-upgrade.md with existing "Backups" section in production/export-and-import.md. - Cleans up and makes content more clear/explicit. - Adds short missing section on how to use wal-e configuration. - Removes a lot of previously duplicate text explaining the difference between the tools. - Various textual tweaks by tabbott. Fixes #13184 and resolves #293. --- docs/production/export-and-import.md | 324 +++++++++++++++--- docs/production/install.md | 2 +- docs/production/maintain-secure-upgrade.md | 199 ----------- .../zerver/help/export-your-organization.md | 2 +- 4 files changed, 274 insertions(+), 253 deletions(-) diff --git a/docs/production/export-and-import.md b/docs/production/export-and-import.md index ec56705c48..e317cc0442 100644 --- a/docs/production/export-and-import.md +++ b/docs/production/export-and-import.md @@ -1,49 +1,270 @@ -# Export and import +# Backups, export and import -Zulip has high quality export and import tools that can be used to move data -from one Zulip server to another, do backups or compliance work, or migrate -from your own servers to the hosted Zulip Cloud service. +Zulip has high quality export and import tools that can be used to +move data from one Zulip server to another, do backups, compliance +work, or migrate from your own servers to the hosted Zulip Cloud +service (or back): -When using these tools, it's important to ensure that the Zulip server -you're exporting from and the one you're exporting to are running the -same version of Zulip, since we do change and extend the format from -time to time. +* The [Backup](#backups) tool is designed for exact restoration of a + Zulip server's state, for disaster recovery, testing with production + data, or hardware migration. This tool has a few limitations: + + * Backups must be restored on a server running the same Zulip + version (most precisely, one where `manage.py showmigrations` has + the same output). + * Backups must be restored on a server running the same `postgres` + version. + * Migrating organizations between self-hosting and Zulip Cloud + (generally requires renumbering all the + users/messages/etc.). + + We highly recommend this tool in situations where it is applicable, + because it is highly optimized and highly stable, since the hard wor + k is done by the built-in backup feature of `postgres`. We also + document [backup details](#backup-details) for users managing + backups manually. + +* The logical [Data export](#data-export) tool is designed for + migrating data between Zulip Cloud and other Zulip servers, as well + as various auditing purposes. The logical export tool produces a + `.tar.gz` archive with most of the Zulip database data encoded in + JSON files–a format shared by our [data + import]((#import-into-a-new-zulip-server) ) tools for third-party + services like + [Slack](https://zulipchat.com/help/import-from-slack). + + Like the backup tool, logical data exports must be imported on a + Zulip server running the same version. However, these exports + imported on Zulip servers running a different `postgres` version or + hosting a different set of Zulip organizations. We recommend this + tool in cases where the backup tool isn't applicable, including + situations where an easily machine-parsable export format is desired. + +* Zulip also has an [HTML archive + tool](https://github.com/zulip/zulip_archive), which is primarily + intended for public archives, but can also be useful to + inexpensively preserve public stream conversations when + decommissioning a Zulip organization. + +* It's possible to setup [postgres streaming + replication](#postgres-streaming-replication) and the [S3 file + upload + backend](../production/upload-backends.html#s3-backend-configuration) + as part of a high evailability environment. ## Backups -If you want to move hardware for a self-hosted Zulip installation, we -recommend Zulip's -[database-level backup and restoration process][backups] for a better -experience. Zulip's database-level backup process is faster, -structurally very unlikely to ever develop bugs, and will restore your -Zulip server to the exact state it was left in. The big thing it -can't do is support a migration to a server hosting a different set of -organizations than the original one, e.g. migrations between -self-hosting and Zulip Cloud (because doing so in the general case -requires renumbering all the users/messages/etc.). +The Zulip server has a built-in backup tool: -Zulip's export/import tools (documented on this page) have full -support for such a renumbering process. While these tools are -carefully designed and tested to make various classes of bugs -impossible or unlikely, the extra complexity required for renumbering -makes them structurally more risky than the direct postgres backup -process. +``` +# As the zulip user +/home/zulip/deployments/current/manage.py backup +# Or as root +su zulip -c '/home/zulip/deployments/current/manage.py backup' +``` -[backups]: ../production/maintain-secure-upgrade.html#backups +The backup tool provides the following options: +- `--output`: Path where the output file should be stored. If no path is + provided, the output file would be saved to a temporary directory. +- `--skip-db`: Skip backup of the database. Useful if you're using a + remote postgres host with its own backup system and just need to + backup non-database state. +- `--skip-uploads`: If `LOCAL_UPLOADS_DIR` is set, user-uploaded files + in that directory will be ignored. -## Preventing changes during the export +This will generate a `.tar.gz` archive containing all the data stored +on your Zulip server that would be needed to restore your Zulip +server's state on another machine perfectly. + +### Restoring backups + +First, [install a new Zulip server through Step 3][install-server] +with the same version of both the base OS and Zulip from your previous +installation. Then, run as root: + +``` +/home/zulip/deployments/current/scripts/setup/restore-backup /path/to/backup +``` + +When that finishes, your Zulip server should be fully operational again. + +#### Changing the hostname + +It's common, when testing backup restoration, to restore backups with a +different user-facing hostname than the original server to avoid +disrupting service (e.g. `zuliptest.example.com` rather than +`zulip.example.com`). + +If you do so, just like any other time you change the hostname, you'll +need to [update `EXTERNAL_HOST`](../production/settings.md) and then +restart the Zulip server (after backup restoration completes). + +Until you do, your Zulip server will think its user-facing hostname is +still `zulip.example.com` and will return HTTP `400 BAD REQUEST` +errors when trying to access it via `zuliptest.example.com`. + +#### Inspecting a backup tarball + +If you're not sure what versions were in use when a given backup was +created, you can get that information via the files in the backup +tarball: `postgres-version`, `os-version`, and `zulip-version`. The +following command may be useful for viewing these files without +extracting the entire archive. + +``` +tar -Oaxf /path/to/archive/zulip-backup-rest.tar.gz zulip-backup/zulip-version +``` + +[install-server]: ../production/install.md + +### What is included + +Backups contain everything you need to fully restore your Zulip +server, including the database, settings, secrets from +`/etc/zulip`, and user-uploaded files stored on the Zulip server. + +The following data is not included in these backup archives, +and you may want to backup separately: + +* The server access/error logs from `/var/log/zulip`. The Zulip + server only appends to logs, and they can be very large compared to + the rest of the data for a Zulip server. + +* Files uploaded with the Zulip + [S3 file upload backend](../production/upload-backends.md). We + don't include these for two reasons. First, the uploaded file data + in S3 can easily be many times larger than the rest of the backup, + and downloading it all to a server doing a backup could easily + exceed its disk capacity. Additionally, S3 is a reliable persistent + storage system with its own high-quality tools for doing backups. + +* Transient data present in Zulip's RabbitMQ queues. For example, a + record that a missed-message email for a given Zulip message is + scheduled to be sent to a given user in 2 minutes, if the recipient + user doesn't interact with Zulip during that time window. You can + check their status using `rabbitmq list_queues` as root. + +* Certain highly transient state that Zulip doesn't store in a + database, such as typing status, API rate-limiting counters, + etc. that would have no value 1 minute after the backup is + completed. + +* SSL certificates. Since these are particularly security-sensitive + and either trivially replaced (if generated via Certbot) or provided + by the system administrator. + +#### Backup details + +This section is primarily for users managing backups themselves +(E.g. if they're using a remote postgres database with an existing +backup strategy), and also serves as documentation for what is +included in the backups generated by Zulip's standard tools. The +data includes: + +* The postgres database. You can back it up like any postgres +database. We have some example tooling for doing that incrementally +into S3 using [wal-e](https://github.com/wal-e/wal-e) in +`puppet/zulip_ops/manifests/postgres_common.pp`. +In short, this requires: + - Zulip 1.4 or newer release. + - An Amazon S3 bucket for storing the backups. + - `/etc/zulip/zulip-secrets.conf` on the postgres server like this: + ``` + [secrets] + s3_backups_key = # aws public key + s3_backups_secret_key = # aws secret key + s3_backups_bucket = # name of S3 backup + ``` + - A cron job to run `/usr/local/bin/pg_backup_and_purge.py`. There's puppet + config for this in `puppet/zulip_internal/manifests/postgres_common.pp`. + - Verification that backups are running via + `/usr/lib/nagios/plugins/zulip_postgres_common/check_postgres_backup`. + +* Any user-uploaded files. If you're using S3 as storage for file +uploads, this is backed up in S3. But if you have instead set +`LOCAL_UPLOADS_DIR`, any files uploaded by users (including avatars) +will be stored in that directory and you'll want to back it up. + +* Your Zulip configuration including secrets from `/etc/zulip/`. +E.g. if you lose the value of `secret_key`, all users will need to +login again when you setup a replacement server since you won't be +able to verify their cookies. If you lose `avatar_salt`, any +user-uploaded avatars will need to be re-uploaded (since avatar +filenames are computed using a hash of `avatar_salt` and user's +email), etc. + +[export-import]: ../production/export-and-import.md + +### Restore from manual backups + +To restore from a manual backup, the process is basically the reverse of the above: + +* Install new server as normal by downloading a Zulip release tarball + and then using `scripts/setup/install`. You don't need + to run the `initialize-database` second stage which puts default + data into the database. + +* Unpack to `/etc/zulip` the `settings.py` and `zulip-secrets.conf` files + from your backups. + +* Restore your database from the backup using `wal-e`. If you ran + `initialize-database` anyway above, you'll want to run + `scripts/setup/postgres-init-db` to drop the initial database first. + +* Reconfigure rabbitmq to use the password from `secrets.conf` + by running, as root, `scripts/setup/configure-rabbitmq`. + +* If you're using local file uploads, restore those files to the path + specified by `settings.LOCAL_UPLOADS_DIR` and (if appropriate) any + logs. + +* Start the server using `scripts/restart-server`. + +This restoration process can also be used to migrate a Zulip +installation from one server to another. + +We recommend running a disaster recovery after setting up your backups to +confirm that your backups are working. You may also want to monitor +that they are up to date using the Nagios plugin at: +`puppet/zulip_ops/files/nagios_plugins/check_postgres_backup`. + +## Postgres streaming replication + +Zulip has database configuration for using Postgres streaming +replication. You can see the configuration in these files: + +* `puppet/zulip_ops/manifests/postgres_slave.pp` +* `puppet/zulip_ops/manifests/postgres_master.pp` +* `puppet/zulip_ops/files/postgresql/*` + +We use this configuration for zulipchat.com, and it works well in +production, but it's not fully generic. Contributions to make it a +supported and documented option for other installations are +appreciated. + +## Data export + +Zulip's powerful data export tool is designed to handle migration of a +Zulip organization between different hardware platforms; as a result, +these exports contain all non-transient data for a Zulip organization, +with the exception of passwords and API keys. + +We recommend using the [backup tool](#backups) if your primary goal is +backups. + +### Preventing changes during the export For best results, you'll want to shut down access to the organization -before exporting, so that nobody can send new messages (etc.) while +before exporting; so that nobody can send new messages (etc.) while you're exporting data. There are two ways to do this: 1. `supervisorctl stop all`, which stops the whole server. This is preferred if you're not hosting multiple organizations, because it has no side effects other than disabling the Zulip server for the duration. -1. `manage.py deactivate_realm`, which deactivates the target +1. `manage.py deactivate_realm -r 'target_org'`, which deactivates the target organization, logging out all active login sessions and preventing all -accounts in the from logging in or accessing the API. This is +accounts from logging in or accessing the API. This is preferred for environments like Zulip Cloud where you might want to export a single organization without disrupting any other users, and the intent is to move hosting of the organization (and forcing users @@ -55,15 +276,15 @@ that neither runs (using the `# ` at the start of the lines). If you'd like to use one of these options, remove the `# ` at the start of the lines for the appropriate option. -## Export your Zulip data +### Export your Zulip data Log in to a shell on your Zulip server as the `zulip` user. Run the following commands: ``` cd /home/zulip/deployments/current -# ./manage.py deactivate_realm -r '' # Deactivates the organization # supervisorctl stop all # Stops the Zulip server +# ./manage.py deactivate_realm -r '' # Deactivates the organization ./manage.py export -r '' # Exports the data ``` @@ -77,32 +298,31 @@ archive of all the organization's uploaded files. ## Import into a new Zulip server -(1.) [Install a new Zulip server](../production/install.md), -skipping "Step 3: Create a Zulip organization, and log in" (you'll -create your Zulip organization via the data import tool instead). - -(1a.) Ensure that the Zulip server you're importing into is running the same +1. [Install a new Zulip server](../production/install.md), +**skipping Step 3** (you'll create your Zulip organization via the data + import tool instead). + * Ensure that the Zulip server you're importing into is running the same version of Zulip as the server you're exporting from. -For exports from zulipchat.com, run the following: + * For exports from zulipchat.com, run the following: -``` -/home/zulip/deployments/current/scripts/upgrade-zulip-from-git master -``` + ``` + /home/zulip/deployments/current/scripts/upgrade-zulip-from-git master + ``` -Note that if your server has 2GB of RAM or less, you'll want to read the detailed instructions -[here][upgrade-zulip-from-git]. -It is not sufficient to be on the latest stable release, as zulipchat.com is -often several months of development ahead of the latest release. + * Note that if your server has 2GB of RAM or less, you'll want to read the + detailed instructions [here][upgrade-zulip-from-git]. + It is not sufficient to be on the latest stable release, as zulipchat.com is + often several months of development ahead of the latest release. -(2.) If your new Zulip server is meant to fully replace a previous Zulip +2. If your new Zulip server is meant to fully replace a previous Zulip server, you may want to copy the contents of `/etc/zulip` to your new server to reuse the server-level configuration and secret keys from your old server. See our -[documentation on backups][backups] for details on the contents of +[documentation on backups](#backups) for details on the contents of this directory. -(3.) Log in to a shell on your Zulip server as the `zulip` user. Run the +3. Log in to a shell on your Zulip server as the `zulip` user. Run the following commands, replacing the filename with the path to your data export tarball: @@ -115,12 +335,12 @@ cd /home/zulip/deployments/current # ./manage.py reactivate_realm -r '' # Reactivates the organization ``` -This could take several minutes to run, depending on how much data you're +This could take several minutes to run depending on how much data you're importing. [upgrade-zulip-from-git]: ../production/maintain-secure-upgrade.html#upgrading-from-a-git-repository -**Import options** +#### Import options The commands above create an imported organization on the root domain (`EXTERNAL_HOST`) of the Zulip installation. You can also import into a @@ -133,13 +353,13 @@ root domain. Replace the last two lines above with the following, after replacin ./manage.py reactivate_realm -r # Reactivates the organization ``` -## Logging in +### Logging in Once the import completes, all your users will have accounts in your new Zulip organization, but those accounts won't have passwords yet (since for security reasons, passwords are not exported). Your users will need to either authenticate using something like -Google auth, or start by resetting their passwords. +Google auth or start by resetting their passwords. You can use the `./manage.py send_password_reset_email` command to send password reset emails to your users. We @@ -156,7 +376,7 @@ and then once you're ready, you can email them to everyone using e.g. (replace `''` with your subdomain if you're using one). -## Deleting and re-importing +### Deleting and re-importing If you did a test import of a Zulip organization, you may want to delete the test import data from your Zulip server before doing a diff --git a/docs/production/install.md b/docs/production/install.md index 21e256f726..f0e643aa82 100644 --- a/docs/production/install.md +++ b/docs/production/install.md @@ -80,7 +80,7 @@ and return to the import instructions. [hipchat-import]: https://zulipchat.com/help/import-from-hipchat [slack-import]: https://zulipchat.com/help/import-from-slack -[zulip-backups]: ../production/maintain-secure-upgrade.html#backups +[zulip-backups]: ../production/export-and-import.html#backups Otherwise, open the link in a browser. Follow the prompts to set up your organization, and your own user account as an administrator. diff --git a/docs/production/maintain-secure-upgrade.md b/docs/production/maintain-secure-upgrade.md index fee5d25b73..3efdd6c26c 100644 --- a/docs/production/maintain-secure-upgrade.md +++ b/docs/production/maintain-secure-upgrade.md @@ -6,7 +6,6 @@ secure Zulip installation, including: - [Upgrading](#upgrading) - [Upgrading from a git repository](#upgrading-from-a-git-repository) - [Upgrading the operating system](#upgrading-the-operating-system) -- [Backups](#backups) - [Monitoring](#monitoring) - [Scalability](#scalability) - [Management commands](#management-commands) @@ -369,204 +368,6 @@ That last command will finish by restarting your Zulip server; you should now be able to navigate to its URL and confirm everything is working correctly. -## Backups - -Starting with Zulip 2.0, Zulip has a built-in backup tool: - -``` -# As the zulip user -/home/zulip/deployments/current/manage.py backup -# Or as root -su zulip -c '/home/zulip/deployments/current/manage.py backup' -``` - -The backup tool provides the following options: -- `--output`: Path where the output file should be stored. If no path is - provided, the output file would be saved to a temporary directory. -- `--skip-db`: If set, the tool will skip the backup of your database. -- `--skip-uploads`: If set, the tool will skip the backup of the uploads. - -This will generate a `.tar.gz` archive containing all the data stored -on your Zulip server that would be needed to restore your Zulip -server's state on another machine perfectly. - -### Restoring backups - -Backups generated using the Zulip 2.0 backup tool can be restored as -follows. - -First, [install a new Zulip server through Step 3][install-server] -with the version of both the base OS and Zulip from your previous -installation. Then, run as root: - -``` -/home/zulip/deployments/current/scripts/setup/restore-backup /path/to/backup -``` - -When that finishes, your Zulip server should be fully operational again. - -#### Changing the hostname - -It's common when testing backup restoration to restore backups with a -different user-facing hostname than the original server to avoid -disrupting service (e.g. `zuliptest.example.com` rather than -`zulip.example.com`). - -If you do so, just like any other time you change the hostname, you'll -need to [update `EXTERNAL_HOST`](../production/settings.md) and then -restart the Zulip server (after backup restoration completes). - -Until you do, your Zulip server will think its user-facing hostname is -still `zulip.example.com` and will return HTTP `400 BAD REQUEST` -errors when trying to access it via `zuliptest.example.com`. - -#### Inspecting a backup tarball - -If you're not sure what versions were in use when a given backup was -created, you can get that information via the files in the backup -tarball `postgres-version`, `os-version`, and `zulip-version`. The -following command may be useful for viewing these files without -extracting the entire archive. - -``` -tar -Oaxf /path/to/archive/zulip-backup-rest.tar.gz zulip-backup/zulip-version -``` - -[install-server]: ../production/install.md - -### What is included - -Zulip's backup tools includes everything you need to fully restore -your Zulip server from a user perspective. - -The following data present on a Zulip server is not included in these -backup archives, and you may want to backup separately: - -* Transient data present in Zulip's RabbitMQ queues. For example, a - record that a missed-message email for a given Zulip message is - scheduled to be sent to a given user in 2 minutes if the recipient - user doesn't interact with Zulip during that time window. You can - check their status using `rabbitmq list_queues` as root. - -* Certain highly transient state that Zulip doesn't store in a - database, such as typing status, API rate-limiting counters, - etc. that would have no value 1 minute after the backup is - completed. - -* The server access/error logs from `/var/log/zulip`, because a Zulip - server only appends to those log files (i.e. they aren't necessarily - to precisely restore your Zulip data), and they can be very large - compared to the rest of the data for a Zulip server. - -* Files uploaded with the Zulip - [S3 file upload backend](../production/upload-backends.md). We - don't include these for two reasons. First, the uploaded file data - in S3 can easily be many times larger than the rest of the backup, - and downloading it all to a server doing a backup could easily - exceed its disk capacity. Additionally, S3 is a reliable persistent - storage system with its own high-quality tools for doing backups. - Contributions of (documentation on) ready-to-use scripting for S3 - backups are welcome. - -* SSL certificates. Since these are security-sensitive and either - trivially replaced (if generated via Certbot) or provided by the - system administrator, we do not include them in these backups. - -### Backup details - -This section is primarily for users managing backups themselves -(E.g. if they're using a remote postgres database with an existing -backup strategy), and also serves as documentation for what is -included in the backups generated by Zulip's standard tools. That -data includes: - -* The postgres database. That you can back up like any postgres -database; we have some example tooling for doing that incrementally -into S3 using [wal-e](https://github.com/wal-e/wal-e) in -`puppet/zulip_ops/manifests/postgres_common.pp` (that's what we -use for zulip.com's database backups). Note that this module isn't -part of the Zulip server releases since it's part of the zulip.com -configuration (see -for a ticket about fixing this to make life easier for running -backups). - -* Any user-uploaded files. If you're using S3 as storage for file -uploads, this is backed up in S3, but if you have instead set -`LOCAL_UPLOADS_DIR`, any files uploaded by users (including avatars) -will be stored in that directory and you'll want to back it up. - -* Your Zulip configuration including secrets from `/etc/zulip/`. -E.g. if you lose the value of `secret_key`, all users will need to -login again when you setup a replacement server since you won't be -able to verify their cookies; if you lose `avatar_salt`, any -user-uploaded avatars will need to be re-uploaded (since avatar -filenames are computed using a hash of `avatar_salt` and user's -email), etc. - -Zulip also has a logical [data export and import tool][export-import], -which is useful for migrating data between Zulip Cloud and other Zulip -servers, as well as various auditing purposes. The big advantage of -the `manage.py backup` system over the export/import process is that -it's structurally very unlikely for the `postgres` process to ever -develop bugs, whereas the import/export tool requires some work for -every new feature we add to Zulip, and thus may occasionally have bugs -around corner cases. The export tool's advantage is that the export is -more human-readable and easier to parse, and doesn't have the -requirement that the same set of Zulip organizations exist on the two -servers (which is critical for migrations to and from Zulip Cloud). - -[export-import]: ../production/export-and-import.md - -### Restore from manual backups - -To restore from a manual backup, the process is basically the reverse of the above: - -* Install new server as normal by downloading a Zulip release tarball - and then using `scripts/setup/install`, you don't need - to run the `initialize-database` second stage which puts default - data into the database. - -* Unpack to `/etc/zulip` the `settings.py` and `zulip-secrets.conf` files - from your backups. - -* Restore your database from the backup using `wal-e`; if you ran - `initialize-database` anyway above, you'll want to first - `scripts/setup/postgres-init-db` to drop the initial database first. - -* Reconfigure rabbitmq to use the password from `secrets.conf` - by running, as root, `scripts/setup/configure-rabbitmq`. - -* If you're using local file uploads, restore those files to the path - specified by `settings.LOCAL_UPLOADS_DIR` and (if appropriate) any - logs. - -* Start the server using `scripts/restart-server`. - -This restoration process can also be used to migrate a Zulip -installation from one server to another. - -We recommend running a disaster recovery after you setup backups to -confirm that your backups are working; you may also want to monitor -that they are up to date using the Nagios plugin at: -`puppet/zulip_ops/files/nagios_plugins/check_postgres_backup`. - -Contributions to more fully automate this process or make this section -of the guide much more explicit and detailed are very welcome! - - -### Postgres streaming replication - -Zulip has database configuration for using Postgres streaming -replication; you can see the configuration in these files: - -* `puppet/zulip_ops/manifests/postgres_slave.pp` -* `puppet/zulip_ops/manifests/postgres_master.pp` -* `puppet/zulip_ops/files/postgresql/*` - -Contribution of a step-by-step guide for setting this up (and moving -this configuration to be available in the main `puppet/zulip/` tree) -would be very welcome! - ## Monitoring The complete Nagios configuration (sans secret keys) used to diff --git a/templates/zerver/help/export-your-organization.md b/templates/zerver/help/export-your-organization.md index 016addfa6e..e86d924c6f 100644 --- a/templates/zerver/help/export-your-organization.md +++ b/templates/zerver/help/export-your-organization.md @@ -80,6 +80,6 @@ their use of such an export. * [Import into an on-premises installation][import-only] -[production-backups]: https://zulip.readthedocs.io/en/stable/production/maintain-secure-upgrade.html#backups +[production-backups]: https://zulip.readthedocs.io/en/stable/production/export-and-import.html#backups [export-and-import]: https://zulip.readthedocs.io/en/latest/production/export-and-import.html [import-only]: https://zulip.readthedocs.io/en/latest/production/export-and-import.html#import-into-a-new-zulip-server