2018-12-28 09:21:03 +01:00
|
|
|
# Continuous integration (CI)
|
|
|
|
|
2020-04-28 14:44:04 +02:00
|
|
|
The Zulip server uses [CircleCI](https://circleci.com/) for continuous
|
|
|
|
integration. CircleCI runs frontend, backend and end-to-end production
|
|
|
|
installer tests. This page documents useful tools and tips when using
|
|
|
|
CircleCI and debugging issues with it.
|
2018-12-28 09:21:03 +01:00
|
|
|
|
|
|
|
## Goals
|
|
|
|
|
|
|
|
The overall goal of our CI is to avoid regressions and minimize the
|
|
|
|
total time spent debugging Zulip. We do that by trying to catch as
|
|
|
|
many possible future bugs as possible, while minimizing both latency
|
|
|
|
and false positives, both of which can waste a lot of developer time.
|
|
|
|
There are a few implications of this overall goal:
|
|
|
|
|
|
|
|
* If a test is failing nondeterministically in CI, we consider that to
|
|
|
|
be an urgent problem.
|
|
|
|
* If the tests become a lot slower, that is also an urgent problem.
|
|
|
|
* Everything we do in CI should also have a way to run it quickly
|
|
|
|
(under 1 minute, preferably under 3 seconds), in order to iterate fast
|
|
|
|
in development. Except when working on the CI configuration itself, a
|
|
|
|
developer should never have to repeatedly wait 10 minutes for a full CI
|
|
|
|
run to iteratively debug something.
|
|
|
|
|
|
|
|
## CircleCI
|
|
|
|
|
|
|
|
### Useful debugging tips and tools
|
|
|
|
|
2020-04-28 14:44:04 +02:00
|
|
|
* Zulip uses the `ts` tool to log the current time on every line of
|
|
|
|
the output in our CircleCI scripts. You can use this output to
|
|
|
|
determine which steps are actually consuming a lot of time.
|
2018-12-28 09:21:03 +01:00
|
|
|
|
|
|
|
* You can [sign up your personal repo for CircleCI][circleci-setup] so
|
|
|
|
that every remote branch you push will be tested, which can be helpful
|
|
|
|
when debugging something complicated.
|
|
|
|
|
|
|
|
* With your personal repo signed up, CircleCI
|
|
|
|
[allows you to SSH][circleci-ssh] into the job container if a job
|
|
|
|
fails. SSHing into the containers can be helpful, especially in rare
|
|
|
|
cases where the tests are passing in your computer but failing in the
|
|
|
|
CI. Make sure that you have uploaded your SSH keys to GitHub: CircleCI
|
|
|
|
uses those SSH keys for authentication.
|
|
|
|
|
|
|
|
[docker-hub]: https://hub.docker.com/
|
2019-04-06 02:58:44 +02:00
|
|
|
[circleci-setup]: ../git/cloning.html#step-3-configure-continuous-integration-for-your-fork
|
2018-12-28 09:21:03 +01:00
|
|
|
[circleci-ssh]: https://circleci.com/docs/2.0/ssh-access-jobs/
|
|
|
|
|
|
|
|
### Suites
|
|
|
|
|
2020-04-28 14:44:04 +02:00
|
|
|
The main CircleCI configuration file defining how the tests are run is
|
|
|
|
[./circleci/config.yml][circleci-config]. Our code for running the
|
|
|
|
tests in CI lives under `tools/ci`; but they are mostly thin wrappers
|
|
|
|
around [Zulip's test suites](../testing/testing.md) or production
|
|
|
|
installer tooling.
|
|
|
|
|
|
|
|
[circleci-config]: https://github.com/zulip/zulip/blob/master/.circleci/config.yml
|
2019-05-25 01:32:00 +02:00
|
|
|
|
2020-04-28 14:44:04 +02:00
|
|
|
We run multiple jobs during a CircleCI build to run Zulip's test
|
|
|
|
suites on our supported production platforms. They are currently:
|
|
|
|
|
2020-06-08 02:46:50 +02:00
|
|
|
* bionic-backend-frontend
|
|
|
|
* focal-backend
|
2018-12-28 09:21:03 +01:00
|
|
|
|
|
|
|
Each runs the Zulip backend test suites, using the indicated
|
2020-06-08 02:46:50 +02:00
|
|
|
platform/OS. As suggested by the names, only one
|
2019-05-25 01:32:00 +02:00
|
|
|
suite runs the frontend test suites, since those are not
|
|
|
|
platform-dependent.
|
2018-12-28 09:21:03 +01:00
|
|
|
|
2020-04-28 14:44:04 +02:00
|
|
|
Additionally, there a couple jobs designed to do an end-to-end test on
|
|
|
|
Zulip's production installer:
|
|
|
|
|
2020-06-08 02:46:50 +02:00
|
|
|
* bionic-production-build
|
|
|
|
* bionic-production-install
|
2020-04-28 15:02:37 +02:00
|
|
|
* xenial-legacy
|
2020-04-28 14:44:04 +02:00
|
|
|
|
|
|
|
The `production-build` job builds a Zulip release tarball, which is
|
|
|
|
then installed in a fresh container in the `production-install` job;
|
|
|
|
various Nagios and other checks are run to confirm the installation
|
|
|
|
worked.
|
|
|
|
|
2020-04-28 15:02:37 +02:00
|
|
|
The xenial-legacy tests are just designed to ensure we give the right
|
|
|
|
error messages when trying to install or upgrade a Xenial system to
|
|
|
|
master.
|
|
|
|
|
2018-12-28 09:21:03 +01:00
|
|
|
### Configuration
|
|
|
|
|
|
|
|
The remaining details in this section are primarily relevant for doing
|
|
|
|
development on our CI system and/or provisioning process.
|
|
|
|
|
|
|
|
The first key of the job section is `docker`. The docker key specifies
|
|
|
|
the image CircleCI should get from [Docker Hub][docker-hub] for running
|
|
|
|
the job. Once CircleCI fetches the image from Docker Hub, it will spin
|
|
|
|
up a docker container. See [images](#images) section to know more about
|
|
|
|
the images we use in CircleCI for testing.
|
|
|
|
|
|
|
|
After booting the container from the configured image, CircleCI will
|
|
|
|
create the directory mentioned in `working_directory` and all the
|
|
|
|
steps are be run from here.
|
|
|
|
|
|
|
|
The `steps` section describes describes everything: fetching the Zulip
|
2020-03-17 13:57:10 +01:00
|
|
|
code, provisioning, fetching caught data, running tests and uploading
|
2018-12-28 09:21:03 +01:00
|
|
|
coverage reports. The steps with prefix `*` reference aliases, which
|
|
|
|
are defined in the `aliases` section at the top of the file.
|
|
|
|
|
|
|
|
### Images
|
|
|
|
|
|
|
|
CircleCI tests are run in containers that are spun off from the images
|
|
|
|
maintained by Zulip team. The Dockerfiles for the various images can be
|
2020-06-23 21:36:39 +02:00
|
|
|
generated by running `./tools/ci/generate-dockerfiles`. This command
|
2018-12-28 09:21:03 +01:00
|
|
|
will generate the Dockerfiles of the three Ubuntu releases in
|
2020-06-23 21:36:39 +02:00
|
|
|
`./tools/ci/images/{release_name}` directories. Take a look at
|
|
|
|
`./tools/ci/images.yml` to see how the Dockerfiles for the three
|
2018-12-28 09:21:03 +01:00
|
|
|
releases differ from each other. To further generate images from the
|
|
|
|
Dockerfiles and upload it to Docker Hub follow the instructions in the
|
|
|
|
generated Dockerfiles.
|
|
|
|
|
|
|
|
### Performance optimizations
|
|
|
|
|
|
|
|
#### Caching
|
|
|
|
|
2020-04-28 14:44:04 +02:00
|
|
|
An important element of making CircleCI perform effectively is caching
|
|
|
|
between jobs the various caches that live under `/srv/` in a Zulip
|
|
|
|
development or production environment. In particular, we cache the
|
|
|
|
following:
|
2018-12-28 09:21:03 +01:00
|
|
|
|
|
|
|
* Python virtualenvs
|
|
|
|
* node_modules directories
|
|
|
|
|
|
|
|
This has a huge impact on the performance of running tests in CircleCI
|
|
|
|
CI; without these caches, the average test time would be several times
|
|
|
|
longer.
|
|
|
|
|
|
|
|
We have designed these caches carefully (they are also used in
|
|
|
|
production and the Zulip development environment) to ensure that each
|
|
|
|
is named by a hash of its dependencies and ubuntu distribution name,
|
|
|
|
so Zulip should always be using the same version of dependencies it
|
|
|
|
would have used had the cache not existed. In practice, bugs are
|
|
|
|
always possible, so be mindful of this possibility.
|
|
|
|
|
|
|
|
A consequence of this caching is that test jobs for branches which
|
|
|
|
modify `package.json`, `requirements/`, and other key dependencies
|
|
|
|
will be significantly slower than normal, because they won't get to
|
|
|
|
benefit from the cache.
|