mirror of https://github.com/zulip/zulip.git
parent
928dd06cc8
commit
876fd95486
|
@ -0,0 +1,290 @@
|
|||
# Provisioning and third-party dependencies
|
||||
|
||||
Zulip is a large project, with well over 100 third-party dependencies,
|
||||
and managing them well is essential to the quality of the project. In
|
||||
this document, we discuss the various classes of dependencies that
|
||||
Zulip has, and how we manage them. Zulip's dependency management has
|
||||
some really nice properties:
|
||||
|
||||
* **Fast provisioning**. When switching to a different commit in the
|
||||
Zulip project with the same dependencies, it takes under 10 seconds
|
||||
to re-provision a working Zulip development environment after
|
||||
switching. If there are new dependencies, one only needs to wait to
|
||||
download the new ones, not all the pre-existing dependencies.
|
||||
* **Consistent provisioning**. Every time a Zulip development or
|
||||
production environment is provisioned/installed, it should end up
|
||||
using the exactly correct versions of all major dependencies.
|
||||
* **Low maintenance burden**. To the extent possible, we want to
|
||||
avoid manual work and keeping track of things that could be
|
||||
automated. This makes it easy to keep running the latest versions
|
||||
of our various dependencies.
|
||||
|
||||
The purpose of this document is to detail all of Zulip's third-party
|
||||
dependencies and how we manage their versions.
|
||||
|
||||
## Philosophy on adding third-party dependencies
|
||||
|
||||
In the Zulip project, we take a pragmatic approach to third-party
|
||||
dependencies. Overall, if a third-party project does something well
|
||||
that Zulip needs to do (and has an appropriate license), we'd love to
|
||||
use it rather than reinventing the wheel. If the third-party project
|
||||
needs some small changes to work, we prefer to make those changes and
|
||||
contribute them upstream. When the upstream maintainer is slow to
|
||||
respond, we may use a fork of the dependency until the code is merged
|
||||
upstream; as a result, we usually have a few packages in
|
||||
`requirements.txt` that are installed from a GitHub URL.
|
||||
|
||||
What we look for in choosing dependencies is whether the project is
|
||||
well-maintained. Usually one can tell fairly quickly from looking at
|
||||
a project's issue tracker how well-managed it is: a quick look at how
|
||||
the issue tracker is managed (or not) and the test suite is usually
|
||||
enough to decide if a project is going to be a high-maintenance
|
||||
dependency or not. That said, we do still take on some smaller
|
||||
dependencies that don't have a well-managed project, if we feel that
|
||||
using the project will still be a better investment than writing our
|
||||
own implementation of that project's functionality. We've adopted a
|
||||
few projects in the past that had a good codebase but whose maintainer
|
||||
no longer had time for them.
|
||||
|
||||
One case where we apply added scrutiny to third-party dependencies is
|
||||
JS libraries. They are a particularly important concern because we
|
||||
want to keep the Zulip webapp's JS bundle small, so that Zulip
|
||||
continues to load quickly on systems with low network bandwidth.
|
||||
We'll look at large JS libraries with much greater scrutiny for
|
||||
whether their functionality justifies their size than Python
|
||||
dependencies, since an extra 50KB of code usually doesn't matter in
|
||||
the backend, but does in JavaScript.
|
||||
|
||||
## System packages
|
||||
|
||||
For the third-party services like Postgres, Redis, Nginx, and RabbitMQ
|
||||
that are documented in the
|
||||
[architecture overview](architecture-overview.html), we rely on the
|
||||
versions of those packages provided alongside the Linux distribution
|
||||
on which Zulip is deployed. Because Zulip
|
||||
[only supports Ubuntu in production](prod-requirements.html), this
|
||||
usually means `apt`, though we do support
|
||||
[other platforms in development](dev-setup-non-vagrant.html). Since
|
||||
we don't control the versions of these dependencies, we avoid relying
|
||||
on specific versions of these packages wherever possible.
|
||||
|
||||
The exact lists of `apt` packages needed by Zulip are maintained in a
|
||||
few places:
|
||||
* For production, in our puppet configuration, `puppet/zulip/`, using
|
||||
the `Package` and `SafePackage` directives.
|
||||
* For development, in `APT_DEPENDENCIES` in `tools/lib/provision.py`.
|
||||
* The packages needed to build a Zulip virtualenv, in
|
||||
`VENV_DEPENDENCIES` in `scripts/lib/setup_venv.py`. These are
|
||||
separate from the rest because (1) we may need to install a
|
||||
virtualenv before running the more complex scripts that, in turn,
|
||||
install other dependencies, and (2) because that list is shared
|
||||
between development and production.
|
||||
|
||||
We maintain a [PPA (personal package archive)][ppa] with some packages
|
||||
unique to Zulip (e.g the `tsearch_extras` postgres extension) and
|
||||
backported versions of other dependencies (e.g. `camo`, to fix a buggy
|
||||
`init` script). Our goal is to shrink or eliminate this PPA where
|
||||
possible by getting issues addressed in the upstream distributions.
|
||||
|
||||
We also rely on the `pgroonga` PPA for the `pgroonga` postgres
|
||||
extension, used by our [full-text search](full-text-search.html).
|
||||
|
||||
## Python packages
|
||||
|
||||
We manage Python packages via the Python-standard `requirements.txt`
|
||||
system and virtualenvs, but there’s a number of interesting details
|
||||
about how Zulip makes this system work well for us that are worth
|
||||
highlighting. The system is largely managed by the code in
|
||||
`scripts/lib/setup_venv.py`
|
||||
|
||||
* **Using `pip` to manage dependencies**. This is standard in the
|
||||
Python ecosystem, and means we only need to record a list of
|
||||
versions in a `requirements.txt` file to declare what we're using.
|
||||
Since we have a few different installation targets, we maintain
|
||||
several `requirements.txt` format files in the `requirements/`
|
||||
directory (e.g. `dev.txt` for development, `prod.txt` for
|
||||
production, `docs.txt` for ReadTheDocs, `common.txt` for the vast
|
||||
majority of packages common to prod and development, etc.). We use
|
||||
`pip install --no-deps` to ensure we only install the packages we
|
||||
explicitly declare as dependencies.
|
||||
* **virtualenv with pinned versions**. For a large application like
|
||||
Zulip, it is important to ensure that we're always using consistent,
|
||||
predictable versions of all of our Python dependencies. To ensure
|
||||
this, we install our dependencies in a [virtualenv][] that contains
|
||||
only the packages and versions that Zulip needs, and we always pin
|
||||
exact versions of our dependencies in our `requirements.txt` files.
|
||||
We pin exact versions, not minimum versions, so that installing
|
||||
Zulip won't break if a dependency makes a buggy release. A side
|
||||
effect is that it's easy to debug problems caused by dependency
|
||||
upgrades, since we're always doing those upgrades with an explicit
|
||||
commit updating the `requirements/` directory.
|
||||
* **Caching of virtualenvs and packages**. To make updating the
|
||||
dependencies of a Zulip installation efficient, we maintain a cache
|
||||
of virtualenvs named by the hash of the relevant `requirements.txt`
|
||||
file (`scripts/lib/hash_reqs.py`). These caches live under
|
||||
`/srv/zulip-venv-cache/<hash>`. That way, when re-provisioning a
|
||||
development environment or deploying a new production version with
|
||||
the same Python dependencies, no downloading or installation is
|
||||
required: we just use the same virtualenv. When the only changes
|
||||
are upgraded versions, we'll use [virtualenv-clone][] to clone the
|
||||
most similar existing virtualenv and then just upgrade the packages
|
||||
needed, making small version upgrades extremely efficient. And
|
||||
finally, we use `pip`'s built-in caching to ensure that a specific
|
||||
version of a specific package is only downloaded once.
|
||||
* **Garbage-collecting caches**. We have a tool,
|
||||
`scripts/lib/clean-venv-cache`, which will clean old cached
|
||||
virtualenvs that are no longer in use. In production, the algorithm
|
||||
preserves recent virtualenvs as well as those in use by any current
|
||||
production deployment directory under `/home/zulip/deployments/`.
|
||||
This helps ensure that a Zulip installation doesn't leak large
|
||||
amounts of disk over time.
|
||||
* **Pinning versions of indirect dependencies**. We "pin" or "lock"
|
||||
the versions of our indirect dependencies files with
|
||||
`tools/update-locked-requirements` (powered by `pip-compile`). What
|
||||
this means is that we have some "source" requirements files, like
|
||||
`requirements/common.txt`, that declare the packages that Zulip
|
||||
depends on directly. Those packages have their own recursive
|
||||
dependencies. When adding or removing a dependency from Zulip, one
|
||||
simply edits the appropriate "source" requirements files, and then
|
||||
runs `tools/update-locked-requirements`. That tool will use `pip
|
||||
compile` to generate the `prod_lock.txt` and `dev_lock.txt` files
|
||||
that explicitly declare versions of all of Zulip's recursive
|
||||
dependencies. For indirect dependencies (i.e. dependencies not
|
||||
explicitly declared in the source requirements files), it provides
|
||||
helpful comments explaining which direct dependency (or
|
||||
dependencies) needed that indirect dependency. The process for
|
||||
using this system is documented in more detail in
|
||||
`requirements/README.md`.
|
||||
* **Scripts**. Often, we want a script running in production to use
|
||||
the Zulip virtualenv. To make that work without a lot of duplicated
|
||||
code, we have a helpful library,
|
||||
`scripts/lib/setup_path_on_import.py`, which on import will put the
|
||||
currently running Python script into the Zulip virtualenv. This is
|
||||
called by `./manage.py` to ensure that our Django code always uses
|
||||
the correct virtualenv as well.
|
||||
|
||||
## JavaScript and other frontend packages
|
||||
|
||||
We use the same set of strategies described for Python dependencies
|
||||
for most of our JavaScript dependencies, so we won't repeat the
|
||||
reasoning here.
|
||||
|
||||
* In a fashion very analogous to the Python codebase,
|
||||
`scripts/lib/node_cache.py` manages cached `node_modules`
|
||||
directories in `/srv/zulip-npm-cache`. Each is named by its hash,
|
||||
computed by the `generate_sha1sum_node_modules` function.
|
||||
`scripts/lib/clean-npm-cache` handles garbage-collection.
|
||||
* We use [yarn][], a `pip`-like tool for JavaScript, to download most
|
||||
JavaScript dependencies. Yarn talks to standard the [npm][]
|
||||
repository. We use the standard `package.json` file to declare our
|
||||
direct dependencies, with sections for for development and
|
||||
production. Yarn takes care of pinning the versions of indirect
|
||||
dependencies in the `yarn.lock` file; `yarn upgrade` updates the
|
||||
`yarn.lock` files.
|
||||
* `tools/update-prod-static`. This process is discussed in detail in
|
||||
the [static asset pipeline](/front-end-build-process.html) article,
|
||||
but we don't use the `node_modules` directories directly in
|
||||
production. Instead, static assets are compiled using our static
|
||||
asset pipeline and it is the compiled assets that are served
|
||||
directly to users. As a result, we don't ship the `node_modules`
|
||||
directory in a Zulip production release tarball, which is a good
|
||||
thing, because doing so would more than double the size of a Zulip
|
||||
release tarball.
|
||||
* **Checked-in packages**. In contrast with Python, we have a few
|
||||
JavaScript dependencies that we have copied into the main Zulip
|
||||
repository under `static/third`, often with patches. These date
|
||||
from an era before `npm` existed. It is a project goal to eliminate
|
||||
these checked-in versions of dependencies and instead use versions
|
||||
managed by the npm repositories.
|
||||
|
||||
## Node and Yarn
|
||||
|
||||
These are installed by `scripts/lib/install-node` (which in turn uses
|
||||
the standard third-party `nvm` installer to download `node` and pin
|
||||
its version) and `scripts/lib/third/install-yarn.sh` (the standard
|
||||
installer for `yarn`, modified to support installing to a path that is
|
||||
not the current user's home directory).
|
||||
|
||||
* `nvm` has its own system for installing each version of `node` at
|
||||
its own path, which we use, though we install a `/usr/local/bin/node`
|
||||
wrapper to access the desired version conveniently and efficiently
|
||||
(`nvm` has a lot of startup overhead).
|
||||
* `install-yarn.sh` is configured to install `yarn` at
|
||||
`/srv/zulip-yarn`. We don't do anything special to try to manage
|
||||
multiple versions of `yarn`.
|
||||
|
||||
## Other third-party and generated files
|
||||
|
||||
In this section, we discuss the other third-party dependencies,
|
||||
generated code, and other files whose original primary source is not
|
||||
the Zulip server repository, and how we provision and otherwise
|
||||
maintain them.
|
||||
|
||||
### Emoji
|
||||
|
||||
Zulip uses the [iamcal emoji data package][iamcal] for its emoji data
|
||||
and sprite sheets. We download this dependency using `npm`, and then
|
||||
have a tool, `tools/setup/build_emoji`, which reformats the emoji data
|
||||
into the files under `static/generated/emoji`. Those files are in
|
||||
turn used by our [markdown processor](markdown.html) and
|
||||
`tools/update-prod-static` to make Zulip's emoji work in the various
|
||||
environments where they need to be displayed.
|
||||
|
||||
Since processing emoji is a relatively expensive operation, as part of
|
||||
optimizing provisioning, we use the same caching strategy for the
|
||||
compiled emoji data as we use for virtualenvs and `node_modules`
|
||||
directories, with `scripts/lib/clean-emoji-cache` responsible for
|
||||
garbage-collection. This caching and garbage-collection is required
|
||||
because a correct emoji implementation involves over 1000 small image
|
||||
files and a few large ones. There is a more extended article on our
|
||||
[emoji infrastructure](emoji.html).
|
||||
|
||||
### Translations data
|
||||
|
||||
Zulip's [translations infrastructure](translating.html) generates
|
||||
several files from the source data, which we manage similar to our
|
||||
emoji, but without the caching (and thus without the
|
||||
garbage-collection). New translations data is downloaded from
|
||||
Transifex and then compiled to generate both the production locale
|
||||
files and also language data in `static/locale/language*.json` using
|
||||
`manage.py compilemessages`, which extends the default Django
|
||||
implementation of that tool.
|
||||
|
||||
### Pygments data
|
||||
|
||||
The list of languages supported by our markdown syntax highlighting
|
||||
comes from the [pygments][] package. `tools/setup/build_pygments_data.py` is
|
||||
responsible for generating `static/generated/pygments_data.js` so that
|
||||
our JavaScript markdown processor has access to the supported list.
|
||||
|
||||
### Authors data
|
||||
|
||||
Zulip maintains data on the developers who have contributed the most to
|
||||
the current version of Zulip in the /about page. These data are
|
||||
fetched using the GitHub API with `tools/update-authors-json`. In
|
||||
development, it just returns some basic test data to avoid adding load
|
||||
to GitHub's APIs unnecessarily; it's primarily run as part of building
|
||||
a release tarball.
|
||||
|
||||
## Modifying provisioning
|
||||
|
||||
When making changes to Zulip's provisioning process or dependencies,
|
||||
usually one needs to think about making changes in 3 places:
|
||||
|
||||
* `tools/lib/provision.py`. This is the main provisioning script,
|
||||
used by most developers to maintain their development environment.
|
||||
* `docs/dev-setup-non-vagrant.md`. This is our "manual installation"
|
||||
documentation. Strategically, we'd like to move the support for more
|
||||
versions of Linux from here into `tools/lib/provision.py`.
|
||||
* Production. Our tools for compiling/generating static assets need
|
||||
to be called from `tools/update-prod-static`, which is called by
|
||||
`tools/build-release-tarball` (for doing Zulip releases) as well as
|
||||
`tools/upgrade-zulip-from-git` (for deploying a Zulip server off of
|
||||
master).
|
||||
|
||||
[virtualenv]: https://virtualenv.pypa.io/en/stable/
|
||||
[virtualenv-clone]: https://github.com/edwardgeorge/virtualenv-clone/
|
||||
[yarn]: https://yarnpkg.com/
|
||||
[ppa]: https://launchpad.net/~tabbott/+archive/ubuntu/zulip
|
||||
[iamcal]: https://github.com/iamcal/emoji-data
|
||||
[pygments]: http://pygments.org/
|
|
@ -5,6 +5,9 @@ developing new features for Zulip that require front-end changes,
|
|||
especially those that involve adding new files. For a more general
|
||||
overview, see the [new feature tutorial](new-feature-tutorial.html).
|
||||
|
||||
Our [dependencies documentation](dependencies.html) has useful
|
||||
relevant background as well.
|
||||
|
||||
## Primary build process
|
||||
|
||||
Most of the existing JS in Zulip is written
|
||||
|
@ -15,7 +18,10 @@ eval with sourcemaps. In production mode (and when creating a release tarball
|
|||
using `tools/build-release-tarball`), JavaScript files are concatenated and
|
||||
minified. We use the
|
||||
[django pipeline extension](https://django-pipeline.readthedocs.io/en/latest/)
|
||||
to manage our static assets, webpack, and typescript.
|
||||
to manage our static assets, webpack, and typescript. The main
|
||||
internal tool that does all of this work is
|
||||
`tools/update-prod-static`, which is called by both
|
||||
`tools/build-release-tarball` and `tools/upgrade-zulip-from-git`.
|
||||
|
||||
## Adding static files
|
||||
|
||||
|
|
|
@ -128,6 +128,7 @@ Contents:
|
|||
:maxdepth: 2
|
||||
:caption: Subsystem documentation
|
||||
|
||||
dependencies
|
||||
settings
|
||||
events-system
|
||||
queuing
|
||||
|
|
Loading…
Reference in New Issue