We’ve always been running CI on both push events and pull_request
events, which means it runs twice for commits that are pushed to a
pull request.
Filter the push events by branch name. Add the workflow_dispatch
event in case developers want to manually run CI on some other branch
that isn’t a pull request.
https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Comments out the steps in 'Create cache directories' that use
`actions/cache@2` so that the CI and production build can pass
while Github support issue is processed.
See https://github.com/actions/cache/issues/794 for an upstream report.
As a consequence:
• Bump minimum supported Python version to 3.8.
• Move Vagrant environment to Ubuntu 20.04, which has Python 3.8.
• Move CI frontend tests to Ubuntu 20.04.
• Move production build test to Ubuntu 20.04.
• Move 3.4 upgrade test to Ubuntu 20.04.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The job name is just the constant `production_build`. Renaming it to
have the OS in the key ensures that it is not shared across OS'es (for
instance between `4.x` and `main`, which are now bionic and buster,
respectively), and also allows it to share caches with the install
step, which uses the OS name in that place.
As a consequence:
• Bump minimum supported Python version to 3.7.
• Move Vagrant environment to Debian 10, which has Python 3.7.
• Move CI frontend tests to Debian 10.
• Move production build test to Debian 10.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
It should not use the configured zulip username, but should instead
pull from the login user (likely `nagios`), or an explicit alternate
provided PostgreSQL username. Failure to do so results in Nagios
failures because the `nagios` login does not have permissions to
authenticated the `zulip` PostgreSQL user.
This requires CI changes, as the install tests install as the `zulip`
login username, which allowed Nagios tests to pass previously; with
the custom database and username, however, they must be passed to
process_fts_updates explicitly when validating the install.
Production installs do not use the zilencer application, but the tests
do include it; as such, changes to any files which reference zilencer
are more likely to pass tests but fail production installs.
Run production tests when those files are changed.
We make a few adjustments:
* We now run full CI whenever pushing to master. It's cheap enough
that it's worth getting accurate signal.
* We now don't run production tests on PRs for changes to JavaScript/CSS
in static/ that don't also affect the webpack configuration.
* We sort the list of paths that trigger tests.
When Github Actions run in Docker, the default pid 1 entrypoint is
`tail -f /dev/null`. PID 1 is responsible for propagating signals to
its children, and calling `waitpid()` on defunct processes; `tail`
does not do these things. This results in zombie processes piling up
inside the container, which is not an issue in most contexts.
However, it affects `start-stop-daemon`, which hangs when stopping
daemon processes, as they are never reaped. This appears in CI as
`/etc/init.d/supervisor restart` never being able to succeed.
Run the docker container with `--init`, which spawns a
`/sbin/docker-init` PID 1 to handle the job of an init process.
This ensures that we exercise the fact that the Zulip installer may be
unpacked to a directory that may not be world-readable.
bc45525369 fixed a recent regression in
this behavior that would have been caught by this commit.
Thumbor and tc-aws have been dragging their feet on Python 3 support
for years, and even the alphas and unofficial forks we’ve been running
don’t seem to be maintained anymore. Depending on these projects is
no longer viable for us.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We had used 2>&1 to redirect stderr to stdout so it could be piped
into ts, but commit dd3cdd6ec5 (#17611)
removed ts, so we no longer need the redirection.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We support Debian as an OS for setting up the Zulip server. But the CI
does not run on pull request to test the setting up of the server on
Debian. Hence, add the check to CI.
Timestamps are logged automatically by GitHub Actions and can be
made visible using log settings easily. Hence we remove the
unnecessary timestamps here to make the logs look much cleaner.
This prevents Zulip CI from eventually consuming large amounts of
storage on one's GitHub account.
I picked a longer retention period for the Puppeteer artifacts because
humans look at those; the production tarballs are unlikely to be used
10 minutes after the run completes as they are just for the next stage
fo the build; certainly 14 days seems ample for any debugging.
The hash keys were missing hash for package.json and yarn.lock
because they were not present since we don't do a full checkout
in this job. We fix this by sending over those files and generating
hashes from them.
I usally verify these cache keys by clicking the Restore <cache>
step dropdown menu and then clicking the Run ... dropdown menu again
to see the generated hash.
All the steps are same from circleci except two steps:
1. The 'Add permissions ...' step is Actions specific as explained
in comments.
2. The step that used upload-artifacts is Actions verison of
presist_to_workspace.
Finally, I should note the duplication in this and zulip-ci
workflow. There are three reason this is not a problem:
1. It will be messy to mush this into zulip-ci workflow only for
benefit of un-duplicating the env and cache restore steps.
2. We needs this on its own workflow if we want to only run it
when production related dependencies are updated.
3. I don't see us updating the duplicated steps between both
workflow. Circle CI config is prefect example for this; nothing
is changed except for adding or updating steps which are not
duplicated.