The new tools now have more concise, more parallel names:
- rebuild-dev-database
- rebuild-test-database
The actual implementations are still pretty different:
rebuild-dev-database:
mostly delegates to 5 management scripts
rebuild-test-database:
is a very thin wrapper for generate-fixtures
We'll try to clean that up a bit soon.
We now have two functions related to digests
for processes:
is_digest_obsolete
write_digest_file
In most cases we now **wait** to write the
digest file until after we've successfully
run a process with its new inputs.
In one place, for database migrations, we
continue to write the digest optimistically.
We'll want to fix this, but it requires a
little more code cleanup.
Here is the typical sequence of events:
NEVER RUN -
is_digest_obsolete returns True
quickly (we don't compute a hash)
write_digest_file does a write (duh)
AFTER NO CHANGES -
is_digest_obsolete returns False
after reading one file for old
hash and multiple files to compute
hash
most callers skip write_digest_file
(no files are changed)
AFTER SOME CHANGES -
is_digest_obsolete returns False
after doing full checks
most callers call write_digest_file
*after* running a process
There's no real reason to do the lazy import any
more, as we use this unconditionally inside `main`
(indirectly), and `provision_inner` runs after we
have set up the venv.
I make these all functions for consistency,
and in particular I want to continue to avoid
`glob.glob` calls until we are actually
computing hashes.
This is mostly a prep to allow us to do
hashing in two separate places:
- check hashes
- update hashes
We would only update hashes **after** running
processes anew.
For `provision_inner` I considered using a
class to put the three path-related helpers
into a mini namespace, but it felt too heavy.
It wouldn't be completely implausible here
to extract something like a JSON config
file that has a list of globs for each
process that we do path-hashing for, but I
want to clean up other stuff first.
We no longer need to maintain duplicate code
related to where we set up the emoji
cache directory.
And we no longer need two extra steps for
people doing advanced (i.e. manual) setup.
There was no clear benefit to having provision
build the cache directory for `build_emoji`,
when it was easy to make `build_emoji` more
self-sufficient. The `build_emoji` tool
was already importing the library that has
`run_as_root`, and it was already responsible
for 99% of the create-directory kind of tasks.
(We always call `build_emoji` unconditionally from
`provision`, so there's no rationale in terms
of avoiding startup time or something.)
ASIDE:
Its not completely clear to me why we need
to put this directory in "/srv", instead of
somewhere more local (like we already do for
Travis), but maybe it's just to be like
its siblings in "/srv":
node_modules
yarn.lock
zulip-emoji-cache
zulip-npm-cache
zulip-py3-venv
zulip-thumbor-venv
zulip-venv-cache
zulip-yarn
I guess the caches that we keep in var are
dev-only, although I think some of what's under
`zulip-emoji-cache` is also dev-only in nature?
./var/webpack-cache
./var/mypy-cache
In `docs/subsystems/emoji.md` we say this:
```
The `build_emoji` tool generates the set of files under
`static/generated/emoji` (or really, it generates the
`/srv/zulip-emoji-cache/<sha1>/emoji` tree, and
`static/generated/emoji` is a symlink to that tree;we do this in
order to cache old versions to make provisioning and production
deployments super fast in the common case that we haven't changed the
emoji tooling). [...]
```
I don't really understand that rationale for the development
case, since `static/generated` is as much ignored by `git` as
'/srv' is, without the complications of needing `sudo` to create it.
And in production, I'm not sure how much time we're really saving,
as it takes me about 1.4s to fully rebuild the cache in dev, not to
mention we're taking on upgrade risk by sharing files between versions.
If the directory `templates/zerver/emails/compiled/`
is missing, then we need to run `inline_email_css`
again.
This can happen if somebody gets overzealous about
cleaning untracked files.
This is more encapsulated and more efficient.
In the cases where `is_force` is `True` or
`pygments_data.json` is missing, we now avoid
the unnecessary step of importing `pygments`, at
least up front.
(Of course, we probably import that once we generate
the artifacts.)
If somebody is having issues with provision, it's
plausible they'll do something like `git clean -fX`
to clean up old artifacts of earlier provision runs,
as part of debugging things.
We defend against this by detecting the most obvious
symptom as cheaply as possible.
I remove `is_force` from `file_or_package_hash_updated`
and modernize its mypy annotations.
If `is_force` is `True`, we just now run the thing
we want to force-run without having to call
`file_or_package_hash_updated` to expensively
and riskily return `True`.
Another nice outcome of this change is that if
`file_or_package_hash_updated` returns `True`,
you can know that the file or package has
indeed been updated.
For the case of `build_pygments_data` we also
skip an `os.path.exists` check when `is_force`
is `True`.
We will short-circuit more logic in the next
few commits, as well as cleaning up some of
the long/wrapper lines in the `if` statements.
We change the message for skipping RabbitMQ
configuration to match nearby messages:
No need to run `tools/setup/build_pygments_data`.
No need to run `scripts/setup/inline_email_css.py`.
No need to run `scripts/setup/configure-rabbitmq.
No need to regenerate the dev DB.
No need to regenerate the test DB.
No need to run `manage.py compilemessages`.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
While it's a bit of extra complexity to do this check, which I'm not
excited about, we've had multiple folks spend significant time being
confused rebasing past d7d8632525 into
deleting `pygments_data.json`, with provision not rebuilding it, so
this seems worth merging as a transitional fix even if we decide to
remove it in 2 months.
1) Created a new class `DatabaseType` and access its objects inside
`template_database_status()` instead of sending five arguments with
default values.
2) Made `check_files` and `setting_name` local variables instead of
function parameters since they had same value(None) for every call.
Fixes#13845.
We'll be soon documenting a production workflow that involves using
it, and that means it needs to live under scripts/ (since tools/ isn't
present in release tarballs).
Mismatching imports from outside and inside the virtualenv in the same
process was causing segfaults after apparently benign changes to the
script!
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>