We now use the `--streamlined` options for `run-dev.py`
when we use `test_server_running` for `test-api` and
`test-js-with-casper` (and its experimental
replacement, `test-js-with-puppeteer`).
This means we don't slow anything down with
processes like thumbor, process_fts_updates, etc.,
which aren't meaningfully exercised by these tests.
We may eventually want some tests to meaningfully
exercise those processes, and when that day comes,
we will need to add an extra argument to
`test_server_running`, probably, but until then,
we just always set `--streamlined` in that codepath.
There is actually a tool called `./tools/test-run-dev`
that we run in CI, and it will use the full mode.
It just doesn't verify much stuff--it mostly polls
the server without testing specific features.
This seems to save about 1s of the startup time on a system I use
(~10.6s -> ~9.7s).
This will give help up write new digest only if the db rebuild
succeeds. We were relying on the caller to
be successful in building db, this was hacky and unreliable.
We write new db digest once the caller succeeds, this ensures
that we write new digest after every successful attempt.
This fixes the anomality we were facing that Databases were rebuild
on the 2nd provision attempt with no changes to files or migrations.
This was happening because we didn't write a new digest for db
after the first provision (The case of DB didn't exist).
During the 1st provision, we check the template_status() of
Database both Dev and Test, but database_exists() of Databases
obviously returned false, and we rebuild the database,
but forgot to write_new_digest and hence the anomaly in the
second provision explained above.
Yes, it's slightly janky to create an
argparse.Namespace object like this, but it
saves us from shelling out to a script whose
only real value-add is parsing a single
`threshold_days` argument.
This saves about 130ms for a no-op provision.
We now prevent these variations:
* <hr/>
* <hr />
* <br/>
* <br />
We could enforce similar consistency for other void
tags, if we wished, but these two are particularly
prevalent.
Since in travis we don't have root access so we used to add different
srv path. As now we shifted our production suites to Circle CI
we don't need that code so removed it.
Also we used a hacky code in commit-lint-message for travis which is
now of no use.
We now forbid tags of the form `<foo ... />` in most
places, and we also forbid it even for several void
tags.
We make exceptions for tags that are already formatted
in two different ways in our codebase. This is mostly
svg tags, plus these common cases:
- br
- hr
- img
- input
It would be nice to lock down a convention for these,
even though the HTML specification is unopinionated
on these. We'll probably want to stay flexible for
svg tags, since they are sometimes copy/pasted from
other sources (although it's probably rare enough for
them that we can tolerate just doing minor edits as
needed).
If folks put something like '<br/>' in the HTML,
we would think the tag's name was "br/" instead
of "br". I think we were assuming most folks
would write either "<br>" or <br />".
ASIDE:
We should probably have a consistent
preference among these styles:
* <br>
* <br/>
* <br />
I prefer the first.
The new tools now have more concise, more parallel names:
- rebuild-dev-database
- rebuild-test-database
The actual implementations are still pretty different:
rebuild-dev-database:
mostly delegates to 5 management scripts
rebuild-test-database:
is a very thin wrapper for generate-fixtures
We'll try to clean that up a bit soon.
This tool was part of a very ad hoc investigation
during 2017 into our JS dep dependencies.
It's very out of date, and it has a non-trivial
maintenance cost, as these type of tools seem
to come up in every code sweep.
Restart postgres service if provision is called in production test suite.
This is required because terminate-psql-sessions script (used
in tools/ci/setup-production) throws error if postgres service is not running.
Restart rabbitmq service if provision is called in production test suite.
This is done to start the node as Circle CI don't start services on installation.
Removed memcached restart as flush-memcached script (which is furthur
used in tools/ci/production) throws UNKNOWN READ FAILURE if memcached is restarted
in development.
We now have two functions related to digests
for processes:
is_digest_obsolete
write_digest_file
In most cases we now **wait** to write the
digest file until after we've successfully
run a process with its new inputs.
In one place, for database migrations, we
continue to write the digest optimistically.
We'll want to fix this, but it requires a
little more code cleanup.
Here is the typical sequence of events:
NEVER RUN -
is_digest_obsolete returns True
quickly (we don't compute a hash)
write_digest_file does a write (duh)
AFTER NO CHANGES -
is_digest_obsolete returns False
after reading one file for old
hash and multiple files to compute
hash
most callers skip write_digest_file
(no files are changed)
AFTER SOME CHANGES -
is_digest_obsolete returns False
after doing full checks
most callers call write_digest_file
*after* running a process
There's no real reason to do the lazy import any
more, as we use this unconditionally inside `main`
(indirectly), and `provision_inner` runs after we
have set up the venv.
I make these all functions for consistency,
and in particular I want to continue to avoid
`glob.glob` calls until we are actually
computing hashes.
This is mostly a prep to allow us to do
hashing in two separate places:
- check hashes
- update hashes
We would only update hashes **after** running
processes anew.
For `provision_inner` I considered using a
class to put the three path-related helpers
into a mini namespace, but it felt too heavy.
It wouldn't be completely implausible here
to extract something like a JSON config
file that has a list of globs for each
process that we do path-hashing for, but I
want to clean up other stuff first.
We no longer need to maintain duplicate code
related to where we set up the emoji
cache directory.
And we no longer need two extra steps for
people doing advanced (i.e. manual) setup.
There was no clear benefit to having provision
build the cache directory for `build_emoji`,
when it was easy to make `build_emoji` more
self-sufficient. The `build_emoji` tool
was already importing the library that has
`run_as_root`, and it was already responsible
for 99% of the create-directory kind of tasks.
(We always call `build_emoji` unconditionally from
`provision`, so there's no rationale in terms
of avoiding startup time or something.)
ASIDE:
Its not completely clear to me why we need
to put this directory in "/srv", instead of
somewhere more local (like we already do for
Travis), but maybe it's just to be like
its siblings in "/srv":
node_modules
yarn.lock
zulip-emoji-cache
zulip-npm-cache
zulip-py3-venv
zulip-thumbor-venv
zulip-venv-cache
zulip-yarn
I guess the caches that we keep in var are
dev-only, although I think some of what's under
`zulip-emoji-cache` is also dev-only in nature?
./var/webpack-cache
./var/mypy-cache
In `docs/subsystems/emoji.md` we say this:
```
The `build_emoji` tool generates the set of files under
`static/generated/emoji` (or really, it generates the
`/srv/zulip-emoji-cache/<sha1>/emoji` tree, and
`static/generated/emoji` is a symlink to that tree;we do this in
order to cache old versions to make provisioning and production
deployments super fast in the common case that we haven't changed the
emoji tooling). [...]
```
I don't really understand that rationale for the development
case, since `static/generated` is as much ignored by `git` as
'/srv' is, without the complications of needing `sudo` to create it.
And in production, I'm not sure how much time we're really saving,
as it takes me about 1.4s to fully rebuild the cache in dev, not to
mention we're taking on upgrade risk by sharing files between versions.
If the directory `templates/zerver/emails/compiled/`
is missing, then we need to run `inline_email_css`
again.
This can happen if somebody gets overzealous about
cleaning untracked files.
This is more encapsulated and more efficient.
In the cases where `is_force` is `True` or
`pygments_data.json` is missing, we now avoid
the unnecessary step of importing `pygments`, at
least up front.
(Of course, we probably import that once we generate
the artifacts.)
If somebody is having issues with provision, it's
plausible they'll do something like `git clean -fX`
to clean up old artifacts of earlier provision runs,
as part of debugging things.
We defend against this by detecting the most obvious
symptom as cheaply as possible.
I remove `is_force` from `file_or_package_hash_updated`
and modernize its mypy annotations.
If `is_force` is `True`, we just now run the thing
we want to force-run without having to call
`file_or_package_hash_updated` to expensively
and riskily return `True`.
Another nice outcome of this change is that if
`file_or_package_hash_updated` returns `True`,
you can know that the file or package has
indeed been updated.
For the case of `build_pygments_data` we also
skip an `os.path.exists` check when `is_force`
is `True`.
We will short-circuit more logic in the next
few commits, as well as cleaning up some of
the long/wrapper lines in the `if` statements.
We change the message for skipping RabbitMQ
configuration to match nearby messages:
No need to run `tools/setup/build_pygments_data`.
No need to run `scripts/setup/inline_email_css.py`.
No need to run `scripts/setup/configure-rabbitmq.
No need to regenerate the dev DB.
No need to regenerate the test DB.
No need to run `manage.py compilemessages`.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This is verbatim from Git upstream, at an older version. (The one
change since then is to add localization for the messages like "You
have unstaged changes" -- which complicates the code, is important and
worth it for Git itself, but for our tools we can do without.)
This function will replace our use of `git diff-index --quiet HEAD`
in several scripts. The key differences in behavior are:
* The `git update-index --refresh`. Without this, on Windows
apparently `git diff-index` routinely (but not all the time!)
reports that tons of files have changed. See report:
https://chat.zulip.org/#narrow/stream/9-issues/topic/.2E.2Ftools.2Ffetch-pull-request.20issue/near/834435
* Instead of one command comparing the worktree to HEAD, we
separately compare the worktree to the index and the index to
HEAD, and abort if either diff is nonempty. This one is obvious,
but rather an edge case (it matters only if you've managed to
make the worktree and HEAD agree while the index has some
changes), and the extra code is annoying if written out in every
script that needs it. But that's what a subroutine is for. :-)
We'll make a few tweaks before actually switching to use this.
Add sgrep (sgrep.dev) to tooling and include simple rule as
proof of concept. Included rule detects use of old django render
function.
Also added a rule that looks for if-else statements where both
code paths are identical.