The previous model for these Nagios checks was kinda crazy -- every
minute, we'd run a full `rabbitmctl list_consumers` for each of the
dozen+ consumers that we have, and then do the exact same parsing
logic for each to determine whether the target queue has a running
consumer to write out a state file.
Because `rabbitmctl list_consumers` takes a small amount of resources,
on systems where CPU is very limited (e.g. t2 style AWS instances),
this minor CPU wastage could be problematic.
Now we just do that `rabbitmqctl list_consumers` once per minute, and
output all the state files from a single command.
Further TODO items on this front include removing the hardcoded list
of queues.
Because rabbitmq doesn't support changing the nodename of a running
rabbitmq node, Zulip installations suffered a plague of issues where
e.g. a Zulip server would reboot, the hostname would change, and
suddenly the local rabbitmq instance being used by Zulip would stop
working.
We address this problem by using, by default, a fixed rabbitmq
nodename, but providing server administrators the option to set the
rabbitmq nodename used by Zulip however they choose.
To upgrade an existing server to use this new configuration, one will
need to add something like the following to /etc/zulip/zulip.conf:
[rabbitmq]
nodename = zulip@localhost
However, I don't believe we have the puppet code in place to make this
work correctly at initial installation without rabbitmq-server being
already installed (but off), as we can easily setup in Travis CI but I
haven't been willing to do for the installer. So for now, this just
fixes our Travis CI problems.
Fixes: #1579.
Travis CI seems to have changed the way the snakeoil SSL certs are
generated in their infrastructure, so we need to update our expected
"success" HTTP headers accordingly.
It seems that we no longer get the message, 'zerver/lib/actions.py
modified; restarting server', but the server reloads successfully
nonetheless.
Fixes: #1341.
The find-add-class tool, when in lint mode, verifies that we can
understand all calls to addClass from our JS code.
When in non-lint mode, i.e. verbose mode, the tool prints out a
list of tuples of (fn, class) that we can use as we wish in other
tools.
We were ignoring singleton tags like "input" tags in
html-grep. This was an artifact of our tokenizer originally
being built to check indentation of templates, for which
singleton tags had been a distraction. This fix actually cleans up
the template checking logic as well, since it can now rely
on the tokenizer to classify special tags and singleton tags.
The tokenizer is more complete and more specific.
This reverts commit 3f95e567c1.
Apparently `apt-add-repository` fails periodically in CI. I suspect
this is some sort of silly networking problem, but given that all
we're saving is a few lines of code, the old version was better if
this fails basically ever.
Now, `tools/test-all` calls a new program called `tools/tests-tools`
that runs unit tests in `test_css_parser.py` and 'test_template_parser.py`.
This puts 100% line coverage on tools/lib/css_parser.py.
This puts about 50% line coverage on tools/lib/template_parser.py.
`tools/lint-all` now calls the new `tools/check-css`
The css_parser library parsers CSS into a data structure
that remembers line numbers and columns of semantically
meaningful tokens and adjoining white space/tokens. It
is intended to be used for various linting tasks.
The file `tools/check-css` runs a few files through the
parser and makes sure they round trip. This has some value
right away, as files that fail to parse will cause an
exception to be thrown and thus alert developers to syntax
errors. We expect to grow this into more advanced linting
tasks eventually.
`npm install` fails nondeterministically occasionally, and this makes
such failures likely to be automatically resolved in most cases by
simple retrying.