* Now queue_workers.py sorts queue names and prints them on their own
line. Previously it's output was nondeterministic.
* Simplified grep strategy for removing the "test" worker.
This list was likely to end up out of date quickly, since it wasn't
documented that you need to update it when adding a queue. The best
solution is to just not require it to be updated.
The original test was written in shell script which launches a new
django instance for every tests. By doing it in Python, we avoid
the overhead and reduce the test time to <1 second.
Fixes#3620.
Before this commit, provisioning was done by executing provision.py,
which printed the log directly to stdout, making debugging harder.
This commit creates a wrapper bash script 'provision' in tools, which
calls 'zulip/scripts/tools/provision_vm.py' (the new location of
provision.py) and prints all the output to
'zulip/var/log/zulip/zulip_provision.log' via 'tee'.
Travis tests and docs have been modified accordingly.
- Add websocket client to create connection with SockJS websocket server.
It contains callback method to launch after connection setup.
- Add '--websocket' parameter to 'check_send_receive_time' script to
check websocket connection.
- Add testing websocket connection to production installation checking.
- Add cronjob to launch websocket connection nagios test.
This makes it possible for Zulip Nagios monitoring to check for
problems impacting the websockets sending code path, which is what all
web users use.
Previously, we were doing this request to the production server before
waiting for all the supervisord processes to start; it's possible this
could cause failures where we hit the server before the Django uwsgi
processes are up.
Hopefully fixes#2723.
Most of the changes to support this were merged some time ago; what
remains are these changes:
* Update requirements.txt
* Django 1.10: Upgrade success-http-headers.txt file.
- We no longer get the absolute urls, see
https://docs.djangoproject.com/en/1.10/releases/1.9/#http-redirects-no-longer-forced-to-absolute-uris
- The headers are capitalized, previously, they were in upper case.
* Bump PROVISON_VERSION to 3.0 since this is a disruptive change.
Fixes#3.
- Add script to compile documentation build and start crawler
to check documentation.
- Add documentation test script to backend travis test case.
- Add log level argument to test-documentation script.
Fixes#1492
Previously, we checked scripts in a separate run to work around mypy
not supporting multiple scripts with the same name. Since we have
fixed that issue, we can restore the original behavior.
We leave the --scripts-only option available, though I'm not sure it's
particularly useful and we'll probably eventually remove it.
We're now at the point where 100% of functions checked by mypy is
fully annotated; to avoid regressions, we're enforcing the requirement
that it stay this way. We still have a moderate amount of code that
is neither checked by mypy nor annotated, but it seems reasonable to
annotate that code at the same time as we get a chance to fix the mypy
issues in it.
This is implemented by using the --disallow-untyped-defs option in
mypy by default.
The previous model for these Nagios checks was kinda crazy -- every
minute, we'd run a full `rabbitmctl list_consumers` for each of the
dozen+ consumers that we have, and then do the exact same parsing
logic for each to determine whether the target queue has a running
consumer to write out a state file.
Because `rabbitmctl list_consumers` takes a small amount of resources,
on systems where CPU is very limited (e.g. t2 style AWS instances),
this minor CPU wastage could be problematic.
Now we just do that `rabbitmqctl list_consumers` once per minute, and
output all the state files from a single command.
Further TODO items on this front include removing the hardcoded list
of queues.
Because rabbitmq doesn't support changing the nodename of a running
rabbitmq node, Zulip installations suffered a plague of issues where
e.g. a Zulip server would reboot, the hostname would change, and
suddenly the local rabbitmq instance being used by Zulip would stop
working.
We address this problem by using, by default, a fixed rabbitmq
nodename, but providing server administrators the option to set the
rabbitmq nodename used by Zulip however they choose.
To upgrade an existing server to use this new configuration, one will
need to add something like the following to /etc/zulip/zulip.conf:
[rabbitmq]
nodename = zulip@localhost
However, I don't believe we have the puppet code in place to make this
work correctly at initial installation without rabbitmq-server being
already installed (but off), as we can easily setup in Travis CI but I
haven't been willing to do for the installer. So for now, this just
fixes our Travis CI problems.
Fixes: #1579.
Travis CI seems to have changed the way the snakeoil SSL certs are
generated in their infrastructure, so we need to update our expected
"success" HTTP headers accordingly.
It seems that we no longer get the message, 'zerver/lib/actions.py
modified; restarting server', but the server reloads successfully
nonetheless.
Fixes: #1341.
Run '/puppet/zulip/files/nagios_plugins/zulip_app_frontend/check_send_receive_time'
script as 'zulip' user so that the connection to the database can be
made correctly.
This is important for both ensuring the Nagios checks work correctly
in production, as well as making sure the `zulip` user can access the
virtualenv (owned by the `travis` user) in Travis CI.
Previously, we were wasting time every time we installed packages,
because `apt-get upgrade` would only install most of the packages
`apt-get dist-upgrade` would.
Camo is a caching image proxy, used in Zulip to avoid mixed-content
warnings by proxying HTTP image content over HTTPS. We've been using
it in zulip.com production for years; this change makes it available
in standalone Zulip deployments.
This should save several minutes off the Travis CI `production`
suite's runtime, since previously we were doing the full apt upgrade
process twice, resulting in things like multiple expensive rebuilds of
the initramfs.
Add 'six' to setup-py3k, because it is being used in tools/lister.py.
Add 'typing' to setup-py3k, so that tools/lister.py can be type
annotated in the future.
The `with sh.sudo` pattern that we were using in python-sh was
deprecated, and emperically hangs on Ubuntu xenial. Since in general
the use of python-sh/python-pbs caused trouble (requiring extra
dependencies, confusing syntax), this just removes it.
We replace it with a new zulip_tools.py library function that echoes
the command line and streams the output.
We do the same to install-phantomjs so we can remove that dependency.
tools/travis/py3k used to always exit with exit code 0.
It should exit with 1 when fixers detect a compatibility issue.
py3k used [ -z "$failed" ] to check if there was a failure.
This is wrong, since if no failure has occured, failed=0,
and -z checks if a string is of zero length. This commit also
fixes this bug.
In py3k, "git reset --hard" was called only if
libmodernize.fixes.fix_dict_six changed files and some of those
changes are not considered false positives by py3k.
But if all of those changes are not considered false positives
by py3k, then "git reset --hard" is not called and the repository
is no longer clean.
This commit fixes this bug.
tools/travis/py3k used to only check files whose names ended with .py.
Now it also checks python scripts which don't have an extension.
It uses tools/lister.py to get a list of all python files.
It's not clear whether this will end up being net negative in value in
the long term since it's kinda hard to understand the output, but in
the short term it should prevent regressions.
Travis CI's model of installing every version of postgres on the test
VM and then shutting all the versions other than the one requested
down seems to not work very well with doing apt upgrades. It seems
the best way to resolve this is to just uninstall the versions we
don't need.
We ran into a bug with the Travis CI infrastructure where it postgres
9.1 is installed on the system, and so when we'd do an apt upgrade
with a new version of 9.1, the 9.1 daemon would end up getting started
and conflict with the 9.3 daemon we were trying to run.
This test caught a few bugs where refactoring had made management
commands fail (and would have caught a few more recent ones).
Ideally we'd replace this with a more advanced test that actually
tests that the management command do something useful, but it's a
start.