The hash keys were missing hash for package.json and yarn.lock
because they were not present since we don't do a full checkout
in this job. We fix this by sending over those files and generating
hashes from them.
I usally verify these cache keys by clicking the Restore <cache>
step dropdown menu and then clicking the Run ... dropdown menu again
to see the generated hash.
There were a lots of flakes in CI recently because typeahead didn't
appear when Enter was pressed and real emails are not accepted as
valid inputs. To fix this we wait for typeahead to appear and then
click that instead of Enter. We also use delay option to type the
email (100ms delay between keypresses) since without we'd also get
flakes.
Re-enable puppeteer test in CI after this fix too.
Previously, we copied them to /tmp and from there we specified those
assets we copied in circleci config in presist_to_workspace step.
Copying it to a directory allows us to get rid of list in circleci
config and GitHub Actions's upload artifact (their version of
presist to workspace) doesn't allow us to specify indivivual files
so only is this cleaner but required.
success-http-headers-bionic.txt and success-http-headers-focal.txt
differ only in the nginx version so this substitution will allow
us to have single file for both of them. Also this change helps
to avoid CI failure if Nginx version is updated in the OS.
The automated tests running in CircleCI don't actually use the `zulip`
db, so we can skip running migrations on it in some CircleCI shards to
save time.
NOTE: This only effects build jobs that run provision, except the
`production-build` job where we skip building the dbs altogether.
Migrations still run on `focal-backend` build job to ensure
we are testing all our development setup code.
Run production suites on Ubuntu Focal.
Added separate success-http-headers files for Focal and Bionic.
Also excluded them from whitespace rules in lint.
memcached 1.5.22 in Ubuntu 20.04 has a bug where it looks for its SASL
configuration at /etc/sasl2/memcached.conf/memcached.conf instead of
/etc/sasl2/memcached.conf.
We already use a workaround for this while applying puppet configurations in
99e71f3786 but for docker builds we used
do memcached hack since we can not use systemd in docker containers.
Since we use this option in our docker-zulip project also
so rather than using it as a test suite option we made it
more specific i.e. --build-release-tarball-only.
Fixes#14962
* codecov needs `.coverage` file in the pwd to upload coverage
results.
* concurrency='multiprocessing' forces `.coverage` file to have a
data_suffix, we explicity set it to "" for the file to have no suffix.
Filed issue as https://github.com/nedbat/coveragepy/issues/989
This allows to run scripts between extraction and install
process.
It will be used to restore npm caches for production install jobs.
We extract the tarball in the working directory so that yarn.lock and
package.json are available to restore cache.
(And also so the path is deterministic).
Runs a test at the end of tools/ci/backend to check if any untracked
files have been created during this ci tests. Exits with error code 1
if untracked files are found, otherwise exits successfully.
Fixes#14691.
Restart postgres service if provision is called in production test suite.
This is required because terminate-psql-sessions script (used
in tools/ci/setup-production) throws error if postgres service is not running.
Restart rabbitmq service if provision is called in production test suite.
This is done to start the node as Circle CI don't start services on installation.
Removed memcached restart as flush-memcached script (which is furthur
used in tools/ci/production) throws UNKNOWN READ FAILURE if memcached is restarted
in development.
Since now we want to use production suites on Circle CI so there
is no need to set TRAVIS in env while running scripts.
CIRCLECI is set default in the enviroment of Circle CI builds
so we can use it directly.
Also Travis CI had rabbitmq-server installed so we had to add workaround
in install script to avoid the error. That workaround is removed.
Used postgres 10 inplace of postgres 9.5 as it is used in Bionic.
Upgraded nginx version in success-http-headers which is in CircleCI
Bionic enviroment.
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013. We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not. Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.
While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.
This WebSockets code path had a lot of downsides/complexity,
including:
* The messy hack involving constructing an emulated request object to
hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
needs and must be provisioned independently from the rest of the
server).
* A duplicate check_send_receive_time Nagios test specific to
WebSockets.
* The requirement for users to have their firewalls/NATs allow
WebSocket connections, and a setting to disable them for networks
where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
been poorly maintained, and periodically throws random JavaScript
exceptions in our production environments without a deep enough
traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
server restart, and especially for large installations like
zulipchat.com, resulting in extra delay before messages can be sent
again.
As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients. We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.
If we later want WebSockets, we’ll likely want to just use Django
Channels.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>