Yes, it's slightly janky to create an
argparse.Namespace object like this, but it
saves us from shelling out to a script whose
only real value-add is parsing a single
`threshold_days` argument.
This saves about 130ms for a no-op provision.
Since in travis we don't have root access so we used to add different
srv path. As now we shifted our production suites to Circle CI
we don't need that code so removed it.
Also we used a hacky code in commit-lint-message for travis which is
now of no use.
Now that we've cleaned up this tool's output, there's no reason to use
an awkward mechanism to hide its output; we can just print it out like
a normal program.
Fixes#14644; resolves#14701.
Generated by autopep8, with the setup.cfg configuration from #14532.
I’m not sure why pycodestyle didn’t already flag these.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Since now we want to use production suites on Circle CI so there
is no need to set TRAVIS in env while running scripts.
CIRCLECI is set default in the enviroment of Circle CI builds
so we can use it directly.
Also Travis CI had rabbitmq-server installed so we had to add workaround
in install script to avoid the error. That workaround is removed.
We now have two functions related to digests
for processes:
is_digest_obsolete
write_digest_file
In most cases we now **wait** to write the
digest file until after we've successfully
run a process with its new inputs.
In one place, for database migrations, we
continue to write the digest optimistically.
We'll want to fix this, but it requires a
little more code cleanup.
Here is the typical sequence of events:
NEVER RUN -
is_digest_obsolete returns True
quickly (we don't compute a hash)
write_digest_file does a write (duh)
AFTER NO CHANGES -
is_digest_obsolete returns False
after reading one file for old
hash and multiple files to compute
hash
most callers skip write_digest_file
(no files are changed)
AFTER SOME CHANGES -
is_digest_obsolete returns False
after doing full checks
most callers call write_digest_file
*after* running a process
I remove `is_force` from `file_or_package_hash_updated`
and modernize its mypy annotations.
If `is_force` is `True`, we just now run the thing
we want to force-run without having to call
`file_or_package_hash_updated` to expensively
and riskily return `True`.
Another nice outcome of this change is that if
`file_or_package_hash_updated` returns `True`,
you can know that the file or package has
indeed been updated.
For the case of `build_pygments_data` we also
skip an `os.path.exists` check when `is_force`
is `True`.
We will short-circuit more logic in the next
few commits, as well as cleaning up some of
the long/wrapper lines in the `if` statements.
We stopped using tsearch-extras in Zulip 2.1.0 after Anders figured
out how to achieve its goals with native postgres. However, we never
did a `DROP EXTENSION` on systems thta had upgraded, which meant that
backups created on systems originally installed with Zulip 2.0.x and
older, and later upgraded to Zulip 2.1.x, could not be restored on
Zulip servers created with a fresh install of Zulip 2.1.x.
We can't do this with a normal database migration, because DROP
EXTENSION has to be done as the postgres user, so we add some custom
migration code in the upgrade-zulip-stage-2 tool.
It's safe to run this whenever tsearch_extras.control is installed because:
* Zulip is AFAIK the only software that ever used tsearch_extras.
* The package was only installed via puppet on production servers configured to
run a local Zulip database.
* We'll only run this code once per system, because it removes the
package and thus the control files.
Fixes#13612.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Used get_venv_dependencies function to return the correct dependencies
for RHEL, Centos, Fedora rather than importing them as separate
COMMON_YUM_DEPENDENCIES in provision and create-production-venv.
In virtualenv ≥ 20, the site_packages variable was removed from
activate_this.py. To avoid a KeyError, replace
activate_locals['site_packages'] with os.path.join(venv, 'lib',
python_version), where python_version is the 'pythonX.Y' name of the
directory where site-packages resides in the virtualenv.
Fixes#14025.
Added a get_venv_dependencies() function in setup_venv.py which
returns VENV_DEPENDENCIES according to the vendor and os_version.
The reason for adding this function was because python-dev will be
depreciated in Focal but can be used as python2-dev so when adding
support for Focal VENV_DEPENDENCIES should to be os_version dependent.
isort 5 knows not to reorder imports across function calls, so this
will stop isort from breaking our code.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This adds Ubuntu 19.10 as a valid provisioning target.
The release test in setup-apt-repo was changed from a list of values to
a regex check for brevity.
The “Smileys & People” category has been split into “Smilys & Emotion”
and “People & Body”.
Also, fix generate_sha1sum_emoji to read the emoji-datasource-google
version from yarn.lock, since package.json only gives a version range.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
These docstrings hadn't been properly updated in years, and bad an
awkward mix of a bad version of the user-facing documentation and
details that are no longer true (e.g. references to "Voyager").
(One important detail is that we have real documentation for this
system now).
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013. We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not. Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.
While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.
This WebSockets code path had a lot of downsides/complexity,
including:
* The messy hack involving constructing an emulated request object to
hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
needs and must be provisioned independently from the rest of the
server).
* A duplicate check_send_receive_time Nagios test specific to
WebSockets.
* The requirement for users to have their firewalls/NATs allow
WebSocket connections, and a setting to disable them for networks
where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
been poorly maintained, and periodically throws random JavaScript
exceptions in our production environments without a deep enough
traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
server restart, and especially for large installations like
zulipchat.com, resulting in extra delay before messages can be sent
again.
As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients. We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.
If we later want WebSockets, we’ll likely want to just use Django
Channels.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This simplifies the RDS installation process to avoid awkwardly
requiring running the installer twice, and also is significantly more
robust in handling issues around rerunning the installer.
Finally, the answer for whether dictionaries are missing is available
to Django for future use in warnings/etc. around full-text search not
being great with this configuration, should they be required.
`copytree` throws an error if the target already exists, and we don’t
really want to rerun the copy anyway.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This is needed on at least Debian 10, otherwise xmlsec fails to
install: `Could not find xmlsec1 config. Are libxmlsec1-dev and
pkg-config installed?`
Also remove libxmlsec1-openssl, which libxmlsec1-dev already depends.
(No changes are needed on RHEL, where libxml2-devel and xmlsec1-devel
already declare a requirement on /usr/bin/pkg-config.)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
The output log from running clean_unused_caches was too verbose as
part of the `upgrade-zulip` overall output. While this output is
potentially helpful when running it directly for debugging, it's
certainly redundant for the main production use case.
So a new flag --no-print-headers is introduced. It suppresses the
header outputs for the subtools.
Fixes#13214.
This allows the system to get updates to the Groonga repository
signing key, so `apt update` doesn’t start failing when the key
changes (like it recently did).
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
debian-archive-keyring is a dependency of the essential package apt,
so it is present in every Debian system.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
virtualenv on Ubuntu 16.04, when creating a new environment, downloads
the current version of setuptools, then replaces its pkg_resources
with an old copy from
/usr/share/python-wheels/pkg_resources-0.0.0-py2.py3-none-any.whl.
This causes problems, a simple example of which is reproducible from
the ubuntu:16.04 Docker base image as follows:
apt-get update
apt-get -y install python3-virtualenv
python3 -m virtualenv -p python3 /ve
/ve/bin/pip install sockjs-tornado
/ve/bin/pip download sockjs-tornado
→ `AttributeError: '_NamespacePath' object has no attribute 'sort'`
More relevantly, it breaks pip-compile in the same way. To fix this,
we need to force setuptools to be reinstalled, even if we’re asking
for the same version.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
To replace DISTRIB_FAMILY, there’s now an os_families function using
the standard ID and ID_LIKE information in /etc/os-release.
Fixes#13070; fixes#13071.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We no longer use tsearch_extras, and the camo patch is irrelevant on
systemd systems (Xenial and newer). So we no longer need to
provide/install a PPA at all.
Closes#13027.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Now that we're implemented tsearch_extras in pure postgres, we no
longer need a custom extension. This should help us considerably, as
it means we no longer need to ship custom apt packages at all.
Fixes#467.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
As predicted in https://www.kb.cert.org/vuls/id/319816/, a malicious
worm is beginning to spread across the npm ecosystem through package
postinstall scripts. Only instead of direct self-replicating code,
the replication vector is the temptation to monetize postinstall
scripts by polluting the console logs with paid advertisements. The
effect will be the same unless we all put a stop to this while we
still can.
Apply the recommended VU#319816 workaround, which is to disable
lifecycle scripts when installing npm packages. The only fallout is:
* node-sass can’t run because it uses compiled native code; we replace
it with Dart Sass.
* phantomjs-prebuilt doesn’t download the binary at install time; we
tell it to download it in run-casper.
* ttf2woff2 transparently falls back from native code to an Emscripten
build.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This commit finishes adding end-to-end support for the install script
on Debian Buster (making it production ready). Some support for this
was already added in prior commits such as
99414e2d96.
We plan to revert the postgres hunks of this once we've built
tsearch_extras for our packagecloud archive.
Fixes#9828.
Previous cleanups (mostly the removals of Python __future__ imports)
were done in a way that introduced leading newlines. Delete leading
newlines from all files, except static/assets/zulip-emoji/NOTICE,
which is a verbatim copy of the Apache 2.0 license.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
As a result of dropping support for trusty, we can remove our old
pattern of putting `if False` before importing the typing module,
which was essential for Python 3.4 support, but not required and maybe
harmful on newer versions.
cron_file_helper
check_rabbitmq_consumers
hash_reqs
check_zephyr_mirror
check_personal_zephyr_mirrors
check_cron_file
zulip_tools
check_postgres_replication_lag
api_test_helpers
purge-old-deployments
setup_venv
node_cache
clean_venv_cache
clean_node_cache
clean_emoji_cache
pg_backup_and_purge
restore-backup
generate_secrets
zulip-ec2-configure-interfaces
diagnose
check_user_zephyr_mirror_liveness
The comment that tabbott edited into my commit while wimpifying this
function is wrong on multiple levels.
Firstly, the way in which users might be “running our scripts” was
never relevant. `__file__` is not the script that the user ran, it’s
zulip_tools.py itself. What matters is not how the user ran the
script, but rather how zulip_tools was imported. If zulip_tools was
imported as scripts.lib.zulip_tools, then `__file__` must end with
`scripts/lib/zulip_tools.py`, so running dirname three times on it is
fine. In fact, in Python ≥ 3.4 (we don’t support anything older),
`__file__` in an imported module is always an absolute path, so it
must end with `scripts/lib/zulip_tools.py` in any case.
(At present, there’s one script that imports lib.zulip_tools, and the
installer runs scripts/lib/zulip_tools.py as a script, but those uses
don’t hit this function.)
Secondly, even if we do care about `__file__` being a funny relative
path, there’s still no reason to have two calls to `realpath`.
`realpath(dirname(dirname(dirname(realpath(…)))))` is equivalent to
`dirname(dirname(dirname(realpath(…)))), as the inner `realpath` has
already canonicalized symlinks at every level.
This version also deals with `__file__` being a funny relative
path (assuming none of scripts, lib, and zulip_tools.py are themselves
symlinks), while making fewer `lstat` calls than either of the above
constructions.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This tool can be used to update the API field of local
zuliprc files for dummy users of development server
(iago, prospero, etc) with the correct API key from database.
This tool can be run after provisioning (or similar tools) which change
the API keys in the database.
As of commit cff40c557b (#9300), these
files are no longer served directly to the browser. Disentangle them
from the static asset pipeline so we can refactor it without worrying
about them.
This has the side effect of eliminating the accidental duplication of
translation data via hash-naming in our release tarballs.
This reverts commit b546391f0b (#1148).
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Apparently, the `chown -R` would never run if the original clone
attempt had networking errors, leading to inability to use
upgrade-zulip-from-git without manual intervention.
Previously, it didn't properly update the stamp files that determine
our caching behavior, so if one ran test-backend afterwards, nothing
would happen.
A secondary issue that this commit does not fix is that provision will
end up rerunning the whole thing.
The ids that will be used for each particular run of the test suite are
written to a unique file. Each file will then be used as a time
reference of when the suite was ran.
This change sets up the ability for a complete clean up of potentially
leaked database templates.
Tweaked by tabbott to remove these files after successful database
cleanup.
We use `git describe --tags` to get information about the number of commit since
the last major version, and the sha of the current HEAD. This is added to the
ZULIP_VERSION when a deploy is done from `git`.
Modified heavily by punchagan to:
* to use git describe instead of `git log` and `wc`
* use a separate script to run the git describe command
* write the file with version info to var/ and remove it from the repo
Fixes#4685.
API changes:
* The behaviour of Date.toLocaleTimeString() reverts to pre 8.0.0,
this only affects automated tests. Lots of other API changes but
we didn't use any of those.
* The internal sorting algorithm changed which causes one of our own
compare function to miss coverage.
Simulate isn’t enough in some cases. The error message when this
fails looks sufficiently non-alarming.
LXC:
default: + apt-get -dy install lsb-release apt-transport-https gnupg
default: Reading package lists...
default: Building dependency tree...
default:
default: Reading state information...
default: lsb-release is already the newest version.
default: gnupg is already the newest version.
default: The following NEW packages will be installed:
default: apt-transport-https
default: 0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
default: Need to get 25.1 kB of archives.
default: After this operation, 238 kB of additional disk space will be used.
default: Err http://archive.ubuntu.com/ubuntu/ trusty-updates/main apt-transport-https amd64 1.0.1ubuntu2.3
default: 404 Not Found [IP: 91.189.88.161 80]
default: Err http://security.ubuntu.com/ubuntu/ trusty-security/main apt-transport-https amd64 1.0.1ubuntu2.3
default: 404 Not Found [IP: 91.189.88.161 80]
default: E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/a/apt/apt-transport-https_1.0.1ubuntu2.3_amd64.deb 404 Not Found [IP: 91.189.88.161 80]
default:
default: E: Some files failed to download
default: + apt-get update
[…]
default: Fetched 4,504 kB in 7s (611 kB/s)
default: Reading package lists...
default: + apt-get -y install lsb-release apt-transport-https gnupg
default: Reading package lists...
Docker:
default: + apt-get -dy install lsb-release apt-transport-https gnupg
default: Reading package lists...
default: Building dependency tree...
default:
default: Reading state information...
default: Package gnupg is not available, but is referred to by another package.
default: This may mean that the package is missing, has been obsoleted, or
default: is only available from another source
default: E: Package 'gnupg' has no installation candidate
default: + apt-get update
[…]
default: Fetched 16.2 MB in 5s (3,326 kB/s)
default: Reading package lists...
default: + apt-get -y install lsb-release apt-transport-https gnupg
default: Reading package lists...
(All in green.)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
The comment explains this in more detail, but basically one previously
needed the `--from-git` option to `upgrade-zulip-stage-2` if one had
last installed/upgraded from Git, and not that option otherwise, which
would have forced us to make the OS upgrade documentation much more
complicated than it needed to be.
activate_this.py has always documented that it should be exec()ed with
locals = globals, and in virtualenv 16.0.0 it raises a NameError
otherwise.
As a simplified demonstration of the weird things that can go wrong
when locals ≠ globals:
>>> exec('a = 1; print([a])', {}, {})
[1]
>>> exec('a = 1; print([a for b in [1]])', {}, {})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
File "<string>", line 1, in <listcomp>
NameError: name 'a' is not defined
>>> exec('a = 1; print([a for b in [1]])', {})
[1]
Top-level assignments go into locals, but from inside a new scope like
a list comprehension, they’re read out of globals, which doesn’t work.
Fixes#12030.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
In addition to upgrading dependencies being generally useful, this may
fix situations where yarn fails but returns a success status code in the
presence of an HTTP proxy.
Now that we have the run_as_root helper function, we don't need to
install sudo to run Zulip in production
This reverts commit a7d7d181ea.
Fixes#10036.
Few folks will be upgrading from versions of Zulip old enough to not
have virtualenv-clone, and those who are won't be able to use it due
to older dependencies having been removed.
Apparently, while upgrade-zulip-from-git always ensures that zulip
deployment directories are owned by the Zulip user, unpack-zulip (aka
the tarball code path) has them owned by root.
The user ID detection logic in su_to_zulip's helper get_zulip_uid was
intended to support both development environments (where the user ID
might vary) and production environments. For development
environments, the existing code is fine, but given this unpack-zulip
permissions issue, we need to have code to fallback to 'zulip' if the
detection logic detects the "zulip" user has having UID 0.
Apparently, virtualenv-clone ends up copying the success-stamp file
that we use to track whether a virtualenv was successfully
provisioned, which results in problems if we get a network error in
the pip install stage afterwards.
The comment explains our fix, but basically we just delete
success-stamp after the clone.
Fixes#11301.
On usage errors (except --help), write usage message to stderr and
exit with nonzero status.
Forbid setting the hostname and email to the example values. Those
are specifically checked for and would fail later.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Instead, manually activate it in the one place where this
functionality was used (tools/lib/provision.py). This way we avoid
trying to activate the Python 2 thumbor virtualenv from Python 3.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Nowm unless you specify `--fill-cache`, memcached caches will not be
pre-filled after a server restart. This will be helpful when someone
is in a hurry (e.g. if the server is down right now, or if he/she
testing a configuration change in a newly setup server), it's best to
just restart without pre-filling the cache.
Fixes: #10900.
The site_packages variable points to (e.g.)
zulip-py3-venv/lib/python3.4/site-packages. If that doesn’t exist,
we’re probably running the wrong Python version.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
We still create a Python 2 virtualenv for thumbor but that’s
separate (/srv/zulip-thumbor-venv from
scripts/lib/create-thumbor-venv).
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Otherwise this causes an error
```
AttributeError: type object 'Callable' has no attribute '_abc_registry'
```
on 3.7. While the error is specific to 3.7, it is safer to uninstall
typing for all the versions that don't require a pip-provided typing
library.
/bin/sh and /usr/bin/env are the only two binaries that NixOS provides
at a fixed path (outside a buildFHSUserEnv sandbox).
This discussion was split from #11004.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This is a common bug that users might be tempated to introduce.
And also fix two instances of this bug that were present in our
codebase, including an important one in our upgrade code path.
This makes it possible to add --skip-purge-old-deployments in the
deploy_options section of /etc/zulip/zulip.conf, and control whether
old deployments are purged automatically on a system.
We still need to do https://github.com/zulip/zulip/issues/10534 and
probably also to add these arguments to be directly passed into
upgrade-zulip, but that can wait for future work.
Fixes#10946.
Since yarn has a package.json conveniently available, we can parse
that with jq, saving the expensive operation of starting up yarn.
This saves ~300ms in a no-op provision.
Apparently, we were incorrectly expressing the paths in the
caches_in_use data structures for these two cache-cleaning algorithms,
resulting in the default threshhold_days algorithm controlling which
caches could be garbage-collected. While the emoji one was just a
performance optimization for upgrade-zulip-from-git, it was possible
for the main `node_modules` cache in use in production to be GCed,
resulting in LaTeX rendering being broken.
This fixes an actual user-facing issue in our mobile push
notifications documentation (where we were incorrectly failing to
quote the argument to `./manage.py register_server` making it not
work), as well as preventing future similar issues from occurring
again via a linter rule.
Apparently, on Debian stretch, the gnupg package isn't installed by
default, which means that our `apt-key add` commands were failing with
these errors on an ultra-minimal Debian installation:
+ apt-key add ./scripts/setup/packagecloud.asc
E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them is required for this operation
+ apt-key add ./scripts/setup/pgroonga-debian.asc
E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them is required for this operation
Fixes#10480.
The original code was actually broken, in that it checked the wrong
path, but it didn't matter because it used `ln -nsf`.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Previously, we unconditionally tried to restart the Tornado process
name corresponding to the historically always-true case of a single
Tornado process. This resulted in Tornado not being automatically
restarted on a production deployment on servers with more than one
Tornado process configured.
This commit allows specifying Subject Alternative Names to issue certs
for multiple domains using certbot. The first name passed to certbot-auto
becomes the common name for the certificate; common name and the other
names are then added to the SAN field. All of these arguments are now
positional. Also read the following for the certbot syntax reference:
https://community.letsencrypt.org/t/how-to-specify-subject-name-on-san/Fixes#10674.
By far the dominant cause of errors when installing apt packages is
not having the Universe repository enabled in Ubuntu bionic (this
seems to have started happening a lot recently; I wonder if Ubuntu
changed the defaults for new server installs or something?).
In any case, providing that suggestion in the error output should help
reduce these a lot.
In scripts/lib/install line 71:
ZULIP_PATH="$(readlink -f $(dirname $0)/../..)"
^-- SC2046: Quote this to prevent word splitting.
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/lib/install line 105:
mem_kb=$(cat /proc/meminfo | head -n1 | awk '{print $2}')
^-- SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.
In scripts/lib/install line 141:
apt-get -y dist-upgrade $APT_OPTIONS
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/lib/install line 145:
$ADDITIONAL_PACKAGES
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/lib/install line 254:
if [ -n "ZULIP_ADMINISTRATOR" ]; then
^-- SC2157: Argument to -n is always true due to literal strings.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
We use it to drop privileges from root to other users in the installer
process (which ideally, we would remove, but it will take some
annoying refactoring).
This should generally be safe to do, since the default sudo
permissions only allow root to use it anyway.
See https://github.com/zulip/zulip/issues/10036 for the follow-up
issue of removing the need to do this.
Because we renamed the "google" iconset to be the modern Google set,
not what is now called the "googleblob" icon set, we need to make sure
that our usually correct policy of not overwriting image files under
`prod-static/` doesn't apply to files potentially being copied in for
the emoji images.
We fix this by just deleting the `images-google-64` directory on
upgrade if it contains the googleblob version of the "hotdog" emoji.
Fixes#10038.
Previously, we were having issues installing on Debian Stretch with
non-English locales, because `locale-gen` actually doesn't take a
locale as an argument (and thus `locale-gen en_US.UTF-8` did nothing).
We should instead be calling localedef directly.
Thanks to Tom Daff for debugging this.
Fixes#10629.
For building Zulip in an environment where a custom CA certificate is
required to access the public Internet, one needs to be able to
specify that CA certificate for all network access done by the Zulip
installer/build process. This change allows configuring that via the
environment.
Thanks to changes in restart-server, this is now already happening there.
(The restart-server changes were required to ensure that if the
upgrade failes and one just does
/home/zulip/deployments/next/restart-server to recover, the right
thing happens; so this is the correct resolution to the conflict).
In scripts/lib/setup-apt-repo line 6:
zulip_source_hash=`sha1sum $SOURCES_FILE`
^-- SC2006: Use $(..) instead of legacy `..`.
In scripts/lib/setup-apt-repo line 10:
SCRIPTS_PATH="$(dirname $(dirname $0))"
^-- SC2046: Quote this to prevent word splitting.
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/lib/setup-apt-repo line 36:
if [ "$zulip_source_hash" = "`sha1sum $SOURCES_FILE`" ] && ! [ -e "$STAMP_FILE" ]; then
^-- SC2006: Use $(..) instead of legacy `..`.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/lib/install-node line 34:
source "$NVM_DIR/nvm.sh"
^-- SC1090: Can't follow non-constant source. Use a directive to specify location.
In scripts/lib/install-node line 36:
export NODE_BIN="$(nvm which default)"
^-- SC2155: Declare and assign separately to avoid masking return values.
In scripts/lib/install-node line 39:
n=$(which node)
^-- SC2230: which is non-standard. Use builtin 'command -v' instead.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/lib/create-zulip-admin line 3:
if ([ "$ZULIP_USER_CREATION_ENABLED" == "True" ] || [ "$ZULIP_USER_CREATION_ENABLED" == "true" ]) && \
^-- SC2235: Use { ..; } instead of (..) to avoid subshell overhead.
In scripts/lib/create-zulip-admin line 4:
([ -z "$ZULIP_USER_DOMAIN" ] || \
^-- SC2235: Use { ..; } instead of (..) to avoid subshell overhead.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/lib/certbot-maybe-renew line 8:
case "$(echo "$value" | tr A-Z a-z)" in
^-- SC2019: Use '[:upper:]' to support accents and foreign alphabets.
^-- SC2018: Use '[:lower:]' to support accents and foreign alphabets.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This flag is used to track which user/message pairs correspond to an
active mobile push notification, that should potentially be cleared
when the user reads the message.
This flag should never appear on a message that is also marked as
read; eventually we may want a cron job to check for that condition.
We include a partial index on UserMessage for this flag.
Apparently, our Python 3 conversion for the early-migrations logic
here was incorrect, and as a result we never set
need_create_large_indexes to True (because we were checking whether a
`bytes` was inside a list of `str`s).
The simplest fix would be to just add a `.decode()` in one place, but
this refactor to just decode at the beginning is a lot more readable.
The is_private flag is intended to be set if recipient type is
'private'(1) or 'huddle'(3), otherwise i.e if it is 'stream'(2), it
should be unset.
This commit adds a database index for the is_private flag (which we'll
need to use it). That index is used to reset the flag if it was
already set. The already set flags were due to a previous removal of
is_me_message flag for which the values were not cleared out.
For now, the is_private flag is always 0 since the really hard part of
this migration is clearing the unspecified previous state; future
commits will fully implement it actually doing something.
History: Migration rewritten significantly by tabbott to ensure it
runs in only 3 minutes on chat.zulip.org. A key detail in making that
work was to ensure that we use the new index for the queries to find
rows to update (which currently requires the `order_by` and `limit`
clauses).
This package is important in order to avoid scary-looking errors
whenever we upgrade the dependencies in thumbor.txt (where
virtualenv-clone isn't installed in the venv, and then gets installed
by the code we just added a TODO comment to.
Apparently, perl at least expects LANG, LANGUAGE, and LC_ALL to be
consistent, and thus apt spits out a bunch of warnings if these are
different. So if we're forcing LC_ALL in these installer/upgrade
script blocks, we should force the rest too.
I believe this fixes the remaining locale part of #9946.
--agree-tos is useful for the Docker environment, where we won't have
an interactive shell present for agreeing to the ToS.
--deploy-hook is also useful for the Docker environment; it makes it
possible to customize what deploy hook (if any) we pass into the
underlying cerbot command.
This migrates Zulip to use a dramatically better set of names and
aliases for our emoji set, defined in emoji_names.py (which is in turn
manually generated from our hand-curated CSV file).
This should significantly improve the experience of using Zulip's
emoji picker and emoji typeahead for finding what one is looking for.
We were already correctly including libssl-dev in Zulip's dependencies
in development environment provisioning, but (at least now) it's
needed to build certain Python packages like pycurl when building a
Zulip virtualenv in production. I haven't investigated why we didn't
need this on Ubuntu, but one possible reason would be that some other
library in our dependencies list happens to depend on it on Ubuntu.
We fix this by moving the dependency over to the shared
VENV_DEPENDENCIES list.
Fixes part of #9946.
Apparently, at least some Debian stretch systems don't have an
/etc/lsb-release, so the optimization that we did in
5d39a0f0fc broke our installer on
Debian.
We fix this, by falling back to calling the lsb_release command on
systems that don't have a faster way to do it.
Fixes part of #9946.
This commits adds the necessary puppet configuration and
installer/upgrade code for installing and managing the thumbor service
in production. This configuration is gated by the 'thumbor.pp'
manifest being enabled (which is not yet the default), and so this
commit should have no effect in a default Zulip production environment
(or in the long term, in any Zulip production server that isn't using
thumbor).
Credit for this effort is shared by @TigorC (who initiated the work on
this project), @joshland (who did a great deal of work on this and got
it working during PyCon 2017) and @adnrs96, who completed the work.
The only changes visible at the AST level, checked using
https://github.com/asottile/astpretty, are
zerver/lib/test_fixtures.py:
'\x1b\\[(1|0)m' ↦ '\\x1b\\[(1|0)m'
'\\[[X| ]\\] (\\d+_.+)\n' ↦ '\\[[X| ]\\] (\\d+_.+)\\n'
which is fine because re treats '\\x1b' and '\\n' the same way as
'\x1b' and '\n'.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This saves about 400ms when running clean-unused-caches, basically by
calling its sub-rountines by import (rather than
`subprocess.check_call()`). The performance optimization seems well worth it.
Fixes#9766.
This file looks like it's producing some kind of compilation of the
mobile strings, that the mobile app will somehow end up using --
especially as it refers to its output as a "resource file". In
reality, it compiles statistics to be included in the language-picker
UI in the web app. Give appropriate names to the identifiers so it's
less confusing.
This improves the performance of these operations, by saving a ~50ms
Python process startup. While not a major performance improvement, it
seems worth it, given how often these commands get run.
Fixes#9571.
On newer distros like Xenial, Stretch, etc., we were incorrectly not
installing the Python 3 version of the virtualenv package. This was
accidentally working because most base images with Python already have
this package too, but this was failing to install the right
dependencies in our Docker builds, requiring unnecessary manual code.
We fixed this some time ago for provision.py, but not for production.
The docker installer configuration incorrectly had has_appserver set
to 0; this meant that (A) the docker-zulip code needed to copy the
block of code in the installer for the `has_appserver` case into the
Dockerfile (unnecessarily), and (B) one couldn't use `install` from a
Git ref (because the static asset compiler didn't end up in the right
place).
It appears that docker-zulip tried to set this flag in their `install`
command line, but the construction inside `install` meant that didn't
work.
This fixes adding the Ubuntu repositories for Debian, as well as makes
sure that we install the debian-archive-keyring package on Debian,
which is only priority important (and thus might be missing).
It wasn't obvious reading this message that you can perfectly well
bring your own SSL/TLS certificate; unless you read quite a bit
between the lines where we say "could not find", or followed the link
to the detailed docs, the message sounded like you had to either use
--certbot or --self-signed-cert.
So, explicitly mention the BYO option. Because the "complete chain"
requirement is a bit tricky, don't try to give instructions for it
in this message; just refer the reader to the docs.
Also, drop the logic to identify which of the files is missing; it
certainly makes the code more complex, and I think even the error
message is actually clearer when it just gives the complete list of
required files -- it's much more likely that the reader doesn't know
what's required than that they do and have missed one, and even then
it's easy for them to look for themselves.
This fixes a bug where provision was failing since our most recent
upgrade to yarn/nvm/node.
It turns out my original fix was the correct fix, but to the wrong
third-party tool: nvm, not yarn, was the offender.
Apparently, new versions of yarn use the HOME environment variable to
figure out where to access their configuration, and sudo apparently
doesn't clear that variable, so install-node was being run with HOME
set to something under /home/vagrant (e.g.).
Fix this by just setting that environment variable correctly.
This replaces 250a036ff8, which
misdiagnosed the issue.
It appears that some change in yarn's versioning system means that
installing yarn itself ends up chowning its config directory
incorrectly to be owned by root, preventing `yarn install` from
working later.
node -> v8.9.4
yarn -> 1.5.1
nvm -> 0.33.8
Also updates a test in timerender.js which depends on time
provided by node which is now changed in newer release.
Some changes have been made in circeci script, we just create ~/.config
directory and chown it to circleci user so installing new version of yarn
does not cause any ci failure on circleci during provision.
This is the analog of 7b2c9223e7 for the
emoji cache; the only difference is that the existing code was working
correctly. It's still worth changing for improved robustness.
We saw issues with /srv/zulip_npm_cache being cleaned incorrectly by
this tool in production (more correctly, we noticed broken symlinks to
those directories, even from the current deployment). Print-debugging
showed that indeed older deployments were being ignored, because the
logic for `get_caches_in_use` was totally broken (this was sorta
masked because we also keep the last week's deployments).
The specific bug here turned out to be that we weren't passing the
`production` argument to generate_sha1sum_node_modules, but the
broader problem is that this logic isn't robust to changes in the
hashing algorithm.
Fix this by replacing the broken logic for trying to compute the
correct hash for that deployment with just checking the symlink inside
the deployment to let it self-report.
We can't easily do this same change for clean-venv-cache, because we
use multiple virtualenvs there. But a similar change could be useful
for the emoji cache as well.
Fixes#8116.
This is apparently installed by the perl package; I hadn't even known
it existed. We of course want to use the sha1sum command from
coreutils.
Fixes#8836.
This commit switches our emoji infrastructure to use 256 color indexed
64px spritesheets. Earlier we were using non-indexed 32px spritesheets
which were blurry on high dpi displays. These indexed spritesheets not
only provide a crispier display but are also smaller in size.
This commit also removes the `emoji-datasource` package as a dependency
as all the data is now sourced from individual datasource packages.
Fixes: #7862.
The installation isn't really complete here, and wasn't even when this
was the only success case; the instructions we're giving are for the
next step in the installation.
These instructions don't say what to do in an actual use case for this
option, but decent instructions there will require having a concrete
use case in front of us and designing the flow for it. At this stage,
just say where we are in the normal flow, and an admin who's chosen to
go off that flow can figure out how they want to vary it from there.
This flips the experimental `--express` option to be the default.
We retain the old behavior, where the script exits before
`initialize-database`, as an option `--no-init-db`; it might be useful
in e.g. a migration scenario (from a Zulip install elsewhere, or
another chat system) where the admin wants to set up the database
separately.
The install instructions are adjusted to match, getting shorter by two
steps and a bunch of words. I think this opens up opportunities to
refactor the text to simplify things further, too, but leaving that
for another commit.
Also tweak the "production" test suite to match.
Kind of unfortunate because the `sudo` interface for running a command
is objectively better -- a list of arguments, rather than a string to
be re-parsed by the shell. But some bare-bones machine images lack
`sudo`, so this makes things a bit more portable.
We do the following here:
* Remove libjasper-dev from THUMBOR_VENV_DEPENDENCIES.
Reason: This dependancy wasn't really needed by us for using
thumbor. It was a dependancy for using open-cv as Imaging Engine
in thumbor but we use PIL (Pillow now) as Imaging Engine.
* Add zlib1g-dev, libfreetype6-dev to THUMBOR_VENV_DEPENDENCIES.
Reason: These are dependancies of Pillow which are required for it
Pillow to function. Since we use Pillow in thumbor as Imaging Engine
we need these. Stuff before this didn't break because we also use
Pillow in development Environment and have these dependancies
installed from VENV_DEPENDENCIES as well.
We'll make this the normal behavior soon, once we're satisfied with
our arrangements for sending the admin straight to realm creation and
using the app without configuring email. The instructions in the docs
will also have to change accordingly, of course.
This causes us to give an error if you pass the installer any
positional arguments, e.g. with `--`. There's no reason you'd want
to do this, but I accidentally did it by passing an extra `--` to
the `test-install/install` wrapper and spent a few minutes on
confused debugging.
This gives us just one way of adopting a self-signed cert, rather than
one script which would generate a new one and an option to another
which would symlink to the system's snakeoil cert. Now those two
codepaths converge, and do the same thing.
The small advantage of generating our own over the alternative is that
it lets us set the name in the cert to EXTERNAL_HOST, rather than the
system's hostname as embedded in the system snakeoil certs. Not a big
deal, but might make things go slightly smoother if some browsers are
lenient (in a way that they probably shouldn't be.)
Before this fix, the installer has an extremely annoying bug where
when run inside a container with `lxc-attach`, when the installer
finishes, the `lxc-attach` just hangs and doesn't respond even to
C-c or C-z. The only way to get the terminal back is to root around
from some other terminal to find the PID and kill it; then run
something like `stty sane` to fix the messed-up terminal settings
left behind.
After bisecting pieces of the install script to locate which step
was causing the issue, it comes down to the `service camo restart`.
The comment here indicates that we knew about an annoying bug here
years ago, and just swept it under the rug by skipping this step
when in Travis. >_<
The issue can be reproduced by running simply `service camo restart`
under `lxc-attach` instead of the installer; or `service camo start`,
following a `service camo stop`. If `lxc-attach` is used to get an
interactive shell, these commands appear to work fine; but then when
that shell exits, the same hang appears. So, when we start camo
we're evidently leaving some kind of mess that entangles the daemon
with our shell.
Looking at the camo initscript where it starts the daemon, there's
not much code, and one flag jumps out as suspicious:
start-stop-daemon --start --quiet --pidfile $PIDFILE -bm \
--exec $DAEMON --no-close -c nobody --test > /dev/null 2>&1 \
|| return 1
start-stop-daemon --start --quiet --pidfile $PIDFILE -bm \
--no-close -c nobody --exec $DAEMON -- \
$DAEMON_ARGS >> /var/log/camo/camo.log 2>&1 \
|| return 2
What does `--no-close` do?
-C, --no-close
Do not close any file descriptor when forcing the daemon
into the background (since version 1.16.5). Used for
debugging purposes to see the process output, or to
redirect file descriptors to log the process output.
And in fact, looking in /proc/PID/fd while a hang is happening finds
that fd 0 on the camo daemon process, aka stdin, is connected to our
terminal.
So, stop that by denying the initscript our stdin in the first place.
This fixes the problem.
The Debian maintainer turns out to be "Zulip Debian Packaging Team",
at debian@zulip.com; so this package and its bugs are basically ours.
This provides a major simplification for non-production installs,
including our own testing (it's already in both the test-install
harness script and the "production" test suite) as well as potential
admins evaluating Zulip.
Ultimately this should probably be the default behavior, with perhaps
something shown to admins on the web as a reminder and link to help on
installing a better certificate. For now, pending working through
that, just get the behavior in and leave it opt-in.
The third-party `install-yarn.sh` script uses `curl`, and we invoke it
in `install-node`. So we need to install it as a dependency.
We've mostly gotten away with this because it's common for `curl` to
already be installed; but it isn't always.
Apparently, this was checking the wrong path in Travis CI, and thus
never actually running (meaning we'd accumulate every `node_modules`
directory ever in the Travis caches, which in turn resulted in very
slow builds).
This updates commit 11ab545f3 "install: Set the locale ..."
to be somewhat cleaner, and to explain more in the commit message.
In some environments, either pip itself fails or some packages fail to
install, and setting the locale to en_US.UTF-8 resolves the issue.
We heard reports of this kind of behavior with at least two different
sets of symptoms, with 1.7.0 or its release candidates:
https://chat.zulip.org/#narrow/stream/general/subject/Trusty.201.2E7.20Upgrade/near/302214https://chat.zulip.org/#narrow/stream/production.20help/subject/1.2E6.20to.201.2E7/near/306250
In all reported cases, commit 11ab545f3 or equivalent fixed the issue.
Setting LC_CTYPE is redundant when also setting LC_ALL, because LC_ALL
overrides all `LC_*` environment variables; so skip that. Also move
the line in `install` to a more appropriate spot, and adjust the
comments.
This commit renames various source requirements files like `dev.txt`,
`mypy.txt` etc to `dev.in`, `mypy.in` etc and various locked requirements
files like `dev_lock.txt`, `mypy_lock.txt` etc to `dev.txt`, `mypy.txt`
etc. This will help in emphasizing to the user that *.in are actually
input to `update-locked-requirements` tool which should be run after
updating any of these.
In this commit we add new dependencies needed for running thumbor.
Also we add the script for creating the virtual environment ready
for thumbor.
Note: Thumbor will use python2 and thus have different virtualenv
dedicated to it.
Credits to @TigorC and @joshland as well for there work on this.
This allows the installer to continue using this script for the
`standalone` method, while the no-argument form now uses the same
`webroot` method as the renewal cron job, suitable for running
by hand to adopt Certbot after initial install.
Certbot replaces the cert files under /etc/letsencrypt/live/,
which our nginx config refers to symlinks to; but it doesn't
tell nginx there's been an update, so nginx keeps serving the
old cert.
This is fine as long as nginx is restarted, or just told to
reload its config, at some point before the cert actually
expires about 30 days later. Which is probably the common
case, but of course we should make it just work. So, if we
actually renew a cert, tell nginx to reload its config now.
This causes the cron job to run only when a Zulip-managed certbot
install is actually set up.
Inside `install`, zulip.conf doesn't yet exist when we run
setup-certbot, so we write the setting later. But we also give
setup-certbot the ability to write the setting itself, so that we
can recommend it in instructions for adopting certbot in an
existing Zulip installation.
Except in:
- docs/writing-bots-guide.md, because bots are supposed to be Python 2
compatible
- puppet/zulip_ops/files/zulip-ec2-configure-interfaces, because this
script is still on python2.7
- tools/lint
- tools/linter_lib
- tools/lister.py
For the latter two, because they might be yanked away to a separate repo
for general use with other FLOSS projects.
This didn't work at all when one did a `vagrant destroy` and then
`vagrant up`, because the cache state would be preserved even though
the machine is gone.
Fixes#5981.
This should make it easier to script the installation process, and
also conveniently are the options one would want for the --certbot
option.
Significantly modified by tabbott to have a sane right interface,
include --help, and avoid printing all the `set -x` garbage before the
usage notices.
Based on #450, with commits
restructured by Rein Zustand.
Tweaks by Rein Zustand:
- Replace configure-cert with generate-self-signed-certs
- `mv scripts/lib/create-zulip-admin.sh scripts/lib/create-zulip-admin`
We were checking for whether an item in the deployments directory
represents a directory but were using its relative path which was
causing a false value to be returned for all items irrespective of
their being a directory or not if the script was invoked from some
where other than the deployments directory.
This commit re-arranges the arguments of `purge_unused_caches()`
function in order to remain consistent with other similar functions
in the library like `may_be_perform_caching()`.
This function will replace the repetitive definition of `parse_args()`
in various cache cleaning scripts. Also adds a `--verbose` argument
to the parser.
Historically, one has needed to build a release tarball in order to
use/test the Zulip installer, but you could upgrade a Zulip server
from Git. However, the only reason for that requirement was that we
didn't run `tools/update-prod-static` as part of the install script if
it's required. A good test for that case is whether we're in a Git
repository, but a better one is to check whether the prod-static
content exists in the tarball paths.
Fixes#3704.
This enforces our use of a consistent style in how we access Python
modules; "from os.path import dirname" is a particularly popular
abbreviation inconsistent with our style, and so it deserves a lint
rule.
Commit message and error text tweaked by tabbott.
Fixes#6543.
Based on the `dry_run` flag, this function either purges the list
of directories passed to them or prints a listing of the directories
it would have purged/kept_back, had the `dry_run` flag been false.
Apparently, the refactoring to make this script only run when changes
are present was buggy, in that if `apt-get update` failed, running
provision against wouldn't rerun `apt-get update`, resulting in a
broken state that requires expertise to fix. This closes that gap, by
using a stamp file to ensure we always successfully update apt before
proceeding.
It doesn't fix existing installations.
Modify `generate_sha1sum_node_modules()` such that it can calculate
the hash for a particular installation.
Tweaked by tabbott to use os.path.realpath in the setup_dir
calculation, to ensure it's consistent.
This should make it much more likely that users see this before
waiting a long time for other things to happen, since the `apt-get
dist-upgrade` step is really slow. We can't move further to the top,
since this requires `lsb_release` to be installed.
Given the path of directory containing all the caches, a list of
caches in use and threshold days, this function returns a list
of caches which can be removed safely.
This function returns a list of all the deployments directories
which are newer than some threshold number of days including the
`/root/zulip` directory if it exists.
This saves us from spending 200-250ms of CPU time importing Django
again just to log that we're running a management command. On
`scripts/restart-server`, this saves us from one thundering herd of
Django startups when all the queue workers are restarted; but there's
still the Django startup for the `manage.py` process itself for each
worker, so on a machine with e.g. 2 (virtual) cores the restart is
still painful.
This causes `upgrade-zulip-from-git`, as well as a no-option run of
`tools/build-release-tarball`, to produce a Zulip install running
Python 3, rather than Python 2. In particular this means that the
virtualenv we create, in which all application code runs, is Python 3.
One shebang line, on `zulip-ec2-configure-interfaces`, explicitly
keeps Python 2, and at least one external ops script, `wal-e`, also
still runs on Python 2. See discussion on the respective previous
commits that made those explicit. There may also be some other
third-party scripts we use, outside of this source tree and running
outside our virtualenv, that still run on Python 2.
We now call the create_large_migrations management command as part of
upgrade-zulip-stage-2 if needed, so that we can create large indexes
while the app is still up.
We can't fully support it until we fix the tsearch_extras availability
issue, but for now, this is an improvement.
Tweaked by tabbott to cover the outstanding tsearch_extras issue.
Also make our dependency on `six` (for e.g. `replace-tarball-shebang`)
explicit -- we've been getting it via `python-pip`, but `python3-pip`
(on trusty) doesn't have that dependency for some reason.
Since we can use both perfer_offline=True and False in a since build
prefer_offline shouldn't be used as a cache key or it will confuse the
cleanup script. Since yarn install (if successful) should be idempotent.
This will probably be ok.
If we do wind up with a symlink lying around at `local_settings.py`,
it won't do us any harm and shouldn't be materially more confusing
than the regular file we've long had there for almost all installs.
It'll also only last as long as the current deploy. So just
let it be, and simplify the code a bit.
Also add a line to help the reader understand the remaining half of
this logic (which is essential so long as people might have pre-1.4.0
deploys lying around that they eventually get around to trying to
upgrade). The fact that it's addressed to a situation which exists
only in the past of this tree, not in its present, makes a brief
comment potentially very helpful.
This replaces nvm in npm-wrapper by harcoding the path the way we do
with node. The main benefit is that this saves a few hundred
milliseconds every time we invoke npm.
For performance reasons, we spawn each linter in a separate OS thread.
The downside of this is that all lints would end up in stdout without
much visual separation, resulting in confusing error log. This commit
introduce the `print_err` function, which shows which linter each line
of lint is from.
Basically we just seperate out the sha1sum generation for the
node modules so that it can be reused later for cache clearance
logic. This is achieved by adding a function which returns the
sha1sum based HEX digest.
The Zulip email mirror script called by postfix had performance/load
issues, because it spent so much time on startup/import due to use of
the Zulip virtualenv.
The script was rewritten using pure python (no Django) to improve
performance.
The install script was failing on 2nd+ attempts if the first attempt
was interrupted.
This failure happened because zulip-venv already existed at
`current_venv_path`. Changing the `ln` command's flags from `-s` to
`-nsf` should make this part of the script idempotent.
This fixes a significant performance issue with LaTeX rendering (and
other things that invoked node) where starting up node took a few
hundred milliseconds due to nvm initialization.
Tweaked by tabbott to avoid copying the node binary itself, instead
using a tiny wrapper script.
This is important primarily because it's possible a future version of
node will expect to find libraries/dependencies/etc. installed via NVM
at some path related to the path of the node binary itself, and that's
more guaranteed with this new model.
Fixes#4618.
This fixes a performance problem where we were previously starting up
a full Django process (~0.7s even on a fast machine) every time a new
email came in, potentially allowing users to accidentally DoS a Zulip
server. Now, we just post over HTTPS, allowing the existing thread
pool support to do its job.
- Add script wrapper to communicate postfix pipe with django web server
over HTTP(S). It uses shared_secret authentication mode.
- Add django view to process messages from email mirror server.
- Clean management command `email-mirror`. Left just functional
for cron email processing.
- Add routes for new tornado view.
- Change pipe script in master process postfix config template
based on updated script.
- Add tests.
Tweaked by tabbott to adjust the directory and set better defaults.
Fixes#2421.
Follow-on from #2373/ PR https://github.com/zulip/zulip/pull/4316, to set an
appropriate umask also when upgrading so files have appropriate permissions.
I've tested this by starting from a clean install, deleting /srv/* so new
files are downloaded, and then doing an upgrade. It worked starting with both
a current version from master and an older release installed with a less
restrictive umask and then the umask changed.
Fixes#2373.
* Now queue_workers.py sorts queue names and prints them on their own
line. Previously it's output was nondeterministic.
* Simplified grep strategy for removing the "test" worker.
This list was likely to end up out of date quickly, since it wasn't
documented that you need to update it when adding a queue. The best
solution is to just not require it to be updated.
Now that we no longer use node_modules at all in production (it's only
used to generate static assets), we don't include `node_modules` in
the production tarballs, and thus we shouldn't attempt to copy
`node_modules` out of the production tarballs when installing.
Fixes a regression introduced in
d71f2e7b9b.
This saves about a minute of downtime when using
upgrade-zulip-from-git in the default configuration.
It should also save several seconds of downtime when upgrading to a
production release tarball as well.
This indirectly causes the RabbitMQ node name for new Zulip
installations to default to zulip@localhost, which would eliminate the
persistent problems we have had
Fixes#194, #465, #1375, #1751.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This adds a dependency on the realpath package on trusty; we could try
to remove it if needed, but given that realpath is included in
coreutils on Xenial (and presumably anything else modern), I think
it's reasonable to add it.
Fixes#1797.
Previously, success_stamp was touched whenever we used a particular
node_modules version; it makes more sense to only touch it when the
node_modules directory has actually changed.
get_package_names did not correctly strip the GitHub URLs from package
names, resulting in the "package names" for our dependencies installed
from Git being tracked with the complete sha1sum included in the name.
This meant that upgrading our virtualenvs incorrectly ended up
resorting to creating an entirely new virtualenv whenever we changed a
dependency that had previously been installed from GitHub URLs.
generate-secrets.py now requires --development for development environment
setup or --production for production environment setup (and one of these
options is mandatory).
This solves the problem that it was somewhat easy to accidentally run
generate-secrets.py without the `-d` option while doing manual development
environment setup.
Fixes: #1911.
This is a first pass at building a framework for collecting various
stats about realms, users, streams, etc. Includes:
* New analytics tables for storing counts data
* Raw SQL queries for pulling data from zerver/models.py tables
* Aggregation functions for aggregating hourly stats into daily stats, and
aggregating user/stream level stats into realm level stats
* A management command for pulling the data
Note that counts.py was added to the linter exclude list due to errors
around %%s.
This adds a new system for copying packages from old virtualenvs that
are sufficiently similar to the new virtualenv required.
In practice, this results in a huge performance improvement for
re-provisioning Zulip development environments when the requirements
files have changed (which is the dominant performance problem with
provision today).
Fixes: #1507.
Between releases 1.3.13 and 1.4.0, local_settings.py was renamed to
prod_settings.py. The upgrade scripts were adjusted to reflect this name
change. But because the first part of the upgrade script is run with the
currently installed version's code, the symlink to /etc/zulip/settings.py is
created with the old name. This was causing upgrade-zulip-stage-2 to fail.
Now upgrade-zulip-stage-2 creates the symlink at zproject/prod_settings.py
if it doesn't already exist.
Fixes#1731.
Because rabbitmq doesn't support changing the nodename of a running
rabbitmq node, Zulip installations suffered a plague of issues where
e.g. a Zulip server would reboot, the hostname would change, and
suddenly the local rabbitmq instance being used by Zulip would stop
working.
We address this problem by using, by default, a fixed rabbitmq
nodename, but providing server administrators the option to set the
rabbitmq nodename used by Zulip however they choose.
To upgrade an existing server to use this new configuration, one will
need to add something like the following to /etc/zulip/zulip.conf:
[rabbitmq]
nodename = zulip@localhost
However, I don't believe we have the puppet code in place to make this
work correctly at initial installation without rabbitmq-server being
already installed (but off), as we can easily setup in Travis CI but I
haven't been willing to do for the installer. So for now, this just
fixes our Travis CI problems.
Fixes: #1579.
This reverts commit 3f95e567c1.
Apparently `apt-add-repository` fails periodically in CI. I suspect
this is some sort of silly networking problem, but given that all
we're saving is a few lines of code, the old version was better if
this fails basically ever.