Few folks will be upgrading from versions of Zulip old enough to not
have virtualenv-clone, and those who are won't be able to use it due
to older dependencies having been removed.
Apparently, while upgrade-zulip-from-git always ensures that zulip
deployment directories are owned by the Zulip user, unpack-zulip (aka
the tarball code path) has them owned by root.
The user ID detection logic in su_to_zulip's helper get_zulip_uid was
intended to support both development environments (where the user ID
might vary) and production environments. For development
environments, the existing code is fine, but given this unpack-zulip
permissions issue, we need to have code to fallback to 'zulip' if the
detection logic detects the "zulip" user has having UID 0.
There’s no reason to do this unless you’re, like, trying to trip the
Let’s Encrypt rate limits (or perhaps trying to manually test this code).
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Apparently, virtualenv-clone ends up copying the success-stamp file
that we use to track whether a virtualenv was successfully
provisioned, which results in problems if we get a network error in
the pip install stage afterwards.
The comment explains our fix, but basically we just delete
success-stamp after the clone.
Fixes#11301.
On usage errors (except --help), write usage message to stderr and
exit with nonzero status.
Forbid setting the hostname and email to the example values. Those
are specifically checked for and would fail later.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Instead, manually activate it in the one place where this
functionality was used (tools/lib/provision.py). This way we avoid
trying to activate the Python 2 thumbor virtualenv from Python 3.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Nowm unless you specify `--fill-cache`, memcached caches will not be
pre-filled after a server restart. This will be helpful when someone
is in a hurry (e.g. if the server is down right now, or if he/she
testing a configuration change in a newly setup server), it's best to
just restart without pre-filling the cache.
Fixes: #10900.
The site_packages variable points to (e.g.)
zulip-py3-venv/lib/python3.4/site-packages. If that doesn’t exist,
we’re probably running the wrong Python version.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
We still create a Python 2 virtualenv for thumbor but that’s
separate (/srv/zulip-thumbor-venv from
scripts/lib/create-thumbor-venv).
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Otherwise this causes an error
```
AttributeError: type object 'Callable' has no attribute '_abc_registry'
```
on 3.7. While the error is specific to 3.7, it is safer to uninstall
typing for all the versions that don't require a pip-provided typing
library.
/bin/sh and /usr/bin/env are the only two binaries that NixOS provides
at a fixed path (outside a buildFHSUserEnv sandbox).
This discussion was split from #11004.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This is a common bug that users might be tempated to introduce.
And also fix two instances of this bug that were present in our
codebase, including an important one in our upgrade code path.
This makes it possible to add --skip-purge-old-deployments in the
deploy_options section of /etc/zulip/zulip.conf, and control whether
old deployments are purged automatically on a system.
We still need to do https://github.com/zulip/zulip/issues/10534 and
probably also to add these arguments to be directly passed into
upgrade-zulip, but that can wait for future work.
Fixes#10946.
This commit works by vendoring the couple functions we still use from
puppetlabs stdlib (join and range), but removing the rest of the
puppetlabs codebase, and of course cleaning up our linter rules in the
process.
Fixes#7423.
Since yarn has a package.json conveniently available, we can parse
that with jq, saving the expensive operation of starting up yarn.
This saves ~300ms in a no-op provision.
This makes it possible for the Puppet codebase to access the path to
the relevant /home/zulip/deployments type directory that puppet was
run from, which in turn makes it possible to safely call scripts from
here.
Based on work by Rein Zustand.
Apparently, we were incorrectly expressing the paths in the
caches_in_use data structures for these two cache-cleaning algorithms,
resulting in the default threshhold_days algorithm controlling which
caches could be garbage-collected. While the emoji one was just a
performance optimization for upgrade-zulip-from-git, it was possible
for the main `node_modules` cache in use in production to be GCed,
resulting in LaTeX rendering being broken.
This fixes an actual user-facing issue in our mobile push
notifications documentation (where we were incorrectly failing to
quote the argument to `./manage.py register_server` making it not
work), as well as preventing future similar issues from occurring
again via a linter rule.
Apparently, on Debian stretch, the gnupg package isn't installed by
default, which means that our `apt-key add` commands were failing with
these errors on an ultra-minimal Debian installation:
+ apt-key add ./scripts/setup/packagecloud.asc
E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them is required for this operation
+ apt-key add ./scripts/setup/pgroonga-debian.asc
E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them is required for this operation
Fixes#10480.
The original code was actually broken, in that it checked the wrong
path, but it didn't matter because it used `ln -nsf`.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Previously, we unconditionally tried to restart the Tornado process
name corresponding to the historically always-true case of a single
Tornado process. This resulted in Tornado not being automatically
restarted on a production deployment on servers with more than one
Tornado process configured.
This library was absolutely essential as part of our Python 2->3
migration process, but all of its calls should be either no-ops or
encode/decode operations.
Note also that the library has been wrong since the incorrect
refactoring in 1f9244e060.
Fixes#10807.
This commit allows specifying Subject Alternative Names to issue certs
for multiple domains using certbot. The first name passed to certbot-auto
becomes the common name for the certificate; common name and the other
names are then added to the SAN field. All of these arguments are now
positional. Also read the following for the certbot syntax reference:
https://community.letsencrypt.org/t/how-to-specify-subject-name-on-san/Fixes#10674.
By far the dominant cause of errors when installing apt packages is
not having the Universe repository enabled in Ubuntu bionic (this
seems to have started happening a lot recently; I wonder if Ubuntu
changed the defaults for new server installs or something?).
In any case, providing that suggestion in the error output should help
reduce these a lot.
This allows our Tornado monitoring to correctly report whether
multiple configured Tornado processes are running.
This setup isn't ideal, in that it can't detect cases where the wrong
set of Tornado processes are running, but it's nice and simple and
should catch most actual problems.
Fixes#10706.
Issue: Before this commit, the `refname` positional argument to
`upgrade-zulip-from-git` script would run successfully for a branch
name on the given remote, but the script would fail if it was
provided with a tag or commit ID.
Solution: 'git clone -q -b refname LOCAL_GIT_CACHE_DIR deploy_path`
would be split into two commands:
1.) `git clone -q LOCAL_GIT_CACHE_DIR deploy_path`
2.) `git checkout -b deploy_timestamp refname` which makes a new
branch with the same name as the timestamp used in make_deploy_path.
Adds an optional argument `--remote-url` to specify the remote URL.
Command line remote URL will be given preference above the one
in /etc/zulip/zulip.conf.
Fixes#6092.
In scripts/lib/install line 71:
ZULIP_PATH="$(readlink -f $(dirname $0)/../..)"
^-- SC2046: Quote this to prevent word splitting.
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/lib/install line 105:
mem_kb=$(cat /proc/meminfo | head -n1 | awk '{print $2}')
^-- SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.
In scripts/lib/install line 141:
apt-get -y dist-upgrade $APT_OPTIONS
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/lib/install line 145:
$ADDITIONAL_PACKAGES
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/lib/install line 254:
if [ -n "ZULIP_ADMINISTRATOR" ]; then
^-- SC2157: Argument to -n is always true due to literal strings.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/setup/terminate-psql-sessions line 16:
major=$(echo "$version" | cut -d. -f1,2)
^-- SC2034: major appears unused. Verify use (or export if used externally).
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
We use it to drop privileges from root to other users in the installer
process (which ideally, we would remove, but it will take some
annoying refactoring).
This should generally be safe to do, since the default sudo
permissions only allow root to use it anyway.
See https://github.com/zulip/zulip/issues/10036 for the follow-up
issue of removing the need to do this.
This dramatically reduces the Tornado downtime when restarting a Zulip
server, which is generally the most significant source of user-facing
bad experiences.
Because we renamed the "google" iconset to be the modern Google set,
not what is now called the "googleblob" icon set, we need to make sure
that our usually correct policy of not overwriting image files under
`prod-static/` doesn't apply to files potentially being copied in for
the emoji images.
We fix this by just deleting the `images-google-64` directory on
upgrade if it contains the googleblob version of the "hotdog" emoji.
Fixes#10038.
Previously, we were having issues installing on Debian Stretch with
non-English locales, because `locale-gen` actually doesn't take a
locale as an argument (and thus `locale-gen en_US.UTF-8` did nothing).
We should instead be calling localedef directly.
Thanks to Tom Daff for debugging this.
Fixes#10629.
For building Zulip in an environment where a custom CA certificate is
required to access the public Internet, one needs to be able to
specify that CA certificate for all network access done by the Zulip
installer/build process. This change allows configuring that via the
environment.
Thanks to changes in restart-server, this is now already happening there.
(The restart-server changes were required to ensure that if the
upgrade failes and one just does
/home/zulip/deployments/next/restart-server to recover, the right
thing happens; so this is the correct resolution to the conflict).
In scripts/setup/terminate-psql-sessions line 5:
[ "$1" = "`echo -e "$1\n$2" | sort -V | tail -n1`" ]
^-- SC2006: Use $(..) instead of legacy `..`.
^-- SC1117: Backslash is literal in "\n". Prefer explicit escaping: "\\n".
In scripts/setup/terminate-psql-sessions line 20:
major=$(echo $version | cut -d. -f1,2)
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/setup/terminate-psql-sessions line 24:
tables=$(echo "'$@'" | sed "s/ /','/g")
^-- SC2145: Argument mixes string and array. Use * or separate argument.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/setup/setup-certbot line 64:
if [ -z "$DOMAIN" -o -z "$EMAIL" ]; then
^-- SC2166: Prefer [ p ] || [ q ] as [ p -o q ] is not well defined.
In scripts/setup/setup-certbot line 73:
method_args=(--webroot --webroot-path=/var/lib/zulip/certbot-webroot/)
^-- SC2191: The = here is literal. To assign by index, use ( [index]=value ) with no spaces. To keep as literal, quote it.
In scripts/setup/setup-certbot line 112:
if [ -z "$deploy_hook" ]; then
^-- SC2128: Expanding an array without an index only gives the first element.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/setup/postgres-init-db line 12:
records=`su "$POSTGRES_USER" -c "psql -Atc 'SELECT COUNT(*) FROM zulip.zerver_message;' zulip" | cat`
^-- SC2006: Use $(..) instead of legacy `..`.
In scripts/setup/postgres-init-db line 35:
source "$(dirname "$0")/terminate-psql-sessions" postgres zulip zulip_base
^-- SC1090: Can't follow non-constant source. Use a directive to specify location.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/setup/install line 18:
if [ $failed = 1 ]; then
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/setup/install line 19:
echo -e "\033[0;31m"
^-- SC1117: Backslash is literal in "\0". Prefer explicit escaping: "\\0".
In scripts/setup/install line 25:
echo -e "\033[0m"
^-- SC1117: Backslash is literal in "\0". Prefer explicit escaping: "\\0".
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/setup/initialize-database line 38:
echo -e "\033[32mPopulating default database failed."
^-- SC1117: Backslash is literal in "\0". Prefer explicit escaping: "\\0".
In scripts/setup/initialize-database line 42:
echo -e "\033[0m"
^-- SC1117: Backslash is literal in "\0". Prefer explicit escaping: "\\0".
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/setup/generate-self-signed-cert line 36:
if [ -n "$EXISTS_OK" ] && [ -e "$KEYFILE" -a -e "$CERTFILE" ]; then
^-- SC2166: Prefer [ p ] && [ q ] as [ p -a q ] is not well defined.
In scripts/setup/generate-self-signed-cert line 40:
if [ -z "$FORCE" ] && [ -e "$KEYFILE" -o -e "$CERTFILE" ]; then
^-- SC2166: Prefer [ p ] || [ q ] as [ p -o q ] is not well defined.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/setup/configure-rabbitmq line 13:
sudo rabbitmqctl $RABBITMQ_FLAGS delete_user "$RABBITMQ_USERNAME" || true
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/setup/configure-rabbitmq line 14:
sudo rabbitmqctl $RABBITMQ_FLAGS delete_user zulip || true
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/setup/configure-rabbitmq line 15:
sudo rabbitmqctl $RABBITMQ_FLAGS delete_user guest || true
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/setup/configure-rabbitmq line 16:
sudo rabbitmqctl $RABBITMQ_FLAGS add_user "$RABBITMQ_USERNAME" "$RABBITMQ_PASSWORD"
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/setup/configure-rabbitmq line 17:
sudo rabbitmqctl $RABBITMQ_FLAGS set_user_tags "$RABBITMQ_USERNAME" administrator
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/setup/configure-rabbitmq line 18:
sudo rabbitmqctl $RABBITMQ_FLAGS set_permissions -p / "$RABBITMQ_USERNAME" '.*' '.*' '.*'
^-- SC2086: Double quote to prevent globbing and word splitting.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/lib/setup-apt-repo line 6:
zulip_source_hash=`sha1sum $SOURCES_FILE`
^-- SC2006: Use $(..) instead of legacy `..`.
In scripts/lib/setup-apt-repo line 10:
SCRIPTS_PATH="$(dirname $(dirname $0))"
^-- SC2046: Quote this to prevent word splitting.
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/lib/setup-apt-repo line 36:
if [ "$zulip_source_hash" = "`sha1sum $SOURCES_FILE`" ] && ! [ -e "$STAMP_FILE" ]; then
^-- SC2006: Use $(..) instead of legacy `..`.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/lib/install-node line 34:
source "$NVM_DIR/nvm.sh"
^-- SC1090: Can't follow non-constant source. Use a directive to specify location.
In scripts/lib/install-node line 36:
export NODE_BIN="$(nvm which default)"
^-- SC2155: Declare and assign separately to avoid masking return values.
In scripts/lib/install-node line 39:
n=$(which node)
^-- SC2230: which is non-standard. Use builtin 'command -v' instead.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/lib/create-zulip-admin line 3:
if ([ "$ZULIP_USER_CREATION_ENABLED" == "True" ] || [ "$ZULIP_USER_CREATION_ENABLED" == "true" ]) && \
^-- SC2235: Use { ..; } instead of (..) to avoid subshell overhead.
In scripts/lib/create-zulip-admin line 4:
([ -z "$ZULIP_USER_DOMAIN" ] || \
^-- SC2235: Use { ..; } instead of (..) to avoid subshell overhead.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/lib/certbot-maybe-renew line 8:
case "$(echo "$value" | tr A-Z a-z)" in
^-- SC2019: Use '[:upper:]' to support accents and foreign alphabets.
^-- SC2018: Use '[:lower:]' to support accents and foreign alphabets.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This flag is used to track which user/message pairs correspond to an
active mobile push notification, that should potentially be cleared
when the user reads the message.
This flag should never appear on a message that is also marked as
read; eventually we may want a cron job to check for that condition.
We include a partial index on UserMessage for this flag.
Apparently, our Python 3 conversion for the early-migrations logic
here was incorrect, and as a result we never set
need_create_large_indexes to True (because we were checking whether a
`bytes` was inside a list of `str`s).
The simplest fix would be to just add a `.decode()` in one place, but
this refactor to just decode at the beginning is a lot more readable.
This is mostly important in that if you're running this as part of a
follow-up to a failed upgrade, and you don't do this,
process-fts-updates will be left not running, resulting in full-text
search not updating.
The is_private flag is intended to be set if recipient type is
'private'(1) or 'huddle'(3), otherwise i.e if it is 'stream'(2), it
should be unset.
This commit adds a database index for the is_private flag (which we'll
need to use it). That index is used to reset the flag if it was
already set. The already set flags were due to a previous removal of
is_me_message flag for which the values were not cleared out.
For now, the is_private flag is always 0 since the really hard part of
this migration is clearing the unspecified previous state; future
commits will fully implement it actually doing something.
History: Migration rewritten significantly by tabbott to ensure it
runs in only 3 minutes on chat.zulip.org. A key detail in making that
work was to ensure that we use the new index for the queries to find
rows to update (which currently requires the `order_by` and `limit`
clauses).
This package is important in order to avoid scary-looking errors
whenever we upgrade the dependencies in thumbor.txt (where
virtualenv-clone isn't installed in the venv, and then gets installed
by the code we just added a TODO comment to.
Apparently, perl at least expects LANG, LANGUAGE, and LC_ALL to be
consistent, and thus apt spits out a bunch of warnings if these are
different. So if we're forcing LC_ALL in these installer/upgrade
script blocks, we should force the rest too.
I believe this fixes the remaining locale part of #9946.
--agree-tos is useful for the Docker environment, where we won't have
an interactive shell present for agreeing to the ToS.
--deploy-hook is also useful for the Docker environment; it makes it
possible to customize what deploy hook (if any) we pass into the
underlying cerbot command.
This migrates Zulip to use a dramatically better set of names and
aliases for our emoji set, defined in emoji_names.py (which is in turn
manually generated from our hand-curated CSV file).
This should significantly improve the experience of using Zulip's
emoji picker and emoji typeahead for finding what one is looking for.
We were already correctly including libssl-dev in Zulip's dependencies
in development environment provisioning, but (at least now) it's
needed to build certain Python packages like pycurl when building a
Zulip virtualenv in production. I haven't investigated why we didn't
need this on Ubuntu, but one possible reason would be that some other
library in our dependencies list happens to depend on it on Ubuntu.
We fix this by moving the dependency over to the shared
VENV_DEPENDENCIES list.
Fixes part of #9946.
Apparently, at least some Debian stretch systems don't have an
/etc/lsb-release, so the optimization that we did in
5d39a0f0fc broke our installer on
Debian.
We fix this, by falling back to calling the lsb_release command on
systems that don't have a faster way to do it.
Fixes part of #9946.
This commits adds the necessary puppet configuration and
installer/upgrade code for installing and managing the thumbor service
in production. This configuration is gated by the 'thumbor.pp'
manifest being enabled (which is not yet the default), and so this
commit should have no effect in a default Zulip production environment
(or in the long term, in any Zulip production server that isn't using
thumbor).
Credit for this effort is shared by @TigorC (who initiated the work on
this project), @joshland (who did a great deal of work on this and got
it working during PyCon 2017) and @adnrs96, who completed the work.
The only changes visible at the AST level, checked using
https://github.com/asottile/astpretty, are
zerver/lib/test_fixtures.py:
'\x1b\\[(1|0)m' ↦ '\\x1b\\[(1|0)m'
'\\[[X| ]\\] (\\d+_.+)\n' ↦ '\\[[X| ]\\] (\\d+_.+)\\n'
which is fine because re treats '\\x1b' and '\\n' the same way as
'\x1b' and '\n'.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This saves about 400ms when running clean-unused-caches, basically by
calling its sub-rountines by import (rather than
`subprocess.check_call()`). The performance optimization seems well worth it.
Fixes#9766.
This file looks like it's producing some kind of compilation of the
mobile strings, that the mobile app will somehow end up using --
especially as it refers to its output as a "resource file". In
reality, it compiles statistics to be included in the language-picker
UI in the web app. Give appropriate names to the identifiers so it's
less confusing.
This improves the performance of these operations, by saving a ~50ms
Python process startup. While not a major performance improvement, it
seems worth it, given how often these commands get run.
Fixes#9571.
Structurally, this queue has the same property as the missed_message
one, namely that it accumulates things and processes them only every
few minutes.
This should stop Zulip from paging in response to slow queries
accumulating when a server restart happens.
On newer distros like Xenial, Stretch, etc., we were incorrectly not
installing the Python 3 version of the virtualenv package. This was
accidentally working because most base images with Python already have
this package too, but this was failing to install the right
dependencies in our Docker builds, requiring unnecessary manual code.
We fixed this some time ago for provision.py, but not for production.
This is multi-stage build which first builds tsearch-extras with the
current version of postgres and then configs postgres for zulip. The
zulip config installs the hunspell dictionaries, stop words file,
tsearch-extras, and creates the initial database.
**Testing Plan:**
1) `docker-compose up` the existing config.
2) Build the new image
3) Edit docker-compose.yml to use the new image id
4) `docker-compose up` and verify full text search is still working.
The docker installer configuration incorrectly had has_appserver set
to 0; this meant that (A) the docker-zulip code needed to copy the
block of code in the installer for the `has_appserver` case into the
Dockerfile (unnecessarily), and (B) one couldn't use `install` from a
Git ref (because the static asset compiler didn't end up in the right
place).
It appears that docker-zulip tried to set this flag in their `install`
command line, but the construction inside `install` meant that didn't
work.
This fixes adding the Ubuntu repositories for Debian, as well as makes
sure that we install the debian-archive-keyring package on Debian,
which is only priority important (and thus might be missing).
It wasn't obvious reading this message that you can perfectly well
bring your own SSL/TLS certificate; unless you read quite a bit
between the lines where we say "could not find", or followed the link
to the detailed docs, the message sounded like you had to either use
--certbot or --self-signed-cert.
So, explicitly mention the BYO option. Because the "complete chain"
requirement is a bit tricky, don't try to give instructions for it
in this message; just refer the reader to the docs.
Also, drop the logic to identify which of the files is missing; it
certainly makes the code more complex, and I think even the error
message is actually clearer when it just gives the complete list of
required files -- it's much more likely that the reader doesn't know
what's required than that they do and have missed one, and even then
it's easy for them to look for themselves.
This fixes a bug where provision was failing since our most recent
upgrade to yarn/nvm/node.
It turns out my original fix was the correct fix, but to the wrong
third-party tool: nvm, not yarn, was the offender.
Apparently, new versions of yarn use the HOME environment variable to
figure out where to access their configuration, and sudo apparently
doesn't clear that variable, so install-node was being run with HOME
set to something under /home/vagrant (e.g.).
Fix this by just setting that environment variable correctly.
This replaces 250a036ff8, which
misdiagnosed the issue.
It appears that some change in yarn's versioning system means that
installing yarn itself ends up chowning its config directory
incorrectly to be owned by root, preventing `yarn install` from
working later.
node -> v8.9.4
yarn -> 1.5.1
nvm -> 0.33.8
Also updates a test in timerender.js which depends on time
provided by node which is now changed in newer release.
Some changes have been made in circeci script, we just create ~/.config
directory and chown it to circleci user so installing new version of yarn
does not cause any ci failure on circleci during provision.
This is the analog of 7b2c9223e7 for the
emoji cache; the only difference is that the existing code was working
correctly. It's still worth changing for improved robustness.
We saw issues with /srv/zulip_npm_cache being cleaned incorrectly by
this tool in production (more correctly, we noticed broken symlinks to
those directories, even from the current deployment). Print-debugging
showed that indeed older deployments were being ignored, because the
logic for `get_caches_in_use` was totally broken (this was sorta
masked because we also keep the last week's deployments).
The specific bug here turned out to be that we weren't passing the
`production` argument to generate_sha1sum_node_modules, but the
broader problem is that this logic isn't robust to changes in the
hashing algorithm.
Fix this by replacing the broken logic for trying to compute the
correct hash for that deployment with just checking the symlink inside
the deployment to let it self-report.
We can't easily do this same change for clean-venv-cache, because we
use multiple virtualenvs there. But a similar change could be useful
for the emoji cache as well.
Fixes#8116.
This is apparently installed by the perl package; I hadn't even known
it existed. We of course want to use the sha1sum command from
coreutils.
Fixes#8836.
This commit switches our emoji infrastructure to use 256 color indexed
64px spritesheets. Earlier we were using non-indexed 32px spritesheets
which were blurry on high dpi displays. These indexed spritesheets not
only provide a crispier display but are also smaller in size.
This commit also removes the `emoji-datasource` package as a dependency
as all the data is now sourced from individual datasource packages.
Fixes: #7862.
The installation isn't really complete here, and wasn't even when this
was the only success case; the instructions we're giving are for the
next step in the installation.
These instructions don't say what to do in an actual use case for this
option, but decent instructions there will require having a concrete
use case in front of us and designing the flow for it. At this stage,
just say where we are in the normal flow, and an admin who's chosen to
go off that flow can figure out how they want to vary it from there.
This flips the experimental `--express` option to be the default.
We retain the old behavior, where the script exits before
`initialize-database`, as an option `--no-init-db`; it might be useful
in e.g. a migration scenario (from a Zulip install elsewhere, or
another chat system) where the admin wants to set up the database
separately.
The install instructions are adjusted to match, getting shorter by two
steps and a bunch of words. I think this opens up opportunities to
refactor the text to simplify things further, too, but leaving that
for another commit.
Also tweak the "production" test suite to match.
Kind of unfortunate because the `sudo` interface for running a command
is objectively better -- a list of arguments, rather than a string to
be re-parsed by the shell. But some bare-bones machine images lack
`sudo`, so this makes things a bit more portable.
Revert c8f034e9a "queue: Remove missedmessage_email_senders code."
As the comment in the code says, it ensures a smooth upgrade path
from 1.7.x; we can delete it in master after 1.8.0 is released.
The removal commit was merged early due to a communication failure.
We do the following here:
* Remove libjasper-dev from THUMBOR_VENV_DEPENDENCIES.
Reason: This dependancy wasn't really needed by us for using
thumbor. It was a dependancy for using open-cv as Imaging Engine
in thumbor but we use PIL (Pillow now) as Imaging Engine.
* Add zlib1g-dev, libfreetype6-dev to THUMBOR_VENV_DEPENDENCIES.
Reason: These are dependancies of Pillow which are required for it
Pillow to function. Since we use Pillow in thumbor as Imaging Engine
we need these. Stuff before this didn't break because we also use
Pillow in development Environment and have these dependancies
installed from VENV_DEPENDENCIES as well.
The zulip user has no need to see this file; it's used by nginx.
And when we set up the cert early in install, there's no zulip user
yet anyway, so this fails.
We'll make this the normal behavior soon, once we're satisfied with
our arrangements for sending the admin straight to realm creation and
using the app without configuring email. The instructions in the docs
will also have to change accordingly, of course.
This script iterates over all the mobile.json resources and creates a
single file at static/locale/mobile_info.json which contains total and
not-translated strings information against each language. After doing
this, it deletes all the mobile i18n resources downloaded by
tools/sync-translations because we neither want to check them in our
repository nor we want to make our repository dirty.
This causes us to give an error if you pass the installer any
positional arguments, e.g. with `--`. There's no reason you'd want
to do this, but I accidentally did it by passing an extra `--` to
the `test-install/install` wrapper and spent a few minutes on
confused debugging.
Thanks to the magic of `set -x`, I noticed this:
```
+ cat
++ ssl-cert
/tmp/src/zulip-server/scripts/setup/generate-self-signed-cert: line 49: ssl-cert: command not found
+ apt-get install -y openssl
[...]
```
In other words, we were trying to run `ssl-cert` -- the name of a
Debian package I meant to refer to in a comment inside the templated
temporary config file for `openssl req` -- as if it were a command.
It wasn't, hence the error.
Because `set -e` has loopholes like a sieve, this didn't cause the
script to exit, just produced this funny output and presumably caused
the config file's comment to be missing a word. In principle, it
could do something surprising if for some reason there were a command
named `ssl-cert` on PATH.
Fix it.
This gives us just one way of adopting a self-signed cert, rather than
one script which would generate a new one and an option to another
which would symlink to the system's snakeoil cert. Now those two
codepaths converge, and do the same thing.
The small advantage of generating our own over the alternative is that
it lets us set the name in the cert to EXTERNAL_HOST, rather than the
system's hostname as embedded in the system snakeoil certs. Not a big
deal, but might make things go slightly smoother if some browsers are
lenient (in a way that they probably shouldn't be.)
Take the core of the logic from how Debian generates the system's
/etc/ssl/certs/ssl-cert-snakeoil.pem ; that gives me more confidence
in the various config choices, and it also demonstrates a much cleaner
way to use the `openssl` tool. Also replace the outer shell logic for
CLI and logging with a cleaner version.
Before this fix, the installer has an extremely annoying bug where
when run inside a container with `lxc-attach`, when the installer
finishes, the `lxc-attach` just hangs and doesn't respond even to
C-c or C-z. The only way to get the terminal back is to root around
from some other terminal to find the PID and kill it; then run
something like `stty sane` to fix the messed-up terminal settings
left behind.
After bisecting pieces of the install script to locate which step
was causing the issue, it comes down to the `service camo restart`.
The comment here indicates that we knew about an annoying bug here
years ago, and just swept it under the rug by skipping this step
when in Travis. >_<
The issue can be reproduced by running simply `service camo restart`
under `lxc-attach` instead of the installer; or `service camo start`,
following a `service camo stop`. If `lxc-attach` is used to get an
interactive shell, these commands appear to work fine; but then when
that shell exits, the same hang appears. So, when we start camo
we're evidently leaving some kind of mess that entangles the daemon
with our shell.
Looking at the camo initscript where it starts the daemon, there's
not much code, and one flag jumps out as suspicious:
start-stop-daemon --start --quiet --pidfile $PIDFILE -bm \
--exec $DAEMON --no-close -c nobody --test > /dev/null 2>&1 \
|| return 1
start-stop-daemon --start --quiet --pidfile $PIDFILE -bm \
--no-close -c nobody --exec $DAEMON -- \
$DAEMON_ARGS >> /var/log/camo/camo.log 2>&1 \
|| return 2
What does `--no-close` do?
-C, --no-close
Do not close any file descriptor when forcing the daemon
into the background (since version 1.16.5). Used for
debugging purposes to see the process output, or to
redirect file descriptors to log the process output.
And in fact, looking in /proc/PID/fd while a hang is happening finds
that fd 0 on the camo daemon process, aka stdin, is connected to our
terminal.
So, stop that by denying the initscript our stdin in the first place.
This fixes the problem.
The Debian maintainer turns out to be "Zulip Debian Packaging Team",
at debian@zulip.com; so this package and its bugs are basically ours.
This provides a major simplification for non-production installs,
including our own testing (it's already in both the test-install
harness script and the "production" test suite) as well as potential
admins evaluating Zulip.
Ultimately this should probably be the default behavior, with perhaps
something shown to admins on the web as a reminder and link to help on
installing a better certificate. For now, pending working through
that, just get the behavior in and leave it opt-in.
It's not appropriate for our script to pass the `--agree-tos` flag
without any evidence of the user actually having any knowledge of,
let alone intent to agree to, any such ToS. Stop doing that.
Fortunately this script hasn't been part of any release, so it's
likely that no users have gone down this path.
The third-party `install-yarn.sh` script uses `curl`, and we invoke it
in `install-node`. So we need to install it as a dependency.
We've mostly gotten away with this because it's common for `curl` to
already be installed; but it isn't always.
This commit just copies all the code from MissedMessageSendingWorker
class to a new EmailSendingWorker class. All the logic to send an email
through a queue was already there. This commit only makes the logic
generic. It does so by creating a special purpose queue called
'email_senders' to send any type of email. To make
MissedMessageSendingWorker still work we derive it from
EmailSendingWorker. All the tests that were testing
MissedMessageSendingWorker now run against EmailSendingWorker.
Apparently, this was checking the wrong path in Travis CI, and thus
never actually running (meaning we'd accumulate every `node_modules`
directory ever in the Travis caches, which in turn resulted in very
slow builds).
This updates commit 11ab545f3 "install: Set the locale ..."
to be somewhat cleaner, and to explain more in the commit message.
In some environments, either pip itself fails or some packages fail to
install, and setting the locale to en_US.UTF-8 resolves the issue.
We heard reports of this kind of behavior with at least two different
sets of symptoms, with 1.7.0 or its release candidates:
https://chat.zulip.org/#narrow/stream/general/subject/Trusty.201.2E7.20Upgrade/near/302214https://chat.zulip.org/#narrow/stream/production.20help/subject/1.2E6.20to.201.2E7/near/306250
In all reported cases, commit 11ab545f3 or equivalent fixed the issue.
Setting LC_CTYPE is redundant when also setting LC_ALL, because LC_ALL
overrides all `LC_*` environment variables; so skip that. Also move
the line in `install` to a more appropriate spot, and adjust the
comments.
This fixes a bug where, when a user is unsubscribed from a stream,
they might have unread messages on that stream leak. While it might
seem to be a minor problem, it can cause significant problems for
computing the `unread_msgs` data structures, since it means we need to
add an extra filter for whether the user is still subscribed, either
in the backend or in the UI.
Fixes#7095.
This commit renames various source requirements files like `dev.txt`,
`mypy.txt` etc to `dev.in`, `mypy.in` etc and various locked requirements
files like `dev_lock.txt`, `mypy_lock.txt` etc to `dev.txt`, `mypy.txt`
etc. This will help in emphasizing to the user that *.in are actually
input to `update-locked-requirements` tool which should be run after
updating any of these.
In this commit we add new dependencies needed for running thumbor.
Also we add the script for creating the virtual environment ready
for thumbor.
Note: Thumbor will use python2 and thus have different virtualenv
dedicated to it.
Credits to @TigorC and @joshland as well for there work on this.
The script already won't work without them; so if the user gets the
invocation wrong, give a halfway-reasonable error rather than just
crash into the ground.
This allows the installer to continue using this script for the
`standalone` method, while the no-argument form now uses the same
`webroot` method as the renewal cron job, suitable for running
by hand to adopt Certbot after initial install.
Certbot replaces the cert files under /etc/letsencrypt/live/,
which our nginx config refers to symlinks to; but it doesn't
tell nginx there's been an update, so nginx keeps serving the
old cert.
This is fine as long as nginx is restarted, or just told to
reload its config, at some point before the cert actually
expires about 30 days later. Which is probably the common
case, but of course we should make it just work. So, if we
actually renew a cert, tell nginx to reload its config now.
This causes the cron job to run only when a Zulip-managed certbot
install is actually set up.
Inside `install`, zulip.conf doesn't yet exist when we run
setup-certbot, so we write the setting later. But we also give
setup-certbot the ability to write the setting itself, so that we
can recommend it in instructions for adopting certbot in an
existing Zulip installation.
This helps make this script suitable to run on existing installations,
by mitigating any worry about clobbering existing certs with links to
the new ones, in case the admin changes their mind or was using the
certs for something else too.
Except in:
- docs/writing-bots-guide.md, because bots are supposed to be Python 2
compatible
- puppet/zulip_ops/files/zulip-ec2-configure-interfaces, because this
script is still on python2.7
- tools/lint
- tools/linter_lib
- tools/lister.py
For the latter two, because they might be yanked away to a separate repo
for general use with other FLOSS projects.