Fixes#2665.
Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.
Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start. I expect this change will increase pressure for us to split
those files, which isn't a bad thing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Automatically generated by the following script, based on the output
of lint with flake8-comma:
import re
import sys
last_filename = None
last_row = None
lines = []
for msg in sys.stdin:
m = re.match(
r"\x1b\[35mflake8 \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
)
if m:
filename, row_str, col_str, err = m.groups()
row, col = int(row_str), int(col_str)
if filename == last_filename:
assert last_row != row
else:
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
with open(filename) as f:
lines = f.readlines()
last_filename = filename
last_row = row
line = lines[row - 1]
if err in ["C812", "C815"]:
lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
elif err in ["C819"]:
assert line[col - 2] == ","
lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
certbot-auto doesn’t work on Ubuntu 20.04, and won’t be updated; we
migrate to instead using the certbot package shipped with the OS
instead. Also made sure that sure certbot gets installed when running
zulip-puppet-apply, to handle existing systems.
We already override the umask in upgrade-zulip-stage-2, but that’s too
late since we’ve already written a bunch of files in stage 1. I would
have removed the stage 2 override, but the OS upgrade documentation
references running stage 2 directly.
Fixes#15164. Note that an affected installation will need to upgrade
twice, because the first upgrade uses the old stage 1.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Generated by pyupgrade --py36-plus --keep-percent-format, but with the
NamedTuple changes reverted (see commit
ba7906a3c6, #15132).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Ubuntu Focal comes with ruby 2.7 and the latest puppet
has some issues with it so to suppress puppet
warnings with ruby 2.7 we added RUBYOPT = "-W0" in
the environment.
This allows straight-forward configuration of realm-based Tornado
sharding through simply editing /etc/zulip/zulip.conf to configure
shards and running scripts/refresh-sharding-and-restart.
Co-Author-By: Mateusz Mandera <mateusz.mandera@zulip.com>
While this functionality to post slow queries to a Zulip stream was
very useful in the early days of Zulip, when there were only a few
hundred accounts, it's long since been useless since (1) the total
request volume on larger Zulip servers run by Zulip developers, and
(2) other server operators don't want real-time notifications of slow
backend queries. The right structure for this is just a log file.
We get rid of the queue and replace it with a "zulip.slow_queries"
logger, which will still log to /var/log/zulip/slow_queries.log for
ease of access to this information and propagate to the other logging
handlers. Reducing the amount of queues is good for lowering zulip's
memory footprint and restart performance, since we run at least one
dedicated queue worker process for each one in most configurations.
Yes, it's slightly janky to create an
argparse.Namespace object like this, but it
saves us from shelling out to a script whose
only real value-add is parsing a single
`threshold_days` argument.
This saves about 130ms for a no-op provision.
We try to avoid importing Django settings unless
we really need them, since we want this program
to run very quickly during `provision` (when
secrets have already been generated earlier).
Since in travis we don't have root access so we used to add different
srv path. As now we shifted our production suites to Circle CI
we don't need that code so removed it.
Also we used a hacky code in commit-lint-message for travis which is
now of no use.
Now that we've cleaned up this tool's output, there's no reason to use
an awkward mechanism to hide its output; we can just print it out like
a normal program.
Fixes#14644; resolves#14701.
Generated by autopep8, with the setup.cfg configuration from #14532.
I’m not sure why pycodestyle didn’t already flag these.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Previously, the send_custom_email code path leaked files in paths that
were not `.gitignored`, under templates/zerver/emails.
This became problematic when we added automated tests for this code
path, as it meant we leaked these files every time `test-backend` ran.
Fix this by ensuring all the files we generate are in this special
subdirectory.
Since now we want to use production suites on Circle CI so there
is no need to set TRAVIS in env while running scripts.
CIRCLECI is set default in the enviroment of Circle CI builds
so we can use it directly.
Also Travis CI had rabbitmq-server installed so we had to add workaround
in install script to avoid the error. That workaround is removed.
We now have two functions related to digests
for processes:
is_digest_obsolete
write_digest_file
In most cases we now **wait** to write the
digest file until after we've successfully
run a process with its new inputs.
In one place, for database migrations, we
continue to write the digest optimistically.
We'll want to fix this, but it requires a
little more code cleanup.
Here is the typical sequence of events:
NEVER RUN -
is_digest_obsolete returns True
quickly (we don't compute a hash)
write_digest_file does a write (duh)
AFTER NO CHANGES -
is_digest_obsolete returns False
after reading one file for old
hash and multiple files to compute
hash
most callers skip write_digest_file
(no files are changed)
AFTER SOME CHANGES -
is_digest_obsolete returns False
after doing full checks
most callers call write_digest_file
*after* running a process
I remove `is_force` from `file_or_package_hash_updated`
and modernize its mypy annotations.
If `is_force` is `True`, we just now run the thing
we want to force-run without having to call
`file_or_package_hash_updated` to expensively
and riskily return `True`.
Another nice outcome of this change is that if
`file_or_package_hash_updated` returns `True`,
you can know that the file or package has
indeed been updated.
For the case of `build_pygments_data` we also
skip an `os.path.exists` check when `is_force`
is `True`.
We will short-circuit more logic in the next
few commits, as well as cleaning up some of
the long/wrapper lines in the `if` statements.
We stopped using tsearch-extras in Zulip 2.1.0 after Anders figured
out how to achieve its goals with native postgres. However, we never
did a `DROP EXTENSION` on systems thta had upgraded, which meant that
backups created on systems originally installed with Zulip 2.0.x and
older, and later upgraded to Zulip 2.1.x, could not be restored on
Zulip servers created with a fresh install of Zulip 2.1.x.
We can't do this with a normal database migration, because DROP
EXTENSION has to be done as the postgres user, so we add some custom
migration code in the upgrade-zulip-stage-2 tool.
It's safe to run this whenever tsearch_extras.control is installed because:
* Zulip is AFAIK the only software that ever used tsearch_extras.
* The package was only installed via puppet on production servers configured to
run a local Zulip database.
* We'll only run this code once per system, because it removes the
package and thus the control files.
Fixes#13612.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
After some testing, I've confirmed that this seems to behave
significantly better in terms of the number of failed requests due to
Tornado being the process of restarting compared with the previous
version, as each individual process is only down for a short time,
rather than all of them being down at once.
Missing commas in the definition of all the queues to check meant that it would be looking for queues with concatenated names, rather than the correct ones. Added the commas.
Used get_venv_dependencies function to return the correct dependencies
for RHEL, Centos, Fedora rather than importing them as separate
COMMON_YUM_DEPENDENCIES in provision and create-production-venv.
In virtualenv ≥ 20, the site_packages variable was removed from
activate_this.py. To avoid a KeyError, replace
activate_locals['site_packages'] with os.path.join(venv, 'lib',
python_version), where python_version is the 'pythonX.Y' name of the
directory where site-packages resides in the virtualenv.
Fixes#14025.
Added a get_venv_dependencies() function in setup_venv.py which
returns VENV_DEPENDENCIES according to the vendor and os_version.
The reason for adding this function was because python-dev will be
depreciated in Focal but can be used as python2-dev so when adding
support for Focal VENV_DEPENDENCIES should to be os_version dependent.
isort 5 knows not to reorder imports across function calls, so this
will stop isort from breaking our code.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This adds Ubuntu 19.10 as a valid provisioning target.
The release test in setup-apt-repo was changed from a list of values to
a regex check for brevity.
The “Smileys & People” category has been split into “Smilys & Emotion”
and “People & Body”.
Also, fix generate_sha1sum_emoji to read the emoji-datasource-google
version from yarn.lock, since package.json only gives a version range.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
"Zulip Voyager" was a name invented during the Hack Week to open
source Zulip for what a single-system Zulip server might be called, as
a Star Trek pun on the code it was based on, "Zulip Enterprise".
At the time, we just needed a name quickly, but it was never a good
name, just a placeholder. This removes that placeholder name from
much of the codebase. A bit more work will be required to transition
the `zulip::voyager` Puppet class, as that has some migration work
involved.
These docstrings hadn't been properly updated in years, and bad an
awkward mix of a bad version of the user-facing documentation and
details that are no longer true (e.g. references to "Voyager").
(One important detail is that we have real documentation for this
system now).
This legacy cross-realm bot hasn't been used in several years, as far
as I know. If we wanted to re-introduce it, I'd want to implement it
as an embedded bot using those common APIs, rather than the totally
custom hacky code used for it that involves unnecessary queue workers
and similar details.
Fixes#13533.
At some point the PostgreSQL Docker image started creating the zulip
database for us, which caused our CREATE DATABASE to fail.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013. We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not. Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.
While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.
This WebSockets code path had a lot of downsides/complexity,
including:
* The messy hack involving constructing an emulated request object to
hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
needs and must be provisioned independently from the rest of the
server).
* A duplicate check_send_receive_time Nagios test specific to
WebSockets.
* The requirement for users to have their firewalls/NATs allow
WebSocket connections, and a setting to disable them for networks
where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
been poorly maintained, and periodically throws random JavaScript
exceptions in our production environments without a deep enough
traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
server restart, and especially for large installations like
zulipchat.com, resulting in extra delay before messages can be sent
again.
As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients. We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.
If we later want WebSockets, we’ll likely want to just use Django
Channels.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Our recent fixes to using the system's configured memcached settings
broke populate_db, because its hacky clear_database helper is called
with a hacked-up settings module.
We fix this by first moving this out-of-place code from models.py into
populate_db, and then saving the settings required to access memcached
so that we can use them in clear_database.
We also fix a mypy erorr in flush-memcached that matches the same
issue fixed in clear_database.
This simplifies the RDS installation process to avoid awkwardly
requiring running the installer twice, and also is significantly more
robust in handling issues around rerunning the installer.
Finally, the answer for whether dictionaries are missing is available
to Django for future use in warnings/etc. around full-text search not
being great with this configuration, should they be required.
We'll be soon documenting a production workflow that involves using
it, and that means it needs to live under scripts/ (since tools/ isn't
present in release tarballs).
`copytree` throws an error if the target already exists, and we don’t
really want to rerun the copy anyway.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Ultimately, this isn't an effective way to monitor this queue; we want
time-based monitoring, not count-based monitoring. Doing that
properly will likely involve modifying the queue processor to write
something about its status.
But until we add the monitoring we want, it makes sense to leave this
active with low limits.
This is needed on at least Debian 10, otherwise xmlsec fails to
install: `Could not find xmlsec1 config. Are libxmlsec1-dev and
pkg-config installed?`
Also remove libxmlsec1-openssl, which libxmlsec1-dev already depends.
(No changes are needed on RHEL, where libxml2-devel and xmlsec1-devel
already declare a requirement on /usr/bin/pkg-config.)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
The output log from running clean_unused_caches was too verbose as
part of the `upgrade-zulip` overall output. While this output is
potentially helpful when running it directly for debugging, it's
certainly redundant for the main production use case.
So a new flag --no-print-headers is introduced. It suppresses the
header outputs for the subtools.
Fixes#13214.
This allows the system to get updates to the Groonga repository
signing key, so `apt update` doesn’t start failing when the key
changes (like it recently did).
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
debian-archive-keyring is a dependency of the essential package apt,
so it is present in every Debian system.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
virtualenv on Ubuntu 16.04, when creating a new environment, downloads
the current version of setuptools, then replaces its pkg_resources
with an old copy from
/usr/share/python-wheels/pkg_resources-0.0.0-py2.py3-none-any.whl.
This causes problems, a simple example of which is reproducible from
the ubuntu:16.04 Docker base image as follows:
apt-get update
apt-get -y install python3-virtualenv
python3 -m virtualenv -p python3 /ve
/ve/bin/pip install sockjs-tornado
/ve/bin/pip download sockjs-tornado
→ `AttributeError: '_NamespacePath' object has no attribute 'sort'`
More relevantly, it breaks pip-compile in the same way. To fix this,
we need to force setuptools to be reinstalled, even if we’re asking
for the same version.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This should dramatically improve the queue processor's performance in
cases where there's a very high volume of requests on a given endpoint
by a given user, as described in the new docstring.
Until we test this more broadly in production, we won't know if this
is a full solution to the problem, but I think it's likely. We've
never seen the UserActivityInterval worker end up backlogged without a
total queue processor outage, and it should have a similar workload.
Fixes#13180.
To replace DISTRIB_FAMILY, there’s now an os_families function using
the standard ID and ID_LIKE information in /etc/os-release.
Fixes#13070; fixes#13071.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We no longer use tsearch_extras, and the camo patch is irrelevant on
systemd systems (Xenial and newer). So we no longer need to
provide/install a PPA at all.
Closes#13027.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Now that we're implemented tsearch_extras in pure postgres, we no
longer need a custom extension. This should help us considerably, as
it means we no longer need to ship custom apt packages at all.
Fixes#467.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
As predicted in https://www.kb.cert.org/vuls/id/319816/, a malicious
worm is beginning to spread across the npm ecosystem through package
postinstall scripts. Only instead of direct self-replicating code,
the replication vector is the temptation to monetize postinstall
scripts by polluting the console logs with paid advertisements. The
effect will be the same unless we all put a stop to this while we
still can.
Apply the recommended VU#319816 workaround, which is to disable
lifecycle scripts when installing npm packages. The only fallout is:
* node-sass can’t run because it uses compiled native code; we replace
it with Dart Sass.
* phantomjs-prebuilt doesn’t download the binary at install time; we
tell it to download it in run-casper.
* ttf2woff2 transparently falls back from native code to an Emscripten
build.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This commit finishes adding end-to-end support for the install script
on Debian Buster (making it production ready). Some support for this
was already added in prior commits such as
99414e2d96.
We plan to revert the postgres hunks of this once we've built
tsearch_extras for our packagecloud archive.
Fixes#9828.
Delete trailing newlines from all files, except
tools/ci/success-http-headers.txt and tools/setup/dev-motd, where they
are significant, and static/third, where we want to stay close to
upstream.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Previous cleanups (mostly the removals of Python __future__ imports)
were done in a way that introduced leading newlines. Delete leading
newlines from all files, except static/assets/zulip-emoji/NOTICE,
which is a verbatim copy of the Apache 2.0 license.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
As a result of dropping support for trusty, we can remove our old
pattern of putting `if False` before importing the typing module,
which was essential for Python 3.4 support, but not required and maybe
harmful on newer versions.
cron_file_helper
check_rabbitmq_consumers
hash_reqs
check_zephyr_mirror
check_personal_zephyr_mirrors
check_cron_file
zulip_tools
check_postgres_replication_lag
api_test_helpers
purge-old-deployments
setup_venv
node_cache
clean_venv_cache
clean_node_cache
clean_emoji_cache
pg_backup_and_purge
restore-backup
generate_secrets
zulip-ec2-configure-interfaces
diagnose
check_user_zephyr_mirror_liveness
The comment that tabbott edited into my commit while wimpifying this
function is wrong on multiple levels.
Firstly, the way in which users might be “running our scripts” was
never relevant. `__file__` is not the script that the user ran, it’s
zulip_tools.py itself. What matters is not how the user ran the
script, but rather how zulip_tools was imported. If zulip_tools was
imported as scripts.lib.zulip_tools, then `__file__` must end with
`scripts/lib/zulip_tools.py`, so running dirname three times on it is
fine. In fact, in Python ≥ 3.4 (we don’t support anything older),
`__file__` in an imported module is always an absolute path, so it
must end with `scripts/lib/zulip_tools.py` in any case.
(At present, there’s one script that imports lib.zulip_tools, and the
installer runs scripts/lib/zulip_tools.py as a script, but those uses
don’t hit this function.)
Secondly, even if we do care about `__file__` being a funny relative
path, there’s still no reason to have two calls to `realpath`.
`realpath(dirname(dirname(dirname(realpath(…)))))` is equivalent to
`dirname(dirname(dirname(realpath(…)))), as the inner `realpath` has
already canonicalized symlinks at every level.
This version also deals with `__file__` being a funny relative
path (assuming none of scripts, lib, and zulip_tools.py are themselves
symlinks), while making fewer `lstat` calls than either of the above
constructions.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This tool can be used to update the API field of local
zuliprc files for dummy users of development server
(iago, prospero, etc) with the correct API key from database.
This tool can be run after provisioning (or similar tools) which change
the API keys in the database.
As of commit cff40c557b (#9300), these
files are no longer served directly to the browser. Disentangle them
from the static asset pipeline so we can refactor it without worrying
about them.
This has the side effect of eliminating the accidental duplication of
translation data via hash-naming in our release tarballs.
This reverts commit b546391f0b (#1148).
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Apparently, the `chown -R` would never run if the original clone
attempt had networking errors, leading to inability to use
upgrade-zulip-from-git without manual intervention.
Previously, it didn't properly update the stamp files that determine
our caching behavior, so if one ran test-backend afterwards, nothing
would happen.
A secondary issue that this commit does not fix is that provision will
end up rerunning the whole thing.
The ids that will be used for each particular run of the test suite are
written to a unique file. Each file will then be used as a time
reference of when the suite was ran.
This change sets up the ability for a complete clean up of potentially
leaked database templates.
Tweaked by tabbott to remove these files after successful database
cleanup.
We use `git describe --tags` to get information about the number of commit since
the last major version, and the sha of the current HEAD. This is added to the
ZULIP_VERSION when a deploy is done from `git`.
Modified heavily by punchagan to:
* to use git describe instead of `git log` and `wc`
* use a separate script to run the git describe command
* write the file with version info to var/ and remove it from the repo
Fixes#4685.
Previously, if you restored onto a different production system from
the one where you took the backup, backup restoration would fail
because the generated rabbitmq passwords for the two systems would be
different, and we didn't update the restored system to use the
password from the original system.
Fixes#12114.
This should ensure that we apply any special configuration for the
database system (e.g. installing `pgroonga`) before we try to restore
the database contents from the archive.
For pgroonga in particular, this is important so that we can preserve
the configuration of the extension in the `pg_restore` process.
Fixes#12345.
With the S3 file upload backend, we don't store uploads locally, so
the `uploads` directory in the backup will be empty, and more
importantly, LOCAL_UPLOADS_DIR will be None, which the previous code
crashed on.
API changes:
* The behaviour of Date.toLocaleTimeString() reverts to pre 8.0.0,
this only affects automated tests. Lots of other API changes but
we didn't use any of those.
* The internal sorting algorithm changed which causes one of our own
compare function to miss coverage.
Simulate isn’t enough in some cases. The error message when this
fails looks sufficiently non-alarming.
LXC:
default: + apt-get -dy install lsb-release apt-transport-https gnupg
default: Reading package lists...
default: Building dependency tree...
default:
default: Reading state information...
default: lsb-release is already the newest version.
default: gnupg is already the newest version.
default: The following NEW packages will be installed:
default: apt-transport-https
default: 0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
default: Need to get 25.1 kB of archives.
default: After this operation, 238 kB of additional disk space will be used.
default: Err http://archive.ubuntu.com/ubuntu/ trusty-updates/main apt-transport-https amd64 1.0.1ubuntu2.3
default: 404 Not Found [IP: 91.189.88.161 80]
default: Err http://security.ubuntu.com/ubuntu/ trusty-security/main apt-transport-https amd64 1.0.1ubuntu2.3
default: 404 Not Found [IP: 91.189.88.161 80]
default: E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/a/apt/apt-transport-https_1.0.1ubuntu2.3_amd64.deb 404 Not Found [IP: 91.189.88.161 80]
default:
default: E: Some files failed to download
default: + apt-get update
[…]
default: Fetched 4,504 kB in 7s (611 kB/s)
default: Reading package lists...
default: + apt-get -y install lsb-release apt-transport-https gnupg
default: Reading package lists...
Docker:
default: + apt-get -dy install lsb-release apt-transport-https gnupg
default: Reading package lists...
default: Building dependency tree...
default:
default: Reading state information...
default: Package gnupg is not available, but is referred to by another package.
default: This may mean that the package is missing, has been obsoleted, or
default: is only available from another source
default: E: Package 'gnupg' has no installation candidate
default: + apt-get update
[…]
default: Fetched 16.2 MB in 5s (3,326 kB/s)
default: Reading package lists...
default: + apt-get -y install lsb-release apt-transport-https gnupg
default: Reading package lists...
(All in green.)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We have been semi-accidentally relying on the fact that terminate-psql-sessions
fails silently when there are PIDs we don't have permission to terminate.
This actually happens somewhat often, generally when we're doing a series of
operations in quick succession by different users, because postgres processes
live a little longer than the `psql` shell that started them.
As part of adding ON_STOP_ERROR to all of our postgres commands, it makes
sense to enforce we don't fail here, but that means we need to actually filter
the target PIDs to only ones we can actually kill.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Also use psql -e (--echo-queries) in scripts that use ‘set -x’, so
errors can be traced to a specific query from the output.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
The comment explains this in more detail, but basically one previously
needed the `--from-git` option to `upgrade-zulip-stage-2` if one had
last installed/upgraded from Git, and not that option otherwise, which
would have forced us to make the OS upgrade documentation much more
complicated than it needed to be.
Fixes permission errors when running restore-backup on a tarball
inaccessible to the zulip user.
Fixes#12125.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
activate_this.py has always documented that it should be exec()ed with
locals = globals, and in virtualenv 16.0.0 it raises a NameError
otherwise.
As a simplified demonstration of the weird things that can go wrong
when locals ≠ globals:
>>> exec('a = 1; print([a])', {}, {})
[1]
>>> exec('a = 1; print([a for b in [1]])', {}, {})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
File "<string>", line 1, in <listcomp>
NameError: name 'a' is not defined
>>> exec('a = 1; print([a for b in [1]])', {})
[1]
Top-level assignments go into locals, but from inside a new scope like
a list comprehension, they’re read out of globals, which doesn’t work.
Fixes#12030.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
In addition to upgrading dependencies being generally useful, this may
fix situations where yarn fails but returns a success status code in the
presence of an HTTP proxy.
The commit 87d1809657 changed the time when
digests are sent by 3 hours to account for moving from the US East Coast to the
West Coast, but didn't change the time period exception in the
`check-rabbitmq-queue` script.
Closes#5415
Now that we have the run_as_root helper function, we don't need to
install sudo to run Zulip in production
This reverts commit a7d7d181ea.
Fixes#10036.
Few folks will be upgrading from versions of Zulip old enough to not
have virtualenv-clone, and those who are won't be able to use it due
to older dependencies having been removed.
Apparently, while upgrade-zulip-from-git always ensures that zulip
deployment directories are owned by the Zulip user, unpack-zulip (aka
the tarball code path) has them owned by root.
The user ID detection logic in su_to_zulip's helper get_zulip_uid was
intended to support both development environments (where the user ID
might vary) and production environments. For development
environments, the existing code is fine, but given this unpack-zulip
permissions issue, we need to have code to fallback to 'zulip' if the
detection logic detects the "zulip" user has having UID 0.
There’s no reason to do this unless you’re, like, trying to trip the
Let’s Encrypt rate limits (or perhaps trying to manually test this code).
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Apparently, virtualenv-clone ends up copying the success-stamp file
that we use to track whether a virtualenv was successfully
provisioned, which results in problems if we get a network error in
the pip install stage afterwards.
The comment explains our fix, but basically we just delete
success-stamp after the clone.
Fixes#11301.
On usage errors (except --help), write usage message to stderr and
exit with nonzero status.
Forbid setting the hostname and email to the example values. Those
are specifically checked for and would fail later.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Instead, manually activate it in the one place where this
functionality was used (tools/lib/provision.py). This way we avoid
trying to activate the Python 2 thumbor virtualenv from Python 3.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Nowm unless you specify `--fill-cache`, memcached caches will not be
pre-filled after a server restart. This will be helpful when someone
is in a hurry (e.g. if the server is down right now, or if he/she
testing a configuration change in a newly setup server), it's best to
just restart without pre-filling the cache.
Fixes: #10900.
The site_packages variable points to (e.g.)
zulip-py3-venv/lib/python3.4/site-packages. If that doesn’t exist,
we’re probably running the wrong Python version.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
We still create a Python 2 virtualenv for thumbor but that’s
separate (/srv/zulip-thumbor-venv from
scripts/lib/create-thumbor-venv).
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Otherwise this causes an error
```
AttributeError: type object 'Callable' has no attribute '_abc_registry'
```
on 3.7. While the error is specific to 3.7, it is safer to uninstall
typing for all the versions that don't require a pip-provided typing
library.
/bin/sh and /usr/bin/env are the only two binaries that NixOS provides
at a fixed path (outside a buildFHSUserEnv sandbox).
This discussion was split from #11004.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This is a common bug that users might be tempated to introduce.
And also fix two instances of this bug that were present in our
codebase, including an important one in our upgrade code path.
This makes it possible to add --skip-purge-old-deployments in the
deploy_options section of /etc/zulip/zulip.conf, and control whether
old deployments are purged automatically on a system.
We still need to do https://github.com/zulip/zulip/issues/10534 and
probably also to add these arguments to be directly passed into
upgrade-zulip, but that can wait for future work.
Fixes#10946.
This commit works by vendoring the couple functions we still use from
puppetlabs stdlib (join and range), but removing the rest of the
puppetlabs codebase, and of course cleaning up our linter rules in the
process.
Fixes#7423.
Since yarn has a package.json conveniently available, we can parse
that with jq, saving the expensive operation of starting up yarn.
This saves ~300ms in a no-op provision.
This makes it possible for the Puppet codebase to access the path to
the relevant /home/zulip/deployments type directory that puppet was
run from, which in turn makes it possible to safely call scripts from
here.
Based on work by Rein Zustand.
Apparently, we were incorrectly expressing the paths in the
caches_in_use data structures for these two cache-cleaning algorithms,
resulting in the default threshhold_days algorithm controlling which
caches could be garbage-collected. While the emoji one was just a
performance optimization for upgrade-zulip-from-git, it was possible
for the main `node_modules` cache in use in production to be GCed,
resulting in LaTeX rendering being broken.
This fixes an actual user-facing issue in our mobile push
notifications documentation (where we were incorrectly failing to
quote the argument to `./manage.py register_server` making it not
work), as well as preventing future similar issues from occurring
again via a linter rule.
Apparently, on Debian stretch, the gnupg package isn't installed by
default, which means that our `apt-key add` commands were failing with
these errors on an ultra-minimal Debian installation:
+ apt-key add ./scripts/setup/packagecloud.asc
E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them is required for this operation
+ apt-key add ./scripts/setup/pgroonga-debian.asc
E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them is required for this operation
Fixes#10480.
The original code was actually broken, in that it checked the wrong
path, but it didn't matter because it used `ln -nsf`.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Previously, we unconditionally tried to restart the Tornado process
name corresponding to the historically always-true case of a single
Tornado process. This resulted in Tornado not being automatically
restarted on a production deployment on servers with more than one
Tornado process configured.
This library was absolutely essential as part of our Python 2->3
migration process, but all of its calls should be either no-ops or
encode/decode operations.
Note also that the library has been wrong since the incorrect
refactoring in 1f9244e060.
Fixes#10807.
This commit allows specifying Subject Alternative Names to issue certs
for multiple domains using certbot. The first name passed to certbot-auto
becomes the common name for the certificate; common name and the other
names are then added to the SAN field. All of these arguments are now
positional. Also read the following for the certbot syntax reference:
https://community.letsencrypt.org/t/how-to-specify-subject-name-on-san/Fixes#10674.
By far the dominant cause of errors when installing apt packages is
not having the Universe repository enabled in Ubuntu bionic (this
seems to have started happening a lot recently; I wonder if Ubuntu
changed the defaults for new server installs or something?).
In any case, providing that suggestion in the error output should help
reduce these a lot.
This allows our Tornado monitoring to correctly report whether
multiple configured Tornado processes are running.
This setup isn't ideal, in that it can't detect cases where the wrong
set of Tornado processes are running, but it's nice and simple and
should catch most actual problems.
Fixes#10706.
Issue: Before this commit, the `refname` positional argument to
`upgrade-zulip-from-git` script would run successfully for a branch
name on the given remote, but the script would fail if it was
provided with a tag or commit ID.
Solution: 'git clone -q -b refname LOCAL_GIT_CACHE_DIR deploy_path`
would be split into two commands:
1.) `git clone -q LOCAL_GIT_CACHE_DIR deploy_path`
2.) `git checkout -b deploy_timestamp refname` which makes a new
branch with the same name as the timestamp used in make_deploy_path.
Adds an optional argument `--remote-url` to specify the remote URL.
Command line remote URL will be given preference above the one
in /etc/zulip/zulip.conf.
Fixes#6092.
In scripts/lib/install line 71:
ZULIP_PATH="$(readlink -f $(dirname $0)/../..)"
^-- SC2046: Quote this to prevent word splitting.
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/lib/install line 105:
mem_kb=$(cat /proc/meminfo | head -n1 | awk '{print $2}')
^-- SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.
In scripts/lib/install line 141:
apt-get -y dist-upgrade $APT_OPTIONS
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/lib/install line 145:
$ADDITIONAL_PACKAGES
^-- SC2086: Double quote to prevent globbing and word splitting.
In scripts/lib/install line 254:
if [ -n "ZULIP_ADMINISTRATOR" ]; then
^-- SC2157: Argument to -n is always true due to literal strings.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
In scripts/setup/terminate-psql-sessions line 16:
major=$(echo "$version" | cut -d. -f1,2)
^-- SC2034: major appears unused. Verify use (or export if used externally).
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
We use it to drop privileges from root to other users in the installer
process (which ideally, we would remove, but it will take some
annoying refactoring).
This should generally be safe to do, since the default sudo
permissions only allow root to use it anyway.
See https://github.com/zulip/zulip/issues/10036 for the follow-up
issue of removing the need to do this.
This dramatically reduces the Tornado downtime when restarting a Zulip
server, which is generally the most significant source of user-facing
bad experiences.
Because we renamed the "google" iconset to be the modern Google set,
not what is now called the "googleblob" icon set, we need to make sure
that our usually correct policy of not overwriting image files under
`prod-static/` doesn't apply to files potentially being copied in for
the emoji images.
We fix this by just deleting the `images-google-64` directory on
upgrade if it contains the googleblob version of the "hotdog" emoji.
Fixes#10038.
Previously, we were having issues installing on Debian Stretch with
non-English locales, because `locale-gen` actually doesn't take a
locale as an argument (and thus `locale-gen en_US.UTF-8` did nothing).
We should instead be calling localedef directly.
Thanks to Tom Daff for debugging this.
Fixes#10629.
For building Zulip in an environment where a custom CA certificate is
required to access the public Internet, one needs to be able to
specify that CA certificate for all network access done by the Zulip
installer/build process. This change allows configuring that via the
environment.
Thanks to changes in restart-server, this is now already happening there.
(The restart-server changes were required to ensure that if the
upgrade failes and one just does
/home/zulip/deployments/next/restart-server to recover, the right
thing happens; so this is the correct resolution to the conflict).