Commit Graph

785 Commits

Author SHA1 Message Date
Anders Kaseorg 8c733a3f68 create-db.sql: Start by dropping the zulip database if needed.
At some point the PostgreSQL Docker image started creating the zulip
database for us, which caused our CREATE DATABASE to fail.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-15 18:04:34 -08:00
Anders Kaseorg 298d45b46a create-db.sql: Handle exception if zulip user already exists.
Fixes #13530.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-15 18:04:34 -08:00
Anders Kaseorg a82032a182 generate_secrets: Enable Redis authentication in production.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-15 17:35:15 -08:00
Anders Kaseorg 3360df7ad1 generate_secrets: Enable memcached authentication in production.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-15 17:35:15 -08:00
Anders Kaseorg cdda983e90 settings: Support optional memcached authentication.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-15 17:35:15 -08:00
Anders Kaseorg ea6934c26d dependencies: Remove WebSockets system for sending messages.
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013.  We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not.  Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.

While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.

This WebSockets code path had a lot of downsides/complexity,
including:

* The messy hack involving constructing an emulated request object to
  hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
  needs and must be provisioned independently from the rest of the
  server).
* A duplicate check_send_receive_time Nagios test specific to
  WebSockets.
* The requirement for users to have their firewalls/NATs allow
  WebSocket connections, and a setting to disable them for networks
  where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
  been poorly maintained, and periodically throws random JavaScript
  exceptions in our production environments without a deep enough
  traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
  server restart, and especially for large installations like
  zulipchat.com, resulting in extra delay before messages can be sent
  again.

As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients.  We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.

If we later want WebSockets, we’ll likely want to just use Django
Channels.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-14 22:34:00 -08:00
Tim Abbott 571ce2f5cb populate_db: Fix handling of memcached flushing.
Our recent fixes to using the system's configured memcached settings
broke populate_db, because its hacky clear_database helper is called
with a hacked-up settings module.

We fix this by first moving this out-of-place code from models.py into
populate_db, and then saving the settings required to access memcached
so that we can use them in clear_database.

We also fix a mypy erorr in flush-memcached that matches the same
issue fixed in clear_database.
2020-01-13 18:05:21 -08:00
Anders Kaseorg 699626f3cf flush-memcached: Use pylibmc.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-13 17:38:18 -08:00
rht cd3907648d prod install: Use ID_LIKE to help select os family. 2020-01-07 13:25:25 -08:00
rht bc94e8e815 prod install: Use /etc/os-release for Ubuntu/Debian to get os_id, os_version_id. 2020-01-07 13:25:25 -08:00
rht 9898c07e0d prod install: Add the CentOS version of the step to do dist-upgrade. 2020-01-07 13:25:25 -08:00
rht bf76696d67 prod install: Add the CentOS version of the step to install preparatory packages. 2020-01-07 13:25:25 -08:00
rht 6dd5dc32fc prod install: Add the CentOS version of the step to upgrade packages. 2020-01-07 13:25:25 -08:00
rht d88a7bbb42 prod install: Add the CentOS version of the step to update packages. 2020-01-07 13:25:25 -08:00
rht 49d7adb3cb prod install: Parse CentOS os identifications from /etc/os-release. 2020-01-07 13:25:25 -08:00
rht 771f6d213f prod install: Rename os_codename into os_version_id 2020-01-07 13:25:25 -08:00
Anders Kaseorg a78f8647d8 install: Run generate_secrets.py before zulip-puppet-apply.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-05 22:48:08 -08:00
Anders Kaseorg ab211c7acf lint: Tell ShellCheck to look for sourced files at relative paths.
This uses the new -P option of ShellCheck 0.7.0.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-18 03:48:02 -08:00
Vishnu KS 6901087246 install: Use crudini for storing value of POSTGRES_MISSING_DICTIONARIES.
This simplifies the RDS installation process to avoid awkwardly
requiring running the installer twice, and also is significantly more
robust in handling issues around rerunning the installer.

Finally, the answer for whether dictionaries are missing is available
to Django for future use in warnings/etc. around full-text search not
being great with this configuration, should they be required.
2019-12-13 12:05:39 -08:00
Vishnu KS 6c97a36355 install: Support remote database services like RDS.
Documentation and variable names edited by tabbott.
2019-12-12 12:59:45 -08:00
Anders Kaseorg 347fd80864 generate_secrets: Remove unused initial_password_salt in production.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-09 23:06:53 -08:00
Anders Kaseorg 7ebba2901a generate_secrets: Remove unused local_database_password in production.
Fixes #13464.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-09 13:05:31 -08:00
Tim Abbott 4e421ebe12 scripts: Move inline-email-css from tools to scripts.
We'll be soon documenting a production workflow that involves using
it, and that means it needs to live under scripts/ (since tools/ isn't
present in release tarballs).
2019-11-15 17:39:42 -08:00
Anders Kaseorg 0d20145b93 mypy: Upgrade from 0.730 to 0.740.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-11-13 12:38:45 -08:00
Anders Kaseorg ac49736311 install-node: Upgrade Node 12.11.1 to 12.13.0, Yarn 1.19.0 to 1.19.1.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-11-11 16:26:31 -08:00
Anders Kaseorg d6377b00c0 node_cache: Don’t retry copying node_modules; let yarn do its thing.
`copytree` throws an error if the target already exists, and we don’t
really want to rerun the copy anyway.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-10-29 12:30:28 -07:00
Tim Abbott bbc1484253 check-rabbitmq-queue: Adjust threshholds for paging.
Ultimately, this isn't an effective way to monitor this queue; we want
time-based monitoring, not count-based monitoring.  Doing that
properly will likely involve modifying the queue processor to write
something about its status.

But until we add the monitoring we want, it makes sense to leave this
active with low limits.
2019-10-13 22:39:52 -07:00
Anders Kaseorg 775162d687 setup_venv: Use pip install --require-hashes for better security.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-10-06 15:21:18 -07:00
Anders Kaseorg 9182293d50 node_cache: Preserve symlinks when copying an old node_modules tree.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-10-06 15:19:53 -07:00
Anders Kaseorg 8432d97edf setup_venv: Add pkg-config to VENV_DEPENDENCIES.
This is needed on at least Debian 10, otherwise xmlsec fails to
install: `Could not find xmlsec1 config. Are libxmlsec1-dev and
pkg-config installed?`

Also remove libxmlsec1-openssl, which libxmlsec1-dev already depends.

(No changes are needed on RHEL, where libxml2-devel and xmlsec1-devel
already declare a requirement on /usr/bin/pkg-config.)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-10-05 18:24:32 -07:00
Anders Kaseorg 1235dc3bec install-node: Upgrade to Node 12.11.1, Yarn 1.19.0.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-10-05 18:07:53 -07:00
Anders Kaseorg 0af22dad18 flush-memcached: Respect MEMCACHED_LOCATION; handle errors.
Fixes #13238.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-10-01 16:05:55 -07:00
ab1nash 71f0fecda7 scripts: Clean up output from 'clean_unused_caches'.
The output log from running clean_unused_caches was too verbose as
part of the `upgrade-zulip` overall output.  While this output is
potentially helpful when running it directly for debugging, it's
certainly redundant for the main production use case.

So a new flag --no-print-headers is introduced.  It suppresses the
header outputs for the subtools.

Fixes #13214.
2019-09-30 10:51:00 -07:00
Mateusz Mandera c42077c12f dependencies: Add dependencies needed for SAML. 2019-09-28 12:15:13 -07:00
Tim Abbott a84bb89bdc scripts: Move mobile i18n code out scripts/.
Like other code that is only used in the development environment, this
doesn't belong in scripts/.
2019-09-24 12:57:42 -07:00
Tim Abbott 27b3c1a312 provision: Move install-shellcheck to proper directory.
Scripts in scripts/ should be exclusively code that used in
production, and this isn't.
2019-09-24 12:54:33 -07:00
Anders Kaseorg 4fdc80a9c7 setup-apt-repo: Install groonga-keyring.
This allows the system to get updates to the Groonga repository
signing key, so `apt update` doesn’t start failing when the key
changes (like it recently did).

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-09-23 16:01:39 -07:00
Anders Kaseorg d1e504079d setup-apt-repo: Don’t waste time installing debian-archive-keyring.
debian-archive-keyring is a dependency of the essential package apt,
so it is present in every Debian system.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-09-23 16:01:39 -07:00
Anders Kaseorg 2ff87bd888 setup: Update groonga APT repository signing key.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-09-23 16:01:39 -07:00
Anders Kaseorg 76492b25ae setup_venv: Install pip.txt requirements with --force-reinstall.
virtualenv on Ubuntu 16.04, when creating a new environment, downloads
the current version of setuptools, then replaces its pkg_resources
with an old copy from
/usr/share/python-wheels/pkg_resources-0.0.0-py2.py3-none-any.whl.
This causes problems, a simple example of which is reproducible from
the ubuntu:16.04 Docker base image as follows:

    apt-get update
    apt-get -y install python3-virtualenv
    python3 -m virtualenv -p python3 /ve
    /ve/bin/pip install sockjs-tornado
    /ve/bin/pip download sockjs-tornado

→ `AttributeError: '_NamespacePath' object has no attribute 'sort'`

More relevantly, it breaks pip-compile in the same way.  To fix this,
we need to force setuptools to be reinstalled, even if we’re asking
for the same version.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-09-23 13:23:58 -07:00
Anders Kaseorg 8d91bebf95 restart-server: Warn if the shell’s PWD goes through an updated symlink.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-09-21 12:02:15 -07:00
Tim Abbott 1c73ce2450 user_activity: Use LoopQueueProcessingWorker strategy.
This should dramatically improve the queue processor's performance in
cases where there's a very high volume of requests on a given endpoint
by a given user, as described in the new docstring.

Until we test this more broadly in production, we won't know if this
is a full solution to the problem, but I think it's likely.  We've
never seen the UserActivityInterval worker end up backlogged without a
total queue processor outage, and it should have a similar workload.

Fixes #13180.
2019-09-21 11:48:24 -07:00
Anders Kaseorg 2e1494bdbd setup-apt-repo: Add ca-certificates to pre_setup_deps.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-09-19 20:15:43 -07:00
Anders Kaseorg 2ec946ad4d postgres-init-db: Require an Enter press in confirmation prompt.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-09-12 16:04:55 -07:00
Anders Kaseorg 096ef1445f parse_os_release: Use /etc/os-release always; remove DISTRIB_FAMILY.
To replace DISTRIB_FAMILY, there’s now an os_families function using
the standard ID and ID_LIKE information in /etc/os-release.

Fixes #13070; fixes #13071.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-29 17:30:20 -07:00
Anders Kaseorg 875002108f setup_venv: Remove CentOS workaround for fixed pycurl bug.
We are installing pycurl 7.43.0.3 which includes the fix.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-29 16:38:38 -07:00
Anders Kaseorg db44d61aab setup-apt-repo: Remove PPA and packagecloud repository.
We no longer use tsearch_extras, and the camo patch is irrelevant on
systemd systems (Xenial and newer).  So we no longer need to
provide/install a PPA at all.

Closes #13027.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-29 12:53:04 -07:00
Anders Kaseorg 6701c4463c search: Remove now unnecessary tsearch_extra dependency.
Now that we're implemented tsearch_extras in pure postgres, we no
longer need a custom extension.  This should help us considerably, as
it means we no longer need to ship custom apt packages at all.

Fixes #467.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-29 12:49:26 -07:00
rht 07808e35be parse_lsb_release: Use /etc/os-release instead of /etc/lsb-release. 2019-08-28 17:53:27 -07:00
Anders Kaseorg 9e481e353a .yarnrc: Set ignore-scripts true.
Follow up to #13065, to keep manual yarn invocations consistent with
our automated ones.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-28 16:15:54 -07:00