Commit Graph

145 Commits

Author SHA1 Message Date
Tim Abbott e1ce53ac46 puppet: Update nagios checks for disk to exclude kernel filesystems.
The fact that we have to explicitly list these is almost certainly a
bug in check_disk, but at least this works.
2020-04-16 17:49:29 -07:00
Tim Abbott cfbb617f5c puppet: Update nagios configuration for checking local disk. 2020-04-16 17:48:36 -07:00
Tim Abbott 8e5a866122 puppet: Update tuning for load average monitoring. 2020-04-16 16:47:05 -07:00
Anders Kaseorg c734bbd95d python: Modernize legacy Python 2 syntax with pyupgrade.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:43:22 -07:00
Vishnu KS 449f7e2d4b team: Generate team page data using cron job.
This eliminates the contributors data as a possible source of
flakiness when installing Zulip from Git.

Fixes #14351.
2020-04-08 12:52:31 -07:00
Stefan Weil d2fa058cc1
text: Fix some typos (most of them found and fixed by codespell).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-03-27 17:25:56 -07:00
Anders Kaseorg 7ff9b22500 docs: Convert many http URLs to https.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-03-26 21:35:32 -07:00
Anders Kaseorg 687553a661 setup_path_on_import: Replace with setup_path function.
isort 5 knows not to reorder imports across function calls, so this
will stop isort from breaking our code.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-02-25 15:40:21 -08:00
Mateusz Mandera 4c5a8e6f0c queue: Remove missedmessage_email_senders. 2020-01-31 12:13:51 -08:00
Tim Abbott dd969b5339 install: Remove references to "Zulip Voyager".
"Zulip Voyager" was a name invented during the Hack Week to open
source Zulip for what a single-system Zulip server might be called, as
a Star Trek pun on the code it was based on, "Zulip Enterprise".

At the time, we just needed a name quickly, but it was never a good
name, just a placeholder.  This removes that placeholder name from
much of the codebase.  A bit more work will be required to transition
the `zulip::voyager` Puppet class, as that has some migration work
involved.
2020-01-30 12:40:41 -08:00
Tim Abbott d70e799466 bots: Remove FEEDBACK_BOT implementation.
This legacy cross-realm bot hasn't been used in several years, as far
as I know.  If we wanted to re-introduce it, I'd want to implement it
as an embedded bot using those common APIs, rather than the totally
custom hacky code used for it that involves unnecessary queue workers
and similar details.

Fixes #13533.
2020-01-25 22:41:39 -08:00
Anders Kaseorg ea6934c26d dependencies: Remove WebSockets system for sending messages.
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013.  We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not.  Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.

While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.

This WebSockets code path had a lot of downsides/complexity,
including:

* The messy hack involving constructing an emulated request object to
  hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
  needs and must be provisioned independently from the rest of the
  server).
* A duplicate check_send_receive_time Nagios test specific to
  WebSockets.
* The requirement for users to have their firewalls/NATs allow
  WebSocket connections, and a setting to disable them for networks
  where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
  been poorly maintained, and periodically throws random JavaScript
  exceptions in our production environments without a deep enough
  traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
  server restart, and especially for large installations like
  zulipchat.com, resulting in extra delay before messages can be sent
  again.

As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients.  We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.

If we later want WebSockets, we’ll likely want to just use Django
Channels.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-14 22:34:00 -08:00
Tim Abbott f84c037225 puppet: Tune check_postgres_locks parameters.
This has been a spurious alert for a long time.

It's unclear that this check is useful at all, but if it spikes
dramatically above what's normal, there's perhaps still utility in
being alerted.
2019-10-23 15:04:38 -07:00
Tim Abbott e4dee9532c nagios: Update configuration for user_activity worker change.
Since LoopQueueProcessingWorker jobs cannot be monitored by checking
for connected consumers (since they poll, rather than consuming as
events arrive), they can't be monitored with check_consumers.  It's
OK, because that monitoring was redundant with monitoring for
potential growth in their queue that we have as well.

Also clean up the block comments for the two other similar queue
procesors.
2019-09-23 11:49:46 -07:00
Anders Kaseorg 0962393933 cleanup: Delete trailing newlines.
Delete trailing newlines from all files, except
tools/ci/success-http-headers.txt and tools/setup/dev-motd, where they
are significant, and static/third, where we want to stay close to
upstream.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-06 23:29:11 -07:00
Anders Kaseorg becef760bf cleanup: Delete leading newlines.
Previous cleanups (mostly the removals of Python __future__ imports)
were done in a way that introduced leading newlines.  Delete leading
newlines from all files, except static/assets/zulip-emoji/NOTICE,
which is a verbatim copy of the Apache 2.0 license.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-06 23:29:11 -07:00
Wyatt Hoodes a109508e34 typing: Remove now-unnecessary conditional import.
As a result of dropping support for trusty, we can remove our old
pattern of putting `if False` before importing the typing module,
which was essential for Python 3.4 support, but not required and maybe
harmful on newer versions.

cron_file_helper
check_rabbitmq_consumers
hash_reqs
check_zephyr_mirror
check_personal_zephyr_mirrors
check_cron_file
zulip_tools
check_postgres_replication_lag
api_test_helpers
purge-old-deployments
setup_venv
node_cache
clean_venv_cache
clean_node_cache
clean_emoji_cache
pg_backup_and_purge
restore-backup
generate_secrets
zulip-ec2-configure-interfaces
diagnose
check_user_zephyr_mirror_liveness
2019-07-29 15:18:22 -07:00
Wyatt Hoodes e331a758c3 python: Migrate open statements to use with.
This is low priority, but it's nice to be consistently using the best
practice pattern.

Fixes: #12419.
2019-07-20 15:48:52 -07:00
Tim Abbott b41c2d93d1 puppet: Exclude squashfs filesystems from nagios disk checks.
These generally aren't being written to.
2019-06-16 16:22:23 -07:00
Anders Kaseorg 643bd18b9f lint: Fix code that evaded our lint checks for string % non-tuple.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:21:37 -07:00
Anders Kaseorg 649235cfec python: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-22 16:54:36 -08:00
Anders Kaseorg c109690cf8 puppet: Remove unused Python imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:02:12 -08:00
rht 9ee2ee046a puppet: Use systemctl instead of pg_ctlcluster on CentOS. 2019-01-05 15:49:03 -08:00
Tim Abbott 2558f101af docs: Add documentation for `if False` mypy pattern in scripts.
This should help make it clear what's going on with these scripts.
2018-12-17 11:12:53 -08:00
Tim Abbott b218c2a70e loadbalancer: Use same certbot cert for zulipstaging.com.
This is a simple configuration improvement.
2018-12-07 13:43:21 -08:00
Tim Abbott 467694c1fa nginx: Enable http2 in external nginx configuration.
This should be a nice performance improvement for browsers that
support it.

We can't yet enabled this in the Zulip on-premise nginx configuration,
because that still has to support Trusty.
2018-12-07 13:43:02 -08:00
Tim Abbott 5abf4dee92 nagios: Add new host groups for Tornado processes.
We also move all the existing Tornado monitoring rules to the
singletornado_frontends rule.
2018-11-06 16:33:18 -08:00
Tim Abbott 5f3b79c9e7 nagios: Fix tab-based whitespace. 2018-11-06 16:30:29 -08:00
Tim Abbott b53a712856 nginx: Update configuration for using certbot certs everywhere. 2018-08-22 11:59:15 -07:00
Anders Kaseorg edfd5ef992 setup_disks.sh: Fix shellcheck warnings.
In puppet/zulip_ops/files/postgresql/setup_disks.sh line 15:
array_name=$(mdadm --examine --scan | sed 's/.*name=//')
^-- SC2034: array_name appears unused. Verify use (or export if used externally).

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2018-08-03 09:15:26 -07:00
Anders Kaseorg 5a0fecc2d5 munin_plugins: Fix shellcheck warnings.
In puppet/zulip_ops/files/munin-plugins/rabbitmq_connections line 66:
echo "connections.value $(HOME=$HOME rabbitmqctl list_connections | grep -v "^Listing" | grep -v "done.$" | wc -l)"
                                                                                         ^-- SC2126: Consider using grep -c instead of grep|wc -l.

In puppet/zulip_ops/files/munin-plugins/rabbitmq_consumers line 32:
VHOST=${vhost:-"/"}
^-- SC2034: VHOST appears unused. Verify use (or export if used externally).

In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages line 32:
VHOST=${vhost:-"/"}
^-- SC2034: VHOST appears unused. Verify use (or export if used externally).

In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages_unacknowledged line 32:
VHOST=${vhost:-"/"}
^-- SC2034: VHOST appears unused. Verify use (or export if used externally).

In puppet/zulip_ops/files/munin-plugins/rabbitmq_messages_uncommitted line 32:
VHOST=${vhost:-"/"}
^-- SC2034: VHOST appears unused. Verify use (or export if used externally).

In puppet/zulip_ops/files/munin-plugins/rabbitmq_queue_memory line 32:
VHOST=${vhost:-"/"}
^-- SC2034: VHOST appears unused. Verify use (or export if used externally).

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2018-08-03 09:15:08 -07:00
Tim Abbott 02ae71f27f api: Stop using API keys for Django->Tornado authentication.
As part of our effort to change the data model away from each user
having a single API key, we're eliminating the couple requests that
were made from Django to Tornado (as part of a /register or home
request) where we used the user's API key grabbed from the database
for authentication.

Instead, we use the (already existing) internal_notify_view
authentication mechanism, which uses the SHARED_SECRET setting for
security, for these requests, and just fetch the user object using
get_user_profile_by_id directly.

Tweaked by Yago to include the new /api/v1/events/internal endpoint in
the exempt_patterns list in test_helpers, since it's an endpoint we call
through Tornado. Also added a couple missing return type annotations.
2018-07-30 12:28:31 -07:00
Tim Abbott 07af59d4cc tornado: Split get_events_backend into two functions.
The lower-layer function, now called get_events_backend, is intended
to be called by multiple code paths (including the upcoming
get_events_internal).
2018-07-30 12:28:31 -07:00
neiljp (Neil Pilgrim) 090b47ed19 mypy: Add explicit Optional for default=None parameters in various files. 2018-03-28 12:31:51 -07:00
neiljp (Neil Pilgrim) f32f3cbf72 mypy: Amend zulip-ec2-configure-interfaces to avoid None. 2018-03-23 11:39:54 -07:00
Tim Abbott d98be2f19f puppet: Only run analytics Nagios checks on machine running cron.
Running this on additional machines would be redundant; additionally,
the FillState checker cron job runs only on cron systems, so this will
crash on other app frontends.
2018-03-06 13:38:27 -08:00
Tim Abbott 8e8faab006 puppet: Move clearsessions cron job to app_frontend_once.
While this is a different system than I'd written up in #8004, I think
this is a better solution to the general problem of cron jobs to run
on just one server.

Fixes #8004.
2018-03-06 13:35:51 -08:00
Tim Abbott 24b6106c9c puppet: Dsiable checking for evictions in memcached nagios.
Zulip's caching model for message history is such that it is normal
and healthy for there to eventually be a nontrivial volume of
evictions.
2018-03-06 13:34:02 -08:00
Greg Price 4475950ddf queue: Restore prematurely-cut upgrade path.
Revert c8f034e9a "queue: Remove missedmessage_email_senders code."
As the comment in the code says, it ensures a smooth upgrade path
from 1.7.x; we can delete it in master after 1.8.0 is released.
The removal commit was merged early due to a communication failure.
2018-02-28 11:15:53 -08:00
Umair Khan c8f034e9a0 queue: Remove missedmessage_email_senders code.
After 68513952fb, all emails are sent through email_senders queue.
This commit removes code related to the legacy queue.
2018-02-21 16:43:56 -08:00
Rishi Gupta 1d581a9c6e nagios: Add nagios check for analytics state.
This should help us detect issues where the analytics cron jobs aren't
running properly.

The cron/nagios part of the implementation done by tabbott.
2018-02-09 16:36:05 -08:00
Umair Khan 68513952fb email-worker: Create EmailSendingWorker.
This commit just copies all the code from MissedMessageSendingWorker
class to a new EmailSendingWorker class. All the logic to send an email
through a queue was already there. This commit only makes the logic
generic. It does so by creating a special purpose queue called
'email_senders' to send any type of email. To make
MissedMessageSendingWorker still work we derive it from
EmailSendingWorker. All the tests that were testing
MissedMessageSendingWorker now run against EmailSendingWorker.
2017-12-20 19:36:27 -08:00
Vishnu Ks 766511e519 actions: Mark all messages as read when user unsubscribes from stream.
This fixes a bug where, when a user is unsubscribed from a stream,
they might have unread messages on that stream leak.  While it might
seem to be a minor problem, it can cause significant problems for
computing the `unread_msgs` data structures, since it means we need to
add an extra filter for whether the user is still subscribed, either
in the backend or in the UI.

Fixes #7095.
2017-11-21 20:09:17 -08:00
Tim Abbott 94554c65da certbot: Modify nginx configuration to support automated renewal. 2017-11-08 12:32:26 -08:00
Tim Abbott 62bb465896 puppet: Modify lb0 nginx configuration. 2017-11-08 12:32:26 -08:00
rht 549a26860f refactor: Remove six.moves.range import. 2017-11-07 10:46:42 -08:00
Tim Abbott 0d1194811f mypy: Remove ignores for a few typeshed bugs fixed upstream. 2017-10-27 17:09:00 -07:00
Tim Abbott 540cae19a8 puppet: Remove obsolete sparkle configuration.
Sparkle was the auto-update system used by the legacy desktop app.  We
haven't been capable of using it for auto-update in years, so there's
no reason to keep around the configuration.

The new Electron app uses a different system anyway.
2017-10-19 16:35:55 -07:00
rht b57289aacd py3: Remove all `from __future__ import print_function.
Except for these files:
- tools/linter_lib/*
- tools/lib
- tools/lister.py
2017-10-18 12:07:19 -07:00
rht 2f3ae84e5a py3: Remove all `__future__ import division`. 2017-10-17 23:09:12 -07:00