Commit Graph

47 Commits

Author SHA1 Message Date
Tim Abbott bbc1484253 check-rabbitmq-queue: Adjust threshholds for paging.
Ultimately, this isn't an effective way to monitor this queue; we want
time-based monitoring, not count-based monitoring.  Doing that
properly will likely involve modifying the queue processor to write
something about its status.

But until we add the monitoring we want, it makes sense to leave this
active with low limits.
2019-10-13 22:39:52 -07:00
Tim Abbott 1c73ce2450 user_activity: Use LoopQueueProcessingWorker strategy.
This should dramatically improve the queue processor's performance in
cases where there's a very high volume of requests on a given endpoint
by a given user, as described in the new docstring.

Until we test this more broadly in production, we won't know if this
is a full solution to the problem, but I think it's likely.  We've
never seen the UserActivityInterval worker end up backlogged without a
total queue processor outage, and it should have a similar workload.

Fixes #13180.
2019-09-21 11:48:24 -07:00
Wyatt Hoodes a109508e34 typing: Remove now-unnecessary conditional import.
As a result of dropping support for trusty, we can remove our old
pattern of putting `if False` before importing the typing module,
which was essential for Python 3.4 support, but not required and maybe
harmful on newer versions.

cron_file_helper
check_rabbitmq_consumers
hash_reqs
check_zephyr_mirror
check_personal_zephyr_mirrors
check_cron_file
zulip_tools
check_postgres_replication_lag
api_test_helpers
purge-old-deployments
setup_venv
node_cache
clean_venv_cache
clean_node_cache
clean_emoji_cache
pg_backup_and_purge
restore-backup
generate_secrets
zulip-ec2-configure-interfaces
diagnose
check_user_zephyr_mirror_liveness
2019-07-29 15:18:22 -07:00
Wyatt Hoodes e331a758c3 python: Migrate open statements to use with.
This is low priority, but it's nice to be consistently using the best
practice pattern.

Fixes: #12419.
2019-07-20 15:48:52 -07:00
Tim Abbott ad81f700a1 scripts: Remove nagios overrides for missedmessage_emails.
Since 5cec566cb9, the
missedmessage_emails queue no longer is expected to grow a backlog
over time.
2019-04-13 20:43:07 -07:00
Puneeth Chaganti 9876f1b14e check_rabbitmq_queue: Fix the time period when we ignore long queues.
The commit 87d1809657 changed the time when
digests are sent by 3 hours to account for moving from the US East Coast to the
West Coast, but didn't change the time period exception in the
`check-rabbitmq-queue` script.

Closes #5415
2019-04-13 20:43:07 -07:00
Anders Kaseorg e984107966 scripts: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:02:58 -08:00
Tim Abbott 2558f101af docs: Add documentation for `if False` mypy pattern in scripts.
This should help make it clear what's going on with these scripts.
2018-12-17 11:12:53 -08:00
Tim Abbott 3f03dcdf5e nagios: Support multiple tornado processes.
This allows our Tornado monitoring to correctly report whether
multiple configured Tornado processes are running.

This setup isn't ideal, in that it can't detect cases where the wrong
set of Tornado processes are running, but it's nice and simple and
should catch most actual problems.
2018-11-06 16:50:03 -08:00
Tim Abbott 0cac7e1cd3 tornado: Extract functions for Tornado queue names.
This moves all control for what queue to use for which realm in our
Tornado system to just the sharding.py file; no actual sharding is
done yet.
2018-11-02 17:00:10 -07:00
Anders Kaseorg 09b8ccd510 scripts/nagios/check-rabbitmq-consumers: Avoid shelling out for mv.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2018-07-19 10:43:37 -07:00
Tim Abbott 999f264ad3 check_rabbitmq_queue: Exclude slow_queries queue from alerting.
Structurally, this queue has the same property as the missed_message
one, namely that it accumulates things and processes them only every
few minutes.

This should stop Zulip from paging in response to slow queries
accumulating when a server restart happens.
2018-05-25 13:06:50 -07:00
Greg Price 4475950ddf queue: Restore prematurely-cut upgrade path.
Revert c8f034e9a "queue: Remove missedmessage_email_senders code."
As the comment in the code says, it ensures a smooth upgrade path
from 1.7.x; we can delete it in master after 1.8.0 is released.
The removal commit was merged early due to a communication failure.
2018-02-28 11:15:53 -08:00
Umair Khan c8f034e9a0 queue: Remove missedmessage_email_senders code.
After 68513952fb, all emails are sent through email_senders queue.
This commit removes code related to the legacy queue.
2018-02-21 16:43:56 -08:00
Umair Khan 68513952fb email-worker: Create EmailSendingWorker.
This commit just copies all the code from MissedMessageSendingWorker
class to a new EmailSendingWorker class. All the logic to send an email
through a queue was already there. This commit only makes the logic
generic. It does so by creating a special purpose queue called
'email_senders' to send any type of email. To make
MissedMessageSendingWorker still work we derive it from
EmailSendingWorker. All the tests that were testing
MissedMessageSendingWorker now run against EmailSendingWorker.
2017-12-20 19:36:27 -08:00
rht 54fb88f331 scripts: Replace optparse with argparse. 2017-11-21 21:23:41 -08:00
Vishnu Ks 766511e519 actions: Mark all messages as read when user unsubscribes from stream.
This fixes a bug where, when a user is unsubscribed from a stream,
they might have unread messages on that stream leak.  While it might
seem to be a minor problem, it can cause significant problems for
computing the `unread_msgs` data structures, since it means we need to
add an extra filter for whether the user is still subscribed, either
in the backend or in the UI.

Fixes #7095.
2017-11-21 20:09:17 -08:00
rht 53e37aa511 scripts: Text-wrap long lines exceeding 110. 2017-11-10 16:22:26 -08:00
rht 71188d7b0a scripts: Remove import print_function. 2017-09-29 15:43:30 -07:00
Greg Price a099e698e2 py3: Switch almost all shebang lines to use `python3`.
This causes `upgrade-zulip-from-git`, as well as a no-option run of
`tools/build-release-tarball`, to produce a Zulip install running
Python 3, rather than Python 2.  In particular this means that the
virtualenv we create, in which all application code runs, is Python 3.

One shebang line, on `zulip-ec2-configure-interfaces`, explicitly
keeps Python 2, and at least one external ops script, `wal-e`, also
still runs on Python 2.  See discussion on the respective previous
commits that made those explicit.  There may also be some other
third-party scripts we use, outside of this source tree and running
outside our virtualenv, that still run on Python 2.
2017-08-16 17:54:43 -07:00
Aditya Bansal 807fee68d6 pep8: Add compliance with rule E261 nagios/check-rabbitmq-consumers. 2017-05-31 17:07:15 -07:00
Elliott Jin 0ec9e54954 bots: Add queue and QueueProcessingWorker for embedded bots. 2017-05-25 15:00:51 -07:00
vaibhav 8881b5eb9f Outgoing Webhook System: Check for @-mentioned outgoing webhook bots.
Also puts them into a processing queue, though the queue processor
does nothing.

Rewritten by tabbott to avoid unnecessary database queries in
do_send_messages.
2017-05-02 09:22:04 -07:00
K.Kanakhin 6a801db1c2 missed-emails-sending: Move email sending to separate queue worker.
- Add new 'missedmessage_email_senders' queue for sending missed messages emails.
- Add the new worker to process 'missedmessage_email_senders' queue.
- Split aggregation missed messages and sending missed messages email
  to separate queue workers.
- Adapt tests for sending missed emails to the new logic.

Fixes #2607
2017-03-07 20:08:40 -08:00
Tim Abbott 0afe832fc7 check-rabbitmq-consumers: Fix typing import issue. 2017-03-04 15:35:26 -08:00
Raghav Jajodia a3a03bd6a5 mypy: Added Dict, List and Set imports.
Fixed mypy errors associated with the upgrade.
2017-03-04 14:33:44 -08:00
Tim Abbott fe0c4cad85 check-rabbitmq-consumers: Go back to hardcoding for now.
This should fix the production test suite in Travis CI, so that we can
debug what's broken here offline.
2017-02-22 22:58:59 -08:00
Tim Abbott b81add60fe check-rabbitmq-consumers: Fix queue_workers call. 2017-02-22 00:48:43 -08:00
Tim Abbott aa6567ee34 queue_workers: Fix confusing --queue_type argument name. 2017-02-22 00:23:26 -08:00
Tim Abbott 19896460f0 nagios: Fix RabbitMQ Nagios checks running Django as root.
This can cause problems by making the /var/log/zulip files owned by
root (not zulip) and thus not writable by the Zulip user.
2017-02-22 00:20:57 -08:00
Tim Abbott 333062f08e nagios: Automate queue list in check-rabbitmq-consumers. 2017-02-19 16:19:55 -08:00
Tim Abbott 34046c1f55 check-rabbitmq-consumers: Add missing embed_links consumer. 2017-02-19 13:12:00 -08:00
Tim Abbott 213af24e47 check-rabbitmq-consumers: Reformat worker_queues list. 2017-02-19 13:12:00 -08:00
Tim Abbott 4e171ce787 lint: Clean up E126 PEP-8 rule. 2017-01-23 22:06:13 -08:00
Tommy Ip e4091c6413 pep8: Fix E222 violations. 2016-11-30 21:49:02 +00:00
Tommy Ip 46b7d54b3e pep8: Fix E701 violations. 2016-11-30 20:45:09 +00:00
Anders Kaseorg 573ec14955 Remove shebang line from non-scripts
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2016-11-26 13:20:22 -08:00
Sahil Dua 058587da77 Remove extra new lines at the ends of Zulip authoried files.
Fixes #1627.

[tweaked by tabbott to avoid patching third-party modules, for now]
2016-09-26 21:05:24 -07:00
Tim Abbott e7c3a0c819 check-rabbitmq-consumers: Add missing tornado_return consumer.
I'd like to move this list to be automatically generated, but this
fixes the fact that it's missing for now.
2016-08-17 22:53:00 -07:00
Tim Abbott 88a123d5e0 Fix excessive CPU usage by rabbitmq-numconsumers Nagios checks.
The previous model for these Nagios checks was kinda crazy -- every
minute, we'd run a full `rabbitmctl list_consumers` for each of the
dozen+ consumers that we have, and then do the exact same parsing
logic for each to determine whether the target queue has a running
consumer to write out a state file.

Because `rabbitmctl list_consumers` takes a small amount of resources,
on systems where CPU is very limited (e.g. t2 style AWS instances),
this minor CPU wastage could be problematic.

Now we just do that `rabbitmqctl list_consumers` once per minute, and
output all the state files from a single command.

Further TODO items on this front include removing the hardcoded list
of queues.
2016-08-12 14:09:36 -07:00
Tim Abbott 0d39ed82d1 Annotate cron_file_helper. 2016-08-04 15:57:03 -07:00
Eklavya Sharma 51ea5c1602 scripts/: Make subprocess calls unicode-aware. 2016-07-26 12:06:41 -07:00
Eklavya Sharma 11732f9ab0 Make all scripts in scripts/ pass mypy check. 2016-07-24 00:17:21 +05:30
Eklavya Sharma 149938d468 Change shebangs from python2.7 to python. 2016-05-29 05:03:08 -07:00
Tim Abbott cb81a59e38 Move write-rabbitmq-consumers-state-file to scripts/nagios/. 2016-05-07 19:37:06 -07:00
Tim Abbott 2761c012e5 Move rabbitmq consumer checks from bots/ to scripts/nagios/. 2016-05-07 19:37:06 -07:00
Tim Abbott be6566dc5c nagios: Move cron_file_helper from bots/ to scripts/lib.
This ensures the tool is available in Zulip production deployments.
2016-05-07 19:37:06 -07:00