Commit Graph

47598 Commits

Author SHA1 Message Date
Alex Vandiver 499284d2fd nagios: Split postgresql into primary and replica.
Replication checks should only run on primary and replicas, not
standalone hosts; while `autovac_freeze` currently only runs on
primary hosts, it functions identically on replicas, and is fine to
run there.

Make `autovac_freeze` run on all `postgresql` hosts, and make
standalone hosts no longer `postgres_primary`, so they do not fail the
replication tests.
2022-06-22 12:07:38 -07:00
Alex Vandiver 38e435347b nagios: Add missing queue consumer checks. 2022-06-22 12:07:38 -07:00
Alex Vandiver e01a4242aa nagios: Sort queue consumer checks. 2022-06-22 12:07:38 -07:00
Alex Vandiver 2c90c7a010 nagios: Switch `check_remote_arg_string` queue checks to consumer checks.
These style of checks just look for matching process names using
`check_remote_arg_string`, which dates to 8edbd64bb8.  These were
added because the original two (`missedmessage_emails` and
`slow_queries`) did not create consumers, instead polling for events.

Switch these to checking the queue consumer counts that the
`check-rabbitmq-consumers` check is already writing out.  Since the
`missedmessage_emails` was _already_ checked via the consumer check, a
duplicate is not added.
2022-06-22 12:07:38 -07:00
Alex Vandiver f48d543d9b nagios: Make and use a "rabbitmq-consumer-service" template service. 2022-06-22 12:07:38 -07:00
Alex Vandiver 775a084d0f nagios: Add a catchall "other" set. 2022-06-22 12:07:38 -07:00
Alex Vandiver 83c82c8e15 nagios: Adjust load alerting by hostgroup.
Even the `pageable_servers` group did not page for high load -- in
part because what was "high" depends on the servers.  Set slightly
better limits based on server role.
2022-06-22 12:07:38 -07:00
Alex Vandiver 2a14aa5180 nagios: Add a `fullstack` hostgroup.
This will be used to apply checks only to czo.
2022-06-22 12:07:38 -07:00
Alex Vandiver b5ecfc327f nagios: Remove unnecessary `web` hostgroup.
This had identical membership to `frontends`.
2022-06-22 12:07:38 -07:00
Alex Vandiver 4be9025212 nagios: Remove redundant `postgresql` hostgroup.
This is implied by `postgresql_primary`.
2022-06-22 12:07:38 -07:00
Alex Vandiver d9d0014fb4 nagios: Rename `zmirror_main` into `zmirror` hostgroup.
`zmirror` itself was `zmirror_main` + `zmirrorp` but was unused; we
consistently just use the term `zmirror` for the non-personals server,
so use it as the hostgroup name.
2022-06-22 12:07:38 -07:00
Alex Vandiver 70c36985b4 nagios: Remove frontends from redis group.
The Redis nagios checks themselves are done against `redis` +
`frontends` groups, so there is no need to misleadingly place
`frontends` in the `redis` hostgroup.
2022-06-22 12:07:38 -07:00
Alex Vandiver 08127086bc nagios: Remove misleading "staging_frontends" from standalone.
No services are tested for the `staging_frontends` hostgroup, so this
does not alter the checks.
2022-06-22 12:07:38 -07:00
Alex Vandiver d804de871d nagios: Move staging and prod hostgroups adjacent. 2022-06-22 12:07:38 -07:00
Alex Vandiver 4c17f2bccc nagios: The frontends hostgroup now includes prod and staging frontends.
This lets the config file remove some repetition.
2022-06-22 12:07:38 -07:00
Alex Vandiver 1e81775fa0 nagios: Drop unhelpful hostgroup comment. 2022-06-22 12:07:38 -07:00
Alex Vandiver 7b584401ac nagios: Reformat hostgroups. 2022-06-22 12:07:38 -07:00
Alex Vandiver 93bcb86345 nagios: Reorder service checks. 2022-06-22 12:07:38 -07:00
Alex Vandiver eaaa2fbff8 nagios: Use canonical "hostgroup_name" consistently. 2022-06-22 12:07:38 -07:00
Alex Vandiver e8996b53a5 nagios: Remove unused has_swap hostgroup. 2022-06-22 12:07:38 -07:00
Alex Vandiver 33472ee9ff nagios: Remove unused stats host set. 2022-06-22 12:07:38 -07:00
Alex Vandiver bc4f4b4862 nagios: Make the pageable/not/flaky tri-state clearer. 2022-06-22 12:07:38 -07:00
Alex Vandiver c74f195fba nagios: Split AWS and non-AWS hosts, for ntp checks.
The non-AWS hosts cannot use the AWS ntp server for their check.
2022-06-22 12:07:38 -07:00
Alex Vandiver 872efdee58 nagios: Fold single- and multitornado_frontends back into frontends.
5abf4dee92 made this distinction, then multitornado_frontends was
never used; the singletornado_frontends alerting worked even for the
multiple-Tornado instances.

Remove the useless and misleading distinction.
2022-06-22 12:07:38 -07:00
Alex Vandiver 27b63d0baf check-rabbitmq-consumers: Fix a misleading comment. 2022-06-22 12:07:38 -07:00
Alex Vandiver 4e06ee45c7 check-rabbitmq-consumers: Remove unused --min-threshold.
This has never actually been used  -- and does not make sense with the
check-all-queues-at-once model switched to in 88a123d5e0.  The
Tornado processes are the only ones we expect to be non-1, and since
they were added in 3f03dcdf5e the right number has been read from
config, not passed as an argument.
2022-06-22 12:07:38 -07:00
Alex Vandiver 53c01aa299 check-rabbitmq-consumers: Remove --queue argument from help.
This has not been accepted since 88a123d5e0.
2022-06-22 12:07:38 -07:00
Alex Vandiver 91379fd67e ci: Update upgrade test to 5.3, from 5.2. 2022-06-21 17:40:33 -07:00
Anders Kaseorg dc6af98e52 nginx: Add Cache-Control headers for Django-hashed static files.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-21 17:26:23 -07:00
Alex Vandiver 9ad74739aa version: Update version and changelog after 5.3 release. 2022-06-21 20:48:24 +00:00
Anders Kaseorg 20f9293f1f CVE-2022-31017: Fix edit event exposure in protected-history streams.
When editing an old message in a private stream with protected
history, the server would incorrectly send an API event including the
edited message to all of the stream’s current subscribers, including
those who should not have access to the old message. This API event is
ignored by official clients, so it could only be observed by a user
using a modified client or their browser’s developer tools.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-21 13:41:23 -07:00
Zixuan James Li c72fe80525 management: Remove migrate_stream_notifications.
The script has been broken since `Subscription.notifications`
was removed.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-06-21 12:57:01 -07:00
Zixuan James Li aebed0e57f management: Remove rename_stream.
Now that it is trivial to rename a stream in the UI, And due
to the fact that the command has been broken for 3 years unnoticed,
it is unnecessary to maintain it anymore.

Fixes #22244.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-06-21 12:56:54 -07:00
Anders Kaseorg 028c2e4ec9 ui_init: Remove unused .subscription_header hover handlers.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-21 12:54:51 -07:00
Alex Vandiver 0645656fd8 process_fts_updates: Nagios may lack permissions to load Django config.
Even if Django and PostgreSQL are on the same host, the `nagios` user
may lack permissions to read accessory configuration files needed to
load the Django configuration (e.g. authentication keys).

Catch those failures, and switch to loading the required settings from
`/etc/zulip/zulip.conf`.
2022-06-21 12:50:13 -07:00
Alex Vandiver a35af3f38b install/upgrade: Allow new packages during `apt-get upgrade`.
`postgresql-14.4` is a notable upgrade in the PostgreSQL series, as it
fixes potential database corruption from `CREATE INDEX CONCURRENTLY`
statements which are run while rows are modified[1].  However, it also
requires an upgrade from `libllvm9` to `libllvm10`, which means it is
not installed by a mere `apt-get upgrade`.

Add the `--with-new-pkgs` flag to all of the potentially relevant
`apt-get upgrade` calls, so that this (and similar) packages are
upgraded successfully.

[1]: https://www.postgresql.org/docs/release/14.4/
2022-06-21 11:21:49 -07:00
Anders Kaseorg 95303a9929 assets: Remove license for previously deleted emoji images.
The images themselves had been deleted by commit
cc33b68d73, and were then zanitized out
of the commit history.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-10 12:27:48 -07:00
Anders Kaseorg a1bdadfb3d audio: Remove the copy of zulip.ogg outside notification_sounds.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-10 12:27:48 -07:00
Anders Kaseorg 048b52ed5c images: Delete various unused images.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-10 12:27:48 -07:00
Anders Kaseorg 1e6bc5b36b styles: Delete some unused CSS.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-10 12:27:48 -07:00
Alex Vandiver 5bdc4b3562 upgrade-zulip-from-git: init, then add remote.
30457ecd02 removed the `--mirror` from
initial clones, but did not add back `--bare`, which `--mirror`
implies.  This leads to `/srv/zulip.git` having a working tree in it,
with a `/srv/zulip.git/.git` directory.

This is mostly harmless, and since the bug was recent, not worth
introducing additional complexity into the upgrade process to handle.

Calling `git clone --bare`, however, would clone the refs into
`refs/heads/`, not the `refs/remotes/origin/` we want.  Instead, use
`git init --bare`, followed by `git remote add origin`.  The remote
will be fetched by the usual `git fetch --all --prune` which is below.
2022-06-09 11:18:42 -07:00
Alex Vandiver 1639792e9e upgrade-zulip-from-git: Check fetch refspecs, not mirror flag.
While the `remote.origin.mirror` boolean being set is a very good
proxy for having been cloned with `--mirror`, is technically only used
when pushing into the remote[1].  What we care about is if fetches
from this remote will overwrite `refs/heads/`, or all of `refs/` --
the latter of which is most likely, from having run `git clone
--bare`.

Detect either of these fetch refspecs, and not the mirror flag.  We
let the upgrade process error out if `remote.origin.fetch` is unset,
as that represents an unexpected state.  We ignore failures to unset
the `remote.origin.mirror` flag, in case it is not set already.

[1]: https://git-scm.com/docs/git-config#Documentation/git-config.txt-remoteltnamegtmirror
2022-06-09 11:18:42 -07:00
Anders Kaseorg 0430705d13 test_tornado: Call process_event on first fetch_events return.
The 0.1 second delay was sometimes not long enough to guarantee we hit
the async response path, resulting in a nondeterministic coverage
failure.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-08 17:10:38 -07:00
Anders Kaseorg e112b619cc tornado: Fix race condition on handler._request.
Commit 6fd1a558b7 (#21469) introduced an
await point where get_events_backend calls fetch_events in order to
switch threads.  This opened the possibility that, in the window
between the connect_handler call in fetch_events and the old location
of this assignment in get_events_backend, an event could arrive,
causing ClientDescriptor.add_event to crash on missing
handler._request.  Fix this by assigning handler._request earlier.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-08 17:10:38 -07:00
Mateusz Mandera acfa55138e retention: Add docstring info on how archive cleaning works.
In particular, it's important to record the special treatment around
ArchivedAttachment rows not being deleted in this step.
2022-06-08 15:12:36 -07:00
Tim Abbott 1304c435ce populate_db: Restore test database predictable set of topics.
856eff0fe6 perturbed the initial state
of the 30 messages in the test database, which unfortunately some
tests depend on.
2022-06-08 12:34:45 -07:00
David Rosa 3474567521 help-docs: Document "Copy link to topic" mobile app feature.
Adds a "Mobile" tab with straightforward instructions for accessing
the long-press menu using two new macros.

Knowing where to access the long-press menu is intuitive, except when
the user is in a topic narrow. It's not immediately obvious that the
top bar can be long-pressed so this adds a "Tip block" using a new macro
to clarify things in this scenario.

Also, this combines the instructions for Desktop and Web into a single
tab because the numbered steps work on both platforms. So this documents
the alternate method via the browser's address bar as a "Tip block" to
avoid stacking alternative numbered steps into a single tab.
This updates the stream and message link instructions too.

Fixes: #22147.
2022-06-08 12:24:09 -07:00
David Rosa 52ef574d3e help-docs: Fix minor errors in "Link to a message or conversation".
Removes the ":" which have accidentally ended up in the "Get a link
to a specific topic" and "Get a link to a specific stream" headings.

Renames the "Via browser's address bar" tab to "Web" so that it
stays consistent with other help center articles.

Fixes part of #22147.
2022-06-08 12:20:32 -07:00
Aman Agrawal d7444f919d app-loading: Add a delay before showing the reload message.
Fixes #22182

This message often flashes on screen briefly, causing unnecessary
worry for the user (is the app likely to not load?).
To address this, we add a delay before the message is shown.

As a consequence, we change the notice to no longer suggest waiting a
few seconds, since we did that before showing it.
2022-06-08 12:17:55 -07:00
Tim Abbott 856eff0fe6 populate_db: Ensure many streams have "more topics" in development.
We increase the total number of messages, since increasing the number
of topics would otherwise have the side effect of making it hard to
find longer conversations.
2022-06-08 12:08:56 -07:00