zulip

History

Alex Vandiver 9b1bdfefcd nagios: Use a better index on UserActivity for zephyr alerting. Limiting only by client_name and query leads to a very poorly-indexed lookup on `query` which throws out nearly all of its rows: ``` Nested Loop (cost=50885.64..60522.96 rows=821 width=8) -> Index Scan using zerver_client_name_key on zerver_client (cost=0.28..2.49 rows=1 width=4) Index Cond: ((name)::text = 'zephyr_mirror'::text) -> Bitmap Heap Scan on zerver_useractivity (cost=50885.37..60429.95 rows=9052 width=12) Recheck Cond: ((client_id = zerver_client.id) AND ((query)::text = ANY ('{get_events,/api/v1/events}'::text[]))) -> BitmapAnd (cost=50885.37..50885.37 rows=9052 width=0) -> Bitmap Index Scan on zerver_useractivity_2bfe9d72 (cost=0.00..16631.82 rows=..large.. width=0) Index Cond: (client_id = zerver_client.id) -> Bitmap Index Scan on zerver_useractivity_1b1cc7f0 (cost=0.00..34103.95 rows=..large.. width=0) Index Cond: ((query)::text = ANY ('{get_events,/api/v1/events}'::text[])) ``` A partial index on the client and query list is extremely effective here in reducing PostgreSQL's workload; however, we cannot easily write it as a migration, since it depends on the value of the ID of the `zephyr_mirror` client. Since this is only relevant for Zulip Cloud, we manually create the index: ```sql CREATE INDEX CONCURRENTLY zerver_useractivity_zehpyr_liveness ON zerver_useractivity(last_visit) WHERE client_id = 1005 AND query IN ('get_events', '/api/v1/events'); ``` We rewrite the query to do the time limit, distinct, and count in SQL, instead of Python, and make use of this index. This turns a 20-second query into two 10ms queries.		2023-11-30 16:01:55 -08:00
..
files	nagios: Use a better index on UserActivity for zephyr alerting.	2023-11-30 16:01:55 -08:00
manifests	zulip_ops: Configure stats to be pushed to status.zulip.com.	2023-11-16 16:21:12 -05:00
templates	zulip_ops: Configure stats to be pushed to status.zulip.com.	2023-11-16 16:21:12 -05:00