zulip/zerver
Alex Vandiver 07c4291749 message: Rewrite personals query to be more performant and accurate.
The previous query suffered from bad corner cases when the user had
received a large number of direct messages but sent very few,
comparatively.  This mean that the first half of the UNION would
retrieve a very large number of UserMessage rows, requiring fetching a
large number of Message rows, merely to throw them away upon
determining that the recipient was the current user.

Instead of merging two queries of "last 1k received" + "last 1k sent",
we instead make better use of the UserMessage rows to find "last 1k
sent or received."  This may change the list of recipients, as large
disparities in sent/received messages may result in pushing the
most-recently-sent users off of the list.  These are likely uncommon
edge cases, however -- and the disparity is the whole reason for the
performance problem.

This also provides more correct answers.  In the case where a user's
1001'th message sent was to person A today, but my most recent message
received was from them yesterday, the previous plan would show the
message I received yesterday message-id as the max, and not the more
recent message I sent today.

While we could theoretically raise the `RECENT_CONVERSATIONS_LIMIT` to
more frequently match the same recipient list as previously, this
increases the cost of the most common cases unreasonably.  With a
1000-message limit, the common cases are slightly faster, and the tail
latencies are very much improved; raising `RECENT_CONVERSATIONS_LIMIT`
would increase the result similarity to the old algorithm, at the cost
of the p50 and p75.

|        |   Old   |   New   |
| ------ | ------- | ------- |
| Mean   | 0.05287 | 0.02520 |
| p50    | 0.00695 | 0.00556 |
| p75    | 0.05592 | 0.03351 |
| p90    | 0.14645 | 0.08026 |
| p95    | 0.20181 | 0.10906 |
| p99    | 0.30691 | 0.16014 |
| p99.9  | 0.57894 | 0.19521 |
| max    | 22.0610 | 0.22184 |

On the whole, however, the much more bounded worst case are worth the
small changes to the resultset.
2024-01-18 09:30:20 -08:00
..
actions actions: Rename *topic local variables to *topic_name. 2024-01-15 09:40:43 -08:00
data_import models: Extract zerver.models.streams. 2023-12-16 22:08:44 -08:00
integration_fixtures/nagios
lib message: Rewrite personals query to be more performant and accurate. 2024-01-18 09:30:20 -08:00
management process_queue: For threaded workers, create them when they start. 2024-01-12 08:38:46 -08:00
migrations models: Extract zerver.models.realm_audit_logs. 2023-12-16 22:08:44 -08:00
models actions: Rename *topic local variables to *topic_name. 2024-01-15 09:40:43 -08:00
openapi docs: Fix other help pages that were renamed or moved, to save a redirect. 2024-01-11 13:52:12 -08:00
tests find_account: Remove emails as URL parameters. 2024-01-16 09:39:00 -08:00
tornado models: Extract zerver.models.clients. 2023-12-16 22:08:44 -08:00
transaction_tests models: Extract zerver.models.realms. 2023-12-16 22:08:44 -08:00
views find_account: Remove emails as URL parameters. 2024-01-16 09:39:00 -08:00
webhooks webhooks: Rename *topic local variables to *topic_name. 2024-01-17 08:35:29 -08:00
worker queue_processors: Defer initial email connection creation. 2024-01-12 08:38:46 -08:00
__init__.py
apps.py mypy: Enable new error explicit-override. 2023-10-12 12:28:41 -07:00
context_processors.py login: Remove external_authentication_methods from page_params. 2023-12-29 13:02:12 -08:00
decorator.py auth: Add hardening authenticate(use_dummy_backend=True) in do_login. 2024-01-15 12:18:48 -08:00
filters.py mypy: Enable new error explicit-override. 2023-10-12 12:28:41 -07:00
forms.py models: Extract zerver.models.realms. 2023-12-16 22:08:44 -08:00
logging_handlers.py error_notify: Remove custom email error reporting handler. 2023-07-20 11:00:09 -07:00
middleware.py csrf_failure: Update error page. 2024-01-10 09:49:24 -08:00
signals.py email: Add a space after the time and AM/PM in the login email. 2023-11-27 09:47:30 -08:00