From 912c6ab90590fda167be97c2a89e1099c427895d Mon Sep 17 00:00:00 2001
From: Tim Abbott <tabbott@zulip.com>
Date: Tue, 16 Feb 2021 12:55:21 -0800
Subject: [PATCH] docs: Add table to scalability article.

This table can hopefully replace some of the prose discussion about
relative scalability impact (though I don't do that editing in this
commit).
---
 docs/subsystems/performance.md | 44 +++++++++++++++++++++++++++-------
 1 file changed, 35 insertions(+), 9 deletions(-)

diff --git a/docs/subsystems/performance.md b/docs/subsystems/performance.md
index 54c2f295a3..7569d87857 100644
--- a/docs/subsystems/performance.md
+++ b/docs/subsystems/performance.md
@@ -78,16 +78,42 @@ optimizations with any cost in code readability to save a few
 milliseconds that would be invisible to the end user.
 
 In Zulip's documentation, our general rule is to primarily write facts
-that are likely to remain true for a long time.  While the numbers in
-this article will surely shift with time and hardware, we expect the
-rough sense of them (as well as the list of important endpoints) to
-remain constant for the foreseeable future.
+that are likely to remain true for a long time.  While the numbers
+presented here vary with hardware, usage patterns, and time (there's
+substantial oscillation within a 24 hour period), we the rough sense
+of them (as well as the list of important endpoints) is not likely to
+vary dramatically over time.
 
-As a spoiler, there are two categories of endpoints that are important
-for scalability: those with extremely high request volumes, and those
-with moderately high request volumes that are also expensive.  We
-first discuss the two endpoints in the first category, and then
-proceed to discuss the rest.
+``` eval_rst
+=======================   ============  ==============  ===============
+Endpoint                  Average time  Request volume  Average impact
+=======================   ============  ==============  ===============
+POST /users/me/presence   25ms          36%             9000
+GET /messages             70ms          3%              2100
+GET /                     300ms         0.3%            900
+GET /events               2ms           44%             880
+GET /user_uploads/*       12ms          5%              600
+POST /messages/flags      25ms          1.5%            375
+POST /messages            40ms          0.5%            200
+POST /users/me/*          50ms          0.04%           20
+=======================   ============  ==============  ===============
+```
+
+The "Average impact" above is computed by multiplying request volume
+by average time; this tells you roughly that endpoint's **relative**
+contribution to the steady-state total CPU load of the system.  It's
+not precise -- waiting for a network request is counted the same as
+active CPU time, but it's extremely useful for providing intuition for
+what code paths are most important to optimize, especially since
+network wait is in practice largely waiting for postgres or memcached
+to do work.
+
+As one can see, there are two categories of endpoints that are
+important for scalability: those with extremely high request volumes,
+and those with moderately high request volumes that are also
+expensive.  It doesn't matter how expensive `POST
+/users/me/subscriptions` is for scalability, because the volume is
+negligible.
 
 ### Tornado