zulip/zerver
Steve Howell 730da55bf8 Pre-fetch user ids for presence query.
Before this commit, postgres would choose a non-optimal query
plan to find all presence rows belonging to a realm.  We now
do an extra query to get the list of relevant user_ids, which allows
the next query to take advantage of UserPresence's index on
user_profile_id.

Here is the query plan for the offending query (this particular query isn't
verbatim from the code, but it's representative of the problem):

    explain analyze
    select client_id
    from zerver_userpresence
    INNER JOIN zerver_userprofile ON
        zerver_userprofile.id = zerver_userpresence.user_profile_id
    WHERE
        zerver_userprofile.is_active and
        zerver_userprofile.realm_id = 3;

     Hash Join  (cost=149.66..506.82 rows=5007 width=4) (actual time=48.834..121.215 rows=5007 loops=1)
       Hash Cond: (zerver_userprofile.id = zerver_userpresence.user_profile_id)
       ->  Seq Scan on zerver_userprofile  (cost=0.00..260.11 rows=5369 width=4) (actual time=0.009..24.322 rows=5021 loops=1)
             Filter: (is_active AND (realm_id = 3))
             Rows Removed by Filter: 3
       ->  Hash  (cost=87.07..87.07 rows=5007 width=8) (actual time=48.789..48.789 rows=5010 loops=1)
             Buckets: 1024  Batches: 1  Memory Usage: 196kB
             ->  Seq Scan on zerver_userpresence  (cost=0.00..87.07 rows=5007 width=8) (actual time=0.007..24.355 rows=5010 loops=1)
     Total runtime: 145.063 ms

You can see above that we're filtering on realm_id instead of using an index.

When you decompose the query into two queries, the total time is about 100ms, for a
savings of 33%.  I imagine the savings would be even greater on an instance with lots
of realms.  This was tested on dev with one really large realm and one tiny realm.
2017-09-08 12:32:17 -07:00
..
fixtures test fixtures: Add some more quotes and remove some Oscar Wilde. 2017-08-23 13:00:39 -07:00
lib fix_unreads: Add docstring explaining migration use case. 2017-09-07 07:06:03 -07:00
management fix_unreads: Remove commit() call in fix(). 2017-09-07 07:06:03 -07:00
migrations Add migration to fix unread messages. 2017-09-07 07:06:03 -07:00
templatetags templatetags: Fix buggy exception clause. 2017-08-25 00:39:58 -07:00
tests fix_unreads: Remove commit() call in fix(). 2017-09-07 07:06:03 -07:00
tornado Calculate idle users more efficiently when sending messages. 2017-09-07 06:59:44 -07:00
views Add MutedTopic model. 2017-09-02 09:19:51 -07:00
webhooks trello: Use client_head wrapper in tests. 2017-08-26 13:45:27 -07:00
worker mypy: Set assign_queue() parameter queue_type to not be Optional. 2017-08-07 21:27:50 -07:00
__init__.py caching: Add configuration class for post-migration cache flushing. 2016-10-27 23:26:34 -07:00
apps.py Add notifications on new logins to Zulip. 2017-03-25 16:50:52 -07:00
context_processors.py settings: Rename SERVER_URI to ROOT_DOMAIN_URI. 2017-08-28 14:09:28 -07:00
decorator.py logger: Add new create_logger abstraction to simplify logging. 2017-08-27 18:31:53 -07:00
filters.py mypy: Added Dict, List and Set imports. 2017-03-04 14:33:44 -08:00
forms.py forms: Replace is_inactive with more comprehensive check. 2017-08-24 23:16:31 -07:00
logging_handlers.py logging_handlers: Fix tracebacks being emailed in subject lines. 2017-04-25 18:55:11 -07:00
middleware.py mypy: Correct 2 type annotations in zerver/middleware.py. 2017-08-15 17:50:18 -07:00
models.py Pre-fetch user ids for presence query. 2017-09-08 12:32:17 -07:00
signals.py Update "MacOS" text to "macOS" 2017-08-26 09:00:42 -07:00
static_header.txt
storage.py mypy: Remove type: ignores not needed in Python 3. 2017-08-25 11:04:20 -07:00