Commit Graph

5768 Commits

Author SHA1 Message Date
Steve Howell 4e7fce60ee Add possible_mentions() to speed up rendering.
We now triage message content for possible mentions before
going to the cache/DB to get name info.  This will create an
extra data hop for messages with mentions, but it will save
a fairly expensive cache lookup for most messages.  (This will
be especially helpful for large realms.)

[Note that we need a subsequent commit to actually make the speedup
happen here, since avatars also cause us to look up all users in
the realm.]
2017-09-15 01:09:08 -07:00
Steve Howell 7a4c3c1a5c Make mentions regex more strict for all/everyone.
We only want `@all` and `@everyone` as shorthands.  For user
names we want askerisks: `@**Steve Howell**`.
2017-09-15 01:09:08 -07:00
Steve Howell 6a625eef66 Clean up model imports for bugdown/__init__.py.
We broke some circular dependencies a while back, so we can
move a bunch of imports to the top of the file.
2017-09-15 01:09:08 -07:00
Eeshan Garg a041a23199 webhooks/trello: Ignore when Board background is changed.
I feel like getting notifications about a board's background being
changed isn't very useful information and could interrupt the flow
of other important information such as Card changes or movement,
so I think we should not support this event and
should simply ignore such payloads in the future.
2017-09-15 01:02:47 -07:00
Tim Abbott d06cb4e4fd event_queue: Make path forward for push/email notifications clearer.
This is a nonfunctional refactor, designed primarily to make it
simpler to extend this code path when we later add support for
controlling whether email notifications go out on stream messages.
2017-09-15 01:01:11 -07:00
Tim Abbott d0e8163f13 event_queue: Remove some unnecessary parenthesis. 2017-09-15 01:01:11 -07:00
Tim Abbott e085af3324 Fix stream_push_notify feature to not send emails.
Previously, due to a logic bug, this feature would also send email
notifications for all messages on the stream, which is definitely not
the intent.  The recent refactoring we just did makes the logic more
obvious.
2017-09-15 01:01:11 -07:00
Tim Abbott 22ea2a5858 event_queue: Separate email and push notification loops. 2017-09-15 01:01:11 -07:00
Tim Abbott 7fa0325fb5 event_queue: Refactor notified logic. 2017-09-15 01:01:11 -07:00
Greg Price c4b506998f tornado: Disable routine logging in dev.
This creates a lot of logging noise, and also causes confusion
for new contributors when something isn't working as they expect
and they aren't sure if this message is normal or an error.
2017-09-14 12:38:57 -07:00
Tim Abbott e8f835d852 migrations: Fix migration 0041 failures for long attachment filenames.
We should have done this a long time ago, but better late than never.
Basically, this migration would crash in the event that there were any
attachments with particularly long names.  The fix is the next
migration, 0042; we just inline it here to avoid that crash.
2017-09-14 07:00:07 -07:00
Tim Abbott 5722237f59 push: Rename received_pm to private_message.
This is a clearer name for this now more broadly used interface.
2017-09-14 05:41:37 -07:00
Sarah 97571a203d push: Add new formatting for stream message push and add tests.
This should make the push notifications for messages to streams with
the new stream push notifications setting enabled make sense.
2017-09-14 05:41:37 -07:00
Sarah c3a8138f74 user_settings: Add push notifications for all stream messages.
Add setting to enable push notifications for all stream messages.
2017-09-14 05:41:37 -07:00
Steve Howell 41e3a819da Inline get_recipient_user_ids() into two callers.
This sets us up a subsequent commit where we need more data
from the Subscription table to build recipient info, so the
function boundary doesn't work any more for get_recipient_info,
which is part of the heavily optimized send-message
path.

We used to share code here with typing notifications, but
typing notifications need a lot less data than the
send-message path, so it's useful to decouple these two
things.  The idioms that are duplicated here are pretty simple
one-liners.
2017-09-14 05:13:58 -07:00
Steve Howell ac61c48964 Optimize get_status_dict_by_realm().
This change optimizes get_status_dict_by_realm() by
introducing query_for_ids(), which quickly computes
an "IN" clause for user ids.

This change also inlines the `two_weeks_ago` check, but
that is just for clarity, not performance.
2017-09-14 04:22:02 -07:00
Steve Howell aade317d87 Extract UserPresence.get_status_dicts_for_rows().
The prior version of this function was passed in a QuerySet, which
made it difficult to effectively profile the callers, and there
is really no compelling reason to pass in a query any more.
2017-09-14 04:22:02 -07:00
Umair Khan 1f93c06b76 i18n: Optimize get_language_list().
compilemessages command now does all the heavy lifting by creating a
language_name_map.json file under locale directory. This file is used
by get_language_list to retrieve the require information.

Fixes: #6486
2017-09-14 02:28:58 -07:00
Steve Howell 6c90940f84 performance: Add UserMessageLite class.
This speeds up sending messages significantly.

For 1000 users, this speeds up create_user_messages from
0.652s to 0.0558s, so basically a 10x speedup.
2017-09-12 04:22:55 -07:00
Steve Howell 811fcf51ee Extract create_user_messages.
The logic to create UserMessage rows when you create a message
is very self-contained, and it's helpful to be able to profile it.
2017-09-12 04:22:55 -07:00
Steve Howell 7fbffb8e30 Optimize bulk inserts for UserMessage rows.
Avoiding ORM overhead makes inserting UserMessage rows
about 15 times faster.
2017-09-12 04:22:55 -07:00
Steve Howell d723be125a Optimize get_recipient_info() for sending messages.
This commit makes get_recipient_info() faster by never creating
Django ORM objects.  We use the ORM to create a values query
instead, and then we iterate over the rows to create various
collections of ids.

In order to avoid lots of code duplication, this commit unifies
how we query UserProfile for PMs and streams.  Prior to this
commit we were getting "wide" UserProfile objects out of
our memcached cache.  Now we just go to the database with our
list of userids.  The new approach at worst adds one hop to the
database for PMs, which aren't really a performance bottleneck
(compared to streams).  And the new approach actually saves a
hop when both partners aren't in cache (plus we don't pay the
penalty of hitting the cache itself).

The performance improvement here is easy to measure for messages
to streams with many users, even with all the other activity
that goes on inside do_send_messages().  I took test_performance()
in test_messages.py, set num_extra_users to 3000, and consistently
measured a ~20% speedup in do_send_messages().

This commit also eliminates fetching of emails.  We probably
could have done that in a prior commit, but in this commit it
is very explicit that we don't need it.  While removing email
from the query is a no-brainer, it actually had a negigible
impact on performance.  Almost all the savings here comes from
not create UserProfile objects.
2017-09-12 04:22:55 -07:00
Steve Howell d00c001b5f Create get_recipient_info().
This function returns a summary of recipient data for a message
that's being sent.  It's mostly just moving code into the
old function called get_recipient_user_profiles().
2017-09-12 04:22:55 -07:00
Steve Howell b562dedb53 Avoid using email to detect that the feedback bot is addressed.
This commit is necessary to prevent bringing back emails from the
DB for all N recipients of a message just to see if the feedback
bot is being invoked.
2017-09-12 04:22:55 -07:00
Steve Howell 6f0289ae79 do_send_messages(): Extract internal push_notify_user_ids set.
This is one more step toward not needing UserProfile objects.
2017-09-12 04:22:55 -07:00
Steve Howell 82b2bd8b65 Take user_ids in get_userids_for_missed_messages().
This helps us phase out the need for getting lots of UserProfile
objects.
2017-09-12 04:22:55 -07:00
Steve Howell 06c388774f do_send_messages(): Clean up service bot code.
We calculate `service_bot_tuples` earlier in the function, so that
we don't need "full" UserProfile objects later in the function.

This is part of consolidating code that basically just needs to
triage user_ids.
2017-09-12 04:22:55 -07:00
Steve Howell a22a22966f do_send_messages(): Create UserMessage objects with user_id.
This starts to phase out the need for UserProfile objects in
do_send_messages().  UserProfile objects are expensive to create
for large streams with lots of users.  The objects in the code
before this commit aren't even full UserProfile objects.

This change mostly sets up future performance improvements, but
we also get a minor speedup here when we run a test with 3000
stream subscribers.
2017-09-12 04:22:55 -07:00
Steve Howell ba397b5109 Use user_ids, not full objects, in render path.
There is no reason for either render_incoming_message() or
render_markdown() to require full UserProfile objects just to
triage alert words.

By only asking for user_ids, we save extra queries in two
callpaths and we make it easier to start using user_ids in
do_send_messages().
2017-09-12 04:22:55 -07:00
Steve Howell 9e8c24168d Extract get_typing_user_profiles().
This function is essentially a copy of get_recipient_user_profiles,
which is about to go away. The new function enforces the contract of
typing indicators, which is that they don't apply to streams, which
allows us to use a relatively simple approach for getting user
profile objects.

We are diverging this code, because the send-message path needs
more optimizations.
2017-09-12 04:22:55 -07:00
Steve Howell c87cc1447f Extract get_recipient_user_ids. 2017-09-12 04:22:55 -07:00
Steve Howell 56a552eec3 Get UserProfile objects directly for stream messages.
This change introduces an extra hop to the database, but it is
generally faster due to nuances of the DB and the ORM.  It
also sets us up to optimize get_recipient_user_profiles() by
avoiding creating ORM objects.

I measured the impact of this using a stream with 3000
subscribers, half of whom were idle, and it speeds things up
by 10%.
2017-09-12 04:22:55 -07:00
Steve Howell 262abe41ab Add a performance test for do_send_messages(). 2017-09-12 04:22:55 -07:00
Steve Howell 019d541e47 Optimize UserMessage.flags_list().
This small function was consuming way too much time when we
sent messages to many recipients.
2017-09-09 11:03:43 -07:00
Steve Howell d3cfa1ab35 Optimize PushDeviceToken query.
Avoid a join to UserProfile here speeds up the query from
86ms -> 28ms when you analyze it with about 2000 mobile users
in a 5000-user realm.

We also avoid some code duplication here, since we filter
UserPresence for the same group of users as we filter
PushDeviceToken.
2017-09-08 12:32:17 -07:00
Steve Howell cb3832a147 Use sets, not lists, for mobile_user_ids.
This avoids an O(N-squared) hit during presence queries.  The speedup
here is probably negligible compared to everything else going on, but
sets are more semantically correct, anyway.
2017-09-08 12:32:17 -07:00
Steve Howell b6bb7f2b1e Fix bug where we hard code realm for PushDeviceToken.
This had no test coverage, which is part of the reason it went
undetected, plus many instances probably only have one realm
with realm_id=1.
2017-09-08 12:32:17 -07:00
Steve Howell 730da55bf8 Pre-fetch user ids for presence query.
Before this commit, postgres would choose a non-optimal query
plan to find all presence rows belonging to a realm.  We now
do an extra query to get the list of relevant user_ids, which allows
the next query to take advantage of UserPresence's index on
user_profile_id.

Here is the query plan for the offending query (this particular query isn't
verbatim from the code, but it's representative of the problem):

    explain analyze
    select client_id
    from zerver_userpresence
    INNER JOIN zerver_userprofile ON
        zerver_userprofile.id = zerver_userpresence.user_profile_id
    WHERE
        zerver_userprofile.is_active and
        zerver_userprofile.realm_id = 3;

     Hash Join  (cost=149.66..506.82 rows=5007 width=4) (actual time=48.834..121.215 rows=5007 loops=1)
       Hash Cond: (zerver_userprofile.id = zerver_userpresence.user_profile_id)
       ->  Seq Scan on zerver_userprofile  (cost=0.00..260.11 rows=5369 width=4) (actual time=0.009..24.322 rows=5021 loops=1)
             Filter: (is_active AND (realm_id = 3))
             Rows Removed by Filter: 3
       ->  Hash  (cost=87.07..87.07 rows=5007 width=8) (actual time=48.789..48.789 rows=5010 loops=1)
             Buckets: 1024  Batches: 1  Memory Usage: 196kB
             ->  Seq Scan on zerver_userpresence  (cost=0.00..87.07 rows=5007 width=8) (actual time=0.007..24.355 rows=5010 loops=1)
     Total runtime: 145.063 ms

You can see above that we're filtering on realm_id instead of using an index.

When you decompose the query into two queries, the total time is about 100ms, for a
savings of 33%.  I imagine the savings would be even greater on an instance with lots
of realms.  This was tested on dev with one really large realm and one tiny realm.
2017-09-08 12:32:17 -07:00
Steve Howell 6076a6a38d Remove unused is_mirror_dummy fields. 2017-09-08 12:32:17 -07:00
Steve Howell c19b3aec0c Avoid sorting in UserPresence query.
We were using `.order_by('user_profile_id', '-timestamp') in our
UserPresence query in get_status_dicts_for_query.

We don't need a full sort to produce the dictionary of statuses.
In fact the whole operation in Python is still O(N):

    - divvy rows up to be per-user in an O(N) pass
    - find max row for the 'aggregated' entry in an O(n) pass
      per user

The one minor annoyance of this fix is that datetime_to_timestamp
is lossy, so if you naively call to_presence_dict before finding
the "max" row, you get test flakes if rows are created during the
same second.  I decided to avoid calling to_presence_dict so there
are fewer moving parts, but there's still the ugly step of having
to remove the "dt" field from the final results.
2017-09-08 12:32:17 -07:00
Steve Howell 642e059725 fix_unreads: Add docstring explaining migration use case. 2017-09-07 07:06:03 -07:00
Steve Howell 4dfe6bb320 Add migration to fix unread messages. 2017-09-07 07:06:03 -07:00
Steve Howell 69203c1c81 fix_unreads: Remove commit() call in fix().
The commit() call in fix() breaks migrations and tests (unless you
mock) due to outer transactions.

We now explicitly call commit() from the management command.
2017-09-07 07:06:03 -07:00
Steve Howell 638675cd7e fix_unreads: Use raw SQL to check topic mutes.
Using raw SQL for checking the topic mutes makes it easier
to use the library in a migration.
2017-09-07 07:06:03 -07:00
Steve Howell 8cc8e87daf fix_unreads: Use logging instead of print. 2017-09-07 07:06:03 -07:00
Steve Howell a2fe4178be Extract zerver/lib/fix_unreads.py.
This is a pure code move.
2017-09-07 07:06:03 -07:00
Steve Howell 848c0803bd Exclude muted topics from unread count. 2017-09-07 07:06:03 -07:00
Steve Howell f5edeb01ae Calculate idle users more efficiently when sending messages.
Usually a small minority of users are eligible to receive missed
message emails or mobile notifications.

We now filter users first before hitting UserPresence to find idle
users.  We also simply check for the existence of recent activity
rather than borrowing the more complicated data structures that we
use for the buddy list.
2017-09-07 06:59:44 -07:00
Steve Howell 97c5f085e7 minor: Extract locals in do_send_messages().
This is a prepartory commit for another refactoring.
2017-09-07 06:59:44 -07:00
Steve Howell 981f557422 Extract receiver_is_off_zulip().
We are splitting out this logic from the more complicated
UserPresence-related logic, so that we can simplify the latter.
2017-09-07 06:59:44 -07:00
Steve Howell 776bdc59db Avoid unnecessary steps in process_message_event().
There is no reason to compute receiver_is_idle() unless a user
is actually PM'ed or mentioned.
2017-09-07 06:59:44 -07:00
Umair Khan f7d8db792c makemessages: Allow whitespaces after comma in i18n.
We allow such patterns:

```
i18n.t('Test __variable__',
        {variable: "script"})
```
2017-09-06 07:01:43 -07:00
Steve Howell 0721115c64 model: Remove user_profile.muted_topics.
(We now track muted topics in the MutedTopic model.
2017-09-02 09:19:51 -07:00
Steve Howell 4ac6bc46c7 Add MutedTopic model.
This commit completely switches us over to using a
dedicated model called MutedTopic to track which topics
a user has muted.

This includes the necessary migrations to create the
table and populate it from legacy data in UserProfile.

A subsequent commit will actually remove the old field
in UserProfile.
2017-09-02 09:19:51 -07:00
Steve Howell 06ca364049 minor: Test round-trip behavior for mutes.
Instead of peeking directly at the DB to verify our mutes are
set correctly, we now use the library function.  This prepares
us to modify the DB internals while preserving the tests.
2017-08-30 09:14:41 -07:00
Brock Whittaker 2140a4aa01 landing: Add /plans/ describing ways to use Zulip.
Note from tabbott: This isn't yet linked to and will need to go
through significantly more iteration, but it's a start.
2017-08-30 07:56:22 -07:00
Greg Price a4bcf1a64b APNs: Handle HTTP connection errors, and retry.
Should help with #6321 as at least a band-aid.
2017-08-29 15:27:41 -07:00
Greg Price 780e1ac5b2 push notifs: Add a simple test for the new APNs provider. 2017-08-29 15:27:41 -07:00
Steve Howell 0501570cd1 Remove POST-based API for setting topic mutes. 2017-08-29 16:53:38 -04:00
Steve Howell 828459a24b Extract build_topic_mute_checker into topic_mutes.py.
We had two duplicate versions of this function, and one
of them was broken with respect to case insensitivity.
2017-08-29 16:53:38 -04:00
Steve Howell 8c4a5a9f7a Extract exclude_topic_mutes.
This is mostly a pure code move, but I cleaned up the code
slightly to use early-return.
2017-08-29 16:53:38 -04:00
Steve Howell 0959c978c3 Fix lint error from recent subdomains commit.
We did a code sweep recently for subdomains (see
60be89d0).
2017-08-29 08:35:37 -07:00
Tim Abbott 60be89d00e test_push_notifications: Declare subdomains explicitly. 2017-08-28 23:19:07 -07:00
Tim Abbott c5d699b6fb tests_classes: Add DEFAULT_SUBDOMAIN feature.
This should make life a little easier for those tests that need to use
the same subdomain like 20 times.
2017-08-28 23:17:33 -07:00
Tim Abbott 4a22316d90 test_decorator: Add explicit subdomains in tests. 2017-08-28 22:51:57 -07:00
Tim Abbott a8b9ffc020 test_classes: Include more detail in incorrect JSON responses.
If the status code is wrong, we show the actual error message now,
which often saves a bit of time when debugging.
2017-08-28 21:43:41 -07:00
Rishi Gupta c1997e759c password_reset: Change email to be appropriate for obtaining first password.
The situation if, for instance, the user signed up via google auth, and now
needs a password to get their API key.
2017-08-28 20:39:53 -07:00
Tim Abbott f1ad819547 home: Remove compatibility code for old name Humbug.
Since Zulip stopped being called Humbug in like 2013, this code hasn't
been useful in years, and is a bit confusing.
2017-08-28 16:15:58 -07:00
Steve Howell 0106add546 mypy: Use TypedDict for UnreadMessageResult. 2017-08-28 14:48:19 -07:00
Tim Abbott 50f5560bd1 accounts: Standardize URL for find_account.
This changes it to match the /accounts/ URL style for all of our other
auth code path endpoints.
2017-08-28 14:36:59 -07:00
Tim Abbott ac0d90e533 portico: Rename 'find_my_team' to 'find_account'. 2017-08-28 14:29:29 -07:00
Tim Abbott a0a1fe1512 settings: Rename SERVER_URI to ROOT_DOMAIN_URI.
This should be a lot less confusing.

See #6013 for discussion.
2017-08-28 14:09:28 -07:00
Yago González 659bff1ffb i18n: Fix URLs misparsing in translation tags.
The double forward slash (//) after the protocol in URLs was being
mistakenly considered the beginning of an inline JS comment, causing
internationalization strings being cut unexpectedly.

Now the check for inline JS comments is only run in .js files.
2017-08-28 13:54:17 -07:00
Umair Khan ecfafc05c0 registration: Use already_registered to show error.
Use this new variable to determine if the user already exists while
doing registration. While doing login through GitHub if we press
*Go back to login*, we pass email using email variable. As a result,
the login page starts showing the "User already exists error" if we
don't change the variable.
2017-08-28 07:02:11 -07:00
Steve Howell 73c30774cb admins: Add private streams to never_subscribed.
Admins need to know about private streams to delete them, even
if they are not subscribed.  We send the minimal info possible
to the client to allow them to have a UI for that.
2017-08-27 19:08:04 -07:00
Steve Howell 8ea9b80a8c Clean up test_never_subscribed_streams().
This basically extracts a few helper methods and makes the data
setup a bit more explicit.
2017-08-27 19:08:04 -07:00
Steve Howell 313f73258d Allow admins to delete private streams (backend only).
This is the backend piece.  Getting the UI right here is a bit
more complicated here, but this allows admins to use the API
to delete streams.
2017-08-27 19:08:04 -07:00
Aditya Bansal d9c9bfe7f6 logger: Add new create_logger abstraction to simplify logging.
This deduplicates a ton of Python logger-creation code to use a single
standard implementation, so we can avoid copy-paste problems.
2017-08-27 18:31:53 -07:00
Tim Abbott e092f1afff logging: Fix soft_deactivation log declaration.
Apparently, the soft deactivation log was incorrectly grabbing the
root logger, and thus screwing up where everything got logged.
2017-08-27 18:30:52 -07:00
Tim Abbott 70e16da81c decorator: Fix request.user handling of remote servers.
The refactor in b46af40bd3 didn't
correctly translate the code for managing request.user and
request._email, resulting in requests for the push notification
bouncer being rejected with this exception:

AttributeError: 'AnonymousUser' object has no attribute 'rate_limits'
2017-08-27 16:35:17 -07:00
Preston Hansen 5a501784f2 digest emails: Add unit tests for digest email management.
Fixes #6266.
2017-08-27 13:10:14 -07:00
Vishnu Ks 8fc8ac0799 management: Override CommandError to mention --entire-server argument. 2017-08-27 12:34:23 -07:00
Vishnu Ks dc63f838d7 backend-tests: Add tests for get_users with all-users argument enabled. 2017-08-27 12:34:23 -07:00
Vishnu Ks 23b63238c4 management: Handle the invalid user arguments cases separately. 2017-08-27 12:34:23 -07:00
Aditya Bansal b232563e12 soft-deactivation: Add cron job for weekly soft deactivating users. 2017-08-27 11:33:06 -07:00
Aditya Bansal 9d7e23c100 softdeactivation/management: Make specifying realm an optional arg. 2017-08-27 11:33:06 -07:00
Tim Abbott ed31a5988c models: fix badly line-wrapped type annotation.
Fixes #6290.
2017-08-27 11:27:07 -07:00
Preston Hansen e8a608f733 management: Move enqueue_digest_email handler to digest. 2017-08-27 10:13:11 -07:00
Preston Hansen 9a4b17cf9b management: Move queue_digest_recipient to digest. 2017-08-27 10:13:11 -07:00
Preston Hansen 2aabf4fc67 management: Move should_process_digest to digest. 2017-08-27 10:13:11 -07:00
Preston Hansen 25a40806df management: Move inactive_since to digest. 2017-08-27 10:13:11 -07:00
Tim Abbott f1648af607 migrations: Update UserMessage model for is_me_message removal.
And while we're at it, document the related migration we need to do.
2017-08-27 10:11:43 -07:00
Tim Abbott 1c8c5cc36f test_messages: Fix deactivation tests for new /me behavior. 2017-08-27 09:58:02 -07:00
Tim Abbott 133f005530 markdown: Remove is_me_message UserMessage flags.
This never made sense to be a flag on the UserMessage table, since
it's not per-user state.  And in fact it doesn't need to be in a
database at all, since it's easily computed from content anyway.

Fixes #1099.
2017-08-27 09:34:24 -07:00
Tim Abbott e5da9966c2 bugdown: Remove now-unnecessary short_names.
This field hasn't been used since we removed the related mention
syntax.
2017-08-27 08:45:02 -07:00
Tim Abbott 92efe94a27 tests: Remove unnecessary apns mock. 2017-08-26 15:00:08 -07:00
Tim Abbott f0637cb01a push_notifications: Fix one last lint error. 2017-08-26 14:34:17 -07:00
Tim Abbott 00036ac8db push_notifications: Fix mypy error. 2017-08-26 14:33:43 -07:00
Rishi Gupta 1215757217 send_email: Remove confusing comment. 2017-08-26 14:24:32 -07:00
Rishi Gupta 1a43ef40cf emails: Add email_images_base_uri to context in build_email. 2017-08-26 14:24:32 -07:00