Commit Graph

824 Commits

Author SHA1 Message Date
derAnfaenger ad0407578d queue processors: Test flow through UserPresenceWorker.consume(). 2017-10-16 23:20:13 -07:00
Robert Hönig e749deb136 onboarding: Add welcome-bot response to initial user message.
Fixes #6030.
2017-10-11 20:45:42 -07:00
Steve Howell a6ad9a6d7c Add is_zephyr to the Stream model.
Add this field to the Stream model will prevent us from having
to look at realm data for several types of stream operations, which
can be prone to either doing extra database lookups or making
our cached data bloated.

Going forward, we'll set stream.is_zephyr to True whenever the
realm's string id is "zephyr".
2017-10-11 16:15:56 -07:00
Steve Howell c1d7fc6e80 Only require stream_id in private_stream_user_ids(). 2017-10-08 20:18:34 -07:00
Steve Howell 7dbea8a2bf Only require stream_id in subscribed_to_stream().
Since subscribed_to_stream is only doing an id lookup
on the Stream model to find out if a user is subscribed to
a stream, there's no reason to require a full Stream object.

It's currently the case that all callers do have full Stream
objects handy to pass in to this function, but it's still a
good practice to have functions only ask for objects that they
need.
2017-10-08 20:18:34 -07:00
Tim Abbott d215ea1e37 actions: Rename all_subs_by_stream to all_subscribers_by_stream.
The previous name sounded a bit too much like they were subcription
objects.
2017-10-08 12:33:53 -07:00
Steve Howell 3e6bfe1b23 Use user_ids, not emails, for bulk stream operations.
We now return user_ids for subscribers to streams in add-stream
events.  This allows us to eliminate the UserLite class for
both bulk adds and bulk removes.  It also simplifies some JS
code that already wanted to use user_ids, not emails.

Fixes #6898
2017-10-08 12:31:12 -07:00
Steve Howell 10a30bece1 Rename presence_idle_userids -> presence_idle_user_ids. 2017-10-07 12:16:45 -07:00
Steve Howell fbaef43ac3 Rename bot_owner_userids -> bot_owner_user_ids. 2017-10-07 12:16:45 -07:00
Steve Howell a331b4f64d Optimize query_all_subs_by_stream().
Using lightweight objects will speed up adding new users
to realms.

We also sort the query results, which lets us itertools.groupby
to more efficiently build the data structure.

Profiling on a large data set shows about a 25x speedup for this
function, and before the optimization, this function accounts
for most of the time spend in bulk_add_subscriptions.

There's a lot less memory to allocate.  I didn't measure
the memory difference.

When we test-deployed this to chat.zulip.org, we got about a 6x
speedup.
2017-10-06 11:03:44 -07:00
Steve Howell f5ddc40d14 Have get_peer_user_ids_for_stream_change() use user_ids. 2017-10-06 11:03:44 -07:00
Steve Howell aae0b2a826 Notify offline users about edited stream messages.
We now do push notifications and missed message emails
for offline users who are subscribed to the stream for
a message that has been edited, but we short circuit
the offline-notification logic for any user who presumably
would have already received a notification on the original
message.

This effectively boils down to sending notifications to newly
mentioned users.  The motivating use case here is that you
forget to mention somebody in a message, and then you edit
the message to mention the person.  If they are offline, they
will now get pushed notifications and missed message emails,
with some minor caveats.

We try to mostly use the same techniques here as the
send-message code path, and we share common code with the
send-message path once we get to the Tornado layer and call
maybe_enqueue_notifications.

The major places where we differ are in a function called
maybe_enqueue_notifications_for_message_update, and the top
of that function short circuits a bunch of cases where we
can mostly assume that the original message had an offline
notification.

We can expect a couple changes in the future:

    * Requirements may change here, and it might make sense
      to send offline notifications on the update side even
      in circumstances where the original message had a
      notification.

    * We may track more notifications in a DB model, which
      may simplify our short-circuit logic.

In the view/action layer, we already had two separate codepaths
for send-message and update-message, but this mostly echoes
what the send-message path does in terms of collecting data
about recipients.
2017-10-03 15:57:06 -07:00
Tim Abbott 654562b942 check_message: Reject null bytes in message content.
Postgres doesn't like them, we don't have an obvious way to escape
them, and they tend to be sent by buggy tools where it'd be better for
the user to get an error.

This fixes a 500 we were getting occasionally.
2017-10-03 15:32:04 -07:00
Tim Abbott 83a226ad3d do_send_message: Replace get_user_profile_by_email.
get_system_bot is the new way to do this sort of thing.
2017-10-02 16:15:36 -07:00
Eeshan Garg 763437cc9c backend: Introduce check_send_private_message.
check_send_private_message utilizes Addressee.for_user_profile()
to send a private message to a user_profile.
2017-10-02 15:27:26 -07:00
Steve Howell 2be713a7e4 Rename get_userids_for_missed_messages().
We rename this function to get_active_presence_idle_userids().
2017-10-02 15:19:28 -07:00
Steve Howell e660428c21 Rename missed_message_userids to presence_idle_userids. 2017-10-02 15:19:28 -07:00
Steve Howell 32d5e297a3 Rename get_idle_userids to filter_presence_idle_userids.
We have two different concepts of "idle", and this function
is based on the "presence" aspect of idleness.  There is also
idleness in terms of a user having no current client
descriptors accepting messages, and we check that later in
the process for things like sending missed message emails.
2017-10-02 15:19:28 -07:00
Eeshan Garg 24aff0d0a2 backend: Introduce check_send_stream_message.
check_send_stream_message is a simpler version of
check_send_message for sending messages where the addressee is
a stream. Instead of relying on Addressee.legacy_build,
check_send_stream_message uses Addressee.for_stream. Consequently,
it eschews many of check_send_message's kwargs that aren't needed
when the intended recipient of a message is a stream.
2017-09-30 17:48:55 -07:00
Steve Howell aaaaa66d4c refactor: Move default_sending_stream logic to Addressee.
Having Addressee take care of setting stream_name to
sender.default_sending_stream.name makes us able to have
the invariant that stream_name is never None when the
message type is 'stream', which will help for mypy, among
other things.

One thing to be aware of is that Addressee does do a little
bit of validation work, and this adds yet another JsonableError
exception.  I don't view this as a bad thing, just something to
know.
2017-09-28 12:14:08 -07:00
rht 035ed93111 zerver/lib: remove `import six`. 2017-09-27 19:10:28 -07:00
rht 2e12fe5e2e zerver/lib: Remove print_function. 2017-09-27 18:05:45 -07:00
Steve Howell de0b47fd4e Always notify service bots about stream mentions.
Before this change, we were only triggering service bots
for stream mentions when the bot was subscribed to the
stream.
2017-09-27 17:22:12 -07:00
Tim Abbott 28eaf5620e emails: Use common_context for email change notifications.
This also lets us remove `realm_uri`.
2017-09-27 16:48:18 -07:00
Steve Howell 1b518f1983 Return mentioned users in get_user_info_for_message_updates().
The dictionary result for get_user_info_for_message_updates()
now has a `mention_user_ids` field that is a set of user ids
who were mentioned in a message.
2017-09-27 16:01:50 -07:00
Steve Howell 646abb57b7 refactor: Extract get_user_info_for_message_updates.
We'll want to expand this to get users that were mentioned in
the prior message, but this commit is just a refactoring.
2017-09-27 16:01:50 -07:00
Steve Howell 5abf52de71 Extract get_idle_userids().
This function will help us send missed-message mails for
updates, in a future commit.
2017-09-27 16:01:50 -07:00
rht f43e54d352 zerver/lib: Remove absolute_import. 2017-09-27 10:00:39 -07:00
Steve Howell b340b28055 Extract get_service_bot_events().
There are several reasons to extract this function:

    * It's easy to unit test without extensive mocking.
    * It will show up when we profile code.
    * It is something that you can mostly ignore for
      most messages.

The main reason to extract this, though, is that we are about
to do some fairly complex splicing of data for the use case
of mentioning service bots on streams they are not subscribed to,
and we want to localize the complexity.
2017-09-26 18:49:03 -07:00
Tim Abbott c11b1623dd addressee: Accept a realm object in legacy_build.
This fixes a bug where the internal_prep_message code path would
incorrectly ignore the `realm` that was passed into it.  As a result,
attempts to send messages using the system bots with this code path
would crash.

As a sidenote, we really need to make our test system consistent with
production in terms of whether the user's realm is the same as the
system realm.
2017-09-25 14:06:29 -07:00
Tim Abbott 14c1660a55 addressee: Pass realm into get_user_profiles.
We don't access any attributes of the sender other than the realm, and
as it turns out, we in some cases want to use a different realm than
the sender's.
2017-09-25 14:00:46 -07:00
Tim Abbott 8e2c91b09c actions: Use internal_send_private_message. 2017-09-25 13:59:04 -07:00
Tim Abbott 1edd137263 RealmAuditLog: Pass acting_user to do_reactivate_user. 2017-09-22 07:33:02 -07:00
Rishi Gupta 6ec3595b77 emails: Change enqueue_welcome_emails to take a user rather than user_id. 2017-09-22 06:20:33 -07:00
Rishi Gupta a7c8770f97 emails: Move enqueue_welcome_emails outside of signups queue.
The only thing this queue should do is sign you up for the newsletter, since
it is only populated if newsletter_data is not None.
2017-09-22 06:20:33 -07:00
Tim Abbott f706f657c0 signup: Fix invitation emails not being cleared properly.
Previously, invitation reminder emails were only being cleared after a
successful signup if newsletter_data was available, since that was the
circumstance in which we were calling the relevant queue processor
code.  Now, we (1) clear them when a human user finishes signing up
and (2) correctly clear them using the 'address' field of
ScheduleEmail, not user_id.
2017-09-21 06:15:11 -07:00
Steve Howell 428d3027c2 Only require ids for finding DefaultStream objects.
We don't need full Realm objects to find DefaultStream
objects for a realm.  So now a few functions related to
adding/removing default streams use realm_id for lookups.

Similarly, we don't need a full Stream object to find
out if a stream exists in DefaultStream, so we do id
lookups there as well.

This sets us up to use thinner objects in callers.
2017-09-20 10:31:33 -07:00
Steve Howell fd58d472a5 Use update_fields in do_deactivate_stream.
We are generally explicit about which fields we save.
2017-09-20 10:31:33 -07:00
Steve Howell 7b2340decd Extract stream_name_in_use().
Checking to see if a stream exists is more idiomatic
if we just use exists() from Django.  We encapsulate it
for case insensitivity purposes.
2017-09-20 10:31:33 -07:00
Steve Howell 9046efb71a Use `prereg.users.set()` in do_invite_users.
This is a bit more idiomatic for many-to-many relationships.
2017-09-20 10:31:33 -07:00
Steve Howell 8ad7133351 Cache active_user_ids() more directly.
We now have a dedicated cache for active_user_ids() that only
stores a list of user_ids.

Before this commit, active_user_ids() used a cache of UserProfile
dictionaries, so it incurred unnecessary deserialization costs for
all the user fields that it sliced away in a list comprehension.

Because the cache is skinnier here, we also need to invalidate it
less frequently.  Basically, all we care about is new users, realm
deactivations, and user deactivations.

It's hard to measure how much this will improve performance, because
the speedup for any operation here is pretty minor, but we use this
function a lot, so hopefully it will make the overall system more
healthy.
2017-09-20 10:31:33 -07:00
Steve Howell cad3a35b6a Only require realm_id for active_user_ids().
This is mostly a preparatory commit for an upcoming optimization
related to stream data, but it probably does save us an
occasional DB hop to the realm table.
2017-09-20 10:31:33 -07:00
Steve Howell 26735eeeac Only require realm_id for get_active_user_dicts_in_realm().
This is a preparatory commit that will eventually allow us
to avoid fetching realm info that we don't need, in other
parts of the codebase.
2017-09-20 10:31:33 -07:00
Steve Howell 0966bf1a48 Simplify get_stream_cache_key().
Before this commit, we could pass in either a Realm object
or a realm_id to get_stream_cache_key().  Now we consistently
pass it a realm_id.
2017-09-20 10:31:33 -07:00
Greg Price 0c7dbd2e8a message send: Cut is_active from the values query in get_recipient_info.
This is unused since the query started filtering on is_active=True, in
51d4f16fe "Ignore inactive users in get_recipient_info()."
2017-09-19 20:08:39 -07:00
Steve Howell c4b17f2f80 Optimize get_recipient_info() with get_ids_for().
The get_ids_for() function produces a 2x speedup for
1000 users.
2017-09-16 03:07:13 -07:00
Steve Howell 84041d3195 Use itertools.groupby in bulk_get_subscriber_user_ids().
This results in about a 20% speedup by making more O(N)
things happen in C vs. Python.
2017-09-15 10:44:32 -07:00
Steve Howell 24b9f72b22 Use raw SQL in bulk_get_subscriber_user_ids().
This leads to more than a 2x speedup when tested with
20k+ total subscribers.  (For large realms with lots of default
streams, this function deals with LOTS of data, so it is important
to optimize.)
2017-09-15 10:44:32 -07:00
Steve Howell 1553dc00e0 Introduce StreamRecipient class.
This class encapsulates the mapping of stream ids to
recipient ids, and it is optimized for bulk use and
repeated use (i.e. it remembers values it already fetched).

This particular commit barely improves the performance
of gather_subscriptions_helper, but it sets us up for
further optimizations.

Long term, we may try to denormalize stream_id on to the
Subscriber table or otherwise modify the database so we
don't have to jump through hoops to do this kind of mapping.
This commit will help enable those changes, because we
isolate the mapping to this one new class.
2017-09-15 10:44:32 -07:00
Steve Howell fc2e485ca7 Sort emails in gather_subscriptions().
This helps makes the tests more deterministic.
2017-09-15 10:44:32 -07:00