Commit Graph

134 Commits

Author SHA1 Message Date
rht 5ee40bf718 Remove usage of six.moves.binary_type. 2017-11-09 10:00:00 -08:00
Harshit Bansal 65838bb825 email_gateway: Disable code block processor for email gateway.
Generally emails are not written with markdown in mind and hence
sometimes render in strange ways. This commit fixes a particular
issue that was causing whitespace before paragraphs to be treated
as code block due to which email content was being rendered in a
box that scrolls in right direction a lot.

Fixes: #7045.
2017-11-09 09:56:35 -08:00
Steve Howell ae0b27a7ed Extract messages_for_ids. 2017-11-07 17:48:27 -08:00
Steve Howell 88e1e284bb Restructure send-message code for gravatars.
This refactoring doesn't change behavior, but it sets us up
to more easily handle a register setting for `client_gravatar`,
which will allow clients to tell us they're going to compute
their own gravatar URLs.

The `client_gravatar` flag already exists in our code, but it
is only used for Django views (users/messages) but not for
Zulip events.

The main change is to move the call to `set_sender_avatar` into
`finalize_payload`, which adds the boolean `client_gravatar`
parameter to that function.  And then we update various callers
to supply that flag.

One small performance benefit of this change is that we now
lazily compute the client message payloads in
`event_queue.process_message_event` now, so this will improve
performance if all interested clients have the same value of
`apply_markdown`.  But the change here is really preparing us
for the additional boolean parameter, which will cause us to
have four variations of the payload.
2017-11-07 10:36:02 -08:00
rht e311842a1b zerver/lib: Remove inheritance from object. 2017-11-06 08:53:48 -08:00
neiljp (Neil Pilgrim) fed757452c mypy: Clarify type of lookup_dict in aggregate_message_dict. 2017-11-04 19:47:44 -07:00
rht fef7d6ba09 zerver/lib: Remove u prefix from strings.
License: Apache-2.0
Signed-off-by: rht <rhtbot@protonmail.com>
2017-11-03 15:34:37 -07:00
Umair Khan 636046aec9 user-groups: Add basic backend for UserGroup model.
This adds the data model and bugdown support for the new UserGroup
mention feature.

Before it'll be fully operational, we'll still need:
* A backend API for making these.
* A UI for interacting with that API.
* Typeahead on the frontend.
* CSS to make them look pretty and see who's in them.
2017-10-31 15:16:14 -07:00
Steve Howell b3192d17ab refactor: Extract get_stream_subscriptions_for_user(). 2017-10-29 18:36:35 -07:00
Steve Howell 8ac26dfb9b refactor: Introduce bugdown.MentionData class.
We now have a MentionData class that encapsulates
the users who are possibly mentioned in a message.

Not that the rendering code may not keep all the mentions,
since things like backticks will suppress the mention.

We populate this now in do_send_messages, so that we can use
the info earlier in the message-sending process.  This info
now gets passed down the call stack as an optional parameter.

Note that bugdown.convert() still populates the data when its
callers decline to pass in a MentionData object.

This is mostly a preparatory commit, as we don't take advantage
of the data yet in do_send_messages.
2017-10-26 22:16:47 -07:00
Tim Abbott 8e2cdedf9a lint: Fix lines in Python codebase longer than 120 characters. 2017-10-26 17:47:30 -07:00
Steve Howell 635675fe48 Reduce queries needed for sending messages.
In do_send_messages, we only produce one dictionary for
the event queues, instead of different flavors for text
vs. html.  This prevents two unnecessary queries to the
database.

It also means we only put one dictionary on the "message"
event queue instead of two, albeit a wider one that has
some values that won't be sent to the actual clients.

This wider dictionary from MessageDict.wide_dict is also
used for the `feedback_messages` queue and service bot
queues.  Since the extra fields are possibly useful down
the road, and they'll just be ignored for now, we don't
bother to remove them.  Also, those queue processors won't
have access to `content_type`, which they shouldn't need.

Fixes #6947
2017-10-26 16:35:28 -07:00
Steve Howell 9b6a4d0b16 refactor: Extract MessageDict.finalize_payload(). 2017-10-26 16:35:28 -07:00
Steve Howell df93a99b50 Cache only one row per message.
Before this change, we populated two cache entries for each
message that we sent.  The entries were largely redundant,
with the only difference being whether we sent the content
as raw markdown or as the rendered HTML.

This commit makes it so we only have one cache entry per
message, and it includes both content and rendered_content.

One legacy source on confusion here is that `content`
changes meaning when you're on the front end.  Here is the
situation going forward:

    database:
        content = raw
        rendered_contented = rendered

    cache entry:
        content = raw
        rendered_contented = rendered

    payload for the frontend:
        content = raw (for apply_markdown=False)
        content = rendered (for apply_markdown=True)
2017-10-26 16:35:28 -07:00
Steve Howell 0e106a2488 Add client_gratavar support to GET /messages.
Clients fetching messages can now specify that they are able
to compute their avatar, and if they set client_gratavar to
True in the request (w/our normal encoding scheme), then the
backend will not compute it, and the payload will be smaller.

The fix starts with get_messages_backend.  The flag gets
passed down through these functions:

    * MessageDict.post_process_dicts.
    * MessageDict.set_sender_avatar.

We also fix up the callers for post_process_dicts to explicitly
pass in the client_gravatar path, but for now they all just hard
code the value to False.
2017-10-20 15:49:21 -07:00
Steve Howell 6fbaf7e80f Remove sender-related fields from message cache.
This change makes the cache entries smaller for message
dictionaries.  It also ensures we get valid data put into
message dictionaries if, for example, the sender's avatar
changes.

After this change, all of the attributes for a message
sender are only fetched during post-processing with two
exceptions:

    * We get sender_id for "free" from the message,
      and it's the primary key that we need to figure
      out which data to fetch in post-processing.

    * We need sender_realm_id to be able to cache topic
      links, and a sender's realm id will never change,
      so it's not a concern for invalidating cache rows.

All the other attributes are either likely to change (e.g.
sender avatar_version) and/or impact the size of cache
entries more severely than the two small id fields above.

This change should improve our overall system performance
by reducing the amount of memory used by every N message
rows we cache, and typically N will be in the thousands or
so on a large realm.

The other major implication of this change is that when
a user changes their avatar, and then later messages that
the user sent are fetched, all of the fields that go into
computing the avatar url will be pulled from the database,
not from cache.
2017-10-16 23:37:10 -07:00
Steve Howell d909355dc2 refactor: Move methods from models.py -> lib/messages.py.
Message.get_raw_db_rows is moved to MessageDict, since its
implementation details are highly coupled to other methods
in MessageDict.

And then sew_messages_and_reactions comes along for the
ride.

We eventually want to move Reaction.get_raw_db_rows to there
as well.
2017-10-16 23:37:10 -07:00
Steve Howell 4919eb4abd Extract MessageDict.set_sender_avatar().
We now populate the avatar url as part of the post
processing step of building message dictionaries,
so that the avatar url is no longer in cache.

This change makes the cache slimmer, because instead
of caching the avatar url (which often includes a long
hash), we just cache the smaller fields that are used
to compute the url.

Note that this commit still has the problem that we're
essentially computing the avatar url from cached fields
that can be invalid.  We will address that a few commits
later.

An immediate benefit of this change is that how we compute
avatar urls (or whether we compute them all) is now decoupled
from caching concerns.  We will address this later as
well.  (Some clients will be capable of computing their
own gravatar urls, for example.)
2017-10-16 23:37:10 -07:00
Steve Howell 3c6cc3d454 Defer deleting intermediate values in message dictionaries.
We're about to have multiple post-processing stages for building
message dictionaries.  Rather than having individual "hydration"
methods remove intermediate values, we just wait until the end.

This decouples the hyrdration steps.  The potentional problem
here is that we may have a field like sender_is_mirror_dummy
that isn't part of the final payload, but we need it for
calculating display recipients and avatars.  We don't want to
delete it too early from the objects.
2017-10-16 23:37:10 -07:00
Steve Howell b0e844c676 refactor: Use get_avatar_field in message.py.
This is part of deprecating avatar_url_from_dict and
eventually supporting the client_gravatar field in
message-related requests from clients.
2017-10-12 14:00:41 -07:00
Steve Howell fed972d1fb Fix bug with applying message events to unread counts.
The `is_mentioned` flag in message events was buggy.  We now
look directly at flags.

We will kill off `is_mentioned` in a subsequent commit.

We also remove some debugging code in the test that was failing
before this fix.  The test would only fail when `is_mentioned`
was wrong, which never happened when you ran a single test, and
which would happen randomly when you ran multiple tests.
2017-10-11 16:55:34 -07:00
Steve Howell 7c726a5e77 Remove sender names from the message cache.
This removes sender names from the message cache, since
they aren't guaranteed to be valid, and they're inexpensive
to add.

This commit will make the message cache entries smaller
by removing sender___full_name and sender__short_name
fields.

Then we add in the sender fields to the message payloads
by doing a query against the unique sender ids of the
messages we are processing.

This change leads to 2 extra database hops for most of
our message-related codepaths.  The reason there are 2 hops
instead of 1 is that we basically re-calculate way too
much data to get a no-markdown dictionary.
2017-10-11 11:37:16 -07:00
Steve Howell 3910448b1d Extract MessageDict.post_process_dicts().
Introduce MessageDict.post_process_dicts() will allow us
the ability to do the following:

    * use less memory in the cache for repeated data
    * prevent cache invalidation
    * format data according to different client needs

The first use of this function is pretty inconsequential, but
it sets us up for more consequential changes.

In this commit we defer the MessageDict.hydrate_recipient_info
step until after we pull data out of the cache.  This impacts
cache size as follows:

    * streams - negligibly bigger
    * PMs/huddles - slimmer due to not needing to repeat
                    sender data like email/full_name

Again, the main point of this change is to start setting up
the infrastructure to do post-processing.
2017-10-11 11:37:16 -07:00
Steve Howell 6bf43e6332 refactor: Extract MessageDict.hydrate_recipient_info().
This is a first step to eventually slimming the message cache,
but there are still some moving parts there to be worked through.

The more immediate benefit of extracting this function is that
we can put tests on it.  Also, it isolates some functionality
that may go away as our clients gets smarter.
2017-10-11 11:37:16 -07:00
Steve Howell d6e21b5ca9 Collect sender_ids (by topic) in `unread_msgs`.
This will allow the mobile app to say "A, B, and C are
talking" in the topic views.
2017-10-05 10:37:15 -07:00
Steve Howell e56084fcf7 Simplify how we apply events for unread messages.
The logic to apply events to page_params['unread_msgs'] was
complicated due to the aggregated data structures that we pass
down to the client.

Now we defer the aggregation logic until after we apply the
events.  This leads to some simplifications in that codepath,
as well as some performance enhancements.

The intermediate data structure has sets and dictionaries that
generally are keyed by message_id, so most message-related
updates are O(1) in nature.

Also, by waiting to compute the counts until the end, it's a
bit less messy to try to keep track of increments/decrements.
Instead, we just update the dictionaries and sets during the
event-apply phase.

This change also fixes some corner cases:

    * We now respect mutes when updating counts.
    * For message updates, instead of bluntly updating
      the whole topic bucket, we update individual
      message ids.

Unfortunately, this change doesn't seem to address the pesky
test that fails sporadically on Travis, related to mention
updates.  It will change the symptom, slightly, though.
2017-10-05 09:42:20 -07:00
Steve Howell f55b22e937 Add get_muted_stream_ids().
This function replaces get_muted_recipient_ids().  This will
set us up to apply events more easily.
2017-10-05 09:32:16 -07:00
Steve Howell 941b1c781c Refactor get_unread_message_ids_per_recipient().
We now have two helper functions:

    * get_raw_unread_data
    * aggregate_unread_data

Separating the concerns is nice.  The first function does
all the data collection.  The second function should be fast,
and it only re-organizes the data into an aggregated form
that makes the page_params payload smaller and easier for
clients to work with.

For the first function, we try to return data structures
that are easier to manipulate than the end result.  This
will allow us to apply events more easily, in a subsequent
commit.
2017-10-05 09:32:16 -07:00
rht f43e54d352 zerver/lib: Remove absolute_import. 2017-09-27 10:00:39 -07:00
neiljp (Neil Pilgrim) 133d679feb mypy: Avoid Message.is_status_message if rendered_content is None. 2017-09-25 16:02:56 -07:00
Steve Howell ba397b5109 Use user_ids, not full objects, in render path.
There is no reason for either render_incoming_message() or
render_markdown() to require full UserProfile objects just to
triage alert words.

By only asking for user_ids, we save extra queries in two
callpaths and we make it easier to start using user_ids in
do_send_messages().
2017-09-12 04:22:55 -07:00
Steve Howell 848c0803bd Exclude muted topics from unread count. 2017-09-07 07:06:03 -07:00
Steve Howell 0106add546 mypy: Use TypedDict for UnreadMessageResult. 2017-08-28 14:48:19 -07:00
Tim Abbott 133f005530 markdown: Remove is_me_message UserMessage flags.
This never made sense to be a flag on the UserMessage table, since
it's not per-user state.  And in fact it doesn't need to be in a
database at all, since it's easily computed from content anyway.

Fixes #1099.
2017-08-27 09:34:24 -07:00
Tim Abbott 1bb09e35d2 message: Add assertions for invalid recipient types. 2017-08-25 00:39:36 -07:00
Steve Howell ead40d8d08 Exclude muted streams from page_params.unread_msgs.count.
This adds one fairly cheap query, and gets the bankruptcy count
more in the ballpark of the home unread count.  (But we don't
account for topics yet.)
2017-08-23 17:39:22 -07:00
Tim Abbott 9081f2cf44 reactions: Store the emoji codepoint in the database.
This is the first part of a larger migration to convert Zulip's
reactions storage to something based on the codepoint, not the emoji
name that the user typed in, so that we don't need to worry about
changes in the names we're using breaking the emoji storage.
2017-08-15 09:29:27 -07:00
Aditya Bansal 0cb909b978 events: Fill in missing messages for a returing soft_deactivated user. 2017-08-15 08:33:16 -07:00
Steve Howell 658ac782a2 Add page_params.unread_msgs.count.
This field is convenient for bankruptcy checks.  Clients could
calculate it from page_params.unread_msgs before this change, but
it would kind of a painful calculation.

To add count, we had to simplify the mypy annotations, which weren't
really accurate before.
2017-08-14 12:38:09 -07:00
Steve Howell c0dec29f5f Exclude inactive streams from unread counts. 2017-08-14 12:38:09 -07:00
Steve Howell c7b9044ee5 Fix apply_unread_message_event() for mentions.
We were exiting this function in certain cases before updating
mentions. This bug was always there, but it was flaky in terms
of database setup whether the tests would fail, so now the
relevant test sends three consecutive messages.

We also avoid putting duplicate message ids in mentions.
2017-08-10 05:09:04 -04:00
Steve Howell 257e110996 unread: Only send clients 5000 most recent unread messages. 2017-08-02 09:40:47 -07:00
Tim Abbott 886db27de4 unread_msgs: Fix nondeterminstic ordering of unread_msgs IDs.
This should make this test flake impossible:
https://travis-ci.org/zulip/zulip/jobs/259439642#L1990
2017-07-31 10:43:43 -07:00
Steve Howell e6e3bbb780 Add a "mentions" section to unread message ids. 2017-07-27 16:14:26 -07:00
Jason Michalski 4f0110e081 Add unread_msgs to the initial state data.
We are adding a new list of unread message ids grouped by
conversation to the queue registration result. This will allow
clients to show accurate unread badges without needing to load an
unbound number of historic messages.

Jason started this commit, and then Steve Howell finished it.

We only identify conversations using stream_id/user_id info;
we may need a subsequent version that includes things like
stream names and user emails/names for API clients that don't
have data structures to map ids -> attributes.
2017-07-27 16:14:25 -07:00
Vaida Plankyte 5aee5b395a message.py: Use the singular 'they' pronoun. 2017-07-05 09:27:44 -07:00
Harshit Bansal 7be2e17827 message.py: Use dict's subscript syntax in `ReactionDict`.
Instead of using dict's `get()` method use the subscript syntax
so that we can assert correctly that the reaction row contains
all the fields and if not raise the `KeyError` instead of silently
returning None.
2017-06-09 16:38:58 -07:00
Elliott Jin 8b98b79646 bots: Generate queue events for embedded bots. 2017-05-25 15:00:51 -07:00
Christian Hudon 1761a3b1c1 mypy: strict optional fixes. 2017-05-24 18:50:59 -07:00
Aditya Bansal 84eadc0562 pep8: Add compliance with rule E261 to zerver/lib/message.py. 2017-05-18 03:00:32 +05:30
Tim Abbott 9f7236eec1 message: Remove unused old gravatar_hash field from message dicts.
This was deprecated and replaced some 4 years ago.
2017-05-09 22:33:27 -07:00
Tim Abbott e3505bd5ae avatar: Fix memcached query loop fetching messages.
This fixes a major performance issue, where we would fetch
user_profile objects inside a code path that had already bulk-fetched
the necessary user objects.

Like the similar related changes we just made, the fix is to marshall
and pass the data into the avatar library directly.
2017-05-09 22:33:27 -07:00
vaibhav 8881b5eb9f Outgoing Webhook System: Check for @-mentioned outgoing webhook bots.
Also puts them into a processing queue, though the queue processor
does nothing.

Rewritten by tabbott to avoid unnecessary database queries in
do_send_messages.
2017-05-02 09:22:04 -07:00
Rishi Gupta b416587aab Change sender_domain to sender_realm_str in message dict. 2017-03-25 19:50:24 -07:00
Rishi Gupta 88abb7871d Remove domain from list of pre-fetched fields for message recipients. 2017-03-25 19:50:24 -07:00
Susan Salituro a2689d6952 message.py: Delete unused function. 2017-03-18 16:08:36 -07:00
Raghav Jajodia a3a03bd6a5 mypy: Added Dict, List and Set imports.
Fixed mypy errors associated with the upgrade.
2017-03-04 14:33:44 -08:00
Steve Howell ad24133b94 Have functions in lib/avatar.py use avatar versions.
In some cases here we simplify things by calling avatar_url()
instead of get_avatar_url(), when we have a user_profile record
handy.  For other cases we pass in an extra avatar_version
parameter to get_avatar_url(), including from avatar_url().
2017-02-17 10:19:56 -08:00
Steve Howell 65a4eb8ec8 Add sender_avatar_version to message caches.
We will use this in computing avatar URLs.
2017-02-17 10:19:56 -08:00
Tim Abbott e746868375 mypy: Fix optional typing usage in rendering code path. 2017-02-10 23:53:44 -08:00
Steve Howell 709493cd75 Pass in sent_by_bot flag to bugdown parser.
We will use this flag to suppress certain url previews
for bots.
2017-02-03 17:07:38 -08:00
Tim Abbott 4e171ce787 lint: Clean up E126 PEP-8 rule. 2017-01-23 22:06:13 -08:00
Tim Abbott fe4f7b1170 lint: Clean up E711 PEP-8 rule. 2017-01-23 21:11:49 -08:00
Tim Abbott 5d52f1ec17 bugdown: Move realm_filters_key logic out of callers.
This gets rid of the confusing duplicate realm_filters_key and
message_realm arguments that previously were passed to bugdown.
2017-01-21 21:37:57 -08:00
Sampriti Panda 34a4a1378d bugdown: Use specified realm, not sender realm, for rendering.
This changes bugdown to use the realm passed in by the caller (if any)
for rendering, fixing a problem where bots such as the notification
bot would have their messages rendering using the admin realm's
settings, not the settings of the realm their messages are being sent
into.

Also adds a test for the notification bot case.

Fixes #3215.
2017-01-21 21:37:57 -08:00
Tim Abbott bc138f72f4 render_markdown: Refactor realm_filters_key logic.
This moves the realm_filter_key variable, primarily used for clarity,
up from Bugdown into the render_markdown function.

We'll need this for the upcoming commits.
2017-01-21 21:37:57 -08:00
Tim Abbott 19b89eb050 bugdown: Rename realm_id to realm_filters_key.
This should substantially improve the clarity of the code, since
inside bugdown, this is only being used as a hash key that happens to
usually be a realm ID, not used as a Realm ID.
2017-01-16 21:48:55 -08:00
Steve Howell 99b5c00ec1 Add stream id to message dictionary.
We now pass the stream id in the messages we send to
our clients.
2017-01-05 15:32:45 -08:00
Bojidar Marinov 5fc65efd69 messages: Allow rendering message content without having an actual message.
This is useful for doing rendering in the emoji search code path.
2017-01-05 15:16:43 -08:00
Rishi Gupta cf762eaf84 Change X.realm.id to X.realm_id across codebase.
This makes it more clearly the pattern in the Zulip codebase, and thus
decreases the risk of accidentally doing database queries.
2017-01-03 16:46:26 -08:00
Rishi Gupta b206d6f251 message.py: Change domain to realm_id in render_markdown args. 2017-01-03 16:46:14 -08:00
Rishi Gupta c6e12e74be Change domain to realm_id in bugdown and realm filter dicts and caches. 2017-01-03 16:25:20 -08:00
Rishi Gupta be6f54d7bb messages.py: Add sender.realm.id to MessageDict. 2017-01-03 16:25:20 -08:00
Robert Hönig ef3069a5d3 mypy: Convert the isinstances function in /zerver/lib/ to use typing.Text. 2016-12-25 10:33:45 -08:00
Robert Hönig 0917493588 mypy: Convert zerver/lib to use typing.Text. 2016-12-25 10:33:45 -08:00
Tim Abbott 0b33be50f3 lint: fix some whitespace issues in new reactions code. 2016-12-14 20:37:13 -08:00
Kracekumar R 61d2297c17 Add reactions in the /json/messages endpoint. 2016-12-14 19:21:04 -08:00
Igor Tokarev c93f1d4eda Add oembed/Open Graph/Meta tags data retrieval from inline links.
This change adds support for displaying inline open graph previews for
links posted into Zulip.

It is designed to interact correctly with message editing.

This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control
whether this feature is enabled.

By default, this setting is currently disabled, so that we can burn it
in for a bit before it impacts users more broadly.

Eventually, we may want to make this manageable via a (set of?)
per-realm settings.  E.g. I can imagine a realm wanting to be able to
enable/disable it for certain URLs.
2016-12-07 17:40:18 -08:00
Sidhant Bhavnani 8c0c12c1d9 pep8: Fix E303 violations. 2016-12-02 15:34:11 -08:00
Tim Abbott f1a399a4e1 message: Create new access_message library function.
With reactions and other upcoming features, we'll be adding several
places where we need to check whether a particular user can access a
particular message.  It's best to just have a single helper function
for this purpose that we can use everywhere.
2016-10-11 17:17:19 -07:00
Steve Howell b2f84f0fa4 Move render_markdown() to lib/message.py.
This removes a bugdown circular dependency.
2016-10-04 11:34:53 -07:00
Steve Howell 7fb992dba3 Simplify and "fix" render_old_messages management command.
The command to render old messages now looks for all messages
not matching the bugdown version, and it no longer directly calls
into model code.  We should still be extremely cautious about
using this code.
2016-10-04 11:31:20 -07:00
Steve Howell 6b71f5bd5f Inline most calls to set_rendered_content().
This is part of breaking the circular dependency on
bugdown in models.py.  A subsequent commit will fully
kill off set_rendered_content().
2016-10-04 11:31:20 -07:00
Steve Howell 583a6bbadd Extract zerver/lib/message.py.
This pulls message-related code from models.py into a new
module called message.py, and it starts to break some bugdown
dependencies.  All the methods here are basically related to
serializing Message objects as dictionaries for caches and
events.

    extract_message_dict
    stringify_message_dict
    message_to_dict
    message_to_dict_json
    MessageDict.to_dict_uncached
    MessageDict.to_dict_uncached_helper
    MessageDict.build_dict_from_raw_db_row
    MessageDict.build_message_dict

This fix also removes a circular dependency related
to get_avatar_url.

Also, there was kind of a latent bug in Message.need_to_render_content
where it was depending on other calls to Message to import bugdown
and set it globally in the namespace.  We really need to just
eliminate the function, since it's so small and used by code that
may be doing very sketchy things, but for now I just fix it.  (The
bug would possibly be exposed by moving build_message_dict out to the
library.)
2016-10-04 11:31:20 -07:00