Commit Graph

2472 Commits

Author SHA1 Message Date
derAnfaenger ad0407578d queue processors: Test flow through UserPresenceWorker.consume(). 2017-10-16 23:20:13 -07:00
derAnfaenger af699500b7 tests: Add option to call queue processor consumer.
This makes tests of queue processors more realistic,
by adding a parameter to `queue_json_publish` that
calls a queue's consumer function if accessed in a test.

Fixes part of #6542.
2017-10-16 23:20:13 -07:00
Tim Abbott b5c107ed27 push_notifications: Remove unnecessary check for no devices.
This should have been checked by the caller anyway.
2017-10-13 17:30:20 -07:00
Tim Abbott 27a450b58d push_notifications: Improve error message for GCM sending issues.
This addresses one of the sources of confusion in #6993.
2017-10-13 17:30:11 -07:00
Harshit Bansal 7d5bcf5534 notifications: Use lxml instead of hacky regexes to scrub inline images. 2017-10-13 16:13:58 +00:00
Harshit Bansal 8d42f42ef2 notifications: Correctly convert relative narrow links to absolute URLs. 2017-10-13 15:44:47 +00:00
Tim Abbott 1d314c3bf4 bugdown: Remove ERROR_BOT markdown rendering notices.
Nobody has used this feature in years, and it causes certain types of
markdown issues in development to completely DoS the development
environment by making it possible for the "Bugdown timeout" exception
handler to timeout in bugdown.

Since we already send an email to the server administrators, there's
no need to replace this feature with anything.
2017-10-12 17:45:33 -07:00
Steve Howell e0bc1b114e Make sure mentions refer only to active users.
An active user can share the same full name as a deactivated
user.  We now only allow mention syntax to find users who are
activated.

Fixed #6978
2017-10-12 17:11:36 -07:00
Steve Howell a9d25f8719 refactor: Simplify avatar_url.
This function is now a thin wrapper around get_avatar_field.
2017-10-12 14:00:41 -07:00
Steve Howell b0e844c676 refactor: Use get_avatar_field in message.py.
This is part of deprecating avatar_url_from_dict and
eventually supporting the client_gravatar field in
message-related requests from clients.
2017-10-12 14:00:41 -07:00
Steve Howell 1fc6a5febc Add get_avatar_field() function.
This function is designed to replace avatar_url() and
avatar_url_from_dict() over time.

There are a few things new about it:

    * We make the parameters more explicit, rather than
      passing in an opaque dictionary or requiring a
      UserProfile object.  (A lot of our callers want
      to use `values()` for efficiency sake, since we
      are often doing bulk user operations.)

    * We start to support the client_gravatar option.
2017-10-12 14:00:41 -07:00
Tim Abbott 5435fbf6c6 html_diff: Add missing mypy import.
It's getting really annoying that this isn't checked by our linter.
2017-10-12 00:13:58 -07:00
Tim Abbott 339e206c90 highlight_html_differences: Improve logging output.
Now at least it will give the message ID, and thus be possible to
debug.
2017-10-11 23:38:29 -07:00
Robert Hönig e749deb136 onboarding: Add welcome-bot response to initial user message.
Fixes #6030.
2017-10-11 20:45:42 -07:00
derAnfaenger 5ddc336844 tests: Add welcome bot as user. 2017-10-11 20:45:42 -07:00
Tim Abbott 298c59f7fd push_notifications: Fix error message for unregistered bouncer.
Previously, we were just returning a JSON error to the client, when it
was a server problem.

Fixes #6639.
2017-10-11 19:09:24 -07:00
Tim Abbott b3b5d5b7cd report: Avoid sending raw message content in error reporting.
This fixes a violation of Zulip's privacy policies (that error
reporting never contain message content) in the previous commit.
2017-10-11 17:44:05 -07:00
Steve Howell fed972d1fb Fix bug with applying message events to unread counts.
The `is_mentioned` flag in message events was buggy.  We now
look directly at flags.

We will kill off `is_mentioned` in a subsequent commit.

We also remove some debugging code in the test that was failing
before this fix.  The test would only fail when `is_mentioned`
was wrong, which never happened when you ran a single test, and
which would happen randomly when you ran multiple tests.
2017-10-11 16:55:34 -07:00
Steve Howell a6ad9a6d7c Add is_zephyr to the Stream model.
Add this field to the Stream model will prevent us from having
to look at realm data for several types of stream operations, which
can be prone to either doing extra database lookups or making
our cached data bloated.

Going forward, we'll set stream.is_zephyr to True whenever the
realm's string id is "zephyr".
2017-10-11 16:15:56 -07:00
derAnfaenger 61aebd036f tools: Remove `.py` extensions from user scripts. 2017-10-11 12:52:36 -07:00
Steve Howell 7c726a5e77 Remove sender names from the message cache.
This removes sender names from the message cache, since
they aren't guaranteed to be valid, and they're inexpensive
to add.

This commit will make the message cache entries smaller
by removing sender___full_name and sender__short_name
fields.

Then we add in the sender fields to the message payloads
by doing a query against the unique sender ids of the
messages we are processing.

This change leads to 2 extra database hops for most of
our message-related codepaths.  The reason there are 2 hops
instead of 1 is that we basically re-calculate way too
much data to get a no-markdown dictionary.
2017-10-11 11:37:16 -07:00
Steve Howell 3910448b1d Extract MessageDict.post_process_dicts().
Introduce MessageDict.post_process_dicts() will allow us
the ability to do the following:

    * use less memory in the cache for repeated data
    * prevent cache invalidation
    * format data according to different client needs

The first use of this function is pretty inconsequential, but
it sets us up for more consequential changes.

In this commit we defer the MessageDict.hydrate_recipient_info
step until after we pull data out of the cache.  This impacts
cache size as follows:

    * streams - negligibly bigger
    * PMs/huddles - slimmer due to not needing to repeat
                    sender data like email/full_name

Again, the main point of this change is to start setting up
the infrastructure to do post-processing.
2017-10-11 11:37:16 -07:00
Steve Howell 6bf43e6332 refactor: Extract MessageDict.hydrate_recipient_info().
This is a first step to eventually slimming the message cache,
but there are still some moving parts there to be worked through.

The more immediate benefit of extracting this function is that
we can put tests on it.  Also, it isolates some functionality
that may go away as our clients gets smarter.
2017-10-11 11:37:16 -07:00
Eeshan Garg 0ca1224b3e integrations: Render xkcd bot's documentation. 2017-10-09 11:40:44 -07:00
Eeshan Garg 71eee35bce webhooks: Add a Google Code-in integration. 2017-10-09 09:04:39 -07:00
Steve Howell c1d7fc6e80 Only require stream_id in private_stream_user_ids(). 2017-10-08 20:18:34 -07:00
Steve Howell 7dbea8a2bf Only require stream_id in subscribed_to_stream().
Since subscribed_to_stream is only doing an id lookup
on the Stream model to find out if a user is subscribed to
a stream, there's no reason to require a full Stream object.

It's currently the case that all callers do have full Stream
objects handy to pass in to this function, but it's still a
good practice to have functions only ask for objects that they
need.
2017-10-08 20:18:34 -07:00
Tim Abbott ec080aed6b mypy: Workaround lxml annotations being busted. 2017-10-08 12:38:20 -07:00
Tim Abbott d215ea1e37 actions: Rename all_subs_by_stream to all_subscribers_by_stream.
The previous name sounded a bit too much like they were subcription
objects.
2017-10-08 12:33:53 -07:00
Steve Howell 3e6bfe1b23 Use user_ids, not emails, for bulk stream operations.
We now return user_ids for subscribers to streams in add-stream
events.  This allows us to eliminate the UserLite class for
both bulk adds and bulk removes.  It also simplifies some JS
code that already wanted to use user_ids, not emails.

Fixes #6898
2017-10-08 12:31:12 -07:00
Harshit Bansal 3c434f0d86 notifications: Switch to use `make_links_absolute()` from lxml library.
Instead of using custom regexes for converting relative URLs to
absolute URLs switch to using `make_links_absolute()` function
from lxml library.
2017-10-08 12:15:30 -07:00
Steve Howell 10a30bece1 Rename presence_idle_userids -> presence_idle_user_ids. 2017-10-07 12:16:45 -07:00
Steve Howell fbaef43ac3 Rename bot_owner_userids -> bot_owner_user_ids. 2017-10-07 12:16:45 -07:00
Greg Price aa4104a5af logging: Add option to show the PID in each log message. 2017-10-06 19:21:40 -07:00
Harshit Bansal 5a6584890d push_notifications: Start using `get_mobile_push_content()` function. 2017-10-06 16:47:25 -07:00
Harshit Bansal 28628eeaeb push_notifications: Add `truncate_content()` function.
This function truncates the textual content at correct length.
(It will be updated later to handle corner cases of unicode
combining characters and tags when we start supporting them.)
2017-10-06 16:44:19 -07:00
Harshit Bansal b5a1aacfb3 push_notifications: Add `get_mobile_push_content()` function.
Given the rendered content of a message, this function strips
all the markup replacing emojis with their corresponding unicode
representation.
2017-10-06 16:44:18 -07:00
Steve Howell 9202777d7f tests: Provide more useful output in assert_length(). 2017-10-06 14:30:30 -07:00
Steve Howell a331b4f64d Optimize query_all_subs_by_stream().
Using lightweight objects will speed up adding new users
to realms.

We also sort the query results, which lets us itertools.groupby
to more efficiently build the data structure.

Profiling on a large data set shows about a 25x speedup for this
function, and before the optimization, this function accounts
for most of the time spend in bulk_add_subscriptions.

There's a lot less memory to allocate.  I didn't measure
the memory difference.

When we test-deployed this to chat.zulip.org, we got about a 6x
speedup.
2017-10-06 11:03:44 -07:00
Steve Howell f5ddc40d14 Have get_peer_user_ids_for_stream_change() use user_ids. 2017-10-06 11:03:44 -07:00
Rishi Gupta 0596c4a810 analytics: Enforce various datetime arguments are in UTC.
Sort of a hacky hammer, but
* The original design of the analytics system mistakenly attempted to play
  nicely with non-UTC datetimes.
* Timezone errors are really hard to find and debug, and don't jump out that
  easily when reading code.

I don't know of any outstanding errors, but putting a few "assert this
timezone is in UTC" around will hopefully reduce the chance that there are
any current or future timezone errors.

Note that none of these functions are called outside of the analytics code
(and tests). This commit also doesn't change any current behavior, assuming
a database where all datetimes have been being stored in UTC.
2017-10-05 11:22:06 -07:00
Rishi Gupta 0c2b4d22a7 analytics: Convert datetimes coming from the API into UTC.
Previously, entering a non-UTC end time for a daily stat would give you
incorrect results. This is because:
* All daily stats are collected at and have end_times in the database in
  midnight UTC.
* For daily stats, time_range returns a list of datetimes at midnight in the
  timezone of its end argument. These datetimes are the only ones we look
  for when looking for rows corresponding to the stat in the database.
* Previously, we passed on the end argument from the API to time_range,
  without modification.
2017-10-05 11:22:06 -07:00
Rishi Gupta 70f6c47edc analytics: Extract verify_utc into its own function.
No functional changes, other than making the error message more generic.
2017-10-05 11:22:06 -07:00
Steve Howell d6e21b5ca9 Collect sender_ids (by topic) in `unread_msgs`.
This will allow the mobile app to say "A, B, and C are
talking" in the topic views.
2017-10-05 10:37:15 -07:00
Steve Howell e56084fcf7 Simplify how we apply events for unread messages.
The logic to apply events to page_params['unread_msgs'] was
complicated due to the aggregated data structures that we pass
down to the client.

Now we defer the aggregation logic until after we apply the
events.  This leads to some simplifications in that codepath,
as well as some performance enhancements.

The intermediate data structure has sets and dictionaries that
generally are keyed by message_id, so most message-related
updates are O(1) in nature.

Also, by waiting to compute the counts until the end, it's a
bit less messy to try to keep track of increments/decrements.
Instead, we just update the dictionaries and sets during the
event-apply phase.

This change also fixes some corner cases:

    * We now respect mutes when updating counts.
    * For message updates, instead of bluntly updating
      the whole topic bucket, we update individual
      message ids.

Unfortunately, this change doesn't seem to address the pesky
test that fails sporadically on Travis, related to mention
updates.  It will change the symptom, slightly, though.
2017-10-05 09:42:20 -07:00
Steve Howell c567f105c9 Have topic_is_muted take a stream_id.
This function doesn't need a full Stream object to detect
whether a stream is muted, so we can save future callers
from doing unnecessary DB fetches.
2017-10-05 09:32:16 -07:00
Steve Howell f55b22e937 Add get_muted_stream_ids().
This function replaces get_muted_recipient_ids().  This will
set us up to apply events more easily.
2017-10-05 09:32:16 -07:00
Steve Howell 941b1c781c Refactor get_unread_message_ids_per_recipient().
We now have two helper functions:

    * get_raw_unread_data
    * aggregate_unread_data

Separating the concerns is nice.  The first function does
all the data collection.  The second function should be fast,
and it only re-organizes the data into an aggregated form
that makes the page_params payload smaller and easier for
clients to work with.

For the first function, we try to return data structures
that are easier to manipulate than the end result.  This
will allow us to apply events more easily, in a subsequent
commit.
2017-10-05 09:32:16 -07:00
Harshit Bansal ef35e6ac3f reactions: Switch to using `name_to_codepoint`.
Instead of using `unified_reactions` mapping start using
`name_to_codepoint` mapping for converting emoji name to
codepoints. We were using `unified_reactions` mapping
because prior to emoji web PR `name_to_codepoint` mapping
was generated using emoji_map.json which contained old
codepoints but for reactions new codepoints were required
to display them using sprite sheets.
2017-10-04 23:09:14 -07:00
Tim Abbott ce7ab0474d error_notify: Add IP address to browser error reports.
This should make debugging a bit more convenient when we want to know
which of a user's clients was involved.
2017-10-04 13:46:05 -07:00