Commit Graph

34022 Commits

Author SHA1 Message Date
rht 771f6d213f prod install: Rename os_codename into os_version_id 2020-01-07 13:25:25 -08:00
rht dccfb0ebe9 puppet: Remove duplicate postgresql-client safepackage check on CentOS. 2020-01-07 13:25:25 -08:00
Hashir Sarwar 0cabacb8ab export: Fix data export parallelization.
This improves the approach of creating multiple parallel processes by
using subprocess.Popen() instead of run_parallel() and
subprocess.call() while exporting an organization's message
history.  This prevents forking twice for individual subprocess.

While this has some performance benefit, the main reason to fix this
is that it fixes an issue with the data export web UI introduced in
run_parallel forks exited).

Fixes #12904.
2020-01-07 13:23:18 -08:00
Mateusz Mandera b87cf22b33 email_mirror: Move send_to_mm_address code to process_missed_message.
process_missed_message did nothing other than calling
send_to_missed_message_address with the same arguments, so there's no
reason to have these as separate functions.
2020-01-07 13:03:32 -08:00
Mateusz Mandera c011d2c6d3 email_mirror: Migrate missed message addresses from redis to database.
Addresses point 1 of #13533.

MissedMessageEmailAddress objects get tied to the specific that was
missed by the user. A useful benefit of that is that email message sent
to that address will handle topic changes - if the message that was
missed gets its topic changed, the email response will get posted under
the new topic, while in the old model it would get posted under the
old topic, which could potentially be confusing.

Migrating redis data to this new model is a bit tricky, so the migration
code has comments explaining some of the compromises made there, and
test_migrations.py tests handling of the various possible cases that
could arise.
2020-01-07 13:03:22 -08:00
Mateusz Mandera 9077bbfefd models: Add MissedMessageEmailAddress class.
Preparatory commit for making the email mirror use the database instead
of redis for missed message addresses.

This model will represent missed message email addresses, which
currently have their data stored in redis.
The redis data will be converted and migrated into these models and
the email mirror will start using them in the main commit.
2020-01-07 12:46:55 -08:00
Steve Howell 630aadb7e0 bot_owner_id: Explicitly set bot_owner_id to None.
For cross realm bots, explicitly set bot_owner_id
to None.  This makes it clear that the cross realm
bots have no owner, whereas before it could be
misdiagnosed as the server forgetting to set the
field.
2020-01-07 12:33:14 -08:00
showell 96a50422f7 minor: Avoid recip.user_id defensive fallback.
The recip.id || recip.user_id idiom has only been
needed for some old unit tests.

It was previously required as a bad workaround for the
local echo issue fixed in dd1a6a97bd 
where we would get `display_recipient` values added in an invalid format.
2020-01-06 12:30:00 -08:00
Tim Abbott 185b52e5e7 slack import: Clarify confusion around xoxe- tokens. 2020-01-06 11:20:29 -08:00
Steve Howell 94761b806c node tests: Restore 100% coverage to pm_list. 2020-01-06 10:21:23 -08:00
Steve Howell c22c796f1d refactor: Extract is_all_privates().
I want to be able to easily test this without
having to simulate all the jQuery side effects.

This simply preserves the old logic, which seems
to handle one edge case without handling every
possible edge case.  The edge cases aren't super
important here, though, since the only thing it affects
is bolding "Private Messages", and when to do that
is somewhat up to personal tastes.

Having said that, we could definitely improve
this code and possibly should move some of this
logic to either narrow_state.js or filter.js.
2020-01-06 10:21:23 -08:00
Steve Howell 5b168d0530 pm_list: Set active-sub-filter in template.
Instead of doing various ad-hoc calculations of
which PM is "active" and plumbing it through various
functions and then updating it via jQuery instead of
just the template, we now just calculate `is_active`
in `_build_private_messages_list` with a little
helper function.
2020-01-06 10:21:23 -08:00
Steve Howell da1392efd2 node test: Remove complicated pm_list test.
This test mostly tests logic that I'm about
to remove in subsequent commits, and it's a bit
messy.

This commit removes 100% line coverage, but I
will restore that a few commits later.
2020-01-06 10:21:23 -08:00
Steve Howell 066a02a987 pm_list: Remove obsolete active_conversation parameter.
In 3cfc3ca24b I removed
the feature that limited PM conversations to five or
less (including the active conversation), but I
didn't clean up this parameter.  I think lint was
confused by the fact that we did mutate it.

I am wondering if this started out as an experiment
and was never fully polished before the push?  Or
maybe I was just careless.  Anyway, I don't
think were any symptoms here--it was just dead code
that we didn't need.
2020-01-06 10:21:23 -08:00
Anders Kaseorg a78f8647d8 install: Run generate_secrets.py before zulip-puppet-apply.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-05 22:48:08 -08:00
Anders Kaseorg 1f31d6d32c dependencies: Upgrade vnu-jar.
This version includes my fix for the ‘Attribute “placeholder”’ test
flake (https://github.com/validator/validator/pull/884).

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-05 22:44:41 -08:00
Tim Abbott 9f72e5fc87 int_dict: Move filter_values helper to dict.
This fixes a rebase issue between the int_dict introduction and use
for people.js with the introduce of filter_values on dict.js and use
inside people.js.
2020-01-05 13:18:34 -08:00
Steve Howell 493afcb9f0 zjsquery: Add data support.
Before this we just noop'ed it, since at one time
we were trying to deprecate this is in favor
of attr calls.
2020-01-05 12:28:37 -08:00
Steve Howell 9ba1829243 streams: Use IntDict for stream/topic unread counts.
Note that we haven't fully swept this for Dict,
since some dicts are keyed by strings.  For
example PM counts can have a huddle like
"101,102,103" as a key.
2020-01-05 12:28:34 -08:00
Steve Howell 579bad4829 refactor: Use Set for default_stream_ids. 2020-01-05 12:28:33 -08:00
Steve Howell 9f7be51ce8 streams: Replace Dict with IntDict in stream_data.
There's another Dict that we'll convert to a Set
in a subsequent commit.
2020-01-05 12:28:28 -08:00
Steve Howell 73d0350a24 people: Use ints in is_my_user_id().
This should be slightly more performant, and we
often call this function N times, such as when
rendering the buddy list.

There's a minor change to pm_list to avoid
an unnecessary computation on huddles that would
otherwise trigger a blueslip warning for the
huddles case.
2020-01-05 12:28:23 -08:00
Steve Howell 552f07428d people: Simplify people.get_recipient_count.
Once we get past the special check for fake
person objects already having `pm_recipient_count`,
we can rely on the object being a `person`
object with `user_id` set.
2020-01-05 12:27:45 -08:00
Steve Howell bc5589c2a7 people: Clean up recip.id code.
When we are pulling data from message.display_recipient
for private messages, the user_id field is always
called 'id', not 'user_id', so we can simplify
some defensive code.
2020-01-05 12:27:30 -08:00
Steve Howell 7630b859c3 js: Use IntDict in people.js.
This required lots of manual testing:

    - search/navigate user presence
    - send PM and mention user
    - pay attention to compose fade
    - send stream msg and mention user
    - open Private Messages in top-left and click
    - test unread counts
    - invite user who already has account
    - search for users in search bar
    - check user settings
        - User Groups
        - Users
        - Deactivated Users
        - Bots
    - create a bot
    - mention user groups
    - send group PM then click on lower right
    - view/edit/create streams

If there are still pieces of code that don't convert
ids to ints, the code should still work but report
blueslip errors.

I try to mostly convert user_ids to ints in the callers,
since often the callers are dealing with small amounts
of data, like user ids from huddles.
2020-01-05 12:27:28 -08:00
Steve Howell 4e59937632 js: Add IntDict class.
We don't use this yet, but we will soon.

We report errors if users pass in strings instead of
ints, but we try to still use the key.
2020-01-05 12:27:26 -08:00
Steve Howell 26168eaa98 search: Optimize search bar suggestions for large realms.
We only ever show 3 or 4 people in search suggestions
(possibly w/a couple variations, like pm-with/sender/etc.),
so we can try to search a smaller subset of people
before going through the entire realm.

We use message_store.user_ids() for this, since you
typically want to search messages for people that
have sent messages recently, and we already sort
based on PM conversations.
2020-01-04 12:58:00 -08:00
Steve Howell 7016292558 search: Track user_ids in message_store.
We'll use this for search.
2020-01-04 12:57:58 -08:00
Steve Howell a5bf6984bc search: Extract make_people_getter().
This helper lets us reduce the number of people
queries down from 4 to either 0 or 1.
2020-01-04 12:55:40 -08:00
Steve Howell d87c5d7b1f search: Use people.filter_all_persons() in search.
This should avoid some memory allocations.

We also use build_person_matcher to avoid
repeating the same logic over and over
again to process the query into termlets.

We also remove people.get_all_persons() and
people.person_matches_query().
2020-01-04 12:53:32 -08:00
Steve Howell d91a0ab9c7 typeahead: Remove diacritics on full names, not pieces.
This may actually be a slowdown for the worst case
scenario, but it sets us up to be able to easily
short circuit the removal of diacritic characters
for users that have pure ascii names.

For example, czo has lots of names like this:

    - Tim Abbott
    - Steve Howell

Since they're pure ascii, we can do a one-time
check.  A subsequent commit will show how we use
this.
2020-01-03 17:46:59 -08:00
Steve Howell 7d7028b7d0 performance: Speed up PM lookaheads.
This looks like simple code cleanup, but it's more
than that.

The code cleanup here is that we don't have three
callbacks to get a list of typeaheads for bootstrap.
Instead, we just have one function that does all the
main work.

And then the speedup comes from the fact we no longer
need to remove diacritics from the query for every
time through our loop of seeing if a person matches
the query.

It's a bit subtle to see in the diff, but these are
the relevant lines:

    const matcher = exports.get_person_or_user_group_matcher(query);
    const filtered_results = _.filter(people_and_groups, matcher);

Before this, bootstrap was doing $.grep, and we'd have
to reinitialize the matcher for every person.

If you profile this before and after, you'll see that
remove_diacritics gets called fewer times.

To profile this, you want to loads lots of users into
your DB and try to autocomplete "Extra", as in "Extra1 User".

If you try to autocomplete something else, then my patch
won't really help, and `remove_diacritics` will still
show up as expensive.  Because it is that expensive a function.
2020-01-03 17:42:29 -08:00
Steve Howell a0a94b54c9 refactor: Extract helpers for user/stream matching.
These had to be done in tandem, since they were
both kinda coupled to the function that is now
called query_matches_name_description.

(This commit slightly negatively impacts PM
lookups, but this is addressed in the subsequent
commit, which makes PMs much faster.  The impact
is super minimal--it's just an extra function
dispatch.)
2020-01-03 17:42:29 -08:00
Steve Howell 303ab00760 typeahead: Extract get_topic_matcher. 2020-01-03 17:42:27 -08:00
Steve Howell e9c2a7ef7c typeahead: Extract get_language_matcher. 2020-01-03 17:42:25 -08:00
Steve Howell b23df43c1f typeahead: Extract get_slash__matcher. 2020-01-03 17:42:22 -08:00
Steve Howell 676397a026 typeahead: Extract get_emoji_matcher. 2020-01-03 17:42:20 -08:00
Steve Howell ccf6640660 refactor: Have compose_content_matcher return a function.
This may seem silly now, since we are returning a function
that still dispatches over all flavors of search for
every item, but subsequent commits will make it obvious
why I'm doing this.
2020-01-03 17:39:50 -08:00
Steve Howell b65da7cbe9 compose typeahead: Do matching/sorting without callbacks.
We want to do our own matching of items, rather than
just giving a callback to bootstrap, which does $.grep
on all the items.

Doing our own matching gives us flexibility for future
improvements like custom data structures for searching
through big amounts of data.  Even in the short term
we can speed up searches by pulling expensive operations
outside the grep/filter call.

This architecture has been in place for our search
bar since ~2014.
2020-01-03 17:39:48 -08:00
Steve Howell 9afad9e054 node tests: Add commented-out benchmarks for Dict.
The benchmark is commented out.  It takes only a few
milliseconds to run, so there may be no reason not
to always run it.  It doesn't test correctness, so
it would arguably inflate line coverage, but set/get
are obviously covered elsewhere.
2020-01-03 17:19:59 -08:00
Steve Howell 30ad1b6f16 zjsunit: Remove Dict dependency.
We now require the actual tests to explicitly
to zrequire Dict, rather than magically adding this.

In one case, the use of Dict was clearly just for
the test (not the app), so I converted that an ordinary
JS object (see timerender.js).
2020-01-03 17:19:59 -08:00
Steve Howell d41f714eff comments: Update comment for zjsunit/i18n.js. 2020-01-03 17:19:59 -08:00
Steve Howell 897320b2c4 zjquery: Use Map instead of Dict.
This seems to speed up the whole test suite
by about 20%, although measurements are a bit
noisy.
2020-01-03 17:19:59 -08:00
Steve Howell ee3e488e02 js: Extract FoldDict class.
We have ~5 years of proof that we'll probably never
extend Dict with more options.

Breaking the classes into makes both a little faster
(no options to check), and we remove some options
in FoldDict that are never used (from/from_array).

A possible next step is to fine-tune the Dict to use
Map internally.

Note that the TypeScript types for FoldDict are now
more specific (requiring string keys).  Of course,
this isn't really enforced until we convert other
modules to TS.
2020-01-03 17:19:50 -08:00
Steve Howell 9cd075ffb1 people: Use Set() in track_duplicate_full_name().
This is more idiomatic and probably
faster for most browsers.  (This function
gets called for each name in page load,
so any slowness is magnified.)
2020-01-03 17:19:38 -08:00
Mateusz Mandera 510bc60663 test_helpers: Set Recipient class attrs in use_db_models.
Model classes fetched through apps.get_model don't get methods or class
attributes. It's not feasible to add them to all these objects in
use_db_models, but Recipient.PERSONAL etc. are worth setting, since
doing that increases the range of functions that can successfully be
imported and called in test_migrations.py.
2020-01-03 16:56:58 -08:00
Mateusz Mandera a993604fae test_email_notifs: Clean up mocking.
These tests had a lot of very repetetive, identical mocking, in some
tests without even doing anything with the mocks. It's cleaner to put
the mock in the one relevant, common place for all the tests that need
it, and remove it from tests who had no use for the mocking.
2020-01-03 16:56:58 -08:00
Mateusz Mandera d691c249db api: Return a JsonableError if API key of invalid format is given. 2020-01-03 16:56:42 -08:00
Mateusz Mandera 72401b229f utils: Add a function to check if string can be an API key. 2020-01-03 16:56:42 -08:00
Mateusz Mandera 4f2897fafc cache: Validate keys before passing them to memcached.
Fixes #13504.

This commit is purely an improvement in error handling.

We used to not do any validation on keys before passing them to
memcached, which meant for invalid keys, memcached's own key
validation would throw an exception.  Unfortunately, the resulting
error messages are super hard to read; the traceback structure doesn't
even show where the call into memcached happened.

In this commit we add validation to all the basic cache_* functions, and
appropriate handling in their callers.

We also add a lot of tests for the new behavior, which has the nice
effect of giving us decent coverage of all these core caching
functions which previously had been primarily tested manually.
2020-01-03 16:56:42 -08:00