Commit Graph

33983 Commits

Author SHA1 Message Date
Steve Howell 9afad9e054 node tests: Add commented-out benchmarks for Dict.
The benchmark is commented out.  It takes only a few
milliseconds to run, so there may be no reason not
to always run it.  It doesn't test correctness, so
it would arguably inflate line coverage, but set/get
are obviously covered elsewhere.
2020-01-03 17:19:59 -08:00
Steve Howell 30ad1b6f16 zjsunit: Remove Dict dependency.
We now require the actual tests to explicitly
to zrequire Dict, rather than magically adding this.

In one case, the use of Dict was clearly just for
the test (not the app), so I converted that an ordinary
JS object (see timerender.js).
2020-01-03 17:19:59 -08:00
Steve Howell d41f714eff comments: Update comment for zjsunit/i18n.js. 2020-01-03 17:19:59 -08:00
Steve Howell 897320b2c4 zjquery: Use Map instead of Dict.
This seems to speed up the whole test suite
by about 20%, although measurements are a bit
noisy.
2020-01-03 17:19:59 -08:00
Steve Howell ee3e488e02 js: Extract FoldDict class.
We have ~5 years of proof that we'll probably never
extend Dict with more options.

Breaking the classes into makes both a little faster
(no options to check), and we remove some options
in FoldDict that are never used (from/from_array).

A possible next step is to fine-tune the Dict to use
Map internally.

Note that the TypeScript types for FoldDict are now
more specific (requiring string keys).  Of course,
this isn't really enforced until we convert other
modules to TS.
2020-01-03 17:19:50 -08:00
Steve Howell 9cd075ffb1 people: Use Set() in track_duplicate_full_name().
This is more idiomatic and probably
faster for most browsers.  (This function
gets called for each name in page load,
so any slowness is magnified.)
2020-01-03 17:19:38 -08:00
Mateusz Mandera 510bc60663 test_helpers: Set Recipient class attrs in use_db_models.
Model classes fetched through apps.get_model don't get methods or class
attributes. It's not feasible to add them to all these objects in
use_db_models, but Recipient.PERSONAL etc. are worth setting, since
doing that increases the range of functions that can successfully be
imported and called in test_migrations.py.
2020-01-03 16:56:58 -08:00
Mateusz Mandera a993604fae test_email_notifs: Clean up mocking.
These tests had a lot of very repetetive, identical mocking, in some
tests without even doing anything with the mocks. It's cleaner to put
the mock in the one relevant, common place for all the tests that need
it, and remove it from tests who had no use for the mocking.
2020-01-03 16:56:58 -08:00
Mateusz Mandera d691c249db api: Return a JsonableError if API key of invalid format is given. 2020-01-03 16:56:42 -08:00
Mateusz Mandera 72401b229f utils: Add a function to check if string can be an API key. 2020-01-03 16:56:42 -08:00
Mateusz Mandera 4f2897fafc cache: Validate keys before passing them to memcached.
Fixes #13504.

This commit is purely an improvement in error handling.

We used to not do any validation on keys before passing them to
memcached, which meant for invalid keys, memcached's own key
validation would throw an exception.  Unfortunately, the resulting
error messages are super hard to read; the traceback structure doesn't
even show where the call into memcached happened.

In this commit we add validation to all the basic cache_* functions, and
appropriate handling in their callers.

We also add a lot of tests for the new behavior, which has the nice
effect of giving us decent coverage of all these core caching
functions which previously had been primarily tested manually.
2020-01-03 16:56:42 -08:00
Mateusz Mandera 5bb84a2255 default_settings: Fix inaccurate "below" phrase in comments.
These are leftovers from where we had default settings in the
settings.py file. Now that the files are separate those references to
"below" are not correct.
2020-01-03 16:52:31 -08:00
Mateusz Mandera e477cae800 docs: Fix missing apostrophe in EMAIL_HOST_USER value. 2020-01-03 16:52:31 -08:00
Mateusz Mandera dc59850d15 docs: Fix incorrect path to get-django-setting script. 2020-01-03 16:52:31 -08:00
Mateusz Mandera d88494deae docs: Add some troubleshooting notes for ldap. 2020-01-03 16:52:30 -08:00
Mateusz Mandera e81aa740bc ldap: Protect against troublesome deactivations in ldap sync.
If ldap sync is run while ldap is misconfigured, it can end up causing
troublesome deactivations due to not finding users in ldap -
deactivating all users, or deactivating all administrators of a realm,
which then will require manual intervention to reactivate at least one
admin in django shell.
This change prevents such potential troublesome situations which are
overwhelmingly likely to be unintentional. If intentional, --force
option can be used to remove the protection.
2020-01-03 16:46:07 -08:00
Mateusz Mandera bfb963b9aa docs: Include suggested USERNAME_ATTR in example AD ldap configs. 2020-01-03 16:46:07 -08:00
Steve Howell b3a69154a6 refactor: Export compare_for_relevance.
This future-proofs us a bit more for test coverage.
2020-01-03 14:58:05 -08:00
Steve Howell 0985842c62 Fix sorting for broadcast mentions.
We had a potentially nasty bug where we
weren't guaranteeing that all/stream/everyone
collated in consistent ways inside of
`compare_people_for_relevance`, which can
send certain types of sort algorithms into
an infinite loop. I doubt this ever happened
in practice, but it's obviously worth fixing.

Now we also have a clear tiebreaker between
any two all/everyone/stream mentions, which
is the idx field.

Finally, this should be a bit more efficient.
2020-01-03 14:58:05 -08:00
Steve Howell 758786ab87 refactor: Extract broadcast_mentions.
This will be helpful for testing.
2020-01-03 14:58:05 -08:00
Steve Howell 773161cbb7 tests: Test "all" mentions more realistically.
We don't have people named "all".  Instead, we
create pseudo person objects with email/full_name
of "all" (along with some other fields).  The tests
now reflect this.
2020-01-03 14:58:05 -08:00
Steve Howell d227988519 tests: Split up sort_recipient tests. 2020-01-03 14:58:05 -08:00
Steve Howell cde01aeeb0 tests: Avoid list mutation.
To test dups we can just create a new list.
2020-01-03 14:58:05 -08:00
Steve Howell 5c43180a70 tests: Use names for test objects. 2020-01-03 14:58:05 -08:00
Steve Howell 49ba916be7 refactor: Rename *_for_at_mentioning functions.
This name was misleading, since this code is used
in sort_recipients, which happens when you, for
example, autocomplete persons in the "To:" box
when composing (and has nothing to do with
mentioning).
2020-01-03 14:58:05 -08:00
Steve Howell 1577662a67 refactor: Clean up exports.compose_matches_sorter. 2020-01-02 12:11:50 -08:00
Steve Howell c2c5878c3a refactor: Clean up compose_content_matcher.
The switch statement is easier to read, and
we also want to eventually remove the "this"
that couples us to the awkward typeahead
hacks.
2020-01-02 12:11:50 -08:00
Steve Howell ebf4195bf3 refactor: Extract clean_query_lowercase().
This makes it a bit easier to find common patterns,
plus it sets us up to pull the calls even further
up the stack.

The first rule of dealing with user data is sanitize
at the edges, not deep down in some function that
has many callers.  Putting this code so deep down
in the stack means it's more likely to be called in
a loop.
2020-01-02 12:11:48 -08:00
Steve Howell 4699710856 refactor: Move clean_query further up the stack.
This moves clean_query into all the callers
of query_matches_source_attrs.

This doesn't change anything performance-wise,
but it sets up future commits.
2020-01-02 12:10:10 -08:00
Steve Howell 8448832bfe refactor: Move clean_query up the stack.
This change is easy--we only had one caller.

This change means any query going against a
target with multiple `match_attrs`, such as
user names (first name, last) only has to
clean the query once per person.
2020-01-02 12:10:10 -08:00
Steve Howell 5b01efda7b typeahead: Extract clean_query helper. 2020-01-02 12:10:07 -08:00
Steve Howell b5d0eab0c6 dict: Add filter_values() method.
This method can help us avoid some memory
allocations.
2020-01-02 12:03:45 -08:00
Steve Howell 8b04cf1288 people: Use is_my_user_id in get_people_for_stream_create.
We want to get away from email-based checks.
2020-01-02 12:03:43 -08:00
Steve Howell 7229a943f0 tests: Use add_in_realm for "me" in people tests.
This is more realistic for testing.
2020-01-02 12:03:04 -08:00
Steve Howell 54cb857fee refactor: Rename people.get_rest_of_realm().
We want to mostly deprecate this function (see
the comment I added), so I gave it a more specific
name.

Ideally I'd just fix `stream_create`, but it does
use this function in a couple places, and it's helpful
to reuse the same sort here.  In one place stream_create
actually unshifts the "me" user back to the top of the
list, which makes sense for its use case.
2020-01-02 12:03:04 -08:00
Steve Howell 405a529340 server: Sort user_ids in recent PM conversations.
This change should prevent test flakes, plus
it's more deterministic behavior for clients,
who will generally comma-join the ids into
a key for their internal data structures.

I was able to verify test coverage on this
by making the sort reversed, which would
cause test_huddle_send_message_events to
fail.
2020-01-02 11:59:58 -08:00
Steve Howell 6e93f330c6 bug fix: Fix huddles in "Private Messages".
If two user_ids in a recent huddle have ids
that sort lexically differently than numerically,
such as 7 and 66, then we were creating two
different buckets in pm_conversations.

This regression was introduced in
263ac0eb45 on
November 21, 2019.
2020-01-02 11:59:58 -08:00
Steve Howell 0e68387975 refactor: Have pm_conversations take user_ids.
Instead of having our callers pass in a possibly
non-canonical version of a user_ids_string, just
have them pass in a list.

The next commit will canonicalize the sort.
2020-01-02 11:59:58 -08:00
Steve Howell ab6f4af33a tests: Use tricky server data in unit tests.
The server may send us ids in the order
[11, 2], instead of [2, 11].  We don't want
to rely on server behavior, regardless, for
the sort.

Our tests now show we process that data.

The current code is is still buggy and causes
us to show the same huddle two different times
for situations where the lexical sort doesn't
match the numerical sort.

This happens on czo often, where Tim is user
7, and his id sorts lexically after ids like
58, 622, 4444, etc.
2020-01-02 11:59:58 -08:00
Anders Kaseorg 8f281c4fc9 apply_event: Replace list comprehension with list.remove.
This should be about 4 times faster, saving something like half a
millisecond on each stream of 10000 subscribers.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-31 10:06:09 -08:00
Mateusz Mandera bbafced254 api docs: Advertise "topic" argument instead of "subject" on /messages.
They have the same meaning but we're transitioning away from the
"subject" terminology, so we should advertise "topic" in docs.
2019-12-30 17:22:46 -08:00
Tim Abbott 86721c3be5 docs: Fix a few typos in Vagrant docs. 2019-12-30 10:34:48 -08:00
Jonas Svatos 9dd29438f5 base Zulip PostgreSQL Docker container on PGroonga official one 2019-12-30 10:20:25 -08:00
Steve Howell b3b83f223d minor: Avoid dict lookup for color.
The only thing get_color() does is look
up a sub:

    exports.get_color = function (stream_name) {
        const sub = exports.get_sub(stream_name);
        if (sub === undefined) {
            return stream_color.default_color;
        }
        return sub.color;
    };

So if we have a sub already, there's no point
calling the helper.

Obviously, this isn't a huge deal, but it happens
N times during page load.
2019-12-30 09:50:22 -08:00
Steve Howell 0711c7ea49 performance: Avoid dup calls to subscribed_streams().
In stream_sort.sort_groups, we now have the caller
pass us in the list of streams, since they are getting
them anyway.
2019-12-30 09:50:22 -08:00
Steve Howell 33246c5c49 streams: Simplify claim_colors.
This is about a millisecond faster for lots of streams,
since it does more work with native Set.
2019-12-30 09:50:22 -08:00
Steve Howell 631811e686 streams: Add BinaryDict for stream_data.
This should make any operation on subscribed
streams faster (we won't need to filter out
unsubscribed streams every time).

I started writing this before I realized we
had a bug where we call `subscribed_streams`
in a nested loop.

After fixing the bugs, this is not as much of
a bottleneck, but it's still a speedup in many
important places:

    * build left sidebar
    * every keystroke in search bar
    * first keystroke in making #stream_links
    * every keystroke in compose stream box

The streams settings code is kinda complicated.
It does a non-deterministic sort of the "others"
bucket when you add elements to the left panel.
They get hidden, anyway.  Our values() call now
puts subscribed streams first.  It never guaranteed
order, but putting subscribed streams first is
probably a good behavior for most situations.
2019-12-30 09:50:20 -08:00
Steve Howell a3512553a8 streams: Add LazySet for subscribers.
This defers O(N*S) operations, where

    N = number of streams
    S = number of subscribers per stream

In many cases we never do an O(N) operation on
a stream.  Exceptions include:

    - checking stream links from the compose box
    - editing a stream
    - adding members to a newly added stream

An operation that used to be O(N)--computing
the number of subscribers--is now O(1), and we
don't even pay O(N) on a one-time basis to
compute it (not counting the cost to build the
array from JSON, but we have to do that).
2019-12-30 09:47:55 -08:00
Steve Howell e804f39f0e performance: Avoid expensive call in stream_data.is_active.
Calling `set_filter_out_inactives` is expensive, since we
count up the number of subscribed streams, which iterates
through all your streams, creates a new list of subscribed
streams, then counts them.

In my dev setup, I created 700 streams, and this shaved
about 700ms off of the initial call to `build_stream_list`.
2019-12-30 09:45:46 -08:00
Steve Howell 70470dea1c settings: Use correct email when searching users.
If we aren't showing users emails, then we don't
want to use emails in the search.

And if we are showing users emails, we want to
search on the email that's displayed to them.
For admins this will be delivery_email.

For regular users we arguably shouldn't search
on emails either, since it mostly causes confusion,
but this commit just preserves the current
behavior for those users (unless `show_email` is
false).
2019-12-30 09:43:24 -08:00