zulip

Commit Graph

Author	SHA1	Message	Date
Steve Howell	d227988519	tests: Split up sort_recipient tests.	2020-01-03 14:58:05 -08:00
Steve Howell	cde01aeeb0	tests: Avoid list mutation. To test dups we can just create a new list.	2020-01-03 14:58:05 -08:00
Steve Howell	5c43180a70	tests: Use names for test objects.	2020-01-03 14:58:05 -08:00
Steve Howell	49ba916be7	refactor: Rename *_for_at_mentioning functions. This name was misleading, since this code is used in sort_recipients, which happens when you, for example, autocomplete persons in the "To:" box when composing (and has nothing to do with mentioning).	2020-01-03 14:58:05 -08:00
Steve Howell	1577662a67	refactor: Clean up exports.compose_matches_sorter.	2020-01-02 12:11:50 -08:00
Steve Howell	c2c5878c3a	refactor: Clean up compose_content_matcher. The switch statement is easier to read, and we also want to eventually remove the "this" that couples us to the awkward typeahead hacks.	2020-01-02 12:11:50 -08:00
Steve Howell	ebf4195bf3	refactor: Extract clean_query_lowercase(). This makes it a bit easier to find common patterns, plus it sets us up to pull the calls even further up the stack. The first rule of dealing with user data is sanitize at the edges, not deep down in some function that has many callers. Putting this code so deep down in the stack means it's more likely to be called in a loop.	2020-01-02 12:11:48 -08:00
Steve Howell	4699710856	refactor: Move clean_query further up the stack. This moves clean_query into all the callers of query_matches_source_attrs. This doesn't change anything performance-wise, but it sets up future commits.	2020-01-02 12:10:10 -08:00
Steve Howell	8448832bfe	refactor: Move clean_query up the stack. This change is easy--we only had one caller. This change means any query going against a target with multiple `match_attrs`, such as user names (first name, last) only has to clean the query once per person.	2020-01-02 12:10:10 -08:00
Steve Howell	5b01efda7b	typeahead: Extract clean_query helper.	2020-01-02 12:10:07 -08:00
Steve Howell	b5d0eab0c6	dict: Add filter_values() method. This method can help us avoid some memory allocations.	2020-01-02 12:03:45 -08:00
Steve Howell	8b04cf1288	people: Use is_my_user_id in get_people_for_stream_create. We want to get away from email-based checks.	2020-01-02 12:03:43 -08:00
Steve Howell	7229a943f0	tests: Use add_in_realm for "me" in people tests. This is more realistic for testing.	2020-01-02 12:03:04 -08:00
Steve Howell	54cb857fee	refactor: Rename people.get_rest_of_realm(). We want to mostly deprecate this function (see the comment I added), so I gave it a more specific name. Ideally I'd just fix `stream_create`, but it does use this function in a couple places, and it's helpful to reuse the same sort here. In one place stream_create actually unshifts the "me" user back to the top of the list, which makes sense for its use case.	2020-01-02 12:03:04 -08:00
Steve Howell	405a529340	server: Sort user_ids in recent PM conversations. This change should prevent test flakes, plus it's more deterministic behavior for clients, who will generally comma-join the ids into a key for their internal data structures. I was able to verify test coverage on this by making the sort reversed, which would cause test_huddle_send_message_events to fail.	2020-01-02 11:59:58 -08:00
Steve Howell	6e93f330c6	bug fix: Fix huddles in "Private Messages". If two user_ids in a recent huddle have ids that sort lexically differently than numerically, such as 7 and 66, then we were creating two different buckets in pm_conversations. This regression was introduced in `263ac0eb45` on November 21, 2019.	2020-01-02 11:59:58 -08:00
Steve Howell	0e68387975	refactor: Have pm_conversations take user_ids. Instead of having our callers pass in a possibly non-canonical version of a user_ids_string, just have them pass in a list. The next commit will canonicalize the sort.	2020-01-02 11:59:58 -08:00
Steve Howell	ab6f4af33a	tests: Use tricky server data in unit tests. The server may send us ids in the order [11, 2], instead of [2, 11]. We don't want to rely on server behavior, regardless, for the sort. Our tests now show we process that data. The current code is is still buggy and causes us to show the same huddle two different times for situations where the lexical sort doesn't match the numerical sort. This happens on czo often, where Tim is user 7, and his id sorts lexically after ids like 58, 622, 4444, etc.	2020-01-02 11:59:58 -08:00
Anders Kaseorg	8f281c4fc9	apply_event: Replace list comprehension with list.remove. This should be about 4 times faster, saving something like half a millisecond on each stream of 10000 subscribers. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-12-31 10:06:09 -08:00
Mateusz Mandera	bbafced254	api docs: Advertise "topic" argument instead of "subject" on /messages. They have the same meaning but we're transitioning away from the "subject" terminology, so we should advertise "topic" in docs.	2019-12-30 17:22:46 -08:00
Tim Abbott	86721c3be5	docs: Fix a few typos in Vagrant docs.	2019-12-30 10:34:48 -08:00
Jonas Svatos	9dd29438f5	base Zulip PostgreSQL Docker container on PGroonga official one	2019-12-30 10:20:25 -08:00
Steve Howell	b3b83f223d	minor: Avoid dict lookup for color. The only thing get_color() does is look up a sub: exports.get_color = function (stream_name) { const sub = exports.get_sub(stream_name); if (sub === undefined) { return stream_color.default_color; } return sub.color; }; So if we have a sub already, there's no point calling the helper. Obviously, this isn't a huge deal, but it happens N times during page load.	2019-12-30 09:50:22 -08:00
Steve Howell	0711c7ea49	performance: Avoid dup calls to subscribed_streams(). In stream_sort.sort_groups, we now have the caller pass us in the list of streams, since they are getting them anyway.	2019-12-30 09:50:22 -08:00
Steve Howell	33246c5c49	streams: Simplify claim_colors. This is about a millisecond faster for lots of streams, since it does more work with native Set.	2019-12-30 09:50:22 -08:00
Steve Howell	631811e686	streams: Add BinaryDict for stream_data. This should make any operation on subscribed streams faster (we won't need to filter out unsubscribed streams every time). I started writing this before I realized we had a bug where we call `subscribed_streams` in a nested loop. After fixing the bugs, this is not as much of a bottleneck, but it's still a speedup in many important places: * build left sidebar * every keystroke in search bar * first keystroke in making #stream_links * every keystroke in compose stream box The streams settings code is kinda complicated. It does a non-deterministic sort of the "others" bucket when you add elements to the left panel. They get hidden, anyway. Our values() call now puts subscribed streams first. It never guaranteed order, but putting subscribed streams first is probably a good behavior for most situations.	2019-12-30 09:50:20 -08:00
Steve Howell	a3512553a8	streams: Add LazySet for subscribers. This defers O(N*S) operations, where N = number of streams S = number of subscribers per stream In many cases we never do an O(N) operation on a stream. Exceptions include: - checking stream links from the compose box - editing a stream - adding members to a newly added stream An operation that used to be O(N)--computing the number of subscribers--is now O(1), and we don't even pay O(N) on a one-time basis to compute it (not counting the cost to build the array from JSON, but we have to do that).	2019-12-30 09:47:55 -08:00
Steve Howell	e804f39f0e	performance: Avoid expensive call in stream_data.is_active. Calling `set_filter_out_inactives` is expensive, since we count up the number of subscribed streams, which iterates through all your streams, creates a new list of subscribed streams, then counts them. In my dev setup, I created 700 streams, and this shaved about 700ms off of the initial call to `build_stream_list`.	2019-12-30 09:45:46 -08:00
Steve Howell	70470dea1c	settings: Use correct email when searching users. If we aren't showing users emails, then we don't want to use emails in the search. And if we are showing users emails, we want to search on the email that's displayed to them. For admins this will be delivery_email. For regular users we arguably shouldn't search on emails either, since it mostly causes confusion, but this commit just preserves the current behavior for those users (unless `show_email` is false).	2019-12-30 09:43:24 -08:00
Steve Howell	3e4326afda	refactor: Extract email_for_user_settings. We want to be able to unit test this value, since it's conditional on several factors: - am I an admin? - can non-admins view emails? - do we have delivery_email for the user? I'm mocking show_email in the tests, since the show_email code is in `settings_org` and kind of hard to unit test. It's not impossible, but it's too much for this commit. (Either we need to extract it out to a nice file or deal with mocking jQuery. That module is mostly data-oriented, so it would be nice to have something like `settings_config` that is actually pure data.)	2019-12-28 11:22:24 -08:00
Steve Howell	3a95be2f2f	refactor: Extract matches_user_settings_search. This was duplicate code. I'm moving it to people for pragmatic reasons--it's hard to unit test stuff in settings_users.js due to all the jQuery. It's also nice to have all people-related search code in one place, just for auditing purposes.	2019-12-28 11:22:24 -08:00
Steve Howell	5e0fc25f74	bug fix: Allow admins to filter users in settings. It appears `c28c3015` caused a regression where we set `email` to undefined if a user does not have `delivery_email` set, and this causes filtering of users to fail for admins doing user settings. This fixes only one of the issues reported in issue #13554. There's probably no easy fix to scrolling taking long, but I think fixing search will mostly address that complaint. The Rust folks seem to agree with me that the search results are too noisy. If I search for "s" I get: * names like Steve (good) * names like Jesse (noisy) * anybody with s in their email (super noisy) Here is the relevant code: return ( item.full_name.toLowerCase().indexOf(value) >= 0 \|\| email.toLowerCase().indexOf(value) >= 0 );	2019-12-28 11:22:24 -08:00
Steve Howell	1df7a7280a	Avoid unnecessary is_ascii checks on search termlets. We now can call is_ascii only once per search termlet when we are filtering multiple persons on the same query. (This requires the caller to use `build_person_matcher` outside a loop or before a `_.filter` call.)	2019-12-28 11:14:21 -08:00
Steve Howell	399e83aa70	minor: Tweak build_person_matcher. This is not a major speedup, but we do a couple simple things here: - trim the query outside the function we build (that might be called multiple times) - don't split names before we possibly early-exit with an email match	2019-12-28 11:14:21 -08:00
Steve Howell	a718b47095	refactor: Speed up filter_people_by_search_terms. We now call build_person_matcher outside the loop.	2019-12-28 11:14:21 -08:00
Steve Howell	9c525f8ecb	refactor: Extract build_person_matcher(). This will allow use to change some O(N) behavior to O(1) where we are performing the same query on a bunch of people. (Subsequent commits will actually take advantage of this prefactoring.)	2019-12-28 11:14:21 -08:00
Steve Howell	ab34ee0800	search performance: Stop at max_items. Once we have max_items results, stop trying to get more items. This should really help large realms when you do a search on streams that turns up more than N streams (where N is about 12). We won't even bother to find people.	2019-12-28 11:09:28 -08:00
Steve Howell	8406d34145	search: Extract make_attacher. This class gives us more control over attaching suggestions to our eventual result. The main thing we do now is remove duplicates as they're encountered. This will make sense in the follow up commit, where we can short circuit actions as soon as we get enough results.	2019-12-28 11:09:26 -08:00
Steve Howell	97293aef96	search: Simplify legacy search code. We now have a list of filterers that we walk through.	2019-12-28 11:09:25 -08:00
Steve Howell	09326cb467	refactor: Extract finalize_results. This has a few benefits: - we remove some duplicate code - we can see finalize_results in profiles It turns out finalize_results is expensive for some searches. If the search itself doesn't do a ton of work but returns a lot of results, we see it in finalize_results. It brings to attention that we should be truncating items earlier instead of doing lots of unnecessary work.	2019-12-28 11:09:25 -08:00
Steve Howell	4141abc171	search: Slightly speed up stream highlighting. This isn't a huge speedup, but it's an easy code change. We remove the two-liner highlight_with_escaping, which was only called in one place, and when we inline it into the caller, we can pull the first line, which builds the regex, out of the loop.	2019-12-28 11:09:23 -08:00
Steve Howell	7a2d9a0579	refactor: Extract build_highlight_regex.	2019-12-28 10:57:53 -08:00
Steve Howell	abfd39987c	refactor: Remove duplicate code. The code we removed in highlight_with_escaping is exactly the same code as in highlight_with_escaping_and_regex. I actually copy/pasted this code five years ago and am now removing the duplication. :)	2019-12-28 10:57:53 -08:00
Steve Howell	abdd4b54f4	performance: Speed up search bar highlighting. When we're highlighting all the people that show up in a search from the search bar, we need to fairly expensively build a regex from the query: query = query.toLowerCase(); query = query.replace(/[\-\[\]{}()*+?.,\\\^$\|#\s]/g, '\\$&'); const regex = new RegExp('(^' + query + ')', 'ig'); Even though the final regex is presumably cached, we still needed to do that `query.replace` for every person. Even for relatively small numbers of persons, this would show up in profiles as expensive. Now we just build the query once by using a pattern where you call a function outside the loop to build an inner function that's used in the loop that closes on the `query` above. The diff probably shows this better than I explained it here.	2019-12-28 10:57:53 -08:00
Steve Howell	6a9eaebff2	populate_db: Make num_recips calculation more clear. Extracting this calculation makes it easier to hack it when you're trying to load lots of users. We probably want a slightly more realistic calculation here for stress testing. And also fewer rows. But at least now it's a little more clear what it's doing.	2019-12-28 10:56:03 -08:00
Steve Howell	4a8c70593f	populate_db: Add random names for large user batches. If extra_users > 1000, add some names that are more interesing than Extra111 User.	2019-12-28 10:56:03 -08:00
Mateusz Mandera	e90866876c	queue: Take advantage of ABC for defining abstract worker base classes. QueueProcessingWorker and LoopQueueProcessingWorker are abstract classes meant to be subclassed by a class that will define its own consume() or consume_batch() method. ABCs are suited for that and we can tag consume/consume_batch with the @abstractmethod wrapper which will prevent subclasses that don't define these methods properly to be impossible to even instantiate (as opposed to only crashing once consume() is called). It's also nicely detected by mypy, which will throw errors such as this on invalid use: error: Only concrete class can be given where "Type[TestWorker]" is expected error: Cannot instantiate abstract class 'TestWorker' with abstract attribute 'consume' Due to it being detected by mypy, we can remove the test test_worker_noconsume which just tested the old version of this - raising an exception when the unimplemented consume() gets called. Now it can be handled already on the linter level.	2019-12-28 10:52:17 -08:00
Mateusz Mandera	ec209a9bc9	test_queue_worker: Extract a repetitive mock.	2019-12-28 10:52:13 -08:00
Mateusz Mandera	a54640fc68	queue: Share exception handling code between loop and normal workers. LoopQueueProcessingWorker can handle exceptions inside consume_batch in a similar manner to how QueueProcessingWorker handles exceptions inside consume.	2019-12-28 10:47:36 -08:00
Mateusz Mandera	e559447f83	ldap: Improve logging. Our ldap integration is quite sensitive to misconfigurations, so more logging is better than less to help debug those issues. Despite the following docstring on ZulipLDAPException: "Since this inherits from _LDAPUser.AuthenticationFailed, these will be caught and logged at debug level inside django-auth-ldap's authenticate()" We weren't actually logging anything, because debug level messages were ignored due to our general logging settings. It is however desirable to log these errors, as they can prove useful in debugging configuration problems. The django_auth_ldap logger can get fairly spammy on debug level, so we delegate ldap logging to a separate file /var/log/zulip/ldap.log to avoid spamming server.log too much.	2019-12-28 10:47:08 -08:00

... 5 6 7 8 9 ...

34262 Commits All Branches Search

34262 Commits

All Branches