Commit Graph

33978 Commits

Author SHA1 Message Date
Steve Howell a718b47095 refactor: Speed up filter_people_by_search_terms.
We now call build_person_matcher outside the loop.
2019-12-28 11:14:21 -08:00
Steve Howell 9c525f8ecb refactor: Extract build_person_matcher().
This will allow use to change some O(N) behavior
to O(1) where we are performing the same query
on a bunch of people.  (Subsequent commits will
actually take advantage of this prefactoring.)
2019-12-28 11:14:21 -08:00
Steve Howell ab34ee0800 search performance: Stop at max_items.
Once we have max_items results, stop trying
to get more items.

This should really help large realms when
you do a search on streams that turns up
more than N streams (where N is about 12).
We won't even bother to find people.
2019-12-28 11:09:28 -08:00
Steve Howell 8406d34145 search: Extract make_attacher.
This class gives us more control over attaching
suggestions to our eventual result.  The main
thing we do now is remove duplicates as they're
encountered.

This will make sense in the follow up commit,
where we can short circuit actions as soon as
we get enough results.
2019-12-28 11:09:26 -08:00
Steve Howell 97293aef96 search: Simplify legacy search code.
We now have a list of filterers that we walk through.
2019-12-28 11:09:25 -08:00
Steve Howell 09326cb467 refactor: Extract finalize_results.
This has a few benefits:

    - we remove some duplicate code
    - we can see finalize_results in profiles

It turns out finalize_results is expensive
for some searches. If the search itself doesn't
do a ton of work but returns a lot of results,
we see it in finalize_results.  It brings to
attention that we should be truncating items
earlier instead of doing lots of unnecessary
work.
2019-12-28 11:09:25 -08:00
Steve Howell 4141abc171 search: Slightly speed up stream highlighting.
This isn't a huge speedup, but it's an easy
code change.

We remove the two-liner highlight_with_escaping,
which was only called in one place, and when
we inline it into the caller, we can pull the
first line, which builds the regex, out of the
loop.
2019-12-28 11:09:23 -08:00
Steve Howell 7a2d9a0579 refactor: Extract build_highlight_regex. 2019-12-28 10:57:53 -08:00
Steve Howell abfd39987c refactor: Remove duplicate code.
The code we removed in highlight_with_escaping
is exactly the same code as in
highlight_with_escaping_and_regex.

I actually copy/pasted this code five years
ago and am now removing the duplication. :)
2019-12-28 10:57:53 -08:00
Steve Howell abdd4b54f4 performance: Speed up search bar highlighting.
When we're highlighting all the people that show
up in a search from the search bar, we need
to fairly expensively build a regex from the
query:

    query = query.toLowerCase();
    query = query.replace(/[\-\[\]{}()*+?.,\\\^$|#\s]/g, '\\$&');
    const regex = new RegExp('(^' + query + ')', 'ig');

Even though the final regex is presumably cached, we
still needed to do that `query.replace` for every person.
Even for relatively small numbers of persons, this would
show up in profiles as expensive.

Now we just build the query once by using a pattern
where you call a function outside the loop to build
an inner function that's used in the loop that closes
on the `query` above.  The diff probably shows this
better than I explained it here.
2019-12-28 10:57:53 -08:00
Steve Howell 6a9eaebff2 populate_db: Make num_recips calculation more clear.
Extracting this calculation makes it easier to hack
it when you're trying to load lots of users.

We probably want a slightly more realistic calculation
here for stress testing.  And also fewer rows.  But
at least now it's a little more clear what it's doing.
2019-12-28 10:56:03 -08:00
Steve Howell 4a8c70593f populate_db: Add random names for large user batches.
If extra_users > 1000, add some names that are more
interesing than Extra111 User.
2019-12-28 10:56:03 -08:00
Mateusz Mandera e90866876c queue: Take advantage of ABC for defining abstract worker base classes.
QueueProcessingWorker and LoopQueueProcessingWorker are abstract classes
meant to be subclassed by a class that will define its own consume()
or consume_batch() method. ABCs are suited for that and we can tag
consume/consume_batch with the @abstractmethod wrapper which will
prevent subclasses that don't define these methods properly to be
impossible to even instantiate (as opposed to only crashing once
consume() is called). It's also nicely detected by mypy, which will
throw errors such as this on invalid use:

error: Only concrete class can be given where "Type[TestWorker]" is
expected
error: Cannot instantiate abstract class 'TestWorker' with abstract
attribute 'consume'

Due to it being detected by mypy, we can remove the test
test_worker_noconsume which just tested the old version of this -
raising an exception when the unimplemented consume() gets called. Now
it can be handled already on the linter level.
2019-12-28 10:52:17 -08:00
Mateusz Mandera ec209a9bc9 test_queue_worker: Extract a repetitive mock. 2019-12-28 10:52:13 -08:00
Mateusz Mandera a54640fc68 queue: Share exception handling code between loop and normal workers.
LoopQueueProcessingWorker can handle exceptions inside consume_batch in
a similar manner to how QueueProcessingWorker handles exceptions inside
consume.
2019-12-28 10:47:36 -08:00
Mateusz Mandera e559447f83 ldap: Improve logging.
Our ldap integration is quite sensitive to misconfigurations, so more
logging is better than less to help debug those issues.
Despite the following docstring on ZulipLDAPException:

"Since this inherits from _LDAPUser.AuthenticationFailed, these will
be caught and logged at debug level inside django-auth-ldap's
authenticate()"

We weren't actually logging anything, because debug level messages were
ignored due to our general logging settings. It is however desirable to
log these errors, as they can prove useful in debugging configuration
problems. The django_auth_ldap logger can get fairly spammy on debug
level, so we delegate ldap logging to a separate file
/var/log/zulip/ldap.log to avoid spamming server.log too much.
2019-12-28 10:47:08 -08:00
Mateusz Mandera a180f01e6b ldap: Use a cleaner super().authenticate() call in ZulipLDAPAuthBackend. 2019-12-28 10:47:08 -08:00
Steve Howell 49cd719273 tests: Verify subscriptions more thoroughly.
When you subscribe/unsubscribe yourself, you want
to make sure these calls work too:

    subscribed_subs
    unsubscribed_subs
2019-12-28 07:29:51 -05:00
Steve Howell 32a1ef20d1 minor: Extract helper for search tests. 2019-12-28 05:43:04 -05:00
Steve Howell f3ebb5fbee test cleanup: Extract a sorter() helper. 2019-12-28 05:43:04 -05:00
Steve Howell 9ac2fe2826 test cleanup: Extract a matcher() helper. 2019-12-28 05:43:04 -05:00
Anders Kaseorg 4b590cc522 templates: Correct sample Google authorized redirect URI.
The required URI was changed in #11450.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-21 20:08:31 -08:00
Anders Kaseorg 1f8d21af74 test-install: Use lxc-destroy -f instead of lxc-stop.
Fixes this error after rebooting the host:

$ sudo ./destroy-all  -f
zulip-install-bionic-41MM2
lxc-stop: zulip-install-bionic-41MM2: tools/lxc_stop.c: main: 191 zulip-install-bionic-41MM2 is not running

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-18 03:48:39 -08:00
Anders Kaseorg 9b5f9858fb test-install: Run lxc-attach with --clear-env.
The host environment variables (especially PATH) should not be allowed
to pollute the test and could interfere with it.

This allows test-install to run on a NixOS host.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-18 03:48:39 -08:00
Anders Kaseorg ab211c7acf lint: Tell ShellCheck to look for sourced files at relative paths.
This uses the new -P option of ShellCheck 0.7.0.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-18 03:48:02 -08:00
Tim Abbott 02169c48cf ldap: Fix bad interaction between EMAIL_ADDRESS_VISIBILITY and LDAP sync.
A block of LDAP integration code related to data synchronization did
not correctly handle EMAIL_ADDRESS_VISIBILITY_ADMINS, as it was
accessing .email, not .delivery_email, both for logging and doing the
mapping between email addresses and LDAP users.

Fixes #13539.
2019-12-15 22:59:02 -08:00
Alex Dehnert 998b52dd8c docs: Remove mention of Dropbox CLA.
Based on discussion in https://chat.zulip.org/#narrow/stream/19-documentation/topic/CLA and 
https://chat.zulip.org/#narrow/stream/105-zulipbot/topic/New.20contributor.20definition, signing 
the Dropbox CLA is no longer a requirement, and it's a essentially a documentation bug that it's 
still mentioned.
2019-12-14 22:07:38 -08:00
Tim Abbott 18de865daa docs: Simplify a few details in our release checklist. 2019-12-13 17:24:49 -08:00
Tim Abbott b68ff6446c version: Update version and changelog for Zulip 2.1.1 release. 2019-12-13 17:19:45 -08:00
Tim Abbott e38c58e7c7 docs: Rewrite LDAP discussion of AUTH_LDAP_REVERSE_EMAIL_SEARCH.
This moves the mandatory configuration for options A/B/C into a single
bulleted list for each option, rather than split across two steps; I
think the result is significantly more readable.

It also fixes a bug where we suggested setting
AUTH_LDAP_REVERSE_EMAIL_SEARCH = AUTH_LDAP_USER_SEARCH in some cases,
whereas in fact it will never work because the parameters are
`%(email)s`, not `%(user)s`.

Also, now that one needs to set AUTH_LDAP_REVERSE_EMAIL_SEARCH, it
seems worth adding values for that to the Active Directory
instructions.  Thanks to @alfonsrv for the suggestion.
2019-12-13 13:55:52 -08:00
Vishnu KS 6901087246 install: Use crudini for storing value of POSTGRES_MISSING_DICTIONARIES.
This simplifies the RDS installation process to avoid awkwardly
requiring running the installer twice, and also is significantly more
robust in handling issues around rerunning the installer.

Finally, the answer for whether dictionaries are missing is available
to Django for future use in warnings/etc. around full-text search not
being great with this configuration, should they be required.
2019-12-13 12:05:39 -08:00
Tim Abbott 851eb1a6ee generate_test_data: Remove some useless type annotations.
One of these caused a parser error trying to run pyre on Zulip; the
other is just useless as the type can be inferred.
2019-12-13 11:52:23 -08:00
Mateusz Mandera 1926649dae migrations: Avoid triggering backend initalization in migration 0209.
Fixes #13528.
The email_auth_enabled check caused all enabled backends to get
initialized, and thus if LDAP was enabled the check_ldap_config()
check would cause an error if LDAP was misconfigured
(for example missing the new settings).
2019-12-13 10:54:05 -08:00
Tim Abbott 9812c6d445 version: Update version strings following 2.1 release. 2019-12-12 22:53:52 -08:00
Rohitt Vashishtha dc4181beec minor: Fix typo in changelog. 2019-12-12 22:52:09 -08:00
Tim Abbott 03a3ae8b61 Release Zulip Server 2.1.0. 2019-12-12 22:23:22 -08:00
Tim Abbott 35959d43c4 docs: Clean up troubleshooting guide.
This article is definitely still below our polish goals, but this is
also definitely an improvement.
2019-12-12 22:19:12 -08:00
Tim Abbott bb6bf837ad docs: Update changelog in preparation for 2.1 release. 2019-12-12 21:11:37 -08:00
Tim Abbott 29ace8100f i18n: Update translation data from transifex. 2019-12-12 20:34:46 -08:00
Tim Abbott 7ccc8373e2 bugdown: Fix logic for extracting attachment path_id.
In 3892a8afd8, we restructured the
system for managing uploaded files to a much cleaner model where we
just do parsing inside bugdown.

That new model had potentially buggy handling of cases around both
relative URLs and URLS starting with `realm.host`.

We address this by further rewriting the handling of attachments to
avoid regular expressions entirely, instead relying on urllib for
parsing, and having bugdown output `path_id` values, so that there's
no need for any conversions between formats outside bugdowm.

The check_attachment_reference_change function for processing message
updates is significantly simplified in the process.

The new check on the hostname has the side effect of requiring us to
fix some previously weird/buggy test data.

Co-Author-By: Anders Kaseorg <anders@zulipchat.com>
Co-Author-By: Rohitt Vashishtha <aero31aero@gmail.com>
2019-12-12 20:30:26 -08:00
Tim Abbott 4adcd35698 version: Update version and changelog for Zulip 2.0.8 release. 2019-12-12 17:32:27 -08:00
Anders Kaseorg 8e37862b69 CVE-2019-19775: Close open redirect in thumbnail view.
This closes an open redirect vulnerability, one case of which was
found by Graham Bleaney and Ibrahim Mohamed using Pysa.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-12 17:29:20 -08:00
Mateusz Mandera 8bd2a130a9 docs: Fix some typos. 2019-12-12 17:19:10 -08:00
Tim Abbott 171c6f119d docs: Clean up upgrade/modify documentation. 2019-12-12 17:02:07 -08:00
Tim Abbott 305adc4f64 docs: Clean up requirements page. 2019-12-12 16:31:02 -08:00
Tim Abbott 080864ca44 docs: Minor edits to export and management command docs. 2019-12-12 16:06:40 -08:00
Tim Abbott ea60670c9f docs: Clean up some editing issues in export docs. 2019-12-12 15:56:23 -08:00
Tim Abbott 7bde70bb52 migrations: Batch fix_has_link_attribute migration.
This avoids risk of OOM issues on servers with relatively limited RAM
and millions of messages of history; apparently, fetching all messages
ordered by ID could be quite memory-intensive even with an iterator
usage model.

Fortunately, we have other migrations that already follow this pattern
of iterating over messages, so it's easy to borrow existing code to
make this migration run reasonably.
2019-12-12 15:29:49 -08:00
Tim Abbott 4901dc3795 url_preview: Fix parsing of open graph tags.
Our open graph parser logic sloppily mixed data obtained by parsing
open graph properties with trusted data set by our oembed parser.

We fix this by consistenly using our explicit whitelist of generic
properties (image, title, and description) in both places where we
interact with open graph properties.  The fixes are redundant with
each other, but doing both helps in making the intent of the code
clearer.

This issue fixed here was originally reported as an XSS vulnerability
in the upcoming Inline URL Previews feature found by Graham Bleaney
and Ibrahim Mohamed using Pysa.  The recent Oembed changes close that
vulnerability, but this change is still worth doing to make the
implementation do what it looks like it does.
2019-12-12 15:24:38 -08:00
Anders Kaseorg faa3ea0b8e oembed: Remove unsound HTML filtering.
The frontend now takes care of confining the HTML.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-12 15:24:38 -08:00