Commit Graph

54 Commits

Author SHA1 Message Date
Anders Kaseorg bf45f921a7 url_preview: Allow Beautiful Soup to get the charset from <meta>.
An HTML document sent without a charset in the Content-Type header
needs to be scanned for a charset in <meta> tags.  We need to pass
bytes instead of str to Beautiful Soup to allow it to do this.

Fixes #16843.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-12-15 11:30:57 -08:00
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
akshatdalton 287c4ed2bb markdown: Fix Youtube and Vimeo preview overriding markdown link titles bug.
Initially markdown titles were overridden by Youtube and Vimeo preview titles.
But now it will check if any markdown title is present to replace Youtube or
Vimeo preview titles, if preview of linked websites is enabled.
Fixes #16100
2020-10-19 12:06:13 -07:00
Alex Vandiver b1cac67c31 tests: Check JSON serializability of test data with mock_queue_publish. 2020-09-03 17:34:31 -07:00
Alex Vandiver ad8943a64a url_preview: Only extract img tags with an `src`.
Some `<img>` tags do not have an SRC, if they are rewritten using JS
to have one later.  Attempting to access `first_image['src']` on these
will raise an exception, as they have no such attribute.

Only look for images which have a defined `src` attribute on them.  We
could instead check if `first_image.has_attr('src')`, but this seems
only likely to produce fewer valid images.
2020-08-18 14:26:21 -04:00
Anders Kaseorg 61d0417e75 python: Replace ujson with orjson.
Fixes #6507.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:55:12 -07:00
Mohit Gupta 8b8cfb2e73 test_link_embed: Add assertLogs to prevent spam in test-backend. 2020-07-26 16:14:17 -07:00
Mohit Gupta 3f5fc13491 refactor: Rename zerver.lib.bugdown to zerver.lib.markdown .
This commit is first of few commita which aim to change all the
bugdown references to markdown. This commits rename the files,
file path mentions and change the imports.
Variables and other references to bugdown will be renamed in susequent
commits.
2020-06-26 17:08:37 -07:00
Tim Abbott 4d7550d705 views: Extract message_edit.py for message editing views.
This is a pretty clean extraction of files that lets us shrink one of
our largest files.
2020-06-22 15:08:34 -07:00
Anders Kaseorg 365fe0b3d5 python: Sort imports with isort.
Fixes #2665.

Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.

Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start.  I expect this change will increase pressure for us to split
those files, which isn't a bad thing.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 16:45:32 -07:00
Anders Kaseorg 8dd83228e7 python: Convert "".format to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus --keep-percent-format, but with the
NamedTuple changes reverted (see commit
ba7906a3c6, #15132).

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-08 15:31:20 -07:00
Anders Kaseorg 840cf4b885 requirements: Drop direct dependency on mock.
mock is just a backport of the standard library’s unittest.mock now.

The SAMLAuthBackendTest change is needed because
MagicMock.call_args.args wasn’t introduced until Python
3.8 (https://bugs.python.org/issue21269).

The PROVISION_VERSION bump is skipped because mock is still an
indirect dev requirement via moto.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-26 11:40:42 -07:00
Anders Kaseorg 78c70b1424 bugdown: Leave link titles alone until clean_user_content_links.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-09 16:32:40 -07:00
Mateusz Mandera 770086f983 url_preview: Discard url in oembed if server returns invalid json.
This fixes the scenario where we'd get errors in the
FetchLinksEmbedData queue processor if oembed got invalid json from the
URL.
2020-04-11 11:54:54 -07:00
Anders Kaseorg c734bbd95d python: Modernize legacy Python 2 syntax with pyupgrade.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:43:22 -07:00
Anders Kaseorg 4f748fb627 markdown: Stop setting target="_blank".
This setting is being overridden by the frontend since the last
commit, and the security model is clearer and more robust if we don't
make it appear as though the markdown processor is handling this
issue.

Co-authored-by: Tim Abbott <tabbott@zulipchat.com>
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-01 14:01:45 -07:00
Steve Howell 1b16693526 tests: Limit email-based logins.
We now have this API...

If you really just need to log in
and not do anything with the actual
user:

    self.login('hamlet')

If you're gonna use the user in the
rest of the test:

    hamlet = self.example_user('hamlet')
    self.login_user(hamlet)

If you are specifically testing
email/password logins (used only in 4 places):

    self.login_by_email(email, password)

And for failures uses this (used twice):

    self.assert_login_failure(email)
2020-03-11 17:10:22 -07:00
Steve Howell 5e2a32c936 tests: Use users in send_*_message.
This commit mostly makes our tests less
noisy, since emails are no longer an important
detail of sending messages (they're not even
really used in the API).

It also sets us up to have more scrutiny
on delivery_email/email in the future
for things that actually matter.  (This is
a prep commit for something along those
lines, kind of hard to explain the full
plan.)
2020-03-07 18:30:13 -08:00
Tim Abbott 4901dc3795 url_preview: Fix parsing of open graph tags.
Our open graph parser logic sloppily mixed data obtained by parsing
open graph properties with trusted data set by our oembed parser.

We fix this by consistenly using our explicit whitelist of generic
properties (image, title, and description) in both places where we
interact with open graph properties.  The fixes are redundant with
each other, but doing both helps in making the intent of the code
clearer.

This issue fixed here was originally reported as an XSS vulnerability
in the upcoming Inline URL Previews feature found by Graham Bleaney
and Ibrahim Mohamed using Pysa.  The recent Oembed changes close that
vulnerability, but this change is still worth doing to make the
implementation do what it looks like it does.
2019-12-12 15:24:38 -08:00
Anders Kaseorg faa3ea0b8e oembed: Remove unsound HTML filtering.
The frontend now takes care of confining the HTML.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-12 15:24:38 -08:00
Hariom Verma 107da5402c url preview: Replace YouTube URLs with their titles.
Modified by punchagan to:
* Replace URLs with titles only if the inline url embed previews are turned on
* Add a test for youtube titles replacing URLs

The titles for the videos are fetched asynchronously after the message has been
sent via the code that fetches metadata for open graph previews. So, the URLs
are replaced with titles only if the inline embed url previews feature is
enabled.

Ideally, YouTube previews should be shown only if inline url previews are
enabled, but this feature is in beta, while YouTube previews are pretty stable.
Once this feature is out of beta, YouTube previews should be shown only if the
url previews feature is turned on.

YouTube preview image is calculated as soon as the message is sent, while the
title needs to be fetched using a network request. This means that the URL is
replaced only after the data has been fetched from the request, and happens a
couple of seconds after the message has been rendered.

Closes #7549
2019-07-12 19:14:19 -07:00
Puneeth Chaganti b10fc1d896 url preview: Don't show a message embed if there's no image. 2019-07-03 14:38:19 -07:00
Puneeth Chaganti 9aa5a2b369 url preview: Use oEmbed html for videos.
Ensure that the html is safe, before using it. The html is considered if it is
in an iframe with a http/https src, based on the recommendations here:
https://oembed.com/#section3

We directly embed the `iframe` html into the lightbox overlay.
2019-05-31 15:59:03 -07:00
Puneeth Chaganti c8cb785950 url preview: Show inline images as previews for oEmbed photo pages. 2019-05-31 15:59:03 -07:00
Puneeth Chaganti 8c0c9ca7a4 url preview: Turn Realm.inline_url_embed_preview off by default. 2019-05-31 15:28:32 -07:00
Puneeth Chaganti 22d0cd9696 url preview: Don't cache embed data when fetch has network errors. 2019-05-30 16:45:22 -07:00
Puneeth Chaganti 4ac9778d69 url preview: Catch network errors during get for page content.
We may be successfully able to get the page once, to get the content type, but
the server or network may go down and cause problems when fetching the page for
parsing its meta tags.
2019-05-13 13:55:00 -07:00
Puneeth Chaganti 59555ee7e5 url preview: Confirm content-type before trying to show previews.
Currently, we only show previews for URLs which are HTML pages, which could
contain other media. We don't show previews for links to non-HTML pages, like
pdf documents or audio/video files. To verify that the URL posted is an HTML
page, we verify the content-type of the page, either using server headers or by
sniffing the content.

Closes #8358
2019-05-13 13:45:17 -07:00
Tim Abbott cf0fc7c221 test_link_embed: Fix unused variable.
This should have been in bc2ebd0f09.
2019-05-06 16:04:37 -07:00
Puneeth Chaganti bc2ebd0f09 url preview: Refactor test code to create mock responses. 2019-05-06 12:37:32 -07:00
Puneeth Chaganti da33b72848 url preview: Use in-memory caching in dev environment. 2019-05-06 12:37:32 -07:00
Tim Abbott 11ea8ae9fe tests: Fix test broken by recent url preview change. 2019-02-05 13:45:00 -08:00
Anders Kaseorg 3127fb4dbd zerver/tests: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:43:03 -08:00
Steve Howell 76deb30312 preview: Hash cache keys for preview urls.
We don't want really long urls to lead to truncated
keys, or we could theoretically have two different
urls get mixed up previews.

Also, this suppresses warnings about exceeding the
250 char limit.

Finally, this gives the key a proper prefix.
2018-10-14 09:28:57 -07:00
neiljp (Neil Pilgrim) b5aa705137 mypy: test_link_embed.py: add assert & remove from mypy.ini. 2018-06-19 10:48:38 -07:00
Nikhil Kumar Mishra 2cf32bda12 embed link: Add test for link_embed_data_from_cache. 2018-04-05 10:48:40 -07:00
Vishnu Ks 59b8f85c63 bugdown: Do only image preview if relative URL. 2018-03-06 13:50:02 -08:00
Tim Abbott 8dc82f97c7 python: Wrap long def lines in test files.
We don't have our linter checking test files due to ultra-long strings
that are often present in test output that we verify.  But it's worth
at least cleaning out all the ultra-long def lines.
2017-11-16 22:00:53 -08:00
rht 4f5b1c0a5a zerver/tests: Use python 3 syntax for typing in most files. 2017-11-16 21:52:01 -08:00
derAnfaenger 970e8c5df2 queue processors: Add full coverage for FetchLinksEmbedData. 2017-11-09 16:01:24 -08:00
Steve Howell cf1a4540ef tests: Fix send_message calls in test_link_embed.py. 2017-10-28 10:20:59 -07:00
rht 26f5d9a32c zerver/tests: Remove print_function. 2017-09-27 18:05:45 -07:00
rht 1e87a4b68c zerver/tests: Remove absolute_import. 2017-09-27 10:00:39 -07:00
Aditya Bansal f32c1892ff preview.py: Fix error raised on uploading file with unicode filename. 2017-06-19 14:58:44 -04:00
Lech Kaiel 6b49e667ef tests: Replaced @zulip.com references with self.example_ functions.
This cleans up the excessive use of @zulip.com emails in the tests.
2017-05-23 20:59:50 -07:00
Rohitt Vashishtha a2c1e10986 bugdown: Add domain-name to relative image-url in open graph data.
fixes #4608.
2017-05-04 16:54:10 -07:00
Ayush Jain bddcfb1c96 Add realm-level settings to control inline image and url preview.
This gives users more control in case they don't want previews,
especially for the "previews of linked websites" feature.

Fixes: #2640.
2017-03-21 15:46:17 -07:00
Tim Abbott bc38870136 preview: Fix adding links in message editing.
When you edit a message to contain links, and URL previews are
enabled, previously we'd throw an exception, because the realm ID
wasn't included in the event.

Also adds a test so that we can have effective test coverage on this
codepath, though this history is actually that I found the bug through
writing this test :).
2017-03-01 10:38:47 -08:00
Tim Abbott d754d257f2 PreviewTestCase: Use the actual queued event.
This makes the test significantly simpler and more faithful, but not
mocking the event put into queue_json_publish.
2017-03-01 10:37:55 -08:00
Tim Abbott b6bf45b0da PreviewTestCase: Factor out open_graph_html variable. 2017-03-01 10:37:50 -08:00