Commit Graph

429 Commits

Author SHA1 Message Date
Anders Kaseorg 03e147d5e1 python: Replace NamedTuple with dataclass.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 15:19:31 -07:00
Anders Kaseorg 6480deaf27 python: Convert more "".format to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus --keep-percent-format, with more
restrictions patched out.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 14:48:09 -07:00
Anders Kaseorg 8dd83228e7 python: Convert "".format to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus --keep-percent-format, but with the
NamedTuple changes reverted (see commit
ba7906a3c6, #15132).

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-08 15:31:20 -07:00
Anders Kaseorg 0797626911 bugdown: Avoid deprecated ElementTree.getchildren method.
Fixes warnings like these with python -Wd:

/home/circleci/zulip/zerver/lib/bugdown/__init__.py:327: DeprecationWarning: This method will be removed in future versions.  Use 'list(elem)' or iteration over elem instead.
  for child in currElementPair.value.getchildren():
/home/circleci/zulip/zerver/lib/bugdown/__init__.py:328: DeprecationWarning: This method will be removed in future versions.  Use 'list(elem)' or iteration over elem instead.
  if child.getchildren():
/home/circleci/zulip/zerver/lib/bugdown/__init__.py:282: DeprecationWarning: This method will be removed in future versions.  Use 'list(elem)' or iteration over elem instead.
  for child in currElement.getchildren():
/home/circleci/zulip/zerver/lib/bugdown/__init__.py:283: DeprecationWarning: This method will be removed in future versions.  Use 'list(elem)' or iteration over elem instead.
  if child.getchildren():

https://docs.python.org/3.8/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.getchildren

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-04 14:17:08 -07:00
Anders Kaseorg 67dbb3b2a9 bugdown: Replace deprecated markdown members with md.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-03 17:23:20 -07:00
Anders Kaseorg f65af9cdb7 bugdown: Replace deprecated markdown.util.etree with ElementTree.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-03 17:23:20 -07:00
Anders Kaseorg 01ccb75259 bugdown: Replace deprecated cElementTree with ElementTree.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-03 17:23:20 -07:00
Anders Kaseorg aad0b8b2da bugdown: Add assert to work around type problem.
url_to_a returns Union[Element, str], but str cannot be appended to
Element; that would raise TypeError at runtime.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-03 17:23:20 -07:00
Anders Kaseorg ba7906a3c6 bugdown: Revert NamedTuple declarations to old style.
This reverts part of commit 8bcdf4ca97
(#15093), to work around a mypy caching bug:
https://github.com/python/mypy/issues/7281.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-28 16:47:08 -07:00
Anders Kaseorg 8bcdf4ca97 python: Convert TypedDict declarations to Python 3.6 style.
A subset of the diff generated by pyupgrade --py36-plus
--keep-percent-format.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-26 11:43:40 -07:00
Tim Abbott 463f1503fc Revert "markdown: Process fenced code blocks in blockquotes."
This reverts commit 7002f98ea1.

This failed tests due to some sort of conflict with a recent
python-markdown upgrade.
2020-05-25 18:13:03 -07:00
Rohitt Vashishtha 7002f98ea1 markdown: Process fenced code blocks in blockquotes.
We handle fenced code blocks in a preprocessor, and > style blockquotes
are parsed in a blockprocessor. Pymarkdown doesn't run the preprocessors
again on any blocks that it is parsing, and is unlikely to accept our
solution upstream; they intend to convert fenced_code to a block parser.

We simply run all the preprocessors on the text again, with the exception
of NormalizeWhitespace which removed delimiters used by HtmlStash to mark
preprocessed html code. To counter this, we subclass NormalizeWhitespace
and use our customized version for when it is called from a blockparser.

Upstream issue: https://github.com/Python-Markdown/markdown/issues/53

Fixes #12800.
2020-05-25 17:35:10 -07:00
Rohitt Vashishtha 52c25a9301 markdown-timestamp: Use data-timestamp attribute.
This commit shifts our timestamp syntax to be of the form:

    <span class="timestamp data-timestamp="123456"></span>

since value is not a valid attribute of span elements.
2020-05-20 14:28:08 -07:00
Rohitt Vashishtha b062e8332f markdown: Add timestamp syntax to markdown processors.
This adds support for syntax like: !time(Jun 7 2017, 6:30 PM) so that
everyone sees the time in their own local timezone. This can be used
when scheduling online meetings, etc.

This adds some hardcoded values for timezones, because of there
being no sureshot way of determining the timezone easily. However,
since the main way of using the feature should be a typeahead for
entering the time, this shouldn't be cause of much concern.

Fixes #5176.
2020-05-20 14:23:55 -07:00
shubhamgupta2956 9cd8644c7c uploads: Add support for ".jpe" file extension.
Currently when the user uploads files with ".jpe" file extension, the
markdown is converted to link but the image is not embedded.

This commit adds the support for ".jpe" file extension.

Fixes #14863
2020-05-10 22:55:52 -07:00
Anders Kaseorg 78c70b1424 bugdown: Leave link titles alone until clean_user_content_links.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-09 16:32:40 -07:00
Anders Kaseorg 32f3fd1c77 bugdown: Fix ElementPair typing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-09 16:32:40 -07:00
Anders Kaseorg 6aaeab75bc bugdown: Fix ResultWithFamily typing.
It needs to be a full class because a generic NamedTuple doesn’t work
in Python 3.6.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-09 16:32:40 -07:00
Rohitt Vashishtha 7d3a31cd8b bugdown: Support hanging_lists preprocessor for indented lists.
Previously, hanging_lists preprocessor didn't consider anything
indented at 4 or above spaces to be a list. This meant that when
we had a list like:

1. 1
  2. 2
    3. 3
  2. 2a
1. 1a

We would insert a newline between 3. 3 and 2. 2a. This resulted
in the block processor breaeking down 1 list into 2 blocks, which
messed up the nesting and indentation for the second block.
2020-04-30 17:54:40 -07:00
Abhishek-Balaji 818776faae alert bugdown: Change name of AlertWordsNotificationProcessor.
This is a precursor commit to change the name of
AlertWordNotificationProcessor to AlertWordsNotificationProcessor
to match the change from UserProfile.alert_words to Alertword.
2020-04-27 10:45:18 -07:00
Anders Kaseorg fead14951c python: Convert assignment type annotations to Python 3.6 style.
This commit was split by tabbott; this piece covers the vast majority
of files in Zulip, but excludes scripts/, tools/, and puppet/ to help
ensure we at least show the right error messages for Xenial systems.

We can likely further refine the remaining pieces with some testing.

Generated by com2ann, with whitespace fixes and various manual fixes
for runtime issues:

-    invoiced_through: Optional[LicenseLedger] = models.ForeignKey(
+    invoiced_through: Optional["LicenseLedger"] = models.ForeignKey(

-_apns_client: Optional[APNsClient] = None
+_apns_client: Optional["APNsClient"] = None

-    notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
-    signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)

-    author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)
+    author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)

-    bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)
+    bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)

-    default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
-    default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)

-descriptors_by_handler_id: Dict[int, ClientDescriptor] = {}
+descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {}

-worker_classes: Dict[str, Type[QueueProcessingWorker]] = {}
-queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {}
+worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {}
+queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {}

-AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None
+AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-22 11:02:32 -07:00
Anders Kaseorg 1cf63eb5bf python: Whitespace fixes from autopep8.
Generated by autopep8, with the setup.cfg configuration from #14532.
I’m not sure why pycodestyle didn’t already flag these.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-21 17:58:09 -07:00
Anders Kaseorg d3c55c166e requirements: Upgrade mypy from 0.761 to 0.770.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-18 13:09:51 -07:00
Anders Kaseorg c734bbd95d python: Modernize legacy Python 2 syntax with pyupgrade.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:43:22 -07:00
Anders Kaseorg 2d45308546 CVE-2020-10935: Fix XSS vulnerability in local link rewriting.
Make sure rewrite_local_links_to_relative does not accidentally change
the meaning of links.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-01 14:01:45 -07:00
Anders Kaseorg 4f748fb627 markdown: Stop setting target="_blank".
This setting is being overridden by the frontend since the last
commit, and the security model is clearer and more robust if we don't
make it appear as though the markdown processor is handling this
issue.

Co-authored-by: Tim Abbott <tabbott@zulipchat.com>
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-01 14:01:45 -07:00
Tim Abbott e3a4aeeffa CVE-2020-9445: Remove unused and insecure modal_link feature.
Zulip's modal_link markdown feature has not been used since 2017; it
was a hack used for a 2013-era tutorial feature and was never used
outside that use case.

Unfortunately, it's sloppy implementation was exposed in the markdown
processor for all users, not just the tutorial use case.

More importantly, it was buggy, in that it did not validate the link
using the standard validation approach used by our other code
interacting with links.

The right solution is simply to remove it.
2020-04-01 14:01:45 -07:00
Stefan Weil d2fa058cc1
text: Fix some typos (most of them found and fixed by codespell).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-03-27 17:25:56 -07:00
Anders Kaseorg 7ff9b22500 docs: Convert many http URLs to https.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-03-26 21:35:32 -07:00
Rohitt Vashishtha 2fab45e530 bugdown: Use AtomicString in UserMentionPattern.
This fixes the user-mention counterpart of #14080.
2020-03-06 11:35:56 -08:00
Rohitt Vashishtha 7f9d8e1907 bugdown: Use AtomicString in UserGroupMentionPattern.
This fixes the user-group counterpart of #14080.
2020-03-06 11:35:56 -08:00
Rohitt Vashishtha ff5e2b6eb7 bugdown: Avoid hanging list paragraphs being processed as codeblocks.
Previously, the input:

====================
- One
  - Two

    Two continued
====================

Would produce the same output as:

====================
- One
  - Two

```
Two continued
```
====================

This was because our CodeBlockProcessor had a higher priority than
the ListIndentProcessor. This issue was discussed here:
https://chat.zulip.org/#narrow/stream/9-issues/topic/continuation.20paragraphs.20in.20list.20items.
2020-03-03 12:08:19 -08:00
Rohitt Vashishtha cd7396e732 bugdown: Update outdated comment about Zulip's heading support. 2020-03-03 11:54:18 -08:00
Rohitt Vashishtha 62a7e464fb bugdown: Use AtomicString in StreamPattern.
This fixes the stream counterpart of #14080.
2020-03-02 00:03:33 -08:00
Rohitt Vashishtha 245de9e1e2 bugdown: Use AtomicString in StreamTopicPattern.
Fixes #14080.
2020-03-02 00:03:33 -08:00
Anders Kaseorg e257253e64 emoji_codes: Replace JS module with JSON module.
webpack optimizes JSON modules using JSON.parse("{…}"), which is
faster than the normal JavaScript parser.

Update the backend to use emoji_codes.json too instead of the three
separate JSON files.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-02-12 10:09:12 -08:00
Rohitt Vashishtha 630c564fc7 bugdown: Rewrite List Preprocessor logic to properly parse fences.
Previously, we didn't track opening and closing fences separately,
with led to bugs like not parsing a list that was immediately after
a quoted fence; we treated each ``` as a new fence.

This commit rewrites the function to maintain a stack of currently
open fences. If any of the parent fences is a code fence, we do not
insert a new line before a list.

We also add some test cases specifically to test this behavior with
complexly nested lists.

Fixes #13745.
2020-01-27 17:14:27 -08:00
Tim Abbott 7ccc8373e2 bugdown: Fix logic for extracting attachment path_id.
In 3892a8afd8, we restructured the
system for managing uploaded files to a much cleaner model where we
just do parsing inside bugdown.

That new model had potentially buggy handling of cases around both
relative URLs and URLS starting with `realm.host`.

We address this by further rewriting the handling of attachments to
avoid regular expressions entirely, instead relying on urllib for
parsing, and having bugdown output `path_id` values, so that there's
no need for any conversions between formats outside bugdowm.

The check_attachment_reference_change function for processing message
updates is significantly simplified in the process.

The new check on the hostname has the side effect of requiring us to
fix some previously weird/buggy test data.

Co-Author-By: Anders Kaseorg <anders@zulipchat.com>
Co-Author-By: Rohitt Vashishtha <aero31aero@gmail.com>
2019-12-12 20:30:26 -08:00
Anders Kaseorg 8e37862b69 CVE-2019-19775: Close open redirect in thumbnail view.
This closes an open redirect vulnerability, one case of which was
found by Graham Bleaney and Ibrahim Mohamed using Pysa.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-12 17:29:20 -08:00
Rohitt Vashishtha 3892a8afd8 messages: Set has_attachment correctly using Bugdown.
Previously, we would naively set has_attachment just by searching
the whole messages for strings like `/user_uploads/...`. We now
prevent running do_claim_attachments for messages that obviously
do not have an attachment in them that we previously ran.

For example: attachments in codeblocks or
             attachments that otherwise do not match our link syntax.

The new implementation runs that check on only the urls that
bugdown determines should be rendered. We also refactor some
Attachment tests in test_messages to test this change.

The new method is:

1. Create a list of potential_attachment_urls in Bugdown while rendering.
2. Loop over this list in do_claim_attachments for the actual claiming.
   For saving:
3. If we claimed an attachment, set message.has_attachment to True.
   For updating:
3. If claimed_attachment != message.has_attachment: update has_attachment.

We do not modify the logic for 'unclaiming' attachments when editing.
2019-12-11 11:03:44 -08:00
Rohitt Vashishtha 4674cc5098 bugdown: Set message.has_image while rendering message. 2019-12-11 17:01:41 +05:30
dustinheestand 157c98de99 bugdown: Correctly set has_link attribute on messages.
Now autolinks and message edits affect the has_link attribute on messages.
2019-12-11 17:01:41 +05:30
Rohitt Vashishtha 182503e5c0 bugdown: Move helper methods to InlineInterestingLinksProcessor.
add_a, add_oembed_data and add_embed are only called by
InlineInterestingLinksProcessor and this commit allows
these methods to access self.markdown object.
2019-12-10 15:35:00 -08:00
Rohitt Vashishtha 1229e69e9b bugdown: Reenable -,+ to begin a markdown list.
This commit has a side-effect that we also now allow mixed lists,
but they have different syntax from the commonmark implementation
and our marked output. For example, without the closing li tags:

  Input    Bugdown     Marked
-------------------------------------
         <ul>
- Hello    <li>Hello  <ul><li>Hello</ul>
+ World    <li>World  <ul><li>World
+ Again    <li>Again      <li>Again</ul>
* And      <li>And    <ul><li>And
* Again    <li>Again      <li>Again</ul>
         </ul>

The bugdown render is in line with what a user in #13447 requests.

Fixes #13477.
2019-12-09 16:13:02 -08:00
Rohitt Vashishtha 9174c636ce bugdown: Store if message has wildcards in MentionData.
We also switch the underlying exctact_mention_text method to use
a regular for loop, as well as make the related methods return
tuples of (names, is_wildcard). This abstraction is hidden from the
MentionData callers behind mention_data.message_has_wildcards().

Concerns #13430.
2019-12-02 12:12:35 -08:00
Anders Kaseorg cce85f6ec7 dependencies: Upgrade katex from 0.10.2 to 0.11.1.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-11-11 16:26:31 -08:00
Thomas Ip 574c35c0b8 markdown: Render ordered lists using <ol> markup.
This brings us in line, and also allows us to style these more like
unordered lists, which is visually more appealing.

On the backend, we now use the default list blockprocessor + sane list
extension of python-markdown to get proper list markup; on the
frontend, we mostly return to upstream's code as they have followed
CommonMark on this issue.

Using <ol> here necessarily removes the behaviour of not renumbering
on lists written like 3, 4, 7; hopefully users will be OK with the
change.

Fixes #12822.
2019-09-08 16:42:20 -07:00
Tim Abbott 62c9ea7cf9 linkifiers: Fix problems with capture groups called "name".
Apparently, due to poor naming of the outer capture group we use to
separate the actual match from the surrounding whitespace (etc.) we
use to determine if the syntax is a possible linkifier start/end, if
you created a linkifier using "name" as the capture group, we'd try to
compile a pattern with two capture groups called "name", which would
500, preventing anyone from accessing the organization.
2019-08-30 09:36:14 -07:00
Rohitt Vashishtha 8b443a25b8 markdown: Show link href if title is empty.
Fixes #6221.
2019-08-25 21:36:42 -07:00
Rohitt Vashishtha abe2dab88c markdown: Upgrade to use InlineProcessor for links.
This commit wraps up the major work that we held back when upgrading
py-markdown 2.6.11 to 3.0.1. Since we were making our custom changes
to the link syntax, at the time we stuck to using the old method of
parsing links. This lays the groundwork for further changes to our
link and image link handling, and brings us on par with upstream.

Also, we now better document the ways in which our link handling is
different from upstream.
2019-08-25 21:36:42 -07:00