Commit Graph

61 Commits

Author SHA1 Message Date
Anders Kaseorg ceb7e2d2bd Revert "markdown: Add support to shorten GitHub links."
This reverts commit 9c6d8d9d81 (#16916).

This feature has known bugs, and also wants some design changes to
make it customizable like linkifiers, so we’re retargeting this to
post-4.x.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-04-02 15:52:34 -07:00
Sumanth V Rao e12f682e2e markdown: Include text & url in `topic_links` parameter of our API.
The linkifier code now includes both the shortened text and the expanded
URL, sorted by the order of the occurrence in a topic. This list is passed
back in the `topic_links` parameter of the /messages and the /events APIs.

topic_links earlier vs now:

earlier: ['https://www.google.com', 'https://github.com/zulip/zulip/32']

now: [{'url': 'https://www.google.com', 'text': 'https://www.google/com},
      {'url': 'https://github.com/zulip/zulip/32', 'text': '#32'}]

Similarly, the topic_links local echo logic in the frontend now returns
back an object.

Fixes: #17109.
2021-03-30 15:53:07 -07:00
m-e-l-u-h-a-n 1b8a5a3344 markdown: Refactor backend logic for handling user mention.
Backend logic for handling user mention was cluttered
because it was handled at two stages first in
get_possible_mentions_info while fetching mention data
based on the messsage and then later in UserMentionPattern
which handles processing of text for mention.

Ideally UserMentionPattern should depend on
get_possible_mentions_info only for data but there was a
shared logic between these two that made it hard to debug
any possible bugs.

Updates in this commit make both of these functions
coherent in terms of logic and also add appropiate
comments to improve readability of these functions.

There was also a hidden bug that if a user A is
mentioned in with @**name|id** then @**invalid|id**
again mentioned A because of the way we handled mentions
earlier. It is solved as a result of this refactor and
appropiate test has been added for this.

This has been tested manually as well as by adding new
test to address missing case.
2021-03-28 16:52:48 -07:00
m-e-l-u-h-a-n 2699048208 markdown: Extend user mention syntax to support user_id for mentioning.
Extend our markdown system to support mentioning of users
by id also. Following these changes, it would be possible
to mention users with @**|user_id** and silently mention
using @_**|user_id**.

Main intention for extending the mention syntax is to make
it convenient for bots to mention a users using their ids. It
is to be noted that previous syntax are also supported.

Documentation tweaked by tabbott for better readability.

The changes were tested manually in development server, and also
by adding some new backend and frontend tests.

Fixes: #17487.
2021-03-25 00:44:56 -07:00
akshatdalton 9c6d8d9d81 markdown: Add support to shorten GitHub links.
We add support to shorten links and test their shortening in
well-organized, clean manner that makes it trivial to extend the
GitHub approach for GitLab and perhaps other services.

We only shorten basic types of GitHub links (issue, PR, commit) that
fit a set of simple common patterns; the default behaviour of Autolink
is kept for everything else.

Logic added in frontend and backend Markdown Processor is identical.
This makes easy to extend the logic for other services like GitLab.

Fixes #11895.
2021-03-25 00:39:44 -07:00
m-e-l-u-h-a-n 830c4acedc markdown: Fix invalid mention bug for stream and stream topic mention.
Modifies `StreamPattern` and `StreamTopicPattern` to inherit
from InlineProcessor instead of Pattern. This change is done
because Pattern stopped checking for matching patterns as soon
as it found a match which was not a valid stream. Due to this
all the subsequent mention failed, even if they were valid.
This bug was only present in backend renderring due to
markdown.inlinepatterns.Pattern.

Due to above changes verbose_compile is no longer used for
precompiling STREAM_LINK_REGEX, STREAM_TOPIC_LINK_REGEX as
adds ^(.*?) and (.*?)$ which cause extra overhead of matching
pattern which is not required. With new InlineProcessor these
extra patterns at beggining and end are not required.
So, StreamPattern and StreamTopicPattern now define their own
__init__ method for precompiling the regex.

Fixes #17535.

These changes were tested locally in dev server and by adding
some new markdown tests to test these.
2021-03-23 01:28:30 -07:00
m-e-l-u-h-a-n dadbba0c25 markdown: Fix invalid mention bug for user group mention.
Modifies `UserGroupMentionPattern` to inherit from InlineProcessor
instead of Pattern. This change is done because Pattern
stopped checking for matching patterns as soon as it found
a match which was not a valid user group. Due to this all
the subsequent user group mention failed, even if they were
valid. This bug was only present in backend renderring due to
markdown.inlinepatterns.Pattern.

This was reported as issue #17535.

These changes were tested locally in dev server and by adding
some new markdown tests to test these.
2021-03-23 01:28:30 -07:00
m-e-l-u-h-a-n c8979a5100 markdown: Fix invalid mention bug for user mention.
Modifies `UserMentionPattern` to inherit from InlineProcessor
instead of Pattern. This change is done because Pattern
stopped checking for matching patterns as soon as it found
a match which was not a valid user. Due to this all the
subsequent user mention failed. This bug was only present in
backend renderring due to markdown.inlinepatterns.Pattern.

This was reported as issue #17535.

These changes were tested locally in dev server and by adding
some new markdown tests to test these.
2021-03-23 01:28:30 -07:00
Anders Kaseorg 23088b5d78 markdown: Fix some Any annotations.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-03-17 18:41:46 -07:00
Anders Kaseorg 9864907985 mypy: Correct typing.re imports to typing.
Although typing.re exists in the standard library, mypy has never
recognized it.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-03-17 18:41:46 -07:00
Anders Kaseorg 0a09c9dfd7 markdown: Re-enable typeshed stub for Python-Markdown.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-03-10 11:49:59 -08:00
Anders Kaseorg b728727d9d timeout: Remove unnecessary varargs support.
Mypy can check it this way.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-15 17:05:28 -08:00
Anders Kaseorg 6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg 11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg ae0afa2390 markdown: Explode config dict.
Commit 434094e599 (#11321) changed this
from an Extension to a subclass of Markdown, so it no longer has any
reason to use a config dict structured like that of an Extension.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-05 10:52:31 -05:00
Anders Kaseorg 4ca66e7278 timezone: Correct common_timezones dictionary.
The changes are as follows:

• Fix one day offset in all western zones.
• Correct CST from -64800 to -21600 and CDT from -68400 to -18000.
• Disambiguate PST in favor of -28000 over +28000.
• Add GMT, UTC, WET, previously excluded for being at offset 0.
• Add ACDT, AEDT, AKST, MET, MSK, NST, NZDT, PKT, which the previous
  code did not find.
• Remove numbered abbreviations -12, …, +14, which are unnecessary.
• Remove MSD and PKST, which are no longer used.

Hardcode the dict and verify it with a test, so that future
discrepancies won’t go silently unnoticed.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-01-27 15:23:15 -08:00
Anders Kaseorg fbf8ce0305 markdown: Add types for extra Markdown members.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
Anders Kaseorg b48bdc65b9 markdown: Fix AlertWordNotificationProcessor.run type.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
Anders Kaseorg 9573f6dc00 markdown: Fix build_block_parser type.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
Anders Kaseorg 060036dfd5 markdown: Merge build_engine into Markdown constructor.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
Anders Kaseorg 08c64f5cfa markdown: Fix imports for compatibility with typeshed stubs.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
akshatdalton 620e9cbf72 markdown: Fix merging of separate quotations.
Initally, when writing two or more quotes, having
a blank line in between them, merges those quotes.
This created confusion especially in "quote and reply".

This commit fixes such issues. Now two or more quotes
having a blank line in between them, will not get merged.

This change is correct both for usability and for improving our
compatibility with CommonMark.

Fixes #14379.
2020-10-30 15:21:15 -07:00
Anders Kaseorg 1352f2f233 python: Replace manual quote_plus usage with urlencode.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-27 13:47:02 -07:00
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Anders Kaseorg e513b75e86 markdown: Remove handler for old bug with incompatible twitter library.
See commit 8b002040e0 and #86.  The
development environment bug that necessitated this handler has long
been irrelevant.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:30:26 -07:00
akshatdalton 287c4ed2bb markdown: Fix Youtube and Vimeo preview overriding markdown link titles bug.
Initially markdown titles were overridden by Youtube and Vimeo preview titles.
But now it will check if any markdown title is present to replace Youtube or
Vimeo preview titles, if preview of linked websites is enabled.
Fixes #16100
2020-10-19 12:06:13 -07:00
Anders Kaseorg 7f69c1d3d5 python: Catch specific exceptions from requests.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-11 16:11:41 -07:00
Aman Agrawal 1b5b82e712 RealmFilterPattern: Mark converted content as AtomicString.
If multiple filters match the same string, we run into an infinite
loop of converting string into urls. To fix it, we mark the matched
string as atomic after first conversion.
2020-09-22 15:10:38 -07:00
Alex Vandiver 03c6a0f182 markdown: Skip other common file extensions in linking, sort. 2020-09-21 21:03:29 -07:00
Alex Vandiver 4361ce1246 markdown: Use tlds package to keep updated list of TLDs.
Also remove a useage of "blacklist."
2020-09-21 21:03:29 -07:00
Anders Kaseorg dfab09b17d markdown: Replace hyperlink requirement with urllib.parse.
The previous code only worked by accident and hyperlink 20.0.0 breaks
it.

>>> hyperlink.parse("example.com").replace(scheme="https")
DecodedURL(url=URL.from_text('https:example.com'))

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-13 15:37:28 -07:00
Anders Kaseorg 02725d32dd python: Rewrite list() as [].
Suggested by the flake8-comprehensions plugin.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-02 11:15:41 -07:00
Anders Kaseorg a276eefcfe python: Rewrite dict() as {}.
Suggested by the flake8-comprehensions plugin.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-02 11:15:41 -07:00
Anders Kaseorg ab120a03bc python: Replace unnecessary intermediate lists with generators.
Mostly suggested by the flake8-comprehension plugin.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-02 11:15:41 -07:00
Alex Vandiver 5b74de7be7 markdown: Add another twitter code to retry-later.
Error code 131 is documented to be an arbitrary server error on
Twitter's side; add it to the retry list.
2020-08-18 10:32:24 -07:00
Alex Vandiver 092ed87ae3 markdown: Cache Twitter 403 responses that are semi-permanent.
03ca3afbc2 added more codes that are equivalent to 404's; this adds to
the list of cache-as-None codes a couple which are equivalent to
403's.  It does not comprise _all_ possible 403-like codes -- many of
them are "the client is not OK," which is relevant to log as an error
still.
2020-08-18 10:32:24 -07:00
Anders Kaseorg 768f9f93cd docs: Capitalize Markdown consistently.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:23:06 -07:00
Alex Vandiver 2928bbc8bd logging: Report stack_info on logging.exception calls.
The exception trace only goes from where the exception was thrown up
to where the `logging.exception` call is; any context as to where
_that_ was called from is lost, unless `stack_info` is passed as well.
Having the stack is particularly useful for Sentry exceptions, which
gain the full stack trace.

Add `stack_info=True` on all `logging.exception` calls with a
non-trivial stack; we omit `wsgi.py`.  Adjusts tests to match.
2020-08-11 10:16:54 -07:00
Alex Vandiver 90cdda9836 markdown: Link the twitter response code docs inline. 2020-07-31 10:35:41 -07:00
Alex Vandiver 03ca3afbc2 markdown: Treat more twitter codes as also permanent failures.
Per the API documentation[1], the following codes all correspond to
HTTP 404:

 - `34`: **Sorry, that page does not exist.**  The specified resource
   was not found.
 - `144`: **No status found with that ID.**  The requested Tweet ID is
   not found (if it existed, it was probably deleted)
 - `421`: **This Tweet is no longer available.**  The Tweet cannot be
   retrieved. This may be for a number of reasons.
 - `422`: **This Tweet is no longer available because it violated the
   Twitter Rules.**  The Tweet is not available in the API.

Treat all of these identically.

[1] https://developer.twitter.com/en/docs/basics/response-codes
2020-07-31 10:35:41 -07:00
Alex Vandiver fc141af30e markdown: Factor out twitter error code handling. 2020-07-31 10:35:41 -07:00
Vinit Singh 308cf8ac00 markdown: Inline Youtube previews instead of appending it to the end.
This change makes our handling of youtube-url previews consistent
with how we handle our inline images. This allows the previews to
render next to the paragraph that links to the youtube video.

Follow-up to PR #15773.
2020-07-22 16:11:17 -07:00
Rohitt Vashishtha fb2946aaf6 Revert "markdown: Remove paragraphs that only contain a tweet link."
This reverts commit d3770153a6.

We do not show a link to the tweet in our preview, so we should revert
to our previous behavior for now.
2020-07-17 14:30:22 -07:00
Rohitt Vashishtha d3770153a6 markdown: Remove paragraphs that only contain a tweet link.
This is similar to our behavior with image previews, and helps
reduce clutter in the final rendered html.

We add the string 'Tweet: ' to our existing tests so those tests
remain the same.
2020-07-13 12:24:32 -07:00
Rohitt Vashishtha 87e01cd1fa markdown: Inline Twitter previews instead of appending at end.
This commit makes our handling of twitter previews consistent with
how we handle our inline images so that tweets render next to the
paragraph that links to the tweet.
2020-07-13 12:24:32 -07:00
Rohitt Vashishtha a8ab745ee4 markdown: Extract get_inlining_information for link previews.
We decouple the logic of insertion rules for inline links from
image preview logic. Now, we can use this same logic for other
kinds of link previews as well.
2020-07-13 12:24:32 -07:00
Rohitt Vashishtha 912e372c4e markdown: Remove !avatar() and !gravatar() syntax.
This particular commit has been a long time coming. For reference,
!avatar(email) was an undocumented syntax that simply rendered an
inline 50px avatar for a user in a message, essentially allowing
you to create a user pill like:

`!avatar(alice@example.com) Alice: hey!`

---

Reimplementation

If we decide to reimplement this or a similar feature in the future,
we could use something like `<avatar:userid>` syntax which is more
in line with creating links in markdown. Even then, it would not be
a good idea to add this instead of supporting inline images directly.

Since any usecases of such a syntax are in automation, we do not need
to make it userfriendly and something like the following is a better
implementation that doesn't need a custom syntax:

`![avatar for Alice](/avatar/1234?s=50) Alice: hey!`

---

History

We initially added this syntax back in 2012 and it was 'deprecated'
from the get go. Here's what the original commit had to say about
the new syntax:

> We'll use this internally for the commit bot.  We might eventually
> disable it for external users.

We eventually did start using this for our github integrations in 2013
but since then, those integrations have been neglected in favor of
our GitHub webhooks which do not use this syntax.

When we copied `!gravatar` to add the `!avatar` syntax, we also noted
that we want to deprecate the `!gravatar` syntax entirely - in 2013!

Since then, we haven't advertised either of these syntaxes anywhere
in our docs, and the only two places where this syntax remains is
our game bots that could easily do without these, and the git commit
integration that we have deprecated anyway.

We do not have any evidence of someone asking about this syntax on
chat.zulip.org when developing an integration and rightfully so- only
the people who work on Zulip (and specifically, markdown) are likely
to stumble upon it and try it out.

This is also the only peice of code due to which we had to look up
emails -> userid mapping in our backend markdown. By removing this,
we entirely remove the backend markdown's dependency on user emails
to render messages.

---

Relevant commits:

- Oct 2012, Initial commit        c31462c278
- Nov 2013, Update commit bot     968c393826
- Nov 2013, Add avatar syntax     761c0a0266
- Sep 2017, Avoid email use       c3032a7fe8
- Apr 2019, Remove from webhook   674fcfcce1
2020-07-07 10:39:44 -07:00
Rohitt Vashishtha 732ec3c0e6 timestamp: Change syntax to `<time:timestammp>`.
We had been using !time() syntax for timestamps so far. Since its
an unreleased feature, we can make changes without affecting many
people.

Fixes #15442.
2020-07-06 15:53:56 -07:00
Mohit Gupta f8d1e0f86a refactor: Rename convert to markdown_convert.
Prior to this commit whenever convert was imported from zerver.lib.markdown
it was aliased as markdown_convert for readability.
This commit rename convert function to markdown_convert so that it can be
directly import it without aliasing and without compromising readability.
2020-07-06 12:39:59 -07:00
Anders Kaseorg e24b2fdf06 markdown: Fix strict_optional errors.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-07-06 11:25:48 -07:00