Commit Graph

228 Commits

Author SHA1 Message Date
Anders Kaseorg c2c96eb0cf python: Annotate type aliases with TypeAlias.
This is not strictly necessary but it’s clearer and improves mypy’s
error messages.

https://docs.python.org/3/library/typing.html#typing.TypeAlias
https://mypy.readthedocs.io/en/stable/kinds_of_types.html#type-aliases

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-07 10:02:49 -07:00
Anders Kaseorg 3b09197fdf ruff: Fix RUF015 Prefer `next(...)` over single element slice.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-23 15:20:53 -07:00
Anders Kaseorg bd2f327a25 markdown: Remove obsolete comment.
It was obsoleted by commit 07fef56c74
(#19144).

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-20 11:00:09 -07:00
Prakhar Pratyush 4c9d26ce17 mention: Send notifications for @topic wildcard mentions.
This commit completes the notifications part of the @topic
wildcard mention feature.

Notifications are sent to the topic participants for the
@topic wildcard mention.
2023-07-17 09:39:24 -07:00
Steve Howell b742f1241f realm emoji: Use a single cache for all lookups.
The active realm emoji are just a subset of all your
realm emoji, so just use a single cache entry per
realm.

Cache misses should be very infrequent per realm.

If a realm has lots of deactivated realm emoji, then
there's a minor expense to deserialize them, but that
is gonna be dwarfed by all the other more expensive
operations in message-send.

I also renamed the two related functions.  I erred on
the side of using somewhat verbose names, as we don't
want folks to confuse the two use cases. Fortunately
there are somewhat natural affordances to use one or
the other, and mypy helps too.

Finally, I use realm_id instead of realm in places
where we don't need the full Realm object.
2023-07-17 09:35:53 -07:00
Prakhar Pratyush 0891f9f65a mention: Determine @topic mention during message rendering.
This commit adds a boolean field `mentions_topic_wildcard`
to the `MessageRenderingResult` dataclass.

The field is set to true only if message rendering determines
the message has an actual topic wildcard mention in it (and not,
e.g., topic wildcard mention syntax inside a code block).

The rendered content for topic wildcard mention is
'<span class="topic-mention">{wildcard}</span>'.

The 'topic-mention' class is the identifier for the wildcard
mention being a topic wildcard mention.

We don't use 'data-user-id="*"' and "user-mention" class for
topic wildcard mentions and eventually plan to remove them for
stream wildcard mentions too in a separate mini-project.
2023-07-13 11:34:48 -07:00
Lauryn Menard d84fd73db4 markdown-processor: Update insertion_index check for multiple classes.
Updates find_proper_insertion_index to check for the inline image
classes as matching at least one of the classes in the element's
attrib["class"] so that cases where an inline preview image has
multiple classes, like YouTube video previews, will have the
correct insertion index.

Fixes #26186.
2023-07-07 11:07:45 -04:00
Alex Vandiver ff53ee8e28 markdown: Only attempt to adjust /wiki/File: paths on Wikipedia. 2023-07-06 17:50:25 -07:00
Prakhar Pratyush 179d5cb37d mention: Replace 'wildcards' with 'stream_wildcards'.
This prep commit replaces the 'wildcard' keyword in the codebase
with 'stream_wildcard' at some places for better readability, as
we plan to introduce 'topic_wildcards' as a part of the
'@topic mention' project.

Currently, 'wildcards = ["all", "everyone", "stream"]' which is an
alias to mention everyone in the stream, hence better renamed as
'stream_wildcards'.

Eventually, we will have:
'stream_wildcard' as an alias to mention everyone in the stream.
'topic_wildcard' as an alias to mention everyone in the topic.
'wildcard' refers to 'stream_wildcard' and 'topic_wildcard' as a whole.
2023-07-03 22:03:17 -07:00
Tim Abbott dce4a3c98e markdown: Remove most of Twitter integration.
Twitter removed their v1 API. We take care to keep the existing cached
results around for now, and to not poison that cache, since we might
be able replace this with something that can still use the existing
cache.
2023-05-29 10:43:35 -07:00
Anders Kaseorg 9db3451333 Remove statsd support.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-04-25 19:58:16 -07:00
Zixuan James Li 268f858f39 linkifier: Support URL templates for linkifiers.
This swaps out url_format_string from all of our APIs and replaces it
with url_template. Note that the documentation changes in the following
commits  will be squashed with this commit.

We change the "url_format" key to "url_template" for the
realm_linkifiers events in event_schema, along with updating
LinkifierDict. "url_template" is the name chosen to normalize
mixed usages of "url_format_string" and "url_format" throughout
the backend.

The markdown processor is updated to stop handling the format string
interpolation and delegate the task template expansion to the uri_template
library instead.

This change affects many test cases. We mostly just replace "%(name)s"
with "{name}", "url_format_string" with "url_template" to make sure that
they still pass. There are some test cases dedicated for testing "%"
escaping, which aren't relevant anymore and are subject to removal.
But for now we keep most of them as-is, and make sure that "%" is always
escaped since we do not use it for variable substitution any more.

Since url_format_string is not populated anymore, a migration is created
to remove this field entirely, and make url_template non-nullable since
we will always populate it. Note that it is possible to have
url_template being null after migration 0422 and before 0424, but
in practice, url_template will not be None after backfilling and the
backend now is always setting url_template.

With the removal of url_format_string, RealmFilter model will now be cleaned
with URL template checks, and the old checks for escapes are removed.

We also modified RealmFilter.clean to skip the validation when the
url_template is invalid. This avoids raising mulitple ValidationError's
when calling full_clean on a linkifier. But we might eventually want to
have a more centric approach to data validation instead of having
the same validation in both the clean method and the validator.

Fixes #23124.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-04-19 12:20:49 -07:00
Anders Kaseorg 087660a87e requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-03-05 14:46:28 -08:00
Anders Kaseorg c1675913a2 web: Move web app to ‘web’ directory.
Ever since we started bundling the app with webpack, there’s been less
and less overlap between our ‘static’ directory (files belonging to
the frontend app) and Django’s interpretation of the ‘static’
directory (files served directly to the web).

Split the app out to its own ‘web’ directory outside of ‘static’, and
remove all the custom collectstatic --ignore rules.  This makes it
much clearer what’s actually being served to the web, and what’s being
bundled by webpack.  It also shrinks the release tarball by 3%.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-23 16:04:17 -08:00
Alex Vandiver ccecc8eb84 markdown: Comment why we do not hash or use STATIC_URL for :zulip:. 2023-02-14 17:17:06 -05:00
Anders Kaseorg 0a1904a6a7 markdown: Rewrite YouTube URL parser without regex spaghetti.
This also adds support for the new YouTube Shorts URLs.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-09 22:34:51 -08:00
Anders Kaseorg 70ac144d57 markdown: Replace custom cache decorator with functools.lru_cache.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-09 15:46:11 -08:00
Anders Kaseorg b91788b945 markdown: Replace deprecated UnescapePostprocessor.
See https://github.com/Python-Markdown/markdown/pull/1272.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-04 16:36:47 -08:00
Anders Kaseorg da3cf5ea7a ruff: Fix RSE102 Unnecessary parentheses on raised exception.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-04 16:34:55 -08:00
Anders Kaseorg df001db1a9 black: Reformat with Black 23.
Black 23 enforces some slightly more specific rules about empty line
counts and redundant parenthesis removal, but the result is still
compatible with Black 22.

(This does not actually upgrade our Python environment to Black 23
yet.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-02 10:40:13 -08:00
Anders Kaseorg 4eda29bd86 ruff: Fix RUF005 Consider spread instead of concatenation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-26 10:16:30 -08:00
Anders Kaseorg 7a7513f6e0 ruff: Fix SIM201 Use `… != …` instead of `not … == …`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Anders Kaseorg b8b29dc3ad ruff: Fix SIM110 Use `return any(…)` instead of `for` loop.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Anders Kaseorg ff1971f5ad ruff: Fix SIM105 Use `contextlib.suppress` instead of try-except-pass.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Anders Kaseorg b0e569f07c ruff: Fix SIM102 nested `if` statements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Trident Pancake c6ea673cc9 markdown: Update max inline preview from 10 to 24.
The max inline preview limit was previously increased to 10 by #20789.
However, as issue #23624 shows, it's still causing confusion for users
when they include more than 10 links.

Bump this limit up to 24, which is a multiple of the 4 image preview
per line logic.
2023-01-18 14:58:00 -05:00
Anders Kaseorg 17300f196c ruff: Fix ISC003 Explicitly concatenated string.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-04 16:25:07 -08:00
Anders Kaseorg 2c5e114f8b ruff: Fix ISC001 Implicitly concatenated string literals on one line.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-04 16:25:07 -08:00
Anders Kaseorg b5cad938b8 ruff: Fix DTZ006 `datetime.datetime.fromtimestamp()` without `tz` argument.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-04 16:25:07 -08:00
Anders Kaseorg f7e97b1180 ruff: Fix PLW0602 Using global but no assignment is done.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-04 16:25:07 -08:00
Zixuan James Li a3a0103d86 markdown: Calculate linkifier precedence in topics.
This uses the linkifier index among the list of linkifiers in the
replacement as the priority to order the replacement order for
patterns in the topic. This avoids having multiple overlapping matches
that each produce a link.

The linkifier with the lowest id will be prioritized when its pattern
overlaps with another. Linkifiers are prioritized over raw URLs.

Note that the same algorithm is used for local echoing and the
backend markdown processor.

Fixes #23715.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-12-13 15:16:20 -08:00
Zixuan James Li 4602c34108 markdown: Correctly retrieve indices for repeated matches.
The same pattern being matched multiple times in a topic cannot be
properly ordered using topic_name.find(match_text) and etc. when there
are multiple matches of the same pattern in the topic.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-12-13 15:16:20 -08:00
Anders Kaseorg e634e3276a ruff: Fix PLC0414 Import alias does not rename original package.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-12-04 22:11:24 -08:00
Anders Kaseorg 73c4da7974 ruff: Fix N818 exception name should be named with an Error suffix.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-17 16:52:00 -08:00
Anders Kaseorg 924d530292 ruff: Fix N813 camelcase imported as lowercase.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-16 09:29:11 -08:00
Anders Kaseorg 2876ae8e48 ruff: Fix N803 argument name should be lowercase.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-16 09:29:11 -08:00
Anders Kaseorg 46955da3a0 ruff: Fix ANN204 missing return type annotation for __init__.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-16 09:29:11 -08:00
Alex Vandiver 5d42a0cb00 linkifiers: Support %20 in URLs for topic links.
9381a3bd45 added support for linkifier pattern URLs containing
`%20`-style escapes, but only did so for the codepath which is used in
the message body -- topic links did not understand them.

Expand the support to include when they are substituted into topics.
2022-10-11 14:31:13 -07:00
Anders Kaseorg 8230324068 markdown: Store ZulipMarkdown in members with the right type.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-10-06 15:15:10 -07:00
Anders Kaseorg 3cf91e9e45 markdown: Rename our Markdown subclass to ZulipMarkdown.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-10-06 15:15:10 -07:00
Anders Kaseorg 97be895cf0 markdown: Remove Optional from zulip_rendering_result type.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-10-06 15:15:10 -07:00
Anders Kaseorg d01c99d2ee markdown: Add missing None check in InlineInterestingLinkProcessor.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-10-06 15:15:10 -07:00
Anders Kaseorg 4a61e36def CVE-2022-36048: Rewrite only specific local links to relative.
Due to mismatches between the URL parsers in Python and browsers, it
was possible to hoodwink rewrite_local_links_to_relative into
generating links that browsers would interpret as absolute.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-08-24 16:29:09 -07:00
N-Shar-ma ef044b8697 markdown: Update characters allowed before @ and stream mentions.
Now the following characters are allowed before @-mentions and stream
references (starting with #) for proper rendering - {, [, /.

This commit makes the markdown rendering consistent with autocomplete
(anything that is autocompleted is also rendered properly).
2022-08-06 19:29:39 -07:00
Mateusz Mandera 2299aa3382 docs: Remove some outdated references to thumbnailing.md doc.
The doc was removed in 405bc8dabf
2022-07-12 17:44:24 -07:00
Anders Kaseorg 8246ee7c57 mypy: Add links to specific mypy bugs.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-07-05 17:54:58 -07:00
Anders Kaseorg dc33a0ae67 markdown: Rewrite include plugin without markdown-include.
markdown-include is GPL licensed.

Also, rewrite it as a block processor, so that it works correctly
inside indented blocks.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-26 17:36:31 -07:00
Anders Kaseorg 6331a314d4 Correctly hyphenate “non-”.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-04-27 22:10:31 -07:00
Anders Kaseorg a2825e5984 python: Use Python 3.8 typing.{Protocol,TypedDict}.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-04-27 12:57:49 -07:00
Alex Vandiver 351bdfaf78 preview: Use cache only as a non-durable cache, not an IPC.
The `get_link_embed_data` / `link_embed_data_from_cache` pair as
introduced in c93f1d4eda uses the cache
as a temporary store inside of the `embed_links` worker; this means
that it must be durable storage, or the worker will stall and re-fetch
the same links to preview them.

Switch to plumbing through the fetched URL embed data as an parameter
to the Markdown evaluation which uses them, rather than using the
cache as an intermediary.  This frees up the cache to be merely a
non-durable cache.

As a side-effect, this removes get_cache_with_key, and
link_embed_data_from_cache which was its only callsite.
2022-04-15 14:48:12 -07:00
Alex Vandiver 327ff9ea0f preview: Use a dataclass for the embed data.
This is significantly cleaner than passing around `Dict[str, Any]` all
of the time.
2022-04-15 14:48:12 -07:00
Alex Vandiver 661c333377 markdown: Use named parameters to add_a helper.
This has enough parameters that it benefits from making which is which
explicit.
2022-04-15 14:48:12 -07:00
Alex Vandiver 452a30305d markdown: Clarify url parameter of "add_a" helper. 2022-04-15 14:48:12 -07:00
Alex Vandiver 1ac0035f8c markdown: Allow whitespace overlaps in topic linkifiers.
`prepare_linkifier_pattern`, as of db934be064, adds a match to the
end of the regex, of either the end of string, or a non-word character
-- this is in place of a negative look-ahead, which is no longer
possible in re2.  This causes the regex to consume trailing
whitespace, and thus not be able to match twice in succession with
`pattern.finditer` -- "#1234 #5678" fails to match because the space
is consumed by the first match of the regex.

Rather than use `pattern.finditer`, write own own version, which
rewinds over the non-word character consumed after the match, if any.
This allows the same "after" non-word character to also satisfy the
"before" of the next match.

Fixes #21502.
2022-03-22 15:40:03 -07:00
Anders Kaseorg 1629d6bfb3 python: Reformat with Black 22 (stable).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-02-18 18:03:13 -08:00
Anders Kaseorg df304c40da markdown: Use built-in hex formatting for unicode_emoji_to_codepoint.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-02-03 11:00:04 -08:00
Puneeth Chaganti d55c137277 emoji: Add yellow_large_square and green_large_square emojis.
Wordle has recently become a thing and it uses green, yellow and white (or
black in dark mode) large square unicode characters to let people share their
gameplay. Zulip converts the white and black large square unicode characters to
emojis, but not the green and yellow ones. This causes the Wordle grid to be
misaligned when shared on Zulip.

This commit adds green and yellow large square emojis to our emoji list to fix
the problem.
2022-02-02 16:26:31 -08:00
Puneeth Chaganti 6beb84b553 emoji: Use str.rjust to pad codepoint strings instead of a loop. 2022-02-02 16:26:30 -08:00
Puneeth Chaganti 0eeb74b3c2 emoji: Fix minor typo in unicode_emoji_to_codepoint comment. 2022-02-02 16:26:28 -08:00
Alex Vandiver 19f891968d markdown: Increase the maximum number of image previews per message.
The limit here is purely to prevent breakage in case of a pathological
number of images in a single message; 5 images is entirely possible in
a reasonable message, and causes user confusion when they are not
expended.

Increase the limit to 10 per message.
2022-01-14 11:30:07 -08:00
Steve Howell 4adcaf92f7 refactor: Attach get_stream_name_map to MentionData.
This diff looks slightly noisy, but the main chunk of
code that we moved here has the same logic as before,
and it just gets realm_id from MentionBackend now, instead
of having our markdown processor have to supply it.

We basically want MentionData to be the gatekeeper of
mention data, and then we delegate backend tasks to
MentionBackend.

Soon we will add a cache to MentionBacked, which will
justify this change a bit more.
2021-12-30 11:28:15 -08:00
Steve Howell c6448263c3 refactor: Add MentionBackend.
We will eventually use this to avoid redundant
queries.

The diff is slightly noisy here, but there are no
logic changes.
2021-12-30 11:28:15 -08:00
Steve Howell ea252ab53e refactor: Convert FullNameInfo to a dataclass.
As part of this we no longer query for email, which
is a vestige of when we used emails to identify users
on the frontend.
2021-12-30 11:28:15 -08:00
Steve Howell f5fc348786 mypy: Add explicit types for dbdata references.
When our handlers specifically reference self.md.zulip_db_data,
we now use an explicit type.

We probably want a more robust solution here, such as a semgrep
rule.
2021-12-30 11:28:15 -08:00
Steve Howell df84892aad markdown: Convert DbData to a dataclass. 2021-12-30 11:28:15 -08:00
Steve Howell 4e551f8279 refactor: Introduce get_stream_name_map.
We only need a name -> id map, and the FullNameInfo
type was a lie.
2021-12-30 11:28:15 -08:00
Steve Howell c04a8097f3 mypy: Add EmojiInfo type.
We now serialize still_url as None for non-animated emojis,
instead of omitting the field. The webapp does proper checks
for falsiness here.  The mobile app does not yet use the field
(to my knowledge).

We bump the API version here. More discussion here:

https://chat.zulip.org/#narrow/stream/378-api-design/topic/still_url/near/1302573
2021-12-30 11:28:14 -08:00
Alex Vandiver 6a40c17ccf markdown: CSS-escape preview links.
This adds `soupsieve` as an explicit dependency, but intentionally
does not adjust the provision version, as it was already an indirect
dependency.
2021-10-26 18:17:23 -07:00
Alex Vandiver 52f74bbd9b markdown: Run URL preview links through camo.
Not proxying these requests through camo is a security concern.
Furthermore, on the desktop client, any embed image which is hosted on
a server with an expired or otherwise invalid certificate will trigger
a blocking modal window with no clear source and a confusing error
message; see zulip/zulip-desktop#1119.

Rewrite all `message_embed_image` URLs through camo, if it is enabled.
2021-10-26 18:17:23 -07:00
Anders Kaseorg 58920affd4 python: Remove re.UNICODE flag (redundant in Python 3).
https://docs.python.org/3/library/re.html#re.A

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-10-22 13:42:29 -07:00
Alex Vandiver 9381a3bd45 linkifiers: Support URL percent-encoded bytes.
Supporting URL percent-encoded bytes is possible using `%%20`, but this
is not necessarily very understandable to end-users, even those that
understand percent encoding.

Allow `%20` in linkifier URL format strings, and transform them into
`%%20` in the pattern just before they are applied in markdown
translation.  Care must be taken here, such that already-escaped `%`s
are not escaped an extra time.

We do this before rendering, and not before storage, as
a simplification; the JS-side linkifier at present only understands
`%(foo)s` and thus needs no changes, and to avoid an un-escaping pass
before showing in the admin UI.
2021-10-22 13:00:20 -07:00
Anders Kaseorg 4839b7ed27 url_preview: Interpret og:image relative to full page URL.
og:image is supposed to be an absolute URL, but some sites incorrectly
provide a relative URL.  In this case, it makes more sense to
interpret it relative to the full page URL after redirects, rather
than relative to just the domain part of the page URL before
redirects.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-10-21 12:20:37 -07:00
Alex Vandiver db934be064 CVE-2021-41115: Use re2 for user-supplied linkifier patterns.
Zulip attempts to validate that the regular expressions that admins
enter for linkifiers are well-formatted, and only contain a specific
subset of regex grammar.  The process of checking these
properties (via a regex!) can cause denial-of-service via
backtracking.

Furthermore, this validation itself does not prevent the creation of
linkifiers which themselves cause denial-of-service when they are
executed.  As the validator accepts literally anything inside of a
`(?P<word>...)` block, any quadratic backtracking expression can be
hidden therein.

Switch user-provided linkifier patterns to be matched in the Markdown
processor by the `re2` library, which is guaranteed constant-time.
This somewhat limits the possible features of the regular
expression (notably, look-head and -behind, and back-references);
however, these features had never been advertised as working in the
context of linkifiers.

A migration removes any existing linkifiers which would not function
under re2, after printing them for posterity during the upgrade; they
are unlikely to be common, and are impossible to fix automatically.

The denial-of-service in the linkifier validator was discovered by
@erik-krogh and @yoff, as GHSL-2021-118.
2021-10-04 21:26:24 +00:00
Tim Abbott 545911b051 markdown: Remove useless locless_schemes check.
This check was copied from upstream python-markdown's "safe mode"
before they removed that feature.  The upstream history is that they
introduced this check in
2db5d1c8e4,
which was not a complete security check, and then added the
immediately following check (with an allowlist of schemes) in
0b4ffbb60e.

Their first, incomplete check provides no security benefit and makes
the code hard to reason about, so we remove it.
2021-09-09 09:03:40 -07:00
rht c24ab8c4d3 markdown: Expand list of safelisted URL schemes to match HTML spec. 2021-09-09 09:03:40 -07:00
Anders Kaseorg 66ad6a4583 docs: Inline code spans are not blocks.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-09-07 16:12:39 -07:00
Anders Kaseorg 646c04eff2 Rename default branch to ‘main’.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-09-06 12:56:35 -07:00
Alex Vandiver 4d428490fd outgoing_http: Use OutgoingSession subclasses in more places.
This adds the X-Smokescreen-Role header to proxy connections, to track
usage from various codepaths, and enforces a timeout.  Timeouts were
kept consistent with their previous values, or set to 5s if they had
none previously.
2021-09-01 05:34:13 -07:00
Priyansh Garg 1e51c23494 markdown: Remove unnecessary checks for zulip_message.
This commits removes some unnecessary checks for `self.md.zulip_message`,
which were put there historically, as earlier we used to add the additional
properties like mentions_user_ids, alert_words, etc. to Message dict
only. These were later moved to MessageRenderingResult class in commit
75cea329b but the checks weren't removed.

This is important because while rendering the messages imported from
other chat tools (like Rocket.Chat), the Message dict is not passed to
the markdown, due to which the checks for `self.md.zerver_message` fails
and hence, things like user mentions, stream/topic mentions are not
rendered in the imported messages properly.
2021-08-31 16:53:42 -07:00
Anders Kaseorg 4206e5f00b python: Remove locally dead code.
These changes are all independent of each other; I just didn’t feel
like making dozens of commits for them.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-19 01:51:37 -07:00
Anders Kaseorg 806494da06 markdown: Stream and parse incrementally in fetch_open_graph_image.
This way we can stop reading as soon as we get to the body.  Also,
send an Accept header, check that the request was actually successful,
use lxml.etree.iterparse instead of a broken hand-rolled state
machine, and support XHTML, all for negative 28 lines of code.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-05 09:17:32 -07:00
Priyansh Garg 0a875c1c4c markdown: Fix jpeg extension in `IMAGE_EXTENSIONS`. 2021-08-05 08:54:02 -07:00
Anders Kaseorg 42fa62e563 Revert "time_widget: Make the generated time string more readable."
This reverts commit 1965584eec.

This syntax has a bad interaction with table syntax and needs to be
rethought.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-03 16:45:31 -07:00
Ganesh Pawar 1965584eec time_widget: Make the generated time string more readable.
Before: <time:2021-07-14T00:14:00-07:00>
After: <time:2021-07-14|00:14:00|UTC-07:00>

Fixes #19205
2021-08-02 23:17:01 -07:00
Anders Kaseorg 3665deb93a python: Remove unnecessary intermediate lists.
Generated automatically by pyupgrade.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-02 15:53:52 -07:00
Anders Kaseorg 162e9d6c0b fenced_code: Optimize FENCE_RE to fix cubic worst-case complexity.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-07-22 16:40:44 -07:00
Anders Kaseorg c56440ded0 requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-07-05 12:23:06 -07:00
Priyansh Garg 94a2be06f3 markdown: Use a shared variable for IMAGE_EXTENSION. 2021-07-02 11:22:55 -07:00
akshatdalton 44a298b671 minor: Use `OUTER_CAPTURE_GROUP` variable instead of string value. 2021-06-25 17:43:27 -07:00
akshatdalton 490f6b6880 markdown: Extract regex in local variables. 2021-06-25 17:43:01 -07:00
PIG208 75cea329b4 markdown: Refactor out additional properties added to Message.
This adds a new class called MessageRenderingResult to contain the
additional properties we added to the Message object (like alert_words)
as well as the rendered content to ensure typesafe reference. No
behavioral change is made except changes in typing.

This is a preparatory change for adding django-stubs to the backend.

Related: #18777
2021-06-24 18:14:53 -07:00
akshatdalton c507931ac8 refactor: Export non-markdown logic in mention.py. 2021-06-14 13:26:30 -07:00
Wesley Aptekar-Cassels d5ba94082a markdown: Increase max rendered message length to 1MB.
This should help with #17425, where messages with lots of LaTeX are
lost, due to the large expansion factor.

This isn't a total fix for this - large messages with lots of LaTeX
can still end up larger than 1MB, and rendering could timeout, but
this fix should help significantly.

1MB is still small enough that I don't expect we'll run into any DOS
problems - my testing didn't show any problems rendering messages that
contain ~1MB of LaTeX.
2021-06-03 10:10:35 -07:00
akshatdalton 7df62ebbaf settings: Make `MAX_MESSAGE_LENGTH` a server-level setting.
This will offer users who are self-hosting to adjust
this value. Moreover, this will help to reduce the
overall time taken to test `test_markdown.py` (since
this can be now overridden with `override_settings`
Django decorator).

This is done as a prep commit for #18641.
2021-06-03 09:26:28 -07:00
akshatdalton 832c763c38 minor: Remove unnecessary `__init__` method in `InlineInterestingLinkProcessor`.
Subclass `Treeprocessor` takes care of the `__init__` method.
2021-05-26 17:13:03 -07:00
Anders Kaseorg bac96cae80 markdown: Fix Dropbox image previews.
?dl=1 causes Dropbox to send Content-Type: application/binary, which
can’t be interpreted by Camo.  Use ?raw=1 instead.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-05-25 13:42:29 -07:00
akshatdalton 503247ebfa refactor: Add class `CompiledInlineProcessor` to de-duplicate code. 2021-05-23 14:30:22 -07:00
akshatdalton 78f26b6031 minor: Use `super` to initialize subclass. 2021-05-23 14:30:22 -07:00
akshatdalton 18203d8af3 markdown: Silence user group mention inside blockquotes. 2021-05-18 17:31:25 -07:00
akshatdalton 0245b590e9 markdown: Add support for user group silent mention.
Prior to this, we only supported direct mention to
the user groups. This commit extends that support
to silent mention for the user groups.
A related test case is also added.

Fixes: #11711.
2021-05-18 17:31:25 -07:00