This adds a new class called MessageRenderingResult to contain the
additional properties we added to the Message object (like alert_words)
as well as the rendered content to ensure typesafe reference. No
behavioral change is made except changes in typing.
This is a preparatory change for adding django-stubs to the backend.
Related: #18777
This should help with #17425, where messages with lots of LaTeX are
lost, due to the large expansion factor.
This isn't a total fix for this - large messages with lots of LaTeX
can still end up larger than 1MB, and rendering could timeout, but
this fix should help significantly.
1MB is still small enough that I don't expect we'll run into any DOS
problems - my testing didn't show any problems rendering messages that
contain ~1MB of LaTeX.
This will offer users who are self-hosting to adjust
this value. Moreover, this will help to reduce the
overall time taken to test `test_markdown.py` (since
this can be now overridden with `override_settings`
Django decorator).
This is done as a prep commit for #18641.
?dl=1 causes Dropbox to send Content-Type: application/binary, which
can’t be interpreted by Camo. Use ?raw=1 instead.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Prior to this, we only supported direct mention to
the user groups. This commit extends that support
to silent mention for the user groups.
A related test case is also added.
Fixes: #11711.
Earlier, USER_GROUP_MENTIONS_RE was:
r"(?<![^\s\'\"\(,:<])@(\*[^\*]+\*)"
For the syntax: *foo*, this was unnecessarily capturing it as
*foo* and the extraction of `foo` was done using another helper
function: `extract_user_group`.
This is now changed as:
r"(?<![^\s\'\"\(,:<])@(\*(?P<match>[^\*]+)\*)"
and extraction of `foo` can be done just by using the named capture
group `match`.
This change also helps to simplify its related code path.
Earlier, MENTIONS_RE was:
r"(?<![^\s\'\"\(,:<])@(?P<silent>_?)(?P<match>\*\*[^\*]+\*\*)"
For the syntax: **foo**, this was unnecessarily capturing it as
**foo** and adding extra operation for the extraction of `foo`.
This is now changed as:
r"(?<![^\s\'\"\(,:<])@(?P<silent>_?)(\*\*(?P<match>[^\*]+)\*\*)"
and extraction of `foo` can be done just by using the named capture
group `match`.
This change also helps to simplify its related code path.
Following the convention, we use uppercase for
regex. Also, `user_group_mentions` is given a
conventional name ending with `*_RE`: `USER_GROUP_MENTIONS_RE`.
A message containing wildcard mention when quoted (which
is turned into a silent mention) or message with silent
wildcard mention notifies the users by sending desktop,
sound, and missed message email notifications. This
is clearly a bug which is fixed by this commit.
Fixes: #18354.
Requesting external images is a privacy risk, so route all external
images through Camo.
Tweaked by tabbott for better test coverage, more comments, and to fix
bugs.
This logic likely never ran due to a combination of bugs.
* Running `maybe_update_markdown_engines` unconditionally meant that
`if md_engine_key in md_engines` was likely always true.
* Introduced in 65838bb: DEFAULT_MARKDOWN_KEY could never be in
md_engines, so should we have ever reached that code path, we'd have
tried to rebuild all markdown engines every time.
And it also wasn't clearly helpful -- because we fetch all linkifiers
for a realm on every request anyway, we don't really save database
queries by doing a bulk fetch on startup, and doing so would likely
result in a material regression to Zulip's overall startup time that
we were creating markdown engines for large numbers of realms in bulk
during process startup.
This reverts commit 9c6d8d9d81 (#16916).
This feature has known bugs, and also wants some design changes to
make it customizable like linkifiers, so we’re retargeting this to
post-4.x.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Backend logic for handling user mention was cluttered
because it was handled at two stages first in
get_possible_mentions_info while fetching mention data
based on the messsage and then later in UserMentionPattern
which handles processing of text for mention.
Ideally UserMentionPattern should depend on
get_possible_mentions_info only for data but there was a
shared logic between these two that made it hard to debug
any possible bugs.
Updates in this commit make both of these functions
coherent in terms of logic and also add appropiate
comments to improve readability of these functions.
There was also a hidden bug that if a user A is
mentioned in with @**name|id** then @**invalid|id**
again mentioned A because of the way we handled mentions
earlier. It is solved as a result of this refactor and
appropiate test has been added for this.
This has been tested manually as well as by adding new
test to address missing case.
Extend our markdown system to support mentioning of users
by id also. Following these changes, it would be possible
to mention users with @**|user_id** and silently mention
using @_**|user_id**.
Main intention for extending the mention syntax is to make
it convenient for bots to mention a users using their ids. It
is to be noted that previous syntax are also supported.
Documentation tweaked by tabbott for better readability.
The changes were tested manually in development server, and also
by adding some new backend and frontend tests.
Fixes: #17487.
We add support to shorten links and test their shortening in
well-organized, clean manner that makes it trivial to extend the
GitHub approach for GitLab and perhaps other services.
We only shorten basic types of GitHub links (issue, PR, commit) that
fit a set of simple common patterns; the default behaviour of Autolink
is kept for everything else.
Logic added in frontend and backend Markdown Processor is identical.
This makes easy to extend the logic for other services like GitLab.
Fixes#11895.
Modifies `StreamPattern` and `StreamTopicPattern` to inherit
from InlineProcessor instead of Pattern. This change is done
because Pattern stopped checking for matching patterns as soon
as it found a match which was not a valid stream. Due to this
all the subsequent mention failed, even if they were valid.
This bug was only present in backend renderring due to
markdown.inlinepatterns.Pattern.
Due to above changes verbose_compile is no longer used for
precompiling STREAM_LINK_REGEX, STREAM_TOPIC_LINK_REGEX as
adds ^(.*?) and (.*?)$ which cause extra overhead of matching
pattern which is not required. With new InlineProcessor these
extra patterns at beggining and end are not required.
So, StreamPattern and StreamTopicPattern now define their own
__init__ method for precompiling the regex.
Fixes#17535.
These changes were tested locally in dev server and by adding
some new markdown tests to test these.
Modifies `UserGroupMentionPattern` to inherit from InlineProcessor
instead of Pattern. This change is done because Pattern
stopped checking for matching patterns as soon as it found
a match which was not a valid user group. Due to this all
the subsequent user group mention failed, even if they were
valid. This bug was only present in backend renderring due to
markdown.inlinepatterns.Pattern.
This was reported as issue #17535.
These changes were tested locally in dev server and by adding
some new markdown tests to test these.
Modifies `UserMentionPattern` to inherit from InlineProcessor
instead of Pattern. This change is done because Pattern
stopped checking for matching patterns as soon as it found
a match which was not a valid user. Due to this all the
subsequent user mention failed. This bug was only present in
backend renderring due to markdown.inlinepatterns.Pattern.
This was reported as issue #17535.
These changes were tested locally in dev server and by adding
some new markdown tests to test these.
Commit 434094e599 (#11321) changed this
from an Extension to a subclass of Markdown, so it no longer has any
reason to use a config dict structured like that of an Extension.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The changes are as follows:
• Fix one day offset in all western zones.
• Correct CST from -64800 to -21600 and CDT from -68400 to -18000.
• Disambiguate PST in favor of -28000 over +28000.
• Add GMT, UTC, WET, previously excluded for being at offset 0.
• Add ACDT, AEDT, AKST, MET, MSK, NST, NZDT, PKT, which the previous
code did not find.
• Remove numbered abbreviations -12, …, +14, which are unnecessary.
• Remove MSD and PKST, which are no longer used.
Hardcode the dict and verify it with a test, so that future
discrepancies won’t go silently unnoticed.
Signed-off-by: Anders Kaseorg <anders@zulip.com>