Commit Graph

211 Commits

Author SHA1 Message Date
Vector73 4430ab9cbe zerver: Replace `realm_uri` with `realm_url` in backend files.
Co-authored-by: Junyao Chen <junyao.chen@socitydao.org>
2024-06-03 10:07:10 -07:00
Alex Vandiver b7bf9e41c7 markdown: Catch urllib3 exceptions which are raised during streaming.
requests transforms the base urllib3 exceptions into
requests.RequestExceptions -- but only within code that it is
running.  When parsing the streaming body in fetch_open_graph_image,
the read itself (inside lxml) may trigger urllib3 to raise its own
timeout error -- which escapes the current catch of
requests.RequestExceptions.

Catch both requests and urllib3 exceptions.
2024-05-24 14:54:53 -07:00
Tomas Fonseca 753b8e1f6f markdown: Added error handling for invalid time zones.
If an invalid timezone (such as +32h) was provided, the
timestamp.astimezone call would throw an exception, causing the
message send to fail. Replace that with a user-facing error.

Fixes #19658.
2024-05-16 09:38:25 -07:00
Vector73 8ab526a25a models: Replace realm.uri with realm.url.
In #23380, we are changing all occurrences of uri with url in order to
follow the latest URL standard. Previous PRs #25038 and #25045 has
replaced the occurences of uri that has no direct relation with realm.

This commit changes just the model property, which has no API
compatibility concerns.
2024-05-08 11:12:43 -07:00
Anders Kaseorg 96fbe060a6 python: Mark regexes as raw strings.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-04-26 12:30:31 -07:00
Anders Kaseorg 72018cc26b timeout: Rename to unsafe_timeout.
This timeout strategy using asynchronous exceptions has a number of
safety caveats (read the docstring!!) and should only be used in very
specific circumstances.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-04-18 11:50:38 -07:00
Alex Vandiver 693b959656 markdown: Switch to directly URL-escaping CSS URLs.
soupsieve is a heavy-weight dependency, and Tornado pulls it in by way
of markdown rendering; since we are only using it for a very simple
process, perform that manually.

Per CSS spec[^1]:

> In quoted <string> url()s, only newlines and the character used to
> quote the string need to be escaped.

[^1]: https://drafts.csswg.org/css-values/#urls
2024-04-16 10:48:51 -07:00
Alex Vandiver 0f9b7f112b message: Move render_markdown into zerver.lib.markdown. 2024-02-14 12:27:03 -08:00
Anders Kaseorg 93198a19ed requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-01-29 10:41:54 -08:00
Alex Vandiver 4ab9cd7cf2 markdown: Prevent OverflowError with large time integers.
`<time:1234567890123>` causes a "signed integer is greater than
maximum" exception from dateutil.parser; datetime also cannot handle
it ("year 41091 is out of range") but that is a ValueError which is
already caught.

Catch the OverflowError thrown by dateutil.
2024-01-05 12:01:06 -08:00
Anders Kaseorg 21ab3858a7 models: Extract zerver.models.linkifiers.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Anders Kaseorg 67fb485797 models: Extract zerver.models.realm_emoji.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-16 22:08:44 -08:00
Sahil Batra 72aa4b256d message: Do not allow guest to mention inaccessible users. 2023-12-09 16:59:38 -08:00
Anders Kaseorg 223b626256 python: Use urlsplit instead of urlparse.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-05 13:03:07 -08:00
Anders Kaseorg 3853fa875a python: Consistently use from…import for urllib.parse.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-05 13:03:07 -08:00
Anders Kaseorg 8a7916f21a python: Consistently use from…import for datetime.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-12-05 12:01:18 -08:00
Alex Vandiver 8c8dbb3d66 markdown: Stop attempting to expand/collapse re2 regex.
549dd8a4c4 changed the regex that we build to contain whitespace for
readability, and strip that back out before returning it.
Unfortunately, this also serves to strip out whitespace in the source
linkifier, causing it to not match expected strings.

Revert 549dd8a4c4.

Fixes: #27854.
2023-11-28 15:07:23 -08:00
Prakhar Pratyush 49388d5d3d topic_mentions: Fix restriction rule for @-topic mentions.
Now, the topic wildcard mention follows the following
rules:
* If the topic has less than 15 participants , anyone
can use @ topic mentions.
* For more than 15, the org setting 'wildcard_mention_policy'
determines who can use @ topic mentions.

Earlier, topic wildcard mentions followed the same restriction
as stream wildcard mentions, which was incorrect.

Fixes part of #27700.
2023-11-23 12:52:25 -08:00
Aman Agrawal 71ea6e8863 realm_inline_image_preview: Use it to toggle video previews too.
This setting now also works to decide whether to show previews of
uploaded or linked videos.
2023-11-20 08:48:39 -08:00
Tim Abbott 3cfe4b720c Revert "linkifiers: Match JS implementation to server implementation."
This reverts commit 091e2f177b.

This version of python_to_js_linkifier fails for at least some real
linkifiers. We'll likely re-introduce this after a bit more debugging.
2023-11-16 14:59:48 -08:00
Anders Kaseorg f4e7a11c35 requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-11-15 15:27:54 -08:00
Alex Vandiver 091e2f177b linkifiers: Match JS implementation to server implementation.
Since the server-side implementation no longer uses look-ahead
or (more importantly) look-behind, it is possible to exactly implement
in Javascript.  This removes a common class which would prevent local
echo.

This requires reworking the topic linking algorithm, to march the
server's as well.  The tests and behaviour are adjusted in so doing --
previously, the JS implementation would have linked `#foo` with a
`foo` regex on the linkifier, but the server implementation would not
have.
2023-11-14 20:43:39 -08:00
Alex Vandiver 70b20e9d2b markdown: Use \p{White_Space} equivalent for linkifier boundaries.
We do not use \p{White_Space} itself because re2 does not support it.
2023-11-14 20:43:39 -08:00
Alex Vandiver 549dd8a4c4 markdown: Expand prepare_linkifier_pattern to make it more legible. 2023-11-14 20:43:39 -08:00
roanster007 dc492867af user_mention: Fix mentions of deactivated users.
Previously, when a deactivated user was mentioned, he wasn't
rendered as a Pill. This is because the dataset for validating mentions
only included active users, which is fixed by removing that filter.

To allow only silent mentions of them, an extra is_active property
added to FullNameInfo class, which is populated from the query,
which tells if user is deactivated. This is used to convert any
mentions of them to silent mentions in the backend markdown.

Fixes #26857
2023-11-08 09:48:31 -08:00
Anders Kaseorg a50eb2e809 mypy: Enable new error explicit-override.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-10-12 12:28:41 -07:00
Anders Kaseorg 835ee69c80 docs: Fix grammar errors found by mwic.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-10-09 13:24:09 -07:00
Aman Agrawal 8ef52d55d3 markdown: Add support for inline video thumbnails. 2023-10-02 22:39:02 -07:00
Anders Kaseorg 2665a3ce2b python: Elide unnecessary list wrappers.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-13 12:41:23 -07:00
Adrián Oliva 732ad89f3d markdown: Fix URL link topic skipping query.
When searching for links inside a topic name, the question mark (?)
was used to split the topic. If a URL had a query after the URL
(e.g., "?foo=bar"), then the query was trimmed from the URL.

Removing the question mark from `basic_link_splitter` is sufficient
to fix this issue. The `get_web_link_regex` function then removes
the trailing punctuation if any, including literal question marks.

Fixes #26368.
2023-09-08 16:17:11 -07:00
evykassirer 0289beb784 emoji: Match emoji sequences in markdown.
Fixes #11767.

Previously multi-character emoji sequences weren't matched in the
emoji regex, so we'd convert the characters to separate images,
breaking the intended display.

This change allows us to match the full emoji sequence, and
therefore show the correct image.
2023-08-23 16:18:15 -07:00
Anders Kaseorg 36dde99308 ruff: Appease SIM118 `"class" not in uncle.keys()`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-17 17:05:34 -07:00
Anders Kaseorg 562a79ab76 ruff: Fix PERF401 Use a list comprehension to create a transformed list.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-07 17:23:55 -07:00
Anders Kaseorg c2c96eb0cf python: Annotate type aliases with TypeAlias.
This is not strictly necessary but it’s clearer and improves mypy’s
error messages.

https://docs.python.org/3/library/typing.html#typing.TypeAlias
https://mypy.readthedocs.io/en/stable/kinds_of_types.html#type-aliases

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-07 10:02:49 -07:00
Anders Kaseorg 3b09197fdf ruff: Fix RUF015 Prefer `next(...)` over single element slice.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-23 15:20:53 -07:00
Anders Kaseorg bd2f327a25 markdown: Remove obsolete comment.
It was obsoleted by commit 07fef56c74
(#19144).

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-20 11:00:09 -07:00
Prakhar Pratyush 4c9d26ce17 mention: Send notifications for @topic wildcard mentions.
This commit completes the notifications part of the @topic
wildcard mention feature.

Notifications are sent to the topic participants for the
@topic wildcard mention.
2023-07-17 09:39:24 -07:00
Steve Howell b742f1241f realm emoji: Use a single cache for all lookups.
The active realm emoji are just a subset of all your
realm emoji, so just use a single cache entry per
realm.

Cache misses should be very infrequent per realm.

If a realm has lots of deactivated realm emoji, then
there's a minor expense to deserialize them, but that
is gonna be dwarfed by all the other more expensive
operations in message-send.

I also renamed the two related functions.  I erred on
the side of using somewhat verbose names, as we don't
want folks to confuse the two use cases. Fortunately
there are somewhat natural affordances to use one or
the other, and mypy helps too.

Finally, I use realm_id instead of realm in places
where we don't need the full Realm object.
2023-07-17 09:35:53 -07:00
Prakhar Pratyush 0891f9f65a mention: Determine @topic mention during message rendering.
This commit adds a boolean field `mentions_topic_wildcard`
to the `MessageRenderingResult` dataclass.

The field is set to true only if message rendering determines
the message has an actual topic wildcard mention in it (and not,
e.g., topic wildcard mention syntax inside a code block).

The rendered content for topic wildcard mention is
'<span class="topic-mention">{wildcard}</span>'.

The 'topic-mention' class is the identifier for the wildcard
mention being a topic wildcard mention.

We don't use 'data-user-id="*"' and "user-mention" class for
topic wildcard mentions and eventually plan to remove them for
stream wildcard mentions too in a separate mini-project.
2023-07-13 11:34:48 -07:00
Lauryn Menard d84fd73db4 markdown-processor: Update insertion_index check for multiple classes.
Updates find_proper_insertion_index to check for the inline image
classes as matching at least one of the classes in the element's
attrib["class"] so that cases where an inline preview image has
multiple classes, like YouTube video previews, will have the
correct insertion index.

Fixes #26186.
2023-07-07 11:07:45 -04:00
Alex Vandiver ff53ee8e28 markdown: Only attempt to adjust /wiki/File: paths on Wikipedia. 2023-07-06 17:50:25 -07:00
Prakhar Pratyush 179d5cb37d mention: Replace 'wildcards' with 'stream_wildcards'.
This prep commit replaces the 'wildcard' keyword in the codebase
with 'stream_wildcard' at some places for better readability, as
we plan to introduce 'topic_wildcards' as a part of the
'@topic mention' project.

Currently, 'wildcards = ["all", "everyone", "stream"]' which is an
alias to mention everyone in the stream, hence better renamed as
'stream_wildcards'.

Eventually, we will have:
'stream_wildcard' as an alias to mention everyone in the stream.
'topic_wildcard' as an alias to mention everyone in the topic.
'wildcard' refers to 'stream_wildcard' and 'topic_wildcard' as a whole.
2023-07-03 22:03:17 -07:00
Tim Abbott dce4a3c98e markdown: Remove most of Twitter integration.
Twitter removed their v1 API. We take care to keep the existing cached
results around for now, and to not poison that cache, since we might
be able replace this with something that can still use the existing
cache.
2023-05-29 10:43:35 -07:00
Anders Kaseorg 9db3451333 Remove statsd support.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-04-25 19:58:16 -07:00
Zixuan James Li 268f858f39 linkifier: Support URL templates for linkifiers.
This swaps out url_format_string from all of our APIs and replaces it
with url_template. Note that the documentation changes in the following
commits  will be squashed with this commit.

We change the "url_format" key to "url_template" for the
realm_linkifiers events in event_schema, along with updating
LinkifierDict. "url_template" is the name chosen to normalize
mixed usages of "url_format_string" and "url_format" throughout
the backend.

The markdown processor is updated to stop handling the format string
interpolation and delegate the task template expansion to the uri_template
library instead.

This change affects many test cases. We mostly just replace "%(name)s"
with "{name}", "url_format_string" with "url_template" to make sure that
they still pass. There are some test cases dedicated for testing "%"
escaping, which aren't relevant anymore and are subject to removal.
But for now we keep most of them as-is, and make sure that "%" is always
escaped since we do not use it for variable substitution any more.

Since url_format_string is not populated anymore, a migration is created
to remove this field entirely, and make url_template non-nullable since
we will always populate it. Note that it is possible to have
url_template being null after migration 0422 and before 0424, but
in practice, url_template will not be None after backfilling and the
backend now is always setting url_template.

With the removal of url_format_string, RealmFilter model will now be cleaned
with URL template checks, and the old checks for escapes are removed.

We also modified RealmFilter.clean to skip the validation when the
url_template is invalid. This avoids raising mulitple ValidationError's
when calling full_clean on a linkifier. But we might eventually want to
have a more centric approach to data validation instead of having
the same validation in both the clean method and the validator.

Fixes #23124.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-04-19 12:20:49 -07:00
Anders Kaseorg 087660a87e requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-03-05 14:46:28 -08:00
Anders Kaseorg c1675913a2 web: Move web app to ‘web’ directory.
Ever since we started bundling the app with webpack, there’s been less
and less overlap between our ‘static’ directory (files belonging to
the frontend app) and Django’s interpretation of the ‘static’
directory (files served directly to the web).

Split the app out to its own ‘web’ directory outside of ‘static’, and
remove all the custom collectstatic --ignore rules.  This makes it
much clearer what’s actually being served to the web, and what’s being
bundled by webpack.  It also shrinks the release tarball by 3%.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-23 16:04:17 -08:00
Alex Vandiver ccecc8eb84 markdown: Comment why we do not hash or use STATIC_URL for :zulip:. 2023-02-14 17:17:06 -05:00
Anders Kaseorg 0a1904a6a7 markdown: Rewrite YouTube URL parser without regex spaghetti.
This also adds support for the new YouTube Shorts URLs.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-09 22:34:51 -08:00
Anders Kaseorg 70ac144d57 markdown: Replace custom cache decorator with functools.lru_cache.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-09 15:46:11 -08:00