We do not use `get_link_embed_data` for messsages sent by
bots, as bots often repeat the same URL over and over again
and are generally either text-focused or have their own
mechanisms to provide preview content.
Fixes#2968.
(The commit q7ef4e40258280e202325c9295579c93fb948b replaced
data-user-email with data-user-id, but we still need to
support data-user-email for old clients like non-updated
androids and we still want to start the migration forward
to data-user-id.)
This changes bugdown to use the realm passed in by the caller (if any)
for rendering, fixing a problem where bots such as the notification
bot would have their messages rendering using the admin realm's
settings, not the settings of the realm their messages are being sent
into.
Also adds a test for the notification bot case.
Fixes#3215.
A lot of care has been taken to ensure we're using the realm that the
message is being sent into, not the realm of the sender, to correctly
handle the logic for cross-realm bot users such as the notifications
bot.
This should substantially improve the clarity of the code, since
inside bugdown, this is only being used as a hash key that happens to
usually be a realm ID, not used as a Realm ID.
This change adds support for displaying inline open graph previews for
links posted into Zulip.
It is designed to interact correctly with message editing.
This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control
whether this feature is enabled.
By default, this setting is currently disabled, so that we can burn it
in for a bit before it impacts users more broadly.
Eventually, we may want to make this manageable via a (set of?)
per-realm settings. E.g. I can imagine a realm wanting to be able to
enable/disable it for certain URLs.
Now that we have updated python-markdown, we remove the deprecated
safe_mode. We used safe_mode to escape raw html, so now instead we
pass in an EscapeHtml markdown extension to the markdown engine.
See https://pythonhosted.org/Markdown/release-2.6.html for details on
the deprecation.
Fixes: #2037 (also addresses the remaining piece of #2043).
This distinguishes between YouTube Videos and Image Previews by adding
a particular “youtube-video” class to the preview along with changing
the title to the video ID rather than the link. This serves to allow
the lightbox to ID when a lightbox preview should be treated like a
YouTube video rather than an image preview.
This also modifies the tests in bug down to expect a youtube-video class
along with the title to just be the video ID on YouTube rather than the
entire URL link.
The changes that required us to fork this extension had been merged
into upstream CodeHilite, so we can remove it and switch to using the
version that comes with python-markdown.
This updates Bugdown to reflect the changes in the updated
markdown. In particular, we now pass a default config object in the
__init__ for the Bugdown extension, update the make_md_engine function
to take kwargs as opposed to a config list, and have UListProcessor
inherit from ulist as opposed to olist (which no longer works).
We update the (forked from upstream) fenced_code extension's
makeExtension to take args and kwargs, and update
FencedBlockPreprocessor __init__ method with updated Codehilite
arguments.
We update the (forked from upstream) Codehilite extension to
mirror the logic with the latest upstream Codehilite:
Add parse_hl_lines function
update makeExtension to take args and kwarfs instead of config
list
Add regex for highlight lines
use linenums instead of linenos
use get_formatter_by_name instead of HtmlFormatter
user get_lexer_by_name instead of TextLexer
add hl_lines and use_pygments arguments to the codehlite
constructor
Change the parameter name of some functions from 'md' to 'content',
since the name 'md' seems to be the reason why this parameter was
wrongly annotated.
This sets the “title” attribute on the image to the actual title of
the image specified by the user in their markdown, rather than just
the URL of the full link to it.
The bugdown parser no longer has a concept of which users need which
alert words, since it can't really do anything actionable with that info
from a rendering standpoint.
Instead, our calling code passes in a set of search words to the parser.
The parser returns the list of words it finds in the message.
Then the model method builds up the list of user ids that should be
flagged as having alert words in the message.
This refactoring is a little more involved than I'd like, but there are
still some circular dependency issues with rendering code, so I need to
pass in the rather complicated realm_alert_words data structure all the way
from the action through the model to the renderer.
This change shouldn't change the overall behavior of the system, except
that it does remove some duplicate regex checks that were occurring when
multiple users may have had the same alert word.
We now raise an exception in bugdown.do_convert() if rendering
fails, to avoid silent failures, and then calling code can convert
the exception to a JsonableError.
!avatar, !modal_link, !gravatar, etc. were incorrectly being processed
before the escape character for code blocks.
While we're at it, we add tests for these special syntaxes.
Many stubs in xml.etree.ElementTree use Union[str, bytes] as
return type. Mypy wants us to correctly handle each case. This
is correct, but not useful for us since we know that we'll always
get str. So force the return value to text_type, to supress mypy
errors.
Add a class 'BaseHandler' and make it a base class of OuterHandler,
QuoteHandler and CodeHandler. This will help annotate some functions
and improve type checking.
Change `str` to `text_type` in annotations in zerver/models.py
related to realm emoji and realm filters.
Also fix clashing annotations in zerver/lib/bugdown/__init__.py.
Had to add some "type: ignore" because the pattern used in match
doesn't affect the type returned. A fix for this issue has been pushed
to typeshed - https://github.com/python/typeshed/pull/244
Since we don't have a stable way to get the Dropbox preview failure
image (and it was sorta a weird setup anyway), it seems best to just
remove the condition.
This also requires updating the required version of oauthlib; previously an
appropriate version was being installed only because it was a dependency of
the wrong twitter library.
This only affects development environments and/or hand-built
installations relying on the contents of requirements.txt.
To fix existing environments, the incorrect api needs to be explicitly
removed with `pip uninstall twitter`.
Fixes#86.
This reverts commit 39f2908a32c0276b1d87ecedc876c71dd35a9b2f.
We're not including the preview_fail.png image in the release.
(imported from commit 2de1451de2f9b1727fc3a7e64c380b71c0f2caa8)
One common place that this happens (for us) is on a local
Dropbox .dev.corp.dropbox.com instance, which can't be reached
by the Zulip servers.
This commit also:
* Fixes the test suite
* Properly previews /photos/ links
(imported from commit b4788b6236e7a9d390e1efc4673be34d9ba5e091)
See #2357. We now support `~~~ .py ` with that trailing space.
Note that the test coverage is Python-side only due to
bugdown_matches_marked being set to false, since we don't yet
support language syntax on the client side.
(imported from commit ccd5fcb0eee01478d349161400103480678d7486)
We now will match an alert word even if it is used at the boundry of
bolding, backtick escaping, or caret quoting.
Closes trac #2186.
(imported from commit 984bc63eb621772c95a01ca5c5bfeb190767f71f)
Apparently the "inline" treeprocessor is what runs the inline
patterns. Also re-enable the rewriting-to-https support.
(imported from commit 2fde2c1f15217a784f26b16db25ee745f424f2f0)
When new streams are created we now send a message with a custom
markdown tag that renders a subscribe button.
(imported from commit 9dfba280b3b4ff4f32f6431ef9227867c8bf4b40)
Image and video links in the twitter API are media and need to be
handed on separately. We also include a preview image if the media link
is a to a picture.
(imported from commit 2bd00d267e51b29ad0ba681195b2bfea9b991d8c)
This converts links in tweets to a tags. We also convert the displayed
text to the target of the twitter short URL. Mentions are linked to the
users twitter page.
(imported from commit 192d5546a7eea82759f9ae30d82c102aed15ff71)
* Deal with shorter tweet IDs
(some old tweets don't have a full 18-character ID)
* Allow trailing slash
* Deal with old-style #! syntax
* Deal with links that link to a photo
(imported from commit 008a98c806f3b8dddd9e2f18a8f002af6932766f)
These images at least load now, but that's because Camo redirects
the browser to the origin server, so the only effect is an extra
round-trip time.
(imported from commit 0d6b9c888a5cdfaa9299272d74a085e872dfa434)
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
CUSTOMER13 doesn't want it, and there's currently no nginx config
or configurable Camo URI, so it wouldn't work if image preview
were enabled.
(imported from commit 615d4a32acbc4d4d590f88cf4e7d45d8f49db1d3)
The !gravatar markdown no longer hard codes to Gravatar, but
instead it serves up our generic avatar URL.
(imported from commit 4e3e2baeb3374bcf025a18ff27a8452b975c22b7)
Arguably the nl2br extension should be doing this for us. Given that
we're using nl2br, the "two spaces at the end of a line makes a line
break" rule doesn't make any sense (since every newline leads to a
linebreak), so we disable it.
(imported from commit 5ffa2ac8a825642ad31e085c532091e076665710)
Trac #1162
The process_fence method replaces code blocks with placeholders, so
indexes stored before the replacement are incorrect. However, because
the closed code blocks have been replaced, we can simply search the
whole string for any remaining opening code block markers.
(imported from commit 6a9e6924840f8f3ca5175da7c52a905e27c1fabd)
We want to avoid opening a DB connection in the markdown thread
as its DB connection might live for a long time
(imported from commit 7700b2ca793ee5e9add7f071b92f22a4bf576b3d)
The text of manual links are already AtomicStrings, so linkified strings
should be too.
Moves emoji detection to happen after linkification, so the emoji rule
won't look at links.
(imported from commit 9c56bce6a0e873b398255e0762dfb312a4a9a64e)
InlinePatterns should return None on failure, not text that may
have placeholders in it.
(imported from commit f9d8d22b2b8cfa7a92ecf3e52a6c76b48e6f0175)
LinkPattern returned a string which contained a placeholder if the URL was
considered invalid. AtomicLinkPattern wrapped this in an AtomicString,
where the placeholder doesn't get removed properly.
m.group(0) is always incorrect because python-markdown modifies your regex
to include more than you specified (this is why part of the message got
duplicated).
(imported from commit 576bdf09c2b677cf4bc56484c363eb05f2110158)
It is triggered by specifying the "language" of a code block to
"quote" or "quoted":
Hamlet said:
~~~ quote
To be or **not** to be.
That is the question
~~~
(imported from commit 847a0602e335e9f2955e32d9955adf8ac8de068c)
Now parsed: 🍺,🍺;🍺!
If \w characters surround :foo:, we still say it's NOT an
emoji, but we used to do this for \S characters, so it's loosened up.
(imported from commit 49b33d2f0ffdcfde8947ae411a4addcf4c24af9c)
This needs to be deployed to both staging and prod at the same
off-peak time (and the schema migration run).
At the time it is deployed, we need to make a few changes directly in
the database:
(1) UPDATE django_content_type set app_label='zerver' where app_label='zephyr';
(2) UPDATE south_migrationhistory set app_name='zerver' where app_name='zephyr';
(imported from commit eb3fd719571740189514ef0b884738cb30df1320)