This commit increases the rendered_content limit from 2x to 10x of the
original message length.
Earlier, we had placed a limit of MAX_MESSAGE_LENGTH * 2 for the
rendered content (explained in commit
77addc5456). That limit was based on
the assumption that in most cases, the rendered content wouldn't cause
a large increase in message length. However, quite prominently in
syntax highlighted codeblocks, that wasn't true and this caused the
limit condition to be hit for long messages composed primarily of code
blocks.
Example: The following message would render close to 10x it's original size.
```py
if:
def:
print("x", var)
x = y
```
Because the syntax highlighted logic is extremely compressible, having
rendered_content reach up to 100KB doesn't create a network
performance problem.
Also switches the default behaviour of the code to not translate the
emoticons. Earlier, the code was testing-aware, and used to translate
when there was no user profile data available(assuming that as a testing
environment).
In 18e43895ff we replaced
stream subscribe buttons with stream links. The new feature
has been well tested and well received for over a year now,
so it's safe to remove the older feature at this point.
Older sites will have super old messages that still have the
rendered markup; this commit does not attempt to address those
situations. Most likely, clicking on an old button in the old
message will either do nothing or look like a message reply.
This commit refactors the bugdown to perform a lookup only on active
realm emojis. This was needed because once we migrate realm emojis
to be addressed by `id` rather than name, it will be costly to
perform a lookup on all the realm emojis.
This field has been unused by clients for some time, and isn't great
for our public archive feature plans (where we'll not want to be
including email addresses in messages).
Add `translate_emoticons` to `prop_types` and `expected_keys`.
Furthermore, create a emoji-translating Markdown inline pattern.
Also use a JavaScript version of `translate_emoticons` and then use
this function during Markdown previews and as a preprocessor. This
is only needed for previews, because usually emoticon translation
happens on the backend after sending.
Add tests for emoticon translation, a settings UI, and a /help/ page
as well.
Tweaked by tabbott to fix various test failurse as well as how this
handles whitespace, requiring emoticons to not have adjacent
characters.
Fixes#1768.
The Markdown extension that lives inside
zerver/lib/bugdown/api_code_example.py previously used ujson.
ujson's `dumps` function doesn't accept a `separators` argument,
which means we have no control over how the JSON is pretty-printed.
This resulted in JSON fixtures with no spaces after the colon, which
looks unnecessarily convoluted.
So now, we use the built-in `json` module to get around this.
For further reading, this issue
<https://github.com/esnme/ultrajson/issues/82> opened on ujson's
repo explains why they are reluctant to support such formatting
due to performance considerations.
This commit prefixes stream names in urls with stream ids,
so that the urls don't break when we rename streams.
strean name: foo bar.com%
before: #narrow/stream/foo.20bar.2Ecom.25
after: #narrow/stream/20-foo-bar.2Ecom.25
For new realms, everything is simple under the new scheme, since
we just parse out the stream id every time to figure out where
to narrow.
For old realms, any old URLs will still work under the new scheme,
assuming the stream hasn't been renamed (and of course old urls
wouldn't have survived stream renaming in the first place). The one
exception is the hopefully rare case of a stream name starting with
something like "99-" and colliding with another stream whose id is 99.
The way that we enocde the stream name portion of the URL is kind
of unimportant now, since we really only look at the stream id, but
we still want a safe encoding of the name that is mostly human
readable, so we now convert spaces to dashes in the stream name. Also,
we try to ensure more code on both sides (frontend and backend) calls
common functions to do the encoding.
Fixes#4713
This enforces `**` around all the mentions including "at-all" and
"at-everyone" mentions. Hence this makes `@all` and `@everyone`
invalid mentions, resulting into proper syntax for these mentions as
`@**all**` and `@**everyone**` respectively.
Note from tabbott: This removes an old feature/syntax, which made
sense back when @Tim was also a way to mention a user with Tim as
their first name. Given how nice typeahead is now, the user part of
the feature was removed a while ago; this should have gone at the same
time.
Fixes: #8143.
This commit modifies the Markdown extension in bugdown/api_code_examples.py
to support rendering code examples in multiple languages by specifying
the language like so:
{generate_code_example(python)|doc.md|example}
This makes us one step closer towards adding support for testable
JavaScript code examples.
If some bug in Bugdown results in a rendered message content that is
bigger than twice the message size, we now just throw an exception
from Bugdown. This is considerably better than the old behavior,
which might result in an enormous message being placed in the database
(potentially, bigger than the 1MB limit to store in memcached), which
would in turn result in tragic consequences.
This fixes#8322, in that it prevents the super bad outcome seen there
(where basically Zulip became unusable for everyone on the stream
where the message is posted). Now, the failure mode is just the
message failing to send. Still not ideal (and requires further work
on the URL embed feature), but not a minor problem, not a major one.
This commit adds support for passing in an argument to the macro
"call" to explicitly specify a fixture to render, like so:
{generate_code_example|doc_name|fixture(stream_message_with_args)}
To generate a code exammple,
{generate_code_example|<api_doc_md>|example} sounds better and
more intuitive than,
{generate_code_example|<api_doc_md>|method}
Some of our code examples can only be run with administrator
credentials (such as create-user). Thus, the Markdown extension
for generating code examples should have an option to include
the lines that recommend using an admin zuliprc instead of a
non-admin one.
Now that the Markdown extension defined in
zerver/lib/bugdown/api_generate_examples depended on code in the
tools/lib/* directory, it caused the production tests to fail since
the tools/ directory wouldn't exist in a production environment.
This commit adds a Markdown extension that allows the following
syntax,
{generate_code_example|<md_file_name>|<fixture or method>}
to generate code examples and fixtures found in tools/lib/api_tests.py
and templates/zerver/api/fixtures.json, respectively.
This also amends a commit from Brock Whittaker <brock@zulipchat.com>
that merges two separate functions for YouTube videos and Vimeo videos
into a generic video recall function.
Fixes#7550.
This commit does the following:
* Move the Arguments table data from stream-message.md and
private-message.md to a JSON file.
* Add a Markdown extension that allows one to include and render
a table from a JSON file like so:
{generate_arguments_table|arguments.json|private-stream.md}
* Use Bootstrap's .table class to format the table instead of
relying on custom CSS.
This uses the correct regex for strikethrough. Also, added
a test to make sure that strikethrough works when it contains
link with whitespace.
Fixes#7596.
This name hasn't been right since f7f2ec0ac back in 2013; this handler
sends the log record to a queue, whose consumer will not only maybe
send a Zulip message but definitely send an email. I found this
pretty confusing when I first worked on this logging code and was
looking for how exception emails got sent; so now that I see exactly
what's actually happening here, fix it.
Adds a markdown preprocessor that finds ordered lists where all items
use the same number and change them to be in normal increasing order,
starting with that number.
Fixes#5159.
Hides URL if the message content == image url so that sending gifs or
images feels less cluttered. Uses the url_to_a() function to generate
the expected url string for matching.
Fixes#7324.
The character ">" now only starts a blockquote if the resulting
blockquote would be non-empty. Thus, by itself, ">" is now
interpreted literally by bugdown, fixing #687. The message
with contents consisting of ">>>" is now parsed as a doubly
(not triply) nested blockquote with contents ">". Properly
formed blockquotes have identical behavior as before, but now
bugdown can no longer produce empty blockquotes as output.
Fixes#2886, #687.
This commit helps reduce clutter on the navigation sidebar.
Creates new directories and moves relevant files into them.
Modifies index.rst, symlinks, and image paths accordingly.
This commit also enables expandable/collapsible navigation items,
renames files in docs/development and docs/production,
modifies /tools/test-documentation so that it overrides a theme setting,
Also updates links to other docs, file paths in the codebase that point
to developer documents, and files that should be excluded from lint tests.
Note that this commit does not update direct links to
zulip.readthedocs.io in the codebase; those will be resolved in an
upcoming follow-up commit (it'll be easier to verify all the links
once this is merged and ReadTheDocs is updated).
Fixes#5265.
While fixing an issue related to email gateway messages not getting
rendered properly, I unknowingly introduced a bug in the markdown
engine updation code. This commit fixes it. The issue was that for
a realm having email gateway setup, updation of realm filters would
lead to the updation of only one of the markdown engines not both.
The intended use of $$ is for inline expressions, not for multiline
ones; ```math is an acceptable alternative for the latter. Hence,
the $$-syntax for inline TeX no longer permits newlines within it.
This was also necessary for the next change to be sensible; namely
allowing for spaces around both $$ when crafting inline TeX instead of
forcing everything to be crammed together, e.g. $$x=7$$. In order to
avoid uninentionally creating inline expressions, the opening and
closing $$'s of an inline expression must now both exactly consist of
two dollar signs, no more and no less.
Fixes: #6488.
Generally emails are not written with markdown in mind and hence
sometimes render in strange ways. This commit fixes a particular
issue that was causing whitespace before paragraphs to be treated
as code block due to which email content was being rendered in a
box that scrolls in right direction a lot.
Fixes: #7045.
This adds the data model and bugdown support for the new UserGroup
mention feature.
Before it'll be fully operational, we'll still need:
* A backend API for making these.
* A UI for interacting with that API.
* Typeahead on the frontend.
* CSS to make them look pretty and see who's in them.
We now have a MentionData class that encapsulates
the users who are possibly mentioned in a message.
Not that the rendering code may not keep all the mentions,
since things like backticks will suppress the mention.
We populate this now in do_send_messages, so that we can use
the info earlier in the message-sending process. This info
now gets passed down the call stack as an optional parameter.
Note that bugdown.convert() still populates the data when its
callers decline to pass in a MentionData object.
This is mostly a preparatory commit, as we don't take advantage
of the data yet in do_send_messages.
Nobody has used this feature in years, and it causes certain types of
markdown issues in development to completely DoS the development
environment by making it possible for the "Bugdown timeout" exception
handler to timeout in bugdown.
Since we already send an email to the server administrators, there's
no need to replace this feature with anything.
This commit switches to use sprite sheets for rendering emojis
in all the remaining places, i.e., message bodies and composebox
typeahead. This commit also includes some changes to notifications.py
file so that the spans used for rendering emojis can be converted
to corresponding image tags so that we don't break the emoji rendering
in missed message emails since we can't use sprite sheets there.
As part of switching the bugdown system to use sprite sheets, we need
to switch the name_to_codepoint mappings to match the new sprite
sheets. This has the side effect of fixing a bunch of emoji like
numbers and flag emoji in the emoji pickers.
Fixes: #3895.
Fixes: #3972.
We now triage message content for possible mentions before
going to the cache/DB to get name info. This will create an
extra data hop for messages with mentions, but it will save
a fairly expensive cache lookup for most messages. (This will
be especially helpful for large realms.)
[Note that we need a subsequent commit to actually make the speedup
happen here, since avatars also cause us to look up all users in
the realm.]
Given typeahed and the fact that this only worked if the person had a
full name that didn't contain whitespace, this side effect of the
original @shortname mentionfeature that we removed was experienced by
users as a bug.
Fixes#6142.
Process the unicode emojis in twitter link previews and render them
properly. Before this we were not processing the unicode emojis in
twitter link previews and hence on the systems which don't have
fonts for displaying them they were rendered as blank boxes.
Fixes: #5427.
This commit renames list named `to_linkify` in twitter link processor
to `to_process` and adds a `type` field to each entry in it to
indicate the type of data represented by that particular entry.
Unicode emojis when rendered should display canonical short name.
Similarly, the alt text should be of the format `:<short_name>:`.
For both of these we currently display the actual unicode symbol.
As some systems don't have the fonts necessary for displaying them
properly, they are rendered as empty square blocks. This commit also
ensures that the markup generated for emoji generated by canonical
name and by an unicode emoji is same.
Fixes: #5555.
We used shortnames for mentioning users before we had autocomplete
feature. Since we now have autocomplete typeahead, this syntax is
no more useful and just causes problems. This commit removes the
shortname mention syntax.
Fixes: #4189.
A deactivated realm emoji should neither be accepted further as a
reaction nor its further occurences in a message be rendered as an
emoji. However, all the old occurences should continue to render
normally.
The `data-toggle` property prevented the new style of overlay modals
from launching, and regardless, isn't a future-proof options for how
this should work.
The regex we were using didn't cover all the unicode blocks
to which our emojis belong. This commit fixes the regex to
include all the unicode blocks and also updates the
corresponding JS regex in marked.js.
Fixes: #3460.
Unicode codepoints are of minimum length 4, padded with extra
zeroes if the length is less than 4. This commit fixes the
`unicode_emoji_to_codepoint()` to ensure that the codepoint
it generates are of correct length.
- Add file_name field to `RealmEmoji` model and migration.
- Add emoji upload supporting to Upload backends.
- Add uploaded file processing to emoji views.
- Use emoji source url as based for display url.
- Change emoji form for image uploading.
- Fix back-end tests.
- Fix front-end tests.
- Add tests for emoji uploading.
Fixes#1134
Earlier, a stack was being used to go through the message and search
for links. Because of this, in some cases the images were added to
the preview in reverse. Using a queue will keep the image previews in
the same order as they appeared in the message.
Fixes#4453.
By default, Python markdown tab length for indents is 4 spaces, which
require using 4 spaces or a tab to create nested elements. This
modifies that setting to specify 2-space indentation for nesting
elements only.
Modified significantly by tabbott to limit the change to just list
indentation.
Fixes#4252.
Use `name_to_codepoint.json` file (and the similar structure in
emoji_codes.js) to map emoji names directly to codepoints and change
the rendered emoji image to `unicode/<codepoint.png>` rather than
`<emoji_name>.png`.
Fixes: #3539.
We do not use `get_link_embed_data` for messsages sent by
bots, as bots often repeat the same URL over and over again
and are generally either text-focused or have their own
mechanisms to provide preview content.
Fixes#2968.
(The commit q7ef4e40258280e202325c9295579c93fb948b replaced
data-user-email with data-user-id, but we still need to
support data-user-email for old clients like non-updated
androids and we still want to start the migration forward
to data-user-id.)
This changes bugdown to use the realm passed in by the caller (if any)
for rendering, fixing a problem where bots such as the notification
bot would have their messages rendering using the admin realm's
settings, not the settings of the realm their messages are being sent
into.
Also adds a test for the notification bot case.
Fixes#3215.
A lot of care has been taken to ensure we're using the realm that the
message is being sent into, not the realm of the sender, to correctly
handle the logic for cross-realm bot users such as the notifications
bot.
This should substantially improve the clarity of the code, since
inside bugdown, this is only being used as a hash key that happens to
usually be a realm ID, not used as a Realm ID.
This change adds support for displaying inline open graph previews for
links posted into Zulip.
It is designed to interact correctly with message editing.
This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control
whether this feature is enabled.
By default, this setting is currently disabled, so that we can burn it
in for a bit before it impacts users more broadly.
Eventually, we may want to make this manageable via a (set of?)
per-realm settings. E.g. I can imagine a realm wanting to be able to
enable/disable it for certain URLs.
Now that we have updated python-markdown, we remove the deprecated
safe_mode. We used safe_mode to escape raw html, so now instead we
pass in an EscapeHtml markdown extension to the markdown engine.
See https://pythonhosted.org/Markdown/release-2.6.html for details on
the deprecation.
Fixes: #2037 (also addresses the remaining piece of #2043).
This distinguishes between YouTube Videos and Image Previews by adding
a particular “youtube-video” class to the preview along with changing
the title to the video ID rather than the link. This serves to allow
the lightbox to ID when a lightbox preview should be treated like a
YouTube video rather than an image preview.
This also modifies the tests in bug down to expect a youtube-video class
along with the title to just be the video ID on YouTube rather than the
entire URL link.
The changes that required us to fork this extension had been merged
into upstream CodeHilite, so we can remove it and switch to using the
version that comes with python-markdown.
This updates Bugdown to reflect the changes in the updated
markdown. In particular, we now pass a default config object in the
__init__ for the Bugdown extension, update the make_md_engine function
to take kwargs as opposed to a config list, and have UListProcessor
inherit from ulist as opposed to olist (which no longer works).
We update the (forked from upstream) fenced_code extension's
makeExtension to take args and kwargs, and update
FencedBlockPreprocessor __init__ method with updated Codehilite
arguments.
We update the (forked from upstream) Codehilite extension to
mirror the logic with the latest upstream Codehilite:
Add parse_hl_lines function
update makeExtension to take args and kwarfs instead of config
list
Add regex for highlight lines
use linenums instead of linenos
use get_formatter_by_name instead of HtmlFormatter
user get_lexer_by_name instead of TextLexer
add hl_lines and use_pygments arguments to the codehlite
constructor
Change the parameter name of some functions from 'md' to 'content',
since the name 'md' seems to be the reason why this parameter was
wrongly annotated.
This sets the “title” attribute on the image to the actual title of
the image specified by the user in their markdown, rather than just
the URL of the full link to it.
The bugdown parser no longer has a concept of which users need which
alert words, since it can't really do anything actionable with that info
from a rendering standpoint.
Instead, our calling code passes in a set of search words to the parser.
The parser returns the list of words it finds in the message.
Then the model method builds up the list of user ids that should be
flagged as having alert words in the message.
This refactoring is a little more involved than I'd like, but there are
still some circular dependency issues with rendering code, so I need to
pass in the rather complicated realm_alert_words data structure all the way
from the action through the model to the renderer.
This change shouldn't change the overall behavior of the system, except
that it does remove some duplicate regex checks that were occurring when
multiple users may have had the same alert word.
We now raise an exception in bugdown.do_convert() if rendering
fails, to avoid silent failures, and then calling code can convert
the exception to a JsonableError.
!avatar, !modal_link, !gravatar, etc. were incorrectly being processed
before the escape character for code blocks.
While we're at it, we add tests for these special syntaxes.
Many stubs in xml.etree.ElementTree use Union[str, bytes] as
return type. Mypy wants us to correctly handle each case. This
is correct, but not useful for us since we know that we'll always
get str. So force the return value to text_type, to supress mypy
errors.
Add a class 'BaseHandler' and make it a base class of OuterHandler,
QuoteHandler and CodeHandler. This will help annotate some functions
and improve type checking.
Change `str` to `text_type` in annotations in zerver/models.py
related to realm emoji and realm filters.
Also fix clashing annotations in zerver/lib/bugdown/__init__.py.
Had to add some "type: ignore" because the pattern used in match
doesn't affect the type returned. A fix for this issue has been pushed
to typeshed - https://github.com/python/typeshed/pull/244
Since we don't have a stable way to get the Dropbox preview failure
image (and it was sorta a weird setup anyway), it seems best to just
remove the condition.