Commit Graph

34 Commits

Author SHA1 Message Date
Alex Vandiver fa6daee4e1 markdown: Fix use of pure_markdown for non-pure markdown rendering.
`render_markdown_path` renders Markdown, and also (since baff121115)
runs Jinja2 on the resulting HTML.

The `pure_markdown` flag was added in 0a99fa2fd6, and did two
things: retried the path directly in the filesystem if it wasn't found
by the Jinja2 resolver, and also skipped the subsequent Jinja2
templating step (regardless of where the content was found).  In this
context, the name `pure_markdown` made some sense.  The only two
callsites were the TOS and privacy policy renders, which might have
had user-supplied arbitrary paths, and we wished to handle absolute
paths in addition to ones inside `templates/`.

Unfortunately, the follow-up of 01bd55bbcb did not refactor the
logic -- it changed it, by making `pure_markdown` only do the former
of the two behaviors.  Passing `pure_markdown=True` after that commit
still caused it to always run Jinja2, but allowed it to look elsewhere
in the filesystem.

This set the stage for calls, such as the one introduced in
dedea23745, which passed both a context for Jinja2, as well as
`pure_markdown=True` implying that Jinja2 was not to be used.

Split the two previous behaviors of the `pure_markdown` flag, and use
pre-existing data to control them, rather than an explicit flag.  For
handling policy information which is stored at an absolute path
outside of the template root, we switch to using the template search
path if and only if the path is relative.  This also closes the
potential inconsistency based on CWD when `pure_markdown=True` was
passed and the path was relative, not absolute.

Decide whether to run Jinja2 based on if a context is passed in at
all.  This restores the behavior in the initial 0a99fa2fd6 where a
call to `rendar_markdown_path` could be made to just render markdown,
and not some other unmentioned and unrelated templating language as
well.
2023-03-17 08:46:25 -07:00
Anders Kaseorg da3cf5ea7a ruff: Fix RSE102 Unnecessary parentheses on raised exception.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-04 16:34:55 -08:00
Anders Kaseorg b0e569f07c ruff: Fix SIM102 nested `if` statements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Anders Kaseorg 73c4da7974 ruff: Fix N818 exception name should be named with an Error suffix.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-17 16:52:00 -08:00
Anders Kaseorg 2bd81dd5c9 fenced_code: Avoid sloppy AttributeError handler.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-10-06 15:15:10 -07:00
Anders Kaseorg 7f0e11bd06 markdown: Rename preprocessor_priorities module to priorities.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-06-26 17:36:31 -07:00
Anders Kaseorg b0ce4f1bce docs: Fix many spelling mistakes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-02-07 18:51:06 -08:00
Eeshan Garg bfbd77ca5c markdown: Organize preprocessor priorities in one place.
All of our custom Markdown extensions have priorities that govern
the order in which the preprocessors will be run. It is more
convenient to have these all in one file so that you can easily
discern the order at first glance.

Thanks to Alya Abbott for reporting the bug that led to this
refactoring!
2021-09-20 16:57:43 -07:00
Anders Kaseorg 498d2b48d9 fenced_code: Use find_lexer_class_by_name.
This is more efficient than get_lexer_by_name, since we don’t need to
instantiate the class just to get its name.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-02 22:31:46 -07:00
Anders Kaseorg 162e9d6c0b fenced_code: Optimize FENCE_RE to fix cubic worst-case complexity.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-07-22 16:40:44 -07:00
Anders Kaseorg dea935f26f fenced_code: Write FENCE_RE with a raw string.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-07-22 16:40:43 -07:00
akshatdalton 7d9e71be21 fenced_code: Add `process_contents` flag to de-duplicate code. 2021-07-22 10:57:23 -07:00
akshatdalton 6b5812082e markdown: Fix shebang line eliminating behaviour of Codehilite.
See the block comment explaining the motivation for this change, but
basically, the shebang feature of Python-Markdown's Codehilite
extension could be really confusing and is not part of the CommonMark
standard.

1. https://python-markdown.github.io/extensions/code_hilite/#shebang-no-path
2. eacff473a2/markdown/extensions/codehilite.py (L164-L180)

Fixes: #18591.
2021-07-15 15:18:33 -07:00
Anders Kaseorg 8486499314 fenced_code: Fix processor type annotation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-07-09 17:10:31 -07:00
akshatdalton db1cf3b521 refactor: Add class `ZulipBaseHandler` to de-duplicate code. 2021-07-07 17:53:22 -07:00
Anders Kaseorg d0c6f4f400 python: Strip leading and trailing spaces from docstrings.
This is enforced by Black ≥ 21.4b0.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-05-07 22:42:39 -07:00
Anders Kaseorg 6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg 11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg e7e1fde6ec fenced_code: Use immutable type for codehilite_conf.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:28 -08:00
Anders Kaseorg 08c64f5cfa markdown: Fix imports for compatibility with typeshed stubs.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
Anders Kaseorg ac5cbf7693 Revert "markdown: Escape lang when echoing back custom non-pygments languages."
This reverts commit 564b199fe6, which
was part of #16308.

Escaping is either required or incorrect; it is never “defensive”.
This escaping is incorrect.  lxml already escapes attributes during
serialization (any other behavior would be a serious bug), and
additional escaping just results in double escaping.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-02 16:23:48 -08:00
akshatdalton 620e9cbf72 markdown: Fix merging of separate quotations.
Initally, when writing two or more quotes, having
a blank line in between them, merges those quotes.
This created confusion especially in "quote and reply".

This commit fixes such issues. Now two or more quotes
having a blank line in between them, will not get merged.

This change is correct both for usability and for improving our
compatibility with CommonMark.

Fixes #14379.
2020-10-30 15:21:15 -07:00
Anders Kaseorg 9281dccae4 python: Serialize lxml elements directly to str.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Anders Kaseorg 254b904965 markdown: Migrate off deprecated extension registration interface.
Fixes #15205.

https://python-markdown.github.io/change_log/release-3.0/#homegrown-ordereddict-has-been-replaced-with-a-purpose-built-registry
https://python-markdown.github.io/change_log/release-3.0/#md_globals-keyword-deprecated-from-extension-api

The priority numbers are arbitrarily chosen to preserve the existing
order.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-19 18:31:12 -07:00
Anders Kaseorg d81a93cdf3 requirements: Upgrade markdown to 3.3.1.
Upstream has slightly changed the whitespace around stashes.  Take
this opportunity to clean up the extra blank lines we were outputting.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-19 11:54:14 -07:00
Sumanth V Rao 564b199fe6 markdown: Escape lang when echoing back custom non-pygments languages.
In ae58ed5a7 we decided to echo back the text, when no Pygments lexer
matching that language was found. When we do so, we must take care to
HTML escape the lang before wrapping it in a data-code-language attribute.

Tweaked by tabbott to make clear the escaping is defensive.
2020-09-18 17:12:11 -07:00
Tim Abbott ae58ed5a74 markdown: Tweak data-code-language testing and comments.
This should make it clearer the precise decisions we've made about the
intended semantics of this feature.
2020-09-15 12:30:57 -07:00
Sumanth V Rao b0c9e0a295 markdown: Rename fenced code data-attribute to data-code-language. 2020-09-15 20:09:58 +05:30
Sumanth V Rao 033351609d markdown: Add data-codehilite-language attr for fenced code.
When converting fenced code markdown, we add the language (if specified)
in a data-attribute by tweaking the HTML generated. Doing so, allows the
frontend to make use of this attr to display view-in-playground option
for codeblocks.

We use pygments to get the lexer subclass name and use that instead of
directly using the language in the data-attribute. Doing so, helps us
map different language aliases (like `js` and `javascript`) into a common
variable (like `JavaScript`) - and avoids the client from dealing with
multiple tags corresponding to the same language.

The html structure for a message like this:

``` js
..content..
```

would now be:

<div class="codehilite" data-codehilite-language="JavaScript">
    <pre>..content..</pre>
</div>

Tests and fixtures amended.
2020-09-14 21:25:19 -07:00
Anders Kaseorg f91d287447 python: Pre-fix a few spots for better Black formatting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00
Anders Kaseorg 6189e4d0c1 python: Convert more percent formatting to "".format.
Semgrep has gotten a little more clever at applying the percent
formatting rule.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-07-13 13:16:38 -07:00
Mohit Gupta 08e74558a9 refactor: Rename remaining bugdown word to markdown in .py files.
This commit is part of series of commits aimed at renaming bugdown to
markdown.
2020-06-29 15:03:20 -07:00
Mohit Gupta 05cce86670 refactor: Change BugdownRenderingException to MarkdownRenderingException.
This commit is part of series of commits aimed at renaming bugdown to
markdown.
2020-06-26 17:08:37 -07:00
Mohit Gupta 3f5fc13491 refactor: Rename zerver.lib.bugdown to zerver.lib.markdown .
This commit is first of few commita which aim to change all the
bugdown references to markdown. This commits rename the files,
file path mentions and change the imports.
Variables and other references to bugdown will be renamed in susequent
commits.
2020-06-26 17:08:37 -07:00