2012-11-19 17:55:28 +01:00
|
|
|
"""
|
|
|
|
Fenced Code Extension for Python Markdown
|
|
|
|
=========================================
|
|
|
|
|
|
|
|
This extension adds Fenced Code Blocks to Python-Markdown.
|
|
|
|
|
|
|
|
>>> import markdown
|
|
|
|
>>> text = '''
|
|
|
|
... A paragraph before a fenced code block:
|
|
|
|
...
|
|
|
|
... ~~~
|
|
|
|
... Fenced code block
|
|
|
|
... ~~~
|
|
|
|
... '''
|
|
|
|
>>> html = markdown.markdown(text, extensions=['fenced_code'])
|
|
|
|
>>> print html
|
|
|
|
<p>A paragraph before a fenced code block:</p>
|
|
|
|
<pre><code>Fenced code block
|
|
|
|
</code></pre>
|
|
|
|
|
|
|
|
Works with safe_mode also (we check this because we are using the HtmlStash):
|
|
|
|
|
|
|
|
>>> print markdown.markdown(text, extensions=['fenced_code'], safe_mode='replace')
|
|
|
|
<p>A paragraph before a fenced code block:</p>
|
|
|
|
<pre><code>Fenced code block
|
|
|
|
</code></pre>
|
|
|
|
|
|
|
|
Include tilde's in a code block and wrap with blank lines:
|
|
|
|
|
|
|
|
>>> text = '''
|
|
|
|
... ~~~~~~~~
|
|
|
|
...
|
|
|
|
... ~~~~
|
|
|
|
... ~~~~~~~~'''
|
|
|
|
>>> print markdown.markdown(text, extensions=['fenced_code'])
|
|
|
|
<pre><code>
|
|
|
|
~~~~
|
|
|
|
</code></pre>
|
|
|
|
|
2017-03-20 18:54:00 +01:00
|
|
|
Removes trailing whitespace from code blocks that cause horizontal scrolling
|
|
|
|
>>> import markdown
|
|
|
|
>>> text = '''
|
|
|
|
... A paragraph before a fenced code block:
|
|
|
|
...
|
|
|
|
... ~~~
|
|
|
|
... Fenced code block \t\t\t\t\t\t\t
|
|
|
|
... ~~~
|
|
|
|
... '''
|
|
|
|
>>> html = markdown.markdown(text, extensions=['fenced_code'])
|
|
|
|
>>> print html
|
|
|
|
<p>A paragraph before a fenced code block:</p>
|
|
|
|
<pre><code>Fenced code block
|
|
|
|
</code></pre>
|
|
|
|
|
2012-11-19 17:55:28 +01:00
|
|
|
Language tags:
|
|
|
|
|
|
|
|
>>> text = '''
|
|
|
|
... ~~~~{.python}
|
|
|
|
... # Some python code
|
|
|
|
... ~~~~'''
|
|
|
|
>>> print markdown.markdown(text, extensions=['fenced_code'])
|
|
|
|
<pre><code class="python"># Some python code
|
|
|
|
</code></pre>
|
|
|
|
|
|
|
|
Copyright 2007-2008 [Waylan Limberg](http://achinghead.com/).
|
|
|
|
|
|
|
|
Project website: <http://packages.python.org/Markdown/extensions/fenced_code_blocks.html>
|
|
|
|
Contact: markdown@freewisdom.org
|
|
|
|
|
|
|
|
License: BSD (see ../docs/LICENSE for details)
|
|
|
|
|
|
|
|
Dependencies:
|
|
|
|
* [Python 2.4+](http://python.org)
|
|
|
|
* [Markdown 2.0+](http://packages.python.org/Markdown/)
|
|
|
|
* [Pygments (optional)](http://pygments.org)
|
|
|
|
|
|
|
|
"""
|
2024-01-29 00:32:21 +01:00
|
|
|
|
2012-11-19 17:55:28 +01:00
|
|
|
import re
|
2024-07-12 02:30:17 +02:00
|
|
|
from typing import Any, Callable, Iterable, Mapping, MutableSequence, Optional, Sequence
|
2020-06-11 00:54:34 +02:00
|
|
|
|
2020-10-30 01:31:33 +01:00
|
|
|
import lxml.html
|
2017-03-20 16:56:39 +01:00
|
|
|
from django.utils.html import escape
|
2020-10-19 06:37:43 +02:00
|
|
|
from markdown import Markdown
|
2021-05-26 13:23:30 +02:00
|
|
|
from markdown.extensions import Extension, codehilite
|
|
|
|
from markdown.extensions.codehilite import CodeHiliteExtension, parse_hl_lines
|
2020-10-19 06:37:43 +02:00
|
|
|
from markdown.preprocessors import Preprocessor
|
2021-08-03 05:44:19 +02:00
|
|
|
from pygments.lexers import find_lexer_class_by_name
|
2020-09-06 08:41:37 +02:00
|
|
|
from pygments.util import ClassNotFound
|
2023-10-12 19:43:45 +02:00
|
|
|
from typing_extensions import override
|
2020-06-11 00:54:34 +02:00
|
|
|
|
2022-11-17 09:30:48 +01:00
|
|
|
from zerver.lib.exceptions import MarkdownRenderingError
|
2024-05-20 22:09:35 +02:00
|
|
|
from zerver.lib.markdown.priorities import PREPROCESSOR_PRIORITIES
|
2017-03-20 16:56:39 +01:00
|
|
|
from zerver.lib.tex import render_tex
|
2012-11-19 17:55:28 +01:00
|
|
|
|
|
|
|
# Global vars
|
2021-02-12 08:19:30 +01:00
|
|
|
FENCE_RE = re.compile(
|
2021-05-13 19:42:51 +02:00
|
|
|
r"""
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
# ~~~ or ```
|
|
|
|
(?P<fence>
|
|
|
|
^(?:~{3,}|`{3,})
|
2012-11-19 17:55:28 +01:00
|
|
|
)
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
|
|
|
|
[ ]* # spaces
|
|
|
|
|
2021-05-13 19:42:53 +02:00
|
|
|
(?:
|
|
|
|
# language, like ".py" or "{javascript}"
|
2021-05-13 19:42:51 +02:00
|
|
|
\{?\.?
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
(?P<lang>
|
2021-05-13 19:42:53 +02:00
|
|
|
[a-zA-Z0-9_+-./#]+
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
) # "py" or "javascript"
|
2021-05-13 19:42:53 +02:00
|
|
|
|
|
|
|
[ ]* # spaces
|
|
|
|
|
|
|
|
# header for features that use fenced block header syntax (like spoilers)
|
2020-04-04 22:14:34 +02:00
|
|
|
(?P<header>
|
2021-05-13 19:42:53 +02:00
|
|
|
[^ ~`][^~`]*
|
|
|
|
)?
|
2021-05-13 19:42:51 +02:00
|
|
|
\}?
|
2021-05-13 19:42:53 +02:00
|
|
|
)?
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
$
|
2021-02-12 08:19:30 +01:00
|
|
|
""",
|
|
|
|
re.VERBOSE,
|
|
|
|
)
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
|
|
|
|
|
2021-02-12 08:20:45 +01:00
|
|
|
CODE_WRAP = "<pre><code{}>{}\n</code></pre>"
|
2020-07-10 01:57:43 +02:00
|
|
|
LANG_TAG = ' class="{}"'
|
2012-11-19 17:55:28 +01:00
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2024-07-12 02:30:17 +02:00
|
|
|
def validate_curl_content(lines: list[str]) -> None:
|
2019-05-16 22:38:53 +02:00
|
|
|
error_msg = """
|
|
|
|
Missing required -X argument in curl command:
|
|
|
|
|
|
|
|
{command}
|
|
|
|
""".strip()
|
|
|
|
|
|
|
|
for line in lines:
|
2019-08-07 10:55:41 +02:00
|
|
|
regex = r'curl [-](sS)?X "?(GET|DELETE|PATCH|POST)"?'
|
2023-01-18 02:59:37 +01:00
|
|
|
if line.startswith("curl") and re.search(regex, line) is None:
|
|
|
|
raise MarkdownRenderingError(error_msg.format(command=line.strip()))
|
2019-05-16 22:38:53 +02:00
|
|
|
|
|
|
|
|
2024-07-12 02:30:17 +02:00
|
|
|
CODE_VALIDATORS: dict[Optional[str], Callable[[list[str]], None]] = {
|
2021-02-12 08:20:45 +01:00
|
|
|
"curl": validate_curl_content,
|
2019-05-16 22:38:53 +02:00
|
|
|
}
|
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2020-10-19 06:37:43 +02:00
|
|
|
class FencedCodeExtension(Extension):
|
2020-06-13 03:34:01 +02:00
|
|
|
def __init__(self, config: Mapping[str, Any] = {}) -> None:
|
2019-05-16 22:38:53 +02:00
|
|
|
self.config = {
|
2021-02-12 08:20:45 +01:00
|
|
|
"run_content_validators": [
|
|
|
|
config.get("run_content_validators", False),
|
|
|
|
"Boolean specifying whether to run content validation code in CodeHandler",
|
python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:
import re
import sys
last_filename = None
last_row = None
lines = []
for msg in sys.stdin:
m = re.match(
r"\x1b\[35mflake8 \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
)
if m:
filename, row_str, col_str, err = m.groups()
row, col = int(row_str), int(col_str)
if filename == last_filename:
assert last_row != row
else:
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
with open(filename) as f:
lines = f.readlines()
last_filename = filename
last_row = row
line = lines[row - 1]
if err in ["C812", "C815"]:
lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
elif err in ["C819"]:
assert line[col - 2] == ","
lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-10 05:23:40 +02:00
|
|
|
],
|
2019-05-16 22:38:53 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
for key, value in config.items():
|
|
|
|
self.setConfig(key, value)
|
2012-11-19 17:55:28 +01:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2020-10-19 06:37:43 +02:00
|
|
|
def extendMarkdown(self, md: Markdown) -> None:
|
2021-05-08 02:36:30 +02:00
|
|
|
"""Add FencedBlockPreprocessor to the Markdown instance."""
|
2012-11-19 17:55:28 +01:00
|
|
|
md.registerExtension(self)
|
2019-05-16 22:38:53 +02:00
|
|
|
processor = FencedBlockPreprocessor(
|
2021-02-12 08:20:45 +01:00
|
|
|
md, run_content_validators=self.config["run_content_validators"][0]
|
2021-02-12 08:19:30 +01:00
|
|
|
)
|
2021-09-17 19:01:36 +02:00
|
|
|
md.preprocessors.register(
|
2024-05-20 22:09:35 +02:00
|
|
|
processor, "fenced_code_block", PREPROCESSOR_PRIORITIES["fenced_code_block"]
|
2021-09-17 19:01:36 +02:00
|
|
|
)
|
2012-11-19 17:55:28 +01:00
|
|
|
|
|
|
|
|
2021-06-06 20:02:24 +02:00
|
|
|
class ZulipBaseHandler:
|
|
|
|
def __init__(
|
|
|
|
self,
|
2021-05-11 01:43:42 +02:00
|
|
|
processor: "FencedBlockPreprocessor",
|
2021-06-06 20:02:24 +02:00
|
|
|
output: MutableSequence[str],
|
|
|
|
fence: Optional[str] = None,
|
2021-07-22 10:19:01 +02:00
|
|
|
process_contents: bool = False,
|
2021-06-06 20:02:24 +02:00
|
|
|
) -> None:
|
|
|
|
self.processor = processor
|
|
|
|
self.output = output
|
|
|
|
self.fence = fence
|
2021-07-22 10:19:01 +02:00
|
|
|
self.process_contents = process_contents
|
2024-07-12 02:30:17 +02:00
|
|
|
self.lines: list[str] = []
|
2021-06-06 20:02:24 +02:00
|
|
|
|
2018-11-02 17:11:42 +01:00
|
|
|
def handle_line(self, line: str) -> None:
|
2021-06-06 20:02:24 +02:00
|
|
|
if line.rstrip() == self.fence:
|
|
|
|
self.done()
|
|
|
|
else:
|
|
|
|
self.lines.append(line.rstrip())
|
2018-11-02 17:11:42 +01:00
|
|
|
|
|
|
|
def done(self) -> None:
|
2021-06-06 20:02:24 +02:00
|
|
|
if self.lines:
|
|
|
|
text = "\n".join(self.lines)
|
|
|
|
text = self.format_text(text)
|
2021-07-22 10:19:01 +02:00
|
|
|
|
|
|
|
# For code blocks, the contents should not receive further
|
|
|
|
# processing. Whereas with quote and spoiler blocks, we
|
|
|
|
# explicitly want Markdown formatting of the content
|
|
|
|
# inside. This behavior is controlled by the
|
|
|
|
# process_contents configuration flag.
|
|
|
|
if not self.process_contents:
|
|
|
|
text = self.processor.placeholder(text)
|
|
|
|
|
2021-06-06 20:02:24 +02:00
|
|
|
processed_lines = text.split("\n")
|
|
|
|
self.output.append("")
|
|
|
|
self.output.extend(processed_lines)
|
|
|
|
self.output.append("")
|
|
|
|
self.processor.pop()
|
|
|
|
|
|
|
|
def format_text(self, text: str) -> str:
|
|
|
|
"""Returns a formatted text.
|
|
|
|
Subclasses should override this method.
|
|
|
|
"""
|
2023-02-04 02:07:20 +01:00
|
|
|
raise NotImplementedError
|
2018-11-02 17:11:42 +01:00
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
|
|
|
def generic_handler(
|
2021-05-11 01:43:42 +02:00
|
|
|
processor: "FencedBlockPreprocessor",
|
2021-02-12 08:19:30 +01:00
|
|
|
output: MutableSequence[str],
|
|
|
|
fence: str,
|
2021-05-13 19:42:53 +02:00
|
|
|
lang: Optional[str],
|
|
|
|
header: Optional[str],
|
2021-02-12 08:19:30 +01:00
|
|
|
run_content_validators: bool = False,
|
|
|
|
default_language: Optional[str] = None,
|
2021-06-06 20:02:24 +02:00
|
|
|
) -> ZulipBaseHandler:
|
2021-05-13 19:42:53 +02:00
|
|
|
if lang is not None:
|
|
|
|
lang = lang.lower()
|
2021-02-12 08:20:45 +01:00
|
|
|
if lang in ("quote", "quoted"):
|
2020-04-13 06:26:25 +02:00
|
|
|
return QuoteHandler(processor, output, fence, default_language)
|
2021-02-12 08:20:45 +01:00
|
|
|
elif lang == "math":
|
2018-11-02 17:11:42 +01:00
|
|
|
return TexHandler(processor, output, fence)
|
2021-02-12 08:20:45 +01:00
|
|
|
elif lang == "spoiler":
|
2020-04-04 22:14:34 +02:00
|
|
|
return SpoilerHandler(processor, output, fence, header)
|
2018-11-02 17:11:42 +01:00
|
|
|
else:
|
2019-05-16 22:38:53 +02:00
|
|
|
return CodeHandler(processor, output, fence, lang, run_content_validators)
|
2018-11-02 17:11:42 +01:00
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
|
|
|
def check_for_new_fence(
|
2021-05-11 01:43:42 +02:00
|
|
|
processor: "FencedBlockPreprocessor",
|
2021-02-12 08:19:30 +01:00
|
|
|
output: MutableSequence[str],
|
|
|
|
line: str,
|
|
|
|
run_content_validators: bool = False,
|
|
|
|
default_language: Optional[str] = None,
|
|
|
|
) -> None:
|
2018-11-02 17:11:42 +01:00
|
|
|
m = FENCE_RE.match(line)
|
|
|
|
if m:
|
2021-02-12 08:20:45 +01:00
|
|
|
fence = m.group("fence")
|
2021-05-13 19:42:53 +02:00
|
|
|
lang: Optional[str] = m.group("lang")
|
|
|
|
header: Optional[str] = m.group("header")
|
2020-04-13 06:26:25 +02:00
|
|
|
if not lang and default_language:
|
|
|
|
lang = default_language
|
2021-02-12 08:19:30 +01:00
|
|
|
handler = generic_handler(
|
|
|
|
processor, output, fence, lang, header, run_content_validators, default_language
|
|
|
|
)
|
2018-11-02 17:11:42 +01:00
|
|
|
processor.push(handler)
|
|
|
|
else:
|
|
|
|
output.append(line)
|
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2021-06-06 20:02:24 +02:00
|
|
|
class OuterHandler(ZulipBaseHandler):
|
2021-02-12 08:19:30 +01:00
|
|
|
def __init__(
|
|
|
|
self,
|
2021-05-11 01:43:42 +02:00
|
|
|
processor: "FencedBlockPreprocessor",
|
2021-02-12 08:19:30 +01:00
|
|
|
output: MutableSequence[str],
|
|
|
|
run_content_validators: bool = False,
|
|
|
|
default_language: Optional[str] = None,
|
|
|
|
) -> None:
|
2019-05-16 22:38:53 +02:00
|
|
|
self.run_content_validators = run_content_validators
|
2020-04-13 06:26:25 +02:00
|
|
|
self.default_language = default_language
|
2021-06-06 20:02:24 +02:00
|
|
|
super().__init__(processor, output)
|
2018-11-02 17:11:42 +01:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2018-11-02 17:11:42 +01:00
|
|
|
def handle_line(self, line: str) -> None:
|
2021-02-12 08:19:30 +01:00
|
|
|
check_for_new_fence(
|
|
|
|
self.processor, self.output, line, self.run_content_validators, self.default_language
|
|
|
|
)
|
2018-11-02 17:11:42 +01:00
|
|
|
|
|
|
|
|
2021-06-06 20:02:24 +02:00
|
|
|
class CodeHandler(ZulipBaseHandler):
|
2021-02-12 08:19:30 +01:00
|
|
|
def __init__(
|
|
|
|
self,
|
2021-05-11 01:43:42 +02:00
|
|
|
processor: "FencedBlockPreprocessor",
|
2021-02-12 08:19:30 +01:00
|
|
|
output: MutableSequence[str],
|
|
|
|
fence: str,
|
2021-05-13 19:42:53 +02:00
|
|
|
lang: Optional[str],
|
2021-02-12 08:19:30 +01:00
|
|
|
run_content_validators: bool = False,
|
|
|
|
) -> None:
|
2018-11-02 17:11:42 +01:00
|
|
|
self.lang = lang
|
2019-05-16 22:38:53 +02:00
|
|
|
self.run_content_validators = run_content_validators
|
2021-06-06 20:02:24 +02:00
|
|
|
super().__init__(processor, output, fence)
|
2018-11-02 17:11:42 +01:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2018-11-02 17:11:42 +01:00
|
|
|
def done(self) -> None:
|
2019-05-16 22:38:53 +02:00
|
|
|
# run content validators (if any)
|
|
|
|
if self.run_content_validators:
|
|
|
|
validator = CODE_VALIDATORS.get(self.lang, lambda text: None)
|
|
|
|
validator(self.lines)
|
2021-06-06 20:02:24 +02:00
|
|
|
super().done()
|
2019-05-16 22:38:53 +02:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2021-06-06 20:02:24 +02:00
|
|
|
def format_text(self, text: str) -> str:
|
|
|
|
return self.processor.format_code(self.lang, text)
|
2018-11-02 17:11:42 +01:00
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2021-06-06 20:02:24 +02:00
|
|
|
class QuoteHandler(ZulipBaseHandler):
|
2021-02-12 08:19:30 +01:00
|
|
|
def __init__(
|
|
|
|
self,
|
2021-05-11 01:43:42 +02:00
|
|
|
processor: "FencedBlockPreprocessor",
|
2021-02-12 08:19:30 +01:00
|
|
|
output: MutableSequence[str],
|
|
|
|
fence: str,
|
|
|
|
default_language: Optional[str] = None,
|
|
|
|
) -> None:
|
2020-04-13 06:26:25 +02:00
|
|
|
self.default_language = default_language
|
2021-07-22 10:19:01 +02:00
|
|
|
super().__init__(processor, output, fence, process_contents=True)
|
2018-11-02 17:11:42 +01:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2018-11-02 17:11:42 +01:00
|
|
|
def handle_line(self, line: str) -> None:
|
|
|
|
if line.rstrip() == self.fence:
|
|
|
|
self.done()
|
|
|
|
else:
|
2021-02-12 08:19:30 +01:00
|
|
|
check_for_new_fence(
|
|
|
|
self.processor, self.lines, line, default_language=self.default_language
|
|
|
|
)
|
2018-11-02 17:11:42 +01:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2021-07-22 10:19:01 +02:00
|
|
|
def format_text(self, text: str) -> str:
|
|
|
|
return self.processor.format_quote(text)
|
2018-11-02 17:11:42 +01:00
|
|
|
|
2020-04-04 22:14:34 +02:00
|
|
|
|
2021-06-06 20:02:24 +02:00
|
|
|
class SpoilerHandler(ZulipBaseHandler):
|
2021-02-12 08:19:30 +01:00
|
|
|
def __init__(
|
2021-05-11 01:43:42 +02:00
|
|
|
self,
|
|
|
|
processor: "FencedBlockPreprocessor",
|
|
|
|
output: MutableSequence[str],
|
|
|
|
fence: str,
|
2021-05-13 19:42:53 +02:00
|
|
|
spoiler_header: Optional[str],
|
2021-02-12 08:19:30 +01:00
|
|
|
) -> None:
|
2020-04-04 22:14:34 +02:00
|
|
|
self.spoiler_header = spoiler_header
|
2021-07-22 10:19:01 +02:00
|
|
|
super().__init__(processor, output, fence, process_contents=True)
|
2020-04-04 22:14:34 +02:00
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2020-04-04 22:14:34 +02:00
|
|
|
def handle_line(self, line: str) -> None:
|
|
|
|
if line.rstrip() == self.fence:
|
|
|
|
self.done()
|
|
|
|
else:
|
|
|
|
check_for_new_fence(self.processor, self.lines, line)
|
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2021-07-22 10:19:01 +02:00
|
|
|
def format_text(self, text: str) -> str:
|
|
|
|
return self.processor.format_spoiler(self.spoiler_header, text)
|
2020-04-04 22:14:34 +02:00
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2021-06-06 20:02:24 +02:00
|
|
|
class TexHandler(ZulipBaseHandler):
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2021-06-06 20:02:24 +02:00
|
|
|
def format_text(self, text: str) -> str:
|
|
|
|
return self.processor.format_tex(text)
|
2018-11-02 17:11:42 +01:00
|
|
|
|
|
|
|
|
2021-05-26 13:23:30 +02:00
|
|
|
class CodeHilite(codehilite.CodeHilite):
|
|
|
|
def _parseHeader(self) -> None:
|
|
|
|
# Python-Markdown has a feature to parse-and-hide shebang
|
|
|
|
# lines present in code blocks:
|
|
|
|
#
|
|
|
|
# https://python-markdown.github.io/extensions/code_hilite/#shebang-no-path
|
|
|
|
#
|
|
|
|
# While using shebang lines for language detection is
|
|
|
|
# reasonable, we don't want this feature because it can be
|
|
|
|
# really confusing when doing anything else in a one-line code
|
|
|
|
# block that starts with `!` (which would then render as an
|
|
|
|
# empty code block!). So we disable the feature, by
|
|
|
|
# overriding this function, which implements it in CodeHilite
|
|
|
|
# upstream.
|
|
|
|
|
|
|
|
# split text into lines
|
|
|
|
lines = self.src.split("\n")
|
|
|
|
# Python-Markdown pops out the first line which we are avoiding here.
|
|
|
|
# Examine first line
|
|
|
|
fl = lines[0]
|
|
|
|
|
|
|
|
c = re.compile(
|
|
|
|
r"""
|
|
|
|
(?:(?:^::+)|(?P<shebang>^[#]!)) # Shebang or 2 or more colons
|
|
|
|
(?P<path>(?:/\w+)*[/ ])? # Zero or 1 path
|
|
|
|
(?P<lang>[\w#.+-]*) # The language
|
|
|
|
\s* # Arbitrary whitespace
|
|
|
|
# Optional highlight lines, single- or double-quote-delimited
|
|
|
|
(hl_lines=(?P<quot>"|')(?P<hl_lines>.*?)(?P=quot))?
|
|
|
|
""",
|
|
|
|
re.VERBOSE,
|
|
|
|
)
|
|
|
|
# Search first line for shebang
|
|
|
|
m = c.search(fl)
|
|
|
|
if m:
|
|
|
|
# We have a match
|
|
|
|
try:
|
|
|
|
self.lang = m.group("lang").lower()
|
|
|
|
except IndexError: # nocoverage
|
|
|
|
self.lang = None
|
|
|
|
|
|
|
|
if self.options["linenos"] is None and m.group("shebang"):
|
|
|
|
# Overridable and Shebang exists - use line numbers
|
|
|
|
self.options["linenos"] = True
|
|
|
|
|
|
|
|
self.options["hl_lines"] = parse_hl_lines(m.group("hl_lines"))
|
|
|
|
|
|
|
|
self.src = "\n".join(lines).strip("\n")
|
|
|
|
|
|
|
|
|
2020-10-19 06:37:43 +02:00
|
|
|
class FencedBlockPreprocessor(Preprocessor):
|
2021-02-12 08:19:30 +01:00
|
|
|
def __init__(self, md: Markdown, run_content_validators: bool = False) -> None:
|
2020-10-19 06:37:43 +02:00
|
|
|
super().__init__(md)
|
2012-11-19 17:55:28 +01:00
|
|
|
|
|
|
|
self.checked_for_codehilite = False
|
2019-05-16 22:38:53 +02:00
|
|
|
self.run_content_validators = run_content_validators
|
2020-11-11 00:33:05 +01:00
|
|
|
self.codehilite_conf: Mapping[str, Sequence[Any]] = {}
|
2012-11-19 17:55:28 +01:00
|
|
|
|
2021-06-06 20:02:24 +02:00
|
|
|
def push(self, handler: ZulipBaseHandler) -> None:
|
2018-11-02 17:11:42 +01:00
|
|
|
self.handlers.append(handler)
|
|
|
|
|
|
|
|
def pop(self) -> None:
|
|
|
|
self.handlers.pop()
|
|
|
|
|
2023-10-12 19:43:45 +02:00
|
|
|
@override
|
2024-07-12 02:30:17 +02:00
|
|
|
def run(self, lines: Iterable[str]) -> list[str]:
|
2021-05-08 02:36:30 +02:00
|
|
|
"""Match and store Fenced Code Blocks in the HtmlStash."""
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
|
2022-10-06 22:56:33 +02:00
|
|
|
from zerver.lib.markdown import ZulipMarkdown
|
|
|
|
|
2024-07-12 02:30:17 +02:00
|
|
|
output: list[str] = []
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
|
|
|
|
processor = self
|
2024-07-12 02:30:17 +02:00
|
|
|
self.handlers: list[ZulipBaseHandler] = []
|
2018-11-02 17:11:42 +01:00
|
|
|
|
2020-04-13 06:26:25 +02:00
|
|
|
default_language = None
|
2022-10-06 22:56:33 +02:00
|
|
|
if isinstance(self.md, ZulipMarkdown) and self.md.zulip_realm is not None:
|
2020-04-13 06:26:25 +02:00
|
|
|
default_language = self.md.zulip_realm.default_code_block_language
|
|
|
|
handler = OuterHandler(processor, output, self.run_content_validators, default_language)
|
2018-11-02 17:11:42 +01:00
|
|
|
self.push(handler)
|
2020-05-26 03:13:03 +02:00
|
|
|
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
for line in lines:
|
2018-11-02 17:11:42 +01:00
|
|
|
self.handlers[-1].handle_line(line)
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
|
2018-11-02 17:11:42 +01:00
|
|
|
while self.handlers:
|
|
|
|
self.handlers[-1].done()
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
|
|
|
|
# This fiddly handling of new lines at the end of our output was done to make
|
2020-06-28 16:40:18 +02:00
|
|
|
# existing tests pass. Markdown is just kind of funny when it comes to new lines,
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
# but we could probably remove this hack.
|
2021-02-12 08:20:45 +01:00
|
|
|
if len(output) > 2 and output[-2] != "":
|
|
|
|
output.append("")
|
Support arbitrarily nested fenced quote/code blocks.
Now we can nest fenced code/quote blocks inside of quote
blocks down to arbitrary depths. Code blocks are always leafs.
Fenced blocks start with at least three tildes or backticks,
and the clump of punctuation then becomes the terminator for
the block. If the user ends their message without terminators,
all blocks are automatically closed.
When inside a quote block, you can start another fenced block
with any header that doesn't match the end-string of the outer
block. (If you don't want to specify a language, then you
can change the number of backticks/tildes to avoid amiguity.)
Most of the heavy lifting happens in FencedBlockPreprocessor.run().
The parser works by pushing handlers on to a stack and popping
them off when the ends of blocks are encountered. Parents communicate
with their children by passing in a simple Python list of strings
for the child to append to. Handlers also maintain their own
lists for their own content, and when their done() method is called,
they render their data as needed.
The handlers are objects returned by functions, and the handler
functions close on variables push, pop, and processor. The closure
style here makes the handlers pretty tightly coupled to the outer
run() method. If we wanted to move to a class-based style, the
tradeoff would be that the class instances would have to marshall
push/pop/processor etc., but we could test the components more
easily in isolation.
Dealing with blank lines is very fiddly inside of bugdown.
The new functionality here is captured in the test
BugdownTest.test_complexly_nested_quote().
(imported from commit 53886c8de74bdf2bbd3cef8be9de25f05bddb93c)
2013-11-20 23:25:48 +01:00
|
|
|
return output
|
|
|
|
|
2021-05-13 19:42:53 +02:00
|
|
|
def format_code(self, lang: Optional[str], text: str) -> str:
|
2013-11-20 19:48:44 +01:00
|
|
|
if lang:
|
2020-07-10 01:57:43 +02:00
|
|
|
langclass = LANG_TAG.format(lang)
|
2016-06-16 13:24:52 +02:00
|
|
|
else:
|
2021-02-12 08:20:45 +01:00
|
|
|
langclass = ""
|
2013-11-20 19:48:44 +01:00
|
|
|
|
2013-11-20 19:11:07 +01:00
|
|
|
# Check for code hilite extension
|
|
|
|
if not self.checked_for_codehilite:
|
2020-06-03 04:16:38 +02:00
|
|
|
for ext in self.md.registeredExtensions:
|
2013-11-20 19:11:07 +01:00
|
|
|
if isinstance(ext, CodeHiliteExtension):
|
|
|
|
self.codehilite_conf = ext.config
|
|
|
|
break
|
|
|
|
|
|
|
|
self.checked_for_codehilite = True
|
|
|
|
|
|
|
|
# If config is not empty, then the codehighlite extension
|
2022-02-08 00:13:33 +01:00
|
|
|
# is enabled, so we call it to highlight the code
|
2013-11-20 19:11:07 +01:00
|
|
|
if self.codehilite_conf:
|
2021-02-12 08:19:30 +01:00
|
|
|
highliter = CodeHilite(
|
|
|
|
text,
|
2021-02-12 08:20:45 +01:00
|
|
|
linenums=self.codehilite_conf["linenums"][0],
|
|
|
|
guess_lang=self.codehilite_conf["guess_lang"][0],
|
|
|
|
css_class=self.codehilite_conf["css_class"][0],
|
|
|
|
style=self.codehilite_conf["pygments_style"][0],
|
|
|
|
use_pygments=self.codehilite_conf["use_pygments"][0],
|
2023-09-12 21:10:57 +02:00
|
|
|
lang=lang or None,
|
2021-02-12 08:20:45 +01:00
|
|
|
noclasses=self.codehilite_conf["noclasses"][0],
|
2023-08-19 20:10:01 +02:00
|
|
|
# By default, the Pygments PHP lexers won't highlight
|
|
|
|
# code without a `<?php` marker at the start of the
|
|
|
|
# code block, which is undesired in the common case of
|
|
|
|
# pasting a snippet of PHP code rather than whole
|
|
|
|
# file. The `startinline` option overrides this
|
|
|
|
# behavior for PHP-descended languages and has no
|
|
|
|
# effect on other lexers.
|
|
|
|
#
|
|
|
|
# See https://pygments.org/docs/lexers/#lexers-for-php-and-related-languages
|
|
|
|
startinline=True,
|
2021-02-12 08:19:30 +01:00
|
|
|
)
|
2013-11-20 19:11:07 +01:00
|
|
|
|
2021-02-12 08:20:45 +01:00
|
|
|
code = highliter.hilite().rstrip("\n")
|
2013-11-20 19:11:07 +01:00
|
|
|
else:
|
2020-07-10 01:57:43 +02:00
|
|
|
code = CODE_WRAP.format(langclass, self._escape(text))
|
2013-11-20 19:11:07 +01:00
|
|
|
|
2020-09-15 06:43:56 +02:00
|
|
|
# To support our "view in playground" feature, the frontend
|
|
|
|
# needs to know what Pygments language was used for
|
|
|
|
# highlighting this code block. We record this in a data
|
|
|
|
# attribute attached to the outer `pre` element.
|
|
|
|
# Unfortunately, the pygments API doesn't offer a way to add
|
|
|
|
# this, so we need to do it in a post-processing step.
|
2020-09-06 08:41:37 +02:00
|
|
|
if lang:
|
2020-10-30 01:31:33 +01:00
|
|
|
div_tag = lxml.html.fromstring(code)
|
2020-09-15 06:43:56 +02:00
|
|
|
|
|
|
|
# For the value of our data element, we get the lexer
|
|
|
|
# subclass name instead of directly using the language,
|
|
|
|
# since that canonicalizes aliases (Eg: `js` and
|
|
|
|
# `javascript` will be mapped to `JavaScript`).
|
2020-09-06 08:41:37 +02:00
|
|
|
try:
|
2021-08-03 05:44:19 +02:00
|
|
|
code_language = find_lexer_class_by_name(lang).name
|
2020-09-06 08:41:37 +02:00
|
|
|
except ClassNotFound:
|
2020-09-15 06:43:56 +02:00
|
|
|
# If there isn't a Pygments lexer by this name, we
|
|
|
|
# still tag it with the user's data-code-language
|
|
|
|
# value, since this allows hooking up a "playground"
|
|
|
|
# for custom "languages" that aren't known to Pygments.
|
2020-11-03 00:28:38 +01:00
|
|
|
code_language = lang
|
2020-09-15 06:43:56 +02:00
|
|
|
|
2021-02-12 08:20:45 +01:00
|
|
|
div_tag.attrib["data-code-language"] = code_language
|
2020-10-30 01:31:33 +01:00
|
|
|
code = lxml.html.tostring(div_tag, encoding="unicode")
|
2013-11-20 19:11:07 +01:00
|
|
|
return code
|
2013-01-29 16:14:30 +01:00
|
|
|
|
2018-05-10 19:13:36 +02:00
|
|
|
def format_quote(self, text: str) -> str:
|
2020-10-30 20:10:29 +01:00
|
|
|
paragraphs = text.split("\n")
|
2013-11-20 19:29:54 +01:00
|
|
|
quoted_paragraphs = []
|
|
|
|
for paragraph in paragraphs:
|
|
|
|
lines = paragraph.split("\n")
|
2020-10-30 20:10:29 +01:00
|
|
|
quoted_paragraphs.append("\n".join("> " + line for line in lines))
|
|
|
|
return "\n".join(quoted_paragraphs)
|
2013-11-20 19:29:54 +01:00
|
|
|
|
2021-05-13 19:42:53 +02:00
|
|
|
def format_spoiler(self, header: Optional[str], text: str) -> str:
|
2020-04-04 22:14:34 +02:00
|
|
|
output = []
|
|
|
|
header_div_open_html = '<div class="spoiler-block"><div class="spoiler-header">'
|
2020-09-02 02:50:08 +02:00
|
|
|
end_header_start_content_html = '</div><div class="spoiler-content" aria-hidden="true">'
|
2021-02-12 08:20:45 +01:00
|
|
|
footer_html = "</div></div>"
|
2020-04-04 22:14:34 +02:00
|
|
|
|
|
|
|
output.append(self.placeholder(header_div_open_html))
|
2021-05-13 19:42:53 +02:00
|
|
|
if header is not None:
|
|
|
|
output.append(header)
|
2020-04-04 22:14:34 +02:00
|
|
|
output.append(self.placeholder(end_header_start_content_html))
|
|
|
|
output.append(text)
|
|
|
|
output.append(self.placeholder(footer_html))
|
|
|
|
return "\n\n".join(output)
|
|
|
|
|
2018-05-10 19:13:36 +02:00
|
|
|
def format_tex(self, text: str) -> str:
|
2017-03-20 16:56:39 +01:00
|
|
|
paragraphs = text.split("\n\n")
|
|
|
|
tex_paragraphs = []
|
|
|
|
for paragraph in paragraphs:
|
|
|
|
html = render_tex(paragraph, is_inline=False)
|
|
|
|
if html is not None:
|
|
|
|
tex_paragraphs.append(html)
|
|
|
|
else:
|
2021-02-12 08:20:45 +01:00
|
|
|
tex_paragraphs.append('<span class="tex-error">' + escape(paragraph) + "</span>")
|
2017-03-20 16:56:39 +01:00
|
|
|
return "\n\n".join(tex_paragraphs)
|
|
|
|
|
2018-05-10 19:13:36 +02:00
|
|
|
def placeholder(self, code: str) -> str:
|
2020-06-03 04:16:38 +02:00
|
|
|
return self.md.htmlStash.store(code)
|
2013-11-20 21:03:57 +01:00
|
|
|
|
2018-05-10 19:13:36 +02:00
|
|
|
def _escape(self, txt: str) -> str:
|
2021-05-08 02:36:30 +02:00
|
|
|
"""basic html escaping"""
|
2021-02-12 08:20:45 +01:00
|
|
|
txt = txt.replace("&", "&")
|
|
|
|
txt = txt.replace("<", "<")
|
|
|
|
txt = txt.replace(">", ">")
|
|
|
|
txt = txt.replace('"', """)
|
2012-11-19 17:55:28 +01:00
|
|
|
return txt
|
|
|
|
|
|
|
|
|
markdown: Fix use of pure_markdown for non-pure markdown rendering.
`render_markdown_path` renders Markdown, and also (since baff121115a1)
runs Jinja2 on the resulting HTML.
The `pure_markdown` flag was added in 0a99fa2fd669, and did two
things: retried the path directly in the filesystem if it wasn't found
by the Jinja2 resolver, and also skipped the subsequent Jinja2
templating step (regardless of where the content was found). In this
context, the name `pure_markdown` made some sense. The only two
callsites were the TOS and privacy policy renders, which might have
had user-supplied arbitrary paths, and we wished to handle absolute
paths in addition to ones inside `templates/`.
Unfortunately, the follow-up of 01bd55bbcbf7 did not refactor the
logic -- it changed it, by making `pure_markdown` only do the former
of the two behaviors. Passing `pure_markdown=True` after that commit
still caused it to always run Jinja2, but allowed it to look elsewhere
in the filesystem.
This set the stage for calls, such as the one introduced in
dedea237456b, which passed both a context for Jinja2, as well as
`pure_markdown=True` implying that Jinja2 was not to be used.
Split the two previous behaviors of the `pure_markdown` flag, and use
pre-existing data to control them, rather than an explicit flag. For
handling policy information which is stored at an absolute path
outside of the template root, we switch to using the template search
path if and only if the path is relative. This also closes the
potential inconsistency based on CWD when `pure_markdown=True` was
passed and the path was relative, not absolute.
Decide whether to run Jinja2 based on if a context is passed in at
all. This restores the behavior in the initial 0a99fa2fd669 where a
call to `rendar_markdown_path` could be made to just render markdown,
and not some other unmentioned and unrelated templating language as
well.
2023-03-10 02:47:44 +01:00
|
|
|
def makeExtension(*args: Any, **kwargs: Any) -> FencedCodeExtension:
|
2019-05-16 22:38:53 +02:00
|
|
|
return FencedCodeExtension(kwargs)
|
2012-11-19 17:55:28 +01:00
|
|
|
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2012-11-19 17:55:28 +01:00
|
|
|
if __name__ == "__main__":
|
|
|
|
import doctest
|
2021-02-12 08:19:30 +01:00
|
|
|
|
2012-11-19 17:55:28 +01:00
|
|
|
doctest.testmod()
|