zulip/web/src/linkifiers.ts

113 lines
3.6 KiB
TypeScript
Raw Normal View History

linkifier: Support URL templates for linkifiers. This swaps out url_format_string from all of our APIs and replaces it with url_template. Note that the documentation changes in the following commits will be squashed with this commit. We change the "url_format" key to "url_template" for the realm_linkifiers events in event_schema, along with updating LinkifierDict. "url_template" is the name chosen to normalize mixed usages of "url_format_string" and "url_format" throughout the backend. The markdown processor is updated to stop handling the format string interpolation and delegate the task template expansion to the uri_template library instead. This change affects many test cases. We mostly just replace "%(name)s" with "{name}", "url_format_string" with "url_template" to make sure that they still pass. There are some test cases dedicated for testing "%" escaping, which aren't relevant anymore and are subject to removal. But for now we keep most of them as-is, and make sure that "%" is always escaped since we do not use it for variable substitution any more. Since url_format_string is not populated anymore, a migration is created to remove this field entirely, and make url_template non-nullable since we will always populate it. Note that it is possible to have url_template being null after migration 0422 and before 0424, but in practice, url_template will not be None after backfilling and the backend now is always setting url_template. With the removal of url_format_string, RealmFilter model will now be cleaned with URL template checks, and the old checks for escapes are removed. We also modified RealmFilter.clean to skip the validation when the url_template is invalid. This avoids raising mulitple ValidationError's when calling full_clean on a linkifier. But we might eventually want to have a more centric approach to data validation instead of having the same validation in both the clean method and the validator. Fixes #23124. Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-10-05 20:55:31 +02:00
import url_template_lib from "url-template";
import * as blueslip from "./blueslip.ts";
linkifier: Support URL templates for linkifiers. This swaps out url_format_string from all of our APIs and replaces it with url_template. Note that the documentation changes in the following commits will be squashed with this commit. We change the "url_format" key to "url_template" for the realm_linkifiers events in event_schema, along with updating LinkifierDict. "url_template" is the name chosen to normalize mixed usages of "url_format_string" and "url_format" throughout the backend. The markdown processor is updated to stop handling the format string interpolation and delegate the task template expansion to the uri_template library instead. This change affects many test cases. We mostly just replace "%(name)s" with "{name}", "url_format_string" with "url_template" to make sure that they still pass. There are some test cases dedicated for testing "%" escaping, which aren't relevant anymore and are subject to removal. But for now we keep most of them as-is, and make sure that "%" is always escaped since we do not use it for variable substitution any more. Since url_format_string is not populated anymore, a migration is created to remove this field entirely, and make url_template non-nullable since we will always populate it. Note that it is possible to have url_template being null after migration 0422 and before 0424, but in practice, url_template will not be None after backfilling and the backend now is always setting url_template. With the removal of url_format_string, RealmFilter model will now be cleaned with URL template checks, and the old checks for escapes are removed. We also modified RealmFilter.clean to skip the validation when the url_template is invalid. This avoids raising mulitple ValidationError's when calling full_clean on a linkifier. But we might eventually want to have a more centric approach to data validation instead of having the same validation in both the clean method and the validator. Fixes #23124. Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-10-05 20:55:31 +02:00
type LinkifierMap = Map<
RegExp,
{url_template: url_template_lib.Template; group_number_to_name: Record<number, string>}
>;
const linkifier_map: LinkifierMap = new Map();
type Linkifier = {
pattern: string;
linkifier: Support URL templates for linkifiers. This swaps out url_format_string from all of our APIs and replaces it with url_template. Note that the documentation changes in the following commits will be squashed with this commit. We change the "url_format" key to "url_template" for the realm_linkifiers events in event_schema, along with updating LinkifierDict. "url_template" is the name chosen to normalize mixed usages of "url_format_string" and "url_format" throughout the backend. The markdown processor is updated to stop handling the format string interpolation and delegate the task template expansion to the uri_template library instead. This change affects many test cases. We mostly just replace "%(name)s" with "{name}", "url_format_string" with "url_template" to make sure that they still pass. There are some test cases dedicated for testing "%" escaping, which aren't relevant anymore and are subject to removal. But for now we keep most of them as-is, and make sure that "%" is always escaped since we do not use it for variable substitution any more. Since url_format_string is not populated anymore, a migration is created to remove this field entirely, and make url_template non-nullable since we will always populate it. Note that it is possible to have url_template being null after migration 0422 and before 0424, but in practice, url_template will not be None after backfilling and the backend now is always setting url_template. With the removal of url_format_string, RealmFilter model will now be cleaned with URL template checks, and the old checks for escapes are removed. We also modified RealmFilter.clean to skip the validation when the url_template is invalid. This avoids raising mulitple ValidationError's when calling full_clean on a linkifier. But we might eventually want to have a more centric approach to data validation instead of having the same validation in both the clean method and the validator. Fixes #23124. Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-10-05 20:55:31 +02:00
url_template: string;
id: number;
};
linkifier: Support URL templates for linkifiers. This swaps out url_format_string from all of our APIs and replaces it with url_template. Note that the documentation changes in the following commits will be squashed with this commit. We change the "url_format" key to "url_template" for the realm_linkifiers events in event_schema, along with updating LinkifierDict. "url_template" is the name chosen to normalize mixed usages of "url_format_string" and "url_format" throughout the backend. The markdown processor is updated to stop handling the format string interpolation and delegate the task template expansion to the uri_template library instead. This change affects many test cases. We mostly just replace "%(name)s" with "{name}", "url_format_string" with "url_template" to make sure that they still pass. There are some test cases dedicated for testing "%" escaping, which aren't relevant anymore and are subject to removal. But for now we keep most of them as-is, and make sure that "%" is always escaped since we do not use it for variable substitution any more. Since url_format_string is not populated anymore, a migration is created to remove this field entirely, and make url_template non-nullable since we will always populate it. Note that it is possible to have url_template being null after migration 0422 and before 0424, but in practice, url_template will not be None after backfilling and the backend now is always setting url_template. With the removal of url_format_string, RealmFilter model will now be cleaned with URL template checks, and the old checks for escapes are removed. We also modified RealmFilter.clean to skip the validation when the url_template is invalid. This avoids raising mulitple ValidationError's when calling full_clean on a linkifier. But we might eventually want to have a more centric approach to data validation instead of having the same validation in both the clean method and the validator. Fixes #23124. Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-10-05 20:55:31 +02:00
export function get_linkifier_map(): LinkifierMap {
return linkifier_map;
}
linkifier: Support URL templates for linkifiers. This swaps out url_format_string from all of our APIs and replaces it with url_template. Note that the documentation changes in the following commits will be squashed with this commit. We change the "url_format" key to "url_template" for the realm_linkifiers events in event_schema, along with updating LinkifierDict. "url_template" is the name chosen to normalize mixed usages of "url_format_string" and "url_format" throughout the backend. The markdown processor is updated to stop handling the format string interpolation and delegate the task template expansion to the uri_template library instead. This change affects many test cases. We mostly just replace "%(name)s" with "{name}", "url_format_string" with "url_template" to make sure that they still pass. There are some test cases dedicated for testing "%" escaping, which aren't relevant anymore and are subject to removal. But for now we keep most of them as-is, and make sure that "%" is always escaped since we do not use it for variable substitution any more. Since url_format_string is not populated anymore, a migration is created to remove this field entirely, and make url_template non-nullable since we will always populate it. Note that it is possible to have url_template being null after migration 0422 and before 0424, but in practice, url_template will not be None after backfilling and the backend now is always setting url_template. With the removal of url_format_string, RealmFilter model will now be cleaned with URL template checks, and the old checks for escapes are removed. We also modified RealmFilter.clean to skip the validation when the url_template is invalid. This avoids raising mulitple ValidationError's when calling full_clean on a linkifier. But we might eventually want to have a more centric approach to data validation instead of having the same validation in both the clean method and the validator. Fixes #23124. Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-10-05 20:55:31 +02:00
function python_to_js_linkifier(
pattern: string,
url: string,
): [RegExp | null, url_template_lib.Template, Record<number, string>] {
// Converts a python named-group regex to a javascript-compatible numbered
// group regex... with a regex!
const named_group_re = /\(?P<([^>]+?)>/g;
let match = named_group_re.exec(pattern);
let current_group = 1;
linkifier: Support URL templates for linkifiers. This swaps out url_format_string from all of our APIs and replaces it with url_template. Note that the documentation changes in the following commits will be squashed with this commit. We change the "url_format" key to "url_template" for the realm_linkifiers events in event_schema, along with updating LinkifierDict. "url_template" is the name chosen to normalize mixed usages of "url_format_string" and "url_format" throughout the backend. The markdown processor is updated to stop handling the format string interpolation and delegate the task template expansion to the uri_template library instead. This change affects many test cases. We mostly just replace "%(name)s" with "{name}", "url_format_string" with "url_template" to make sure that they still pass. There are some test cases dedicated for testing "%" escaping, which aren't relevant anymore and are subject to removal. But for now we keep most of them as-is, and make sure that "%" is always escaped since we do not use it for variable substitution any more. Since url_format_string is not populated anymore, a migration is created to remove this field entirely, and make url_template non-nullable since we will always populate it. Note that it is possible to have url_template being null after migration 0422 and before 0424, but in practice, url_template will not be None after backfilling and the backend now is always setting url_template. With the removal of url_format_string, RealmFilter model will now be cleaned with URL template checks, and the old checks for escapes are removed. We also modified RealmFilter.clean to skip the validation when the url_template is invalid. This avoids raising mulitple ValidationError's when calling full_clean on a linkifier. But we might eventually want to have a more centric approach to data validation instead of having the same validation in both the clean method and the validator. Fixes #23124. Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-10-05 20:55:31 +02:00
const group_number_to_name: Record<number, string> = {};
while (match) {
const name = match[1]!;
// Replace named group with regular matching group
pattern = pattern.replace("(?P<" + name + ">", "(");
linkifier: Support URL templates for linkifiers. This swaps out url_format_string from all of our APIs and replaces it with url_template. Note that the documentation changes in the following commits will be squashed with this commit. We change the "url_format" key to "url_template" for the realm_linkifiers events in event_schema, along with updating LinkifierDict. "url_template" is the name chosen to normalize mixed usages of "url_format_string" and "url_format" throughout the backend. The markdown processor is updated to stop handling the format string interpolation and delegate the task template expansion to the uri_template library instead. This change affects many test cases. We mostly just replace "%(name)s" with "{name}", "url_format_string" with "url_template" to make sure that they still pass. There are some test cases dedicated for testing "%" escaping, which aren't relevant anymore and are subject to removal. But for now we keep most of them as-is, and make sure that "%" is always escaped since we do not use it for variable substitution any more. Since url_format_string is not populated anymore, a migration is created to remove this field entirely, and make url_template non-nullable since we will always populate it. Note that it is possible to have url_template being null after migration 0422 and before 0424, but in practice, url_template will not be None after backfilling and the backend now is always setting url_template. With the removal of url_format_string, RealmFilter model will now be cleaned with URL template checks, and the old checks for escapes are removed. We also modified RealmFilter.clean to skip the validation when the url_template is invalid. This avoids raising mulitple ValidationError's when calling full_clean on a linkifier. But we might eventually want to have a more centric approach to data validation instead of having the same validation in both the clean method and the validator. Fixes #23124. Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-10-05 20:55:31 +02:00
// Map numbered reference to named reference for template expansion
group_number_to_name[current_group] = name;
// Reset the RegExp state
named_group_re.lastIndex = 0;
match = named_group_re.exec(pattern);
current_group += 1;
}
// Convert any python in-regex flags to RegExp flags
let js_flags = "g";
const inline_flag_re = /\(\?([Limsux]+)\)/;
match = inline_flag_re.exec(pattern);
// JS regexes only support i (case insensitivity) and m (multiline)
// flags, so keep those and ignore the rest
if (match) {
const py_flags = match[1]!;
for (const flag of py_flags) {
if ("im".includes(flag)) {
js_flags += flag;
}
}
pattern = pattern.replace(inline_flag_re, "");
}
// Ideally we should have been checking that linkifiers
// begin with certain characters but since there is no
// support for negative lookbehind in javascript, we check
// for this condition in `contains_backend_only_syntax()`
// function. If the condition is satisfied then the message
// is rendered locally, otherwise, we return false there and
// message is rendered on the backend which has proper support
// for negative lookbehind.
pattern = pattern + /(?!\w)/.source;
let final_regex = null;
try {
final_regex = new RegExp(pattern, js_flags);
} catch (error) {
// We have an error computing the generated regex syntax.
// We'll ignore this linkifier for now, but log this
// failure for debugging later.
if (error instanceof SyntaxError) {
blueslip.error("python_to_js_linkifier failure!", {pattern}, error);
} else {
// Don't swallow any other (unexpected) exceptions.
/* istanbul ignore next */
throw error;
}
}
linkifier: Support URL templates for linkifiers. This swaps out url_format_string from all of our APIs and replaces it with url_template. Note that the documentation changes in the following commits will be squashed with this commit. We change the "url_format" key to "url_template" for the realm_linkifiers events in event_schema, along with updating LinkifierDict. "url_template" is the name chosen to normalize mixed usages of "url_format_string" and "url_format" throughout the backend. The markdown processor is updated to stop handling the format string interpolation and delegate the task template expansion to the uri_template library instead. This change affects many test cases. We mostly just replace "%(name)s" with "{name}", "url_format_string" with "url_template" to make sure that they still pass. There are some test cases dedicated for testing "%" escaping, which aren't relevant anymore and are subject to removal. But for now we keep most of them as-is, and make sure that "%" is always escaped since we do not use it for variable substitution any more. Since url_format_string is not populated anymore, a migration is created to remove this field entirely, and make url_template non-nullable since we will always populate it. Note that it is possible to have url_template being null after migration 0422 and before 0424, but in practice, url_template will not be None after backfilling and the backend now is always setting url_template. With the removal of url_format_string, RealmFilter model will now be cleaned with URL template checks, and the old checks for escapes are removed. We also modified RealmFilter.clean to skip the validation when the url_template is invalid. This avoids raising mulitple ValidationError's when calling full_clean on a linkifier. But we might eventually want to have a more centric approach to data validation instead of having the same validation in both the clean method and the validator. Fixes #23124. Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-10-05 20:55:31 +02:00
const url_template = url_template_lib.parse(url);
return [final_regex, url_template, group_number_to_name];
}
export function update_linkifier_rules(linkifiers: Linkifier[]): void {
linkifier_map.clear();
for (const linkifier of linkifiers) {
linkifier: Support URL templates for linkifiers. This swaps out url_format_string from all of our APIs and replaces it with url_template. Note that the documentation changes in the following commits will be squashed with this commit. We change the "url_format" key to "url_template" for the realm_linkifiers events in event_schema, along with updating LinkifierDict. "url_template" is the name chosen to normalize mixed usages of "url_format_string" and "url_format" throughout the backend. The markdown processor is updated to stop handling the format string interpolation and delegate the task template expansion to the uri_template library instead. This change affects many test cases. We mostly just replace "%(name)s" with "{name}", "url_format_string" with "url_template" to make sure that they still pass. There are some test cases dedicated for testing "%" escaping, which aren't relevant anymore and are subject to removal. But for now we keep most of them as-is, and make sure that "%" is always escaped since we do not use it for variable substitution any more. Since url_format_string is not populated anymore, a migration is created to remove this field entirely, and make url_template non-nullable since we will always populate it. Note that it is possible to have url_template being null after migration 0422 and before 0424, but in practice, url_template will not be None after backfilling and the backend now is always setting url_template. With the removal of url_format_string, RealmFilter model will now be cleaned with URL template checks, and the old checks for escapes are removed. We also modified RealmFilter.clean to skip the validation when the url_template is invalid. This avoids raising mulitple ValidationError's when calling full_clean on a linkifier. But we might eventually want to have a more centric approach to data validation instead of having the same validation in both the clean method and the validator. Fixes #23124. Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-10-05 20:55:31 +02:00
const [regex, url_template, group_number_to_name] = python_to_js_linkifier(
linkifier.pattern,
linkifier.url_template,
);
if (!regex) {
// Skip any linkifiers that could not be converted
continue;
}
linkifier: Support URL templates for linkifiers. This swaps out url_format_string from all of our APIs and replaces it with url_template. Note that the documentation changes in the following commits will be squashed with this commit. We change the "url_format" key to "url_template" for the realm_linkifiers events in event_schema, along with updating LinkifierDict. "url_template" is the name chosen to normalize mixed usages of "url_format_string" and "url_format" throughout the backend. The markdown processor is updated to stop handling the format string interpolation and delegate the task template expansion to the uri_template library instead. This change affects many test cases. We mostly just replace "%(name)s" with "{name}", "url_format_string" with "url_template" to make sure that they still pass. There are some test cases dedicated for testing "%" escaping, which aren't relevant anymore and are subject to removal. But for now we keep most of them as-is, and make sure that "%" is always escaped since we do not use it for variable substitution any more. Since url_format_string is not populated anymore, a migration is created to remove this field entirely, and make url_template non-nullable since we will always populate it. Note that it is possible to have url_template being null after migration 0422 and before 0424, but in practice, url_template will not be None after backfilling and the backend now is always setting url_template. With the removal of url_format_string, RealmFilter model will now be cleaned with URL template checks, and the old checks for escapes are removed. We also modified RealmFilter.clean to skip the validation when the url_template is invalid. This avoids raising mulitple ValidationError's when calling full_clean on a linkifier. But we might eventually want to have a more centric approach to data validation instead of having the same validation in both the clean method and the validator. Fixes #23124. Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-10-05 20:55:31 +02:00
linkifier_map.set(regex, {
url_template,
group_number_to_name,
});
}
}
export function initialize(linkifiers: Linkifier[]): void {
update_linkifier_rules(linkifiers);
}