We now create tokens for whitespace and text, such that you
could rebuild the template file with "".join(token.s for
token in tokens).
I also fixed a few bugs related to not parsing
whitespace-control tokens.
We no longer ignore template variables, although we could do
a lot better at validating them.
The most immediate use case for the more thorough parser is
to simplify the pretty printer, but it should also make it
less likely for us to skip over new template constructs
(i.e. the tool will fail hard rather than acting strange).
Note that this speeds up the tool by almost 3x, which may be
slightly surprising considering we are building more tokens.
The reason is that we are now munching efficiently through
big chunks of whitespace and text at a time, rather than
checking each individual character to see if it starts one
of the N other token types.
The changes to the pretty_print module here are a bit ugly,
but they should mostly be made irrelevant in subsequent
commits.
This reverses the policy that was set, but incompletely enforced, by
commit 951514dd7d. The self-closing tag
syntax is clearer, more consistent, simpler to parse, compatible with
XML, preferred by Prettier, and (most importantly now) required by
FormatJS.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Fixes#2665.
Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.
Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start. I expect this change will increase pressure for us to split
those files, which isn't a bad thing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We now forbid tags of the form `<foo ... />` in most
places, and we also forbid it even for several void
tags.
We make exceptions for tags that are already formatted
in two different ways in our codebase. This is mostly
svg tags, plus these common cases:
- br
- hr
- img
- input
It would be nice to lock down a convention for these,
even though the HTML specification is unopinionated
on these. We'll probably want to stay flexible for
svg tags, since they are sometimes copy/pasted from
other sources (although it's probably rare enough for
them that we can tolerate just doing minor edits as
needed).
Previous cleanups (mostly the removals of Python __future__ imports)
were done in a way that introduced leading newlines. Delete leading
newlines from all files, except static/assets/zulip-emoji/NOTICE,
which is a verbatim copy of the Apache 2.0 license.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Tweaked by tabbott to not remove it from lister.py, linter_lib, and
friends, since those are intended to support both Python 2 and 3
(we're planning to extract them from the repository).
In this commit we enhance our current template linter to detect
duplicate ids and report them during lint checks. html_branches.py
was topped up with a new function build_id_dict for the purpose.
Also the get_tag_info function in same file was updated to parse
ids and classes more robustly in cases of template variables.
split_for_id_and_class function was added to serve this purpose.
Unit tests for both the functions were created under
tests/test_html_branches. Also a directory under tests called
test_template_data was created to hold templates for testing under
newly created functionality.
check_templates was modified to print to console any duplicates
detected.
showell reviewed my commit and helped me out.
Fixes#2950.
This code is not directly related to the template parser, so it
can safely live in its own file.
The only significant change to the code is to the signature of
`html_branches` so that it can be called without requiring a file.
Since it's only used in html_grep, that has been updated to reflect
this change.
Fixes: #1774.