zulip/tools/i18n
Josh Klar 70b30e7792 i18n: Unescape Unicode sequences in JSON.
This greatly improves the readability of the diffs and in-codebase
translation strings over using ASCII sequences for unicode in the JSON.

We've previously noticed [^1] some JSON translation files ending up with
escaped Unicode sequences on disk, which Transifex indicates is expected
behavior [^2], though it is sometimes fixed by `manage.py
compilemessages` [^3]. Further, as noted in #23932 [^4], some JSON
translation files include HTML-escaped entities like quotation marks.

This script will ingest valid JSON files and output them as proper UTF-8
files with appropriately unescaped (unless otherwise necessary, like
double quotes being backslash-escaped) sequences, except when the key
itself contains HTML escape sequences (as it's presumed the value of
such entries must be pre-escaped before being passed to consumers).

[^1]: https://chat.zulip.org/#narrow/stream/58-translation/topic/Transifex.20client/near/1479205

[^2]: https://chat.zulip.org/#narrow/stream/58-translation/topic/an.20email.20for.20Transifex.20support/near/1481287

[^3]: https://chat.zulip.org/#narrow/stream/58-translation/topic/an.20email.20for.20Transifex.20support/near/1481908

[^4]: Which is not end-to-end fixed yet by this commit: that will
require a new release of Zulip Server.

gitlint-ignore: B1, title-trailing-punctuation, body-min-length, body-is-missing
2023-01-17 13:19:45 -08:00
..
process-mobile-i18n
push-translations
sync-translations i18n: Unescape Unicode sequences in JSON. 2023-01-17 13:19:45 -08:00
unescape-contents i18n: Unescape Unicode sequences in JSON. 2023-01-17 13:19:45 -08:00
unescape-html-in-json-translations i18n: Unescape Unicode sequences in JSON. 2023-01-17 13:19:45 -08:00