TOR users are legitimate users of the system; however, that system can
also be used for abuse -- specifically, by evading IP-based
rate-limiting.
For the purposes of IP-based rate-limiting, add a
RATE_LIMIT_TOR_TOGETHER flag, defaulting to false, which lumps all
requests from TOR exit nodes into the same bucket. This may allow a
TOR user to deny other TOR users access to the find-my-account and
new-realm endpoints, but this is a low cost for cutting off a
significant potential abuse vector.
If enabled, the list of TOR exit nodes is fetched from their public
endpoint once per hour, via a cron job, and cached on disk. Django
processes load this data from disk, and cache it in memcached.
Requests are spared from the burden of checking disk on failure via a
circuitbreaker, which trips of there are two failures in a row, and
only begins trying again after 10 minutes.
This commit adds django-cte as dependency
which will be used for querying recursive
group membership.
Extracted this commit from #19866.
Co-authored-by: Anders Kaseorg <anders@zulip.com>
re2[1] compiles (strictly) regular expressions to deterministic finite
automata, which guarantees linear-time behavior; `google-re2` is a
drop-in replacement for the `re` module which uses re2 under the hood.
[1]: https://github.com/google/re2/
Recommonmark is no longer maintained, and MyST-Parser is much more
complete.
https://myst-parser.readthedocs.io/
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Effectively revert commit a8f66573f3.
This adds a little more bureaucracy when adding a new untyped
third-party library, but will prevent bugs like the ones fixed
in #19105 and #19106.
Signed-off-by: Anders Kaseorg <anders@zulip.com>