This prevents a memory leak caused by the `SimpleLazyObject` instance of
`UserProfile` that create a reference loop with the request object
via `ZulipRequestNotes`.
We will no longer use the HttpRequest to store the rate limit data.
Using ZulipRequestNotes, we can access rate_limit and ratelimits_applied
with type hints support. We also save the process of initializing
ratelimits_applied by giving it a default value.
If the user is logged in, we'll stick to rate limiting by the
UserProfile. In case of requests without authentication, we'll apply the
same limits but to the IP address.
This simplifies the code, as it allows using the mechanism of converting
JsonableErrors into a response instead of having separate, but
ultimately similar, logic in RateLimitMiddleware.
We don't touch tests here because "rate limited" error responses are
already verified in test_external.py.
Fixes#2665.
Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.
Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start. I expect this change will increase pressure for us to split
those files, which isn't a bad thing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Automatically generated by the following script, based on the output
of lint with flake8-comma:
import re
import sys
last_filename = None
last_row = None
lines = []
for msg in sys.stdin:
m = re.match(
r"\x1b\[35mflake8 \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
)
if m:
filename, row_str, col_str, err = m.groups()
row, col = int(row_str), int(col_str)
if filename == last_filename:
assert last_row != row
else:
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
with open(filename) as f:
lines = f.readlines()
last_filename = filename
last_row = row
line = lines[row - 1]
if err in ["C812", "C815"]:
lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
elif err in ["C819"]:
assert line[col - 2] == ","
lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Generated by pyupgrade --py36-plus --keep-percent-format, but with the
NamedTuple changes reverted (see commit
ba7906a3c6, #15132).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This will protect us in case of some kinds of bugs that could allow
making requests such as password authentication attempts to tornado.
Without restricting the domains to which the in-memory backend can
be applied, such bugs would lead to attackers having multiple times
larger rate limits for these sensitive requests.
The Redis-based rate limiting approach takes a lot of time talking to
Redis with 3-4 network requests to Redis on each request. It had a
negative impact on the performance of `get_events()` since this is our
single highest-traffic endpoint.
This commit introduces an in-process rate limiting alternate for
`/json/events` endpoint. The implementation uses Leaky Bucket
algorithm and Python dictionaries instead of Redis. This drops the
rate limiting time for `get_events()` from about 3000us to less than
100us (on my system).
Fixes#13913.
Co-Author-by: Mateusz Mandera <mateusz.mandera@protonmail.com>
Co-Author-by: Anders Kaseorg <anders@zulipchat.com>
If we had a rule like "max 3 requests in 2 seconds", there was an
inconsistency between is_ratelimited() and get_api_calls_left().
If you had:
request #1 at time 0
request #2 and #3 at some times < 2
Next request, if exactly at time 2, would not get ratelimited, but if
get_api_calls_left was called, it would return 0. This was due to
inconsistency on the boundary - the check in is_ratelimited was
exclusive, while get_api_calls_left uses zcount, which is inclusive.
time_reset returned from api_calls_left() was a timestamp, but
mistakenly treated as delta seconds. We change the return value of
api_calls_left() to be delta seconds, to be consistent with the return
value of rate_limit().
The information used to be stored in a request._ratelimit dict, but
there's no need for that, and a list is a simpler structure, so this
allows us to simplify the plumbing somewhat.
That's the value that matters to the code that catches the exception,
and this change allows simplifying the plumbing somewhat, and gets rid
of the get_rate_limit_result_from_request function.
Instead of operating on RateLimitedObjects, and making the classes
depend on each too strongly. This also allows getting rid of get_keys()
function from RateLimitedObject, which was a redis rate limiter
implementation detail. RateLimitedObject should only define their own
key() function and the logic forming various necessary redis keys from
them should be in RedisRateLimiterBackend.
type().__name__ is sufficient, and much readable than type(), so it's
better to use the former for keys.
We also make the classes consistent in forming the keys in the format
type(self).__name__:identifier and adjust logger.warning and statsd to
take advantage of that and simply log the key().
As more types of rate limiting of requests are added, one request may
end up having various limits applied to it - and the middleware needs to
be able to handle that. We implement that through a set_response_headers
function, which sets the X-RateLimit-* headers in a sensible way based
on all the limits that were applied to the request.
Previous cleanups (mostly the removals of Python __future__ imports)
were done in a way that introduced leading newlines. Delete leading
newlines from all files, except static/assets/zulip-emoji/NOTICE,
which is a verbatim copy of the Apache 2.0 license.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Fixes#11209.
This requires changing how zadd is used in rate_limiter.py:
In redis-py >= 3.0 the pairs to ZADD need to be passed as a dictionary,
not as *args or **kwargs, as described at
https://pypi.org/project/redis/3.2.1/ in the section
"Upgrading from redis-py 2.X to 3.0".
The rate_limiter change has to be in one commit with the redis upgrade,
because the dict format is not supported before redis-py 3.0.
We create rate_limit_entity as a general rate-limiting function for
RateLimitedObjects, from code that was possible to abstract away from
rate_limit_user and that will be used for other kinds of rate limiting.
We make rate_limit_user use this new general framework from now.
We should rate-limit users when our rate limiter deadlocks trying to
increment its count; we also now log at warning level (so it doesn't
send spammy emails) and include details on the user and route was, so
that we can properly investigate whether the rate-limiting on the
route was in error.