zulip/zerver/lib/per_request_cache.py

34 lines
1.1 KiB
Python
Raw Normal View History

from collections.abc import Callable
from typing import Any, TypeVar
per-request caches: Add per_request_cache library. We have historically cached two types of values on a per-request basis inside of memory: * linkifiers * display recipients Both of these caches were hand-written, and they both actually cache values that are also in memcached, so the per-request cache essentially only saves us from a few memcached hits. I think the linkifier per-request cache is a necessary evil. It's an important part of message rendering, and it's not super easy to structure the code to just get a single value up front and pass it down the stack. I'm not so sure we even need the display recipient per-request cache any more, as we are generally pretty smart now about hydrating recipient data in terms of how the code is organized. But I haven't done thorough research on that hypotheseis. Fortunately, it's not rocket science to just write a glorified memoize decorator and tie it into key places in the code: * middleware * tests (e.g. asserting db counts) * queue processors That's what I did in this commit. This commit definitely reduces the amount of code to maintain. I think it also gets us closer to possibly phasing out this whole technique, but that effort is beyond the scope of this PR. We could add some instrumentation to the decorator to see how often we get a non-trivial number of saved round trips to memcached. Note that when we flush linkifiers, we just use a big hammer and flush the entire per-request cache for linkifiers, since there is only ever one realm in the cache.
2023-07-14 19:46:50 +02:00
ReturnT = TypeVar("ReturnT")
FUNCTION_NAME_TO_PER_REQUEST_RESULT: dict[str, dict[int, Any]] = {}
per-request caches: Add per_request_cache library. We have historically cached two types of values on a per-request basis inside of memory: * linkifiers * display recipients Both of these caches were hand-written, and they both actually cache values that are also in memcached, so the per-request cache essentially only saves us from a few memcached hits. I think the linkifier per-request cache is a necessary evil. It's an important part of message rendering, and it's not super easy to structure the code to just get a single value up front and pass it down the stack. I'm not so sure we even need the display recipient per-request cache any more, as we are generally pretty smart now about hydrating recipient data in terms of how the code is organized. But I haven't done thorough research on that hypotheseis. Fortunately, it's not rocket science to just write a glorified memoize decorator and tie it into key places in the code: * middleware * tests (e.g. asserting db counts) * queue processors That's what I did in this commit. This commit definitely reduces the amount of code to maintain. I think it also gets us closer to possibly phasing out this whole technique, but that effort is beyond the scope of this PR. We could add some instrumentation to the decorator to see how often we get a non-trivial number of saved round trips to memcached. Note that when we flush linkifiers, we just use a big hammer and flush the entire per-request cache for linkifiers, since there is only ever one realm in the cache.
2023-07-14 19:46:50 +02:00
def return_same_value_during_entire_request(f: Callable[..., ReturnT]) -> Callable[..., ReturnT]:
cache_key = f.__name__
assert cache_key not in FUNCTION_NAME_TO_PER_REQUEST_RESULT
FUNCTION_NAME_TO_PER_REQUEST_RESULT[cache_key] = {}
def wrapper(key: int, *args: Any) -> ReturnT:
if key in FUNCTION_NAME_TO_PER_REQUEST_RESULT[cache_key]:
return FUNCTION_NAME_TO_PER_REQUEST_RESULT[cache_key][key]
result = f(key, *args)
FUNCTION_NAME_TO_PER_REQUEST_RESULT[cache_key][key] = result
return result
return wrapper
def flush_per_request_cache(cache_key: str) -> None:
if cache_key in FUNCTION_NAME_TO_PER_REQUEST_RESULT:
FUNCTION_NAME_TO_PER_REQUEST_RESULT[cache_key] = {}
def flush_per_request_caches() -> None:
for cache_key in FUNCTION_NAME_TO_PER_REQUEST_RESULT:
FUNCTION_NAME_TO_PER_REQUEST_RESULT[cache_key] = {}