zulip

Commit Graph

Author	SHA1	Message	Date
Steve Howell	61a9f701bd	cache: Use a single cache entry for cross-realm bots. The cross-realm bots rarely change, and there are only a few of them, so we just query them all at once and put them in the cache. Also, we put the dictionaries in the cache, instead of the user objects, since there is nothing time-sensitive about the dictionaries, and they are small. This saves us a little time computing the avatar url and things like that, not to mention marshalling costs. This commit also fixes a theoretical bug where we would have stale cache entries if somebody somehow modified the cross-realm bots without bumping KEY_PREFIX. Internally we no longer pre-fetch the realm objects for the bots, but we don't get overly precise about picking individual fields from UserProfile, since we rarely hit the database and since we don't store raw ORM objects in the cache. The test diffs make it look like we are hitting the cache an extra time, but the tests weren't counting bulk fetches. Now we only use a single key for all bots rather a key per bot.	2023-07-25 23:08:52 -07:00
Sahil Batra	bb3945a32f	models: Remove select_related call in get_active_users. We do not use any related fields for the UserProfile objects fetched by get_active_users, so we can simply remove the select_related call. The user object from get_active_users was used to get realm but since get_active_users called from a realm object we can directly use that realm object. This change also leads to some changes in the cache code where we now pass the realm to the function instead of selecting it from UserProfile object.	2023-07-20 10:44:39 -07:00
Steve Howell	3599b1662e	cache: Eliminate transformed_bulk_cached_fetch. Its two callers now just directly call generic_bulk_cached_fetch with the explicit `lambda obj: obj` helpers.	2023-07-19 11:07:33 -07:00
Steve Howell	d19c1f7438	message fetching: Avoid duplicate cache layers. This code removes a lot of complexity with very likely positive overall impact on system performance and negligible downside. We already cache display recipients on a per-user level, so there's no need for another cache layer on top of that that keys them with recipient ids. We avoid strange things where Alice/Bob and Bob/Charlie get put into the top layer cache and then we still have a cache miss on Alice/Charlie despite the lower level cache being able to support per-user lookups. This change does introduce an extra database round trip if any of our messages have a huddle, but the query is extremely cheap, and we can always try to cache that function more directly or try to re-use some of our other huddle-based caches. As part of this, we clean up the names for the lower-level per-user cache of display recipients, and we simplify the cache keys. We also stop passing in a full Recipient object to the `bulk_get_huddle_user_ids` functions. The local impact of this change should be easy to measure (at least approximately), since we use this function every time a user gets messages via the /messages endpoint.	2023-07-19 11:07:33 -07:00
Anders Kaseorg	052984bc14	utils: Remove make_safe_digest wrapper. It’s unclear what was supposed to be “safe” about this wrapper. The hashlib API is fine without it, and we don’t want to encourage further use of SHA-1. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-07-19 10:54:05 -07:00
Steve Howell	89381a8072	cache: Eliminate get-stream-by-name cache. We remove the cache functionality for the get_realm_stream function, and we also change it to return a thin Stream object (instead of calling select_related with no arguments). The main goal here is to remove code complexity, as we have been prone to at least one caching validation bug related to how Realm and UserGroup interact. That particular bug was more theoretical than practical in terms of its impact, to be clear. Even if we were to be perfectly disciplined about only caching thin stream objects and always making sure to delete cache entries when stream data changed, we would still be prone to ugly situations like having transactions get rolled back before we delete the cache entry. The do_deactivate_stream is a perfect example of where we have to consider the best time to unset the cache. If you unset it too early, then you are prone to races where somebody else churns the cache right before you update the database. If you set it too late, then you can have an invalid entry after a rollback or deadlock situation. If you just eliminate the cache as a moving part, that whole debate is moot. As the lack of test changes here indicates, we rarely fetch streams by name any more in critical sections of our code. The one place where we fetch by name is in loading the home page, but that is only when you specify a stream name. And, of course, that only causes about an extra millisecond of time.	2023-07-11 13:45:40 -07:00
Ujjawal Modi	a361c23aac	alert_words: Refactor the code to flush alert_words cache. Subsequent commits will add "on_delete=models.RESTRICT" relationships, which will result in the AlertWord objects being deleted after Realm has been deleted from the database. In order to handle this, we update realm_alert_words_cache_key, realm_alert_words_automaton_cache_key, and flush_realm_alert_words functions to accept realm_id as parameter instead of realm object, so that the code for flushing the cache works even after the realm is deleted. This change is fine because eventually only realm_id is used by these functions and there is no need of the complete realm object.	2023-06-28 18:03:32 -07:00
Ujjawal Modi	f7346f36fc	attachments: Refactor code for flushing used_upload_space cache. Subsequent commits will add "on_delete=models.RESTRICT" relationships, which will result in the Attachment objects being deleted after Realm has been deleted from the database. In order to handle this, we update get_realm_used_upload_space_cache_key function to accept realm_id as parameter instead of realm object, so that the code for flushing the cache works even after the realm is deleted. This change is fine because eventually only realm_id is used by this function and there is no need of the complete realm object.	2023-06-28 18:03:32 -07:00
Ujjawal Modi	535a088d0b	bots: Refactor code for flushing bots cache. Subsequent commits will add "on_delete=models.RESTRICT" relationships, which will result in the UserProfile objects being deleted after Realm has been deleted from the database. In order to handle this, we update bot_dicts_in_realm_cache_key function to accept realm_id as parameter instead of realm object, so that the code for flushing the cache works even after the realm is deleted. This change is fine because eventually only realm_id is used by this function and there is no need of the complete realm object.	2023-06-28 18:03:32 -07:00
Anders Kaseorg	9db3451333	Remove statsd support. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-04-25 19:58:16 -07:00
Zixuan James Li	24f24d236d	cache: Use QuerySetAny for isinstance check. Previously, `QuerySet` does not support isinstance check since it is defined to be generic in django-stubs. In a recent update, such check is possible by using `QuerySetAny`, a non-generic alias of `QuerySet`. Signed-off-by: Zixuan James Li <p359101898@gmail.com>	2023-03-17 08:38:20 -07:00
Anders Kaseorg	d3efd4c095	python: Import F, Q, QuerySet from their canonical module. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-03-05 14:46:28 -08:00
Sahil Batra	0ed5f76063	settings: Add backend code for using user email_address_visibility setting. This commits update the code to use user-level email_address_visibility setting instead of realm-level to set or update the value of UserProfile.email field and to send the emails to clients. Major changes are - - UserProfile.email field is set while creating the user according to RealmUserDefault.email_address_visbility. - UserProfile.email field is updated according to change in the setting. - 'email_address_visibility' is added to person objects in user add event and in avatar change event. - client_gravatar can be different for different users when computing avatar_url for messages and user objects since email available to clients is dependent on user-level setting. - For bots, email_address_visibility is set to EVERYONE while creating them irrespective of realm-default value. - Test changes are basically setting user-level setting instead of realm setting and modifying the checks accordingly.	2023-02-10 17:35:49 -08:00
Anders Kaseorg	df001db1a9	black: Reformat with Black 23. Black 23 enforces some slightly more specific rules about empty line counts and redundant parenthesis removal, but the result is still compatible with Black 22. (This does not actually upgrade our Python environment to Black 23 yet.) Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-02-02 10:40:13 -08:00
Anders Kaseorg	f7e97b1180	ruff: Fix PLW0602 Using global but no assignment is done. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-04 16:25:07 -08:00
Alex Vandiver	8f6f38c97c	cache: Decline to store querysets, with an error. As we have seen no further cases of this in production since #23215, increase the severity to an error, and switch from returning a list (which is not type-safe if the function declares a QuerySet return) to returning the QuerySet without caching. Failing to store the result in the cache, with an error, seems superior to raising an exception; in both cases the next request will redo the work, but we are guaranteed a worse user experience if we 500 the request. Ref https://github.com/zulip/zulip/pull/23215#discussion_r994186493	2022-11-29 16:45:11 -08:00
Anders Kaseorg	73c4da7974	ruff: Fix N818 exception name should be named with an Error suffix. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-11-17 16:52:00 -08:00
Anders Kaseorg	46955da3a0	ruff: Fix ANN204 missing return type annotation for __init__. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-11-16 09:29:11 -08:00
Lauryn Menard	19a4c3907f	decorator: Put back check for dev env in `ignore_unhashable_lru_cache`. Prior to `53231aa`, the `ignore_unhashable_lru_cache` decorator had a check for the development environment so that changes could be seen on refresh. Puts that check back in IgnoreUnhashableLruCacheWrapper class.	2022-11-07 12:17:59 -08:00
Alex Vandiver	c328de3372	cache: Log a warning when attempting to store a whole QuerySet. As noted in the previous commit, this causes bloat in memcached, for no purpose. Log a warning when `cache_with_key` sees a QuerySet returned from the function it is decorating.	2022-10-12 22:25:48 -07:00
Alex Vandiver	204f1b58e8	cache: Drop realm_id from `realm_user_dict_fields`. Storing this key is superfluous, as it will be the same for all users, and definitionally already known to fetch the cache for the realm. It is also not currently used by the callsites that read rows from the cache.	2022-10-12 22:25:48 -07:00
Zixuan James Li	6c7b2d621e	typing: Avoid redefinition of incompatible QuerySets. The pattern of using the same variable to apply filters or alter the `QuerySet` in other ways might produce `QuerySet`s with incompatible types. This behavior is not allowed by mypy. Signed-off-by: Zixuan James Li <p359101898@gmail.com>	2022-07-07 11:27:43 -07:00
Anders Kaseorg	53231aa9d9	decorator: Type cache_info, cache_clear for ignore_unhashable_lru_cache. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-06-27 10:20:05 -07:00
Anders Kaseorg	d5fea08b8a	cache: Remove needless monkey patching. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-06-07 09:37:43 -07:00
Anders Kaseorg	fd16f97d6b	python: Excise None from pointlessly nullable booleans. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-27 12:40:14 -07:00
Alex Vandiver	351bdfaf78	preview: Use cache only as a non-durable cache, not an IPC. The `get_link_embed_data` / `link_embed_data_from_cache` pair as introduced in `c93f1d4eda` uses the cache as a temporary store inside of the `embed_links` worker; this means that it must be durable storage, or the worker will stall and re-fetch the same links to preview them. Switch to plumbing through the fetched URL embed data as an parameter to the Markdown evaluation which uses them, rather than using the cache as an intermediary. This frees up the cache to be merely a non-durable cache. As a side-effect, this removes get_cache_with_key, and link_embed_data_from_cache which was its only callsite.	2022-04-15 14:48:12 -07:00
Alex Vandiver	aaa58a49db	cache: Make the cache_name=None behaviour clearer. `django.core.cache.cache` is equal to `django.core.cache.caches["default"]`; the latter is more understandable in context.	2022-04-15 14:48:12 -07:00
Zixuan James Li	f21746ba0b	cache: Strength types of cache decorators with ParamSpec. This demonstrates a way to resolve the long-standing issue of typing higher-order identity functions without using `cast` and in a type-safe manner for decorators in `cache.py`. Signed-off-by: Zixuan James Li <359101898@qq.com>	2022-04-14 12:44:35 -07:00
Anders Kaseorg	ad5f0c05b5	python: Remove default "utf8" argument for encode(), decode(). Partially generated by pyupgrade. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-08-02 15:53:52 -07:00
Mateusz Mandera	d45f3eecaa	models: Add optional realm_id argument to get_system_bot.	2021-07-26 15:31:10 -07:00
Anders Kaseorg	1ae56e466b	cache: Fix typing for post_save and post_delete flush handlers. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-07-16 13:14:04 -07:00
Vishnu KS	b9066886d9	cache: Properly flush stream from cache after the stream is deleted. The previous logic was incorrect and was not flushing the stream from cache after deletion. ``` stream = get_realm_stream("Verona", realm.id) stream.delete() get_realm_stream("Verona", realm.id) ``` In the above example, the last line of code would have returned the stream from cache instead of throwing a Stream.DoesNotExist error. This is fixed in the commit. I have verified that this commit indeed fix the issue by verifying that calling get_realm_stream again after deleting the stream results in Stream.DoesNotExist error.	2021-07-06 17:21:59 -07:00
Abhijeet Prasad Bodas	b7fcb0275c	cache: Use `id`s instead of `UserProfile`s for get_muting_users. This will make it easier to call this function in the message send codepath.	2021-06-07 13:41:37 -07:00
Vishnu KS	5db53029a5	api: Include is_billing_admin as an attribute in user response. This is sufficiently useful that it should be made available to clients.	2021-06-03 10:27:07 -07:00
Anders Kaseorg	544bbd5398	docs: Fix capitalization mistakes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-05-10 09:57:26 -07:00
Mateusz Mandera	1a8ad796f8	models: Replace __id syntax with _id where possible. model__id syntax implies needing a JOIN on the model table to fetch the id. That's usually redundant, because the first table in the query simply has a 'model_id' column, so the id can be fetched directly. Django is actually smart enough to not do those redundant joins, but we should still avoid this misguided syntax. The exceptions are ManytoMany fields and queries doing a backward relationship lookup. If "streams" is a many-to-many relationship, then streams_id is invalid - streams__id syntax is needed. If "y" is a foreign fields from X to Y: class X: y = models.ForeignKey(Y) then object x of class X has the field x.y_id, but y of class Y doesn't have y.x_id. Thus Y queries need to be done like Y.objects.filter(x__id__in=some_list)	2021-04-22 14:53:00 -07:00
Abhijeet Prasad Bodas	b140c17441	mute user: Cache list of muter IDs. This commit defines a new function `get_muting_users` which will return a list of IDs of users who have muted a given user. Whenever someone mutes/unmutes a user, the cache will be flushed, and subsequently when that user sends a message, the cache will be populated with the list of people who have muted them (maybe empty). This data is a good candidate for caching because- 1. The function will later be called from the message send codepath, and we try to minimize database queries there. 2. The entries will be pretty tiny. 3. The entries won't churn too much. An average user will send messages much more frequently than get muted/unmuted, and the first time penalty of hitting the db and populating the cache should ideally get amortized by avoiding several DB lookups on subsequent message sends. The actual code to call this function will be written in further commits.	2021-04-13 09:08:47 -07:00
Mateusz Mandera	82d6d925e5	cache: Delete user_profile_by_email_cache_key. This is no longer used in any important place, get_user_profile_by_email is meant to be used only in manage.py shell now and thus there's no point in this function being cached.	2021-03-25 00:47:42 -07:00
Mateusz Mandera	f147c42f9d	actions: Change caching of create_mirror_user_if_needed. Emails are not unique, so we can only sensibly cache using keys formed with both email and realm. This requires adding a new cache key function for caching by delivery email - user_profile_delivery_email_cache_key.	2021-03-25 00:47:42 -07:00
Anders Kaseorg	6eb1705068	cache: Strengthen ignore_unhashable_lru_cache decorator type. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-15 17:05:28 -08:00
Anders Kaseorg	6e4c3e41dc	python: Normalize quotes with Black. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	11741543da	python: Reformat with Black, except quotes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	3a8cf869db	python: Convert os.open(…, O_EXCL) to open(…, "x"). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-11-09 14:31:01 -08:00
Mateusz Mandera	cbeeadab16	delete_realm: Register a post_delete Realm handler. By registering a post_delete handler to clear appropriate caches in a nicer way, we can get rid of the ugly flush-memcached call in the delete_realm command.	2020-10-30 11:43:03 -07:00
Anders Kaseorg	7c4f68d9cf	python: Skip unnecessary decode before BeautifulSoup parsing. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-30 11:36:38 -07:00
Anders Kaseorg	72d6ff3c3b	docs: Fix more capitalization issues. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-23 11:46:55 -07:00
Anders Kaseorg	b7b7475672	python: Use standard secrets module to generate random tokens. There are three functional side effects: • Correct an insignificant but mathematically offensive bias toward repeated characters in generate_api_key introduced in commit 47b4283c4b4c70ecde4d3c8de871c90ee2506d87; its entropy is increased from 190.52864 bits to 190.53428 bits. • Use the base32 alphabet in confirmation.models.generate_key; its entropy is reduced from 124.07820 bits to the documented 120 bits, but now it uses 1 syscall instead of 24. • Use the base32 alphabet in get_bigbluebutton_url; its entropy is reduced from 51.69925 bits to 50 bits, but now it uses 1 syscall instead of 10. (The base32 alphabet is A-Z 2-7. We could probably replace all of these with plain secrets.token_urlsafe, since I expect most callers can handle the full urlsafe_b64 alphabet A-Z a-z 0-9 - _ without problems.) Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-09 15:52:57 -07:00
Dinesh	c64888048f	puppeteer: Rename CASPER_TESTS env variable to PUPPETEER_TESTS. Also modified few comments to match with the changes.	2020-09-09 13:38:39 -04:00
arpit551	af3a34fbca	cache: Used lru_cache from functools instead of django.utils.lru_cache. Django 3.0 removed private Python 2 compatibility APIs so used lru_cache() directly from functools. We cast lru_cache to Any to avoid attr-defined error in mypy since we are adding extra field, 'key_prefix', to this object later.	2020-08-14 11:34:04 -07:00
Steve Howell	c44500175d	database: Remove short_name from UserProfile. A few major themes here: - We remove short_name from UserProfile and add the appropriate migration. - We remove short_name from various cache-related lists of fields. - We allow import tools to continue to write short_name to their export files, and then we simply ignore the field at import time. - We change functions like do_create_user, create_user_profile, etc. - We keep short_name in the /json/bots API. (It actually gets turned into an email.) - We don't modify our LDAP code much here.	2020-07-17 11:15:15 -07:00

1 2 3 4 5

221 Commits