Commit Graph

187 Commits

Author SHA1 Message Date
Tim Abbott 51eeb0a3ee cache: Add missing : in test-backend key prefixes.
Previously, these cache keys looked like:
:1:9c26164d3a393e316e0f8210efe270e08710d45astream_by_realm_and_name:...

Now, they look like this:
:1:9c26164d3a393e316e0f8210efe270e08710d45a:stream_by_realm_and_name:...
2019-03-18 10:56:50 -07:00
Vishnu Ks 94ae2dc24e models: Cache currently_used_upload_space_bytes function. 2019-03-04 18:46:13 -08:00
Bennet Sunder 7c5f316cb8 alert_words: Performance improvements in looking for alert_words.
This commit leverages the ahocorasick algorithm to build a set of user_ids
that have their alert_words present in the message. It runs in linear time
of the order of length of the input message as opposed to number of
alert_words. This is after building a ahocorasick Automaton which runs
in O(number of alert_words in entire realm) which is usually cached.
2019-03-01 15:36:39 -08:00
Anders Kaseorg f0ecb93515 zerver core: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:41:24 -08:00
Wyatt Hoodes 8eac361fb5 docs: Refactor BS work with use of cache_with_key.
Refactor the potentially expensive work done by Beautiful Soup into a
function that is called by the alter_content function, so that we can
cache the result.  Saves a significant portion of the runtime of
loading of all of our /help/ and /api/ documentation pages (e.g. 12ms
for /api).

Fixes #11088.

Tweaked by tabbott to use the URL path as the cache key, clean up
argument structure, and use a clearer name for the function.
2019-01-28 15:21:52 -08:00
Tim Abbott a0da4f6d30 python: Clean up various if False blocks.
Most of these are now-unnecessary typing imports; some are just
improved comments for those with other mypy motivations.
2018-12-17 11:14:47 -08:00
Vishnu Ks 8a1794caa3 message: Store the value of first_visible_message_id in Realm table.
This eliminates a bunch of potentially buggy caching code, with no
material negative side effects.
2018-12-12 15:11:17 -08:00
Tim Abbott e603237010 email: Convert accounts code to use delivery_email.
A key part of this is the new helper, get_user_by_delivery_email.  Its
verbose name is important for clarity; it should help avoid blind
copy-pasting of get_user (which we'll also want to rename).
Unfortunately, it requires detailed understanding of the context to
figure out which one to use; each is used in about half of call sites.

Another important note is that this PR doesn't migrate get_user calls
in the tests except where not doing so would cause the tests to fail.
This probably deserves a follow-up refactor to avoid bugs here.
2018-12-06 16:21:38 -08:00
Tim Abbott 209dd5db67 actions: Add a function for changing realm subdomains.
This is initial work, which will help us establish habits of using a
well-tested approach for renaming a Zulip organization (since as part
of https://github.com/zulip/zulip-mobile/issues/3142, we'll likely
need to make this function do more).
2018-11-15 14:39:14 -08:00
Pragati Agrawal d5df0377cc settings_users: Support guest user in admin-user-table.
This supports guest user in the user-info-form-modal as well as in the
role section of the admin-user-table.

With some fixes by Tim Abbott and Shubham Dhama.
2018-10-29 12:33:35 -07:00
Steve Howell 76deb30312 preview: Hash cache keys for preview urls.
We don't want really long urls to lead to truncated
keys, or we could theoretically have two different
urls get mixed up previews.

Also, this suppresses warnings about exceeding the
250 char limit.

Finally, this gives the key a proper prefix.
2018-10-14 09:28:57 -07:00
Yago González f6219745de users: Get all API keys via wrapper method.
Now reading API keys from a user is done with the get_api_key wrapper
method, rather than directly fetching it from the user object.

Also, every place where an action should be done for each API key is now
using get_all_api_keys. This method returns for the moment a single-item
list, containing the specified user's API key.

This commit is the first step towards allowing users have multiple API
keys.
2018-08-08 16:35:17 -07:00
Tim Abbott 6f7e12ea19 docs: Add subsystem documentation for caching. 2018-07-31 17:00:45 -07:00
Anders Kaseorg 195cc78470 zerver/lib/cache.py: Avoid shelling out for mkdir.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2018-07-19 10:43:37 -07:00
Shubham Dhama 01555e8772 streams: Handle guest user ids for stream settings changes' events. 2018-06-04 11:35:37 -07:00
Vishnu Ks 372e9740ac events: Add date_joined to user_dict. 2018-05-17 07:49:35 -07:00
Steve Howell 4332fd64f7 Add submessages to message payloads. 2018-05-16 15:13:33 -07:00
Aditya Bansal a68376e2ba zerver/lib: Change use of typing.Text to str. 2018-05-12 15:22:39 -07:00
Tim Abbott 707af5ab56 cache: Remove a now-unnecessary TODO.
We solved the problem the TODO raised by using a different type
annotation syntax, and I'm not sure whether that refactor would
actually improve the code.
2018-03-16 11:32:14 -07:00
neiljp (Neil Pilgrim) 966ca7015f mypy: Finalize migration of cache.py to python3 function annotation.
- Use forward declarations of some types from models.py to avoid cycles.
- Remove cache.py from linter rule exclude list to ensure it stays that way.
2018-03-16 11:29:12 -07:00
neiljp (Neil Pilgrim) 005cb6bd03 mypy: Improve [get_]cache_with_key typing & use py3 annotation. 2018-03-16 11:29:12 -07:00
Tim Abbott 3d8e45f1cb cache: Avoid caching /help/ documentation page content.
This should make it a lot less annoying to edit these pages locally,
without regressing the test performance which motivated the cache.
2018-03-05 09:26:58 -08:00
Umair Khan 0eca2e102d cache: Add ignore_unhashable_lru_cache function.
This is a wrapper over lru_cache function. It adds following features on
top of lru_cache:

    * It will not cache result of functions with unhashable arguments.
    * It will clear cache whenever zerver.lib.cache.KEY_PREFIX changes.
2018-02-09 18:14:08 -08:00
rht 9161f8c39b zerver/lib: Remove u prefix from strings. 2018-02-05 12:12:58 -08:00
Vishnu Ks 036dc53d20 messages: Rename last_visible_message_id to first_visible_message_id. 2018-01-22 19:53:44 -08:00
Vishnu Ks e6d3f8895f messages: Create function to calculate first visible message id. 2018-01-04 08:52:40 -05:00
rht ee546a33a3 zerver/lib: Use python 3 syntax for typing.
Edited by tabbott to improve various line-wrapping decisions.
2017-11-28 17:15:14 -08:00
rht 229a8b38c0 zerver/lib: Use Python 3 syntax for typing for several files.
This adds a number of annotations that had been missed in previous
passes.
2017-11-28 17:02:24 -08:00
Tim Abbott 646ba5b9e5 bulk_get_users: Fix issues with users in multiple realms.
The previous implementation had a subtle caching bug: because it was
sharing its cache with the `get_user_profile_by_email` cache, if a
user happened to have an email in that cache, we'd return it, even
though that user didn't match `base_query`.

This causes `get_cross_realm_users` to no longer have a problematic
caching bug.
2017-11-27 14:34:45 -08:00
rht fef7d6ba09 zerver/lib: Remove u prefix from strings.
License: Apache-2.0
Signed-off-by: rht <rhtbot@protonmail.com>
2017-11-03 15:34:37 -07:00
neiljp (Neil Pilgrim) 304e411944 mypy: Remove unused FuncT TypeVar in cache.py. 2017-10-31 00:02:17 -07:00
Tim Abbott f2e3e779eb mypy: Properly annotate generic_bulk_cached_fetch.
Along with fixing some minor bugs, this requires extracting out the
default functions so that we can do type: ignores on them properly.

While we're at it, we switch to the Python 3 syntax.
2017-10-28 10:07:15 -07:00
Tim Abbott 73c27e1277 cache: Fix type aliasing of cached_objects.
Previously, it was converted from a CompressedItemT to an ItemT
without changing the variable name.
2017-10-28 10:01:44 -07:00
Tim Abbott 94c1da7025 cache: Move generic_bulk_cached_fetch typevars up a bit. 2017-10-28 10:00:43 -07:00
neiljp (Neil Pilgrim) c063ba72a2 mypy: Improve typing of cache_with_key and cache decorators.
Fixes #1348.
2017-10-28 08:57:49 -07:00
Steve Howell df93a99b50 Cache only one row per message.
Before this change, we populated two cache entries for each
message that we sent.  The entries were largely redundant,
with the only difference being whether we sent the content
as raw markdown or as the rendered HTML.

This commit makes it so we only have one cache entry per
message, and it includes both content and rendered_content.

One legacy source on confusion here is that `content`
changes meaning when you're on the front end.  Here is the
situation going forward:

    database:
        content = raw
        rendered_contented = rendered

    cache entry:
        content = raw
        rendered_contented = rendered

    payload for the frontend:
        content = raw (for apply_markdown=False)
        content = rendered (for apply_markdown=True)
2017-10-26 16:35:28 -07:00
Steve Howell 14d2d4e506 Fix bug in flush_user_profile().
Every time we updated a UserProfile object, we were calling
delete_display_recipient_cache(), which churns the cache and
does an extra database hop to find subscriptions.  This was
due to saying `updated_fields` instead of `update_fields`.

This made us prone to cache churn for fields like UserProfile.pointer
that are fairly volatile.

Now we use the helper function changed().  To prevent the
opposite problem, we use all the fields that could invalidate
the cache.
2017-10-25 11:30:56 -07:00
Steve Howell c8875693c8 Extract changed() helper in flush_user_profile().
The verbose style of `changed` is partly to appease mypy.
2017-10-25 11:29:09 -07:00
Steve Howell b94c062368 Make the realm user cache include non-active users.
This is a prepatory commit that adds non-active users to
the realm user cache.  It mostly involves name changes and
removing an `is_active` filter from the relevant DB query.

The only consumer of this cache is `get_raw_user_data`, which
now filters on `is_active` in a dictionary comprehension (but
this will get moved around a bit in a subsequent commit).
2017-10-25 11:18:30 -07:00
rht 035ed93111 zerver/lib: remove `import six`. 2017-09-27 19:10:28 -07:00
rht 2e12fe5e2e zerver/lib: Remove print_function. 2017-09-27 18:05:45 -07:00
rht f43e54d352 zerver/lib: Remove absolute_import. 2017-09-27 10:00:39 -07:00
Juliana Bacelar 928dd06cc8 linter: Add lint rule banning 'import os.path' 2017-09-22 10:32:21 -07:00
Steve Howell 8ad7133351 Cache active_user_ids() more directly.
We now have a dedicated cache for active_user_ids() that only
stores a list of user_ids.

Before this commit, active_user_ids() used a cache of UserProfile
dictionaries, so it incurred unnecessary deserialization costs for
all the user fields that it sliced away in a list comprehension.

Because the cache is skinnier here, we also need to invalidate it
less frequently.  Basically, all we care about is new users, realm
deactivations, and user deactivations.

It's hard to measure how much this will improve performance, because
the speedup for any operation here is pretty minor, but we use this
function a lot, so hopefully it will make the overall system more
healthy.
2017-09-20 10:31:33 -07:00
Steve Howell 26735eeeac Only require realm_id for get_active_user_dicts_in_realm().
This is a preparatory commit that will eventually allow us
to avoid fetching realm info that we don't need, in other
parts of the codebase.
2017-09-20 10:31:33 -07:00
Steve Howell 0966bf1a48 Simplify get_stream_cache_key().
Before this commit, we could pass in either a Realm object
or a realm_id to get_stream_cache_key().  Now we consistently
pass it a realm_id.
2017-09-20 10:31:33 -07:00
Tim Abbott b8e7369dee mypy: Remove type: ignores not needed in Python 3. 2017-08-25 11:04:20 -07:00
Tim Abbott eeabed9119 models: Add new get_user_profile_by_api_key helper.
This results in a slight performance increase.
2017-08-24 23:17:08 -07:00
Abhijeet Kaur af7e08acb0 bots: Add UI to view bot types of existing bots in "Your bots".
Tweaked by tabbott for more standard internationalization.
2017-06-15 10:08:31 -07:00
Eklavya Sharma 690b6025fb mypy: Fix return type of a function. 2017-05-24 18:43:51 -07:00
Konstantin Gukov dd76222a3f Fetch system bots using new get_system_bot function.
This eliminate a bunch of uninteresting calls to
get_user_profile_by_email.
2017-05-23 10:30:40 -07:00
Vishnu Ks bdf7c6c02f models: Add get_user function.
This is intended to replace get_user_profile_by_email.
2017-05-22 11:26:44 -07:00
Tim Abbott 0990246289 avatar: Fix memcached query loop fetching bots.
Similar to the related issue with users, the new avatar storage had
accidentally added database queries in a loop to this code path.
2017-05-09 22:33:27 -07:00
Tim Abbott 8e47dc73bd avatar: Fix loop doing database queries in register.
Due to the refactoring of the avatar URL codepath that added realm IDs
to the URLs, we ended up calling `get_user_profile_by_email` inside
`get_avatar_url`, which in turns was called in a loop over all users
in a realm.

Needless to say, this resulted in a significant performance problem.

We fix this issue by passing in the data needed to compute the avatar
URL, rather than looking it up by email address.
2017-05-09 22:33:27 -07:00
Aditya Bansal 821be4519c pep8: Add compliance with rule E261 to cache.py. 2017-05-07 23:21:50 -07:00
Umair Khan 4d543217ba cache: Take hash of KEY_PREFIX to limit key size.
Key size of Memcached records should be less than 256.
2017-05-05 18:23:40 +05:00
hackerkid bf3b2ac673 Include timezone in user_dict fields.
Tweaked by tabbott to avoid adding timezone to bot dicts, since bots
don't need a timezone.
2017-04-14 10:33:55 -07:00
Rishi Gupta 128c431f14 cache.py: Change realm_alert_words_cache_key to use Realm.string_id. 2017-03-13 14:17:14 -07:00
Raghav Jajodia a3a03bd6a5 mypy: Added Dict, List and Set imports.
Fixed mypy errors associated with the upgrade.
2017-03-04 14:33:44 -08:00
Harshit Bansal 9d5be410af page_params: Modify `bot_list` to hold active as well as inactive bots.
Modify the `bot_list` to hold all the bots owned by an user
irrespective of whether the bot is active or inactive. Also
include the `is_active` field in `active_bot_dict_fields` to
distinguish between inactive and active bots.
2017-02-26 23:56:51 -08:00
Steve Howell fa31ad35c9 Fix display of changed avatars in old messages (page_params).
Our client code will now receive avatar_url in
page_params.people_list during page load, so it will be
able to use more current urls for old messages (the client
already had some logic for that and was just missing the
data).

We also add avatar_url to the realm_user/add event.

When we change the avatar, we make sure to always send a
realm_user/update event (even for bots).

We also needed to add avatar_version and
avatar_source to our active users cache.
2017-02-22 07:57:03 -08:00
Steve Howell 3a04831793 Add avatar_version to active_bot_dict_fields. 2017-02-17 10:19:56 -08:00
Umair Khan c585fa6eb4 change-email: Delete display recipient cache. 2017-02-07 21:49:31 -08:00
Umair Khan 41aa07adb6 change-email: Delete email caches on email change. 2017-02-07 18:43:26 -08:00
Tim Abbott e9158dd520 lint: Clean up E121 PEP-8 rule. 2017-01-23 21:02:39 -08:00
Tim Abbott 42612161df tornado: Remove unused caching code. 2017-01-19 16:36:31 -08:00
Juan Verhook cfa9c2eaf2 mypy: Update zerver directory to use Text 2016-12-29 09:12:15 -08:00
Robert Hönig 0917493588 mypy: Convert zerver/lib to use typing.Text. 2016-12-25 10:33:45 -08:00
Sampriti Panda c0326d1938 Add lint rule to disallow python calls with versions (e.g: python2, python3)
Fixes #2435
2016-12-19 08:00:48 -08:00
Umair Khan 770a899239 Django 1.10: Use single cache prefix for casper tests.
There is a change in Django 1.10 due to which whenever the password
of the user is changed the session hash changes. This change affects
us because we cache user profile objects and these cached objects need
to be refreshed. However, the signal sent by Django in which objects are
refreshed fails to refresh the cache for Tornado because it uses a
different cache prefix.

Note: Backend tests are not affected because they don't rely on Tornado.
2016-12-14 22:40:33 -08:00
Igor Tokarev c93f1d4eda Add oembed/Open Graph/Meta tags data retrieval from inline links.
This change adds support for displaying inline open graph previews for
links posted into Zulip.

It is designed to interact correctly with message editing.

This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control
whether this feature is enabled.

By default, this setting is currently disabled, so that we can burn it
in for a bit before it impacts users more broadly.

Eventually, we may want to make this manageable via a (set of?)
per-realm settings.  E.g. I can imagine a realm wanting to be able to
enable/disable it for certain URLs.
2016-12-07 17:40:18 -08:00
Bickio 6b0df43463 pep8: Fix E125. 2016-11-30 20:03:29 -08:00
Umair Khan 0536aeba4d Django 1.10: Use same cache prefix for JS tests.
Previously, the key prefix was based on the process id due to which
the JS tests couldn't properly flush user profiles from the cache as
our application spans over multiple processes. This problem becomes
apparent when in json_change_settings view after changing the user_profile
the tornado views continue to get the cached user profile corresponding
to their process id.
2016-11-26 15:10:50 -08:00
Umair Khan d81446805c Django 1.10: Use `caches` object to access cache. 2016-11-04 10:06:00 -07:00
Steve Howell ac994fdd51 Move three functions from models.py to lib/cache.py.
I move these three functions to lib/cache.py:

    to_dict_cache_key_id
    to_dict_cache_key
    flush_message

This will prepare us for a more significant refactoring that
eventually breaks down some circular dependencies with
Message and bugdown.
2016-10-04 11:31:20 -07:00
Tim Abbott 22fd7ba02a avatar: Move avatar hash computations to their own file. 2016-10-02 21:19:10 -07:00
Tim Abbott 8442e9249c Reflow annotation for generic_bulk_cached_fetch.
This is a test of mypy's new support for annotating functions that
take lots of arguments.
2016-09-27 20:36:56 -07:00
Tim Abbott 2e6aad669c cache: Add a basic annotation for cache_with_key. 2016-09-10 11:57:08 -07:00
Umair Khan 8e5e6a06f2 Delete cache entry for user profile.
Our flush functions update user profile cache entries which can cause
confusing race conditions (see e.g. #1257).  To resolve this, we move
all the user_profile flush functions to delete the entry instead of
updating it -- it will then be fetched as part of the next request
that needs to access the user object.

There are still races here, and there is perhaps an argument that a
better fix for this would be to re-fetch the object and then put it
into the cache, but this resolves the main cache correctness problem
we had with the previous implementation.

Fixes: #1322.
2016-07-28 13:43:14 -07:00
Eklavya Sharma 674f6999e1 Improve annotations of decorators. 2016-07-22 11:14:33 -07:00
Tim Abbott 014a13df7c cache: Fix echoing of mkdir command to console. 2016-07-18 14:25:13 -07:00
Taranjeet Singh ba3f9de9a9 zerver/lib/cache.py: Move remote_cache_prefix to var directory.
This commit ensures the var directory exists before its needed in both
development and production environments.
2016-07-18 14:13:02 -07:00
Tim Abbott 72e948d19a Remove now-unused message_cache_key message cache.
Originally this cache was used to transmit data from Django to Tornado
(and also for general message caching purposes), but now nothing
actually reads from this cache, so we can eliminate it.
2016-07-08 17:58:56 -07:00
Eklavya Sharma 7ca1e658b5 zerver/lib/cache.py: Change some TypeVars to Any.
Change ItemT and CompressedItemT to Any.
See https://github.com/python/mypy/issues/1721.
2016-06-27 16:50:50 +05:30
Eklavya Sharma f82b28e835 zerver/lib/cache.py: Fix get_cache_backend's annotation. 2016-06-11 09:12:58 -07:00
Eklavya Sharma 0b2d1c30e9 zerver/lib/cache.py: Replace Any with appropriate models.
Due to a cyclic dependency issue, functions having models as parameters
were annotated as Any.
That issue is fixed by importing models inside an `if False:` block,
so that mypy sees them but they are not imported at runtime.
2016-06-11 09:12:58 -07:00
Eklavya Sharma ff4e95d941 Improve generic_bulk_cached_fetch annotation using TypeVars. 2016-06-11 09:12:42 -07:00
Eklavya Sharma d27a0e162a zerver/lib/cache.py: update_user_profile_caches return type is None.
In update_user_profile_caches, the return type in annotation was
marked as Any.  Change that to None because, nothing is being returned
in that function.
2016-06-11 09:11:52 -07:00
Eklavya Sharma 53084fe03c Use text_type as type of cache keys and update users.
This changes the type annotations for the cache keys in Zulip to be
consistently text_type, and updates the annotations for values that
are used as cache keys across the codebase.
2016-06-11 09:10:34 -07:00
Eklavya Sharma d3b80d94a2 Use appropriate string types and correctly encode/decode them. 2016-06-11 17:34:23 +05:30
Eklavya Sharma 286d23734a zerver/lib/cache.py: Remove unneeded return statements. 2016-06-09 16:57:11 -07:00
Ashish Kumar 31bf6b8259 Type annotation of zerver/models.py
[Substantially revised by tabbott]

This probably still has some bugs in it, but having mostly complete
annotations for models.py will help a lot for the annotations folks
are adding to other files.
2016-06-02 23:28:34 -07:00
Ashish Kumar cad342aff6 Correct annotation of generic_bulk_cached_fetch in zerver/lib/cache.py.
Previously, object_ids was tagged as an int, but it is called from
models.py with a string, so we make it an Any.
2016-06-01 14:00:49 -07:00
Eklavya Sharma 1ea6171179 Fix an annotation in zerver/lib/cache.py.
This is done to make annotations in zerver/lib/actions.py work correctly.
2016-05-25 15:11:48 -07:00
Tim Abbott 2409ac9b2f cache: Add type annotations to active_*_dict_fields. 2016-05-10 11:48:03 -07:00
Tim Abbott 0161d2fddd Cleanup guardian-based complexity in get_realm_user_dicts.
The old code for this lookup was unnecessarily complicated because we
were working around Guardian, where the `is_realm_admin` check was
extremely expensive.
2016-05-09 10:12:35 -07:00
Tim Abbott 2a2cbd60c3 cache: Fix fragile active_bot_dicts_in_realm caching model.
The issue here is similar to that in the previous commit.
2016-05-09 10:12:35 -07:00
Tim Abbott fbc7e977ac cache: Fix fragile active_user_dicts_in_realm caching model.
Previously we relied on having two matching list of fields for the
get_active_user_dicts_in_realm, one in the actual code and the other
in the caching system.  By unifying these lists to have a single
source, we eliminate a class of caching bugs we might otherwise
regularly introduce.
2016-05-09 10:12:35 -07:00
Ashish Kumar 31408d639e Type annotation of zerver/lib/cache.py. 2016-04-29 14:43:48 -07:00
Tim Abbott 552caf661a Caching: Fix 'update_fields' not being present in .delete() 2016-04-20 15:12:53 -07:00