Commit Graph

1420 Commits

Author SHA1 Message Date
Anders Kaseorg c4748298bb ruff: Fix PERF102 Using only the keys/values of a dict.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-07 17:23:55 -07:00
Sahil Batra ae72151ec1 streams: Pass stream_weekly_traffic field in stream objects.
This commit adds code to pass stream traffic data using
the "stream_weekly_traffic" field in stream objects.

We already include the traffic data in Subscription objects,
but the traffic data does not depend on the user to stream
relationship and is stream-only information, so it's better
to include it in Stream objects. We may remove the traffic
data and other stream information fields for Subscription
objects in future.

This will help clients to correctly display the stream
traffic data in case where client receives a stream
creation event and no subscription event, for an already
existing stream which the user did not have access to before.
2023-08-06 18:06:42 -07:00
Sahil Batra 2533e64be6 streams: Remove get_client_data function.
This commit changes the code to not use get_client_data
function and instead use `stream_to_dict` function to
get the stream data in a dictionary form. This is a
prep commit add stream traffic data to Stream objects.
2023-08-06 18:02:47 -07:00
Alex Vandiver b67108c8c6 retention: Prevent deletion of partially-archived messages.
Previously, this code:
```python3
old_archived_attachments = ArchivedAttachment.objects.annotate(
    has_other_messages=Exists(
        Attachment.objects.filter(id=OuterRef("id"))
        .exclude(messages=None)
        .exclude(scheduled_messages=None)
    )
).filter(messages=None, create_time__lt=delta_weeks_ago, has_other_messages=False)
```

...protected from removal any ArchivedAttachment objects where there
was an Attachment which had _both_ a message _and_ a scheduled
message, instead of _either_ a message _or_ a scheduled message.
Since files are removed from disk when the ArchivedAttachment rows are
deleted, this meant that if an upload was referenced in two messages,
and one was deleted, the file was permanently deleted when the
ArchivedMessage and ArchivedAttachment were cleaned up, despite being
still referenced in live Messages and Attachments.

Switch from `.exclude(messages=None).exclude(scheduled_messages=None)`
to `.exclude(messages=None, scheduled_messages=None)` which "OR"s
those conditions appropriately.

Pull the relevant test into its own file, and expand it significantly
to cover this, and other, corner cases.
2023-08-06 13:40:02 -07:00
Ujjawal Modi c8bcb422f5 streams: Rename `can_remove_subscribers_group_id` parameter.
Earlier the API endpoints related to streams accepts and returns a
field `can_remove_subscribers_group_id` which represents the ID
of user_group whose members can remove subscribers from stream.

This commit renames this field to `can_remove_subscribers_group`.
2023-07-25 18:33:04 -07:00
Zixuan James Li 000761ac0c realm_playgrounds: Replace url_prefix with url_template.
Dropping support for url_prefix for RealmPlayground, the server now uses
url_template instead only for playground creation, retrieval and audit
logging upon removal.

This does the necessary handling so that url_template is expanded with
the extracted code.

Fixes #25723.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-24 17:40:59 -07:00
Zixuan James Li 641f60305d realm_playgrounds: Add url_template field.
As an intermediate step before we fully support url_template for realm
playgrounds, we populate url_template in the backend ensuring that all
the new entries will be validated. With a later backfilling migration,

we prepare the database such that all the records will have a valid URL
template.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-24 10:29:40 -07:00
Anders Kaseorg 2ae285af7c ruff: Fix PLR1714 Consider merging multiple comparisons.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-23 15:21:33 -07:00
Lauryn Menard 1cccdd8103 realm-settings: Make default_code_block_language empty string as default.
Updates the realm field `default_code_block_language` to have a default
value of an empty string instead of None. Also updates the web-app to
check for the empty string and not `null` to indicate no default is set.

This means that both new realms and existing realms that have no default
set will have the same value for this setting: an empty string.

Previously, new realms would have None if no default was set, while realms
that had set and then unset a value for this field would have an empty
string when no default was set.
2023-07-21 18:54:02 +02:00
Sahil Batra c11cf8eb54 users: Directly access id of foreign keys instead of full object.
We used to access the complete objects for UserProfile foreign
keys like "bot_owner" and "default_sending_stream", where we only
needed ID of them.

This commit fixes some of such instances and now we directly get
the id using "bot_owner_id" and "default_sending_stream_id" so
that we can avoid the unnecessary complexity of accessing the
complete object.
2023-07-20 10:44:39 -07:00
Sahil Batra 3e09a21929 models: Pass realm and bot_owner as args to select_related.
This commit updates the select_related calls in queries to get
UserProfile objects in get_user, get_user_by_delivery_email,
get_user_profile_by_id, get_user_profile_by_id_in_realm and
get_user_profile_by_api_key functions to pass "realm" and
"bot_owner" as arguments to select_related call.

These functions are used in different parts of code to get
the UserProfile object and realm is accessed using the user
object at many places.

"bot_owner" field is also used in some places like to check
whether a bot can access a stream, to check whether a user
can change modify another user, in webhooks code to send the
message to the bot owner, and in tests as well. There can be
some places where the bot owner is not required and in most
such cases the code would only be accessed for human users,
which means the bot_owner will be null for these cases and
would avoid complexity and performance issues.

Note that previously, no arguments were passed to select_related
and thus only realm field was fetched during the query.
2023-07-20 10:44:39 -07:00
Sahil Batra 71c66cd75c models: Pass realm as arg to select_related in get_system_bot.
This commit updates the select_related calls in queries to
get UserProfile object in get_syste_bot function pass "realm"
as argument to select_related call.

The "get_system_bot" call function is mostly used to get cross
realm bot which are used as senders to send messages.
The fields like default_events_register_stream and recipient
are not required for these cases. The bot_owner field is used
to check access to a stream to send message but the cross-realm
bots are handled differently and the bot_owner check is not
required.

Also, note that "realm" is the only non-null foreign key field
in UserProfile object, so select_related() was only fetching
realm object previously as well. But we should still pass
"realm" as argument in select_related call so that we can make
sure that only required fields are selected in case we add
more foreign keys to UserProfile in future.
2023-07-20 10:44:39 -07:00
Sahil Batra 584026b21f models: Pass realm as arg to select_related in get_user_profile_by_email.
This commit updates select_related call in get_user_profile_by_email
to pass "realm" as argument.

This function is intended to be used for manual manage.py shell
work so we just keep the behavior same as before as "realm" is
the only non-null related field in UserProfile.
2023-07-20 10:44:39 -07:00
Sahil Batra bb3945a32f models: Remove select_related call in get_active_users.
We do not use any related fields for the UserProfile objects
fetched by get_active_users, so we can simply remove the
select_related call.

The user object from get_active_users was used to get realm
but since get_active_users called from a realm object we can
directly use that realm object. This change also leads to
some changes in the cache code where we now pass the realm
to the function instead of selecting it from UserProfile object.
2023-07-20 10:44:39 -07:00
Steve Howell d19c1f7438 message fetching: Avoid duplicate cache layers.
This code removes a lot of complexity with very likely
positive overall impact on system performance and
negligible downside.

We already cache display recipients on a per-user
level, so there's no need for another cache layer on
top of that that keys them with recipient ids.

We avoid strange things where Alice/Bob and Bob/Charlie
get put into the top layer cache and then we still have
a cache miss on Alice/Charlie despite the lower level
cache being able to support per-user lookups.

This change does introduce an extra database round trip
if any of our messages have a huddle, but the query is
extremely cheap, and we can always try to cache that
function more directly or try to re-use some of our
other huddle-based caches.

As part of this, we clean up the names for the
lower-level per-user cache of display recipients, and
we simplify the cache keys.

We also stop passing in a full Recipient object to the
`bulk_get_huddle_user_ids` functions.

The local impact of this change should be easy to
measure (at least approximately), since we use this
function every time a user gets messages via the
/messages endpoint.
2023-07-19 11:07:33 -07:00
Steve Howell 03557a5568 huddles: Find huddle user ids more efficiently.
We restrict the columns, avoid quadratic looping,
and don't bother with order_by.

We also return the user ids (per recipient) as
sets, since that's how the only caller uses the
info (albeit implicitly via set.union accepting
a list).
2023-07-19 11:07:33 -07:00
Anders Kaseorg 052984bc14 utils: Remove make_safe_digest wrapper.
It’s unclear what was supposed to be “safe” about this wrapper.  The
hashlib API is fine without it, and we don’t want to encourage further
use of SHA-1.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-19 10:54:05 -07:00
Anders Kaseorg 143baa4243 python: Convert translated positional {} fields to {named} fields.
Translators benefit from the extra information in the field names, and
need the reordering freedom that isn’t available with multiple
positional fields.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-18 15:19:07 -07:00
Prakhar Pratyush 4c9d26ce17 mention: Send notifications for @topic wildcard mentions.
This commit completes the notifications part of the @topic
wildcard mention feature.

Notifications are sent to the topic participants for the
@topic wildcard mention.
2023-07-17 09:39:24 -07:00
Steve Howell b742f1241f realm emoji: Use a single cache for all lookups.
The active realm emoji are just a subset of all your
realm emoji, so just use a single cache entry per
realm.

Cache misses should be very infrequent per realm.

If a realm has lots of deactivated realm emoji, then
there's a minor expense to deserialize them, but that
is gonna be dwarfed by all the other more expensive
operations in message-send.

I also renamed the two related functions.  I erred on
the side of using somewhat verbose names, as we don't
want folks to confuse the two use cases. Fortunately
there are somewhat natural affordances to use one or
the other, and mypy helps too.

Finally, I use realm_id instead of realm in places
where we don't need the full Realm object.
2023-07-17 09:35:53 -07:00
Steve Howell e988cf9b0a emoji cache: Don't join to UserProfile table.
We only need author id, and anything else in the table
would be possibly stale anyway.
2023-07-17 09:35:53 -07:00
Anders Kaseorg 7e707270f0 models: Convert deprecated index_together option to indexes.
index_together is slated for removal in Django 5.1:
https://docs.djangoproject.com/en/4.2/internals/deprecation/#deprecation-removed-in-5-1

We set the optional index names to match the previously generated
index names to avoid adding new migrations.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-12 07:12:43 -07:00
Sahil Batra 2e4f7f6336 user_groups: Remove "@" from name of role-based system groups.
This commit removes "@" from name of role-based system groups
since we have added a restricion on having user group names
starting with "@" in the previous commit as they look odd in
mention syntax.

We also add a migration in this commit to update the name of
role-based system groups in existing realms to remove "@"
from the name. This migration also updates the names of
non-system user groups by removing the invalid prefixes
from their names and if there is a group already with that
name, we insted name the group as "group:{group_id}".

Fixes #26148.
2023-07-11 13:46:02 -07:00
Sahil Batra 929bf1243e user_groups: Disallow certain prefixes in group name.
We do not allow user group names to start with "@", "role:",
"user:", "stream:" and "channel:".

Group names starting with "@" look odd in mentions and
"role:", "user:" and "stream:" prefixes are reserved for
system groups which will be used in the new groups-based
permission model. We do not allow "channel:" prefix for
now just to be safe in a case where we use it instead of
"stream:" prefix for stream based groups in future.

Fixes part of #26148.
2023-07-11 13:46:02 -07:00
Sahil Batra ea3a7a9e6f user_groups: Add API restrictions for long user group names.
Previously we had database level restriction on length of
user group names. Now we add the same restriction to API
level as well, so we can return a better error response.
2023-07-11 13:46:02 -07:00
Steve Howell 89381a8072 cache: Eliminate get-stream-by-name cache.
We remove the cache functionality for the
get_realm_stream function, and we also change it to
return a thin Stream object (instead of calling
select_related with no arguments).

The main goal here is to remove code complexity, as we
have been prone to at least one caching validation bug
related to how Realm and UserGroup interact. That
particular bug was more theoretical than practical in
terms of its impact, to be clear.

Even if we were to be perfectly disciplined about only
caching thin stream objects and always making sure to
delete cache entries when stream data changed, we would
still be prone to ugly situations like having
transactions get rolled back before we delete the cache
entry. The do_deactivate_stream is a perfect example of
where we have to consider the best time to unset the
cache. If you unset it too early, then you are prone to
races where somebody else churns the cache right before
you update the database. If you set it too late, then
you can have an invalid entry after a rollback or
deadlock situation. If you just eliminate the cache as
a moving part, that whole debate is moot.

As the lack of test changes here indicates, we rarely
fetch streams by name any more in critical sections of
our code.

The one place where we fetch by name is in loading the
home page, but that is **only** when you specify a
stream name. And, of course, that only causes about an
extra millisecond of time.
2023-07-11 13:45:40 -07:00
Steve Howell 046e4c715b cache: Use DB for all bulk get-stream-by-name queries.
This changes bulk_get_streams so that it just uses the
database all the time.  Also, we avoid calling
select_related(), so that we just get back thin and
tidy Stream objects with simple queries.

About not caching any more:

It's actually pretty rare that we fetch streams by name
in the main application. It's usually API requests that
send in stream names to find more info about streams.

It also turns out that for large queries (>= ~30 rows
for my measurements) it's more efficent to hit the
database than memcached. The database is super fast at
scale; it's just the startup cost of having Django
construct the query, and then having the database do
query planning or whatever, that slows us down. I don't
know the exact bottleneck, but you can clearly measure
that one-row queries are slow (on the order of a full
millisecond or so) but the marginal cost of additional
rows is minimal assuming you have a decent index (20
microseconds per row on my droplet).

All the query-count changes in the tests revolve around
unsubscribing somebody from a stream, and that's a
particularly odd use case for bulk_get_streams, since
you generally unsubscribe from a single stream at a
time. If there are some use cases where you do want to
unsubscribe from multiple streams, we should move
toward passing in stream ids, at least from the
application. And even if we don't do that, our cost for
most queries is a couple milliseconds.
2023-07-11 13:45:40 -07:00
Zixuan James Li 3349ac9f86 user_groups: Audit UserGroup group based setting changes.
This add audit log entries when any group based setting of a user group
is updated. We store both the old and new values in extra_data, along
with the name of that setting. Entries populated during user group creation
are hardcoded to track "can_mention_group".

Potentially we can adjust "set_defaults_for_group_settings" so that it
populates realm audit logs with it, but that is out of scope for this change.

We use an atomic transaction so that the audit logs are committed
together with the updates.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Zixuan James Li 71de14ab43 models: Add modified_user_group.
This also adds the supported event types for changes to UserGroup.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Prakhar Pratyush 179d5cb37d mention: Replace 'wildcards' with 'stream_wildcards'.
This prep commit replaces the 'wildcard' keyword in the codebase
with 'stream_wildcard' at some places for better readability, as
we plan to introduce 'topic_wildcards' as a part of the
'@topic mention' project.

Currently, 'wildcards = ["all", "everyone", "stream"]' which is an
alias to mention everyone in the stream, hence better renamed as
'stream_wildcards'.

Eventually, we will have:
'stream_wildcard' as an alias to mention everyone in the stream.
'topic_wildcard' as an alias to mention everyone in the topic.
'wildcard' refers to 'stream_wildcard' and 'topic_wildcard' as a whole.
2023-07-03 22:03:17 -07:00
Sahil Batra 2763f9b575 user_groups: Add can_mention_group setting.
This commit adds a new can_mention_group setting which will be
used to determine who can mention a particular group.

Fixes a part of #25927.
2023-06-30 17:28:33 -07:00
Ujjawal Modi a361c23aac alert_words: Refactor the code to flush alert_words cache.
Subsequent commits will add "on_delete=models.RESTRICT"
relationships, which will result in the AlertWord
objects being deleted after Realm has been deleted from
the database.

In order to handle this, we update realm_alert_words_cache_key,
realm_alert_words_automaton_cache_key, and flush_realm_alert_words
functions to accept realm_id as parameter instead of realm
object, so that  the code for flushing the cache works even
after the realm is deleted. This change is fine because
eventually only realm_id is used by these functions and there
is no need of the complete realm object.
2023-06-28 18:03:32 -07:00
Ujjawal Modi f7346f36fc attachments: Refactor code for flushing used_upload_space cache.
Subsequent commits will add "on_delete=models.RESTRICT"
relationships, which will result in the Attachment
objects being deleted after Realm has been deleted from
the database.

In order to handle this, we update
get_realm_used_upload_space_cache_key function to accept
realm_id as parameter instead of realm object, so that
the code for flushing the cache works even after the
realm is deleted. This change is fine because eventually
only realm_id is used by this function and there is no
need of the complete realm object.
2023-06-28 18:03:32 -07:00
Ujjawal Modi 535a088d0b bots: Refactor code for flushing bots cache.
Subsequent commits will add "on_delete=models.RESTRICT"
relationships, which will result in the UserProfile
objects being deleted after Realm has been deleted from
the database.

In order to handle this, we update bot_dicts_in_realm_cache_key
function to accept realm_id as parameter instead of realm
object, so that  the code for flushing the cache works even
after the realm is deleted. This change is fine because
eventually only realm_id is used by this function and there is
no need of the complete realm object.
2023-06-28 18:03:32 -07:00
Ujjawal Modi fd0434a052 realm_emoji: Refactor code for flushing realm_emoji cache.
Subsequent commits will add "on_delete=models.RESTRICT"
relationships, which will result in the RealmEmoji
objects being deleted after Realm has been deleted from
the database.

In order to handle this, we update get_realm_emoji_dicts,
get_realm_emoji_cache_key, get_active_realm_emoji_cache_key,
get_realm_emoji_uncached and get_active_realm_emoji_uncached
functions to accept realm_id as parameter instead of realm
object, so that the code for flushing the cache works even
after the realm is deleted. This change is fine because
eventually only realm_id is used by these functions and
there is no need of the complete realm object.
2023-06-28 18:03:32 -07:00
Sahil Batra 138a67d97e types: Add default_group_name field to GroupPermissionSetting type.
This commit adds default_group_name field to GroupPermissionSetting
type which will be used to store the name of the default group for
that setting which would in most cases be one of the role-based
system groups. This will be helpful when we would have multiple
settings and we would need to set the defaults while creating
realm and streams.
2023-06-28 18:03:32 -07:00
Lauryn Menard d3f7cfccbc zerver: Update comments with "private message" or "PM".
Updates comments/doc-strings that use "private message" or "PM" in
files in the `/zerver` directory to instead use "direct message".
2023-06-23 11:24:13 -07:00
Anders Kaseorg c09e7d6407 codespell: Correct “requestor” to “requester”.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-06-20 16:17:55 -07:00
Zixuan James Li 5901ffb0ab models: Fix typo in comment.
"This consistent ordering is important to prevent to prevent"
->
"This consistent ordering is important to prevent"
2023-06-15 17:50:34 -04:00
Prakhar Pratyush 134058b06d settings: Configure 'enable_followed_topic_audible_notifications'.
This commit makes it possible for users to control the
audible desktop notifications for messages sent to followed topics
via a global notification setting.

There is no support for configuring this setting through the UI yet.
2023-06-13 18:01:41 -07:00
Prakhar Pratyush a848c744c3 settings: Configure 'enable_followed_topic_desktop_notifications'.
This commit makes it possible for users to control the
visual desktop notifications for messages sent to followed topics
via a global notification setting.

There is no support for configuring this setting through the UI yet.
2023-06-13 18:01:41 -07:00
Prakhar Pratyush e71d3ada87 settings: Add wildcard mention notifications for the followed topics.
This commit makes it possible for users to control the wildcard
mention notifications for messages sent to followed topics
via a global notification setting.

There is no support for configuring this setting
through the UI yet.
2023-06-13 18:01:41 -07:00
Prakhar Pratyush d73c715dc2 settings: Add push notifications for the followed topics.
This commit makes it possible for users to control
the push notifications for messages sent to followed topics
via a global notification setting.

There is no support for configuring this setting
through the UI yet.
2023-06-13 18:01:41 -07:00
Prakhar Pratyush 5e5538886f settings: Add email notifications for the followed topics.
This commit makes it possible for users to control
the email notifications for messages sent to followed topics
via a global notification setting.
Although there is no support for configuring this setting
through the UI yet.

Add five new fields to the UserBaseSettings class for
the "followed topic notifications" feature, similar to
stream notifications. But this commit consists only of
the implementation of email notifications.
2023-06-13 18:01:41 -07:00
Zixuan Li e39e04c3ce
migration: Add `extra_data_json` for audit log models.
Note that we use the DjangoJSONEncoder so that we have builtin support
for parsing Decimal and datetime.

During this intermediate state, the migration that creates
extra_data_json field has been run. We prepare for running the backfilling
migration that populates extra_data_json from extra_data.

This change implements double-write, which is important to keep the
state of extra data consistent. For most extra_data usage, this is
handled by the overriden `save` method on `AbstractRealmAuditLog`, where
we either generates extra_data_json using orjson.loads or
ast.literal_eval.

While backfilling ensures that old realm audit log entries have
extra_data_json populated, double-write ensures that any new entries
generated will also have extra_data_json set. So that we can then safely
rename extra_data_json to extra_data while ensuring the non-nullable
invariant.

For completeness, we additionally set RealmAuditLog.NEW_VALUE for
the USER_FULL_NAME_CHANGED event. This cannot be handled with the
overridden `save`.

This addresses: https://github.com/zulip/zulip/pull/23116#discussion_r1040277795

Note that extra_data_json at this point is not used yet. So the test
cases do not need to switch to testing extra_data_json. This is later
done after we rename extra_data_json to extra_data.

Double-write for the remote server audit logs is special, because we only
get the dumped bytes from an external source. Luckily, none of the
payload carries extra_data that is not generated using orjson.dumps for
audit logs of event types in SYNC_BILLING_EVENTS. This can be verified
by looking at:

`git grep -A 6 -E "event_type=.*(USER_CREATED|USER_ACTIVATED|USER_DEACTIVATED|USER_REACTIVATED|USER_ROLE_CHANGED|REALM_DEACTIVATED|REALM_REACTIVATED)"`

Therefore, we just need to populate extra_data_json doing an
orjson.loads call after a None-check.

Co-authored-by: Zixuan James Li <p359101898@gmail.com>
2023-06-07 12:14:43 -07:00
Sahil Batra 48e99657ad events: Remove realm_community_topic_editing_limit_seconds.
This commit removes realm_community_topic_editing_limit_seconds
field from register response since topic edit limit is now
controlled by move_messages_within_streams_limit_seconds
setting.
We also remove DEFAULT_COMMUNITY_TOPIC_EDITING_LIMIT_SECONDS
constant since it is no longer used.
2023-05-25 17:26:21 -07:00
Alex Vandiver c978bfaa32 models: Add a unique index on UserProfile.api_key.
This prevents `get_user_profile_by_api_key` from doing a sequential
scan.

Doing this requires moving the generation of initial api_key values
into the column definition, so that even bare calls to
`UserProfile.objects.create` (e.g. from tests) call appropriately
generate a random initial value.
2023-05-19 11:11:04 -07:00
Sahil Batra 7f01b3fb63 users: Set tos_version to -1 for users who have not logged-in yet.
We now set tos_version to "-1" for imported users and the ones
created using API or using other methods like LDAP, SCIM and
management commands. This value will help us to allow users to
change email address visibility setting during first login.
2023-05-16 13:52:56 -07:00
Lauryn Menard 90cc2716f0 scheduled-messages: Update API dicts for `failed` boolean field.
Adds the `failed` boolean from the ScheduledMessage to the API dict
returned by scheduled message events and register response, and by
fetching the user's scheduled messages.

`failed` will only be true when the server has tried to send the
scheduled message and failed due to an error.
2023-05-12 15:48:59 -07:00
Tim Abbott 602e4c2aa3 scheduled_messages: Add focused scheduled message indexes. 2023-05-09 13:48:28 -07:00