Commit Graph

1446 Commits

Author SHA1 Message Date
Lalit 2b566c778b user_settings: Add new `web_stream_unreads_count_display_policy` field.
This is a backend change that will help us support the new "Show unread counts for"
user display setting.
2023-09-13 18:45:45 -07:00
Alex Vandiver 5874c6542f migrations: Remove indexes on Message without realm_id.
These indexes should no longer be necessary after the changes in the
previous commit.
2023-09-11 15:00:37 -07:00
Alex Vandiver b94402152d models: Always search Messages with a realm_id or id limit.
Unless there is a limit on `id`, always provide a `realm_id` limit as
well.  We also notate which index is expected to be used in each
query.
2023-09-11 15:00:37 -07:00
Alex Vandiver 3518d31797 migrations: Add indexes with realm_id.
This is designed to help PostgreSQL have better specificity and
locality in its indexes.  Subsequent commits will adjust the code to
make sure that we use these indexes rather than the `realm_id`-less
versions.

We do not add a `realm_id` variation to the full-text index, since
it is a GIN index; multi-column GIN indexes are not terribly
performant, require the `btree_gin` extension for `int` types (which
requires superuser privileges on PostgreSQL 12 and earlier), and
cannot be consistently added concurrently on running instances.

After all indexes have been made, we also run `CREATE STATISTICS` in
order to give PostgreSQL the opportunity to realize that recipient and
sender are highly correlated with message realm, allowing it to
estimate that `(realm_id, recipient_id)` is likely as specific as
matching a given `recipient_id`, instead of as likely as matching
`realm_id` times matching a `recipient_id`.  Finally, those statistics
must be filled by `ANALYZE zerver_message`, which is run last.
2023-09-11 15:00:37 -07:00
Alex Vandiver d6745209f2 django: Use .exists() instead of .count() when possible. 2023-09-11 15:00:37 -07:00
Anders Kaseorg 2cd018ce57 models: Remove duplicate index definition for date_sent.
Commit cf0eb46afc added this to let
Django understand the CREATE INDEX CONCURRENTLY statement that had
been hidden in a RunSQL query in migration 0244.  However, migration
0245 explained that same index to Django in a different way by setting
db_index=True.  Move that to 0244 where the index is actually created,
using SeparateDatabaseAndState.

Also remove the part of the SQL in 0245 that was mirrored by dummy
state_operations, and replace it with real operations.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-07 16:44:43 -07:00
Tim Abbott 6c83bbcbdb settings: Disallow everyone group for new setting.
This is important because the "guests" value isn't one that we'd
expect anyone to pick intentionally, and in particular isn't an
available option for the similar/adjacent "email invitations" setting.
2023-09-07 14:21:01 -07:00
Ujjawal Modi ec49c3acc8 invites: Rename `can_invite_others_to_realm` local variables.
This commit rename the existing setting `Who can invite users to this
organization` to `Who can send email invitations to new users` and
also renames all the variables related to this setting that do not
require a change to the API.

This was done for better code readability as a new setting
`Who can create invite links` will be added in future commits.
2023-09-07 14:21:01 -07:00
Ujjawal Modi f67cef8885 invite: Add new setting for "Who can create multiuse invite links".
This commit does the backend changes required for adding a realm
setting based on groups permission model and does the API changes
required for the new setting `Who can create multiuse invite link`.
2023-09-07 14:21:01 -07:00
Ujjawal Modi 9eccb4336e types: Add id_field_name field to GroupPermissionSetting type.
This commit adds id_field_name field to GroupPermissionSetting
type which will be used to store the string formed by concatenation
of setting_name and `_id`.
2023-09-07 14:21:01 -07:00
Sahil Batra 5a8416ff6a message: Do not pass "sender__realm" to select_related.
We have modified the code to directly fetch realm from Message
object instead of "sender" field and thus we no longer need to
fetch "sender__realm" using select_related.
2023-08-23 11:38:32 -07:00
Sahil Batra 7295028194 message: Access realm object directly from message.
We can directly get the realm object from Message object now
and there is no need to get the realm object from "sender"
field of Message object.

After this change, we would not need to fetch "sender__realm"
field using "select_related" and instead only passing "realm"
to select_related when querying Message objects would be enough.

This commit also updates a couple of cases to directly access
realm ID from message object and not message.sender. Although
we have fetched sender object already, so accessing realm_id
from message directly or from message.sender should not matter,
but we can be consistent to directly get realm from Message
object whenever possible.
2023-08-23 11:38:32 -07:00
Lauryn Menard 438bcc1585 email-templates: Remove followup_day from EMAIL_TYPES.
Now that we're using the new templates for the onboarding emails,
remove "followup_day1" and "followup_day2" from the EMAIL_TYPES
that are used for scheduled emails.
2023-08-18 16:25:48 -07:00
Lauryn Menard 5e29e025c5 email-templates: Add zulip_onboarding_topics email templates.
The "followup_day2" email template name is not clear or descriptive
about the purpose of the email. Creates a duplicate of those email
template files with the template name "zulip_onboarding_topics".

Because any existing scheduled emails that use the "followup_day2"
templates will need to be updated before the current templates can
be removed, we don't do a simple file rename here.
2023-08-18 16:25:48 -07:00
Lauryn Menard c491bef07b email-templates: Add account_registered email templates.
The "followup_day1" email template name is not clear or descriptive
about the purpose of the email. Creates a duplicate of those email
template files with the template name "account_registered".

Because any existing scheduled emails that use the "followup_day1"
templates will need to be updated before the current templates can
be removed, we don't do a simple file rename here.
2023-08-18 16:25:48 -07:00
Zixuan James Li 30495cec58 migration: Rename extra_data_json to extra_data in audit log models.
This migration applies under the assumption that extra_data_json has
been populated for all existing and coming audit log entries.

- This removes the manual conversions back and forth for extra_data
throughout the codebase including the orjson.loads(), orjson.dumps(),
and str() calls.

- The custom handler used for converting Decimal is removed since
DjangoJSONEncoder handles that for extra_data.

- We remove None-checks for extra_data because it is now no longer
nullable.

- Meanwhile, we want the bouncer to support processing RealmAuditLog entries for
remote servers before and after the JSONField migration on extra_data.

- Since now extra_data should always be a dict for the newer remote
server, which is now migrated, the test cases are updated to create
RealmAuditLog objects by passing a dict for extra_data before
sending over the analytics data. Note that while JSONField allows for
non-dict values, a proper remote server always passes a dict for
extra_data.

- We still test out the legacy extra_data format because not all
remote servers have migrated to use JSONField extra_data.
This verifies that support for extra_data being a string or None has not
been dropped.

Co-authored-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-08-16 17:18:14 -07:00
Zixuan James Li 37660dd0e7 linkifier: Support reordering linkifiers.
This adds API support to reorder linkifiers and makes sure that the
returned lists of linkifiers from `GET /events`, `POST /register`, and
`GET /realm/linkifiers` are always sorted with the order that they
should processed when rendering linkifiers.

We set the new `order` field to the ID with the migration. This
preserves the order of the existing linkifiers.

New linkifiers added will always be ordered the last. When reordering,
the `order` field of all linkifiers in the same realm is updated, in
a manner similar to how we implement ordering for
`custom_profile_fields`.
2023-08-14 15:21:48 -07:00
Steve Howell 51db22c86c per-request caches: Add per_request_cache library.
We have historically cached two types of values
on a per-request basis inside of memory:

    * linkifiers
    * display recipients

Both of these caches were hand-written, and they
both actually cache values that are also in memcached,
so the per-request cache essentially only saves us
from a few memcached hits.

I think the linkifier per-request cache is a necessary
evil. It's an important part of message rendering, and
it's not super easy to structure the code to just get
a single value up front and pass it down the stack.

I'm not so sure we even need the display recipient
per-request cache any more, as we are generally pretty
smart now about hydrating recipient data in terms of
how the code is organized. But I haven't done thorough
research on that hypotheseis.

Fortunately, it's not rocket science to just write
a glorified memoize decorator and tie it into key
places in the code:

    * middleware
    * tests (e.g. asserting db counts)
    * queue processors

That's what I did in this commit.

This commit definitely reduces the amount of code
to maintain. I think it also gets us closer to
possibly phasing out this whole technique, but that
effort is beyond the scope of this PR. We could
add some instrumentation to the decorator to see
how often we get a non-trivial number of saved
round trips to memcached.

Note that when we flush linkifiers, we just use
a big hammer and flush the entire per-request
cache for linkifiers, since there is only ever
one realm in the cache.
2023-08-11 11:09:34 -07:00
Steve Howell f8ec00b895 mypy: Improve type checks for user display recipients. 2023-08-10 18:13:43 -07:00
Steve Howell 5b569ab865 cache: Stringify stream recipients without the cache.
We generally want to avoid extra moving parts when we
stringify objects. We also want to phase out the use
of get_display_recipient for streams.

Note that we still hit get_display_recipient to
stringify DM and huddle objects, and it's kind of ugly
how we do it, but that's outside the scope of my
current PR.
2023-08-10 18:13:43 -07:00
Prakhar Pratyush c4e4737cc6 notification_trigger: Rename `private_message` to `direct_message`.
This commit renames the 'PRIVATE_MESSAGE' attribute of the
'NotificationTriggers' class to 'DIRECT_MESSAGE'.

Custom migration to update the existing value in the database.

It includes 'TODO/compatibility' code to support the old
notification trigger value 'private_message' in the
push notification queue during the Zulip server upgrades.

Earlier 'private_message' was one of the possible values for the
'trigger' property of the '[`POST /zulip-outgoing-webhook`]' response;
Update the docs to reflect the change in the above-mentioned trigger
value.
2023-08-10 17:41:49 -07:00
Sahil Batra 36f8aba7db message: Pass args to select_related call for Message objects.
This commit adds code to pass all the required arguments to
select_related call for Message objects such that only the
required related fields are fetched from the database.

Previously, we did not pass any arguments to select_related,
so all the directly and indirectly related fields were fetched
when many of them were actually not being used and made the
query unnecessarily complex.
2023-08-10 17:35:43 -07:00
Sahil Batra ab488010b3 models: Pass args to select_related in get_stream_by_id_in_realm.
This commit updates the code to pass "realm" and "recipient" as
arguments to select_related call in get_stream_by_id_in_realm.

Previously, since there was no arguments, it fetched
can_remove_subscribers_group and the related fields of
"Realm" model as well which were not being used, but
did not fetch "recipient" as it is a nullable field.
2023-08-10 17:35:43 -07:00
Sahil Batra 91a58d026b models: Remove get_huddle_recipient and use get_or_create_huddle.
This commit removes get_huddle_recipient function and we now use
get_or_create_huddle in get_recipient_from_user_profiles.

As a result of this change, we do not fetch the recipient from
Huddle object but instead get it using the "id" and "recipient_id"
fields available from Huddle object like we do for a personal
message. This change allows us to not fetch recipient object
using select_related when querying the Huddle object.
2023-08-10 17:35:43 -07:00
Sahil Batra 2c28b49680 models: Fetch "recipient" object when along with "Huddle" object.
We now fetch recipient object when querying "Huddle" object in
get_or_create_huddle_backend as this query is eventually used
to get the recipient object only in get_huddle_recipient.

This commit also updates the select_related call in the code to
populate Huddle objects in cache to pass "Recipient" as argument.
Previously no argument was passed to select_related and thus no
related objects were being fetched, with no non-null related fields
being present.
2023-08-10 17:35:43 -07:00
Anders Kaseorg 562a79ab76 ruff: Fix PERF401 Use a list comprehension to create a transformed list.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-07 17:23:55 -07:00
Anders Kaseorg c4748298bb ruff: Fix PERF102 Using only the keys/values of a dict.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-07 17:23:55 -07:00
Sahil Batra ae72151ec1 streams: Pass stream_weekly_traffic field in stream objects.
This commit adds code to pass stream traffic data using
the "stream_weekly_traffic" field in stream objects.

We already include the traffic data in Subscription objects,
but the traffic data does not depend on the user to stream
relationship and is stream-only information, so it's better
to include it in Stream objects. We may remove the traffic
data and other stream information fields for Subscription
objects in future.

This will help clients to correctly display the stream
traffic data in case where client receives a stream
creation event and no subscription event, for an already
existing stream which the user did not have access to before.
2023-08-06 18:06:42 -07:00
Sahil Batra 2533e64be6 streams: Remove get_client_data function.
This commit changes the code to not use get_client_data
function and instead use `stream_to_dict` function to
get the stream data in a dictionary form. This is a
prep commit add stream traffic data to Stream objects.
2023-08-06 18:02:47 -07:00
Alex Vandiver b67108c8c6 retention: Prevent deletion of partially-archived messages.
Previously, this code:
```python3
old_archived_attachments = ArchivedAttachment.objects.annotate(
    has_other_messages=Exists(
        Attachment.objects.filter(id=OuterRef("id"))
        .exclude(messages=None)
        .exclude(scheduled_messages=None)
    )
).filter(messages=None, create_time__lt=delta_weeks_ago, has_other_messages=False)
```

...protected from removal any ArchivedAttachment objects where there
was an Attachment which had _both_ a message _and_ a scheduled
message, instead of _either_ a message _or_ a scheduled message.
Since files are removed from disk when the ArchivedAttachment rows are
deleted, this meant that if an upload was referenced in two messages,
and one was deleted, the file was permanently deleted when the
ArchivedMessage and ArchivedAttachment were cleaned up, despite being
still referenced in live Messages and Attachments.

Switch from `.exclude(messages=None).exclude(scheduled_messages=None)`
to `.exclude(messages=None, scheduled_messages=None)` which "OR"s
those conditions appropriately.

Pull the relevant test into its own file, and expand it significantly
to cover this, and other, corner cases.
2023-08-06 13:40:02 -07:00
Ujjawal Modi c8bcb422f5 streams: Rename `can_remove_subscribers_group_id` parameter.
Earlier the API endpoints related to streams accepts and returns a
field `can_remove_subscribers_group_id` which represents the ID
of user_group whose members can remove subscribers from stream.

This commit renames this field to `can_remove_subscribers_group`.
2023-07-25 18:33:04 -07:00
Zixuan James Li 000761ac0c realm_playgrounds: Replace url_prefix with url_template.
Dropping support for url_prefix for RealmPlayground, the server now uses
url_template instead only for playground creation, retrieval and audit
logging upon removal.

This does the necessary handling so that url_template is expanded with
the extracted code.

Fixes #25723.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-24 17:40:59 -07:00
Zixuan James Li 641f60305d realm_playgrounds: Add url_template field.
As an intermediate step before we fully support url_template for realm
playgrounds, we populate url_template in the backend ensuring that all
the new entries will be validated. With a later backfilling migration,

we prepare the database such that all the records will have a valid URL
template.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-24 10:29:40 -07:00
Anders Kaseorg 2ae285af7c ruff: Fix PLR1714 Consider merging multiple comparisons.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-23 15:21:33 -07:00
Lauryn Menard 1cccdd8103 realm-settings: Make default_code_block_language empty string as default.
Updates the realm field `default_code_block_language` to have a default
value of an empty string instead of None. Also updates the web-app to
check for the empty string and not `null` to indicate no default is set.

This means that both new realms and existing realms that have no default
set will have the same value for this setting: an empty string.

Previously, new realms would have None if no default was set, while realms
that had set and then unset a value for this field would have an empty
string when no default was set.
2023-07-21 18:54:02 +02:00
Sahil Batra c11cf8eb54 users: Directly access id of foreign keys instead of full object.
We used to access the complete objects for UserProfile foreign
keys like "bot_owner" and "default_sending_stream", where we only
needed ID of them.

This commit fixes some of such instances and now we directly get
the id using "bot_owner_id" and "default_sending_stream_id" so
that we can avoid the unnecessary complexity of accessing the
complete object.
2023-07-20 10:44:39 -07:00
Sahil Batra 3e09a21929 models: Pass realm and bot_owner as args to select_related.
This commit updates the select_related calls in queries to get
UserProfile objects in get_user, get_user_by_delivery_email,
get_user_profile_by_id, get_user_profile_by_id_in_realm and
get_user_profile_by_api_key functions to pass "realm" and
"bot_owner" as arguments to select_related call.

These functions are used in different parts of code to get
the UserProfile object and realm is accessed using the user
object at many places.

"bot_owner" field is also used in some places like to check
whether a bot can access a stream, to check whether a user
can change modify another user, in webhooks code to send the
message to the bot owner, and in tests as well. There can be
some places where the bot owner is not required and in most
such cases the code would only be accessed for human users,
which means the bot_owner will be null for these cases and
would avoid complexity and performance issues.

Note that previously, no arguments were passed to select_related
and thus only realm field was fetched during the query.
2023-07-20 10:44:39 -07:00
Sahil Batra 71c66cd75c models: Pass realm as arg to select_related in get_system_bot.
This commit updates the select_related calls in queries to
get UserProfile object in get_syste_bot function pass "realm"
as argument to select_related call.

The "get_system_bot" call function is mostly used to get cross
realm bot which are used as senders to send messages.
The fields like default_events_register_stream and recipient
are not required for these cases. The bot_owner field is used
to check access to a stream to send message but the cross-realm
bots are handled differently and the bot_owner check is not
required.

Also, note that "realm" is the only non-null foreign key field
in UserProfile object, so select_related() was only fetching
realm object previously as well. But we should still pass
"realm" as argument in select_related call so that we can make
sure that only required fields are selected in case we add
more foreign keys to UserProfile in future.
2023-07-20 10:44:39 -07:00
Sahil Batra 584026b21f models: Pass realm as arg to select_related in get_user_profile_by_email.
This commit updates select_related call in get_user_profile_by_email
to pass "realm" as argument.

This function is intended to be used for manual manage.py shell
work so we just keep the behavior same as before as "realm" is
the only non-null related field in UserProfile.
2023-07-20 10:44:39 -07:00
Sahil Batra bb3945a32f models: Remove select_related call in get_active_users.
We do not use any related fields for the UserProfile objects
fetched by get_active_users, so we can simply remove the
select_related call.

The user object from get_active_users was used to get realm
but since get_active_users called from a realm object we can
directly use that realm object. This change also leads to
some changes in the cache code where we now pass the realm
to the function instead of selecting it from UserProfile object.
2023-07-20 10:44:39 -07:00
Steve Howell d19c1f7438 message fetching: Avoid duplicate cache layers.
This code removes a lot of complexity with very likely
positive overall impact on system performance and
negligible downside.

We already cache display recipients on a per-user
level, so there's no need for another cache layer on
top of that that keys them with recipient ids.

We avoid strange things where Alice/Bob and Bob/Charlie
get put into the top layer cache and then we still have
a cache miss on Alice/Charlie despite the lower level
cache being able to support per-user lookups.

This change does introduce an extra database round trip
if any of our messages have a huddle, but the query is
extremely cheap, and we can always try to cache that
function more directly or try to re-use some of our
other huddle-based caches.

As part of this, we clean up the names for the
lower-level per-user cache of display recipients, and
we simplify the cache keys.

We also stop passing in a full Recipient object to the
`bulk_get_huddle_user_ids` functions.

The local impact of this change should be easy to
measure (at least approximately), since we use this
function every time a user gets messages via the
/messages endpoint.
2023-07-19 11:07:33 -07:00
Steve Howell 03557a5568 huddles: Find huddle user ids more efficiently.
We restrict the columns, avoid quadratic looping,
and don't bother with order_by.

We also return the user ids (per recipient) as
sets, since that's how the only caller uses the
info (albeit implicitly via set.union accepting
a list).
2023-07-19 11:07:33 -07:00
Anders Kaseorg 052984bc14 utils: Remove make_safe_digest wrapper.
It’s unclear what was supposed to be “safe” about this wrapper.  The
hashlib API is fine without it, and we don’t want to encourage further
use of SHA-1.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-19 10:54:05 -07:00
Anders Kaseorg 143baa4243 python: Convert translated positional {} fields to {named} fields.
Translators benefit from the extra information in the field names, and
need the reordering freedom that isn’t available with multiple
positional fields.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-18 15:19:07 -07:00
Prakhar Pratyush 4c9d26ce17 mention: Send notifications for @topic wildcard mentions.
This commit completes the notifications part of the @topic
wildcard mention feature.

Notifications are sent to the topic participants for the
@topic wildcard mention.
2023-07-17 09:39:24 -07:00
Steve Howell b742f1241f realm emoji: Use a single cache for all lookups.
The active realm emoji are just a subset of all your
realm emoji, so just use a single cache entry per
realm.

Cache misses should be very infrequent per realm.

If a realm has lots of deactivated realm emoji, then
there's a minor expense to deserialize them, but that
is gonna be dwarfed by all the other more expensive
operations in message-send.

I also renamed the two related functions.  I erred on
the side of using somewhat verbose names, as we don't
want folks to confuse the two use cases. Fortunately
there are somewhat natural affordances to use one or
the other, and mypy helps too.

Finally, I use realm_id instead of realm in places
where we don't need the full Realm object.
2023-07-17 09:35:53 -07:00
Steve Howell e988cf9b0a emoji cache: Don't join to UserProfile table.
We only need author id, and anything else in the table
would be possibly stale anyway.
2023-07-17 09:35:53 -07:00
Anders Kaseorg 7e707270f0 models: Convert deprecated index_together option to indexes.
index_together is slated for removal in Django 5.1:
https://docs.djangoproject.com/en/4.2/internals/deprecation/#deprecation-removed-in-5-1

We set the optional index names to match the previously generated
index names to avoid adding new migrations.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-12 07:12:43 -07:00
Sahil Batra 2e4f7f6336 user_groups: Remove "@" from name of role-based system groups.
This commit removes "@" from name of role-based system groups
since we have added a restricion on having user group names
starting with "@" in the previous commit as they look odd in
mention syntax.

We also add a migration in this commit to update the name of
role-based system groups in existing realms to remove "@"
from the name. This migration also updates the names of
non-system user groups by removing the invalid prefixes
from their names and if there is a group already with that
name, we insted name the group as "group:{group_id}".

Fixes #26148.
2023-07-11 13:46:02 -07:00
Sahil Batra 929bf1243e user_groups: Disallow certain prefixes in group name.
We do not allow user group names to start with "@", "role:",
"user:", "stream:" and "channel:".

Group names starting with "@" look odd in mentions and
"role:", "user:" and "stream:" prefixes are reserved for
system groups which will be used in the new groups-based
permission model. We do not allow "channel:" prefix for
now just to be safe in a case where we use it instead of
"stream:" prefix for stream based groups in future.

Fixes part of #26148.
2023-07-11 13:46:02 -07:00