Commit Graph

17002 Commits

Author SHA1 Message Date
Sahil Batra 584026b21f models: Pass realm as arg to select_related in get_user_profile_by_email.
This commit updates select_related call in get_user_profile_by_email
to pass "realm" as argument.

This function is intended to be used for manual manage.py shell
work so we just keep the behavior same as before as "realm" is
the only non-null related field in UserProfile.
2023-07-20 10:44:39 -07:00
Sahil Batra 577555e529 send_custom_email: Pass realm as arg to select_related.
This commit updates the select_related calls in queries
to get UserProfile objects in send_custom_email code to
pass "realm" as argument to select_related call.

Also, note that "realm" is the only non-null foreign key
field in UserProfile object, so select_related() was only
fetching realm object previously as well. But we should
still pass "realm" as argument in select_related call so
that we can make sure that only required fields are selected
in case we add more foreign keys to UserProfile in future.
2023-07-20 10:44:39 -07:00
Sahil Batra 3ae0b4f913 dev_login: Pass realm as arg to select_related calls.
This commit updates the select_related calls in queries to
get UserProfile objects in dev_login code to pass "realm"
as argument to select_related call.

Also, note that "realm" is the only non-null foreign key field
in UserProfile object, so select_related() was only fetching
realm object previously as well. But we should still pass "realm"
as argument in select_related call so that we can make sure that
only required fields are selected in case we add more foreign
keys to UserProfile in future.
2023-07-20 10:44:39 -07:00
Sahil Batra c0029319f9 users: Pass realm as arg to select_related in fetch_users_by_id.
This commit updates select_related call to pass "realm" as
argument in select_related call in fetch_users_by_id function
as we only require realm for the UserProfile objects fetched
using fetch_users_by_id.

Also, note that "realm" is the only non-null foreign key field
in UserProfile object, so select_related() was only fetching
realm object previously as well. But we should still pass "realm"
as argument in select_related call so that we can make sure that
only required fields are selected in case we add more foreign
keys to UserProfile in future.
2023-07-20 10:44:39 -07:00
Sahil Batra bb3945a32f models: Remove select_related call in get_active_users.
We do not use any related fields for the UserProfile objects
fetched by get_active_users, so we can simply remove the
select_related call.

The user object from get_active_users was used to get realm
but since get_active_users called from a realm object we can
directly use that realm object. This change also leads to
some changes in the cache code where we now pass the realm
to the function instead of selecting it from UserProfile object.
2023-07-20 10:44:39 -07:00
Sahil Batra eda3879733 soft_deactivation: Remove select_related call.
This commit removes select_related call from
get_soft_deactivated_users_for_catch_up as
we do not use any related fields for the
UserProfile objects fetched using this call.
2023-07-20 10:44:39 -07:00
Sahil Batra 290973585c cache_helpers: Remove select_related call for Client.
There are no foreign keys for Client and so there are
no related objects to select using select_related.
2023-07-20 10:44:39 -07:00
Lauryn Menard 072a62d5b5 api-changelog: Add entry for feature level 193 updates.
Adds an entry to the API changelog for feature level 193. The
changes documented here are from commit 3802503a8a.
2023-07-20 09:52:28 -07:00
Alex Vandiver d957559371 uploads: Allow uploads to set storage class.
Uploads are well-positioned to use S3's "intelligent tiering" storage
class.  Add a setting to let uploaded files to declare their desired
storage class at upload time, and document how to move existing files
to the same storage class.
2023-07-19 16:19:34 -07:00
Alex Vandiver 871a668dd2 reactions: Add error code for duplicate addition/removal. 2023-07-19 16:18:31 -07:00
Anders Kaseorg 29bdaaf5b5 requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-19 16:14:59 -07:00
Anders Kaseorg 195efb3802 name_restrictions: Update disposable_email_domains usage.
‘blocklist’ was added in 0.0.35 (with backwards compatibility for the
old name), and type annotations were added in 0.0.91 (with only the
new name).

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-19 16:14:59 -07:00
Anders Kaseorg d87eea1a67 ruff: Fix B034 `re.split`, `re.sub` should pass keyword arguments.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-19 16:14:59 -07:00
Anders Kaseorg 50e6cba1af ruff: Fix UP032 Use f-string instead of `format` call.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-19 16:14:59 -07:00
Anders Kaseorg 0efc662eab ruff: Fix RUF015 Prefer `next(iter(…))` over `list(…)[0]`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-19 16:14:59 -07:00
Anders Kaseorg 9bb3d15a79 openapi: Switch to new openapi_core validation API.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-19 16:14:59 -07:00
Steve Howell 3599b1662e cache: Eliminate transformed_bulk_cached_fetch.
Its two callers now just directly call
generic_bulk_cached_fetch with the explicit `lambda
obj: obj` helpers.
2023-07-19 11:07:33 -07:00
Steve Howell d19c1f7438 message fetching: Avoid duplicate cache layers.
This code removes a lot of complexity with very likely
positive overall impact on system performance and
negligible downside.

We already cache display recipients on a per-user
level, so there's no need for another cache layer on
top of that that keys them with recipient ids.

We avoid strange things where Alice/Bob and Bob/Charlie
get put into the top layer cache and then we still have
a cache miss on Alice/Charlie despite the lower level
cache being able to support per-user lookups.

This change does introduce an extra database round trip
if any of our messages have a huddle, but the query is
extremely cheap, and we can always try to cache that
function more directly or try to re-use some of our
other huddle-based caches.

As part of this, we clean up the names for the
lower-level per-user cache of display recipients, and
we simplify the cache keys.

We also stop passing in a full Recipient object to the
`bulk_get_huddle_user_ids` functions.

The local impact of this change should be easy to
measure (at least approximately), since we use this
function every time a user gets messages via the
/messages endpoint.
2023-07-19 11:07:33 -07:00
Steve Howell b85d3dd65b recipient caches: Split up bulk-fetching.
The only overlap between how we fetched streams and
users was to share some really complicated data
structures.

We can also short-circuit some logic if a message
batch is either all-stream or all-DM.
2023-07-19 11:07:33 -07:00
Steve Howell 03557a5568 huddles: Find huddle user ids more efficiently.
We restrict the columns, avoid quadratic looping,
and don't bother with order_by.

We also return the user ids (per recipient) as
sets, since that's how the only caller uses the
info (albeit implicitly via set.union accepting
a list).
2023-07-19 11:07:33 -07:00
Anders Kaseorg 052984bc14 utils: Remove make_safe_digest wrapper.
It’s unclear what was supposed to be “safe” about this wrapper.  The
hashlib API is fine without it, and we don’t want to encourage further
use of SHA-1.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-19 10:54:05 -07:00
Anders Kaseorg 143baa4243 python: Convert translated positional {} fields to {named} fields.
Translators benefit from the extra information in the field names, and
need the reordering freedom that isn’t available with multiple
positional fields.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-18 15:19:07 -07:00
Anders Kaseorg db6323b2c4 python: Remove unused arguments for translated format strings.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-18 10:38:46 -07:00
Alex Vandiver b188e6fa04 management: Add a reactivate-stream command.
Fixes #601.
2023-07-17 17:42:54 -07:00
David Rosa 4626a40589 lib-markdown: Support relative links to `/#drafts` and `/#scheduled`.
- Adds `message_handle_match` function to handle new pattern for
  relative help links to "Drafts" and "Scheduled messages" for logged-in
  users: `{relative|message|drafts}` and `{relative|message|scheduled}`.
2023-07-17 17:25:25 -07:00
Alex Vandiver 54395612c7 export: Skip crossrealm bots, if they are in the exported realm.
This prevents them from being duplicated in the crossrealm users.
2023-07-17 17:22:57 -07:00
Alex Vandiver 207cfe49cf import: Merge mirrordummy users _before_ recipients are stripped out.
`remove_denormalized_recipient_column_from_data` removes the
`recipient` data from `zerver_userprofile`, but did not remove it from
`zerver_userprofile_mirrordummy`, which was later appended to the list
of `zerver_userprofile` objects.  This led to failure when inserting,
as the mirrordummy objects still tried to reference their previous
`recipient_id`s.

Move the merging of the two sets earlier, before we call
`remove_denormalized_recipient_column_from_data`.
2023-07-17 17:22:57 -07:00
Alex Vandiver cfda414277 export: Include huddles subscription from mirrordummy users.
If there are two huddles, with users A + B + C + D and A + B + C, and
user D is deleted, it is replaced with a mirrordummy user.  If
mirrordummy subscriptions are not included in exports, then the two
huddles have duplicate member sets, and will not be able to be
imported successfully.

Include huddle subscriptions for mirrordummy users in exports.
2023-07-17 17:22:57 -07:00
Alex Vandiver 003fa7adda message_edit: Lock the Message row in check_update_message.
Fundamentally, we should take a write lock on the message, check its
validity for a change, and then make and commit that change.
Previously, `check_update_message` did not operate in a transaction,
but `do_update_message` did -- which led to the ordering:

 - `check_update_message` reads Message, not in a transaction
 - `check_update_message` verifies properties of the Message
 - `do_update_message` starts a transaction
 - `do_update_message` takes a read lock on UserMessage
 - `do_update_message` writes on UserMessage
 - `do_update_message` writes Message
 - `do_update_message` commits

This leads to race conditions, where the `check_update_message` may
have verified based on stale data, and `do_update_message` may
improperly overwrite it; as well as deadlocks, where
other (properly-written) codepaths take a write lock on Message
_before_ updating UserMessage, and thus deadlock with
`do_update_message`.

Change `check_update_message` to open a transaction, and take the
write lock when first accessing the Message row.  We update the
comment above `do_update_message` to clarify this expectation.

The new ordering is thus:

 - `check_update_message` starts a transaction
 - `check_update_message` takes a write lock on Message
 - `check_update_message` verifies properties of the Message
 - `do_update_message` writes on UserMessage
 - `do_update_message` writes Message
 - `check_update_message` commits
2023-07-17 10:53:38 -07:00
abdullahm1 a0fb4feebf integrations: Replace use of 'subject' to 'topic'.
Fixes #25974
2023-07-17 10:35:51 -07:00
Prakhar Pratyush 21a5818765 mention: Soft-reactivate users receiving @topic mention notifications.
The long-term idle topic participants are soft-reactivated
after email/push notifications are sent due to @topic mention.

The reason being that, generally, @topic mentions are going to
reach a small set of users who have a decent chance of being
reactivated by the notifications.
2023-07-17 09:39:24 -07:00
Prakhar Pratyush 4c9d26ce17 mention: Send notifications for @topic wildcard mentions.
This commit completes the notifications part of the @topic
wildcard mention feature.

Notifications are sent to the topic participants for the
@topic wildcard mention.
2023-07-17 09:39:24 -07:00
Steve Howell 67cdf1a7b4 emojis: Use get_emoji_data.
The previous function was poorly named, asked for a
Realm object when realm_id sufficed, and returned a
tuple of strings that had different semantics.

I also avoid calling it duplicate times in a couple
places, although it was probably rarely the case that
both invocations actually happened if upstream
validations were working.

Note that there is a TypedDict called EmojiInfo, so I
chose EmojiData here.  Perhaps a better name would be
TinyEmojiData or something.

I also simplify the reaction tests with a verify
helper.
2023-07-17 09:35:53 -07:00
Steve Howell b742f1241f realm emoji: Use a single cache for all lookups.
The active realm emoji are just a subset of all your
realm emoji, so just use a single cache entry per
realm.

Cache misses should be very infrequent per realm.

If a realm has lots of deactivated realm emoji, then
there's a minor expense to deserialize them, but that
is gonna be dwarfed by all the other more expensive
operations in message-send.

I also renamed the two related functions.  I erred on
the side of using somewhat verbose names, as we don't
want folks to confuse the two use cases. Fortunately
there are somewhat natural affordances to use one or
the other, and mypy helps too.

Finally, I use realm_id instead of realm in places
where we don't need the full Realm object.
2023-07-17 09:35:53 -07:00
Steve Howell e988cf9b0a emoji cache: Don't join to UserProfile table.
We only need author id, and anything else in the table
would be possibly stale anyway.
2023-07-17 09:35:53 -07:00
Zixuan James Li e8a6f6a313 integrations: Fix broken screenshots configuration.
Along with the fix, we add a test case to ensure that this never happens
again.
2023-07-17 09:23:01 -07:00
Zixuan Li a0cf624eaa
migrations: Backfill extra_data_json for audit log entries.
This migration is reasonably complex because of various anomalies in existing
data.

Note that there are cases when extra_data does not contain data that is
proper json with possibly single quotes. Thus we need to use
"ast.literal_eval" to cover that.

There is also a special case for "event_type == USER_FULL_NAME_CHANGED",
where extra_data is a plain str. This event_type is only used for
RealmAuditLog, so the zilencer migration script does not need to handle
it.

The migration does not handle "event_type == REALM_DISCOUNT_CHANGED"
because ast.literal_eval only allow Python literals. We expect the admin
to populate the jsonified extra_data for extra_data_json manually
beforehand.

This chunks the backfilling migration to reduce potential block time.

The migration for zilencer is mostly similar to the one for zerver; except that
the backfill helper is added in a wrapper and unrelated events are
removed.

**Logging and error recovery**

We print out a warning when the extra_data_json field of an entry
would have been overwritten by a value inconsistent with what we derived
from extra_data. Usually this only happens when the extra_data was
corrupted before this migration. This prevents data loss by backing up
possibly corrupted data in extra_data_json with the keys
"inconsistent_old_extra_data" and "inconsistent_old_extra_data_json".
More roundtrips to the database are needed for inconsistent data, which are
expected to be infrequent.

This also outputs messages when there are audit log entries with decimals,
indicating that such entries are not backfilled. Do note that audit log
entries with decimals are not populated with "inconsistent_old_extra_data_*"
in the JSONField, because they are not overwritten.

For such audit log entries with "extra_data_json" marked as inconsistent,
we skip them in the migration.  Because when we have discovered anomalies in a
previous run, there is no need to overwrite them again nesting the extra keys
we added to it.

**Testing**

We create a migration test case utilizing the property of bulk_create
that it doesn't call our modified save method.

We extend ZulipTestCase to support verifying console output at the test
case level. The implementation is crude but the use case should be rare
enough that we don't need it to be too elaborate.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-15 09:43:23 -07:00
Satyam Bansal b29ec4d62e integrations: Update documentation for Grafana Integration.
The process for creating an integration has changed since
Grafana 8.3.
2023-07-13 16:56:01 -07:00
Sahil Batra 75b61a8261 streams: Send stream creation events when subscribing guests.
We did not send the stream creation events when subscribing
guests to public streams while we do send them when subscribing
non-admin users to private streams.

This commit adds code to send the stream creation events when
subscribing guests to public streams, so the clients can know
that the stream exists and fixes the bug where client tries
to process a subscription add event for a stream which it does
not know about.
2023-07-13 14:04:51 -07:00
Sahil Batra 2c02d94b85 openapi: Fix description for stream creation event.
This commit updates description for stream creation event
to mention that the event is also sent when user gains
access to a stream either due to being subscribed to it
or if a private stream is made public.
2023-07-13 14:04:51 -07:00
Zixuan James Li e9e18454d2 user_groups: Populate membership audit logs during realm creation.
This tracks user group membership changes when the realm is first set
up, either through an import or not. This happens when we add users to
the system user groups by their roles.

For an imported realm, we do extra handling when the data doesn't include
user groups. This gets audited as well.
2023-07-13 11:55:38 -07:00
Zixuan James Li 1af50548ae import_realm: Fix broken stream group-based settings backfill.
Django seems to have an aggressive check on the type of a field when
setting it through an relation, requiring the argument to be a UserGroup in
our case.

Reference:
02966a30dd/django/db/models/base.py (L537-L546)
2023-07-13 11:55:38 -07:00
Alex Vandiver be960f4142 missed-message: Lock ScheduledMessageNotificationEmail rows.
This prevents the rows from being deleted out from under the worker
while it is sending emails.
2023-07-13 11:50:42 -07:00
Alex Vandiver d87895a3ef missed-message: Merge before calling handle_missedmessage_emails.
The MissedMessage queue worker is the single callsite of
`handle_missedmessage_emails`, which immediately transforms the list
of events into a dict keyed by message-id.

Skip the intermediate list step, and use defaultdict and a dataclass
to simplify and make explicit the pieces.  This removes the unused
user_profile_id and message_id pieces of the data structure.
2023-07-13 11:50:42 -07:00
Alex Vandiver c7d9a4784e missed-message: Remove unnecessary select_related().
This was added in ebb4eab0f99d; neither the `user_profile` nor the
`message` attribute are read off of the object.
2023-07-13 11:50:42 -07:00
Steve Howell 4533ff3671 onboarding: Rename variable to cutoff_date.
This is just as clear in terms of intent, and it's robust to
us tweaking the number of weeks.
2023-07-13 11:46:34 -07:00
Prakhar Pratyush 0891f9f65a mention: Determine @topic mention during message rendering.
This commit adds a boolean field `mentions_topic_wildcard`
to the `MessageRenderingResult` dataclass.

The field is set to true only if message rendering determines
the message has an actual topic wildcard mention in it (and not,
e.g., topic wildcard mention syntax inside a code block).

The rendered content for topic wildcard mention is
'<span class="topic-mention">{wildcard}</span>'.

The 'topic-mention' class is the identifier for the wildcard
mention being a topic wildcard mention.

We don't use 'data-user-id="*"' and "user-mention" class for
topic wildcard mentions and eventually plan to remove them for
stream wildcard mentions too in a separate mini-project.
2023-07-13 11:34:48 -07:00
Prakhar Pratyush 806d8f2dc7 test_markdown: Merge similar tests into a single test case.
This prep commit merges separate tests for '**@all**',
'**@stream**' and '**@everyone**' stream wildcard mentions
into a single test named 'test_mention_stream_wildcard'.

Similarly, it merges separate tests for '@all', '@stream',
and '@everyone' stream wildcard mentions into a single test
named 'test_mention_at_stream_wildcard'.

The aim is to finally have two separate tests for stream and
topic wildcard mentions (when we introduce topic wildcards)
instead of having separate tests for each mention text
(i.e. all, everyone, stream, topic).
2023-07-13 11:34:48 -07:00
Prakhar Pratyush c0c30bc5f7 topic_mentions: Fetch users to be notified of @topic mentions.
This commit adds the 'topic_wildcard_mention_user_ids' and
'topic_wildcard_mention_in_followed_topic_user_ids'
attributes to the 'RecipientInfoResult' dataclass.

Only topic participants are notified of @topic mentions.

Topic participants are anyone who sent a message to a topic
or reacted to a message on the topic.

'topic_wildcard_mention_in_followed_topic_user_ids' stores the
ids of the topic participants who follow the topic and have
enabled the wildcard mention notifications for followed topics.

'topic_wildcard_mention_user_ids' stores the ids of the topic
participants for whom 'user_allows_notifications_in_StreamTopic'
with setting 'wildcard_mentions_notify' returns True.
2023-07-13 11:34:48 -07:00
Prakhar Pratyush 1df63ed448 mention: Add 'has_topic_wildcards' to 'MentionData'.
This commit adds a 'has_topic_wildcards' instance variable
to the 'MentionData' class for the detection of
- possible topic wildcards mentions.

Fixes part of #22829.

Co-authored-by: Prakhar Pratyush <prakhar841301@gmail.com>
Co-authored-by: orientor <aditya.verma@students.iiit.ac.in>
2023-07-13 11:34:48 -07:00
Prakhar Pratyush 3f6b41e4be test_notifications: Update tests to cover the corner case properly.
This commit updates the existing tests in 'test_email_notifications'
and 'test_push_notifications' to properly configure user settings
and visibility policies before running the actual tests.

Earlier, the tests were passing, but the corner case expected
to be covered wasn't covered.

This should have been included in
d80779435a.
2023-07-13 11:34:48 -07:00
Prakhar Pratyush 2869de8026 test_notifications: Remove unnecessary comments.
These comments should not have been included in
a8fd9eb701.

We covered the case "Private message should soft reactivate
the user" earlier in the test. So the comment was rightly added
there.

During stream wildcard or group mention, no such personal mention
is involved; hence, the comments are not needed.
2023-07-13 11:34:48 -07:00
Prakhar Pratyush 2b42df4ef1 mention: Replace 'wildcard' with 'stream_wildcard'.
This is a prep commit to replace 'wildcard' with 'stream_wildcard'.

This wasn't included in 179d5cb because we didn't decide to
use a different rendered_content for topic wildcard mention,
i.e., ''<span class="user-mention topic-mention">{wildcard}</span>'.

Our intention was not to create separate tests for both stream
and topic wildcard mentions, as they were expected to have the
same rendered content format.
2023-07-13 11:34:48 -07:00
Steve Howell 418057048a onboarding: Backfill unread messages up to 12 weeks old.
Previously this limit was 1 week, which was fine for busy
organizations, but for organizations that send a few messages a week,
or have occasional bursts of activity but the last one was a few weeks
ago, this should give a significantly better new user experience.

There are still caps like 1000 messages total and 20
unread, but we're a bit more flexible about time.
2023-07-13 10:40:12 -07:00
Steve Howell 890732a88f soft activation: Avoid QuerySet and use List instead. 2023-07-13 08:09:14 -07:00
Anders Kaseorg 7e707270f0 models: Convert deprecated index_together option to indexes.
index_together is slated for removal in Django 5.1:
https://docs.djangoproject.com/en/4.2/internals/deprecation/#deprecation-removed-in-5-1

We set the optional index names to match the previously generated
index names to avoid adding new migrations.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-12 07:12:43 -07:00
nimish c238327899 settings: Change "Display settings" to "Preferences".
This includes changing the URL to #settings/preferences, with a
transparent redirect so that existing links, like the one from Welcome
Bot, continue to work.
2023-07-12 07:09:03 -07:00
Anders Kaseorg 63be67af80 logging_util: Remove dependence on get_current_request.
Pass the HttpRequest explicitly through the two webhooks that log to
the webhook loggers.

get_current_request is now unused, so remove it (in the same commit
for test coverage reasons).

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-11 22:23:47 -07:00
Anders Kaseorg f66e2c3112 sentry: Remove dependence on get_current_request.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-11 22:23:47 -07:00
abdullahm1 5a90f9c404 tests: Use time_machine for testing scheduled message delivery. 2023-07-11 17:34:58 -07:00
Lauryn Menard 3dfdbbc775 welcome-emails: Separate followup_day1 email from other welcome emails.
The initial followup_day1 email confirms that the new user account
has been successfully created and should be sent to the user
independently of an organization's setting for send_welcome_emails.

Here we separate out the followup_day1 email into a separate function
from enqueue_welcome_emails and create a helper function for setting
the shared welcome email sender information.

The followup_day1 email is still a scheduled email so that the initial
account creation and log-in process for the user remains unchanged.

Fixes #25268.
2023-07-11 14:15:52 -07:00
Lauryn Menard 0e1acd595b welcome-emails: Use followup_day2 for scheduled email tests.
The followup_day2 email is scheduled with a delay as a welcome email
and is therefore more likely to exist as a scheduled email in these
deactivation cases.
2023-07-11 14:15:52 -07:00
Lauryn Menard dd59e83d54 welcome-emails: Make some code comments and docstrings more evergreen. 2023-07-11 14:15:52 -07:00
Lauryn Menard c323afd9d7 test-example: Revise comment with number of emails generated.
Updates comment to not include the number of emails generated so
that it doesn't need to be updated every time a new email is added.
The current count in the comment is already out-of-date.
2023-07-11 14:15:52 -07:00
Zixuan James Li 84723654c8 webhooks: Use 200 status code for unknown events.
Because the third party might not be expecting a 400 from our
webhooks, we now instead use 200 status code for unknown events,
while sending back the error to Sentry. Because it is no longer an error
response, the response type should now be "success".

Fixes #24721.
2023-07-11 13:51:37 -07:00
Sahil Batra 2e4f7f6336 user_groups: Remove "@" from name of role-based system groups.
This commit removes "@" from name of role-based system groups
since we have added a restricion on having user group names
starting with "@" in the previous commit as they look odd in
mention syntax.

We also add a migration in this commit to update the name of
role-based system groups in existing realms to remove "@"
from the name. This migration also updates the names of
non-system user groups by removing the invalid prefixes
from their names and if there is a group already with that
name, we insted name the group as "group:{group_id}".

Fixes #26148.
2023-07-11 13:46:02 -07:00
Sahil Batra 929bf1243e user_groups: Disallow certain prefixes in group name.
We do not allow user group names to start with "@", "role:",
"user:", "stream:" and "channel:".

Group names starting with "@" look odd in mentions and
"role:", "user:" and "stream:" prefixes are reserved for
system groups which will be used in the new groups-based
permission model. We do not allow "channel:" prefix for
now just to be safe in a case where we use it instead of
"stream:" prefix for stream based groups in future.

Fixes part of #26148.
2023-07-11 13:46:02 -07:00
Sahil Batra ea3a7a9e6f user_groups: Add API restrictions for long user group names.
Previously we had database level restriction on length of
user group names. Now we add the same restriction to API
level as well, so we can return a better error response.
2023-07-11 13:46:02 -07:00
Steve Howell 89381a8072 cache: Eliminate get-stream-by-name cache.
We remove the cache functionality for the
get_realm_stream function, and we also change it to
return a thin Stream object (instead of calling
select_related with no arguments).

The main goal here is to remove code complexity, as we
have been prone to at least one caching validation bug
related to how Realm and UserGroup interact. That
particular bug was more theoretical than practical in
terms of its impact, to be clear.

Even if we were to be perfectly disciplined about only
caching thin stream objects and always making sure to
delete cache entries when stream data changed, we would
still be prone to ugly situations like having
transactions get rolled back before we delete the cache
entry. The do_deactivate_stream is a perfect example of
where we have to consider the best time to unset the
cache. If you unset it too early, then you are prone to
races where somebody else churns the cache right before
you update the database. If you set it too late, then
you can have an invalid entry after a rollback or
deadlock situation. If you just eliminate the cache as
a moving part, that whole debate is moot.

As the lack of test changes here indicates, we rarely
fetch streams by name any more in critical sections of
our code.

The one place where we fetch by name is in loading the
home page, but that is **only** when you specify a
stream name. And, of course, that only causes about an
extra millisecond of time.
2023-07-11 13:45:40 -07:00
Steve Howell 046e4c715b cache: Use DB for all bulk get-stream-by-name queries.
This changes bulk_get_streams so that it just uses the
database all the time.  Also, we avoid calling
select_related(), so that we just get back thin and
tidy Stream objects with simple queries.

About not caching any more:

It's actually pretty rare that we fetch streams by name
in the main application. It's usually API requests that
send in stream names to find more info about streams.

It also turns out that for large queries (>= ~30 rows
for my measurements) it's more efficent to hit the
database than memcached. The database is super fast at
scale; it's just the startup cost of having Django
construct the query, and then having the database do
query planning or whatever, that slows us down. I don't
know the exact bottleneck, but you can clearly measure
that one-row queries are slow (on the order of a full
millisecond or so) but the marginal cost of additional
rows is minimal assuming you have a decent index (20
microseconds per row on my droplet).

All the query-count changes in the tests revolve around
unsubscribing somebody from a stream, and that's a
particularly odd use case for bulk_get_streams, since
you generally unsubscribe from a single stream at a
time. If there are some use cases where you do want to
unsubscribe from multiple streams, we should move
toward passing in stream ids, at least from the
application. And even if we don't do that, our cost for
most queries is a couple milliseconds.
2023-07-11 13:45:40 -07:00
Steve Howell adb548c7a2 stream creation: Avoid stream.realm references.
We want to avoid Django going back to the database to
get a realm object that the caller already has.

It's actually currently the case that we often
pre-fetch realm objects when we get stream objects
using get_stream (using a call to select_related() with
no arguments), but that is an expensive operation that
we want to avoid going forward.

This commit prepares us to just fetch slim objects.
2023-07-11 13:45:40 -07:00
Satyam Bansal 328c104424 integrations: Separate issue milestoned events in GitHub Integration.
This commit creates separate events for issue milestoned and
demilestoned notifications. This allows the end-users to choose
whether they want these notifications or not.

Fixes #25793.
2023-07-11 08:58:31 -07:00
Satyam Bansal 34f31ab9d2 integrations: Improve GitHub issue milestoned notifications.
Earlier, the notifications had no information about the milestone
that was added or removed.
2023-07-11 08:58:31 -07:00
Satyam Bansal 1c567ae616 integrations: Add issue demilestoned fixture to GitHub Integration. 2023-07-11 08:58:31 -07:00
Satyam Bansal f8ac308ec2 integrations: Add issue milestoned fixture to GitHub Integration. 2023-07-11 08:58:31 -07:00
Zixuan James Li 3349ac9f86 user_groups: Audit UserGroup group based setting changes.
This add audit log entries when any group based setting of a user group
is updated. We store both the old and new values in extra_data, along
with the name of that setting. Entries populated during user group creation
are hardcoded to track "can_mention_group".

Potentially we can adjust "set_defaults_for_group_settings" so that it
populates realm audit logs with it, but that is out of scope for this change.

We use an atomic transaction so that the audit logs are committed
together with the updates.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Zixuan James Li 4d0b7fe682 user_groups: Audit UserGroup properties changes.
This add audit log entries when the name or description of a user group
is updated. We store both the old and new values in extra_data. We wrap
the functions inside an atomic transaction so that the audit logs and
the updates are committed together.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Zixuan James Li 3035854dca user_groups: Audit UserGroup supergroup memberships changes.
This is mostly the same as tracking subgroup changes, except that now
modified_user_group is the subgroup.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Zixuan James Li ad698d597a user_groups: Audit UserGroup subgroup memberships changes.
It's worth noting that instead of adding another field to the
RealmAuditLog model, we store the modified subgroup ids in extra_data as
a JSON encoded dict with the key "subgroup_ids". We don't create audit
log entries for supergroup changes at this point.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Zixuan James Li 44781ddfa9 user_groups: Audit UserGroup memberships changes.
This also add audit log entries during user creation and role change,
because we modify system group memberships there.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Zixuan James Li 63f5936207 user_groups: Audit UserGroup creation.
We also create RealmAuditLog entries for the initial memberships that
get added along with the creation of a UserGroup. System user groups are
not created with members so no audit logs are populated for that.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Zixuan James Li 71de14ab43 models: Add modified_user_group.
This also adds the supported event types for changes to UserGroup.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-11 08:56:55 -07:00
Alex Vandiver a076d49be7 sentry: Reduce http timeout.
This helps reduce the impact on busy uwsgi processes in case there are
slow timeout failures of Sentry servers.  The p99 is less than 300ms,
and p99.9 per day peaks at around 1s, so this will not affect more
than .1% of requests in normal operation.

This is not a complete solution (see #26229); it is merely stop-gap
mitigation.
2023-07-10 13:46:16 -07:00
Lauryn Menard 3d8090a116 sentry-webhook: Revise documentation page to be clearer. 2023-07-10 13:43:28 -07:00
Steve Howell b31bbc6148 signup: Clean up add_new_user_history.
Various cleanups:

    * clean up comments
    * improve names for constants and variables
    * express first ORM query as a single statement
    * use set differences to simplify logic
    * avoid all the reversing churn
    * avoid early-exit idiom since this function is so small

Note that it's plausible that we should just combine the two
queries and let the database exclude the already-used ids,
but that felt a little risky for now.  As I mentioned on
Zulip, I think the one-week window has dubious value, but
I am biased by having wasted time chasing down a test
flake related to the time window.
2023-07-10 13:41:28 -07:00
Steve Howell 225e826fb2 deactivate streams: Remove unused "log" parameter. 2023-07-10 13:41:28 -07:00
Steve Howell bc3afe9127 default stream groups: Make deleting streams efficient.
This pulls one query out the loop, and then it makes
another query a bulk query, and then it finally eliminates
an unnecessary query at the end.
2023-07-10 13:41:28 -07:00
Steve Howell 87d1208d53 tests: Improve test for default stream groups. 2023-07-10 13:41:28 -07:00
Steve Howell 1156a50109 signup: Avoid bloated Stream objects for default streams.
Basically, I eliminate the use of select_all() in a query
that still makes a single round trip.  We have good test
enforcement that Django never needs to lazily fetch
objects off the Stream object. (It used to be common
to fetch stream.realm a while back, but we upgraded
bulk_add_subscription, in particular, a while back.)
2023-07-10 13:41:28 -07:00
Steve Howell 8894ff89ac signup: Extract set_up_streams_for_new_human_user.
We extract code from process_new_human_user with
no modifications.

This has all the best outcomes of extracting a function:

    * better profile info
    * easier to test for query counts (signup gets real noisy)
    * simplifies a long, messy function

It has no real drawbacks, since the helper function doesn't need
to pass back any intermediate state to the parent for the rest
of what the parent does.

When you profile test_signup and test_invite, with a decent
sample size, the set_up_streams_for_new_human_user function
does about 20% of the work for process_new_human_user, which
is a lot considering that most tests don't create a ton of
pre-registered or default streams.
2023-07-10 13:41:28 -07:00
Steve Howell d6ef94f63f page load: Improve default_streams performance.
At least as measured by test_events.py, which has over 1000
calls to fetch initial data for page loads, this should
be about a 10% improvement in how much time the server
spends fetching data.

We mostly avoid a select_related() query that did this nastiness:

    INNER JOIN "zerver_realm" ON ("zerver_stream"."realm_id" = "zerver_realm"."id")
    INNER JOIN "zerver_usergroup" ON ("zerver_stream"."can_remove_subscribers_group_id" = "zerver_usergroup"."id")
    INNER JOIN "zerver_realm" T4 ON ("zerver_usergroup"."realm_id" = T4."id")
    INNER JOIN "zerver_usergroup" T5 ON ("zerver_usergroup"."can_mention_group_id" = T5."id")
    INNER JOIN "zerver_realm" T6 ON (T5."realm_id" = T6."id")
    INNER JOIN "zerver_usergroup" T7 ON (T5."can_mention_group_id" = T7."id")
    INNER JOIN "zerver_realm" T8 ON (T7."realm_id" = T8."id")
    INNER JOIN "zerver_usergroup" T9 ON (T7."can_mention_group_id" = T9."id")
    INNER JOIN "zerver_realm" T10 ON (T9."realm_id" = T10."id")
    INNER JOIN "zerver_usergroup" T11 ON (T9."can_mention_group_id" = T11."id")
    WHERE "zerver_stream"."id" IN (SELECT U0."stream_id" FROM "zerver_defaultstream" U0 WHERE U0."realm_id" = 2

Future commits will address the codepath for creating users.
2023-07-10 13:41:28 -07:00
Steve Howell 763b5e0741 default streams: Extract library functions.
I created zerver/lib/default_streams.py, so that various
views and events.py don't have to awkwardly reach into
an "actions" file.

I copied over two functions verbatim from actions/default_streams.py:

    get_default_streams_for_realm
    streams_to_dicts_sorted

The latter only remains as an internal detail in the new library.

I also created two new helpers:

    get_default_stream_ids_for_realm:

        This is both faster and easier to use in all the places
        where we only need to get a set of default stream ids.

    get_default_streams_for_realm_as_dicts:

        This just wraps the prior calls to
        streams_to_dicts_sorted(get_default_streams_for_realm(...)),
        and it doesn't yet address the slowness of the underlying
        code.

        All the "real" code should be functionally the same.

        In a few tests I now use this wrapper instead of
        calling get_default_streams_for_realm, just to get
        slightly deeper coverage.
2023-07-10 13:41:28 -07:00
Lauryn Menard d84fd73db4 markdown-processor: Update insertion_index check for multiple classes.
Updates find_proper_insertion_index to check for the inline image
classes as matching at least one of the classes in the element's
attrib["class"] so that cases where an inline preview image has
multiple classes, like YouTube video previews, will have the
correct insertion index.

Fixes #26186.
2023-07-07 11:07:45 -04:00
Alex Vandiver ff53ee8e28 markdown: Only attempt to adjust /wiki/File: paths on Wikipedia. 2023-07-06 17:50:25 -07:00
Lalit 46b582689a tests: Improve automated tests for submessages.
Added an additional test case to `test_submessages.py` for testing the
message object containing `submessages` meta data.

Previous to this commit we were never validating the `submessage` schema
in the `message` objects.

Fixes #25896.
2023-07-06 16:35:46 -07:00
Zixuan James Li eebe46ad1c test_classes: Do not necessary wrap test cases in a transaction.
By relocating helper methods into a mixin class, we can be more flexible
with managing transactions in test cases, without always forcing the
django.test.TestCase behavior of always putting the test case into an
atomic transaction.

We include a check for side effects in ZulipTransactionTestCase. It only
checks for the set of row ids in all tables before and after each test.
It is not a comprehensive check for side effects, but should be
sufficient for the basics without much performance overhead.
2023-07-06 11:44:50 -07:00
arghyadeep10 1808cdec90 uploads: Improve file not found message.
It replaces the "File not found." text with:
"This file does not exist or has been deleted."

At present when a file is deleted it results in a confusing
experience when looking at the "File not found." message.
In order to clarify the situation is not a bug, the message
has been replaced with a better alternative.

Fixes part of Issue #23739.
2023-07-06 09:32:41 -07:00
Prakhar Pratyush 179d5cb37d mention: Replace 'wildcards' with 'stream_wildcards'.
This prep commit replaces the 'wildcard' keyword in the codebase
with 'stream_wildcard' at some places for better readability, as
we plan to introduce 'topic_wildcards' as a part of the
'@topic mention' project.

Currently, 'wildcards = ["all", "everyone", "stream"]' which is an
alias to mention everyone in the stream, hence better renamed as
'stream_wildcards'.

Eventually, we will have:
'stream_wildcard' as an alias to mention everyone in the stream.
'topic_wildcard' as an alias to mention everyone in the topic.
'wildcard' refers to 'stream_wildcard' and 'topic_wildcard' as a whole.
2023-07-03 22:03:17 -07:00
Prakhar Pratyush d80779435a tests: Add the missing tests.
This commit adds the missing tests for
'followed_topic_wildcard_mention'.

These tests should have been included in
b052c8980e.
2023-07-03 22:03:17 -07:00
Prakhar Pratyush 0bf6eb6786 notifications: Fix 'get_gcm_alert' and 'get_apns_alert_subtitle'.
The 'get_gcm_alert' and 'get_apns_alert_subtitle' functions
don't include the case when the trigger is
'NotificationTriggers.FOLLOWED_TOPIC_WILDCARD_MENTION'.

This commit updates the functions to include
'NotificationTriggers.FOLLOWED_TOPIC_WILDCARD_MENTION'.
2023-07-03 22:03:17 -07:00