Commit Graph

7817 Commits

Author SHA1 Message Date
Sahil Batra ada2991f1c users: Send stream creation/deletion events on role change.
We now send stream creation and stream deletion events on
changing a user's role because a user can gain or lose
access to some streams on changing their role.
2023-08-25 12:56:36 -07:00
Sahil Batra 5e3c39ea4f streams: Extract some code out of do_get_streams in a new function.
This commit extracts the code which queries the required streams
to a new function "get_user_streams". The new functions returns
the list of "Stream" object and not dictionaries and then
do_get_streams function converts it into list of dictionaries.

This change is important because we would use the new function
in further commit where we want list of "Stream" objects and
not list of dictionaries.
2023-08-25 12:56:36 -07:00
Sahil Batra 5e1eb3cd44 events: Fix applying stream creation events in apply_event.
There was a bug in apply_event code where only a stream which
is not private is added to the "never_subscribed" data after
a stream creation event. Instead, it should be added to the
"never_subscribed" data irrespective of permission policy of
the stream as we already send stream creation events only to
those users who can access the stream. Due to the current
bug, private streams were not being added to "never_subscribed"
data in apply_event for admins as well. This commit fixes it
and also makes sure the "never_subscribed" list is sorted
which was not done before and was also a bug.

The bugs mentioned above were unnoticed as the tests did not
cover these cases and this commit also adds tests for those
cases.
2023-08-25 12:56:36 -07:00
Sahil Batra b92af18928 register: Include web-public streams in "streams" field of response.
The "streams" field in "/register" response did not include web-public
streams for non-admin users but the data for those are eventually
included in the subscriptions data sent using "subscriptions",
"unsubscribed" and "never_subscribed" fields.

This commit adds code to include the web-public streams in "streams"
field as well as everyone can access those and will make the "streams"
data complete.
2023-08-25 12:56:36 -07:00
Mateusz Mandera f3a3047484 bulk_access_messages_expect_usermessage: Fix function name and comments.
The name and docstring were just wrong, having a UserMessage row isn't
sufficient for having message access and is actually only relevant in a
private stream with private history. The function is only used in a
single place anyway, in bulk_access_messages.

The comment mentioning this function in handle_remove_push_notification
can be tweaked to just not mention any function specifically and just
say why we're not checking message access.
2023-08-25 14:10:27 -04:00
Mateusz Mandera c908b518ef CVE-2023-32678: Prevent unauthorized editing/deletion in priv streams.
Users who used to be subscribed to a private stream and have been
removed from it since retain the ability to edit messages/topics, and
delete messages that they used to have access to, if other relevant
organization permissions allow these actions. For example, a user may be
able to edit or delete their old messages they posted in such a private
stream. An administrator will be able to delete old messages (that they
had access to) from the private stream.

We fix this by fixing the logic in has_message_access (which lies at the
core of our message access checks - access_message() and
bulk_access_messages())
to not rely on only a UserMessage row for checking access but also
verify stream type and subscription status.
2023-08-25 14:10:27 -04:00
Zixuan James Li a081428ad2 user_groups: Make locks required for updating user group memberships.
**Background**

User groups are expected to comply with the DAG constraint for the
many-to-many inter-group membership. The check for this constraint has
to be performed recursively so that we can find all direct and indirect
subgroups of the user group to be added.

This kind of check is vulnerable to phantom reads which is possible at
the default read committed isolation level because we cannot guarantee
that the check is still valid when we are adding the subgroups to the
user group.

**Solution**

To avoid having another transaction concurrently update one of the
to-be-subgroup after the recursive check is done, and before the subgroup
is added, we use SELECT FOR UPDATE to lock the user group rows.

The lock needs to be acquired before a group membership change is about
to occur before any check has been conducted.

Suppose that we are adding subgroup B to supergroup A, the locking protocol
is specified as follows:

1. Acquire a lock for B and all its direct and indirect subgroups.
2. Acquire a lock for A.

For the removal of user groups, we acquire a lock for the user group to
be removed with all its direct and indirect subgroups. This is the special
case A=B, which is still complaint with the protocol.

**Error handling**

We currently rely on Postgres' deadlock detection to abort transactions
and show an error for the users. In the future, we might need some
recovery mechanism or at least better error handling.

**Notes**

An important note is that we need to reuse the recursive CTE query that
finds the direct and indirect subgroups when applying the lock on the
rows. And the lock needs to be acquired the same way for the addition and
removal of direct subgroups.

User membership change (as opposed to user group membership) is not
affected. Read-only queries aren't either. The locks only protect
critical regions where the user group dependency graph might violate
the DAG constraint, where users are not participating.

**Testing**

We implement a transaction test case targeting some typical scenarios
when an internal server error is expected to happen (this means that the
user group view makes the correct decision to abort the transaction when
something goes wrong with locks).

To achieve this, we add a development view intended only for unit tests.
It has a global BARRIER that can be shared across threads, so that we
can synchronize them to consistently reproduce certain potential race
conditions prevented by the database locks.

The transaction test case lanuches pairs of threads initiating possibly
conflicting requests at the same time. The tests are set up such that exactly N
of them are expected to succeed with a certain error message (while we don't
know each one).

**Security notes**

get_recursive_subgroups_for_groups will no longer fetch user groups from
other realms. As a result, trying to add/remove a subgroup from another
realm results in a UserGroup not found error response.

We also implement subgroup-specific checks in has_user_group_access to
keep permission managing in a single place. Do note that the API
currently don't have a way to violate that check because we are only
checking the realm ID now.
2023-08-24 17:21:08 -07:00
Zixuan James Li 9f7fab4213 user_groups: Extract has_user_group_access helper.
Similar to has_message, we can maintain a helper dedicated to managing
access to user groups. Future permission related changes should be added
here.
2023-08-24 17:21:08 -07:00
Zixuan James Li 8792cfbadf user_groups: Return a QuerySet for recursive subgroups query.
This makes it more consistent with other recursive queries and allow
better composability.
2023-08-24 17:21:08 -07:00
Zixuan James Li a3f4341934 user_groups: Make for_read required.
We want to make the callers be more explicit about the use of the
user group being accessed, so that the later implemented database lock
can be benefited from the visibility.
2023-08-24 17:21:08 -07:00
Zixuan James Li 37b3507b86 user_groups: Reduce necessary nesting inside try-block.
The error only occurs when we do the get call.
2023-08-24 17:21:08 -07:00
Samuel 3ce7b77092 typing: Add typing constants to the post register api response.
Adds typing notification constants to the response given by
`POST /register`. Until now, these were hardcoded by clients
based on the documentation for implementing typing notifications
in the main endpoint description for `api/set-typing-status`.

This change also reflects updating the web-app frontend code
to use the new constants from the register response.

Co-authored-by: Samuel Kabuya <samuel.mwangikabuya@kibo.school>
Co-authored-by: Wilhelmina Asante <wilhelmina.asante@kibo.school>
2023-08-23 16:36:44 -07:00
evykassirer b9303a6506 push notifications: Use hex_codepoint_to_emoji.
Now that we have this util function, we can
use it here. No functional changes.
2023-08-23 16:18:15 -07:00
evykassirer 0289beb784 emoji: Match emoji sequences in markdown.
Fixes #11767.

Previously multi-character emoji sequences weren't matched in the
emoji regex, so we'd convert the characters to separate images,
breaking the intended display.

This change allows us to match the full emoji sequence, and
therefore show the correct image.
2023-08-23 16:18:15 -07:00
Sahil Batra 5a8416ff6a message: Do not pass "sender__realm" to select_related.
We have modified the code to directly fetch realm from Message
object instead of "sender" field and thus we no longer need to
fetch "sender__realm" using select_related.
2023-08-23 11:38:32 -07:00
Sahil Batra 58aecbe443 message: Pass realm as argument to wildcard_mention_allowed.
We do not want to access realm from "sender" field so that
we do not need to pass "sender__realm" argument to
select_related call when querying messages. We can instead
pass realm as argument to wildcard_mention_allowed.
2023-08-23 11:38:32 -07:00
Sahil Batra df2407f97a message: Access realm from SendMessageRequest object directly.
We store realm object in SendMessageRequest object, so we can
access it directly instead of getting it from "sender" field.
2023-08-23 11:38:32 -07:00
Sahil Batra 7295028194 message: Access realm object directly from message.
We can directly get the realm object from Message object now
and there is no need to get the realm object from "sender"
field of Message object.

After this change, we would not need to fetch "sender__realm"
field using "select_related" and instead only passing "realm"
to select_related when querying Message objects would be enough.

This commit also updates a couple of cases to directly access
realm ID from message object and not message.sender. Although
we have fetched sender object already, so accessing realm_id
from message directly or from message.sender should not matter,
but we can be consistent to directly get realm from Message
object whenever possible.
2023-08-23 11:38:32 -07:00
Alex Vandiver adc987dc43 send_email: Use a consistent order when sending custom emails to users. 2023-08-23 10:49:34 -07:00
David Rosa 0349152f0f help: Rename edit-or-delete-a-message.md and update links. 2023-08-22 14:50:23 -07:00
Sahil Batra 7137eba222 streams: Don't compute traffic data for sub objects in zephyr realm.
We set stream_weekly_traffic field to "null" for Subscription
objects in zephyr mirror realm as we do not need stream traffic
data in zephyr mirror realm. This makes the subscription data
consistent with steams data.

This commit also udpates test to check never_subscribed data for
zephyr mirror realm.
2023-08-21 15:21:58 -07:00
Sahil Batra 6776e380b2 stream_traffic: Update get_streams_traffic to return None for zephyr realm.
Instead of having a "realm.is_zephyr_mirror_realm" check for every
get_streams_traffic call, this commit udpates get_streams_traffic to
accept realm as parameter and return "None" for zephyr mirror realm.
2023-08-21 15:21:58 -07:00
Lauryn Menard 5e29e025c5 email-templates: Add zulip_onboarding_topics email templates.
The "followup_day2" email template name is not clear or descriptive
about the purpose of the email. Creates a duplicate of those email
template files with the template name "zulip_onboarding_topics".

Because any existing scheduled emails that use the "followup_day2"
templates will need to be updated before the current templates can
be removed, we don't do a simple file rename here.
2023-08-18 16:25:48 -07:00
Lauryn Menard c491bef07b email-templates: Add account_registered email templates.
The "followup_day1" email template name is not clear or descriptive
about the purpose of the email. Creates a duplicate of those email
template files with the template name "account_registered".

Because any existing scheduled emails that use the "followup_day1"
templates will need to be updated before the current templates can
be removed, we don't do a simple file rename here.
2023-08-18 16:25:48 -07:00
Anders Kaseorg 36dde99308 ruff: Appease SIM118 `"class" not in uncle.keys()`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-17 17:05:34 -07:00
Anders Kaseorg ec00c2970f ruff: Fix PYI032 Prefer `object` for the second parameter to `__eq__`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-17 17:05:34 -07:00
Anders Kaseorg 53e8c0c497 ruff: Fix E721 Do not compare types, use `isinstance()`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-17 17:05:34 -07:00
Zixuan James Li 30495cec58 migration: Rename extra_data_json to extra_data in audit log models.
This migration applies under the assumption that extra_data_json has
been populated for all existing and coming audit log entries.

- This removes the manual conversions back and forth for extra_data
throughout the codebase including the orjson.loads(), orjson.dumps(),
and str() calls.

- The custom handler used for converting Decimal is removed since
DjangoJSONEncoder handles that for extra_data.

- We remove None-checks for extra_data because it is now no longer
nullable.

- Meanwhile, we want the bouncer to support processing RealmAuditLog entries for
remote servers before and after the JSONField migration on extra_data.

- Since now extra_data should always be a dict for the newer remote
server, which is now migrated, the test cases are updated to create
RealmAuditLog objects by passing a dict for extra_data before
sending over the analytics data. Note that while JSONField allows for
non-dict values, a proper remote server always passes a dict for
extra_data.

- We still test out the legacy extra_data format because not all
remote servers have migrated to use JSONField extra_data.
This verifies that support for extra_data being a string or None has not
been dropped.

Co-authored-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-08-16 17:18:14 -07:00
Sahil Batra 35d5609996 bots: Remove private stream subscriptions on changing bot owner.
We remove bot's subscriptions for private streams to which the
new owner is not subscribed and keep the ones to which the new
owner is subscribed on changing owner.

This commit also changes the code for sending subscription
remove events to use transaction.on_commit since we call
the function inside a transactopn in do_change_bot_owner and
this also requires some changes in tests in test_events.
2023-08-16 15:37:37 -07:00
Prakhar Pratyush 379a08eb1e message_send: Fix wildcard_mentioned flag unset for few participants.
For topic wildcard mentions, the 'wildcard_mentioned' flag is set
for those user messages having 'user_profile_id' in
'topic_participant_user_ids', i.e. all topic participants.

Earlier, the flag was set if the 'user_profile_id' exists in
'all_topic_wildcard_mention_user_ids'.
'all_topic_wildcard_mention_user_ids' contains the ids of those
users who are topic participants and have enabled notifications
for '@topic' mentions.

The earlier approach was incorrect, as it would set the
'wildcard_mentioned' flag only for those topic participants
who have enabled the notifications for '@topic' mention instead
of setting the flag for all the topic participants.

The bug was introduced in 4c9d26c.
2023-08-16 11:31:56 -07:00
Tim Abbott ea83c911e9 Revert "narrow: Fix topic highlighting issue with apostrophes in search results."
This reverts commit baede93f69.

This failed tests after rebasing it on top of
5151dd7ff8.
2023-08-15 17:51:03 -07:00
Akshat baede93f69 narrow: Fix topic highlighting issue with apostrophes in search results.
This commit addresses the issue where the topic highlighting
in search results was offset by one character when an
apostrophe was present. The problem stemmed from the disparity
in HTML escaping generated by the function `func.escape_html` which
is used to obtain `topic_matches` differs from the escaping performed
by the function `django.utils.html.escape` for apostrophes (').

func.escape_html | django.utils.html.escape
-----------------+--------------------------
      &#39;      |           &#x27;

To fix this SQL query is changed to return the HTML-escaped
topic name generated by the function `func.escape_html`.

Fixes: #25633.
2023-08-15 17:29:20 -07:00
Joelute eb78264162 navbar_alerts: Delay showing "Complete the organization profile" banner.
Currently, we are displaying the "Complete the organization profile"
banner immediately after the organization was created. It's important to
strongly encourage orgs to configure their profile, so we should delay
showing the banner if the profile has not been configured after 15 days.
Thus also allows the users to check out Zulip and see how it works before
configuring the organization settings.

Fixes: #24122.
2023-08-15 10:46:33 -07:00
David Rosa 6aee4bb768 help: Document how to view all direct messages in the desktop/web app. 2023-08-15 10:10:31 -07:00
Alex Vandiver 570ff08fde topic: Set a max batch_size on bulk_upate call.
The number of affected objects may be quite high, and they are
selected by `id IN (...)` query, and updated with a giant `CASE`.
This turns out to be quadratic, and can cause large queries to take
hours, in a state where they cannot be terminated, when PostgreSQL >11
tries to JIT the query.

Set a batch_size as a stopgap performance fix before moving to
`.update()` as a real fix.
2023-08-14 13:33:20 -07:00
Steve Howell 51db22c86c per-request caches: Add per_request_cache library.
We have historically cached two types of values
on a per-request basis inside of memory:

    * linkifiers
    * display recipients

Both of these caches were hand-written, and they
both actually cache values that are also in memcached,
so the per-request cache essentially only saves us
from a few memcached hits.

I think the linkifier per-request cache is a necessary
evil. It's an important part of message rendering, and
it's not super easy to structure the code to just get
a single value up front and pass it down the stack.

I'm not so sure we even need the display recipient
per-request cache any more, as we are generally pretty
smart now about hydrating recipient data in terms of
how the code is organized. But I haven't done thorough
research on that hypotheseis.

Fortunately, it's not rocket science to just write
a glorified memoize decorator and tie it into key
places in the code:

    * middleware
    * tests (e.g. asserting db counts)
    * queue processors

That's what I did in this commit.

This commit definitely reduces the amount of code
to maintain. I think it also gets us closer to
possibly phasing out this whole technique, but that
effort is beyond the scope of this PR. We could
add some instrumentation to the decorator to see
how often we get a non-trivial number of saved
round trips to memcached.

Note that when we flush linkifiers, we just use
a big hammer and flush the entire per-request
cache for linkifiers, since there is only ever
one realm in the cache.
2023-08-11 11:09:34 -07:00
Steve Howell 751b8b5bb5 tests: Flush per-request caches automatically for query counts. 2023-08-11 11:09:34 -07:00
Steve Howell 549891266d tests: Add assert_memcached_count.
We use a specific name to distinguish from other caches
like per-request caches.
2023-08-11 11:09:34 -07:00
Steve Howell 0eea42b48c tests: Remove spurious nocoverage directive. 2023-08-11 11:00:57 -07:00
Steve Howell f8ec00b895 mypy: Improve type checks for user display recipients. 2023-08-10 18:13:43 -07:00
Steve Howell 1b7880fc21 push notifications: Go to the DB for streams.
We want to phase out the use of get_display_recipient
for streams, and this is the last place that I
eliminate it. The next commit will eliminate the
dead code and make mypy types tighter.

This change will make push notifications slightly
slower in some situations, but we avoid all the
complexity of a cache, and this code tends to run
offline.

We could always make this code a bit more efficient
by being a little smarter about what data we fetch
up front. For example, get_apns_alert_title gets
called by a function that already has the stream
name. It's just a bit of a pain to refactor when
you have all the DM codepath mucked up with the
stream codepath.
2023-08-10 18:13:43 -07:00
Steve Howell 8295b0d46e tests: Simplify how we get active streams.
There's no need for the complexity and extra round
trips to call get_display_recipient in a testing
context.

We also eliminate the unnecessary call to check_string.

This function is poorly named, but that's a sweep
for another day.
2023-08-10 18:13:43 -07:00
Steve Howell a54760da0e tests: Add assert_message_stream_name
The get_display_recipient helper is a clumsy way to get
stream names, and it's not even representative of how
most of our code retrieves stream names.

The new helper also double-checks that the Stream
object has the correct recipient id.
2023-08-10 18:13:43 -07:00
Steve Howell 7c864db8f2 email mirror: Avoid silly email lookup.
We can search by id, which is more resilient and still
hits a cache.
2023-08-10 18:13:43 -07:00
Steve Howell 257b32a4a4 narrow urls: Avoid complicated optional types.
We no longer have to reason about the 12 possible
ways of invoking get_narrow_url. We also avoid
double computation in a couple places.

Finally, we get stricter type checks by just inlining
the calls.
2023-08-10 18:13:43 -07:00
Steve Howell 233486f7b3 push notifications: Rename variable. 2023-08-10 18:13:43 -07:00
Steve Howell 6be2a08ed8 cache: Avoid cache spam for push notifications.
We don't need to call get_display_recipient for
non-stream messages.

I will rename display_recipient in the next commit;
if I were to combine the steps the diff would be too
hard to read.
2023-08-10 18:13:43 -07:00
Prakhar Pratyush 860eee94fd notifications: Rename 'pm' to 'dm' in 'RecipientInfoResult' dataclass.
This commit renames the keyword 'pm' to 'dm' in the
'pm_mention_email_disabled_user_ids' and
'pm_mention_push_disabled_user_ids' attributes of the
'RecipientInfoResult' dataclass.

'pm' and 'dm' are the acronyms for 'private message' and
'direct message' respectively.

It includes 'TODO/compatibility' code to support the old format
fields in the tornado queues during the Zulip server upgrades.
2023-08-10 17:41:49 -07:00
Prakhar Pratyush c4e4737cc6 notification_trigger: Rename `private_message` to `direct_message`.
This commit renames the 'PRIVATE_MESSAGE' attribute of the
'NotificationTriggers' class to 'DIRECT_MESSAGE'.

Custom migration to update the existing value in the database.

It includes 'TODO/compatibility' code to support the old
notification trigger value 'private_message' in the
push notification queue during the Zulip server upgrades.

Earlier 'private_message' was one of the possible values for the
'trigger' property of the '[`POST /zulip-outgoing-webhook`]' response;
Update the docs to reflect the change in the above-mentioned trigger
value.
2023-08-10 17:41:49 -07:00
Prakhar Pratyush 3675a44471 push_notifications: Add the missing compatibility code.
This commit adds 'TODO/compatibility' code to support the
old notification trigger values in the push notification queue
during the Zulip server upgrades.

In f4fa82e, we renamed the following notification triggers:
* 'wildcard_mentioned' to 'stream_wildcard_mentioned'
* 'followed_topic_wildcard_mentioned' to
'stream_wildcard_mentioned_in_followed_topic'.

This should have been added in f4fa82e.
2023-08-10 17:41:49 -07:00