We now send stream creation and stream deletion events on
changing a user's role because a user can gain or lose
access to some streams on changing their role.
This commit extracts the code which queries the required streams
to a new function "get_user_streams". The new functions returns
the list of "Stream" object and not dictionaries and then
do_get_streams function converts it into list of dictionaries.
This change is important because we would use the new function
in further commit where we want list of "Stream" objects and
not list of dictionaries.
There was a bug in apply_event code where only a stream which
is not private is added to the "never_subscribed" data after
a stream creation event. Instead, it should be added to the
"never_subscribed" data irrespective of permission policy of
the stream as we already send stream creation events only to
those users who can access the stream. Due to the current
bug, private streams were not being added to "never_subscribed"
data in apply_event for admins as well. This commit fixes it
and also makes sure the "never_subscribed" list is sorted
which was not done before and was also a bug.
The bugs mentioned above were unnoticed as the tests did not
cover these cases and this commit also adds tests for those
cases.
The "streams" field in "/register" response did not include web-public
streams for non-admin users but the data for those are eventually
included in the subscriptions data sent using "subscriptions",
"unsubscribed" and "never_subscribed" fields.
This commit adds code to include the web-public streams in "streams"
field as well as everyone can access those and will make the "streams"
data complete.
The name and docstring were just wrong, having a UserMessage row isn't
sufficient for having message access and is actually only relevant in a
private stream with private history. The function is only used in a
single place anyway, in bulk_access_messages.
The comment mentioning this function in handle_remove_push_notification
can be tweaked to just not mention any function specifically and just
say why we're not checking message access.
Users who used to be subscribed to a private stream and have been
removed from it since retain the ability to edit messages/topics, and
delete messages that they used to have access to, if other relevant
organization permissions allow these actions. For example, a user may be
able to edit or delete their old messages they posted in such a private
stream. An administrator will be able to delete old messages (that they
had access to) from the private stream.
We fix this by fixing the logic in has_message_access (which lies at the
core of our message access checks - access_message() and
bulk_access_messages())
to not rely on only a UserMessage row for checking access but also
verify stream type and subscription status.
**Background**
User groups are expected to comply with the DAG constraint for the
many-to-many inter-group membership. The check for this constraint has
to be performed recursively so that we can find all direct and indirect
subgroups of the user group to be added.
This kind of check is vulnerable to phantom reads which is possible at
the default read committed isolation level because we cannot guarantee
that the check is still valid when we are adding the subgroups to the
user group.
**Solution**
To avoid having another transaction concurrently update one of the
to-be-subgroup after the recursive check is done, and before the subgroup
is added, we use SELECT FOR UPDATE to lock the user group rows.
The lock needs to be acquired before a group membership change is about
to occur before any check has been conducted.
Suppose that we are adding subgroup B to supergroup A, the locking protocol
is specified as follows:
1. Acquire a lock for B and all its direct and indirect subgroups.
2. Acquire a lock for A.
For the removal of user groups, we acquire a lock for the user group to
be removed with all its direct and indirect subgroups. This is the special
case A=B, which is still complaint with the protocol.
**Error handling**
We currently rely on Postgres' deadlock detection to abort transactions
and show an error for the users. In the future, we might need some
recovery mechanism or at least better error handling.
**Notes**
An important note is that we need to reuse the recursive CTE query that
finds the direct and indirect subgroups when applying the lock on the
rows. And the lock needs to be acquired the same way for the addition and
removal of direct subgroups.
User membership change (as opposed to user group membership) is not
affected. Read-only queries aren't either. The locks only protect
critical regions where the user group dependency graph might violate
the DAG constraint, where users are not participating.
**Testing**
We implement a transaction test case targeting some typical scenarios
when an internal server error is expected to happen (this means that the
user group view makes the correct decision to abort the transaction when
something goes wrong with locks).
To achieve this, we add a development view intended only for unit tests.
It has a global BARRIER that can be shared across threads, so that we
can synchronize them to consistently reproduce certain potential race
conditions prevented by the database locks.
The transaction test case lanuches pairs of threads initiating possibly
conflicting requests at the same time. The tests are set up such that exactly N
of them are expected to succeed with a certain error message (while we don't
know each one).
**Security notes**
get_recursive_subgroups_for_groups will no longer fetch user groups from
other realms. As a result, trying to add/remove a subgroup from another
realm results in a UserGroup not found error response.
We also implement subgroup-specific checks in has_user_group_access to
keep permission managing in a single place. Do note that the API
currently don't have a way to violate that check because we are only
checking the realm ID now.
We want to make the callers be more explicit about the use of the
user group being accessed, so that the later implemented database lock
can be benefited from the visibility.
Adds typing notification constants to the response given by
`POST /register`. Until now, these were hardcoded by clients
based on the documentation for implementing typing notifications
in the main endpoint description for `api/set-typing-status`.
This change also reflects updating the web-app frontend code
to use the new constants from the register response.
Co-authored-by: Samuel Kabuya <samuel.mwangikabuya@kibo.school>
Co-authored-by: Wilhelmina Asante <wilhelmina.asante@kibo.school>
Fixes#11767.
Previously multi-character emoji sequences weren't matched in the
emoji regex, so we'd convert the characters to separate images,
breaking the intended display.
This change allows us to match the full emoji sequence, and
therefore show the correct image.
We have modified the code to directly fetch realm from Message
object instead of "sender" field and thus we no longer need to
fetch "sender__realm" using select_related.
We do not want to access realm from "sender" field so that
we do not need to pass "sender__realm" argument to
select_related call when querying messages. We can instead
pass realm as argument to wildcard_mention_allowed.
We can directly get the realm object from Message object now
and there is no need to get the realm object from "sender"
field of Message object.
After this change, we would not need to fetch "sender__realm"
field using "select_related" and instead only passing "realm"
to select_related when querying Message objects would be enough.
This commit also updates a couple of cases to directly access
realm ID from message object and not message.sender. Although
we have fetched sender object already, so accessing realm_id
from message directly or from message.sender should not matter,
but we can be consistent to directly get realm from Message
object whenever possible.
We set stream_weekly_traffic field to "null" for Subscription
objects in zephyr mirror realm as we do not need stream traffic
data in zephyr mirror realm. This makes the subscription data
consistent with steams data.
This commit also udpates test to check never_subscribed data for
zephyr mirror realm.
Instead of having a "realm.is_zephyr_mirror_realm" check for every
get_streams_traffic call, this commit udpates get_streams_traffic to
accept realm as parameter and return "None" for zephyr mirror realm.
The "followup_day2" email template name is not clear or descriptive
about the purpose of the email. Creates a duplicate of those email
template files with the template name "zulip_onboarding_topics".
Because any existing scheduled emails that use the "followup_day2"
templates will need to be updated before the current templates can
be removed, we don't do a simple file rename here.
The "followup_day1" email template name is not clear or descriptive
about the purpose of the email. Creates a duplicate of those email
template files with the template name "account_registered".
Because any existing scheduled emails that use the "followup_day1"
templates will need to be updated before the current templates can
be removed, we don't do a simple file rename here.
This migration applies under the assumption that extra_data_json has
been populated for all existing and coming audit log entries.
- This removes the manual conversions back and forth for extra_data
throughout the codebase including the orjson.loads(), orjson.dumps(),
and str() calls.
- The custom handler used for converting Decimal is removed since
DjangoJSONEncoder handles that for extra_data.
- We remove None-checks for extra_data because it is now no longer
nullable.
- Meanwhile, we want the bouncer to support processing RealmAuditLog entries for
remote servers before and after the JSONField migration on extra_data.
- Since now extra_data should always be a dict for the newer remote
server, which is now migrated, the test cases are updated to create
RealmAuditLog objects by passing a dict for extra_data before
sending over the analytics data. Note that while JSONField allows for
non-dict values, a proper remote server always passes a dict for
extra_data.
- We still test out the legacy extra_data format because not all
remote servers have migrated to use JSONField extra_data.
This verifies that support for extra_data being a string or None has not
been dropped.
Co-authored-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
We remove bot's subscriptions for private streams to which the
new owner is not subscribed and keep the ones to which the new
owner is subscribed on changing owner.
This commit also changes the code for sending subscription
remove events to use transaction.on_commit since we call
the function inside a transactopn in do_change_bot_owner and
this also requires some changes in tests in test_events.
For topic wildcard mentions, the 'wildcard_mentioned' flag is set
for those user messages having 'user_profile_id' in
'topic_participant_user_ids', i.e. all topic participants.
Earlier, the flag was set if the 'user_profile_id' exists in
'all_topic_wildcard_mention_user_ids'.
'all_topic_wildcard_mention_user_ids' contains the ids of those
users who are topic participants and have enabled notifications
for '@topic' mentions.
The earlier approach was incorrect, as it would set the
'wildcard_mentioned' flag only for those topic participants
who have enabled the notifications for '@topic' mention instead
of setting the flag for all the topic participants.
The bug was introduced in 4c9d26c.
This commit addresses the issue where the topic highlighting
in search results was offset by one character when an
apostrophe was present. The problem stemmed from the disparity
in HTML escaping generated by the function `func.escape_html` which
is used to obtain `topic_matches` differs from the escaping performed
by the function `django.utils.html.escape` for apostrophes (').
func.escape_html | django.utils.html.escape
-----------------+--------------------------
' | '
To fix this SQL query is changed to return the HTML-escaped
topic name generated by the function `func.escape_html`.
Fixes: #25633.
Currently, we are displaying the "Complete the organization profile"
banner immediately after the organization was created. It's important to
strongly encourage orgs to configure their profile, so we should delay
showing the banner if the profile has not been configured after 15 days.
Thus also allows the users to check out Zulip and see how it works before
configuring the organization settings.
Fixes: #24122.
The number of affected objects may be quite high, and they are
selected by `id IN (...)` query, and updated with a giant `CASE`.
This turns out to be quadratic, and can cause large queries to take
hours, in a state where they cannot be terminated, when PostgreSQL >11
tries to JIT the query.
Set a batch_size as a stopgap performance fix before moving to
`.update()` as a real fix.
We have historically cached two types of values
on a per-request basis inside of memory:
* linkifiers
* display recipients
Both of these caches were hand-written, and they
both actually cache values that are also in memcached,
so the per-request cache essentially only saves us
from a few memcached hits.
I think the linkifier per-request cache is a necessary
evil. It's an important part of message rendering, and
it's not super easy to structure the code to just get
a single value up front and pass it down the stack.
I'm not so sure we even need the display recipient
per-request cache any more, as we are generally pretty
smart now about hydrating recipient data in terms of
how the code is organized. But I haven't done thorough
research on that hypotheseis.
Fortunately, it's not rocket science to just write
a glorified memoize decorator and tie it into key
places in the code:
* middleware
* tests (e.g. asserting db counts)
* queue processors
That's what I did in this commit.
This commit definitely reduces the amount of code
to maintain. I think it also gets us closer to
possibly phasing out this whole technique, but that
effort is beyond the scope of this PR. We could
add some instrumentation to the decorator to see
how often we get a non-trivial number of saved
round trips to memcached.
Note that when we flush linkifiers, we just use
a big hammer and flush the entire per-request
cache for linkifiers, since there is only ever
one realm in the cache.
We want to phase out the use of get_display_recipient
for streams, and this is the last place that I
eliminate it. The next commit will eliminate the
dead code and make mypy types tighter.
This change will make push notifications slightly
slower in some situations, but we avoid all the
complexity of a cache, and this code tends to run
offline.
We could always make this code a bit more efficient
by being a little smarter about what data we fetch
up front. For example, get_apns_alert_title gets
called by a function that already has the stream
name. It's just a bit of a pain to refactor when
you have all the DM codepath mucked up with the
stream codepath.
There's no need for the complexity and extra round
trips to call get_display_recipient in a testing
context.
We also eliminate the unnecessary call to check_string.
This function is poorly named, but that's a sweep
for another day.
The get_display_recipient helper is a clumsy way to get
stream names, and it's not even representative of how
most of our code retrieves stream names.
The new helper also double-checks that the Stream
object has the correct recipient id.
We no longer have to reason about the 12 possible
ways of invoking get_narrow_url. We also avoid
double computation in a couple places.
Finally, we get stricter type checks by just inlining
the calls.
We don't need to call get_display_recipient for
non-stream messages.
I will rename display_recipient in the next commit;
if I were to combine the steps the diff would be too
hard to read.
This commit renames the keyword 'pm' to 'dm' in the
'pm_mention_email_disabled_user_ids' and
'pm_mention_push_disabled_user_ids' attributes of the
'RecipientInfoResult' dataclass.
'pm' and 'dm' are the acronyms for 'private message' and
'direct message' respectively.
It includes 'TODO/compatibility' code to support the old format
fields in the tornado queues during the Zulip server upgrades.
This commit renames the 'PRIVATE_MESSAGE' attribute of the
'NotificationTriggers' class to 'DIRECT_MESSAGE'.
Custom migration to update the existing value in the database.
It includes 'TODO/compatibility' code to support the old
notification trigger value 'private_message' in the
push notification queue during the Zulip server upgrades.
Earlier 'private_message' was one of the possible values for the
'trigger' property of the '[`POST /zulip-outgoing-webhook`]' response;
Update the docs to reflect the change in the above-mentioned trigger
value.
This commit adds 'TODO/compatibility' code to support the
old notification trigger values in the push notification queue
during the Zulip server upgrades.
In f4fa82e, we renamed the following notification triggers:
* 'wildcard_mentioned' to 'stream_wildcard_mentioned'
* 'followed_topic_wildcard_mentioned' to
'stream_wildcard_mentioned_in_followed_topic'.
This should have been added in f4fa82e.