The 'startinline' option is utilized in the `CodeHilite` instantiation
to indicate that the provided PHP code snippet should be highlighted
even if it doesn't start with the opening tag or marker of the
associated programming language (which will rarely be the case in
Zulip, since one discusses a section of a file much more often than a
whole file).
Fixes: https://chat.zulip.org/#narrow/stream/137-feedback/topic/php.20syntax.20highlighting.20should.20not.20require.20.60.3C.3Fphp.60.
Signed-off-by: Akshat <akshat25iiit@gmail.com>
This commit updates the 'get_apns_alert_subtitle' function to
return a common subtitle, i.e., "{full_name} mentioned everyone:"
for wildcard mentions.
The triggers for the stream or topic wildcard mentions include:
* NotificationTriggers.TOPIC_WILDCARD_MENTION_IN_FOLLOWED_TOPIC
* NotificationTriggers.STREAM_WILDCARD_MENTION_IN_FOLLOWED_TOPIC
* NotificationTriggers.TOPIC_WILDCARD_MENTION
* NotificationTriggers.STREAM_WILDCARD_MENTION
We now send stream creation and stream deletion events on
changing a user's role because a user can gain or lose
access to some streams on changing their role.
This commit extracts the code which queries the required streams
to a new function "get_user_streams". The new functions returns
the list of "Stream" object and not dictionaries and then
do_get_streams function converts it into list of dictionaries.
This change is important because we would use the new function
in further commit where we want list of "Stream" objects and
not list of dictionaries.
There was a bug in apply_event code where only a stream which
is not private is added to the "never_subscribed" data after
a stream creation event. Instead, it should be added to the
"never_subscribed" data irrespective of permission policy of
the stream as we already send stream creation events only to
those users who can access the stream. Due to the current
bug, private streams were not being added to "never_subscribed"
data in apply_event for admins as well. This commit fixes it
and also makes sure the "never_subscribed" list is sorted
which was not done before and was also a bug.
The bugs mentioned above were unnoticed as the tests did not
cover these cases and this commit also adds tests for those
cases.
The "streams" field in "/register" response did not include web-public
streams for non-admin users but the data for those are eventually
included in the subscriptions data sent using "subscriptions",
"unsubscribed" and "never_subscribed" fields.
This commit adds code to include the web-public streams in "streams"
field as well as everyone can access those and will make the "streams"
data complete.
The name and docstring were just wrong, having a UserMessage row isn't
sufficient for having message access and is actually only relevant in a
private stream with private history. The function is only used in a
single place anyway, in bulk_access_messages.
The comment mentioning this function in handle_remove_push_notification
can be tweaked to just not mention any function specifically and just
say why we're not checking message access.
Users who used to be subscribed to a private stream and have been
removed from it since retain the ability to edit messages/topics, and
delete messages that they used to have access to, if other relevant
organization permissions allow these actions. For example, a user may be
able to edit or delete their old messages they posted in such a private
stream. An administrator will be able to delete old messages (that they
had access to) from the private stream.
We fix this by fixing the logic in has_message_access (which lies at the
core of our message access checks - access_message() and
bulk_access_messages())
to not rely on only a UserMessage row for checking access but also
verify stream type and subscription status.
**Background**
User groups are expected to comply with the DAG constraint for the
many-to-many inter-group membership. The check for this constraint has
to be performed recursively so that we can find all direct and indirect
subgroups of the user group to be added.
This kind of check is vulnerable to phantom reads which is possible at
the default read committed isolation level because we cannot guarantee
that the check is still valid when we are adding the subgroups to the
user group.
**Solution**
To avoid having another transaction concurrently update one of the
to-be-subgroup after the recursive check is done, and before the subgroup
is added, we use SELECT FOR UPDATE to lock the user group rows.
The lock needs to be acquired before a group membership change is about
to occur before any check has been conducted.
Suppose that we are adding subgroup B to supergroup A, the locking protocol
is specified as follows:
1. Acquire a lock for B and all its direct and indirect subgroups.
2. Acquire a lock for A.
For the removal of user groups, we acquire a lock for the user group to
be removed with all its direct and indirect subgroups. This is the special
case A=B, which is still complaint with the protocol.
**Error handling**
We currently rely on Postgres' deadlock detection to abort transactions
and show an error for the users. In the future, we might need some
recovery mechanism or at least better error handling.
**Notes**
An important note is that we need to reuse the recursive CTE query that
finds the direct and indirect subgroups when applying the lock on the
rows. And the lock needs to be acquired the same way for the addition and
removal of direct subgroups.
User membership change (as opposed to user group membership) is not
affected. Read-only queries aren't either. The locks only protect
critical regions where the user group dependency graph might violate
the DAG constraint, where users are not participating.
**Testing**
We implement a transaction test case targeting some typical scenarios
when an internal server error is expected to happen (this means that the
user group view makes the correct decision to abort the transaction when
something goes wrong with locks).
To achieve this, we add a development view intended only for unit tests.
It has a global BARRIER that can be shared across threads, so that we
can synchronize them to consistently reproduce certain potential race
conditions prevented by the database locks.
The transaction test case lanuches pairs of threads initiating possibly
conflicting requests at the same time. The tests are set up such that exactly N
of them are expected to succeed with a certain error message (while we don't
know each one).
**Security notes**
get_recursive_subgroups_for_groups will no longer fetch user groups from
other realms. As a result, trying to add/remove a subgroup from another
realm results in a UserGroup not found error response.
We also implement subgroup-specific checks in has_user_group_access to
keep permission managing in a single place. Do note that the API
currently don't have a way to violate that check because we are only
checking the realm ID now.
We want to make the callers be more explicit about the use of the
user group being accessed, so that the later implemented database lock
can be benefited from the visibility.
Adds typing notification constants to the response given by
`POST /register`. Until now, these were hardcoded by clients
based on the documentation for implementing typing notifications
in the main endpoint description for `api/set-typing-status`.
This change also reflects updating the web-app frontend code
to use the new constants from the register response.
Co-authored-by: Samuel Kabuya <samuel.mwangikabuya@kibo.school>
Co-authored-by: Wilhelmina Asante <wilhelmina.asante@kibo.school>
Fixes#11767.
Previously multi-character emoji sequences weren't matched in the
emoji regex, so we'd convert the characters to separate images,
breaking the intended display.
This change allows us to match the full emoji sequence, and
therefore show the correct image.
We have modified the code to directly fetch realm from Message
object instead of "sender" field and thus we no longer need to
fetch "sender__realm" using select_related.
We do not want to access realm from "sender" field so that
we do not need to pass "sender__realm" argument to
select_related call when querying messages. We can instead
pass realm as argument to wildcard_mention_allowed.
We can directly get the realm object from Message object now
and there is no need to get the realm object from "sender"
field of Message object.
After this change, we would not need to fetch "sender__realm"
field using "select_related" and instead only passing "realm"
to select_related when querying Message objects would be enough.
This commit also updates a couple of cases to directly access
realm ID from message object and not message.sender. Although
we have fetched sender object already, so accessing realm_id
from message directly or from message.sender should not matter,
but we can be consistent to directly get realm from Message
object whenever possible.
We set stream_weekly_traffic field to "null" for Subscription
objects in zephyr mirror realm as we do not need stream traffic
data in zephyr mirror realm. This makes the subscription data
consistent with steams data.
This commit also udpates test to check never_subscribed data for
zephyr mirror realm.
Instead of having a "realm.is_zephyr_mirror_realm" check for every
get_streams_traffic call, this commit udpates get_streams_traffic to
accept realm as parameter and return "None" for zephyr mirror realm.
The "followup_day2" email template name is not clear or descriptive
about the purpose of the email. Creates a duplicate of those email
template files with the template name "zulip_onboarding_topics".
Because any existing scheduled emails that use the "followup_day2"
templates will need to be updated before the current templates can
be removed, we don't do a simple file rename here.
The "followup_day1" email template name is not clear or descriptive
about the purpose of the email. Creates a duplicate of those email
template files with the template name "account_registered".
Because any existing scheduled emails that use the "followup_day1"
templates will need to be updated before the current templates can
be removed, we don't do a simple file rename here.
This migration applies under the assumption that extra_data_json has
been populated for all existing and coming audit log entries.
- This removes the manual conversions back and forth for extra_data
throughout the codebase including the orjson.loads(), orjson.dumps(),
and str() calls.
- The custom handler used for converting Decimal is removed since
DjangoJSONEncoder handles that for extra_data.
- We remove None-checks for extra_data because it is now no longer
nullable.
- Meanwhile, we want the bouncer to support processing RealmAuditLog entries for
remote servers before and after the JSONField migration on extra_data.
- Since now extra_data should always be a dict for the newer remote
server, which is now migrated, the test cases are updated to create
RealmAuditLog objects by passing a dict for extra_data before
sending over the analytics data. Note that while JSONField allows for
non-dict values, a proper remote server always passes a dict for
extra_data.
- We still test out the legacy extra_data format because not all
remote servers have migrated to use JSONField extra_data.
This verifies that support for extra_data being a string or None has not
been dropped.
Co-authored-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
We remove bot's subscriptions for private streams to which the
new owner is not subscribed and keep the ones to which the new
owner is subscribed on changing owner.
This commit also changes the code for sending subscription
remove events to use transaction.on_commit since we call
the function inside a transactopn in do_change_bot_owner and
this also requires some changes in tests in test_events.
For topic wildcard mentions, the 'wildcard_mentioned' flag is set
for those user messages having 'user_profile_id' in
'topic_participant_user_ids', i.e. all topic participants.
Earlier, the flag was set if the 'user_profile_id' exists in
'all_topic_wildcard_mention_user_ids'.
'all_topic_wildcard_mention_user_ids' contains the ids of those
users who are topic participants and have enabled notifications
for '@topic' mentions.
The earlier approach was incorrect, as it would set the
'wildcard_mentioned' flag only for those topic participants
who have enabled the notifications for '@topic' mention instead
of setting the flag for all the topic participants.
The bug was introduced in 4c9d26c.
This commit addresses the issue where the topic highlighting
in search results was offset by one character when an
apostrophe was present. The problem stemmed from the disparity
in HTML escaping generated by the function `func.escape_html` which
is used to obtain `topic_matches` differs from the escaping performed
by the function `django.utils.html.escape` for apostrophes (').
func.escape_html | django.utils.html.escape
-----------------+--------------------------
' | '
To fix this SQL query is changed to return the HTML-escaped
topic name generated by the function `func.escape_html`.
Fixes: #25633.
Currently, we are displaying the "Complete the organization profile"
banner immediately after the organization was created. It's important to
strongly encourage orgs to configure their profile, so we should delay
showing the banner if the profile has not been configured after 15 days.
Thus also allows the users to check out Zulip and see how it works before
configuring the organization settings.
Fixes: #24122.
The number of affected objects may be quite high, and they are
selected by `id IN (...)` query, and updated with a giant `CASE`.
This turns out to be quadratic, and can cause large queries to take
hours, in a state where they cannot be terminated, when PostgreSQL >11
tries to JIT the query.
Set a batch_size as a stopgap performance fix before moving to
`.update()` as a real fix.
We have historically cached two types of values
on a per-request basis inside of memory:
* linkifiers
* display recipients
Both of these caches were hand-written, and they
both actually cache values that are also in memcached,
so the per-request cache essentially only saves us
from a few memcached hits.
I think the linkifier per-request cache is a necessary
evil. It's an important part of message rendering, and
it's not super easy to structure the code to just get
a single value up front and pass it down the stack.
I'm not so sure we even need the display recipient
per-request cache any more, as we are generally pretty
smart now about hydrating recipient data in terms of
how the code is organized. But I haven't done thorough
research on that hypotheseis.
Fortunately, it's not rocket science to just write
a glorified memoize decorator and tie it into key
places in the code:
* middleware
* tests (e.g. asserting db counts)
* queue processors
That's what I did in this commit.
This commit definitely reduces the amount of code
to maintain. I think it also gets us closer to
possibly phasing out this whole technique, but that
effort is beyond the scope of this PR. We could
add some instrumentation to the decorator to see
how often we get a non-trivial number of saved
round trips to memcached.
Note that when we flush linkifiers, we just use
a big hammer and flush the entire per-request
cache for linkifiers, since there is only ever
one realm in the cache.
We want to phase out the use of get_display_recipient
for streams, and this is the last place that I
eliminate it. The next commit will eliminate the
dead code and make mypy types tighter.
This change will make push notifications slightly
slower in some situations, but we avoid all the
complexity of a cache, and this code tends to run
offline.
We could always make this code a bit more efficient
by being a little smarter about what data we fetch
up front. For example, get_apns_alert_title gets
called by a function that already has the stream
name. It's just a bit of a pain to refactor when
you have all the DM codepath mucked up with the
stream codepath.
There's no need for the complexity and extra round
trips to call get_display_recipient in a testing
context.
We also eliminate the unnecessary call to check_string.
This function is poorly named, but that's a sweep
for another day.
The get_display_recipient helper is a clumsy way to get
stream names, and it's not even representative of how
most of our code retrieves stream names.
The new helper also double-checks that the Stream
object has the correct recipient id.
We no longer have to reason about the 12 possible
ways of invoking get_narrow_url. We also avoid
double computation in a couple places.
Finally, we get stricter type checks by just inlining
the calls.
We don't need to call get_display_recipient for
non-stream messages.
I will rename display_recipient in the next commit;
if I were to combine the steps the diff would be too
hard to read.
This commit renames the keyword 'pm' to 'dm' in the
'pm_mention_email_disabled_user_ids' and
'pm_mention_push_disabled_user_ids' attributes of the
'RecipientInfoResult' dataclass.
'pm' and 'dm' are the acronyms for 'private message' and
'direct message' respectively.
It includes 'TODO/compatibility' code to support the old format
fields in the tornado queues during the Zulip server upgrades.
This commit renames the 'PRIVATE_MESSAGE' attribute of the
'NotificationTriggers' class to 'DIRECT_MESSAGE'.
Custom migration to update the existing value in the database.
It includes 'TODO/compatibility' code to support the old
notification trigger value 'private_message' in the
push notification queue during the Zulip server upgrades.
Earlier 'private_message' was one of the possible values for the
'trigger' property of the '[`POST /zulip-outgoing-webhook`]' response;
Update the docs to reflect the change in the above-mentioned trigger
value.
This commit adds 'TODO/compatibility' code to support the
old notification trigger values in the push notification queue
during the Zulip server upgrades.
In f4fa82e, we renamed the following notification triggers:
* 'wildcard_mentioned' to 'stream_wildcard_mentioned'
* 'followed_topic_wildcard_mentioned' to
'stream_wildcard_mentioned_in_followed_topic'.
This should have been added in f4fa82e.
This commit adds code to pass all the required arguments to
select_related call for Message objects such that only the
required related fields are fetched from the database.
Previously, we did not pass any arguments to select_related,
so all the directly and indirectly related fields were fetched
when many of them were actually not being used and made the
query unnecessarily complex.
This commit adds code to pass all the required arguments to
select_related call for Message objects such that only the
required related fields are fetched from the database.
Previously, we did not pass any arguments to select_related,
so all the directly and indirectly related fields were fetched
when many of them were actually not being used and made the
query unnecessarily complex.
This commit adds code to pass all the required arguments to
select_related call for Subscription objects such that only
the required related fields are fetched from the database.
Previously, we did not pass any arguments to select_related,
so all the directly and indirectly related fields were fetched
when many of them were actually not being used and made the
query unnecessarily complex.
This commit adds code to pass all the required arguments to
select_related call for MissedMessageEmailAddress such that
only the required related fields are fetched from the database.
Previously, we did not pass any arguments to select_related,
so all the directly and indirectly related fields were fetched
when many of them were actually not being used and made the
query unnecessarily complex.
We only need ID of the recipient and not the full object, so we
directly access ID using "stream.recipient_id" instead of using
the complete recipient object.
This commit removes get_huddle_recipient function and we now use
get_or_create_huddle in get_recipient_from_user_profiles.
As a result of this change, we do not fetch the recipient from
Huddle object but instead get it using the "id" and "recipient_id"
fields available from Huddle object like we do for a personal
message. This change allows us to not fetch recipient object
using select_related when querying the Huddle object.
We now fetch recipient object when querying "Huddle" object in
get_or_create_huddle_backend as this query is eventually used
to get the recipient object only in get_huddle_recipient.
This commit also updates the select_related call in the code to
populate Huddle objects in cache to pass "Recipient" as argument.
Previously no argument was passed to select_related and thus no
related objects were being fetched, with no non-null related fields
being present.
- Adds instructions block and relative link to Starred messages.
- Adds "Toggle starred messages counter" subheading.
- Adds "Searching for messages" as a related article.
The set of objects in the `users` object can be very large (in some
cases, literally every object in the database) and making them into a
giant `id in (...)` to handle the one tiny corner case which we never
use is silly.
Switch the `--users` codepath to returning a QuerySet as well, so it
can be composed. We pass a QuerySet into send_custom_email as well,
so it can ensure that the realm is `select_related` in as well, no
matter how the QuerySet was generated.
Substituting the rendered body via Jinja2 means that it cannot
perform any interpolation itself. While the string replacement is
hacky, it is the only solution which avoids running Jinja2 more than
once, and also allows the user-supplied content to have Jinja2
substitutions in it.
In this commit, we introduce a new option in the stream creation
UI - a 'Default stream for new users' checkbox. By default, the
checkbox is set to 'off' and is only visible to admins. This
allow admins to easily designate a stream as the default stream
for new users during stream creation.
Fixes#24048.
Adds a test for when a value for a user's custom profile field is
removed and not set to a new value. The omission of this event in
the tests was noted as a possibility in #22103, which updated the
API documentation for these events having `null` for the field
value.
When adding the test discovered that the events logic was not
deleting the field from the user object and instead setting it to
`None`, so fixes that logic as well. There was a similar bug fixed
in commit 96c61a1a41 for when custom profile fields are removed
from a realm.
When applying realm_user update events, some of the event fields
for the person object were being updated to the same value in a
loop. Unnests those calls from the loop over the existing fields
so that they are only updated once.
The original nesting was introduced in commit 649fccde6b and
was expanded in other additions to the logic for these events.
This commit adds code to pass stream traffic data using
the "stream_weekly_traffic" field in stream objects.
We already include the traffic data in Subscription objects,
but the traffic data does not depend on the user to stream
relationship and is stream-only information, so it's better
to include it in Stream objects. We may remove the traffic
data and other stream information fields for Subscription
objects in future.
This will help clients to correctly display the stream
traffic data in case where client receives a stream
creation event and no subscription event, for an already
existing stream which the user did not have access to before.
This commit changes the code to not use get_client_data
function and instead use `stream_to_dict` function to
get the stream data in a dictionary form. This is a
prep commit add stream traffic data to Stream objects.
This commit adds stream_to_dict method which is same as
Stream.to_dict method as of now. This is a prep commit
to include stream traffic data in stream objects.
I move the helper user_ids_to_users to the only
place that it's used, and then I simplify it to
do a direct database query.
These endpoints aren't hit often enough to justify
caching complexity, and for really large user groups,
hitting the cache can actually be counterproductive.
Particularly when you add new users to an existing
group, the bulk of the cost is sending out
notification messages to users.
The only change to the test is that I added an
assertion on the query count.
The cross-realm bots rarely change, and there are only
a few of them, so we just query them all at once and
put them in the cache.
Also, we put the dictionaries in the cache, instead of
the user objects, since there is nothing time-sensitive
about the dictionaries, and they are small. This saves
us a little time computing the avatar url and things
like that, not to mention marshalling costs.
This commit also fixes a theoretical bug where we would
have stale cache entries if somebody somehow modified
the cross-realm bots without bumping KEY_PREFIX.
Internally we no longer pre-fetch the realm objects for
the bots, but we don't get overly precise about picking
individual fields from UserProfile, since we rarely hit
the database and since we don't store raw ORM objects
in the cache.
The test diffs make it look like we are hitting the
cache an extra time, but the tests weren't counting
bulk fetches. Now we only use a single key for all
bots rather a key per bot.
The bulk_get_users() function was only being used to
get cross-realm bots.
It appears that it was introduced in
f02e5b90f6 for that
specific use case.
Now we make the function more specific and test it more
accurately.
We also eliminate a lot of janky code and comments,
including some code that never had test coverage.
Incidentally, it appears that we did not have any code
to invalidate the cache keys here, and that is still
the case. In practice I assume people rarely
re-configure their cross-realm bots unless they are
upgrading the server, and then KEY_PREFIX comes into
play. 25fd4c5508 seems
to have caused that hopefully harmless regression.
A further step will be to make this cache more coarse,
since there are only a few cross-realm bots. The next
commit will hopefully simplify the code and address the
validation pitfall.
Earlier the API endpoints related to user_group accepts and returns a
field `can_mention_group_id` which represents the ID
of user_group whose members can mention the group.
This commit renames this field to `can_mention_group`.
Earlier the API endpoints related to streams accepts and returns a
field `can_remove_subscribers_group_id` which represents the ID
of user_group whose members can remove subscribers from stream.
This commit renames this field to `can_remove_subscribers_group`.
Dropping support for url_prefix for RealmPlayground, the server now uses
url_template instead only for playground creation, retrieval and audit
logging upon removal.
This does the necessary handling so that url_template is expanded with
the extracted code.
Fixes#25723.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
This commit removes the stray strings used to refer to
various types of notification triggers.
We use the attributes of the 'NotificationTriggers' class instead.
Updates the realm field `default_code_block_language` to have a default
value of an empty string instead of None. Also updates the web-app to
check for the empty string and not `null` to indicate no default is set.
This means that both new realms and existing realms that have no default
set will have the same value for this setting: an empty string.
Previously, new realms would have None if no default was set, while realms
that had set and then unset a value for this field would have an empty
string when no default was set.
Expands support for the message ID operand for id" operator to be either
a string or an integer. Previously, this operand was always validated as
a string.
Restore the default django.utils.log.AdminEmailHandler when
ERROR_REPORTING is enabled. Those with more sophisticated needs can
turn it off and use Sentry or a Sentry-compatible system.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We used to access the complete objects for UserProfile foreign
keys like "bot_owner" and "default_sending_stream", where we only
needed ID of them.
This commit fixes some of such instances and now we directly get
the id using "bot_owner_id" and "default_sending_stream_id" so
that we can avoid the unnecessary complexity of accessing the
complete object.
This commit updates code to pass "realm" and "bot_owner" args to
select_related call in get_users. We pass "realm" and "bot_owner"
args to get_users because the caches which this function is used
to populate are used for get_user and get_user_profile_by_api_key
functions and they also select both these fields when querying for
UserProfile objects.
This commit updates the select_related calls in queries to
get UserProfile objects in get_user function called in
management commands to pass "realm" as argument to
select_related call.
There are some management commands like deactivate_user,
change_full_name, etc. which might need fields like
"default_sending_stream" when changing full name of a bot
or something similar, but we don't think that would happen
often and we can afford to have a DB round trip to get
these fields if needed.
Also, note that "realm" is the only non-null foreign key
field in UserProfile object, so select_related() was only
fetching realm object previously as well. But we should
still pass "realm" as argument in select_related call so
that we can make sure that only required fields are
selected in case we add more foreign keys to UserProfile
in future.
This commit updates select_related call to pass "realm" as
argument in select_related call in fetch_users_by_id function
as we only require realm for the UserProfile objects fetched
using fetch_users_by_id.
Also, note that "realm" is the only non-null foreign key field
in UserProfile object, so select_related() was only fetching
realm object previously as well. But we should still pass "realm"
as argument in select_related call so that we can make sure that
only required fields are selected in case we add more foreign
keys to UserProfile in future.
We do not use any related fields for the UserProfile objects
fetched by get_active_users, so we can simply remove the
select_related call.
The user object from get_active_users was used to get realm
but since get_active_users called from a realm object we can
directly use that realm object. This change also leads to
some changes in the cache code where we now pass the realm
to the function instead of selecting it from UserProfile object.
This commit removes select_related call from
get_soft_deactivated_users_for_catch_up as
we do not use any related fields for the
UserProfile objects fetched using this call.
Uploads are well-positioned to use S3's "intelligent tiering" storage
class. Add a setting to let uploaded files to declare their desired
storage class at upload time, and document how to move existing files
to the same storage class.
‘blocklist’ was added in 0.0.35 (with backwards compatibility for the
old name), and type annotations were added in 0.0.91 (with only the
new name).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This code removes a lot of complexity with very likely
positive overall impact on system performance and
negligible downside.
We already cache display recipients on a per-user
level, so there's no need for another cache layer on
top of that that keys them with recipient ids.
We avoid strange things where Alice/Bob and Bob/Charlie
get put into the top layer cache and then we still have
a cache miss on Alice/Charlie despite the lower level
cache being able to support per-user lookups.
This change does introduce an extra database round trip
if any of our messages have a huddle, but the query is
extremely cheap, and we can always try to cache that
function more directly or try to re-use some of our
other huddle-based caches.
As part of this, we clean up the names for the
lower-level per-user cache of display recipients, and
we simplify the cache keys.
We also stop passing in a full Recipient object to the
`bulk_get_huddle_user_ids` functions.
The local impact of this change should be easy to
measure (at least approximately), since we use this
function every time a user gets messages via the
/messages endpoint.
The only overlap between how we fetched streams and
users was to share some really complicated data
structures.
We can also short-circuit some logic if a message
batch is either all-stream or all-DM.
We restrict the columns, avoid quadratic looping,
and don't bother with order_by.
We also return the user ids (per recipient) as
sets, since that's how the only caller uses the
info (albeit implicitly via set.union accepting
a list).
It’s unclear what was supposed to be “safe” about this wrapper. The
hashlib API is fine without it, and we don’t want to encourage further
use of SHA-1.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Translators benefit from the extra information in the field names, and
need the reordering freedom that isn’t available with multiple
positional fields.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
- Adds `message_handle_match` function to handle new pattern for
relative help links to "Drafts" and "Scheduled messages" for logged-in
users: `{relative|message|drafts}` and `{relative|message|scheduled}`.
`remove_denormalized_recipient_column_from_data` removes the
`recipient` data from `zerver_userprofile`, but did not remove it from
`zerver_userprofile_mirrordummy`, which was later appended to the list
of `zerver_userprofile` objects. This led to failure when inserting,
as the mirrordummy objects still tried to reference their previous
`recipient_id`s.
Move the merging of the two sets earlier, before we call
`remove_denormalized_recipient_column_from_data`.
If there are two huddles, with users A + B + C + D and A + B + C, and
user D is deleted, it is replaced with a mirrordummy user. If
mirrordummy subscriptions are not included in exports, then the two
huddles have duplicate member sets, and will not be able to be
imported successfully.
Include huddle subscriptions for mirrordummy users in exports.
The long-term idle topic participants are soft-reactivated
after email/push notifications are sent due to @topic mention.
The reason being that, generally, @topic mentions are going to
reach a small set of users who have a decent chance of being
reactivated by the notifications.
This commit completes the notifications part of the @topic
wildcard mention feature.
Notifications are sent to the topic participants for the
@topic wildcard mention.
The previous function was poorly named, asked for a
Realm object when realm_id sufficed, and returned a
tuple of strings that had different semantics.
I also avoid calling it duplicate times in a couple
places, although it was probably rarely the case that
both invocations actually happened if upstream
validations were working.
Note that there is a TypedDict called EmojiInfo, so I
chose EmojiData here. Perhaps a better name would be
TinyEmojiData or something.
I also simplify the reaction tests with a verify
helper.
The active realm emoji are just a subset of all your
realm emoji, so just use a single cache entry per
realm.
Cache misses should be very infrequent per realm.
If a realm has lots of deactivated realm emoji, then
there's a minor expense to deserialize them, but that
is gonna be dwarfed by all the other more expensive
operations in message-send.
I also renamed the two related functions. I erred on
the side of using somewhat verbose names, as we don't
want folks to confuse the two use cases. Fortunately
there are somewhat natural affordances to use one or
the other, and mypy helps too.
Finally, I use realm_id instead of realm in places
where we don't need the full Realm object.
This migration is reasonably complex because of various anomalies in existing
data.
Note that there are cases when extra_data does not contain data that is
proper json with possibly single quotes. Thus we need to use
"ast.literal_eval" to cover that.
There is also a special case for "event_type == USER_FULL_NAME_CHANGED",
where extra_data is a plain str. This event_type is only used for
RealmAuditLog, so the zilencer migration script does not need to handle
it.
The migration does not handle "event_type == REALM_DISCOUNT_CHANGED"
because ast.literal_eval only allow Python literals. We expect the admin
to populate the jsonified extra_data for extra_data_json manually
beforehand.
This chunks the backfilling migration to reduce potential block time.
The migration for zilencer is mostly similar to the one for zerver; except that
the backfill helper is added in a wrapper and unrelated events are
removed.
**Logging and error recovery**
We print out a warning when the extra_data_json field of an entry
would have been overwritten by a value inconsistent with what we derived
from extra_data. Usually this only happens when the extra_data was
corrupted before this migration. This prevents data loss by backing up
possibly corrupted data in extra_data_json with the keys
"inconsistent_old_extra_data" and "inconsistent_old_extra_data_json".
More roundtrips to the database are needed for inconsistent data, which are
expected to be infrequent.
This also outputs messages when there are audit log entries with decimals,
indicating that such entries are not backfilled. Do note that audit log
entries with decimals are not populated with "inconsistent_old_extra_data_*"
in the JSONField, because they are not overwritten.
For such audit log entries with "extra_data_json" marked as inconsistent,
we skip them in the migration. Because when we have discovered anomalies in a
previous run, there is no need to overwrite them again nesting the extra keys
we added to it.
**Testing**
We create a migration test case utilizing the property of bulk_create
that it doesn't call our modified save method.
We extend ZulipTestCase to support verifying console output at the test
case level. The implementation is crude but the use case should be rare
enough that we don't need it to be too elaborate.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
This tracks user group membership changes when the realm is first set
up, either through an import or not. This happens when we add users to
the system user groups by their roles.
For an imported realm, we do extra handling when the data doesn't include
user groups. This gets audited as well.
Django seems to have an aggressive check on the type of a field when
setting it through an relation, requiring the argument to be a UserGroup in
our case.
Reference:
02966a30dd/django/db/models/base.py (L537-L546)
The MissedMessage queue worker is the single callsite of
`handle_missedmessage_emails`, which immediately transforms the list
of events into a dict keyed by message-id.
Skip the intermediate list step, and use defaultdict and a dataclass
to simplify and make explicit the pieces. This removes the unused
user_profile_id and message_id pieces of the data structure.
This commit adds a boolean field `mentions_topic_wildcard`
to the `MessageRenderingResult` dataclass.
The field is set to true only if message rendering determines
the message has an actual topic wildcard mention in it (and not,
e.g., topic wildcard mention syntax inside a code block).
The rendered content for topic wildcard mention is
'<span class="topic-mention">{wildcard}</span>'.
The 'topic-mention' class is the identifier for the wildcard
mention being a topic wildcard mention.
We don't use 'data-user-id="*"' and "user-mention" class for
topic wildcard mentions and eventually plan to remove them for
stream wildcard mentions too in a separate mini-project.
This commit adds the 'topic_wildcard_mention_user_ids' and
'topic_wildcard_mention_in_followed_topic_user_ids'
attributes to the 'RecipientInfoResult' dataclass.
Only topic participants are notified of @topic mentions.
Topic participants are anyone who sent a message to a topic
or reacted to a message on the topic.
'topic_wildcard_mention_in_followed_topic_user_ids' stores the
ids of the topic participants who follow the topic and have
enabled the wildcard mention notifications for followed topics.
'topic_wildcard_mention_user_ids' stores the ids of the topic
participants for whom 'user_allows_notifications_in_StreamTopic'
with setting 'wildcard_mentions_notify' returns True.
This commit adds a 'has_topic_wildcards' instance variable
to the 'MentionData' class for the detection of
- possible topic wildcards mentions.
Fixes part of #22829.
Co-authored-by: Prakhar Pratyush <prakhar841301@gmail.com>
Co-authored-by: orientor <aditya.verma@students.iiit.ac.in>
This includes changing the URL to #settings/preferences, with a
transparent redirect so that existing links, like the one from Welcome
Bot, continue to work.