Uploads are well-positioned to use S3's "intelligent tiering" storage
class. Add a setting to let uploaded files to declare their desired
storage class at upload time, and document how to move existing files
to the same storage class.
‘blocklist’ was added in 0.0.35 (with backwards compatibility for the
old name), and type annotations were added in 0.0.91 (with only the
new name).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This code removes a lot of complexity with very likely
positive overall impact on system performance and
negligible downside.
We already cache display recipients on a per-user
level, so there's no need for another cache layer on
top of that that keys them with recipient ids.
We avoid strange things where Alice/Bob and Bob/Charlie
get put into the top layer cache and then we still have
a cache miss on Alice/Charlie despite the lower level
cache being able to support per-user lookups.
This change does introduce an extra database round trip
if any of our messages have a huddle, but the query is
extremely cheap, and we can always try to cache that
function more directly or try to re-use some of our
other huddle-based caches.
As part of this, we clean up the names for the
lower-level per-user cache of display recipients, and
we simplify the cache keys.
We also stop passing in a full Recipient object to the
`bulk_get_huddle_user_ids` functions.
The local impact of this change should be easy to
measure (at least approximately), since we use this
function every time a user gets messages via the
/messages endpoint.
The only overlap between how we fetched streams and
users was to share some really complicated data
structures.
We can also short-circuit some logic if a message
batch is either all-stream or all-DM.
We restrict the columns, avoid quadratic looping,
and don't bother with order_by.
We also return the user ids (per recipient) as
sets, since that's how the only caller uses the
info (albeit implicitly via set.union accepting
a list).
It’s unclear what was supposed to be “safe” about this wrapper. The
hashlib API is fine without it, and we don’t want to encourage further
use of SHA-1.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Translators benefit from the extra information in the field names, and
need the reordering freedom that isn’t available with multiple
positional fields.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
- Adds `message_handle_match` function to handle new pattern for
relative help links to "Drafts" and "Scheduled messages" for logged-in
users: `{relative|message|drafts}` and `{relative|message|scheduled}`.
`remove_denormalized_recipient_column_from_data` removes the
`recipient` data from `zerver_userprofile`, but did not remove it from
`zerver_userprofile_mirrordummy`, which was later appended to the list
of `zerver_userprofile` objects. This led to failure when inserting,
as the mirrordummy objects still tried to reference their previous
`recipient_id`s.
Move the merging of the two sets earlier, before we call
`remove_denormalized_recipient_column_from_data`.
If there are two huddles, with users A + B + C + D and A + B + C, and
user D is deleted, it is replaced with a mirrordummy user. If
mirrordummy subscriptions are not included in exports, then the two
huddles have duplicate member sets, and will not be able to be
imported successfully.
Include huddle subscriptions for mirrordummy users in exports.
Fundamentally, we should take a write lock on the message, check its
validity for a change, and then make and commit that change.
Previously, `check_update_message` did not operate in a transaction,
but `do_update_message` did -- which led to the ordering:
- `check_update_message` reads Message, not in a transaction
- `check_update_message` verifies properties of the Message
- `do_update_message` starts a transaction
- `do_update_message` takes a read lock on UserMessage
- `do_update_message` writes on UserMessage
- `do_update_message` writes Message
- `do_update_message` commits
This leads to race conditions, where the `check_update_message` may
have verified based on stale data, and `do_update_message` may
improperly overwrite it; as well as deadlocks, where
other (properly-written) codepaths take a write lock on Message
_before_ updating UserMessage, and thus deadlock with
`do_update_message`.
Change `check_update_message` to open a transaction, and take the
write lock when first accessing the Message row. We update the
comment above `do_update_message` to clarify this expectation.
The new ordering is thus:
- `check_update_message` starts a transaction
- `check_update_message` takes a write lock on Message
- `check_update_message` verifies properties of the Message
- `do_update_message` writes on UserMessage
- `do_update_message` writes Message
- `check_update_message` commits
The long-term idle topic participants are soft-reactivated
after email/push notifications are sent due to @topic mention.
The reason being that, generally, @topic mentions are going to
reach a small set of users who have a decent chance of being
reactivated by the notifications.
This commit completes the notifications part of the @topic
wildcard mention feature.
Notifications are sent to the topic participants for the
@topic wildcard mention.
The previous function was poorly named, asked for a
Realm object when realm_id sufficed, and returned a
tuple of strings that had different semantics.
I also avoid calling it duplicate times in a couple
places, although it was probably rarely the case that
both invocations actually happened if upstream
validations were working.
Note that there is a TypedDict called EmojiInfo, so I
chose EmojiData here. Perhaps a better name would be
TinyEmojiData or something.
I also simplify the reaction tests with a verify
helper.
The active realm emoji are just a subset of all your
realm emoji, so just use a single cache entry per
realm.
Cache misses should be very infrequent per realm.
If a realm has lots of deactivated realm emoji, then
there's a minor expense to deserialize them, but that
is gonna be dwarfed by all the other more expensive
operations in message-send.
I also renamed the two related functions. I erred on
the side of using somewhat verbose names, as we don't
want folks to confuse the two use cases. Fortunately
there are somewhat natural affordances to use one or
the other, and mypy helps too.
Finally, I use realm_id instead of realm in places
where we don't need the full Realm object.
This migration is reasonably complex because of various anomalies in existing
data.
Note that there are cases when extra_data does not contain data that is
proper json with possibly single quotes. Thus we need to use
"ast.literal_eval" to cover that.
There is also a special case for "event_type == USER_FULL_NAME_CHANGED",
where extra_data is a plain str. This event_type is only used for
RealmAuditLog, so the zilencer migration script does not need to handle
it.
The migration does not handle "event_type == REALM_DISCOUNT_CHANGED"
because ast.literal_eval only allow Python literals. We expect the admin
to populate the jsonified extra_data for extra_data_json manually
beforehand.
This chunks the backfilling migration to reduce potential block time.
The migration for zilencer is mostly similar to the one for zerver; except that
the backfill helper is added in a wrapper and unrelated events are
removed.
**Logging and error recovery**
We print out a warning when the extra_data_json field of an entry
would have been overwritten by a value inconsistent with what we derived
from extra_data. Usually this only happens when the extra_data was
corrupted before this migration. This prevents data loss by backing up
possibly corrupted data in extra_data_json with the keys
"inconsistent_old_extra_data" and "inconsistent_old_extra_data_json".
More roundtrips to the database are needed for inconsistent data, which are
expected to be infrequent.
This also outputs messages when there are audit log entries with decimals,
indicating that such entries are not backfilled. Do note that audit log
entries with decimals are not populated with "inconsistent_old_extra_data_*"
in the JSONField, because they are not overwritten.
For such audit log entries with "extra_data_json" marked as inconsistent,
we skip them in the migration. Because when we have discovered anomalies in a
previous run, there is no need to overwrite them again nesting the extra keys
we added to it.
**Testing**
We create a migration test case utilizing the property of bulk_create
that it doesn't call our modified save method.
We extend ZulipTestCase to support verifying console output at the test
case level. The implementation is crude but the use case should be rare
enough that we don't need it to be too elaborate.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
We did not send the stream creation events when subscribing
guests to public streams while we do send them when subscribing
non-admin users to private streams.
This commit adds code to send the stream creation events when
subscribing guests to public streams, so the clients can know
that the stream exists and fixes the bug where client tries
to process a subscription add event for a stream which it does
not know about.
This commit updates description for stream creation event
to mention that the event is also sent when user gains
access to a stream either due to being subscribed to it
or if a private stream is made public.
This tracks user group membership changes when the realm is first set
up, either through an import or not. This happens when we add users to
the system user groups by their roles.
For an imported realm, we do extra handling when the data doesn't include
user groups. This gets audited as well.
Django seems to have an aggressive check on the type of a field when
setting it through an relation, requiring the argument to be a UserGroup in
our case.
Reference:
02966a30dd/django/db/models/base.py (L537-L546)
The MissedMessage queue worker is the single callsite of
`handle_missedmessage_emails`, which immediately transforms the list
of events into a dict keyed by message-id.
Skip the intermediate list step, and use defaultdict and a dataclass
to simplify and make explicit the pieces. This removes the unused
user_profile_id and message_id pieces of the data structure.
This commit adds a boolean field `mentions_topic_wildcard`
to the `MessageRenderingResult` dataclass.
The field is set to true only if message rendering determines
the message has an actual topic wildcard mention in it (and not,
e.g., topic wildcard mention syntax inside a code block).
The rendered content for topic wildcard mention is
'<span class="topic-mention">{wildcard}</span>'.
The 'topic-mention' class is the identifier for the wildcard
mention being a topic wildcard mention.
We don't use 'data-user-id="*"' and "user-mention" class for
topic wildcard mentions and eventually plan to remove them for
stream wildcard mentions too in a separate mini-project.
This prep commit merges separate tests for '**@all**',
'**@stream**' and '**@everyone**' stream wildcard mentions
into a single test named 'test_mention_stream_wildcard'.
Similarly, it merges separate tests for '@all', '@stream',
and '@everyone' stream wildcard mentions into a single test
named 'test_mention_at_stream_wildcard'.
The aim is to finally have two separate tests for stream and
topic wildcard mentions (when we introduce topic wildcards)
instead of having separate tests for each mention text
(i.e. all, everyone, stream, topic).
This commit adds the 'topic_wildcard_mention_user_ids' and
'topic_wildcard_mention_in_followed_topic_user_ids'
attributes to the 'RecipientInfoResult' dataclass.
Only topic participants are notified of @topic mentions.
Topic participants are anyone who sent a message to a topic
or reacted to a message on the topic.
'topic_wildcard_mention_in_followed_topic_user_ids' stores the
ids of the topic participants who follow the topic and have
enabled the wildcard mention notifications for followed topics.
'topic_wildcard_mention_user_ids' stores the ids of the topic
participants for whom 'user_allows_notifications_in_StreamTopic'
with setting 'wildcard_mentions_notify' returns True.
This commit adds a 'has_topic_wildcards' instance variable
to the 'MentionData' class for the detection of
- possible topic wildcards mentions.
Fixes part of #22829.
Co-authored-by: Prakhar Pratyush <prakhar841301@gmail.com>
Co-authored-by: orientor <aditya.verma@students.iiit.ac.in>
This commit updates the existing tests in 'test_email_notifications'
and 'test_push_notifications' to properly configure user settings
and visibility policies before running the actual tests.
Earlier, the tests were passing, but the corner case expected
to be covered wasn't covered.
This should have been included in
d80779435a.
These comments should not have been included in
a8fd9eb701.
We covered the case "Private message should soft reactivate
the user" earlier in the test. So the comment was rightly added
there.
During stream wildcard or group mention, no such personal mention
is involved; hence, the comments are not needed.
This is a prep commit to replace 'wildcard' with 'stream_wildcard'.
This wasn't included in 179d5cb because we didn't decide to
use a different rendered_content for topic wildcard mention,
i.e., ''<span class="user-mention topic-mention">{wildcard}</span>'.
Our intention was not to create separate tests for both stream
and topic wildcard mentions, as they were expected to have the
same rendered content format.
Previously this limit was 1 week, which was fine for busy
organizations, but for organizations that send a few messages a week,
or have occasional bursts of activity but the last one was a few weeks
ago, this should give a significantly better new user experience.
There are still caps like 1000 messages total and 20
unread, but we're a bit more flexible about time.
This includes changing the URL to #settings/preferences, with a
transparent redirect so that existing links, like the one from Welcome
Bot, continue to work.
Pass the HttpRequest explicitly through the two webhooks that log to
the webhook loggers.
get_current_request is now unused, so remove it (in the same commit
for test coverage reasons).
Signed-off-by: Anders Kaseorg <anders@zulip.com>