Commit Graph

75 Commits

Author SHA1 Message Date
Alex Vandiver ca2ca030d2 migrations: Backfill missing RealmAuditLog entries for subscriptions.
Backfill subscription realm audit log SUBSCRIPTION_CREATED events for
users which are currently subscribed but don't have any subscription
events, presumably due to some historical bug.  This is important
because those rows are necessary when reactivating a user who is
currently soft-deactivated.

For each stream, we find the subscribed users who have no
subscription-related realm audit log entries, and create a
`backfill=True` subscription audit log entry which is the latest it
could have been, based on UserMessage rows.  We then optionally insert
a `DEACTIVATION` if the current subscription is not active.
2023-05-15 16:09:44 -07:00
Alex Vandiver 4f2417cfc4 soft_reactivation: Add a partial index to speed up event lookups.
The full auditlog table is moderately large, and the previously-chosen
index (on `modified_user_id`) is not terribly specific.
2023-04-28 12:43:34 -07:00
Alex Vandiver a56da4be76 soft_deactivation: Only fetch necessary columns.
Existing tests verify that this does not add more queries.
2023-04-28 12:43:34 -07:00
Alex Vandiver ae7485a96e soft_deactivation: Do not bother to fetch stream data as well.
This prefetch is unnecessary and makes this query load more data than
needed.

Existing tests verify that this does not add more queries.
2023-04-28 12:43:34 -07:00
Anders Kaseorg d3efd4c095 python: Import F, Q, QuerySet from their canonical module.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-03-05 14:46:28 -08:00
Anders Kaseorg df001db1a9 black: Reformat with Black 23.
Black 23 enforces some slightly more specific rules about empty line
counts and redundant parenthesis removal, but the result is still
compatible with Black 22.

(This does not actually upgrade our Python environment to Black 23
yet.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-02 10:40:13 -08:00
Anders Kaseorg b0e569f07c ruff: Fix SIM102 nested `if` statements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-23 11:18:36 -08:00
Zixuan James Li 27af5865b0 soft_deactivation: Tighten function signatures with generic QuerySet.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-07-07 11:28:13 -07:00
Zixuan James Li ab1bbdda65 typing: Broaden type annotations for QuerySet compatibility.
To explain the rationale of this change, for example, there is
`get_user_activity_summary` which accepts either a `Collection[UserActivity]`,
where `QuerySet[T]` is not strictly `Sequence[T]` because its slicing behavior
is different from the `Protocol`, making `Collection` necessary.

Similarily, we should have `Iterable[T]` instead of `List[T]` so that
`QuerySet[T]` will also be an acceptable subtype, or `Sequence[T]` when we
also expect it to be indexed.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-07-07 11:27:42 -07:00
Zixuan James Li 5285fbb4d0 typing: Fix type annotation for missing messages in soft reactivation.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-06-28 16:05:24 -07:00
Zixuan James Li 3e95b59f2e soft_deactivation: Add a helper for queuing soft_reactivation.
Signed-off-by: Zixuan James Li <359101898@qq.com>
2022-05-27 14:28:52 -07:00
Zixuan James Li a8fd9eb701 email_notifications: Soft reactivate mentioned users.
Signed-off-by: Zixuan James Li <359101898@qq.com>
2022-04-27 16:43:54 -07:00
Alex Vandiver 6b6dcf6ce1 soft_deactivate: Handle multiple SUBSCRIPTION_DEACTIVATEDs.
Race conditions in stream unsubscription may lead to multiple
back-to-back SUBSCRIPTION_DEACTIVATED RealmAuditLog entries for the
same stream.  The current logic constructs duplicate UserMessage
entries for such, which then later fail to insert.

Keep a set of message-ids that have been prep'd to be inserted, so
that we don't duplicate them if there is a duplicated
SUBSCRIPTION_DEACTIVATED row.  This also renames the `message` local
variable, which otherwise overrode the `message` argument of a
different type.
2021-11-10 12:19:25 -08:00
PIG208 7d1c475f69 typing: Use assertions for function arguments.
Utilize the assert_is_not_None helper to eliminate errors of
'Argument x to "Foo" has incompatible type "Optional[Bar]"...'
2021-07-26 14:48:45 -07:00
PIG208 66b1a4e7ca backend: Add None-checks with assertions and if-elses.
This fixes a batch of mypy errors of the following format:
'Item "None" of "Optional[Something]" has no attribute "abc"'
2021-07-24 15:00:21 -07:00
Anders Kaseorg 544bbd5398 docs: Fix capitalization mistakes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-05-10 09:57:26 -07:00
Mateusz Mandera 1a8ad796f8 models: Replace __id syntax with _id where possible.
model__id syntax implies needing a JOIN on the model table to fetch the
id. That's usually redundant, because the first table in the query
simply has a 'model_id' column, so the id can be fetched directly.
Django is actually smart enough to not do those redundant joins, but we
should still avoid this misguided syntax.

The exceptions are ManytoMany fields and queries doing a backward
relationship lookup. If "streams" is a many-to-many relationship, then
streams_id is invalid - streams__id syntax is needed. If "y" is a
foreign fields from X to Y:
class X:
  y = models.ForeignKey(Y)

then object x of class X has the field x.y_id, but y of class Y doesn't
have y.x_id. Thus Y queries need to be done like
Y.objects.filter(x__id__in=some_list)
2021-04-22 14:53:00 -07:00
Alex Vandiver 11177a40da soft_deactivate: Log and continue on failure to catch up a user.
There exists a logic bug (see #18236) which causes duplicate
usermessage rows to be inserted.  Currently, this stops catch-up for
all users.

Catch and record the exception for each affected user, so we at least
make catch-up progress on other users.
2021-04-22 14:38:03 -07:00
Anders Kaseorg 6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg 11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Tim Abbott 3242fc7388 soft_deactivation: Fix typo in logging output. 2020-09-28 12:12:04 -07:00
palash 0c18113910 soft_deactivation: Change root logger to zulip.soft_deactivation.
Update logger in the following files using this logger:
test_soft_deactivation, test_home, test_push_notifications
2020-09-28 12:12:00 -07:00
Anders Kaseorg 365fe0b3d5 python: Sort imports with isort.
Fixes #2665.

Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.

Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start.  I expect this change will increase pressure for us to split
those files, which isn't a bad thing.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 16:45:32 -07:00
Anders Kaseorg 69730a78cc python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:

import re
import sys

last_filename = None
last_row = None
lines = []

for msg in sys.stdin:
    m = re.match(
        r"\x1b\[35mflake8    \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
    )
    if m:
        filename, row_str, col_str, err = m.groups()
        row, col = int(row_str), int(col_str)

        if filename == last_filename:
            assert last_row != row
        else:
            if last_filename is not None:
                with open(last_filename, "w") as f:
                    f.writelines(lines)

            with open(filename) as f:
                lines = f.readlines()
            last_filename = filename
        last_row = row

        line = lines[row - 1]
        if err in ["C812", "C815"]:
            lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
        elif err in ["C819"]:
            assert line[col - 2] == ","
            lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")

if last_filename is not None:
    with open(last_filename, "w") as f:
        f.writelines(lines)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-06-11 16:04:12 -07:00
Anders Kaseorg 67e7a3631d python: Convert percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 15:02:09 -07:00
Tim Abbott 1c5aa10147 soft_deactivation: Fix buggy error handling.
There is no such thing as `max()` on the manager object.  We meant
.last().

Introduced in 37189e1f9d, so my bug, in
a rare untested code path.
2020-05-06 10:46:54 -07:00
Anders Kaseorg bdc365d0fe logging: Pass format arguments to logging.
https://docs.python.org/3/howto/logging.html#optimization

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-02 10:18:02 -07:00
Anders Kaseorg fead14951c python: Convert assignment type annotations to Python 3.6 style.
This commit was split by tabbott; this piece covers the vast majority
of files in Zulip, but excludes scripts/, tools/, and puppet/ to help
ensure we at least show the right error messages for Xenial systems.

We can likely further refine the remaining pieces with some testing.

Generated by com2ann, with whitespace fixes and various manual fixes
for runtime issues:

-    invoiced_through: Optional[LicenseLedger] = models.ForeignKey(
+    invoiced_through: Optional["LicenseLedger"] = models.ForeignKey(

-_apns_client: Optional[APNsClient] = None
+_apns_client: Optional["APNsClient"] = None

-    notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
-    signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)

-    author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)
+    author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)

-    bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)
+    bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)

-    default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
-    default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)

-descriptors_by_handler_id: Dict[int, ClientDescriptor] = {}
+descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {}

-worker_classes: Dict[str, Type[QueueProcessingWorker]] = {}
-queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {}
+worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {}
+queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {}

-AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None
+AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-22 11:02:32 -07:00
Tim Abbott 8f50062e49 soft_deactivation: Fix incorrect logging function.
Using logging.info() rather than logger.info() meant that our
zulip.soft_deactivation logger configuration (which, in particular,
included not logging to the console) was not active on this log line,
resulting in the `manage.py soft_deactivate_users` cron job sending
emails every time it ran.

Fixes #13750.
2020-01-28 17:17:43 -08:00
Tim Abbott b14b18b76f soft_deactivation: Remove 'email' from logging.
The value wasn't correct with EMAIL_ADDRESS_VISIBILITY_ADMINS, and in
any case we really just need the user ID.
2019-11-15 17:06:51 -08:00
Tim Abbott ddd3a36536 soft deactivation: Remove useless conditional.
Due to my misreading the code and a sloppy search, I thought in
8218bf101c that
all_stream_subscription_logs didn't filter for streams.

While changing this, we'll switch to using `.modified_stream_id` for
potentially better performance.
2019-05-08 14:40:33 -07:00
Tim Abbott 3ecdabdc77 soft_deactivation: Add temporary nocoverage to fix CI. 2019-05-06 16:14:31 -07:00
Tim Abbott 1af4f8fe20 soft_deactivation: Add some explanatory comments.
This function still doesn't make sense in the way I'd like it to, but
this at least records what algorithm we're trying to implement.
2019-05-05 18:33:15 -07:00
Tim Abbott eb97e9fae0 soft_deactivation: Fix buggy handling of race condition.
If a soft deactivated user had a subscription double-toggled without
any new messages being sent in between, add_missing_messages might
incorrectly process those two subscription changes in the wrong order.

Fortunately, the failure mode was usually to throw this exception:

django.db.utils.IntegrityError: duplicate key value violates unique
constraint "zerver_usermessage_user_profile_id_message_id_4936d0df_uniq"
DETAIL:  Key (user_profile_id, message_id)=(4, 57) already exists.

Our unit tests actually had this precise setup some fraction of the
time, because a bit of the test setup code subscribed+unsubscribed the
target user without sending any messages in between, resulting in a
test failure something like 50% of the time.

The original exception was hard to reproduce reliably originally
(resulting in an extremely annoying nondetermnistic test failure), but
is easily reproducible by changing the "id" to "-id" in this change to
always mis-order the processing of those RealmAuditLog events.
2019-05-05 18:29:20 -07:00
Tim Abbott 8218bf101c soft_deactivation: Fix buggy handling of other streams.
Previously, our soft-deactivation logic incorrectly did not filter the
set of stream subscription changes to look at to only include the
target stream.

This could result in unspecified buggy behavior.
2019-05-05 18:29:20 -07:00
Tim Abbott 8e995deab5 soft_deactivation: Make the stream_messages mutation logic clearer.
This changes our code from something that's pretty nasty bad behavior
mutating objects during a loop to only somewhat bad behavior.
2019-05-05 18:29:20 -07:00
Tim Abbott a1d2b73790 soft_deactivation: Clarify loop logic around stream_messages.
Break will do the same thing as continue here, as each iteration will
have the same result, and it's also worth explaining why this isn't
one layer up in the loop setup.
2019-05-05 18:29:20 -07:00
Anders Kaseorg 643bd18b9f lint: Fix code that evaded our lint checks for string % non-tuple.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:21:37 -07:00
Puneeth Chaganti d75d2c9974 soft-deactivation: Run catch-up when "auto" deactivate is run.
When soft deactivation is run for in "auto" mode (no emails are
specified and all users inactive for specified number of days are
deactivated), catch-up is also run in the "auto" mode if
AUTO_CATCH_UP_SOFT_DEACTIVATED_USERS is True.

Automatically catching up soft-deactivated users periodically would
ensure a good user experience for returning users, but on some servers
we may want to turn off this option to save on some disk space.

Fixes #8858, at least for the default configuration, by eliminating
the situation where there are a very large number of messages to recover.
2019-03-14 11:53:15 -07:00
Puneeth Chaganti cf65136002 soft-deactivation: Add code to catch up soft deactivated users. 2019-03-13 17:23:14 -07:00
Puneeth Chaganti 52afbe5e8d soft-deactivation: Rename maybe_catch_up_soft_deactivated_user.
Rename `maybe_catch_up_soft_deactivated_user` to
`reactivate_user_if_soft_deactivated`.
2019-03-13 17:16:22 -07:00
Puneeth Chaganti 82d9789d93 soft-deactivation: Paginate bulk creation of UserMessage rows.
A user who has been soft deactivated for a long time might have 10Ks of message
history that was "soft deactivated". It might take a minute or more to add
UserMessage rows for all of these messages, causing timeouts. So, we paginate
the creation of these UserMessage rows.
2019-03-13 17:16:22 -07:00
Puneeth Chaganti 8c5a0f585b soft-deactivation: Clarify that value being fetched is recipient_id. 2019-03-13 17:16:22 -07:00
Puneeth Chaganti 4d77ffe2cb soft-deactivation: Refactor fetching list of existing UserMessages. 2019-03-13 17:16:22 -07:00
Puneeth Chaganti 8c5425e33a soft-deactivation: Remove unnecessary select_related calls.
It will become more clear in the upcoming commits, but these calls
aren't actually useful as we're only extracting a few IDs.
2019-03-13 17:15:50 -07:00
Tim Abbott 051bbf9f35 docs: Add some more links to new soft deactivation documentation. 2019-03-07 17:48:54 -08:00
Anders Kaseorg f0ecb93515 zerver core: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:41:24 -08:00
Tim Abbott 37189e1f9d soft deactivation: Handle case where a user has no message history.
I'm aware of at least one case where this happened with some imported
data history; better to not have that crash.
2018-12-16 18:52:20 -08:00
Tim Abbott f47f263655 soft deactivation: Avoid giant transaction.
The previous logic for soft deactivation ended up doing a giant
transaction in the case that there were thousands of users to
deactivate; this was messy and potentially buggy.

The batched transactions were useful for RealmAuditLog management,
however.  So the right solution is to do reasonably sized batches
(e.g. 100 users).
2018-12-16 18:52:19 -08:00