Commit Graph

149 Commits

Author SHA1 Message Date
Zixuan James Li 30495cec58 migration: Rename extra_data_json to extra_data in audit log models.
This migration applies under the assumption that extra_data_json has
been populated for all existing and coming audit log entries.

- This removes the manual conversions back and forth for extra_data
throughout the codebase including the orjson.loads(), orjson.dumps(),
and str() calls.

- The custom handler used for converting Decimal is removed since
DjangoJSONEncoder handles that for extra_data.

- We remove None-checks for extra_data because it is now no longer
nullable.

- Meanwhile, we want the bouncer to support processing RealmAuditLog entries for
remote servers before and after the JSONField migration on extra_data.

- Since now extra_data should always be a dict for the newer remote
server, which is now migrated, the test cases are updated to create
RealmAuditLog objects by passing a dict for extra_data before
sending over the analytics data. Note that while JSONField allows for
non-dict values, a proper remote server always passes a dict for
extra_data.

- We still test out the legacy extra_data format because not all
remote servers have migrated to use JSONField extra_data.
This verifies that support for extra_data being a string or None has not
been dropped.

Co-authored-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-08-16 17:18:14 -07:00
Anders Kaseorg 143baa4243 python: Convert translated positional {} fields to {named} fields.
Translators benefit from the extra information in the field names, and
need the reordering freedom that isn’t available with multiple
positional fields.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-18 15:19:07 -07:00
Zixuan Li e39e04c3ce
migration: Add `extra_data_json` for audit log models.
Note that we use the DjangoJSONEncoder so that we have builtin support
for parsing Decimal and datetime.

During this intermediate state, the migration that creates
extra_data_json field has been run. We prepare for running the backfilling
migration that populates extra_data_json from extra_data.

This change implements double-write, which is important to keep the
state of extra data consistent. For most extra_data usage, this is
handled by the overriden `save` method on `AbstractRealmAuditLog`, where
we either generates extra_data_json using orjson.loads or
ast.literal_eval.

While backfilling ensures that old realm audit log entries have
extra_data_json populated, double-write ensures that any new entries
generated will also have extra_data_json set. So that we can then safely
rename extra_data_json to extra_data while ensuring the non-nullable
invariant.

For completeness, we additionally set RealmAuditLog.NEW_VALUE for
the USER_FULL_NAME_CHANGED event. This cannot be handled with the
overridden `save`.

This addresses: https://github.com/zulip/zulip/pull/23116#discussion_r1040277795

Note that extra_data_json at this point is not used yet. So the test
cases do not need to switch to testing extra_data_json. This is later
done after we rename extra_data_json to extra_data.

Double-write for the remote server audit logs is special, because we only
get the dumped bytes from an external source. Luckily, none of the
payload carries extra_data that is not generated using orjson.dumps for
audit logs of event types in SYNC_BILLING_EVENTS. This can be verified
by looking at:

`git grep -A 6 -E "event_type=.*(USER_CREATED|USER_ACTIVATED|USER_DEACTIVATED|USER_REACTIVATED|USER_ROLE_CHANGED|REALM_DEACTIVATED|REALM_REACTIVATED)"`

Therefore, we just need to populate extra_data_json doing an
orjson.loads call after a None-check.

Co-authored-by: Zixuan James Li <p359101898@gmail.com>
2023-06-07 12:14:43 -07:00
Zixuan James Li 28ec7baaef zilencer: Make analytics bouncer forward-compatible with JSONField.
This adds support to accepting extra_data being dict from remote
servers' RealmAuditLog entries. So that it is forward-compatible with
servers that have migrated to use JSONField for RealmAuditLog just in
case. This prepares us for migrating zilencer's audit log models to use
JSONField for extra_data.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-06-05 17:38:10 -07:00
Mateusz Mandera 2a45429a51 zilencer: Delete duplicate remote push registrations.
This fixes existing instances of the bug fixed in the previous commit.

Fixes #24969.
2023-04-13 15:17:20 -07:00
Mateusz Mandera ade2225f08 zilencer: Avoid creating duplicate remote push registrations.
Servers that had upgraded from a Zulip server version that did not yet
support the user_uuid field to one that did could end up with some
mobile devices having two push notifications registrations, one with a
user_id and the other with a user_uuid.

Fix this issue by sending both user_id and user_uuid, and clearing
2023-04-13 15:17:20 -07:00
Prakhar Pratyush e45623fccc python: Update tuple handling pattern; returned by a delete() query.
This commit updates the pattern for dealing with tuples
returned by the delete() query.

The '(num_deleted, ignored) = ModelName.objects.filter().delete()'
pattern is preferred due to better readability.

We avoid the pattern '(num_deleted, _)' because Django uses _
for translation, which may lead to future bugs.
2023-03-27 16:18:23 -07:00
Anders Kaseorg bd884c88ed Fix typos caught by typos.
https://github.com/crate-ci/typos

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-03 11:09:50 -08:00
Anders Kaseorg 69e94b5991 ruff: Fix C413 Unnecessary `list` call around `sorted()`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-03 12:10:15 -07:00
Zixuan James Li eae3e1c3cc zilencer: Tighten type annotations of views.
`remote_server_path` allows us to get rid of all the `validate_entity`
calls in `zilencer.views` and remove all the `Union` type annotations
in the signatures of the authenticated view functions.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-08-13 14:53:52 -07:00
Zixuan James Li af88417847 decorator: Extract validate_remote_server.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-08-13 14:33:59 -07:00
Anders Kaseorg 2b1b070fda zilencer: Check remote server API keys with constant-time comparison.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-08-09 16:02:37 -07:00
Zixuan James Li b7bb30f3cb zilencer: Avoid redefinition of row_objects.
Mypy disallows redefinition of a variable with different types.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-07-15 14:00:56 -07:00
Mateusz Mandera 0677c90170 zilencer: Change push bouncer API to accept uuids as user identifier.
This is the first step to making the full switch to self-hosted servers
use user uuids, per issue #18017. The old id format is still supported
of course, for backward compatibility.

This commit is separate in order to allow deploying *just* the bouncer
API change to production first.
2022-03-14 17:47:30 -07:00
Alex Vandiver f531f3a27f push_notifications: Drop FCM retries to 2, not 10.
This reverts bc15085098 (which provided
not justification for its change) and moves further, down to 2 retries
from the default of 5.

10 retries, with exponential backoff, is equivalent to sleeping 2^11
seconds, or just about 34 minutes (though the code uses a jitter which
may make this up to 51 minutes).  This is an unreasonable amount of
time to spend in this codepath -- as only one worker is used, and it
is single-threaded, this could effectively block all missed message
notifications for half an hour or longer.

This is also necessary because messages sent through the push bouncer
are sent synchronously; the sending server uses a 30-second timeout,
set in PushBouncerSession.  Having retries which linger longer than
this can cause duplicate messages; the sending server will time out
and re-queue the message in RabbitMQ, while the push bouncer's request
will continue, and may succeed.

Limit to 2 retries (APNS currently uses 3), and results expected max
of 4 seconds of sleep, potentially up to 6.  If this fails, there
exists another retry loop above it, at the RabbitMQ layer (either
locally, or via the remote server's queue), which will result in up to
3 additional retries -- all told, the request will me made to FCM up
to 12 times.
2022-03-08 12:52:58 -08:00
Lauryn Menard 3be622ffa7 backend: Add request as parameter to json_success.
Adds request as a parameter to json_success as a refactor towards
making `ignored_parameters_unsupported` functionality available
for all API endpoints.

Also, removes any data parameters that are an empty dict or
a dict with the generic success response values.
2022-02-04 15:16:56 -08:00
Eeshan Garg 3bc0f8c6f9 zilencer: Add endpoint for deactivating remote server registration. 2022-01-21 14:57:04 -08:00
Alex Vandiver 1b395b6403 zilencer: Truncate APNS notifications correctly.
APNs payloads nest the zulip-custom data further than the top level,
as Android notifications do.  This led to APNs data silently never
being truncated; this case was not caught in tests because the mocks
provided the wrong data for the APNs structure.

Adjust to look in the appropriate place within the APNs data, and
truncate that.
2022-01-03 15:24:16 -08:00
Eeshan Garg 4cc35c339b migrations: Backfill audit log entries for remote server creation.
This is a follow-up to #20408.
2022-01-03 12:58:00 -08:00
Mateusz Mandera 4153b5c517 remote_server: Improve uuid validation at the server/register endpoint.
As explained in the comments in the code, just doing UUID(string) and
catching ValueError is not enough, because the uuid library sometimes
tries to modify the string to convert it into a valid UUID:

>>> a = '18cedb98-5222-5f34-50a9-fc418e1ba972'
>>> uuid.UUID(a, version=4)
UUID('18cedb98-5222-4f34-90a9-fc418e1ba972')
2021-12-31 11:18:01 -08:00
Mateusz Mandera e48120fd12 remote_server: Validate zulip_org_id submitted by registering server.
zulip_org_id is supposed to be a UUID, so we want to actually validate
the format, not only check the length.
2021-12-28 10:11:34 -08:00
Alex Vandiver 6c14978cd1 zilencer: Truncate "remove" notifications from remote servers.
This is 4d055a6695, but for notifications which are received from
remote hosts.
2021-11-10 13:39:35 -08:00
Alex Vandiver 111ee64e36 push_notifications: Pass down the remote server and user-id for logs.
This makes logging more consistent between FCM and APNs codepaths, and
makes clear which user-ids are for local users, and which are opaque
integers namespaced from some remote zulip server.
2021-10-19 22:04:24 -07:00
Alex Vandiver 5bcd3c01cb push_notifications: Add log line with user-id, UUID, and devices.
Being able to determine how many distinct users are getting push
notifications per remote host is useful, as is the distribution of
device counts.  This parallels the log line in
handle_push_notification for push notifications from local realms,
handled via the event queue.
2021-10-19 22:04:24 -07:00
Mateusz Mandera 0af7c84c99 push_notifs: Log the number of devices notification was sent to. 2021-09-29 15:50:06 -07:00
PIG208 dcbb2a78ca python: Migrate most json_error => JsonableError.
JsonableError has two major benefits over json_error:
* It can be raised from anywhere in the codebase, rather than
  being a return value, which is much more convenient for refactoring,
  as one doesn't potentially need to change error handling style when
  extracting a bit of view code to a function.
* It is guaranteed to contain the `code` property, which is helpful
  for API consistency.

Various stragglers are not updated because JsonableError requires
subclassing in order to specify custom data or HTTP status codes.
2021-06-30 16:22:38 -07:00
Anders Kaseorg e7ed907cf6 python: Convert deprecated Django ugettext alias to gettext.
django.utils.translation.ugettext is a deprecated alias of
django.utils.translation.gettext as of Django 3.0, and will be removed
in Django 4.0.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-04-15 18:01:34 -07:00
Anders Kaseorg f0e655f1d8 request: Rename validator parameter of REQ to json_validator.
This makes it much more clear that this feature does JSON encoding,
which previously was only indicated in the documentation.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-04-07 14:13:06 -07:00
Anders Kaseorg 93d2ae8092 request: Remove redundant str_validator=check_string from REQ().
REQ(str_validator=check_string) is equivalent to the default behavior
of REQ().

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-04-07 14:13:03 -07:00
Anders Kaseorg 6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg 11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Hashir Sarwar b885678881 push_notifications: Simplify `if device exists` checks. 2020-08-31 17:31:41 -07:00
Anders Kaseorg f364d06fb5 python: Convert percent formatting to .format for translated strings.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-15 16:24:46 -07:00
Anders Kaseorg 365fe0b3d5 python: Sort imports with isort.
Fixes #2665.

Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.

Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start.  I expect this change will increase pressure for us to split
those files, which isn't a bad thing.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 16:45:32 -07:00
Anders Kaseorg 69730a78cc python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:

import re
import sys

last_filename = None
last_row = None
lines = []

for msg in sys.stdin:
    m = re.match(
        r"\x1b\[35mflake8    \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
    )
    if m:
        filename, row_str, col_str, err = m.groups()
        row, col = int(row_str), int(col_str)

        if filename == last_filename:
            assert last_row != row
        else:
            if last_filename is not None:
                with open(last_filename, "w") as f:
                    f.writelines(lines)

            with open(filename) as f:
                lines = f.readlines()
            last_filename = filename
        last_row = row

        line = lines[row - 1]
        if err in ["C812", "C815"]:
            lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
        elif err in ["C819"]:
            assert line[col - 2] == ","
            lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")

if last_filename is not None:
    with open(last_filename, "w") as f:
        f.writelines(lines)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-06-11 16:04:12 -07:00
Anders Kaseorg 1f565a9f41 timezone: Use standard library datetime.timezone.utc consistently.
datetime.timezone is available in Python ≥ 3.2.  This also lets us
remove a pytz dependency from the PostgreSQL scripts.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-05 09:34:17 -07:00
Anders Kaseorg bdc365d0fe logging: Pass format arguments to logging.
https://docs.python.org/3/howto/logging.html#optimization

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-02 10:18:02 -07:00
Tim Abbott 6407d0b1f9 push_notifications: Clear PushDeviceToken on API key change.
This includes adding a new endpoint to the push notification bouncer
interface, and code to call it appropriately after resetting a user's
personal API key.

When we add support for a user having multiple API keys, we may need
to add an additional key here to support removing keys associated with
just one client.
2019-11-19 15:37:43 -08:00
Anders Kaseorg cafac83676 request: Tighten type checking on REQ.
Then, find and fix a predictable number of previous misuses.

With a small change by tabbott to preserve backwards compatibility for
sending `yes` for the `forged` field.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-11-13 12:35:55 -08:00
Anders Kaseorg b0a7b33f9b push_notifications: Declare token of type str, not bytes.
Declaring a CharField of type bytes made no sense.

Signed-off-by: Anders Kaseorg <andersk@zulipchat.com>
2019-11-12 23:21:20 -08:00
Rishi Gupta 360cd7f147 remote data: Send RealmAuditLog data. 2019-10-08 17:27:29 -07:00
Rishi Gupta 48dc1d1128 remote data: Refactor remote_server_post_analytics to be more generic.
One small change in behavior is that this creates an array with all the
row_objects at once, rather than creating them 1000 at a time.

That should be fine, given that the client batches these in units of
10000 anyway, and so we're just creating 10K rows of a relatively
small data structure in Python code here.
2019-10-06 16:55:41 -07:00
Anders Kaseorg 5d063910ff zilencer: Clean up type ignores.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-09 17:42:33 -07:00
Tim Abbott bcc6949461 zilencer: Add better error handling for IntegrityError.
This provides a clean warning and 40x error, rather than a 500, for
this corner case which is very likely user error.

The test here is awkward because we have to work around
https://github.com/zulip/zulip/issues/12362.
2019-05-20 17:53:43 -07:00
Anders Kaseorg 9a9de156c3 lint: Fix calls to _() on computed strings.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:23:03 -07:00
Anders Kaseorg 643bd18b9f lint: Fix code that evaded our lint checks for string % non-tuple.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:21:37 -07:00
Greg Price 49fd2e65de push notif: Add GCM options to bouncer API; empty for now.
The first use case for this will be setting `priority`,
coming up shortly.
2019-02-08 09:40:43 -08:00
Anders Kaseorg 4e21cc0152 views: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:23:43 -08:00
Tim Abbott 55ead5b77f zilencer: Fix buggy validation of installation_counts upload.
This was a simple copy-paste error.  It's probably worth a bit more
work on code duplication in this code path.
2019-02-02 11:51:22 -08:00
Tim Abbott 022c8beaf5 analytics: Add APIs for submitting analytics to another server.
This adds a new API for sending basic analytics data (number of users,
number of messages sent) from a Zulip server to the Zulip Cloud
central analytics database, which will make it possible for servers to
elect to have their usage numbers counted in published stats on the
size of the Zulip ecosystem.
2019-02-01 22:03:52 -08:00