When a self-hosted Zulip server does a data export and then import
process into a different hosting environment (i.e. not sharing the
RemoteZulipServer with the original, we'll have various things that
fail where we look up the RemoteRealm by UUID and find it but the
RemoteZulipServer it is associated with is the wrong one.
Right now, we ask user to contact support via an error page but
might develop UI to help user do the migration directly.
When a remote server uploads statistics, we update the
LicenseLedger using the audit logs uploaded.
We iterate over the RemoteRealmAuditlog data for the concerned
realm starting from the event_time of the last LicenseLedger
created for that customer and update the ledger based on each event.
If the RemoteRealmAuditLog has stale data, it means the server
stopped or never uploaded data. We raise MissingDataError in such
cases when a user action led to calculating licenses count from
stale data.
1. When we get data and it includes realm info, we should automatically
link the new records with the appropriate RemoteRealm.
2. For old records, when we receive realm data, we have an opportunity
to update those old record to link them to the right RemoteRealm.
This logic doesn't need to always run, just after a remote server
upgrade, since that's when this shift in remote server behavior will
occur.
This is a prep commit to return, for each remote realm, the 'uuid',
'can_push', and 'expected_end_timestamp'.
This data will be used in 'initialize_push_notifications'.
This consists of the following pieces:
1. Makes servers using the bouncer send realm_uuid in requests for token
registration. (Sidenote: realm_uuid is already sent in the "send
notification" codepath as of
48db4bf854)
2. This allows the bouncer to tie RemotePushDeviceToken to the
RemoteRealm with matching realm_uuid at registration time.
3. Introduce handling of some potential weird edge cases around the
realm_uuid and RemoteRealm objects in get_remote_realm_helper.
This may happen if there are multiple servers with the same UUID
submitting data (e.g. if they were cloned after initial creation), or
if there is one server, but `./manage.py clear_analytics_tables` was
used to truncate the analytics tables.
In the case of `clear_analytics_tables`, the data submitted likely has
identical historical values with new remote `id` values; preserving
the originally-submitted contemporaneous data is the best option. For
the case of submissions from multiple servers, there is no completely
sensible outcome, so the best we can do is detect the case and move
on.
Since we have a lock on the RemoteZulipServer, we know that no other
inserts are happening, so counting before and after will return the
true number of rows inserted (which `bulk_create` cannot do in the
face of `ignore_conflicts`[^1]). We compare this to the expected
number of new inserted rows to detect dropped duplicates.
[^1]: See https://code.djangoproject.com/ticket/30138.
This does not ensure that we do not mix data from multiple servers
sharing a UUID -- if one has more `RemoteRealmCount` rows,
and the other has more `RemoteInstalltionCount` rows, the end result
will still be some rows from each server, across the two tables.
It does ensure that we will not alternate rows between two servers
if both requests are processed at the same time.
It also causes submissions to be all-or-nothing in the event of
integrity errors. This is not necessarily beneficial, as forward
progress is generally useful -- but the integrity errors are resolved
in the subsequent commit.
Add the new model for recording basic information about Realms on remote
server, to go with the other analytics data. Also adds necessary changes
to the bouncer endpoint and the send_analytics_to_push_bouncer()
function to submit such Realm information.
This parameter appeared here on the function definition,
but because it lacked a `REQ` call it didn't actually connect
to any parameter passed in the HTTP request.
It doesn't make any sense on this endpoint anyway -- presumably
it was copy-pasted from its "register" counterpart -- so just cut it.
We'll need this information in order to properly direct APNs
notifications. Happily, the Zulip server always sends it when
registering an APNs token; and it appears it always has done so
since the commit:
cddee49e7 Add support infrastructure for push notification bouncer service.
back in 2016. So there's no compatibility issue from requiring it.
This missing `REQ` call has meant we just drop this parameter:
even though the remote Zulip server passes it (for all APNs tokens),
we never notice and never store it. Fix that.
_default_manager is the same as objects on most of our models. But
when a model class is stored in a variable, the type system doesn’t
know which model the variable is referring to, so it can’t know that
objects even exists (Django doesn’t add it if the user added a custom
manager of a different name). django-stubs used to incorrectly assume
it exists unconditionally, but it no longer does.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This migration applies under the assumption that extra_data_json has
been populated for all existing and coming audit log entries.
- This removes the manual conversions back and forth for extra_data
throughout the codebase including the orjson.loads(), orjson.dumps(),
and str() calls.
- The custom handler used for converting Decimal is removed since
DjangoJSONEncoder handles that for extra_data.
- We remove None-checks for extra_data because it is now no longer
nullable.
- Meanwhile, we want the bouncer to support processing RealmAuditLog entries for
remote servers before and after the JSONField migration on extra_data.
- Since now extra_data should always be a dict for the newer remote
server, which is now migrated, the test cases are updated to create
RealmAuditLog objects by passing a dict for extra_data before
sending over the analytics data. Note that while JSONField allows for
non-dict values, a proper remote server always passes a dict for
extra_data.
- We still test out the legacy extra_data format because not all
remote servers have migrated to use JSONField extra_data.
This verifies that support for extra_data being a string or None has not
been dropped.
Co-authored-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
Translators benefit from the extra information in the field names, and
need the reordering freedom that isn’t available with multiple
positional fields.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Note that we use the DjangoJSONEncoder so that we have builtin support
for parsing Decimal and datetime.
During this intermediate state, the migration that creates
extra_data_json field has been run. We prepare for running the backfilling
migration that populates extra_data_json from extra_data.
This change implements double-write, which is important to keep the
state of extra data consistent. For most extra_data usage, this is
handled by the overriden `save` method on `AbstractRealmAuditLog`, where
we either generates extra_data_json using orjson.loads or
ast.literal_eval.
While backfilling ensures that old realm audit log entries have
extra_data_json populated, double-write ensures that any new entries
generated will also have extra_data_json set. So that we can then safely
rename extra_data_json to extra_data while ensuring the non-nullable
invariant.
For completeness, we additionally set RealmAuditLog.NEW_VALUE for
the USER_FULL_NAME_CHANGED event. This cannot be handled with the
overridden `save`.
This addresses: https://github.com/zulip/zulip/pull/23116#discussion_r1040277795
Note that extra_data_json at this point is not used yet. So the test
cases do not need to switch to testing extra_data_json. This is later
done after we rename extra_data_json to extra_data.
Double-write for the remote server audit logs is special, because we only
get the dumped bytes from an external source. Luckily, none of the
payload carries extra_data that is not generated using orjson.dumps for
audit logs of event types in SYNC_BILLING_EVENTS. This can be verified
by looking at:
`git grep -A 6 -E "event_type=.*(USER_CREATED|USER_ACTIVATED|USER_DEACTIVATED|USER_REACTIVATED|USER_ROLE_CHANGED|REALM_DEACTIVATED|REALM_REACTIVATED)"`
Therefore, we just need to populate extra_data_json doing an
orjson.loads call after a None-check.
Co-authored-by: Zixuan James Li <p359101898@gmail.com>
This adds support to accepting extra_data being dict from remote
servers' RealmAuditLog entries. So that it is forward-compatible with
servers that have migrated to use JSONField for RealmAuditLog just in
case. This prepares us for migrating zilencer's audit log models to use
JSONField for extra_data.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
Servers that had upgraded from a Zulip server version that did not yet
support the user_uuid field to one that did could end up with some
mobile devices having two push notifications registrations, one with a
user_id and the other with a user_uuid.
Fix this issue by sending both user_id and user_uuid, and clearing
This commit updates the pattern for dealing with tuples
returned by the delete() query.
The '(num_deleted, ignored) = ModelName.objects.filter().delete()'
pattern is preferred due to better readability.
We avoid the pattern '(num_deleted, _)' because Django uses _
for translation, which may lead to future bugs.
`remote_server_path` allows us to get rid of all the `validate_entity`
calls in `zilencer.views` and remove all the `Union` type annotations
in the signatures of the authenticated view functions.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>