Just like deactivated realms should be excluded, so should locally
deleted realms.
In particular, failure to exclude locally deleted realms breaks
handle_customer_migration_from_server_to_realms.
This was a bug from 4715a058b0 where this
was just incorrectly called. get_realms_info_for_push_bouncer() is a
function meant to be called on a self-hosted server - and this
handle_... call happens on the bouncer. Therefore this returns all
zulipchat realms in product.
With the way, handle_... is being called right now, there's no reason
for it to have an argument for passing a list of realms. It should just
fetch the relevant RemoteRealm entries by itself, given the server arg.
Requests to these endpoint are about a specified user, and therefore
also have a notion of the RemoteRealm for these requests. Until now
these endpoints weren't getting the realm_uuid value, because it wasn't
used - but now it is needed for updating .last_request_datetime on the
RemoteRealm.
For the RemoteRealm case, we can only set this in endpoints where the
remote server sends us the realm_uuid. So we're missing that for the
endpoints:
- remotes/push/unregister and remotes/push/unregister/all
- remotes/push/test_notification
This should be added in a follow-up commit.
Earlier, the 'handle_customer_migration_from_server_to_realms'
function was called during the send analytics step.
It resulted in an error for customers having multiple Zulip servers,
one for testing and the others for not-testing, sharing a
push bouncer registration.
The migration step when run in a test instance caused customers to
have their legacy plan migrated to a test realm, resulting in them
losing their legacy plan.
This commit moves the migration step to run during plan management
login step. This reduces the chances of losing legacy
plan as we expect them to only verify that 8.0 upgrade works and
not bother trying to login to plan management from their test instance.
This protects us from incorrectly handling situations where someone
tested and upgrade to 8.0 for a backup on a separate hostname, and
left the test system live while upgrading the main system, in a way
that results in duplicate RemoteRealm objects that are all marked as
locally deleted.
Further word is required to figure out how to avoid the original
duplication problem.
It seems most correct to answer the question about whether push
notifications are working specifically for the exact set of realms
that the server self-reported to us in fact exist.
Sending data on any additional realms that were not referenced in the
request (if that's somehow possible without them being locally
deleted) is likely to only be confusing.
And the client should reasonably be able to expect to get a response
covering exactly the realms it told us about.
Old RemotePushDeviceTokens were created without this attribute. But when
processing a notification, if we have remote_realm, we can take the
opportunity to to set this for all the registrations for this user.
This moves the function which computes can_push and
expected_end_timestamp outside RemoteRealmBillingSession
because we might use this function for RemoteZulipServer
as well and also renames it.
We need to update 'last_audit_log_update' before calling the
'sync_license_ledger_if_needed' method to avoid 'MissingDataError'
due to 'has_stale_audit_log' being True.
Also, we made the code block that creates audit logs,
updates 'last_audit_log_update', and syncs LicenseLedger in
an atomic operation.
This helps to rely on 'last_audit_log_update' to assume
'RemoteRealmAuditLog' and 'LicenseLedger' are up-to-date.
- The server sends the list of registrations it believes to have with
the bouncer.
- The bouncer includes in the response the registrations that it doesn't
actually have and therefore the server should delete.
When a self-hosted Zulip server does a data export and then import
process into a different hosting environment (i.e. not sharing the
RemoteZulipServer with the original, we'll have various things that
fail where we look up the RemoteRealm by UUID and find it but the
RemoteZulipServer it is associated with is the wrong one.
Right now, we ask user to contact support via an error page but
might develop UI to help user do the migration directly.
When a remote server uploads statistics, we update the
LicenseLedger using the audit logs uploaded.
We iterate over the RemoteRealmAuditlog data for the concerned
realm starting from the event_time of the last LicenseLedger
created for that customer and update the ledger based on each event.
If the RemoteRealmAuditLog has stale data, it means the server
stopped or never uploaded data. We raise MissingDataError in such
cases when a user action led to calculating licenses count from
stale data.
1. When we get data and it includes realm info, we should automatically
link the new records with the appropriate RemoteRealm.
2. For old records, when we receive realm data, we have an opportunity
to update those old record to link them to the right RemoteRealm.
This logic doesn't need to always run, just after a remote server
upgrade, since that's when this shift in remote server behavior will
occur.
This is a prep commit to return, for each remote realm, the 'uuid',
'can_push', and 'expected_end_timestamp'.
This data will be used in 'initialize_push_notifications'.
This consists of the following pieces:
1. Makes servers using the bouncer send realm_uuid in requests for token
registration. (Sidenote: realm_uuid is already sent in the "send
notification" codepath as of
48db4bf854)
2. This allows the bouncer to tie RemotePushDeviceToken to the
RemoteRealm with matching realm_uuid at registration time.
3. Introduce handling of some potential weird edge cases around the
realm_uuid and RemoteRealm objects in get_remote_realm_helper.
This may happen if there are multiple servers with the same UUID
submitting data (e.g. if they were cloned after initial creation), or
if there is one server, but `./manage.py clear_analytics_tables` was
used to truncate the analytics tables.
In the case of `clear_analytics_tables`, the data submitted likely has
identical historical values with new remote `id` values; preserving
the originally-submitted contemporaneous data is the best option. For
the case of submissions from multiple servers, there is no completely
sensible outcome, so the best we can do is detect the case and move
on.
Since we have a lock on the RemoteZulipServer, we know that no other
inserts are happening, so counting before and after will return the
true number of rows inserted (which `bulk_create` cannot do in the
face of `ignore_conflicts`[^1]). We compare this to the expected
number of new inserted rows to detect dropped duplicates.
[^1]: See https://code.djangoproject.com/ticket/30138.
This does not ensure that we do not mix data from multiple servers
sharing a UUID -- if one has more `RemoteRealmCount` rows,
and the other has more `RemoteInstalltionCount` rows, the end result
will still be some rows from each server, across the two tables.
It does ensure that we will not alternate rows between two servers
if both requests are processed at the same time.
It also causes submissions to be all-or-nothing in the event of
integrity errors. This is not necessarily beneficial, as forward
progress is generally useful -- but the integrity errors are resolved
in the subsequent commit.