Migrate all `ids` of anything which does not have a foreign key from
the Message or UserMessage table (and would thus require walking
those) to be `bigint`. This is done by removing explicit
`BigAutoField`s, trading them for explicit `AutoField`s on the tables
to not be migrated, while updating `DEFAULT_AUTO_FIELD` to the new
default.
In general, the tables adjusted in this commit are small tables -- at
least compared to Messages and UserMessages.
Many-to-many tables without their own model class are adjusted by a
custom Operation, since they do not automatically pick up migrations
when `DEFAULT_AUTO_FIELD` changes[^1].
Note that this does multiple scans over tables to update foreign
keys[^2]. Large installs may wish to hand-optimize this using the
output of `./manage.py sqlmigrate` to join multiple `ALTER TABLE`
statements into one, to speed up the migration. This is unfortunately
not possible to do generically, as constraint names may differ between
installations.
This leaves the following primary keys as non-`bigint`:
- `auth_group.id`
- `auth_group_permissions.id`
- `auth_permission.id`
- `django_content_type.id`
- `django_migrations.id`
- `otp_static_staticdevice.id`
- `otp_static_statictoken.id`
- `otp_totp_totpdevice.id`
- `two_factor_phonedevice.id`
- `zerver_archivedmessage.id`
- `zerver_client.id`
- `zerver_message.id`
- `zerver_realm.id`
- `zerver_recipient.id`
- `zerver_userprofile.id`
[^1]: https://code.djangoproject.com/ticket/32674
[^2]: https://code.djangoproject.com/ticket/24203
This helps prevent wraparound on exceedingly large and old installs,
particularly Zulip Cloud. These are relatively simple migrations
since they are not referenced by any other tables; however, they are
quite large, and are actively used from Django by running servers,
making this not a migration which is possible to run without stopping
the server.
Use the escape hatch in the previous commit to temporarily pause
analytics writes while the migration happens. This should make the
migration transparent to users, at the small cost of an artificial dip
in statistics (specifically, to push notification counts, and unread
message counts) while the migration runs.
When profiling the database queries for the remote support view,
getting the user counts for remote realms was identifed as an
expensive query. Adds an index on RemoteRealmAuditLog to improve
this relatively common query for remote billing information.
When profiling the database query in `remote_activity.py`,
push_forwarded_count was identified as an expensive part of
the overall work. Adds an index on RemoteInstallationCount
so this is more efficient.
For the RemoteRealm case, we can only set this in endpoints where the
remote server sends us the realm_uuid. So we're missing that for the
endpoints:
- remotes/push/unregister and remotes/push/unregister/all
- remotes/push/test_notification
This should be added in a follow-up commit.
Also avoid prompting for full name time more than once.
Adds TOS version field to Remote server user.
Co-authored-by: Karl Stolley <karl@zulip.com>
Co-authored-by: Aman Agrawal <amanagr@zulip.com>
The way the flow goes now is this:
1. The user initiaties login via "Billing" in the gear menu.
2. That takes them to `/self-hosted-billing/` (possibly with a
`next_page` param if we use that for some gear menu options).
3. The server queries the bouncer to give the user a link with a signed
access token.
4. The user is redirected to that link (on `selfhosting.zulipchat.com`).
Now we have two cases, either the user is logging in for the first time
and already did in the past.
If this is the first time, we have:
5. The user is asked to fill in their email in a form that's shown,
pre-filled with the value provided inside the signed access token.
They POST this to the next endpoint.
6. The next endpoint sends a confirmation email to that address and asks
the user to go check their email.
7. The user clicks the link in their email is taken to the
from_confirmation endpoint.
8. Their initial RemoteBillingUser is created, a new signed link like in
(3) is generated and they're transparently taken back to (4),
where now that they have a RemoteBillingUser, they're handled
just like a user who already logged in before:
If the user already logged in before, they go straight here:
9. "Confirm login" page - they're shown their information (email and
full_name), can update
their full name in the form if they want. They also accept ToS here
if necessary. They POST this form back to
the endpoint and finally have a logged in session.
10. They're redirected to billing (or `next_page`) now that they have
access.
For the last form (with Full Name and ToS consent field), this pretty
shamelessly re-uses and directly renders the
corporate/remote_realm_billing_finalize_login_confirmation.html
template. That's probably good in terms of re-use, but calls for a
clean-up commit that will generalize the name of this template and the
classes/ids in the HTML.
If the RemoteRealmAuditLog has stale data, it means the server
stopped or never uploaded data. We raise MissingDataError in such
cases when a user action led to calculating licenses count from
stale data.
This consists of the following pieces:
1. Makes servers using the bouncer send realm_uuid in requests for token
registration. (Sidenote: realm_uuid is already sent in the "send
notification" codepath as of
48db4bf854)
2. This allows the bouncer to tie RemotePushDeviceToken to the
RemoteRealm with matching realm_uuid at registration time.
3. Introduce handling of some potential weird edge cases around the
realm_uuid and RemoteRealm objects in get_remote_realm_helper.
The recent #27818 naïvely added unique indexes, despite there being a
large number of existing violations. This makes the migration
impossible to deploy.
Update the migration to de-duplicate rows, dropping all but the
first-by-id of each unique set. This is equivalent to what
dd954749be does with `ignore_conflicts`. We update the migration,
rather than making a new one, as any server which has somehow
successfully applied the migration apparently did not need to
de-duplicate anything.
This applies f299f31340 but for the push bouncer receiving side.
This is particularly important as we start relying on the unique
constraints, via `ON CONFLICT ... IGNORE`, in subsequent commits.
Fixes: #12362.
Add the new model for recording basic information about Realms on remote
server, to go with the other analytics data. Also adds necessary changes
to the bouncer endpoint and the send_analytics_to_push_bouncer()
function to submit such Realm information.
There is no reason to have an index on just `realm_id` or `remote_id`,
as those values mean nothing outside of the scope of a specific
`server_id`. Remove those never-used single-column indexes from the
two tables that have them.
By contrast, the pair of `server_id` and `remote_id` is quite useful
and specific -- it is a unique pair, and every POST of statistics from
a remote host requires looking up the highest `remote_id` for a given
`server_id`, which (without this index) is otherwise a quite large
scan.
Add a unique constraint, which (in PostgreSQL) is implemented as a
unique index.
_default_manager is the same as objects on most of our models. But
when a model class is stored in a variable, the type system doesn’t
know which model the variable is referring to, so it can’t know that
objects even exists (Django doesn’t add it if the user added a custom
manager of a different name). django-stubs used to incorrectly assume
it exists unconditionally, but it no longer does.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This migration applies under the assumption that extra_data_json has
been populated for all existing and coming audit log entries.
- This removes the manual conversions back and forth for extra_data
throughout the codebase including the orjson.loads(), orjson.dumps(),
and str() calls.
- The custom handler used for converting Decimal is removed since
DjangoJSONEncoder handles that for extra_data.
- We remove None-checks for extra_data because it is now no longer
nullable.
- Meanwhile, we want the bouncer to support processing RealmAuditLog entries for
remote servers before and after the JSONField migration on extra_data.
- Since now extra_data should always be a dict for the newer remote
server, which is now migrated, the test cases are updated to create
RealmAuditLog objects by passing a dict for extra_data before
sending over the analytics data. Note that while JSONField allows for
non-dict values, a proper remote server always passes a dict for
extra_data.
- We still test out the legacy extra_data format because not all
remote servers have migrated to use JSONField extra_data.
This verifies that support for extra_data being a string or None has not
been dropped.
Co-authored-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
This migration is reasonably complex because of various anomalies in existing
data.
Note that there are cases when extra_data does not contain data that is
proper json with possibly single quotes. Thus we need to use
"ast.literal_eval" to cover that.
There is also a special case for "event_type == USER_FULL_NAME_CHANGED",
where extra_data is a plain str. This event_type is only used for
RealmAuditLog, so the zilencer migration script does not need to handle
it.
The migration does not handle "event_type == REALM_DISCOUNT_CHANGED"
because ast.literal_eval only allow Python literals. We expect the admin
to populate the jsonified extra_data for extra_data_json manually
beforehand.
This chunks the backfilling migration to reduce potential block time.
The migration for zilencer is mostly similar to the one for zerver; except that
the backfill helper is added in a wrapper and unrelated events are
removed.
**Logging and error recovery**
We print out a warning when the extra_data_json field of an entry
would have been overwritten by a value inconsistent with what we derived
from extra_data. Usually this only happens when the extra_data was
corrupted before this migration. This prevents data loss by backing up
possibly corrupted data in extra_data_json with the keys
"inconsistent_old_extra_data" and "inconsistent_old_extra_data_json".
More roundtrips to the database are needed for inconsistent data, which are
expected to be infrequent.
This also outputs messages when there are audit log entries with decimals,
indicating that such entries are not backfilled. Do note that audit log
entries with decimals are not populated with "inconsistent_old_extra_data_*"
in the JSONField, because they are not overwritten.
For such audit log entries with "extra_data_json" marked as inconsistent,
we skip them in the migration. Because when we have discovered anomalies in a
previous run, there is no need to overwrite them again nesting the extra keys
we added to it.
**Testing**
We create a migration test case utilizing the property of bulk_create
that it doesn't call our modified save method.
We extend ZulipTestCase to support verifying console output at the test
case level. The implementation is crude but the use case should be rare
enough that we don't need it to be too elaborate.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
Note that we use the DjangoJSONEncoder so that we have builtin support
for parsing Decimal and datetime.
During this intermediate state, the migration that creates
extra_data_json field has been run. We prepare for running the backfilling
migration that populates extra_data_json from extra_data.
This change implements double-write, which is important to keep the
state of extra data consistent. For most extra_data usage, this is
handled by the overriden `save` method on `AbstractRealmAuditLog`, where
we either generates extra_data_json using orjson.loads or
ast.literal_eval.
While backfilling ensures that old realm audit log entries have
extra_data_json populated, double-write ensures that any new entries
generated will also have extra_data_json set. So that we can then safely
rename extra_data_json to extra_data while ensuring the non-nullable
invariant.
For completeness, we additionally set RealmAuditLog.NEW_VALUE for
the USER_FULL_NAME_CHANGED event. This cannot be handled with the
overridden `save`.
This addresses: https://github.com/zulip/zulip/pull/23116#discussion_r1040277795
Note that extra_data_json at this point is not used yet. So the test
cases do not need to switch to testing extra_data_json. This is later
done after we rename extra_data_json to extra_data.
Double-write for the remote server audit logs is special, because we only
get the dumped bytes from an external source. Luckily, none of the
payload carries extra_data that is not generated using orjson.dumps for
audit logs of event types in SYNC_BILLING_EVENTS. This can be verified
by looking at:
`git grep -A 6 -E "event_type=.*(USER_CREATED|USER_ACTIVATED|USER_DEACTIVATED|USER_REACTIVATED|USER_ROLE_CHANGED|REALM_DEACTIVATED|REALM_REACTIVATED)"`
Therefore, we just need to populate extra_data_json doing an
orjson.loads call after a None-check.
Co-authored-by: Zixuan James Li <p359101898@gmail.com>
Black 23 enforces some slightly more specific rules about empty line
counts and redundant parenthesis removal, but the result is still
compatible with Black 22.
(This does not actually upgrade our Python environment to Black 23
yet.)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This is the first step to making the full switch to self-hosted servers
use user uuids, per issue #18017. The old id format is still supported
of course, for backward compatibility.
This commit is separate in order to allow deploying *just* the bouncer
API change to production first.