zulip

Commit Graph

Author	SHA1	Message	Date
Mateusz Mandera	f8616fa013	analytics: Send ZULIP_MERGE_BASE to the bouncer.	2024-06-23 07:44:11 -07:00
Alex Vandiver	50c3dd88e6	models: Migrate ids of all non-Message-related tables to bigint. Migrate all `ids` of anything which does not have a foreign key from the Message or UserMessage table (and would thus require walking those) to be `bigint`. This is done by removing explicit `BigAutoField`s, trading them for explicit `AutoField`s on the tables to not be migrated, while updating `DEFAULT_AUTO_FIELD` to the new default. In general, the tables adjusted in this commit are small tables -- at least compared to Messages and UserMessages. Many-to-many tables without their own model class are adjusted by a custom Operation, since they do not automatically pick up migrations when `DEFAULT_AUTO_FIELD` changes[^1]. Note that this does multiple scans over tables to update foreign keys[^2]. Large installs may wish to hand-optimize this using the output of `./manage.py sqlmigrate` to join multiple `ALTER TABLE` statements into one, to speed up the migration. This is unfortunately not possible to do generically, as constraint names may differ between installations. This leaves the following primary keys as non-`bigint`: - `auth_group.id` - `auth_group_permissions.id` - `auth_permission.id` - `django_content_type.id` - `django_migrations.id` - `otp_static_staticdevice.id` - `otp_static_statictoken.id` - `otp_totp_totpdevice.id` - `two_factor_phonedevice.id` - `zerver_archivedmessage.id` - `zerver_client.id` - `zerver_message.id` - `zerver_realm.id` - `zerver_recipient.id` - `zerver_userprofile.id` [^1]: https://code.djangoproject.com/ticket/32674 [^2]: https://code.djangoproject.com/ticket/24203	2024-06-05 11:48:27 -07:00
Alex Vandiver	4f4725f810	analytics: Migrate models' id columns to bigint. This helps prevent wraparound on exceedingly large and old installs, particularly Zulip Cloud. These are relatively simple migrations since they are not referenced by any other tables; however, they are quite large, and are actively used from Django by running servers, making this not a migration which is possible to run without stopping the server. Use the escape hatch in the previous commit to temporarily pause analytics writes while the migration happens. This should make the migration transparent to users, at the small cost of an artificial dip in statistics (specifically, to push notification counts, and unread message counts) while the migration runs.	2024-06-05 11:48:27 -07:00
Alex Vandiver	b557297dd2	zilencer: Drop data which is no longer sent by remote servers.	2024-06-03 12:35:35 -07:00
Alex Vandiver	7607969f27	zilencer: Add a unique index on RemoteRealm counts with RemoteRealm objects.	2024-05-06 16:34:01 -07:00
Anders Kaseorg	570f3dd447	python: Reformat with Ruff formatter. https://docs.astral.sh/ruff/formatter/ Signed-off-by: Anders Kaseorg <anders@zulip.com>	2024-02-29 17:07:16 -08:00
Lauryn Menard	e5c92a1ee6	remote-billing: Add index to RemoteRealmAuditLog for billing events. When profiling the database queries for the remote support view, getting the user counts for remote realms was identifed as an expensive query. Adds an index on RemoteRealmAuditLog to improve this relatively common query for remote billing information.	2024-02-22 16:30:06 -08:00
Lauryn Menard	47a5459637	zilencer: Add index on RemoteInstallationCount for remote activity. When profiling the database query in `remote_activity.py`, push_forwarded_count was identified as an expensive part of the overall work. Adds an index on RemoteInstallationCount so this is more efficient.	2024-02-01 12:01:16 -08:00
Mateusz Mandera	cbfbdd7337	zilencer: Add last_request_datetime to RemoteRealm + RemoteZulipServer. For the RemoteRealm case, we can only set this in endpoints where the remote server sends us the realm_uuid. So we're missing that for the endpoints: - remotes/push/unregister and remotes/push/unregister/all - remotes/push/test_notification This should be added in a follow-up commit.	2024-01-05 13:09:09 -08:00
Mateusz Mandera	fb5137f8b5	zilencer: Handle deleted realms nicely at server/analytics.	2023-12-15 09:18:26 -08:00
Tim Abbott	1757b88760	billing: Offer release announcement subscriptions. Also avoid prompting for full name time more than once. Adds TOS version field to Remote server user. Co-authored-by: Karl Stolley <karl@zulip.com> Co-authored-by: Aman Agrawal <amanagr@zulip.com>	2023-12-14 10:51:16 -08:00
Mateusz Mandera	651590c49a	remote_billing: Store acting users in remote user audit logs.	2023-12-14 08:11:04 -08:00
Tim Abbott	6308e07e53	billing: Standardize remote server plan type IDs. This will likely save us at least one headache.	2023-12-13 16:40:44 -08:00
Aman Agrawal	b4e4ca14d5	models: Store `is_system_bot_realm` information for `RemoteRealm`. This will help us filter out system bot realm and control feature access to it.	2023-12-11 13:23:49 -08:00
Mateusz Mandera	c800951966	remote_billing: Add some useful fields to Remote...User models. These are useful for auditing and follow what we have for UserProfile. And is_active will be used in the future when we add user deactivation.	2023-12-11 09:39:24 -08:00
Mateusz Mandera	423aebf98e	remote_billing: Implement confirmation flow for RemoteRealm auth. The way the flow goes now is this: 1. The user initiaties login via "Billing" in the gear menu. 2. That takes them to `/self-hosted-billing/` (possibly with a `next_page` param if we use that for some gear menu options). 3. The server queries the bouncer to give the user a link with a signed access token. 4. The user is redirected to that link (on `selfhosting.zulipchat.com`). Now we have two cases, either the user is logging in for the first time and already did in the past. If this is the first time, we have: 5. The user is asked to fill in their email in a form that's shown, pre-filled with the value provided inside the signed access token. They POST this to the next endpoint. 6. The next endpoint sends a confirmation email to that address and asks the user to go check their email. 7. The user clicks the link in their email is taken to the from_confirmation endpoint. 8. Their initial RemoteBillingUser is created, a new signed link like in (3) is generated and they're transparently taken back to (4), where now that they have a RemoteBillingUser, they're handled just like a user who already logged in before: If the user already logged in before, they go straight here: 9. "Confirm login" page - they're shown their information (email and full_name), can update their full name in the form if they want. They also accept ToS here if necessary. They POST this form back to the endpoint and finally have a logged in session. 10. They're redirected to billing (or `next_page`) now that they have access.	2023-12-10 16:15:28 -08:00
Anders Kaseorg	f86becfc94	remote_server: Send API feature level along with Zulip version. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-12-09 12:01:22 -08:00
Mateusz Mandera	abdfdeffe4	remote_billing: Implement confirmation flow for legacy servers. For the last form (with Full Name and ToS consent field), this pretty shamelessly re-uses and directly renders the corporate/remote_realm_billing_finalize_login_confirmation.html template. That's probably good in terms of re-use, but calls for a clean-up commit that will generalize the name of this template and the classes/ids in the HTML.	2023-12-08 23:49:10 -08:00
Prakhar Pratyush	ed9b0d330d	stripe: Raise 'MissingDataError' while fetching license count. If the RemoteRealmAuditLog has stale data, it means the server stopped or never uploaded data. We raise MissingDataError in such cases when a user action led to calculating licenses count from stale data.	2023-12-08 12:58:21 -08:00
Tim Abbott	5e721f4605	migrations: Add recently added indexes concurrently.	2023-12-05 18:22:23 -08:00
Mateusz Mandera	d631c76747	zilencer: Add some indexes on Remote* models. These are for making fix_remote_realm_foreign_keys more efficient.	2023-12-05 16:49:00 -08:00
Mateusz Mandera	250b52e3dc	remote_billing: Add a "confirm login" page in RemoteRealm auth flow.	2023-12-05 11:34:57 -08:00
Mateusz Mandera	7f33d6f0ea	zilencer: Tie RemotePushDeviceToken to RemoteRealm at registration. This consists of the following pieces: 1. Makes servers using the bouncer send realm_uuid in requests for token registration. (Sidenote: realm_uuid is already sent in the "send notification" codepath as of `48db4bf854`) 2. This allows the bouncer to tie RemotePushDeviceToken to the RemoteRealm with matching realm_uuid at registration time. 3. Introduce handling of some potential weird edge cases around the realm_uuid and RemoteRealm objects in get_remote_realm_helper.	2023-12-03 09:51:45 -08:00
Aman Agrawal	4d60c3a96c	models: Allow realm_id to be blank. We cannot provide realm_id for some remote session logs.	2023-11-30 11:22:19 -08:00
Aman Agrawal	2795f11e3f	models: Add org_type to RemoteZulipServer. This is required to save sponsorship data for remote servers.	2023-11-29 19:04:32 -08:00
Mateusz Mandera	9b1a495e2c	zilencer: Sync name and authentication_methods on RemoteRealm.	2023-11-29 15:54:38 -08:00
Alex Vandiver	98b68d7034	zilencer: Remove duplicates before adding unique indexes. The recent #27818 naïvely added unique indexes, despite there being a large number of existing violations. This makes the migration impossible to deploy. Update the migration to de-duplicate rows, dropping all but the first-by-id of each unique set. This is equivalent to what `dd954749be` does with `ignore_conflicts`. We update the migration, rather than making a new one, as any server which has somehow successfully applied the migration apparently did not need to de-duplicate anything.	2023-11-28 15:01:10 -08:00
Mateusz Mandera	02d5740f0f	remote_realm: Add syncing of org_type.	2023-11-28 14:41:16 -08:00
Alex Vandiver	150c64ddd0	zilencer: Enforce uniqueness of server_id + remote_id. This was previously just an index (not a unique one). Enforce this data constraint.	2023-11-28 09:46:48 -08:00
Alex Vandiver	49263ba69f	migrations: Keep the existing constraints until the new ones are made. This removes a window where more violations could enter, and also a period where indexes which may be useful are lacking.	2023-11-21 21:02:37 -05:00
Alex Vandiver	ae836ae007	zilencer: Apply partial unique constraints for null subgroups. This applies `f299f31340` but for the push bouncer receiving side. This is particularly important as we start relying on the unique constraints, via `ON CONFLICT ... IGNORE`, in subsequent commits. Fixes: #12362.	2023-11-21 11:44:55 -08:00
Alex Vandiver	9bc41ca040	zilencer: Store the last-reported server version when storing analytics. Servers since `216d2ec1bf` (version 2.0.0) have submitted this, but we have never stored it.	2023-11-20 14:36:27 -08:00
Mateusz Mandera	48db4bf854	counts: Add new mobile_pushes RemoteRealmCount stats. This requires a bit of complexity to avoid a name collision in COUNT_STATS with the RemoteInstallationCount stats with the same name.	2023-11-10 16:09:11 -08:00
Mateusz Mandera	1312c7ccd7	zilencer: Add mechanism to update RemoteRealm when Realm is changed. This requires a migration to allow RemoteRealmAuditLog.remote_id to be NULL, and to add a RemoteRealmAuditLog.remote_realm.	2023-11-08 15:54:22 -08:00
Mateusz Mandera	76e0511481	zilencer: Add new model RemoteRealm and send the data to the bouncer. Add the new model for recording basic information about Realms on remote server, to go with the other analytics data. Also adds necessary changes to the bouncer endpoint and the send_analytics_to_push_bouncer() function to submit such Realm information.	2023-11-08 15:54:22 -08:00
Greg Price	f109e3b598	push_notifs: Backfill ios_app_id on bouncer.	2023-11-07 16:19:42 -08:00
Mateusz Mandera	2ecd7abc0d	zilencer: Make BaseRemoteCount.remote_id field nullable.	2023-11-01 17:26:10 -07:00
Mateusz Mandera	3cbb651942	zilencer: Remove index on RemoteInstallationCount.remote_id. As in `902498ec4f`, we shouldn't need an index on remote_id alone - only (server_id, remote_id) together.	2023-10-20 10:07:06 -07:00
Alex Vandiver	902498ec4f	zilencer: Update remoterealm indexes. There is no reason to have an index on just `realm_id` or `remote_id`, as those values mean nothing outside of the scope of a specific `server_id`. Remove those never-used single-column indexes from the two tables that have them. By contrast, the pair of `server_id` and `remote_id` is quite useful and specific -- it is a unique pair, and every POST of statistics from a remote host requires looking up the highest `remote_id` for a given `server_id`, which (without this index) is otherwise a quite large scan. Add a unique constraint, which (in PostgreSQL) is implemented as a unique index.	2023-09-14 09:30:16 -07:00
Anders Kaseorg	0ce6dcb905	mypy: Upgrade mypy from 1.4.1 to 1.5.1. _default_manager is the same as objects on most of our models. But when a model class is stored in a variable, the type system doesn’t know which model the variable is referring to, so it can’t know that objects even exists (Django doesn’t add it if the user added a custom manager of a different name). django-stubs used to incorrectly assume it exists unconditionally, but it no longer does. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-09-07 17:51:42 -07:00
Zixuan James Li	30495cec58	migration: Rename extra_data_json to extra_data in audit log models. This migration applies under the assumption that extra_data_json has been populated for all existing and coming audit log entries. - This removes the manual conversions back and forth for extra_data throughout the codebase including the orjson.loads(), orjson.dumps(), and str() calls. - The custom handler used for converting Decimal is removed since DjangoJSONEncoder handles that for extra_data. - We remove None-checks for extra_data because it is now no longer nullable. - Meanwhile, we want the bouncer to support processing RealmAuditLog entries for remote servers before and after the JSONField migration on extra_data. - Since now extra_data should always be a dict for the newer remote server, which is now migrated, the test cases are updated to create RealmAuditLog objects by passing a dict for extra_data before sending over the analytics data. Note that while JSONField allows for non-dict values, a proper remote server always passes a dict for extra_data. - We still test out the legacy extra_data format because not all remote servers have migrated to use JSONField extra_data. This verifies that support for extra_data being a string or None has not been dropped. Co-authored-by: Siddharth Asthana <siddharthasthana31@gmail.com> Signed-off-by: Zixuan James Li <p359101898@gmail.com>	2023-08-16 17:18:14 -07:00
Anders Kaseorg	2ae285af7c	ruff: Fix PLR1714 Consider merging multiple comparisons. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-07-23 15:21:33 -07:00
Zixuan Li	a0cf624eaa	migrations: Backfill extra_data_json for audit log entries. This migration is reasonably complex because of various anomalies in existing data. Note that there are cases when extra_data does not contain data that is proper json with possibly single quotes. Thus we need to use "ast.literal_eval" to cover that. There is also a special case for "event_type == USER_FULL_NAME_CHANGED", where extra_data is a plain str. This event_type is only used for RealmAuditLog, so the zilencer migration script does not need to handle it. The migration does not handle "event_type == REALM_DISCOUNT_CHANGED" because ast.literal_eval only allow Python literals. We expect the admin to populate the jsonified extra_data for extra_data_json manually beforehand. This chunks the backfilling migration to reduce potential block time. The migration for zilencer is mostly similar to the one for zerver; except that the backfill helper is added in a wrapper and unrelated events are removed. Logging and error recovery We print out a warning when the extra_data_json field of an entry would have been overwritten by a value inconsistent with what we derived from extra_data. Usually this only happens when the extra_data was corrupted before this migration. This prevents data loss by backing up possibly corrupted data in extra_data_json with the keys "inconsistent_old_extra_data" and "inconsistent_old_extra_data_json". More roundtrips to the database are needed for inconsistent data, which are expected to be infrequent. This also outputs messages when there are audit log entries with decimals, indicating that such entries are not backfilled. Do note that audit log entries with decimals are not populated with "inconsistent_old_extra_data_" in the JSONField, because they are not overwritten. For such audit log entries with "extra_data_json" marked as inconsistent, we skip them in the migration. Because when we have discovered anomalies in a previous run, there is no need to overwrite them again nesting the extra keys we added to it. Testing* We create a migration test case utilizing the property of bulk_create that it doesn't call our modified save method. We extend ZulipTestCase to support verifying console output at the test case level. The implementation is crude but the use case should be rare enough that we don't need it to be too elaborate. Signed-off-by: Zixuan James Li <p359101898@gmail.com>	2023-07-15 09:43:23 -07:00
Anders Kaseorg	7e707270f0	models: Convert deprecated index_together option to indexes. index_together is slated for removal in Django 5.1: https://docs.djangoproject.com/en/4.2/internals/deprecation/#deprecation-removed-in-5-1 We set the optional index names to match the previously generated index names to avoid adding new migrations. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-07-12 07:12:43 -07:00
Zixuan Li	e39e04c3ce	migration: Add `extra_data_json` for audit log models. Note that we use the DjangoJSONEncoder so that we have builtin support for parsing Decimal and datetime. During this intermediate state, the migration that creates extra_data_json field has been run. We prepare for running the backfilling migration that populates extra_data_json from extra_data. This change implements double-write, which is important to keep the state of extra data consistent. For most extra_data usage, this is handled by the overriden `save` method on `AbstractRealmAuditLog`, where we either generates extra_data_json using orjson.loads or ast.literal_eval. While backfilling ensures that old realm audit log entries have extra_data_json populated, double-write ensures that any new entries generated will also have extra_data_json set. So that we can then safely rename extra_data_json to extra_data while ensuring the non-nullable invariant. For completeness, we additionally set RealmAuditLog.NEW_VALUE for the USER_FULL_NAME_CHANGED event. This cannot be handled with the overridden `save`. This addresses: https://github.com/zulip/zulip/pull/23116#discussion_r1040277795 Note that extra_data_json at this point is not used yet. So the test cases do not need to switch to testing extra_data_json. This is later done after we rename extra_data_json to extra_data. Double-write for the remote server audit logs is special, because we only get the dumped bytes from an external source. Luckily, none of the payload carries extra_data that is not generated using orjson.dumps for audit logs of event types in SYNC_BILLING_EVENTS. This can be verified by looking at: `git grep -A 6 -E "event_type=.*(USER_CREATED\|USER_ACTIVATED\|USER_DEACTIVATED\|USER_REACTIVATED\|USER_ROLE_CHANGED\|REALM_DEACTIVATED\|REALM_REACTIVATED)"` Therefore, we just need to populate extra_data_json doing an orjson.loads call after a None-check. Co-authored-by: Zixuan James Li <p359101898@gmail.com>	2023-06-07 12:14:43 -07:00
Anders Kaseorg	0628c3cac8	migrations: Import BaseDatabaseSchemaEditor from its canonical module. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-03-05 14:46:28 -08:00
Anders Kaseorg	df001db1a9	black: Reformat with Black 23. Black 23 enforces some slightly more specific rules about empty line counts and redundant parenthesis removal, but the result is still compatible with Black 22. (This does not actually upgrade our Python environment to Black 23 yet.) Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-02-02 10:40:13 -08:00
Zixuan James Li	d5517932cd	typing: Use BaseDatabaseSchemaEditor in place of DatabaseSchemaEditor. This is a part of #18777. Signed-off-by: Zixuan James Li <359101898@qq.com>	2022-05-30 14:18:53 -07:00
Mateusz Mandera	f90beae616	zilencer: Drop the index from RemotePushDeviceToken.user_id. The index isn't used, because our unique_index entries provide better indexes for the queries.	2022-03-14 17:47:30 -07:00
Mateusz Mandera	0677c90170	zilencer: Change push bouncer API to accept uuids as user identifier. This is the first step to making the full switch to self-hosted servers use user uuids, per issue #18017. The old id format is still supported of course, for backward compatibility. This commit is separate in order to allow deploying just the bouncer API change to production first.	2022-03-14 17:47:30 -07:00

1 2

83 Commits