Commit Graph

858 Commits

Author SHA1 Message Date
roanster007 02d0566dc5 refactor: Rename `Huddle` Django model class to `DirectMessageGroup`.
This commit renames the "Huddle" Django model class to
"DirectMessageGroup", while maintaining the same table --
"zerver_huddle".

Fixes part of #28640.
2024-07-07 21:31:30 -07:00
Alex Vandiver f52a93bc14 upload: Stop requiring callers pass in the file size.
This can be calculated because we have the contents.
2024-07-07 14:40:07 -07:00
Alex Vandiver 0a296b2a6e upload: Start storing content-type for new uploads. 2024-07-07 14:40:07 -07:00
Mateusz Mandera 00b8cce50e push_notifs: Rename PushDeviceToken.GCM to FCM. 2024-06-17 18:22:59 -07:00
Alex Vandiver 50c3dd88e6 models: Migrate ids of all non-Message-related tables to bigint.
Migrate all `ids` of anything which does not have a foreign key from
the Message or UserMessage table (and would thus require walking
those) to be `bigint`.  This is done by removing explicit
`BigAutoField`s, trading them for explicit `AutoField`s on the tables
to not be migrated, while updating `DEFAULT_AUTO_FIELD` to the new
default.

In general, the tables adjusted in this commit are small tables -- at
least compared to Messages and UserMessages.

Many-to-many tables without their own model class are adjusted by a
custom Operation, since they do not automatically pick up migrations
when `DEFAULT_AUTO_FIELD` changes[^1].

Note that this does multiple scans over tables to update foreign
keys[^2].  Large installs may wish to hand-optimize this using the
output of `./manage.py sqlmigrate` to join multiple `ALTER TABLE`
statements into one, to speed up the migration.  This is unfortunately
not possible to do generically, as constraint names may differ between
installations.

This leaves the following primary keys as non-`bigint`:
- `auth_group.id`
- `auth_group_permissions.id`
- `auth_permission.id`
- `django_content_type.id`
- `django_migrations.id`
- `otp_static_staticdevice.id`
- `otp_static_statictoken.id`
- `otp_totp_totpdevice.id`
- `two_factor_phonedevice.id`
- `zerver_archivedmessage.id`
- `zerver_client.id`
- `zerver_message.id`
- `zerver_realm.id`
- `zerver_recipient.id`
- `zerver_userprofile.id`

[^1]: https://code.djangoproject.com/ticket/32674
[^2]: https://code.djangoproject.com/ticket/24203
2024-06-05 11:48:27 -07:00
Alex Vandiver 4f4725f810 analytics: Migrate models' id columns to bigint.
This helps prevent wraparound on exceedingly large and old installs,
particularly Zulip Cloud.  These are relatively simple migrations
since they are not referenced by any other tables; however, they are
quite large, and are actively used from Django by running servers,
making this not a migration which is possible to run without stopping
the server.

Use the escape hatch in the previous commit to temporarily pause
analytics writes while the migration happens.  This should make the
migration transparent to users, at the small cost of an artificial dip
in statistics (specifically, to push notification counts, and unread
message counts) while the migration runs.
2024-06-05 11:48:27 -07:00
Alex Vandiver 6c17cca208 zilencer: Drop unwanted data that old servers might still send. 2024-06-03 12:35:35 -07:00
Alex Vandiver 09e9c75ec6 analytics: Remove `active_users` and `active_users_log` metrics.
Both of these are inaccurate, not currently used anywhere, and have
been superseded by the `active_users_audit` metric.
2024-06-03 12:35:35 -07:00
Alex Vandiver 0100440a86 analytics: Make active_users_audit into a RealmCount.
With `realm_active_humans` no longer dependent on the per-user rows,
there is no reason to preserve them -- any measure of "was a user
active" should look directly at the much richer RealmAuditLog.  This
removes the bulk of the UserCount table, since the remaining rows all
require user interaction of some sort to produce rows.
2024-06-03 12:35:35 -07:00
Alex Vandiver 195defb031 analytics: Rewrite realm_active_humans::day query.
This makes it no longer dependent on active_users_audit:is_bot:day,
which subsequent commits will make a RealmCount, not UserCount, query.
This folds the same behaviour of `active_users_audit` directly into
the query; however, only running over active users, using the index
from the earlier commit, and using the new `DISTINCT ON` formulation
make this a fast query compared to `active_users_audit:is_bot:day` +
the old `realm_active_humans::day`.
2024-06-03 12:35:35 -07:00
Alex Vandiver e638ae44a8 analytics: Use a DISTINCT ON rather than a self-join.
This produces a query which is more comprehensible, is 2x faster when
limited to a realm, and has equivalent speed when performing the full
table scan.
2024-06-03 12:35:35 -07:00
Alex Vandiver 7ad967ebc7 test_counts: Create audit log entries when creating users. 2024-06-03 12:35:35 -07:00
Lauryn Menard 5892e48ba4 analytics: Update "Messages sent by client" chart for Flutter app.
Updates the labels in the "Messages sent by client" analytics chart
for the user-agent/client names for the Flutter mobile app, which
can be "ZulipFlutter" or "ZulipMobile/flutter".

Fixes #28220.
2024-05-28 10:18:40 -07:00
Alex Vandiver f246b82f67 puppet: Factor out pattern of writing a nagios state file atomically. 2024-05-24 11:31:25 -07:00
Alex Vandiver 88be3246a0 management: Move commands to all use ZulipBaseCommand. 2024-05-24 10:30:16 -07:00
Sahil Batra 7b42c802b1 invites: Add include_realm_default_subscriptions parameter.
This commit adds include_realm_default_subscriptions parameter
to the invite endpoints and the corresponding field in
PreregistrationUser and MultiuseInvite objects. This field will
be used to subscribe the new users to the default streams at the
time of account creation and not to the streams that were default
when sending the invite.
2024-05-14 14:20:07 -07:00
Mateusz Mandera 9406bfbc0a analytics: Store realm disk space used as a CountStat.
Fixes #29632.

The issue description explains this well:

We currently recalculate `currently_used_upload_space_bytes` every file
upload, by dint of calling `flush_used_upload_space_cache`  on
save/delete, and then immediately calling
`user_profile.realm.currently_used_upload_space_bytes()` in
`notify_attachment_update`.  Since this walks the Attachments table,
recalculating this can take seconds in large realms.

Switch this to using a CountStat, so we don't need to walk significant
chunks of the Attachment table when we upload an attachment.  This will
also give us a historical daily graph of usage.
2024-05-09 10:54:44 -07:00
Mahhheshh 1198785c62 analytics: Improve do_increment_logging_stat performance.
The previous implementation using Django's `get_or_create` for
`do_increment_logging_stat` involved two separate database queries,
potentially leading to race conditions.

Use an `ON CONFLICT ... DO UPDATE` (aka "upsert") query, which
eliminates race conditions and improves performance.  This is mildly
complicated due to the different unique indexes across the various
tables, and the need for bug-for-bug compatibility with the previous
implementation.

Fixes #28947.

Co-authored-by: Alex Vandiver <alexmv@zulip.com>
2024-05-06 16:34:01 -07:00
Mahhheshh 218c7ae8cd test_counts: Add test to do_incremental_logging_stat.
Adds an test, to count the number of database queries made by
`do_incremental_logging_stat` function.
2024-05-06 16:34:01 -07:00
Alex Vandiver 9dfaa83aa8 invites: Remove invites worker, make confirmation object in-process.
The "invites" worker exists to do two things -- make a Confirmation
object, and send the outgoing email.  Making the Confirmation object
in a background process from where the PreregistrationUser is created
temporarily leaves the PreregistrationUser in invalid state, and
results in 500's, and the user not immediately seeing the sent
invitation.  That the "invites" worker also wants to create the
Confirmation object means that "resending" an invite invalidates the
URL in the previous email, which can be confusing to the user.

Moving the Confirmation creation to the same transaction solves both
of these issues, and leaves the "invites" worker with nothing to do
but send the email; as such, we remove it entirely, and use the
existing "email_senders" worker to send the invites.  The volume of
invites is small enough that this will not affect other uses of that
worker.

Fixes: #21306
Fixes: #24275
2024-05-02 14:23:04 -07:00
Alex Vandiver d863aa56de invites: Lock the realm when determining invitation counts.
This prevents users from hammering the invitation endpoint, causing
races, and inviting more users than they should otherwise be allowed
to.

Doing this requires that we not raise InvitationError when we have
partially succeeded; that behaviour is left to the one callsite of
do_invite_users.

Reported by Lakshit Agarwal (@chiekosec).
2024-05-02 14:23:04 -07:00
Sahil Batra e78d0aacaf tests: Use NamedUserGroup for queries. 2024-04-26 17:03:09 -07:00
Sahil Batra a96c8b8352 groups: Use NamedUserGroup for all queries. 2024-04-26 17:03:09 -07:00
Alex Vandiver 11dd6791c4 management: Provide a common lockfile dir, and a decorator for it.
Factor out the repeated pattern of taking a lock, or immediately
aborting with a message if it cannot be acquired.  The exit code in
that situation is changed to be exit code 1, rather than the successful
0; we are likely missing new work since that process started.

We move the lockfiles to a common directory under `/srv/zulip-locks`
rather than muddy up `/home/zulip/deployments`.
2024-04-24 14:40:28 -07:00
Lauryn Menard 91ffb548cc streams: Update translated errors for stream to channel rename.
Updates translated JsonableError strings that relate to streams
to use channel instead of stream. Separated from other error string
updates as this is a dense area of changes for this rename.

Part of stream to channel rename project.
2024-04-24 14:35:05 -07:00
Lauryn Menard d0a62020ff stats: Update translated strings for stream to channel rename.
Updates the labels for the "Messages sent by recipient type" chart
to use "channel", and updates the error message that would be sent
for the "messages_sent_by_stream" chart (that has not yet been
implemented) for a missing channel/stream.

Part of the stream to channel rename project.
2024-04-24 14:35:05 -07:00
John Lu a5cf0ec526
refactor: Replace HUDDLE with DIRECT_MESSAGE_GROUP.
Replaced HUDDLE attribute with DIRECT_MESSAGE_GROUP using VS Code search,
part of a general renaming of the object class.

Fixes part of #28640.

Co-authored-by: JohnLu2004 <JohnLu10212004@gmail.com>
2024-03-21 16:39:33 -07:00
Mateusz Mandera 634015411a update_analytics_count: Use a correct lock mechanism.
Adds a re-usable lockfile_nonblocking helper to context_managers.

Relying on naive `os.mkdir` is not enough especially now that the
successful operation of this command is necessary for push notifications
to work for many servers.

We can't use `lockfile` context manager from
`zerver.lib.context_managers`, because we want the custom behavior of
failing if the lock can't be acquired, instead of waiting.
That's because if an instance of this gets stuck, we don't want to start
queueing up more processes waiting forever whenever the cronjob runs
again and fail->exit is preferrable instead.
2024-03-05 10:21:14 -08:00
Anders Kaseorg 87992b8b29 ruff: Fix PERF403 Use a dictionary comprehension instead of a for-loop.
This is a preview rule, not yet enabled by default.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-03-01 09:30:04 -08:00
Anders Kaseorg 570f3dd447 python: Reformat with Ruff formatter.
https://docs.astral.sh/ruff/formatter/

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-02-29 17:07:16 -08:00
Lauryn Menard cf82d3316b push-bouncer: Exclude LoggingCountStats with partial data.
LoggingCountStats with a daily duration and that are directly stored
on the RealmCount table (not via aggregation in process_count_stat),
can be in a state, after the hourly cron job to update analytics
counts, where the logged value will be live-updated later, because
the end time for the stat is still in the future.

As these logging counts are designed to be used on the self-hosted
installation for either debugging or rate limiting, sending these
partial/incomplete counts to the bouncer has low value.
2024-02-26 17:53:12 -08:00
Tim Abbott d0c276d863 corporate: Fix billing_session variable reuse confusion.
The previous logic incorrectly used the server-level number of users
even when a (presumably smaller) realm-level count was available.

Fixes a bug introduced in 2e1ed4431a.
2024-02-21 17:51:30 -08:00
Anders Kaseorg a4938d3760 page_params: Parse page_params and state_data with Zod.
This establishes a runtime check that their types continue to reflect
reality going forward.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-02-17 00:02:38 -08:00
Anders Kaseorg 688a9be556 page_params: Remove unused remote.
It’s been unused since its introduction in commit
ebdd55814c.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-02-08 10:08:15 -08:00
Anders Kaseorg c23f6a786d page_params: Remove unused for_installation.
It’s been unused since its introduction in commit
1af7fc7344 (#9458).

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-02-08 10:08:15 -08:00
Tim Abbott 51542eb55e stats: Fix bad query plan for remote counts.
We don't have an index on `(server_id, id)`, and in any case, we have
a stronger guarantee that `remote_id` is time-sorted, from the
construction of the analytics tables, than that the `id`s given these
entries when uploaded are time-sorted.
2024-02-06 18:06:17 -08:00
Lauryn Menard df2f4b6469 corporate: Move support and activity views to /corporate.
View functions in `analytics/views/support.py` are moved to
`corporate/views/support.py`.

Shared activity functions in `analytics/views/activity_common.py`
are moved to `corporate/lib/activity.py`, which was also renamed
from `corporate/lib/analytics.py`.
2024-01-30 10:06:48 -08:00
Prakhar Pratyush edec29e0b6 support: Add support to configure fixed_price plan. 2024-01-29 11:23:20 -08:00
Anders Kaseorg 93198a19ed requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2024-01-29 10:41:54 -08:00
Tim Abbott 7b4afb35e7 analytics: Fix improperly scoped billing import. 2024-01-26 09:33:30 -08:00
Lauryn Menard bfd9eec4b3 activity: Add totals row as sticky footer to activity charts.
Updates the total row for the installation and remote activity
charts to be in the table footer. Makes the footer class sticky
to the bottom of the view so that it is always visible on the
chart.

Also, updates the installation activity column for revenue to
be formatted as a dollar string, since this formatting was being
applied in the updated total row.
2024-01-26 09:04:39 -08:00
Lauryn Menard dcae35196c remote-activity: Prefetch LicenseLedger entry for current plan.
To estimate the annual recurring revenue for remote server and
remote realm CustomerPlans, we prefetch the current license
ledger as part of the CustomerPlan query.

Also adds a select related to the remote realm audit log query
so that we don't go to the database for the remote realm ID.

With the test added in the previous commit, the query count for
the remote activity view goes from 27 to 11, as we are no longer
hitting the database multiple times for every current plan or
for every remote realm with audit log data.

Refactors get_customer_plan_renewal_amount so that a license
ledger is always passed and make_end_of_cycle_updates_if_needed
does not need to be called.
2024-01-25 15:35:44 -08:00
Lauryn Menard 842dcb6546 analytics-tests: Build robust test data for remote activity view.
Creates multiple remote servers and remote realms with active
plan data and audit logs.

Shows multiple small queries to get revenue data from plan license
ledger, as well as remote server and remote realm IDs.
2024-01-25 15:35:44 -08:00
Lauryn Menard 1249b5929e remote-activity: Update column data for remote realms.
Combines the IDs for the remote realm and remote server in the
first column.

Moves the Realm name to the second column.

For the host/hostname column, displays the remote realm host if
it is a remote realm row. Otherwise, the remote server hostname
is displayed.
2024-01-22 11:55:31 -08:00
Lauryn Menard fca9ff1ae7 support: Show date for start of next billing cycle for current plan.
Instead of showing the next invoice date for the plan, show the
date for the next billing cycle start (e.g. the next plan renewal
charge date), except for plans currently on a free trial.

For plans on a free trial, the next plan renewal date will be when
the free trial ends, which is stored as the next invoice date on
the plan.
2024-01-22 10:09:00 -08:00
Lauryn Menard 76b26612a0 remote-activity: Get user counts for all servers and realms.
Instead of querying the database for every remote server and realm
in the remote activity chart, we now get the server and realm data
for the installation in two queries.
2024-01-19 11:46:13 -08:00
Lauryn Menard 536aef854c remote-activity: Display rows for remote realms.
Adds columns for remote realm ID, name and organization type. If
a remote server has remote realms attached that are not marked
as deactivated by the remote server, then there will be a row in
the chart for each remote realm (which duplicates some remote
server data).

Updates the plan data, revenue and user counts to be for the realm
if present and otherwise for the server.

Updates the user counts to be total users and guest users, instead
of non guest and guest users.

The total row for mobile data (users and pushes forwarded) sums
each remote server's data once, so while the column duplicates
data, the total row should be an accurate total for the installation.

Adds 5 queries to the remote activity page test. One is for the
additional query for the remote realm plans. The other four are
getting the remote realm object and then the user count data for
the two remote realms in the test.
2024-01-19 11:46:13 -08:00
Lauryn Menard 8310d1f0be remote-activity: Use constants for columns when iterating over rows.
This will help with updating this function for updating/changing
the columns in the chart as only the constant needs to be updated
and not every instance the integer is used in the loops over the
rows of data.
2024-01-19 11:46:13 -08:00
Lauryn Menard 440b60b261 remote-activity: Remove analytics users column. 2024-01-19 11:46:13 -08:00
Lauryn Menard 030f899195 realm-activity: Merge chart for human and bot users.
Merges the two charts remaining to have just one chart for the
realm activity view.

Removes the columns for different clients and adds a column to
show/sort by the user type (human or bot type).

Deletes templates/analytics/activity.html because it is no
longer used for any activity pages/views.
2024-01-16 09:43:42 -08:00