push bouncer: Submit basic metadata unconditionally.

These metadata are essentially all publicily available anyway, and
making uploading them unconditional will simplify some things.

The documentation is not quite accurate in that it claims the server
will upload some metadata that is not actually uploaded yet (but will
by soon). This seems harmless.
This commit is contained in:
Tim Abbott 2023-11-28 14:20:24 -08:00
parent 2765c63f56
commit 7db15176f3
5 changed files with 110 additions and 32 deletions

View File

@ -96,7 +96,7 @@ class Command(BaseCommand):
)
logger.info("Finished updating analytics counts through %s", fill_to_time)
if settings.PUSH_NOTIFICATION_BOUNCER_URL and settings.SUBMIT_USAGE_STATISTICS:
if settings.PUSH_NOTIFICATION_BOUNCER_URL:
# Skew 0-10 minutes based on a hash of settings.ZULIP_ORG_ID, so
# that each server will report in at a somewhat consistent time.
assert settings.ZULIP_ORG_ID

View File

@ -147,6 +147,13 @@ _Released 2023-11-16_
#### Upgrade notes for 8.0
- Installations using the [Mobile Push Notifications
Service][mobile-push] now regularly upload [basic
metadata][mobile-push-metadata] about the organizations hosted by
the installation to the Mobile Push Notifications
Service. Previously, basic metadata was uploaded only when uploading
usage statistics was also enabled via the `SUBMIT_USAGE_STATISTICS`
setting.
- This release contains several expensive migrations, most notably
`0472_add_message_realm_id_indexes.py`,
`0485_alter_usermessage_flags_and_add_index.py`, and
@ -166,6 +173,8 @@ _Released 2023-11-16_
controls. This behavior was removed to simplify hosting multiple
organizations with different LDAP configuration preferences.
[mobile-push-metadata]: ../production/mobile-push-notifications.md#uploading-usage-statistics
## Zulip Server 7.x series
### Zulip Server 7.5

View File

@ -18,20 +18,19 @@ follows:
outgoing HTTP proxy](deployment.md#customizing-the-outgoing-http-proxy)
first.
1. Decide whether to upload basic usage statistics. Systems using the Mobile
Push Notifications Service will, by default, submit basic usage statistics
(e.g. Zulip version, number of users, number of messages sent) to the service.
These statistics help Zulip's maintainers understand how many people are
self-hosting Zulip in order to allocate resources towards supporting
self-hosted installations.
1. Decide whether to share usage statistics with the Zulip team.
Our use of these statistics is governed by the same [Terms of
Service](https://zulip.com/policies/terms) and [Privacy
Policy](https://zulip.com/policies/privacy) that covers the Mobile Push
Notifications Service itself. If your organization does not want to submit these
statistics, you can disable this feature during setup or at any time by setting
By default, Zulip installations using the Mobile Push Notifications
Service submit additional usage statistics that help Zulip's
maintainers allocate resources towards supporting self-hosted
installations ([details](#uploading-usage-statistics)). You can
disable submitting usage statistics now or at any time by setting
`SUBMIT_USAGE_STATISTICS=False` in `/etc/zulip/settings.py`.
Note that all systems using the service upload [basic
metadata](#uploading-basic-metadata) about the organizations hosted
by the installation.
1. Uncomment the
`PUSH_NOTIFICATION_BOUNCER_URL = 'https://push.zulipchat.com'` line
in your `/etc/zulip/settings.py` file (i.e., remove the `#` at the
@ -105,6 +104,8 @@ and privacy in mind:
a given notification to the appropriate set of mobile devices.
These user ID numbers are opaque to the Push Notification
Service and Kandra Labs.
- [Basic organization metadata](#uploading-basic-metadata) and
[optional usage statistics](#uploading-usage-statistics)
- The Push Notification Service receives (but does not store) the
contents of individual mobile push notifications:
@ -135,11 +136,71 @@ and privacy in mind:
source and available as part of the
[Zulip server project on GitHub](https://github.com/zulip/zulip).
- The push notification forwarding servers are professionally managed
by a small team of security expert engineers.
by a small team of security-sensitive engineers.
If you have any questions about the security model, [contact Zulip
support](https://zulip.com/help/contact-support).
### Uploading basic metadata
All Zulip installations running Zulip 8.0 or greater that are
registered for the Mobile Push Notifications Service regularly upload
to the service basic metadata about the organizations hosted by the
installation. (Older Zulip servers upload these metadata only if
[uploading usage statistics](#uploading-usage-statistics) is enabled).
Uploaded metadata consists of, for each organization hosted by the
installation:
- A subset of the basic metadata returned by the unauthenticated [`GET
/server_settings` API
endpoint](https://zulip.com/api/get-server-settings).
The purpose of that API endpoint is to serve the minimal data
needed by the Zulip mobile apps in order to:
- Verify that a given URL is indeed a valid Zulip server URL
- Present a correct login form, offering only the supported features
and authentication methods for that organization and Zulip server
version.
Most of the metadata it returns is necessarily displayed to anyone
with network access to the Zulip server on the login and signup
pages for your Zulip organization as well.
(Some fields returned by this endpoint, like the organization icon
and description, are not included in uploaded metadata.)
- The [organization type](https://zulip.com/help/organization-type)
and creation date.
- The number of user accounts with each role.
Our use of uploaded metadata is governed by the same [Terms of
Service](https://zulip.com/policies/terms) and [Privacy
Policy](https://zulip.com/policies/privacy) that covers the Mobile
Push Notifications Service itself.
### Uploading usage statistics
By default, Zulip installations that register for the Mobile Push
Notifications Service upload the following usage statistics. You can
disable these uploads any time by setting
`SUBMIT_USAGE_STATISTICS=False` in `/etc/zulip/settings.py`.
- Totals for messages sent and read with subtotals for various
combinations of clients and integrations.
- Totals for active users under a few definitions (1day, 7day, 15day)
and related statistics.
Some of the graphs on your server's [usage statistics
page](https://zulip.com/help/analytics) can be generated from these
statistics.
Our use of uploaded usage statistics is governed by the same [Terms of
Service](https://zulip.com/policies/terms) and [Privacy
Policy](https://zulip.com/policies/privacy) that covers the Mobile
Push Notifications Service itself.
## Rate limits
The Mobile Push Notifications Service API has a very high default rate

View File

@ -240,27 +240,30 @@ def send_analytics_to_push_bouncer() -> None:
logger.warning(e.msg, exc_info=True)
return
# Gather only entries with IDs greater than the last ID received by the push bouncer.
# We don't re-send old data that's already been submitted.
last_acked_realm_count_id = result["last_realm_count_id"]
last_acked_installation_count_id = result["last_installation_count_id"]
last_acked_realmauditlog_id = result["last_realmauditlog_id"]
# Gather only entries with IDs greater than the last ID received by the push bouncer.
# We don't re-send old data that's already been submitted.
(realm_count_data, installation_count_data, realmauditlog_data) = build_analytics_data(
realm_count_query=RealmCount.objects.filter(id__gt=last_acked_realm_count_id),
installation_count_query=InstallationCount.objects.filter(
if settings.SUBMIT_USAGE_STATISTICS:
installation_count_query = InstallationCount.objects.filter(
id__gt=last_acked_installation_count_id
),
)
realm_count_query = RealmCount.objects.filter(id__gt=last_acked_realm_count_id)
else:
installation_count_query = InstallationCount.objects.none()
realm_count_query = RealmCount.objects.none()
(realm_count_data, installation_count_data, realmauditlog_data) = build_analytics_data(
realm_count_query=realm_count_query,
installation_count_query=installation_count_query,
realmauditlog_query=RealmAuditLog.objects.filter(
event_type__in=RealmAuditLog.SYNCED_BILLING_EVENTS, id__gt=last_acked_realmauditlog_id
),
)
record_count = len(realm_count_data) + len(installation_count_data) + len(realmauditlog_data)
if record_count == 0:
logger.info("No new records to report.")
return
request = {
"realm_counts": orjson.dumps(realm_count_data).decode(),
"installation_counts": orjson.dumps(installation_count_data).decode(),

View File

@ -969,8 +969,13 @@ class AnalyticsBouncerTest(BouncerTestCase):
self.assertEqual(InstallationCount.objects.count(), 1)
self.assertEqual(RealmAuditLog.objects.filter(id__gt=audit_log_max_id).count(), 2)
with self.settings(SUBMIT_USAGE_STATISTICS=False):
# With this setting off, we don't send RealmCounts and InstallationCounts.
send_analytics_to_push_bouncer()
check_counts(2, 2, 0, 0, 1)
send_analytics_to_push_bouncer()
check_counts(2, 2, 1, 1, 1)
check_counts(3, 3, 1, 1, 1)
self.assertEqual(
list(
@ -1015,7 +1020,7 @@ class AnalyticsBouncerTest(BouncerTestCase):
do_deactivate_realm(zephyr_realm, acting_user=None)
send_analytics_to_push_bouncer()
check_counts(3, 3, 1, 1, 4)
check_counts(4, 4, 1, 1, 4)
zephyr_remote_realm = RemoteRealm.objects.get(uuid=zephyr_realm.uuid)
self.assertEqual(zephyr_remote_realm.host, zephyr_realm.host)
@ -1059,7 +1064,7 @@ class AnalyticsBouncerTest(BouncerTestCase):
# Test having no new rows
send_analytics_to_push_bouncer()
check_counts(4, 3, 1, 1, 4)
check_counts(5, 5, 1, 1, 4)
# Test only having new RealmCount rows
RealmCount.objects.create(
@ -1075,14 +1080,14 @@ class AnalyticsBouncerTest(BouncerTestCase):
value=9,
)
send_analytics_to_push_bouncer()
check_counts(5, 4, 3, 1, 4)
check_counts(6, 6, 3, 1, 4)
# Test only having new InstallationCount rows
InstallationCount.objects.create(
property=realm_stat.property, end_time=end_time + datetime.timedelta(days=1), value=6
)
send_analytics_to_push_bouncer()
check_counts(6, 5, 3, 2, 4)
check_counts(7, 7, 3, 2, 4)
# Test only having new RealmAuditLog rows
# Non-synced event
@ -1094,7 +1099,7 @@ class AnalyticsBouncerTest(BouncerTestCase):
extra_data={"data": "foo"},
)
send_analytics_to_push_bouncer()
check_counts(7, 5, 3, 2, 4)
check_counts(8, 8, 3, 2, 4)
# Synced event
RealmAuditLog.objects.create(
realm=user.realm,
@ -1106,7 +1111,7 @@ class AnalyticsBouncerTest(BouncerTestCase):
},
)
send_analytics_to_push_bouncer()
check_counts(8, 6, 3, 2, 5)
check_counts(9, 9, 3, 2, 5)
# Now create an InstallationCount with a property that's not supposed
# to be tracked by the remote server - since the bouncer itself tracks
@ -1126,7 +1131,7 @@ class AnalyticsBouncerTest(BouncerTestCase):
)
# The analytics endpoint call counts increase by 1, but the actual RemoteCounts remain unchanged,
# since syncing the data failed.
check_counts(9, 7, 3, 2, 5)
check_counts(10, 10, 3, 2, 5)
forbidden_installation_count.delete()
(realm_count_data, installation_count_data, realmauditlog_data) = build_analytics_data(
@ -1160,7 +1165,7 @@ class AnalyticsBouncerTest(BouncerTestCase):
],
)
# Only the request counts go up -- all of the other rows' duplicates are dropped
check_counts(10, 8, 3, 2, 5)
check_counts(11, 11, 3, 2, 5)
# Test that only valid org_type values are accepted - integers defined in OrgTypeEnum.
realms_data = [dict(realm) for realm in get_realms_info_for_push_bouncer()]