The previous query suffered from bad corner cases when the user had
received a large number of direct messages but sent very few,
comparatively. This mean that the first half of the UNION would
retrieve a very large number of UserMessage rows, requiring fetching a
large number of Message rows, merely to throw them away upon
determining that the recipient was the current user.
Instead of merging two queries of "last 1k received" + "last 1k sent",
we instead make better use of the UserMessage rows to find "last 1k
sent or received." This may change the list of recipients, as large
disparities in sent/received messages may result in pushing the
most-recently-sent users off of the list. These are likely uncommon
edge cases, however -- and the disparity is the whole reason for the
performance problem.
This also provides more correct answers. In the case where a user's
1001'th message sent was to person A today, but my most recent message
received was from them yesterday, the previous plan would show the
message I received yesterday message-id as the max, and not the more
recent message I sent today.
While we could theoretically raise the `RECENT_CONVERSATIONS_LIMIT` to
more frequently match the same recipient list as previously, this
increases the cost of the most common cases unreasonably. With a
1000-message limit, the common cases are slightly faster, and the tail
latencies are very much improved; raising `RECENT_CONVERSATIONS_LIMIT`
would increase the result similarity to the old algorithm, at the cost
of the p50 and p75.
| | Old | New |
| ------ | ------- | ------- |
| Mean | 0.05287 | 0.02520 |
| p50 | 0.00695 | 0.00556 |
| p75 | 0.05592 | 0.03351 |
| p90 | 0.14645 | 0.08026 |
| p95 | 0.20181 | 0.10906 |
| p99 | 0.30691 | 0.16014 |
| p99.9 | 0.57894 | 0.19521 |
| max | 22.0610 | 0.22184 |
On the whole, however, the much more bounded worst case are worth the
small changes to the resultset.
This is preparatory work towards adding a Topic model.
We plan to use the local variable name as 'topic' for
the Topic model objects.
Currently, we use *topic as the local variable name for
topic names.
We rename local variables of the form *topic to *topic_name
so that we don't need to think about type collisions in
individual code paths where we might want to talk about both
Topic objects and strings for the topic name.
This is preparatory work towards adding a Topic model.
We plan to use the local variable name as 'topic' for
the Topic model objects.
Currently, we use *topic as the local variable name for
topic names.
We rename local variables of the form *topic to *topic_name
so that we don't need to think about type collisions in
individual code paths where we might want to talk about both
Topic objects and strings for the topic name.
This is preparatory work towards adding a Topic model.
We plan to use the local variable name as 'topic' for
the Topic model objects.
Currently, we use *topic as the local variable name for
topic names.
We rename local variables of the form *topic to *topic_name
so that we don't need to think about type collisions in
individual code paths where we might want to talk about both
Topic objects and strings for the topic name.
Requests to these endpoint are about a specified user, and therefore
also have a notion of the RemoteRealm for these requests. Until now
these endpoints weren't getting the realm_uuid value, because it wasn't
used - but now it is needed for updating .last_request_datetime on the
RemoteRealm.
`<time:1234567890123>` causes a "signed integer is greater than
maximum" exception from dateutil.parser; datetime also cannot handle
it ("year 41091 is out of range") but that is a ValueError which is
already caught.
Catch the OverflowError thrown by dateutil.
boto3 has two different modalities of making API calls -- through
resources, and through clients. Resources are a higher-level
abstraction, and thus more generally useful, but some APIs are only
accessible through clients. It is possible to get to a client object
from a resource, but not vice versa.
Use `get_bucket(...).meta.client` when we need direct access to the
client object for more complex API calls; this lets all of the
configuration for how to access S3 to sit within `get_bucket`. Client
objects are not bound to only one bucket, but we get to them based on
the bucket we will be interacting with, for clarity.
We removed the cached session object, as it serves no real purpose.
e883ab057f started caching the boto client, which we had identified
as slow call. e883ab057f went further, calling
`get_boto_client().generate_presigned_url()` once and caching that
result.
This makes the inner cache on the client useless. Remove it.
If we `.distinct("delivery_email")` then we must also
`.order_by("delivery_email")`; adc987dc43 added the `.order_by`
call, which broke the newsletter codepath, since it did not contain
the `delivery_email` in the ordering fields.
Add a flag to distinct on emails in `send_custom_email`.
For remote servers, we cannot advertise `List-Unsubscribe=One-Click`,
which is specified in RFC 8058[^1] to mean that the `List-Unsubscribe`
URL supports a POST request with no arguments to unsubscribe. Because
we show an interstitial and confirmation page, as this is not just a
mailing list which is disabled if you click the link, it does not
support the mail system performing the unsubscribe for the user.
Remove the inaccurate header for remote servers.
[^1]: https://datatracker.ietf.org/doc/html/rfc8058
612f2c73d6 started passing add_context to
`send_custom_server_email`, but did not make it make use of it.
Also add the `hostname` as a built-in value, since that is most likely
the most useful property.
This fixes the exception case on the initial
`/api/v1/remotes/server/analytics/status` case. Other exceptions from
`send_to_push_bouncer` are allowed to escape.
Co-authored-by: Alex Vandiver <alexmv@zulip.com>
Previously, passing a url longer than 200 characters for
jitsi_server_url caused a low-level failure at DB level. This
commit adds this restriction at API level.
Fixes part of #27355.
Saying `**options: str` is a lie, since it contains bools. We pluck
out the two bools that we need properly typed because we will be
pushing them into function calls, and type them explicitly as bools.
If the exception was because the channel closed, attempting to NAK the
events will just raise another error, and is pointless, as the server
already marked the pending events as NAK'd.
This ensures determinism in these tests doing mock_send.assert_called
with - avoids producing test flakes due to a different order of
retrieval of these objects from the database.
- The server sends the list of registrations it believes to have with
the bouncer.
- The bouncer includes in the response the registrations that it doesn't
actually have and therefore the server should delete.
This commit creates a RealmAuditlog entry with a new event_type
'RealmAuditLog.REALM_IMPORTED' after the realm is reactivated.
It contains user count data (using realm_user_count_by_role)
stored in extra_data.
This helps to have an accurate user count data for the billing
system if someone tries to signup just after doing an import.
This is a rename of the previous
enqueue_register_realm_with_push_bouncer_if_needed but is clearer
about the fact that this will also upload audit logs if available.
Given that most of the use cases for realms-only code path would
really like to upload audit logs too, and the others would likely
produce a better user experience if they upoaded audit logs, we
should just have a single main code path here i.e.
'send_analytics_to_push_bouncer'.
We still only upload usage statistics according to documented
option, and only from the analytics cron job.
The error handling takes place in 'send_analytics_to_push_bouncer'
itself.
When a self-hosted Zulip server does a data export and then import
process into a different hosting environment (i.e. not sharing the
RemoteZulipServer with the original, we'll have various things that
fail where we look up the RemoteRealm by UUID and find it but the
RemoteZulipServer it is associated with is the wrong one.
Right now, we ask user to contact support via an error page but
might develop UI to help user do the migration directly.
This commit adds code to not include original details of senders like
name, email and avatar url in the message objects sent through events
and in the response of endpoint used to fetch messages.
This is the last major commit for the project to add support for
limiting guest access to an entire organization.
Fixes#10970.
This commit sets the client capability value to not pass
unknown users data in the webapp and also does some changes
to avoid errors while loading the web-app home page.
This commit only does some basic webapp changes to not show
inaccessible users in sidebar and we would need need more
changes to make the web-app work as expected which will be
done in further commits.
Adds `user.realm.string_id` as the realm name to the base payload
for notifications. Uses this realm name in the body of the alert
in the `apns_data`.
Changes the event string from "test-by-device-token" to "test".
Fixes#28075.
Earlier, the event sent when an onboarding step (hotspot till now)
is marked as read generated an event with type='hotspots' and
'hotspots' named array in it.
This commit renames the type to 'onboarding_steps' and the array
to 'onboarding_steps' to reflect the fact that it'll also contain
data for elements other than hotspots.
This commit adds a 'type' field to the objects
in 'hotspots' array sent in 'hotspots' events.
We have explicitly added this field as we eventually
plan to have two type of onboarding steps, 'hotspots'
and 'one_time_notice'.
This will help clients to easily identify them.
This commit adds a new endpoint 'users/me/onboarding_steps'
deprecating the older 'users/me/hotspots' to mark hotspot as read.
We also renamed the view `mark_hotspot_as_read` to
`mark_onboarding_step_as_read`.
Reason: Our plan is to make this endpoint flexible to support
other types of UI elements not just restricted to hotspots.
This prep commit moves the 'rename_indexes_constraints'
function to 'lib/migrate' as we're going to re-use it for
the 'UserHotspot' to 'OnboardingStep' table rename operation.
In general, this function would be helpful in migrations
involving table rename operations, subject to the caution
mentioned in the function via comments.
This commit adds code to include original name, email and avatar
for inaccessible users which can happen when a user sends message
to an unsubscribed stream.
We rename "intro_gear" to "intro_personal" because after the menu
was split into help menu, main menu and personal menu, the "Settings"
option now resides inside the personal menu.
Fixes#27878.
This creates a valid registration, for two reasons:
1. Avoid the need to run "manage.py register_server" in dev env to
register, when wanting to to test stuff with
`PUSH_NOTIFICATION_BOUNCER_URL = "http://localhost:9991"`.
2. Avoid breaking RemoteRealm syncing, due to duplicate registrations
(first set of registrations that gets set up with the dummy
RemoteZulipServer in populate_db, and the second that gets set up via
the regular syncing mechanism with the new RemoteZulipServer created
during register_server).
This is a prep commit to return, for each remote realm, the 'uuid',
'can_push', and 'expected_end_timestamp'.
This data will be used in 'initialize_push_notifications'.
Implements a nice redirect flow to give a good UX for users attempting
to access a remote billing page with an expired RemoteRealm session e.g.
/realm/some-uuid/sponsorship - perhaps through their browser
history or just their session expired while they were doing things in
this billing system.
The logic has a few pieces:
1. get_remote_realm_from_session, if the user doesn't have a
identity_dict will raise RemoteBillingAuthenticationError.
2. If the user has an identity_dict, but it's expired, then
get_identity_dict_from_session inside of get_remote_realm_from_session
will raise RemoteBillingIdentityExpiredError.
3. The decorator authenticated_remote_realm_management_endpoint
catches that exception and uses some general logic, described in more
detail in the comments in the code, to figure out the right URL to
redirect them to. Something like:
https://theirserver.example.com/self-hosted-billing/?next_page=...
where the next_page param is determined based on parsing request.path
to see what kind of endpoint they're trying to access.
4. The remote_server_billing_entry endpoint is tweaked to also send
its uri scheme to the bouncer, so that the bouncer can know whether
to do the redirect on http or https.
This consists of the following pieces:
1. Makes servers using the bouncer send realm_uuid in requests for token
registration. (Sidenote: realm_uuid is already sent in the "send
notification" codepath as of
48db4bf854)
2. This allows the bouncer to tie RemotePushDeviceToken to the
RemoteRealm with matching realm_uuid at registration time.
3. Introduce handling of some potential weird edge cases around the
realm_uuid and RemoteRealm objects in get_remote_realm_helper.
This default setup will be more realistic, matching the ordinary
conditions for a modern server.
Especially needed as we add bouncer code that will expect to have
RemoteRealm entries for realm_uuid values for which it receives
requests.
This reduces the query time by an order of magnitude, since it is able
to switch from a raw `stream_id` index to an index over all of
`realm_id, property, end_time`.