Having a non-identity `cache_transformer` is no different from running
it on every row of the query_function. Simplify understanding of the
codepath used in caching by merging the pieces of code.
Rather than pass around a list of message objects in-memory, we
instead keep the same constructed QuerySet which includes the later
propagated messages (if any), and use that same query to pick out
affected Attachment objects, rather than limiting to the set of ids.
This is not necessarily a win -- the list of message-ids *may* be very
long, and thus the query may be more concise, easier to send to
PostgreSQL, and faster for PostgreSQL to parse. However, the list of
ids is almost certainly better-indexed.
After processing the move, the QuerySet must be re-defined as a search
of ids (and possibly a very long list of such), since there is no
other way which is guaranteed to correctly single out the moved
messages. At this point, it is mostly equivalent to the list of
Message objects, and certainly takes no less memory.
Rather than use `bulk_update()` to batch-move chunks of messages, use
a single SQL query to move the messages. This is much more efficient
for large topic moves. Since the `edit_history` field is not yet
JSON (see #26496) this requires that PostgreSQL cast the current data
into `jsonb`, append the new data (also cast to `jsonb`), and then
re-cast that as text.
For single-message moves, this _increases_ the SQL query count by one,
since we have to re-query for the updated data from the database after
the bulk update. However, this is overall still a performance
improvement, which improves to 2x or 3x for larger topic moves. Below
is a table of duration in seconds to run `do_update_message` to move a
topic to a new stream, based on messages in the topic, for before and
after this change:
| Topic size | Before | After |
| ---------- | -------- | ------- |
| 1 | 0.1036 | 0.0868 |
| 2 | 0.1108 | 0.0925 |
| 5 | 0.1139 | 0.0959 |
| 10 | 0.1218 | 0.0972 |
| 20 | 0.1310 | 0.1098 |
| 50 | 0.1759 | 0.1366 |
| 100 | 0.2307 | 0.1662 |
| 200 | 0.3880 | 0.2229 |
| 500 | 0.7676 | 0.4052 |
| 1000 | 1.3990 | 0.6848 |
| 2000 | 2.9706 | 1.3370 |
| 5000 | 7.5218 | 3.2882 |
| 10000 | 14.0272 | 5.4434 |
This applies access restrictions in SQL, so that individual messages
do not need to be walked one-by-one. It only functions for stream
messages.
Use of this method significantly speeds up checks if we moved "all
visible messages" in a topic, since we no longer need to walk every
remaining message in the old topic to determine that at least one was
visible to the user. Similarly, it significantly speeds up merging
into existing topics, since it no longer must walk every message in
the new topic to determine if the user could see at least one.
Finally, it unlocks the ability to bulk-update only messages the user
has access to, in a single query (see subsequent commit).
The problem was that earlier this was just an uncaught JsonableError,
leading to a full traceback getting spammed to the admins.
The prior commit introduced a clear .code for this error on the bouncer
side, meaning the self-hosted server can now detect that and handle it
nicely, by just logging.error about it and also take the opportunity to
adjust the realm.push_notifications_... flags.
The endpoint was lacking validation that the authentication_methods dict
submitted by the user made sense. So e.g. it allowed submitting a
nonsense key like NoSuchBackend or modifying the realm's configured
authentication methods for a backend that's not enabled on the server,
which should not be allowed.
Both were ultimately harmless, because:
1. Submitting NoSuchBackend would luckily just trigger a KeyError inside
the transaction.atomic() block in do_set_realm_authentication_methods
so it would actually roll back the database changes it was trying to
make. So this couldn't actually create some weird
RealmAuthenticationMethod entries.
2. Silently enabling or disabling e.g. GitHub for a realm when GitHub
isn't enabled on the server doesn't really change anything. And this
action is only available to the realm's admins to begin with, so
there's no attack vector here.
test_supported_backends_only_updated wasn't actually testing anything,
because the state it was asserting:
```
self.assertFalse(github_auth_enabled(realm))
self.assertTrue(dev_auth_enabled(realm))
self.assertFalse(password_auth_enabled(realm))
```
matched the desired state submitted to the API...
```
result = self.client_patch(
"/json/realm",
{
"authentication_methods": orjson.dumps(
{"Email": False, "Dev": True, "GitHub": False}
).decode()
},
)
```
so we just replace it with a new test that tests the param validation.
This commit updates the API to check the permission to subscribe other
users while creating multi-use invites. The API will raise error if
the user passes the "stream_ids" parameter (even when it contains only
default streams) and the calling user does not have permission to
subscribe others to streams.
We did not add this before as we only allowed admins to create
multiuse invites, but now we have added a setting which can be used
to allow users with other roles as well to create multiuse invites.
1e5c49ad82 added support for shared channels -- but some users may
only currently exist in DMs or MPIMs, and not in channel membership.
Walk the list of MPIM subscriptions and messages, as well as DM users,
and add any such users to the set of mirror dummy users.
This leads to significant speedups. In a test, with 100 random unique
event classes, the old code processed a batch of 100 rows (on average
66-ish unique in the batch) in 0.45 seconds. Doing this in a single
query processes the same batch in 0.0076 seconds.
Earlier, after a successful POST request on find accounts page
users were redirected to a URL with the emails (submitted via form)
as URL parameters. Those raw emails in the URL were used to
display on a template.
We no longer redirect to such a URL; instead, we directly render
a template with emails passed as a context variable.
Fixes part of #3128
As explained in the comment, this is to prevent bugs where some strange
combination of codepaths could end up calling do_login without basic
validation of e.g. the subdomain. The usefulness of this will be
extended with the upcoming commit to add the ability to configure custom
code to wrap authenticate() calls in. This will help ensure that some
codepaths don't slip by the mechanism, ending up logging in a user
without the chance for the custom wrapper to run its code.
This test is ancient and patches so much that it's almost unreadable,
while being redundant considering we have comprehensive tests via the
SocialAuthBase subclasses. The one missing case was the one with the
backend we disabled. We replace that with a proper
test_social_auth_backend_disabled test in SocialAuthBase.
This is preparatory work towards adding a Topic model.
We plan to use the local variable name as 'topic' for
the Topic model objects.
Currently, we use *topic as the local variable name for
topic names.
We rename local variables of the form *topic to *topic_name
so that we don't need to think about type collisions in
individual code paths where we might want to talk about both
Topic objects and strings for the topic name.
This is preparatory work towards adding a Topic model.
We plan to use the local variable name as 'topic' for
the Topic model objects.
Currently, we use *topic as the local variable name for
topic names.
We rename local variables of the form *topic to *topic_name
so that we don't need to think about type collisions in
individual code paths where we might want to talk about both
Topic objects and strings for the topic name.
We return expected_end_timestamp as "None" for the plans to be
downgraded if number of users is not more than MAX_USERS_WITHOUT_PLAN
since they will be downgraded to self-managed plan and would
have push notifications enabled.
Requests to these endpoint are about a specified user, and therefore
also have a notion of the RemoteRealm for these requests. Until now
these endpoints weren't getting the realm_uuid value, because it wasn't
used - but now it is needed for updating .last_request_datetime on the
RemoteRealm.
For the RemoteRealm case, we can only set this in endpoints where the
remote server sends us the realm_uuid. So we're missing that for the
endpoints:
- remotes/push/unregister and remotes/push/unregister/all
- remotes/push/test_notification
This should be added in a follow-up commit.
os.path.getmtime needs to be mock.patched or otherwise the success of
the test depends on the filesystem state and breaks if version.py hasn't
been modified in a while.
`<time:1234567890123>` causes a "signed integer is greater than
maximum" exception from dateutil.parser; datetime also cannot handle
it ("year 41091 is out of range") but that is a ValueError which is
already caught.
Catch the OverflowError thrown by dateutil.
This protects us from incorrectly handling situations where someone
tested and upgrade to 8.0 for a backup on a separate hostname, and
left the test system live while upgrading the main system, in a way
that results in duplicate RemoteRealm objects that are all marked as
locally deleted.
Further word is required to figure out how to avoid the original
duplication problem.
Earlier, 'topic' parameter length for
'/users/me/subscriptions/muted_topics' and '/user_topics' endpoints
were not validated before DB operations which resulted in exception:
'DataError: value too long for type character varying(60)'.
This commit adds validation for the topic name length to be
capped at 'max_topic_length' characters.
The doc is updated to suggest clients that the topic name should
have a maximum length of 'max_topic_length'.
Fixes#27796.
Old RemotePushDeviceTokens were created without this attribute. But when
processing a notification, if we have remote_realm, we can take the
opportunity to to set this for all the registrations for this user.
This moves the function which computes can_push and
expected_end_timestamp outside RemoteRealmBillingSession
because we might use this function for RemoteZulipServer
as well and also renames it.
This fixes the exception case on the initial
`/api/v1/remotes/server/analytics/status` case. Other exceptions from
`send_to_push_bouncer` are allowed to escape.
Co-authored-by: Alex Vandiver <alexmv@zulip.com>
Previously, passing a url longer than 200 characters for
jitsi_server_url caused a low-level failure at DB level. This
commit adds this restriction at API level.
Fixes part of #27355.
While the query parameter is properly excaped when inlined into the
template (and thus is not an XSS), it can still produce content which
misleads the user via carefully-crafted query parameter.
Validate that the parameter looks like an email address.
Thanks to jinjo2 for reporting this, via HackerOne.
Saying `**options: str` is a lie, since it contains bools. We pluck
out the two bools that we need properly typed because we will be
pushing them into function calls, and type them explicitly as bools.
This ensures determinism in these tests doing mock_send.assert_called
with - avoids producing test flakes due to a different order of
retrieval of these objects from the database.
- The server sends the list of registrations it believes to have with
the bouncer.
- The bouncer includes in the response the registrations that it doesn't
actually have and therefore the server should delete.
This commit creates a RealmAuditlog entry with a new event_type
'RealmAuditLog.REALM_IMPORTED' after the realm is reactivated.
It contains user count data (using realm_user_count_by_role)
stored in extra_data.
This helps to have an accurate user count data for the billing
system if someone tries to signup just after doing an import.
Given that most of the use cases for realms-only code path would
really like to upload audit logs too, and the others would likely
produce a better user experience if they upoaded audit logs, we
should just have a single main code path here i.e.
'send_analytics_to_push_bouncer'.
We still only upload usage statistics according to documented
option, and only from the analytics cron job.
The error handling takes place in 'send_analytics_to_push_bouncer'
itself.
Earlier, it was passing tests because the deffered_work queue
that calls send_realms_only_to_push_bouncer didn't update the
realms propery based on response received from bouncer.
This prep commit removes the invalid "dummy-uuid" used, as any
call to send_realms_only_to_push_bouncer will update realms
properties too.
We return an empty realms array as the realm is created midway in
do_create_realm, so the uuid is not already available. Also, our
intent here is not to verify the behaviour of the
send_realms_only_to_push_bouncer function because we'll have
separate tests for that. Here, we verify that deffered_work event
was sent and eventually it made call to send_to_push_bouncer
with appropriate data.
When a self-hosted Zulip server does a data export and then import
process into a different hosting environment (i.e. not sharing the
RemoteZulipServer with the original, we'll have various things that
fail where we look up the RemoteRealm by UUID and find it but the
RemoteZulipServer it is associated with is the wrong one.
Right now, we ask user to contact support via an error page but
might develop UI to help user do the migration directly.
This commit adds code to not include original details of senders like
name, email and avatar url in the message objects sent through events
and in the response of endpoint used to fetch messages.
This is the last major commit for the project to add support for
limiting guest access to an entire organization.
Fixes#10970.
Adds `user.realm.string_id` as the realm name to the base payload
for notifications. Uses this realm name in the body of the alert
in the `apns_data`.
Changes the event string from "test-by-device-token" to "test".
Fixes#28075.
Earlier, the event sent when an onboarding step (hotspot till now)
is marked as read generated an event with type='hotspots' and
'hotspots' named array in it.
This commit renames the type to 'onboarding_steps' and the array
to 'onboarding_steps' to reflect the fact that it'll also contain
data for elements other than hotspots.
This commit adds a new endpoint 'users/me/onboarding_steps'
deprecating the older 'users/me/hotspots' to mark hotspot as read.
We also renamed the view `mark_hotspot_as_read` to
`mark_onboarding_step_as_read`.
Reason: Our plan is to make this endpoint flexible to support
other types of UI elements not just restricted to hotspots.
This commit adds code to include original name, email and avatar
for inaccessible users which can happen when a user sends message
to an unsubscribed stream.
This commit adds code to not allow Zulip Cloud organizations that are not
on the Plus plan to change the "can_access_all_users_group" setting.
Fixes#27877.
1. When we get data and it includes realm info, we should automatically
link the new records with the appropriate RemoteRealm.
2. For old records, when we receive realm data, we have an opportunity
to update those old record to link them to the right RemoteRealm.
This logic doesn't need to always run, just after a remote server
upgrade, since that's when this shift in remote server behavior will
occur.
This is a prep commit to return, for each remote realm, the 'uuid',
'can_push', and 'expected_end_timestamp'.
This data will be used in 'initialize_push_notifications'.
This consists of the following pieces:
1. Makes servers using the bouncer send realm_uuid in requests for token
registration. (Sidenote: realm_uuid is already sent in the "send
notification" codepath as of
48db4bf854)
2. This allows the bouncer to tie RemotePushDeviceToken to the
RemoteRealm with matching realm_uuid at registration time.
3. Introduce handling of some potential weird edge cases around the
realm_uuid and RemoteRealm objects in get_remote_realm_helper.
This default setup will be more realistic, matching the ordinary
conditions for a modern server.
Especially needed as we add bouncer code that will expect to have
RemoteRealm entries for realm_uuid values for which it receives
requests.
[squash]: Update sponsorsip and question boxes for Cloud.
[squash]: Update tabs subtitles.
[squash]: Content for info boxes for self-hosted plans.
[squash]: Adjust content to fit design.
portico: Tweak /plans text.
This reduces the query time by an order of magnitude, since it is able
to switch from a raw `stream_id` index to an index over all of
`realm_id, property, end_time`.
These metadata are essentially all publicily available anyway, and
making uploading them unconditional will simplify some things.
The documentation is not quite accurate in that it claims the server
will upload some metadata that is not actually uploaded yet (but will
by soon). This seems harmless.
Currently, the sender names for outgoing emails sent by Zulip
are hardcoded. It should be configurable for self-hosted systems.
This commit makes the 'Zulip' part a variable in the following
email sender names: 'Zulip Account Security', 'Zulip Digest',
and 'Zulip Notifications' by introducing a settings variable
'SERVICE_NAME' with the default value as f"{EXTERNAL_HOST} Zulip".
Fixes: #23857
This commit updates get_fake_email_domain to accept realm.host as
argument instead of the Realm object since we only use realm.host
to get the fake email domain.
This is a preparatory commit for the limited guest feature as we
would be sending the fake email of the message sender in message
event object to a guest user who cannot access the sender and
there we would need to compute the fake email.
549dd8a4c4 changed the regex that we build to contain whitespace for
readability, and strip that back out before returning it.
Unfortunately, this also serves to strip out whitespace in the source
linkifier, causing it to not match expected strings.
Revert 549dd8a4c4.
Fixes: #27854.
Adds details about the requested organization URL and type to the
registration confirmation email that's sent when creating a new
Zulip organization.
Fixes#25899.
If the request's `Accept:` header signals a preference for serving
images over text, return an image representing the 404/403 instead of
serving a `text/html` response.
Fixes: #23739.
Previously, we weren't able to mute the cross realm bots. This was
because, for muting the users, we access only those profiles which are
in realm, excluding the cross realm system bots.
This is fixed by replacing the access_user_by_id method with a new
method access_user_by_id_including_cross_realm for this specific test.
Fixes#27823
Earlier, for the push notifications having latex math
like "$$1 \oplus 0 = 1$$, the notification had the math
included multiple times.
This commit fixes the incorrect behavior by replacing
the KaTeX with the raw LaTeX source.
Fixes part of #25289.
This commit refactors the current hotspot subsytem to use a more
robust dataclass `Hotspot` defined in `lib/hotspots.py`. This fixes
mypy errors as well as make code more readable.
This commit introduces non-intro hotspots.
They are a bit different than intro hotspots in the
following ways:
* All the non-intro hotspots are sent at once instead of
sending them one by one like intro hotspots.
* They only activate when a specific event occurs,
unlike intro hotspot where they activate after the
previous hotspot is read.
Now, the topic wildcard mention follows the following
rules:
* If the topic has less than 15 participants , anyone
can use @ topic mentions.
* For more than 15, the org setting 'wildcard_mention_policy'
determines who can use @ topic mentions.
Earlier, topic wildcard mentions followed the same restriction
as stream wildcard mentions, which was incorrect.
Fixes part of #27700.
This commit updates the backend code to allow changing
can_access_all_users_group setting in development environment
and also adds a dropdown in webapp UI which is only shown in
development environment.
This makes it possible for a self-hosted realm administrator to
directly access a logged-page on the push notifications bouncer
service, enabling billing, support contacts, and other administrator
for enterprise customers to be managed without manual setup.
We previously did not allow setting signup_notifications_stream and
notifications_stream settings to private streams that admin is not
subscribed to, even when admins have access to metadata of all the
streams in the realm and can see them in the dropdown options as well.
This commit fixes it to allow admins to set these settings to private
streams that the admin is not subscribed to.
Guests might lose access to deactivated users if the user
is not involved in any DM with guest. This commit adds
code to send "realm_user/remove" events for such cases.
We now send user creation events to recipient users
when sending DMs if recipients gain access to either
sender or other pariticpating users in the DM.
This commit adds code to send "realm_user/remove" event
when a guest user loses access to a user due to the user
being unsubscribed from one or more streams.
This commit adds code to send user creation events to
guests who gain access to new subscribers and to the
new guest subscribers who gain access to existing
stream subscribers.
The presence and user status update events are only sent to accessible
users, i.e. guests do not receive presence and user status updates for
users they cannot access.
This commit adds code to make sure that update events for changing
a user's role, email, etc. are not sent to guests who cannot access
the modified user.
We do not send the original user data in user creation events
to guests if user access is restricted in realm, as they would
receive the information about user if user is subscribed to some
common streams after account creation.
This commit adds code to update access_user_by_id to raise
error if guest tries to access an inaccessible user.
One notable behavioral change due to this is that we do
not allow guest to mute or unmute a deactivated user if
that user was not involved in DMs.
This may happen if there are multiple servers with the same UUID
submitting data (e.g. if they were cloned after initial creation), or
if there is one server, but `./manage.py clear_analytics_tables` was
used to truncate the analytics tables.
In the case of `clear_analytics_tables`, the data submitted likely has
identical historical values with new remote `id` values; preserving
the originally-submitted contemporaneous data is the best option. For
the case of submissions from multiple servers, there is no completely
sensible outcome, so the best we can do is detect the case and move
on.
Since we have a lock on the RemoteZulipServer, we know that no other
inserts are happening, so counting before and after will return the
true number of rows inserted (which `bulk_create` cannot do in the
face of `ignore_conflicts`[^1]). We compare this to the expected
number of new inserted rows to detect dropped duplicates.
[^1]: See https://code.djangoproject.com/ticket/30138.
Earlier, for the emails having latex math like
"$$d^* = +\infty$$", the bad rendering led to the math
being included multiple times in the email body.
This was due to displaying KaTeX HTML without the CSS.
This commit fixes the incorrect behavior by replacing
the KaTeX with the raw LaTex source.
Fixes part of #25289.
This is a useful helper using the same API as
send_analytics_to_push_bouncer(), but uploading only realms info. This
is useful to upload realms info without the risk of taking a long time
to process the request due to too much of the *Count analytics data.
The original behavior of this setting was to disable LDAP
authentication for any realms not configured to use it. This was an
arbitrary choice, and its only value was to potentially help catch
typos for users who are lazy about testing their configuration.
Since it makes it a very inconvenient to potentially host multiple
organizations with different LDAP configurations, remove that
behavior.
This makes it possible to send notifications to more than one app ID
from the same server: for example, the main Zulip mobile app and the
new Flutter-based app, which has a separate app ID for use through its
beta period so that it can be installed alongside the existing app.
This commit adds code to send stream deletion events when
unsubscribing non-admin users from private streams and
when unsubscribing guests from public streams since
non-admins cannot access unsubscribed private streams
and guests cannot access unsubscribed public streams.
It was discovered by the Zulip development team that active users who
had previously been subscribed to a stream incorrectly continued being
able to use the Zulip API to access metadata for that stream. As a
result, users who had been removed from a stream, but still had an
account in the organization, could still view metadata for that
stream (including the stream name, description, settings, and an email
address used to send emails into the stream via the incoming email
integration). This potentially allowed users to see changes to a
stream’s metadata after they had lost access to the stream.
This bug was present in all Zulip releases prior to today's Zulip
Server 7.5.
This commit adds new API endpoint to get stream email which is
used by the web-app as well to get the email when a user tries
to open the stream email modal.
The stream email is returned only to the users who have access
to it. Specifically for private streams only subscribed users
have access to its email. And for public streams, all non-guest
users and only subscribed guests have access to its email.
All users can access email of web-public streams.
This commit removes "email_address" field from Subscription objects
and we would instead a new endpoint in next commit to get email
address for stream with proper access check.
This change also fixes the bug where we would include email address
for the unsubscribed private stream as well when user did not have
permission to send message to the stream, and having email allowed
the unsubscribed user to send message to the stream.
Note that the unsubscribed user can still send message to the stream
if the user had noted down the email before being unsubscribed
and the stream token is not changed after unsubscribing the user.
We now pass bogus data for inaccessible users when sending
the users data in "realm_users" field of "register" response
or when using endpoints like "GET /users" to get data of
all the users in realm.
We would add a client capability field in future commits
such that new clients would receive data only for accessible
users and they can form the bogus data by themselves.
This commit adds new setting for controlling who can access
all users in the realm which would have "Everyone" and
"Members only" option.
Fixes part of #10970.
This is a CountStat for tracking how many mobile notifications the
server requested.
1. On a self-hosted server, that means requesting from the push bouncer.
2. On a server that's its own push bouncer, that's just the number
directly sent.
This number has room for inaccuracy due to incrementing by the number of
user devices on a self-hosted server, as it doesn't account for errors
that may occur in the GCM/APNs low-level sending codepaths on the bouncer.
Also tests that a server that's its own push bouncer correctly
increments its mobile_pushes_sent::day CountStat, by basing it on the
values returned from the send_apple/android_push_notification functions
which tell us the actual number of successfully sent notifications.
Since the return values of send_..._push_notification are now
used in those codepaths, we need to tweak our mocks in some unrelated
tests to set up some return value to avoid errors.
Rename the existing 'wildcard_mentioned' flag to
'stream_wildcard_mentioned'.
The 'wildcard_mentioned' flag is deprecated and exists for
backwards compatibility.
We have two separate flags for stream and topic wildcard mentions,
i.e., 'stream_wildcard_mentioned' and 'topic_wildcard_mentioned',
respectively.
* stream wildcard mentions: `@all`, `@everyone`, and `@stream`
* topic wildcard mentions: `@topic`
The `wildcard_mentioned` flag is included in the events and
API response if either `stream_wildcard_mentioned` or
`topic_wildcard_mentioned` is set.