This commit removes "@" from name of role-based system groups
since we have added a restricion on having user group names
starting with "@" in the previous commit as they look odd in
mention syntax.
We also add a migration in this commit to update the name of
role-based system groups in existing realms to remove "@"
from the name. This migration also updates the names of
non-system user groups by removing the invalid prefixes
from their names and if there is a group already with that
name, we insted name the group as "group:{group_id}".
Fixes#26148.
We do not allow user group names to start with "@", "role:",
"user:", "stream:" and "channel:".
Group names starting with "@" look odd in mentions and
"role:", "user:" and "stream:" prefixes are reserved for
system groups which will be used in the new groups-based
permission model. We do not allow "channel:" prefix for
now just to be safe in a case where we use it instead of
"stream:" prefix for stream based groups in future.
Fixes part of #26148.
Previously we had database level restriction on length of
user group names. Now we add the same restriction to API
level as well, so we can return a better error response.
We remove the cache functionality for the
get_realm_stream function, and we also change it to
return a thin Stream object (instead of calling
select_related with no arguments).
The main goal here is to remove code complexity, as we
have been prone to at least one caching validation bug
related to how Realm and UserGroup interact. That
particular bug was more theoretical than practical in
terms of its impact, to be clear.
Even if we were to be perfectly disciplined about only
caching thin stream objects and always making sure to
delete cache entries when stream data changed, we would
still be prone to ugly situations like having
transactions get rolled back before we delete the cache
entry. The do_deactivate_stream is a perfect example of
where we have to consider the best time to unset the
cache. If you unset it too early, then you are prone to
races where somebody else churns the cache right before
you update the database. If you set it too late, then
you can have an invalid entry after a rollback or
deadlock situation. If you just eliminate the cache as
a moving part, that whole debate is moot.
As the lack of test changes here indicates, we rarely
fetch streams by name any more in critical sections of
our code.
The one place where we fetch by name is in loading the
home page, but that is **only** when you specify a
stream name. And, of course, that only causes about an
extra millisecond of time.
This changes bulk_get_streams so that it just uses the
database all the time. Also, we avoid calling
select_related(), so that we just get back thin and
tidy Stream objects with simple queries.
About not caching any more:
It's actually pretty rare that we fetch streams by name
in the main application. It's usually API requests that
send in stream names to find more info about streams.
It also turns out that for large queries (>= ~30 rows
for my measurements) it's more efficent to hit the
database than memcached. The database is super fast at
scale; it's just the startup cost of having Django
construct the query, and then having the database do
query planning or whatever, that slows us down. I don't
know the exact bottleneck, but you can clearly measure
that one-row queries are slow (on the order of a full
millisecond or so) but the marginal cost of additional
rows is minimal assuming you have a decent index (20
microseconds per row on my droplet).
All the query-count changes in the tests revolve around
unsubscribing somebody from a stream, and that's a
particularly odd use case for bulk_get_streams, since
you generally unsubscribe from a single stream at a
time. If there are some use cases where you do want to
unsubscribe from multiple streams, we should move
toward passing in stream ids, at least from the
application. And even if we don't do that, our cost for
most queries is a couple milliseconds.
This add audit log entries when any group based setting of a user group
is updated. We store both the old and new values in extra_data, along
with the name of that setting. Entries populated during user group creation
are hardcoded to track "can_mention_group".
Potentially we can adjust "set_defaults_for_group_settings" so that it
populates realm audit logs with it, but that is out of scope for this change.
We use an atomic transaction so that the audit logs are committed
together with the updates.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
This prep commit replaces the 'wildcard' keyword in the codebase
with 'stream_wildcard' at some places for better readability, as
we plan to introduce 'topic_wildcards' as a part of the
'@topic mention' project.
Currently, 'wildcards = ["all", "everyone", "stream"]' which is an
alias to mention everyone in the stream, hence better renamed as
'stream_wildcards'.
Eventually, we will have:
'stream_wildcard' as an alias to mention everyone in the stream.
'topic_wildcard' as an alias to mention everyone in the topic.
'wildcard' refers to 'stream_wildcard' and 'topic_wildcard' as a whole.
Subsequent commits will add "on_delete=models.RESTRICT"
relationships, which will result in the AlertWord
objects being deleted after Realm has been deleted from
the database.
In order to handle this, we update realm_alert_words_cache_key,
realm_alert_words_automaton_cache_key, and flush_realm_alert_words
functions to accept realm_id as parameter instead of realm
object, so that the code for flushing the cache works even
after the realm is deleted. This change is fine because
eventually only realm_id is used by these functions and there
is no need of the complete realm object.
Subsequent commits will add "on_delete=models.RESTRICT"
relationships, which will result in the Attachment
objects being deleted after Realm has been deleted from
the database.
In order to handle this, we update
get_realm_used_upload_space_cache_key function to accept
realm_id as parameter instead of realm object, so that
the code for flushing the cache works even after the
realm is deleted. This change is fine because eventually
only realm_id is used by this function and there is no
need of the complete realm object.
Subsequent commits will add "on_delete=models.RESTRICT"
relationships, which will result in the UserProfile
objects being deleted after Realm has been deleted from
the database.
In order to handle this, we update bot_dicts_in_realm_cache_key
function to accept realm_id as parameter instead of realm
object, so that the code for flushing the cache works even
after the realm is deleted. This change is fine because
eventually only realm_id is used by this function and there is
no need of the complete realm object.
Subsequent commits will add "on_delete=models.RESTRICT"
relationships, which will result in the RealmEmoji
objects being deleted after Realm has been deleted from
the database.
In order to handle this, we update get_realm_emoji_dicts,
get_realm_emoji_cache_key, get_active_realm_emoji_cache_key,
get_realm_emoji_uncached and get_active_realm_emoji_uncached
functions to accept realm_id as parameter instead of realm
object, so that the code for flushing the cache works even
after the realm is deleted. This change is fine because
eventually only realm_id is used by these functions and
there is no need of the complete realm object.
This commit adds default_group_name field to GroupPermissionSetting
type which will be used to store the name of the default group for
that setting which would in most cases be one of the role-based
system groups. This will be helpful when we would have multiple
settings and we would need to set the defaults while creating
realm and streams.
This commit makes it possible for users to control the
audible desktop notifications for messages sent to followed topics
via a global notification setting.
There is no support for configuring this setting through the UI yet.
This commit makes it possible for users to control the
visual desktop notifications for messages sent to followed topics
via a global notification setting.
There is no support for configuring this setting through the UI yet.
This commit makes it possible for users to control the wildcard
mention notifications for messages sent to followed topics
via a global notification setting.
There is no support for configuring this setting
through the UI yet.
This commit makes it possible for users to control
the push notifications for messages sent to followed topics
via a global notification setting.
There is no support for configuring this setting
through the UI yet.
This commit makes it possible for users to control
the email notifications for messages sent to followed topics
via a global notification setting.
Although there is no support for configuring this setting
through the UI yet.
Add five new fields to the UserBaseSettings class for
the "followed topic notifications" feature, similar to
stream notifications. But this commit consists only of
the implementation of email notifications.
Note that we use the DjangoJSONEncoder so that we have builtin support
for parsing Decimal and datetime.
During this intermediate state, the migration that creates
extra_data_json field has been run. We prepare for running the backfilling
migration that populates extra_data_json from extra_data.
This change implements double-write, which is important to keep the
state of extra data consistent. For most extra_data usage, this is
handled by the overriden `save` method on `AbstractRealmAuditLog`, where
we either generates extra_data_json using orjson.loads or
ast.literal_eval.
While backfilling ensures that old realm audit log entries have
extra_data_json populated, double-write ensures that any new entries
generated will also have extra_data_json set. So that we can then safely
rename extra_data_json to extra_data while ensuring the non-nullable
invariant.
For completeness, we additionally set RealmAuditLog.NEW_VALUE for
the USER_FULL_NAME_CHANGED event. This cannot be handled with the
overridden `save`.
This addresses: https://github.com/zulip/zulip/pull/23116#discussion_r1040277795
Note that extra_data_json at this point is not used yet. So the test
cases do not need to switch to testing extra_data_json. This is later
done after we rename extra_data_json to extra_data.
Double-write for the remote server audit logs is special, because we only
get the dumped bytes from an external source. Luckily, none of the
payload carries extra_data that is not generated using orjson.dumps for
audit logs of event types in SYNC_BILLING_EVENTS. This can be verified
by looking at:
`git grep -A 6 -E "event_type=.*(USER_CREATED|USER_ACTIVATED|USER_DEACTIVATED|USER_REACTIVATED|USER_ROLE_CHANGED|REALM_DEACTIVATED|REALM_REACTIVATED)"`
Therefore, we just need to populate extra_data_json doing an
orjson.loads call after a None-check.
Co-authored-by: Zixuan James Li <p359101898@gmail.com>
This commit removes realm_community_topic_editing_limit_seconds
field from register response since topic edit limit is now
controlled by move_messages_within_streams_limit_seconds
setting.
We also remove DEFAULT_COMMUNITY_TOPIC_EDITING_LIMIT_SECONDS
constant since it is no longer used.
This prevents `get_user_profile_by_api_key` from doing a sequential
scan.
Doing this requires moving the generation of initial api_key values
into the column definition, so that even bare calls to
`UserProfile.objects.create` (e.g. from tests) call appropriately
generate a random initial value.
We now set tos_version to "-1" for imported users and the ones
created using API or using other methods like LDAP, SCIM and
management commands. This value will help us to allow users to
change email address visibility setting during first login.
Adds the `failed` boolean from the ScheduledMessage to the API dict
returned by scheduled message events and register response, and by
fetching the user's scheduled messages.
`failed` will only be true when the server has tried to send the
scheduled message and failed due to an error.
Previously, it seemed possible for the scheduled messages API to try
to send infinite copies of a message if we had the very poor luck of a
persistent failure happening after a message was sent.
The failure_message field supports being able to display what happened
in the scheduled messages modal, though that's not exposed to the API
yet.
Fixes#25414.
We add Attachment.scheduled_messages relation to track ScheduledMessages
which reference the attachment.
The import bits can be done after merging this, by updating #25345.
This will help us remove scheduled message and reminder logic
from `/messages` code path.
Removes `deliver_at`/`defer_until` and `tz_guess` parameters. And
adds the `scheduled_delivery_timestamp` instead. Also updates the
scheduled message dicts to return `scheduled_delivery_timestamp`.
Also, revises some text in `/delete-scheduled-message` endpoint
and in the `ScheduledMessage` schema in the API documentation.
Updates the objects in the API for scheduled messages so that those
for stream messages return the `to` property as an integer since it
is always the unique stream ID and so that those for direct messages
do not have a `topic` property since direct messages never have a
topic.
Also makes small update so that web app scheduled messages overlay
has the correct stream ID.
This commit adds ORG_TYPE_IDS constant field to Realm class
such that it can be used when we want to validate the org_type
passed in request. This was previously defined in realm.py, but
we move it inside Realm class such that we can use it at other
places as well.
This implements the core of the rewrite described in:
For the backend data model for UserPresence to one that supports much
more efficient queries and is more correct around handling of multiple
clients. The main loss of functionality is that we no longer track
which Client sent presence data (so we will no longer be able to say
using UserPresence "the user was last online on their desktop 15
minutes ago, but was online with their phone 3 minutes ago"). If we
consider that information important for the occasional investigation
query, we have can construct that answer data via UserActivity
already. It's not worth making Presence much more expensive/complex
to support it.
For slim_presence clients, this sends the same data format we sent
before, albeit with less complexity involved in constructing it. Note
that we at present will always send both last_active_time and
last_connected_time; we may revisit that in the future.
This commit doesn't include the finalizing migration, which drops the
UserPresenceOld table.
The way to deploy is to start the backfill migration with the server
down and then start the server *without* the user_presence queue worker,
to let the migration finish without having new data interfering with it.
Once the migration is done, the queue worker can be started, leading to
the presence data catching up to the current state as the queue worker
goes over the queued up events and updating the UserPresence table.
Co-authored-by: Mateusz Mandera <mateusz.mandera@zulip.com>
This removes the validator argument for 0423_realmfilter_url_template,
which do not really alter the database schema. It otherwise fails
the migration because the filter_format_validator function is removed.
Migration 0094_realm_filter_url_validator is modified because we can no
longer refer to filter_format_validator.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
This swaps out url_format_string from all of our APIs and replaces it
with url_template. Note that the documentation changes in the following
commits will be squashed with this commit.
We change the "url_format" key to "url_template" for the
realm_linkifiers events in event_schema, along with updating
LinkifierDict. "url_template" is the name chosen to normalize
mixed usages of "url_format_string" and "url_format" throughout
the backend.
The markdown processor is updated to stop handling the format string
interpolation and delegate the task template expansion to the uri_template
library instead.
This change affects many test cases. We mostly just replace "%(name)s"
with "{name}", "url_format_string" with "url_template" to make sure that
they still pass. There are some test cases dedicated for testing "%"
escaping, which aren't relevant anymore and are subject to removal.
But for now we keep most of them as-is, and make sure that "%" is always
escaped since we do not use it for variable substitution any more.
Since url_format_string is not populated anymore, a migration is created
to remove this field entirely, and make url_template non-nullable since
we will always populate it. Note that it is possible to have
url_template being null after migration 0422 and before 0424, but
in practice, url_template will not be None after backfilling and the
backend now is always setting url_template.
With the removal of url_format_string, RealmFilter model will now be cleaned
with URL template checks, and the old checks for escapes are removed.
We also modified RealmFilter.clean to skip the validation when the
url_template is invalid. This avoids raising mulitple ValidationError's
when calling full_clean on a linkifier. But we might eventually want to
have a more centric approach to data validation instead of having
the same validation in both the clean method and the validator.
Fixes#23124.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
This will later be used to expand matching linkifier patterns.
Making it nullable for now, but we will make it required in
the APIs.
As a part of this transition, we temporarily make url_format_string
nullable as well, which will be later removed. This allows us to
switch to populating url_template without caring about passing
url_format_string.
Note that the validators are imported in the migration because Django
otherwise diffs it and considers the schema to be different, generating
a migration, failing the "tools/test-migrations" test.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
For endpoints with a `type` parameter to indicate whether the message
is a stream or direct message, `POST /typing` and `POST /messages`,
adds support for passing "direct" as the preferred value for direct
messages, group and 1-on-1.
Maintains support for "private" as a deprecated value to indicate
direct messages.
Fixes#24960.
So far, we've used the BitField .authentication_methods on Realm
for tracking which backends are enabled for an organization. This
however made it a pain to add new backends (requiring altering the
column and a migration - particularly troublesome if someone wanted to
create their own custom auth backend for their server).
Instead this will be tracked through the existence of the appropriate
rows in the RealmAuthenticationMethods table.