Commit Graph

664 Commits

Author SHA1 Message Date
Zixuan James Li 30495cec58 migration: Rename extra_data_json to extra_data in audit log models.
This migration applies under the assumption that extra_data_json has
been populated for all existing and coming audit log entries.

- This removes the manual conversions back and forth for extra_data
throughout the codebase including the orjson.loads(), orjson.dumps(),
and str() calls.

- The custom handler used for converting Decimal is removed since
DjangoJSONEncoder handles that for extra_data.

- We remove None-checks for extra_data because it is now no longer
nullable.

- Meanwhile, we want the bouncer to support processing RealmAuditLog entries for
remote servers before and after the JSONField migration on extra_data.

- Since now extra_data should always be a dict for the newer remote
server, which is now migrated, the test cases are updated to create
RealmAuditLog objects by passing a dict for extra_data before
sending over the analytics data. Note that while JSONField allows for
non-dict values, a proper remote server always passes a dict for
extra_data.

- We still test out the legacy extra_data format because not all
remote servers have migrated to use JSONField extra_data.
This verifies that support for extra_data being a string or None has not
been dropped.

Co-authored-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-08-16 17:18:14 -07:00
Zixuan James Li 011b4c1f7a populate_db: Populate linkifiers.
The curl examples of reordering linkifiers require there to be some
linkifiers in the database to be reordered. This adjusts some test cases
so they do not assume that there is no linkifier in the test db.
2023-08-14 15:21:48 -07:00
Anders Kaseorg 562a79ab76 ruff: Fix PERF401 Use a list comprehension to create a transformed list.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-08-07 17:23:55 -07:00
Anders Kaseorg 2ae285af7c ruff: Fix PLR1714 Consider merging multiple comparisons.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-23 15:21:33 -07:00
Sahil Batra 62f01edee3 populate_db: Pass realm as arg to select_related calls.
This commit updates the select_related calls in queries to
get UserProfile objects in populate_db code to pass "realm"
as argument to select_related call.

Also, note that "realm" is the only non-null foreign key field
in UserProfile object, so select_related() was only fetching
realm object previously as well. But we should still pass "realm"
as argument in select_related call so that we can make sure that
only required fields are selected in case we add more foreign
keys to UserProfile in future.
2023-07-20 10:44:39 -07:00
Anders Kaseorg 32a8151ce8 profile_request: Support only synchronous responses for now.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-19 16:14:59 -07:00
Anders Kaseorg 143baa4243 python: Convert translated positional {} fields to {named} fields.
Translators benefit from the extra information in the field names, and
need the reordering freedom that isn’t available with multiple
positional fields.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-18 15:19:07 -07:00
Steve Howell 67cdf1a7b4 emojis: Use get_emoji_data.
The previous function was poorly named, asked for a
Realm object when realm_id sufficed, and returned a
tuple of strings that had different semantics.

I also avoid calling it duplicate times in a couple
places, although it was probably rarely the case that
both invocations actually happened if upstream
validations were working.

Note that there is a TypedDict called EmojiInfo, so I
chose EmojiData here.  Perhaps a better name would be
TinyEmojiData or something.

I also simplify the reaction tests with a verify
helper.
2023-07-17 09:35:53 -07:00
Zixuan Li a0cf624eaa
migrations: Backfill extra_data_json for audit log entries.
This migration is reasonably complex because of various anomalies in existing
data.

Note that there are cases when extra_data does not contain data that is
proper json with possibly single quotes. Thus we need to use
"ast.literal_eval" to cover that.

There is also a special case for "event_type == USER_FULL_NAME_CHANGED",
where extra_data is a plain str. This event_type is only used for
RealmAuditLog, so the zilencer migration script does not need to handle
it.

The migration does not handle "event_type == REALM_DISCOUNT_CHANGED"
because ast.literal_eval only allow Python literals. We expect the admin
to populate the jsonified extra_data for extra_data_json manually
beforehand.

This chunks the backfilling migration to reduce potential block time.

The migration for zilencer is mostly similar to the one for zerver; except that
the backfill helper is added in a wrapper and unrelated events are
removed.

**Logging and error recovery**

We print out a warning when the extra_data_json field of an entry
would have been overwritten by a value inconsistent with what we derived
from extra_data. Usually this only happens when the extra_data was
corrupted before this migration. This prevents data loss by backing up
possibly corrupted data in extra_data_json with the keys
"inconsistent_old_extra_data" and "inconsistent_old_extra_data_json".
More roundtrips to the database are needed for inconsistent data, which are
expected to be infrequent.

This also outputs messages when there are audit log entries with decimals,
indicating that such entries are not backfilled. Do note that audit log
entries with decimals are not populated with "inconsistent_old_extra_data_*"
in the JSONField, because they are not overwritten.

For such audit log entries with "extra_data_json" marked as inconsistent,
we skip them in the migration.  Because when we have discovered anomalies in a
previous run, there is no need to overwrite them again nesting the extra keys
we added to it.

**Testing**

We create a migration test case utilizing the property of bulk_create
that it doesn't call our modified save method.

We extend ZulipTestCase to support verifying console output at the test
case level. The implementation is crude but the use case should be rare
enough that we don't need it to be too elaborate.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-07-15 09:43:23 -07:00
Anders Kaseorg 7e707270f0 models: Convert deprecated index_together option to indexes.
index_together is slated for removal in Django 5.1:
https://docs.djangoproject.com/en/4.2/internals/deprecation/#deprecation-removed-in-5-1

We set the optional index names to match the previously generated
index names to avoid adding new migrations.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-07-12 07:12:43 -07:00
Anders Kaseorg c09e7d6407 codespell: Correct “requestor” to “requester”.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-06-20 16:17:55 -07:00
Zixuan Li e39e04c3ce
migration: Add `extra_data_json` for audit log models.
Note that we use the DjangoJSONEncoder so that we have builtin support
for parsing Decimal and datetime.

During this intermediate state, the migration that creates
extra_data_json field has been run. We prepare for running the backfilling
migration that populates extra_data_json from extra_data.

This change implements double-write, which is important to keep the
state of extra data consistent. For most extra_data usage, this is
handled by the overriden `save` method on `AbstractRealmAuditLog`, where
we either generates extra_data_json using orjson.loads or
ast.literal_eval.

While backfilling ensures that old realm audit log entries have
extra_data_json populated, double-write ensures that any new entries
generated will also have extra_data_json set. So that we can then safely
rename extra_data_json to extra_data while ensuring the non-nullable
invariant.

For completeness, we additionally set RealmAuditLog.NEW_VALUE for
the USER_FULL_NAME_CHANGED event. This cannot be handled with the
overridden `save`.

This addresses: https://github.com/zulip/zulip/pull/23116#discussion_r1040277795

Note that extra_data_json at this point is not used yet. So the test
cases do not need to switch to testing extra_data_json. This is later
done after we rename extra_data_json to extra_data.

Double-write for the remote server audit logs is special, because we only
get the dumped bytes from an external source. Luckily, none of the
payload carries extra_data that is not generated using orjson.dumps for
audit logs of event types in SYNC_BILLING_EVENTS. This can be verified
by looking at:

`git grep -A 6 -E "event_type=.*(USER_CREATED|USER_ACTIVATED|USER_DEACTIVATED|USER_REACTIVATED|USER_ROLE_CHANGED|REALM_DEACTIVATED|REALM_REACTIVATED)"`

Therefore, we just need to populate extra_data_json doing an
orjson.loads call after a None-check.

Co-authored-by: Zixuan James Li <p359101898@gmail.com>
2023-06-07 12:14:43 -07:00
Zixuan James Li 28ec7baaef zilencer: Make analytics bouncer forward-compatible with JSONField.
This adds support to accepting extra_data being dict from remote
servers' RealmAuditLog entries. So that it is forward-compatible with
servers that have migrated to use JSONField for RealmAuditLog just in
case. This prepares us for migrating zilencer's audit log models to use
JSONField for extra_data.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-06-05 17:38:10 -07:00
Lauryn Menard 154af5bb6b scheduled-messages: Remove ID from create scheduled message.
Part of splitting creating and editing scheduled messages.
Should be merged with final commit in series. Breaks tests.

Removes `scheduled_message_id` parameter from the create scheduled
message path.
2023-05-26 18:05:55 -07:00
Lauryn Menard 7af5ceb1c5 scheduled-messages: Add direct scheduled message to populate_db.
Prep commit for splitting create/edit endpoint for scheduled
messages.

Because of `test-api` runs the tests in alphabetical order based on
the `operationId`, we need two scheduled messages in the test database.
The first for the curl example delete (delete-scheduled-message) and
the second for the curl example update (update-scheduled-message).
2023-05-26 18:05:55 -07:00
Alex Vandiver 9f231322c9 workers: Pass down if they are running multi-threaded.
This allows them to decide for themselves if they should enable
timeouts.
2023-05-16 14:05:01 -07:00
Anders Kaseorg d0481be3e5 requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-05-10 19:44:47 -07:00
Lauryn Menard 15c6d67e9c populate-db: Add scheduled message to test database.
Prep commit for adding the scheduled-message endpoints to the API
documentation.

Adds a scheduled message for Iago in the test database so that it
can be deleted in the delete cURL example in the api-test suite.
2023-04-28 17:25:00 -07:00
Tim Abbott 027b67be80 presence: Rewrite the backend data model.
This implements the core of the rewrite described in:

For the backend data model for UserPresence to one that supports much
more efficient queries and is more correct around handling of multiple
clients.  The main loss of functionality is that we no longer track
which Client sent presence data (so we will no longer be able to say
using UserPresence "the user was last online on their desktop 15
minutes ago, but was online with their phone 3 minutes ago").  If we
consider that information important for the occasional investigation
query, we have can construct that answer data via UserActivity
already.  It's not worth making Presence much more expensive/complex
to support it.

For slim_presence clients, this sends the same data format we sent
before, albeit with less complexity involved in constructing it.  Note
that we at present will always send both last_active_time and
last_connected_time; we may revisit that in the future.

This commit doesn't include the finalizing migration, which drops the
UserPresenceOld table.
The way to deploy is to start the backfill migration with the server
down and then start the server *without* the user_presence queue worker,
to let the migration finish without having new data interfering with it.
Once the migration is done, the queue worker can be started, leading to
the presence data catching up to the current state as the queue worker
goes over the queued up events and updating the UserPresence table.

Co-authored-by: Mateusz Mandera <mateusz.mandera@zulip.com>
2023-04-26 14:26:47 -07:00
Mateusz Mandera 2a45429a51 zilencer: Delete duplicate remote push registrations.
This fixes existing instances of the bug fixed in the previous commit.

Fixes #24969.
2023-04-13 15:17:20 -07:00
Mateusz Mandera ade2225f08 zilencer: Avoid creating duplicate remote push registrations.
Servers that had upgraded from a Zulip server version that did not yet
support the user_uuid field to one that did could end up with some
mobile devices having two push notifications registrations, one with a
user_id and the other with a user_uuid.

Fix this issue by sending both user_id and user_uuid, and clearing
2023-04-13 15:17:20 -07:00
Alex Vandiver d888bb3df2 error-bot: Remove ERROR_BOT support.
This isn't sufficiently useful to keep the added complexity.  Users
should use the email error reporting, or set up Sentry error
reporting.
2023-04-13 14:59:58 -07:00
Prakhar Pratyush e45623fccc python: Update tuple handling pattern; returned by a delete() query.
This commit updates the pattern for dealing with tuples
returned by the delete() query.

The '(num_deleted, ignored) = ModelName.objects.filter().delete()'
pattern is preferred due to better readability.

We avoid the pattern '(num_deleted, _)' because Django uses _
for translation, which may lead to future bugs.
2023-03-27 16:18:23 -07:00
Zixuan James Li cf9b95b95a user_groups: rename create_user_group to create_user_group_in_database.
To avoid people calling "create_user_group" instead of
"check_add_user_group", we rename it to make its purpose clearer.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-03-27 09:05:00 -07:00
Zixuan James Li 0f5d6432a4 user_groups: Move create_user_group to zerver.actions.user_groups.
Since this function creates a new user group into the database,
it is more appropriate to have it not as a generic "lib" function
but as an "action".

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2023-03-27 09:05:00 -07:00
Anders Kaseorg 3bfbfb014a zilencer: Switch a log message back from %r to %s.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-03-08 22:56:55 -08:00
Anders Kaseorg 2d9b2a2a05 models: Remove type prefixes from __str__ values.
The Django convention is for __repr__ to include the type and __str__
to omit it.  In fact its default __repr__ implementation for models
automatically adds a type prefix to __str__, which has resulted in the
type being duplicated:

    >>> UserProfile.objects.first()
    <UserProfile: <UserProfile: emailgateway@zulip.com <Realm: zulipinternal 1>>>

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-03-08 22:56:55 -08:00
Anders Kaseorg aa577a554b populate_db: Import timedelta from its canonical module.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-03-05 14:46:28 -08:00
Anders Kaseorg 0628c3cac8 migrations: Import BaseDatabaseSchemaEditor from its canonical module.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-03-05 14:46:28 -08:00
Anders Kaseorg 43b4f10578 run-dev: Drop .py from script name.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-03-03 18:02:37 -08:00
Anders Kaseorg ed069ebe0e docs: Remove spaces before commas.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-22 17:17:25 -08:00
Alex Vandiver 7feda75c5f populate_db: Temporarily remove post-delete signal on AlertWord.
The post-delete signal on AlertWord clears the realm cache; when it is
called repeatedly, this results in re-fetching the realm object O(n)
times, where n scales by number of users in the database.

Disconnect this cache-clearing signal before removing the AlertWord
entries, and reconnect it afterwards.  This is not thread-safe, but
this section is single-threaded.  It is also probably unnecessary to
re-connect the signal, as rest of `./manage.py populate_db` does not
delete AlertWord objects, but cleanliness dictates doing the
re-connection.

This drops the time to repeatedly run:
    python3 ./manage.py populate_db --num-messages=0 --extra-users=1000

...from 47 seconds to 36 seconds.
2023-02-17 13:59:11 -08:00
Sahil Batra 9d1dc20e6e settings: Remove realm-level email_address_visibility setting.
This was replaced by the new user-level version in recent commits.

Fixes #20035.
Fixes #18149.
2023-02-10 17:40:33 -08:00
Sahil Batra 0ed5f76063 settings: Add backend code for using user email_address_visibility setting.
This commits update the code to use user-level email_address_visibility
setting instead of realm-level to set or update the value of UserProfile.email
field and to send the emails to clients.

Major changes are -

- UserProfile.email field is set while creating the user according to
RealmUserDefault.email_address_visbility.

- UserProfile.email field is updated according to change in the setting.

- 'email_address_visibility' is added to person objects in user add event
and in avatar change event.

- client_gravatar can be different for different users when computing
avatar_url for messages and user objects since email available to clients
is dependent on user-level setting.

- For bots, email_address_visibility is set to EVERYONE while creating
them irrespective of realm-default value.

- Test changes are basically setting user-level setting instead of realm
setting and modifying the checks accordingly.
2023-02-10 17:35:49 -08:00
Anders Kaseorg 6992d3297a ruff: Fix PIE810 Call `startswith` once with a `tuple`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-08 16:40:35 -08:00
Anders Kaseorg da3cf5ea7a ruff: Fix RSE102 Unnecessary parentheses on raised exception.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-04 16:34:55 -08:00
Anders Kaseorg df001db1a9 black: Reformat with Black 23.
Black 23 enforces some slightly more specific rules about empty line
counts and redundant parenthesis removal, but the result is still
compatible with Black 22.

(This does not actually upgrade our Python environment to Black 23
yet.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-02-02 10:40:13 -08:00
Anders Kaseorg ba78bee8c4 ruff: Fix DTZ005 `datetime.datetime.now()` without `tz` argument.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-04 16:25:07 -08:00
Anders Kaseorg bd884c88ed Fix typos caught by typos.
https://github.com/crate-ci/typos

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-01-03 11:09:50 -08:00
Zixuan James Li b3aba796f1 user_groups: Track acting user for user group creation.
This is a prep-commit for populating RealmAuditLogs for changes made to
UserGroup.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-12-13 14:58:58 -08:00
Anders Kaseorg 73c4da7974 ruff: Fix N818 exception name should be named with an Error suffix.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-17 16:52:00 -08:00
Anders Kaseorg 69e94b5991 ruff: Fix C413 Unnecessary `list` call around `sorted()`.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-11-03 12:10:15 -07:00
Anders Kaseorg 1385a827c2 python: Clean up getattr, setattr, delattr calls with literal names.
These were useful as a transitional workaround to ignore type errors
that only show up with django-stubs, while avoiding errors about
unused type: ignore comments without django-stubs.  Now that the
django-stubs transition is complete, switch to type: ignore comments
so that mypy will tell us if they become unnecessary.  Many already
have.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-10-10 08:40:28 -07:00
Mateusz Mandera 00b3546c9f models: Add denormalized .realm column to Message.
This commit adds the OPTIONAL .realm attribute to Message
(and ArchivedMessage), with the server changes for making new Messages
have this set. Old Messages still have to be migrated to backfill this,
before it can be non-nullable.

Appropriate test changes to correctly set .realm for Messages the tests
manually create are included here as well.
2022-10-07 10:09:38 -07:00
Sahil Batra 2bf70fe4db custom_profile_field: Add "Pronouns" custom field type.
This commit adds "Pronouns" custom profile field type. We also
add "Pronouns" type field in the development environment
2022-10-06 17:56:26 -07:00
Zixuan James Li 4c3c976174 models: Implicitly type model fields with django-stubs.
Previously, we type the model fields with explicit type annotations
manually with the approximate types. This was because the lack of types
for Django.

django-stubs provides more specific types for all these fields that
incompatible with our previous approximate annotations. So now we can
remove the inline type annotations and rely on the types defined in the
stubs. This allows mypy to infer the types of the model fields for us.

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
2022-10-05 16:15:56 -07:00
Mateusz Mandera 065b59213b models: Rename get_huddle to get_or_create_huddle.
Small follow-up to d86e4ac34d.
get_ makes it sound like it doesn't have side-effects, when these are
actually much like the django ORM .get_or_create function.
2022-09-27 10:42:03 -07:00
Mateusz Mandera 0799ec1a43 populate_db: Limit user_profiles for private messages to zulip realm.
These are used for creating huddles and private messages (and some
UserPresence objects). It'd be really weird, and potentially create some
Messages that break our assumptions, for this to end up involving users
in multiple realms.
I believe currently this hasn't been happening, because when
this line runs, there are only users in "zulip" realm and system bots in
"zulipinternal" - but the query has been excluding bots already.

Still, this query should be explicit about grabbing users from a single
realm. This will also be helpful for the work adding the denormalized
Message.realm field - so that the realm of Message objects that get
manually created in generate_and_send_messages can be simply set to
"zulip" with confidence that it's correct.
2022-09-23 09:59:10 -07:00
Anders Kaseorg 8dd72497a7 zilencer: Mark remote_server_dispatch request argument positional-only.
In the presence of **kwargs, this is required by the Concatenate type
expected by default_never_cache_responses.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2022-09-19 13:52:35 -07:00
Tim Abbott 467d4dfb0f zilencer: Fix missing decorators on remote_server_dispatch.
In 5c49e4ba06, we neglected to include
the CSRF and caching decorators required for all API views in the new
remote_server_dispatch function.

I'm not sure why our automated tests didn't catch this, but this made
the remote server API endpoints nonfunctional in a production
environment.
2022-09-12 14:16:37 -07:00