Commit Graph

14769 Commits

Author SHA1 Message Date
Lauryn Menard dcfe9d0dd8 api_documentation: Clarify `update_message` event description.
Moves `flags` field to top part of object description because
it is always included in the event.

If a field is present only for certain types of message updates,
the description begins by stating when the field is present:
"Only present if ...".

These fields are organized by the type of message update:
stream, stream and/or topic, topic, content.

If a field is not present due to a special event, the description
ends by stating when the field is not present:
"Not present if ...".

Adds documentation for fields currently required to be returned
with any `update_message` event.
2022-01-05 14:45:19 -08:00
Mateusz Mandera 30ccb76e19 do_delete_user: Preserve date_joined value of the user. 2022-01-04 15:42:03 -08:00
Mateusz Mandera 444bb6d0e9 do_delete_user: Create RealmAuditLog entries. 2022-01-04 15:42:03 -08:00
Mateusz Mandera 5939329485 do_delete_user: Add migration to fix bugged UserProfiles.
do_delete_users had two bugs:
1. Creating the replacement dummy users
with active=True
2. Creating the replacement dummy users with email domain set to
realm.uri, which may not be a valid email domain.
Prior commits fixed the bugs, and this migration fixes the pre-existing
objects.
2022-01-04 15:42:03 -08:00
Mateusz Mandera 208c0c3034 do_delete_user: Use get_fake_email_domain for dummy user email domain.
Otherwise the dummy user can be created with an invalid email domain -
e.g. in development environment with the domain
"@http://localhost:9991". get_fake_email_domain exists exactly for
handling these kinds of scenarios.
2022-01-04 15:42:03 -08:00
Mateusz Mandera dffdeb48e7 do_delete_user: Make the replacement dummy user inactive.
Otherwise, the dummy user will show up in the user list in the right
sidebar.
2022-01-04 15:42:03 -08:00
Alex Vandiver fc13dd6f3d user_groups: Don't use access_user_group_by_id for notifications.
Stop using `access_user_group_by_id` in notifications codepaths, as it
is meant to be used to check for _write_ access, not read
access (which is not limited).  In the notification codepaths, there
are no ACLs to apply, and the ID is known-good; just load it
directly. The `for_mention` flag is removed, as it was not used in the
mention codepaths at all, only the notification ones.
2022-01-04 14:45:04 -08:00
Mateusz Mandera 868ed17661 remote_server: Handle invalid server uuid being given authing to API.
get_remote_server_by_uuid (called in validate_api_key) raises
ValidationError when given an invalid UUID due to how Django handles
UUIDField. We don't want that exception and prefer the ordinary
DoesNotExist exception to be raised.
2022-01-04 14:40:49 -08:00
Alex Vandiver 1b395b6403 zilencer: Truncate APNS notifications correctly.
APNs payloads nest the zulip-custom data further than the top level,
as Android notifications do.  This led to APNs data silently never
being truncated; this case was not caught in tests because the mocks
provided the wrong data for the APNs structure.

Adjust to look in the appropriate place within the APNs data, and
truncate that.
2022-01-03 15:24:16 -08:00
Eeshan Garg 0b5324f345 corporate: Add helper for deactivating remote server registrations. 2022-01-03 14:02:48 -08:00
Abhijeet Prasad Bodas 15e8717847 notifications: Don't enqueue notifications for bots.
This replaces the temporary (and testless) fix in
24b1439e93 with a more permanent
fix.

Instead of checking if the user is a bot just before
sending the notifications, we now just don't enqueue
notifications for bots. This is done by sending a list
of bot IDs to the event_queue code, just like other
lists which are used for creating NotificationData objects.

Credit @andersk for the test code in `test_notification_data.py`.
2022-01-03 09:55:06 -08:00
Mateusz Mandera 4153b5c517 remote_server: Improve uuid validation at the server/register endpoint.
As explained in the comments in the code, just doing UUID(string) and
catching ValueError is not enough, because the uuid library sometimes
tries to modify the string to convert it into a valid UUID:

>>> a = '18cedb98-5222-5f34-50a9-fc418e1ba972'
>>> uuid.UUID(a, version=4)
UUID('18cedb98-5222-4f34-90a9-fc418e1ba972')
2021-12-31 11:18:01 -08:00
Steve Howell a9271e7a99 performance: Cache stream lookups in MentionBackend.
This is useful when you subscribe a bunch of folks
to a stream and need to send them all PMs telling
them about the new subscription.
2021-12-30 11:28:15 -08:00
Steve Howell 4adcaf92f7 refactor: Attach get_stream_name_map to MentionData.
This diff looks slightly noisy, but the main chunk of
code that we moved here has the same logic as before,
and it just gets realm_id from MentionBackend now, instead
of having our markdown processor have to supply it.

We basically want MentionData to be the gatekeeper of
mention data, and then we delegate backend tasks to
MentionBackend.

Soon we will add a cache to MentionBacked, which will
justify this change a bit more.
2021-12-30 11:28:15 -08:00
Steve Howell 05eb4cfa5f mypy: Fix argument type for get_active_streams.
We now make it mandatory to pass in the Realm object.

If this function was ever called with None, I am scared
to know what the expected results were at the time of
writing.
2021-12-30 11:28:15 -08:00
Steve Howell 0359c083d1 refactor: Extract get_linkable_streams.
This is a one-liner with two purposes:

    * We want the comment to explain the business rule.

    * We want to just work in id space.
2021-12-30 11:28:15 -08:00
Steve Howell c4bd4496dd peformance: Cache user mentions for multiple PMs.
It's slightly annoying to plumb Optional[MentionBackend]
down the stack, but it's a one-time change.

I tried to make the cache code relatively unobtrusive
for the single-message use case.

We should be able to eliminate redundant stream queries
using similar techniques.

I considered caching at the level of rendering the message
itself, but this involves nearly as much plumbing, and
you have to account for the fact that several users on
your realm may have distinct default languages (French,
Spanish, Russian, etc.), so you would not eliminate as
many query hops. Also, if multiple streams were involved,
users would get slightly different messages based on
their prior subscriptions.
2021-12-30 11:28:15 -08:00
Steve Howell c6448263c3 refactor: Add MentionBackend.
We will eventually use this to avoid redundant
queries.

The diff is slightly noisy here, but there are no
logic changes.
2021-12-30 11:28:15 -08:00
Steve Howell a22f49bf83 refactor: Extract UserFilter.
This is setting us up for future commits.
2021-12-30 11:28:15 -08:00
Steve Howell ea252ab53e refactor: Convert FullNameInfo to a dataclass.
As part of this we no longer query for email, which
is a vestige of when we used emails to identify users
on the frontend.
2021-12-30 11:28:15 -08:00
Steve Howell f5fc348786 mypy: Add explicit types for dbdata references.
When our handlers specifically reference self.md.zulip_db_data,
we now use an explicit type.

We probably want a more robust solution here, such as a semgrep
rule.
2021-12-30 11:28:15 -08:00
Steve Howell df84892aad markdown: Convert DbData to a dataclass. 2021-12-30 11:28:15 -08:00
Steve Howell 4e551f8279 refactor: Introduce get_stream_name_map.
We only need a name -> id map, and the FullNameInfo
type was a lie.
2021-12-30 11:28:15 -08:00
Steve Howell c04a8097f3 mypy: Add EmojiInfo type.
We now serialize still_url as None for non-animated emojis,
instead of omitting the field. The webapp does proper checks
for falsiness here.  The mobile app does not yet use the field
(to my knowledge).

We bump the API version here. More discussion here:

https://chat.zulip.org/#narrow/stream/378-api-design/topic/still_url/near/1302573
2021-12-30 11:28:14 -08:00
Steve Howell a6201b430f tests: Improve checks for subscribing users.
We now check both the notification messages for
all three of Hamlet's peers.

And we count queries.
2021-12-30 11:23:25 -08:00
Steve Howell fd925e6045 streams: Add id to user mentions for stream notifications. 2021-12-30 11:23:25 -08:00
Lauryn Menard a16fcd3172 tests: Improve testing helper event schema for `update_message`.
Further clarifies the fields returned by `update_message` event
for the type of change (content, topic and/or stream).

Follow-up task from #20587.
2021-12-30 08:35:35 -08:00
BIKI DAS ad61d06cea
python: Remove unnecessary list comprehension.
`all` can take a generator, not just a list.  

Using a generator expression here is simpler and faster.
2021-12-30 06:51:50 -08:00
Anders Kaseorg b0b8f84949 test_console_output: Avoid appending to bytes in a loop.
Appending to bytes in a loop leads to a quadratic slowdown since
Python doesn’t optimize this for bytes like it does for str.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-29 16:50:08 -08:00
parth 4edf029ad5 invitations: Don't notify now-deactivated users.
While accepting an invitation from a user, there was no condition in
place to check if the user sending the invitation was now
now-deactivated.

Skip sending notifications about newly-joined users to users who are
now disabled.

Fixes #18569.
2021-12-29 16:21:19 -08:00
Steve Howell 1e4593b2ae performance: Avoid Recipient lookup.
We don't have to go to the database to get the Recipient
fields for `user_profile.recipient`.

See also 85ed6f332a from a little
over a year ago--it's very similar.
2021-12-28 12:15:02 -08:00
Steve Howell 01ebb2c85f refactor: Pass realm to bulk_remove_subscriptions.
We made a very similar change to bulk_add_subscriptions
earlier in the year.
2021-12-28 12:15:02 -08:00
Steve Howell ebbd5f168b refactor: Pass realm to notify_subscriptions_removed. 2021-12-28 12:15:02 -08:00
Steve Howell 966d88a78a stream colors: Fix stream color assignment.
The bug here probably didn't come up too much in
practice, but if we were adding a user to multiple
streams when they already had used all N available
colors, all the new streams would be assigned the same
color, since the size of used_colors would stay at N,
thwarting our little modulo-len hackery.

It's not a terrible bug, since users can obviously
customize their stream colors as they see fit.

Usually when we are adding a user to multiple streams,
the users are fairly new, and thus don't have many
existing streams, so I have never heard this bug
reported in the field.

Anyway, assigning the colors in bulk seems to make more
sense, and I added some tests.

For the situations where all the colors have already
been used, I didn't put a ton of thought into exactly
which repeated colors we want to choose; instead, I
just ensure they're different modulo 24. It's possible
that we should just have more than 24 canned colors, or
we should just assign the same default color every time
and let users change it themselves (once they've gone
beyond the 24, to be clear). Or maybe we can just do
something smarter here. I don't have enough time for a
deep dive on this issue.
2021-12-28 12:15:02 -08:00
Steve Howell fe3295d395 performance: Avoid monster query for existing subs.
Part of our codepath for subscribing users involves
fetching the users' existing subscriptions to make sure
we can do things like properly report to the clients
that the users were already subscribed.  This codepath
used to be coupled to code that helped users maintain
unique stream colors.

Suppose you are creating a new stream, and you are
importing users from an older stream with 15k
subscribers, and each of your users is subscribed to
about 20 streams.

The prior code, instead of filtering on recipient_id,
would literally look at every subscription for every
user, which was kind of crazy if you didn't understand
the pick-stream-color complications.

Before this commit, we would fetch 300k rows with 15
columns each (granted, all but one of the columns are
bool/int). That's a total of 4.5 million tiny objects
that we had to glom into Django ORM objects and slice
and dice.

After this commit, we would fetch exactly zero rows
for the are-they-already-subscribed logic.

Yes, ZERO.

If we were to mistakenly try to re-add the same 15k
subscribers to the new stream (under the new code), we
will now fetch 15k Sub rows instead of 300k.

It is worth looking at the prior commit. We go through
great pains to ensure that users get new stream colors
when we invite them to a stream, and we still fetch a
bunch of data for that. Instead of 4.5 million cells,
it's more like 600k cells (2 columns per row), and it's
less than that insofar as some users may only
have 24 distinct colors among their many streams.
It's a lot of work.
2021-12-28 12:15:02 -08:00
Steve Howell f638fd6f72 performance: Get used stream colors in separate trip.
This commit sets us up for the next commit, which will
save us a very expensive query.

If you are adding 15k users to a stream, and each user
has about 20 existing streams, then we need to retrieve
300k rows from the database to figure out which stream
colors they already have.  We don't need all the extra
fields from Subscription, so now we get just the two
values we need for making a color map.

In the next commit we'll eliminate the other use case
for the big query, and I will explain in greater
depth how splitting out the color-picking code can
be a huge win. It is possible that some product decisions
could make this codepath easier. We could also do some
engineering specific to stream colors, such as caching
which colors users have already used.

This does cost us an extra round trip to the database.
2021-12-28 12:15:02 -08:00
Steve Howell 56da570422 code cleanup: Remove unused parameter in pick_color. 2021-12-28 12:15:02 -08:00
Abhijeet Prasad Bodas aa18e797a8 test_event_queue: Generalize some helpers.
This will later allow us to also use these when
writing new tests for bots.
2021-12-28 10:59:04 -08:00
Abhijeet Prasad Bodas acdce4df47 actions: Fix misleading comment about wildcard mentions.
Having the `wildcard_mentions_notify` setting turned on does
not necessarily mean that the user will receive notification
for that message. There is more nuance to this, as explained
in the updated comment.
2021-12-28 10:58:54 -08:00
Eeshan Garg 2393342e03 webhooks/jira: Handle anomalous payloads properly.
We recently ran into a payload in production that didn't contain
an event type at all. A payload where we can't figure out the event
type is quite rare. Instead of letting these payloads run amok, we
should raise a more informative exception for such unusual payloads.
If we encounter too many of these, then we can choose to conduct a
deeper investigation on a case-by-case basis.

With some changes by Tim Abbott.
2021-12-28 10:56:25 -08:00
Mateusz Mandera c5c3ab66d6 remote_server: Migrate RemoteZulipServer.uuid to be UUIDField.
Given that these values are uuids, it's better to use UUIDField which is
meant for exactly that, rather than an arbitrary CharField.

This requires modifying some tests to use valid uuids.
2021-12-28 10:11:34 -08:00
Mateusz Mandera e48120fd12 remote_server: Validate zulip_org_id submitted by registering server.
zulip_org_id is supposed to be a UUID, so we want to actually validate
the format, not only check the length.
2021-12-28 10:11:34 -08:00
Steve Howell d62b39450e performance: Optimize send_subscription_add_events.
We avoid repeating the same calculations over and
over again for the same stream.

This helps, but the real bottleneck in this function
is that send_event usually takes at least a millisecond,
and that adds up quickly if you're doing something
like subscribing 5k users to a new stream.
2021-12-28 09:33:16 -08:00
Anders Kaseorg bc69f213a0 requirements: Upgrade Python requirements.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-28 09:31:55 -08:00
Anders Kaseorg 60eed65832 scim: Placate mypy 0.930.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-28 09:31:55 -08:00
Anders Kaseorg c8dd90f32b bot_config: Placate mypy 0.930.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-28 09:31:55 -08:00
Anders Kaseorg 575932f4e0 actions: Placate mypy 0.930.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-28 09:31:55 -08:00
Anders Kaseorg 95cddff39b test_scim: Placate mypy 0.930.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-28 09:31:55 -08:00
Anders Kaseorg f45b245f74 test_urls: Fix get_callback_string logic.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-28 09:31:55 -08:00
Anders Kaseorg 48190cf744 test_timezone: Fix ambiguous_abbrevs type.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-28 09:31:55 -08:00
Anders Kaseorg c4c28e06d9 test_openapi: Replace convert_regex_to_url_pattern.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-28 09:31:55 -08:00
Anders Kaseorg e3a8f992d5 test_openapi: Fix __wrapped__ accesses.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-28 09:31:55 -08:00
Anders Kaseorg d40f3d54f1 test_console_output: Implement the entire TextIO contract.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-28 09:31:55 -08:00
Anders Kaseorg 702ce071f4 python: Accept Optional[FrameType] in signal handlers.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-28 09:31:55 -08:00
Anders Kaseorg 591bd3f4a1 webhooks: Rename Yo App to Yo.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg 1d3520db12 webhooks: Remove space from UptimeRobot.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg 68c99511a2 webhooks: Fix TeamCity capitalization.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg 65868b09eb webhooks: Add missing space in Review Board.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg c02c053ec3 webhooks: Fix Mailchimp capitalization.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg dc72f79a83 webhooks: Fix Canarytokens pluralization.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg cd8a01587b webhooks: Fix Jotform capitalization.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Anders Kaseorg 3ca2f8ca1e webhooks: Fix Clubhouse capitalization.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-22 14:05:17 -08:00
Shlok Patel b1436aed9c production: Create stream in an atomic transaction.
To avoid the window between stream creation and creation of the
Recipient object, we create the stream in an atomic transaction.

Fixes #20127
2021-12-21 15:45:45 -08:00
Anders Kaseorg dc18aadeb2 test_classes: Type kwargs for client_get and friends.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-17 08:03:52 -08:00
Anders Kaseorg 27977eddeb export: Use tar -C to switch directories.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-17 08:01:53 -08:00
Anders Kaseorg 6855df0abb export_single_user: Fix usage with relative --output directory.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-17 08:01:53 -08:00
Anders Kaseorg 0daf32310e export_single_user: Refuse to overwrite a nonempty directory.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-12-17 08:01:53 -08:00
Steve Howell 3138f7a73c mypy: Fix content types.
This got by mypy due to Message being an Any type.
2021-12-16 20:35:56 -05:00
Steve Howell 0b0faa46b4 mypy: Use object type for checker return values. 2021-12-16 19:52:35 -05:00
Tim Abbott e152f255f5 test_upload: Remove GIF file extension test.
This change should have been in the previous commit.
2021-12-16 16:16:34 -08:00
Tim Abbott 22b5e105e6 upload: Remove incorrect animated GIF asserts.
GIF files can be `.GIF`, and also we determine the file format by
inspecting the image data, so there's no reason to have this
assertion.

(The code for serving still images does not rely on the file being a
GIF.)
2021-12-16 16:13:00 -08:00
Sahil Batra 1b23cbdf3e do_change_user_role: Use transaction.atomic. 2021-12-16 14:24:30 -08:00
Sahil Batra 168f241ff0 do_create_user: Use transaction.atomic.
Have kept process_new_human_user out of
the atomic block because it involves many
different operations and also sends events.
Tried enclosing event in on_commit but that
would need many changes in the tests, so have
skipped it for now.
2021-12-16 14:24:30 -08:00
Lauryn Menard 9321095406 tests: Update event tests for `do_update_message` action.
Updates testing helpers in `event_schema.py` for `do_update_message` so
that all stream message fields are present in any edits / updates to
stream messages. Adds verfication tests of events returned from private
message edits and from stream message content-only and topic-only edits.
2021-12-16 11:01:31 -08:00
Lauryn Menard 3b72da8a7c api: Include `stream_id` field for all edits to stream messages.
Updates the `update_message` event type to always include a `stream_id`
field when the message being edited is a stream message. This change
aligns with the current definition of the `\get-events` endpoint
in the OpenAPI documentation.
2021-12-16 11:01:31 -08:00
Tim Abbott ed01e16f60 send_custom_email: Fix dry run with --remote-servers. 2021-12-14 23:19:00 -08:00
Tim Abbott af27675857 send_custom_email: Add support for emailing remote server contacts.
This isn't a fully reusable tool, since it has copy about terms of
service, but it's at least readily modified and has tests.
2021-12-14 18:11:23 -08:00
Tim Abbott f287606198 send_custom_email: Make options a mandatory kwarg. 2021-12-14 18:11:23 -08:00
Alex Vandiver 4b1fd209be send_email: Don't abort on an EmailNotDeliveredException.
It is better to press on, than stop halfway through due to a user
whose email no longer works.  The exception is already logged, which
is sufficient here, as this is generally run interactively.
2021-12-14 17:07:34 -08:00
Alex Vandiver 45736aea3c email: Don't send overly-long "To" addresses.
This parallels b7fa41601d, but with "To"
addresses, not "From" addresses.
2021-12-14 15:37:12 -08:00
Alex Vandiver c55c46706d tests: Fold two tests into TestSendEmail.
These fundamentally tested send_email, not build_email, and thus
belong in TestSendEmail, not TestBuildEmail.  They also duplicated the
code in test_send_email_exceptions; reuse it.
2021-12-14 15:37:12 -08:00
Alex Vandiver bfd7254f17 tests: Rename build-email test, expand it for expected behavior.
The key to test is that it flips to the shorter form when it would
get too long.
2021-12-14 15:37:12 -08:00
Alex Vandiver e43373cc1f video_calls: Drop VIDEO_ZOOM_TESTING_ configurations.
These are no longer needed.
2021-12-13 15:17:34 -08:00
Alex Vandiver 5ccbd0eade ifttt: Ensure topic and body are strings, and not dicts / arrays. 2021-12-13 14:59:00 -08:00
Steve Howell 16db496871 export tests: Verify files for single-user exports. 2021-12-13 12:29:19 -08:00
Steve Howell 3c63ebde15 export tests: Extract ExportFile class.
This is just moving code around.
2021-12-13 12:29:19 -08:00
Steve Howell eb0114cdee export tests: Add verify_attachment_json.
This allows verify_uploads to use the database
as the authoritative source for what attachments
we need to look for when we're verifying the
images got exported properly, while still
also verifying attachment.json is correct.
2021-12-13 12:29:19 -08:00
Steve Howell 24009cb7d3 export tests: Clean up emoji setup.
We can't use the normal RealmEmoji from the
test database.

Also, we now use an actual action function to
set up emojis for our own purposes.
2021-12-13 12:29:17 -08:00
Steve Howell c6cdf98b66 export tests: Rename method to export_realm. 2021-12-13 12:25:19 -08:00
Steve Howell c79c95d55e export tests: Split function for uploading files.
This will give us flexibility for the single-user
tests.
2021-12-13 12:25:19 -08:00
Steve Howell a215a14c00 export tests: Use verify_uploads() for s3, too. 2021-12-13 12:25:19 -08:00
Steve Howell 3f5c15320b export tests: Extract verify_uploads. 2021-12-13 12:25:19 -08:00
Steve Howell 6b5a90bbd1 tests: Extract verify_emojis. 2021-12-13 12:25:19 -08:00
Steve Howell 302ef32c5b export tests: Extract verify_realm_logo_and_icon. 2021-12-13 12:25:19 -08:00
Steve Howell b4c089d3b8 export tests: Improve how we check avatars.
We avoid code duplication, and we iterate
over all records to see if files exist.
2021-12-13 12:25:19 -08:00
Steve Howell 0c02d89bf3 export tests: Avoid passing back path_id from setup. 2021-12-13 12:25:19 -08:00
Steve Howell d3ea369057 export tests: Clean up emoji checks. 2021-12-13 12:25:19 -08:00
Steve Howell fd94ba1579 tests: Avoid returning original_avatar_path_id.
The way we check for avatars is kind of clumsy for
realms.  Ideally we would just check all users
in the realm.
2021-12-13 12:25:19 -08:00
Steve Howell dbf1ae989d tests: Avoid relying on setup data (test_image).
It is better for the verifying code to just explicitly
ensure that the exported file bytes match the bytes
in the test image.  This introduces a tiny bit more
of I/O.
2021-12-13 12:25:19 -08:00
Steve Howell 53ffb8152f tests: Use read_test_image_file helper. 2021-12-13 12:25:19 -08:00
Steve Howell 186c446458 tests: Create export files for specific user.
We no longer hackily look for the first message ever
sent within the realm.
2021-12-13 12:25:19 -08:00
Steve Howell 2debb5e5e6 tests: Add assertions for upload path_ids. 2021-12-13 12:25:19 -08:00
Steve Howell 035c90df68 export tests: Avoid full_data concept.
It's easier to read the code without the intermediate
full_data dictionary that obscures where the files live.

We also avoid some unnecessary file i/o in the tests.
2021-12-13 12:25:19 -08:00
Steve Howell 275653ad2a tests: Move helpers to module level.
(This is a pure code move apart from removing "self"
in a few places.)
2021-12-13 12:25:19 -08:00
Steve Howell 6e3e3a7bff export tests: Remove unnecessary setUp method.
I cargo-culted this in a recent commit.
2021-12-13 12:25:11 -08:00
Steve Howell 08376da7af tests: Remove dead testing code for 2nd message batch. 2021-12-13 12:25:05 -08:00
Steve Howell d63e12c233 tests: Check more tables for user exports.
We do a sanity check for every table
that gets written to user.json as part of
the single-user export.

If we add more tables to the single-user export,
the test that I modified here will now ask
the author to add a new checker function, which
means we should always have at least a basic
sanity check for every exported table as long
as we stay in this new paradigm.

We also remove a little bit of old code that
became redundant.
2021-12-12 11:16:12 -08:00
Mateusz Mandera 74f4e3e914 do_change_realm_subdomain: Use transaction.atomic. 2021-12-11 10:39:07 -08:00
Mateusz Mandera 1692b2e81b do_reactivate_realm: Use transaction.atomic. 2021-12-11 10:39:07 -08:00
Mateusz Mandera 466e0bcdb3 do_set_realm_signup_notifications_stream: Use transaction.atomic. 2021-12-11 10:39:07 -08:00
Mateusz Mandera cddfd2cc92 do_set_realm_notifications_stream: Use transaction.atomic. 2021-12-11 10:39:07 -08:00
Mateusz Mandera 4999f68ba9 do_set_realm_message_editing: Use transaction.atomic. 2021-12-11 10:39:07 -08:00
Mateusz Mandera dc9aac9253 do_set_realm_authentication_methods: Use transaction.atomic. 2021-12-11 10:38:14 -08:00
Steve Howell 21ab5e3a55 tests: Register checkers for user export test. 2021-12-11 13:06:41 -05:00
Steve Howell 7df86f3614 tests: Tweak assertion for streams. 2021-12-11 13:06:41 -05:00
Steve Howell 6be3fbde1d tests: Split out single-user tests.
I dropped a minor assertion that was kind of redundant.
2021-12-11 13:06:41 -05:00
Steve Howell b2d83a8300 tests: Split out SingleUserExportTest.
This is mostly moving code, plus I now just
call shutil.rmtree directly.
2021-12-11 13:06:41 -05:00
Tim Abbott ee77c6365a portico: Use /help/ style pages for displaying policies.
This replaces the TERMS_OF_SERVICE and PRIVACY_POLICY settings with
just a POLICIES_DIRECTORY setting, in order to support settings (like
Zulip Cloud) where there's more policies than just those two.

With minor changes by Eeshan Garg.
2021-12-10 17:56:12 -08:00
Tim Abbott 95854d9d94 terms: Rename and tweak FIRST_TIME_TERMS_OF_SERVICE_TEMPLATE.
We do s/TOS/TERMS_OF_SERVICE/ on the name, and while we're at it,
remove the assumed zerver/ namespace for the template, which isn't
correct -- Zulip Cloud related content should be in the corporate/
directory.
2021-12-10 17:56:12 -08:00
Tim Abbott 31842e1377 export: Fix empty realm_icons directory in single-user exports. 2021-12-10 12:05:34 -08:00
Tim Abbott 7dd543bee5 export_single_user: Fix extra leading tmp/ in tarball. 2021-12-10 12:05:34 -08:00
Steve Howell 2902f8b931 tests: Ensure stream senders get a UserMessage row.
We now complain if a test author sends a stream message
that does not result in the sender getting a
UserMessage row for the message.

This is basically 100% equivalent to complaining that
the author failed to subscribe the sender to the stream
as part of the test setup, as far as I can tell, so the
AssertionError instructs the author to subscribe the
sender to the stream.

We exempt bots from this check, although it is
plausible we should only exempt the system bots like
the notification bot.

I considered auto-subscribing the sender to the stream,
but that can be a little more expensive than the
current check, and we generally want test setup to be
explicit.

If there is some legitimate way than a subscribed human
sender can't get a UserMessage, then we probably want
an explicit test for that, or we may want to change the
backend to just write a UserMessage row in that
hypothetical situation.

For most tests, including almost all the ones fixed
here, the author just wants their test setup to
realistically reflect normal operation, and often devs
may not realize that Cordelia is not subscribed to
Denmark or not realize that Hamlet is not subscribed to
Scotland.

Some of us don't remember our Shakespeare from high
school, and our stream subscriptions don't even
necessarily reflect which countries the Bard placed his
characters in.

There may also be some legitimate use case where an
author wants to simulate sending a message to an
unsubscribed stream, but for those edge cases, they can
always set allow_unsubscribed_sender to True.
2021-12-10 09:40:04 -08:00
Tim Abbott 1c180a9f57 documentation: Avoid potential unused variable code path.
These variables can be unset if the `os.path.exists` check fails.

That should be rare, since we've previously checked the files do
exist before getting here.
2021-12-09 17:51:52 -08:00
Tim Abbott 4cb189fc63 settings: Rename TOS_VERSION to TERMS_OF_SERVICE_VERSION.
The previous version was appropriate in a setting where it was only
used for Zulip Cloud, but it's definitely clearer to spell it out.
2021-12-09 17:51:16 -08:00
odunybrad 90aa45a316 emoji: Add database-level uniqueness constraint for RealmEmoji.
While races here are unlikely, it is most correct to enforce this
invariant at the database layer, and having a database-level
constraint makes the models file a bit more readable.
2021-12-09 17:48:53 -08:00
Steve Howell 9a39ca217f user export: Show less info for recipients.
For PM and huddles, show full names but no
emails or other crufty fields.
2021-12-09 17:20:01 -08:00
Steve Howell 6a5c407b05 user export: Be more selective about exported messages. 2021-12-09 17:20:01 -08:00
Steve Howell fa654fd7a0 user export: Ignore realm icon and logo.
These are not considered to be "personal"
info, even if you upload them, so we
don't export them.

Generally the only folks who upload
these are admins, who can easily get
them in other ways. In fact, anybody
can get these via the app.
2021-12-09 17:20:01 -08:00
Eeshan Garg 5aaeb1a432 use_cases: Rename /for/companies to /for/business. 2021-12-09 17:16:52 -08:00
Steve Howell 8f991f8eb1 export: Make sure messages are sorted **across** files.
We now ensure that all message ids are sorted BEFORE
we split them into batches.

We now do a few extra "slim" queries to get message
ids up front.

But, now, when we divide them into batches, we no
longer run 2 or 3 different complicated queries in
a loop. We just basically hydrate our message ids,
so `write_message_partials` should be easy to reason
about.

This change also means that for tiny realms with
< 1000 messages you will always have just one
json file, since we aggregate the ids from the
queries before batching.
2021-12-09 12:22:34 -08:00
Steve Howell cef0e11816 export: Add get_id_list_gently_from_database.
This is slightly overkill for the single-user use
case, but for small queries it's barely any overhead,
and it's a nice abstraction.
2021-12-09 12:22:34 -08:00
Steve Howell 8ea320812f user exports: Chunkify messages in sorted order.
This accomplishes a few things:

    * It extracts `chunkify` rather than having us
      clumsily track chunking-related stuff in a
      big loop that is doing other stuff.

    * It makes it so that all message ids
      in message-000001.json < message-000002.json.

    * It makes it easier for us to customize
      the messages we send to a single user
      (coming soon).

BTW we probably have a slicker version of chunkify
somewhere in our codebase, but I couldn't remember
where.
2021-12-09 12:22:34 -08:00
Steve Howell 2a73964e16 user export: Add reactions.
We may eventually try to attach these to the messages
in the message-NNNNNN.json files, but for now they're
fine in user.json.
2021-12-09 12:22:34 -08:00
Mateusz Mandera 93e18fe289 migrations: Remove disallowed characters from topics.
Following b3c58f454f, we want to clean up
old topics that may contain the disallowed characters. The Message table
is large, so we go in batches, making sure we limit topic fetches and
UPDATE query to no more than BATCH_SIZE Message rows per query.
2021-12-09 09:51:06 -08:00
Steve Howell f810833df5 export: Improve export_usermessages_batch.
We no longer jankily read our input file into
an "output" variable.  Instead, we do things
in a type-safe way.
2021-12-09 08:36:40 -08:00
Steve Howell 5c1e8cb8dc mypy: Add MessagePartial TypedDict. 2021-12-09 08:36:40 -08:00
Steve Howell 09c57a3f9f export: Log more consistently and sort ids.
Now all file writes go through our three
helper functions, and we consistently
write a single log message after the file
gets written.

I killed off write_message_exports, since
all but one of its callers can call
write_table_data, which automatically
sorts data.  In particular, our Message
and UserMessage data will now be sorted
by ids.
2021-12-09 08:36:40 -08:00
Steve Howell 6ec49951c6 minor: Avoid creating intermediate list for message_ids.
This probably just postpones the list creation until
Django builds the "IN" query, but semantically it's
good to work in sets where we don't have any
meaningful ordering of the list that gets used.
2021-12-08 16:12:54 -08:00
Steve Howell f8ed099d3c export: Sort table data for most tables.
This affects most of our tables, but it excludes
table(s) like Message that go through kind of
unique codepaths.
2021-12-08 16:12:54 -08:00
Steve Howell a1d3f12e53 refactor: Extract write_table_data().
The immediate benefit of this is stronger mypy
checks (avoiding the ugly union caused by message
files).

The subsequent commit will add sorting.

We have test coverage on all these lines insofar
as if you comment out the lines, tests will
explode (i.e. more than superficial line
coverage).
2021-12-08 16:12:54 -08:00
Steve Howell c76ca2d0df export: Sort records.json files by path. 2021-12-08 16:12:54 -08:00
Steve Howell 2ef38e3d48 refactor: Extract write_records_json_file. 2021-12-08 16:12:54 -08:00
Steve Howell b79cfc19ab user export: Broaden query for RealmAuditLog.
We now check acting_user as well as modified_user
to see if a row pertains to our exported user.
2021-12-08 16:01:38 -08:00
Steve Howell 927b04368e minor: Use virtual_parent for custom fetchers.
The distinction here wasn't super meaningful
due to the way we order our "elif" statements,
but we want to reserver "normal_parent" for the
majority of use cases, where you simply tell
the Config what the "foreign_key" is.
2021-12-08 15:58:07 -08:00
Steve Howell 50120a9387 export: Remove config parameter for custom fetchers. 2021-12-08 15:58:07 -08:00
Steve Howell 54a3a423e5 mypy: Fix CustomFetch=Any hack. 2021-12-08 15:58:07 -08:00
Steve Howell 4128b52ac5 export: Rename custom fetchers. 2021-12-08 15:58:07 -08:00
Steve Howell a2c4931316 exports: Use realm for RealmAuditLog in realm exports.
For realm-wide exports, there is no reason to query
inefficiently against a list of modified users.

We move the Config out of the common child configs.
2021-12-08 15:58:07 -08:00
Steve Howell 8dd3c1038f exports: Rename parent_key to include_rows.
Even though Django usually treats foo__in
and foo_id__in identically for filters where
foo is a ForeignKey type, we want to insist
on somewhat more consistent syntax, because
we have the odd combo of type and type_id
in Recipient, where type_id is kinda like a
foreign key, but not a ForeignKey.

So we assert for now that all our include_rows
values end in "_id__in".
2021-12-08 15:58:07 -08:00
Steve Howell 02207f47d5 minor: Move code blocks to be alphabetical. 2021-12-08 15:58:07 -08:00