Commit Graph

14536 Commits

Author SHA1 Message Date
Mateusz Mandera 93e18fe289 migrations: Remove disallowed characters from topics.
Following b3c58f454f, we want to clean up
old topics that may contain the disallowed characters. The Message table
is large, so we go in batches, making sure we limit topic fetches and
UPDATE query to no more than BATCH_SIZE Message rows per query.
2021-12-09 09:51:06 -08:00
Steve Howell f810833df5 export: Improve export_usermessages_batch.
We no longer jankily read our input file into
an "output" variable.  Instead, we do things
in a type-safe way.
2021-12-09 08:36:40 -08:00
Steve Howell 5c1e8cb8dc mypy: Add MessagePartial TypedDict. 2021-12-09 08:36:40 -08:00
Steve Howell 09c57a3f9f export: Log more consistently and sort ids.
Now all file writes go through our three
helper functions, and we consistently
write a single log message after the file
gets written.

I killed off write_message_exports, since
all but one of its callers can call
write_table_data, which automatically
sorts data.  In particular, our Message
and UserMessage data will now be sorted
by ids.
2021-12-09 08:36:40 -08:00
Steve Howell 6ec49951c6 minor: Avoid creating intermediate list for message_ids.
This probably just postpones the list creation until
Django builds the "IN" query, but semantically it's
good to work in sets where we don't have any
meaningful ordering of the list that gets used.
2021-12-08 16:12:54 -08:00
Steve Howell f8ed099d3c export: Sort table data for most tables.
This affects most of our tables, but it excludes
table(s) like Message that go through kind of
unique codepaths.
2021-12-08 16:12:54 -08:00
Steve Howell a1d3f12e53 refactor: Extract write_table_data().
The immediate benefit of this is stronger mypy
checks (avoiding the ugly union caused by message
files).

The subsequent commit will add sorting.

We have test coverage on all these lines insofar
as if you comment out the lines, tests will
explode (i.e. more than superficial line
coverage).
2021-12-08 16:12:54 -08:00
Steve Howell c76ca2d0df export: Sort records.json files by path. 2021-12-08 16:12:54 -08:00
Steve Howell 2ef38e3d48 refactor: Extract write_records_json_file. 2021-12-08 16:12:54 -08:00
Steve Howell b79cfc19ab user export: Broaden query for RealmAuditLog.
We now check acting_user as well as modified_user
to see if a row pertains to our exported user.
2021-12-08 16:01:38 -08:00
Steve Howell 927b04368e minor: Use virtual_parent for custom fetchers.
The distinction here wasn't super meaningful
due to the way we order our "elif" statements,
but we want to reserver "normal_parent" for the
majority of use cases, where you simply tell
the Config what the "foreign_key" is.
2021-12-08 15:58:07 -08:00
Steve Howell 50120a9387 export: Remove config parameter for custom fetchers. 2021-12-08 15:58:07 -08:00
Steve Howell 54a3a423e5 mypy: Fix CustomFetch=Any hack. 2021-12-08 15:58:07 -08:00
Steve Howell 4128b52ac5 export: Rename custom fetchers. 2021-12-08 15:58:07 -08:00
Steve Howell a2c4931316 exports: Use realm for RealmAuditLog in realm exports.
For realm-wide exports, there is no reason to query
inefficiently against a list of modified users.

We move the Config out of the common child configs.
2021-12-08 15:58:07 -08:00
Steve Howell 8dd3c1038f exports: Rename parent_key to include_rows.
Even though Django usually treats foo__in
and foo_id__in identically for filters where
foo is a ForeignKey type, we want to insist
on somewhat more consistent syntax, because
we have the odd combo of type and type_id
in Recipient, where type_id is kinda like a
foreign key, but not a ForeignKey.

So we assert for now that all our include_rows
values end in "_id__in".
2021-12-08 15:58:07 -08:00
Steve Howell 02207f47d5 minor: Move code blocks to be alphabetical. 2021-12-08 15:58:07 -08:00
Steve Howell aae9f1b6f5 export: Make Config errors more clear. 2021-12-08 15:58:07 -08:00
Nikhil Maske 091772b534 hotspots: Remove intro_reply hotspot.
Zulip shows two guides on How to reply, first one by
the welcome bot and second one is intro_reply hotspot.
To simply and avoid redundancy, intro_reply hotspot is
removed.

Fixes #20482.
2021-12-07 21:55:59 -08:00
Nipunn Koorapati 0ca49bc93a emoji reactions: Order reactions query results by id.
Force postgres to give reactions in ID order - which
is generally chronological order. Results in frontend
displaying reactions in said order.

Fixes #20060.
2021-12-07 15:02:46 -08:00
Eeshan Garg 3714a30e63 stream notifications: Add helper for silent user mention syntax.
In many of our stream notification messages, we make use of the
same silent user mention syntax, the template for which was always
hardcoded. This commit adds a helper function that all relevant
callers can call to get the right syntax when mentioning users.

Thanks to Tim Abbott for this suggestion!
2021-12-07 14:53:50 -08:00
Eeshan Garg 8ebe05f644 streams: Add RealmAuditLog entry for message retention updates. 2021-12-07 14:53:50 -08:00
Eeshan Garg d2901892e2 streams: Add notifications for message retention policy updates.
This is a part of #20289.
2021-12-07 14:53:50 -08:00
S-Abhishek 186d1a83e9 narrow_banner: Move empty narrow messages to handlebar templates.
Removed existing empty narrow divs from app/home.html and created
a new javascript module to dynamically load empty narrow messages
using handlebar template.

Fixes #18797
2021-12-07 13:38:48 -08:00
Steve Howell 6381c2e535 tests: Make sure import doesn't corrupt original realm.
The original intention of this was to prevent coding
errors with realm getters that don't, um, filter
on realm.

Unfortunately, you can still write a broken realm getter
that forgets to filter on realm, but which returns a
Set, and the new safeguards won't see any difference.

We could make all the getters return sorted lists
instead, but that's for another day.

This code does serve another purpose, which is to
prevet egregious bugs in the import itself.
2021-12-07 12:27:01 -08:00
Steve Howell fea659eacd tests: Extract get_getters. 2021-12-07 12:27:01 -08:00
Steve Howell 5803057589 tests: Make some helpers class-level.
This is somewhat tactical in nature. I want to
extract a huge chunk of code that minorly depends
on these helpers.
2021-12-07 12:27:01 -08:00
Steve Howell 29bd1e8bd3 tests: Avoid clutter within long list of getters.
The diff here is ugly, but to summarize:

    BEFORE IMPORT:
        define get_user_id
        define get_huddle_hashes

    AFTER IMPORT AND MAKING GETTERS:
        check realm id
        define assert_realm_values
        verify emoji codes
        check huddle hashes
2021-12-07 12:27:01 -08:00
Steve Howell 93761cd237 tests: Add getter decorator for import test. 2021-12-07 12:27:01 -08:00
Steve Howell 5892748c7b tests: Avoid lambdas in import test. 2021-12-07 12:27:01 -08:00
Steve Howell 54a6c82282 tests: Avoid equal flag for huddle hashes.
There's no need to complexify the codepath
for all the normal use cases.
2021-12-07 12:27:01 -08:00
Steve Howell 6d09eab285 export: Export file images for single users.
We don't have automated test coverage on this yet,
but below are the results from manual testing.

Note that we include the realm icon and logo even
though they were not created by Cordelia.

    ./manage.py export_single_user cordelia@zulip.com

    $ (cd /tmp/zulip-export-4v3mo802/ && find .)
    .
    ./emoji
    ./emoji/2
    ./emoji/2/emoji
    ./emoji/2/emoji/images
    ./emoji/2/emoji/images/3.jpg
    ./emoji/records.json
    ./messages-000001.json
    ./realm_icons
    ./realm_icons/2
    ./realm_icons/2/night_logo.original
    ./realm_icons/2/night_logo.png
    ./realm_icons/2/icon.png
    ./realm_icons/2/icon.original
    ./realm_icons/records.json
    ./avatars
    ./avatars/2
    ./avatars/2/c5125af0447f4d66ce34c1b32eac75ac27ebe0e7.original
    ./avatars/2/c5125af0447f4d66ce34c1b32eac75ac27ebe0e7.png
    ./avatars/records.json
    ./uploads
    ./uploads/2
    ./uploads/2/68
    ./uploads/2/68/xyEkC5dTIp8m42_6HJ3kBfdt
    ./uploads/2/68/xyEkC5dTIp8m42_6HJ3kBfdt/denver.jpg
    ./uploads/2/96
    ./uploads/2/96/ol5WE6RTUntvuPDSpJUrYTim
    ./uploads/2/96/ol5WE6RTUntvuPDSpJUrYTim/denver.jpg
    ./uploads/records.json
    ./user.json
2021-12-07 11:16:52 -08:00
Steve Howell b8d9143318 export: Validate emoji paths.
(We lift the RealmEmoji query to be used by
both local and S3 storage helpers.)
2021-12-07 11:16:52 -08:00
Steve Howell ef6d9b10d2 refactor: Extract get_emoji_path. 2021-12-07 11:16:52 -08:00
Steve Howell 5a41904201 export: Add handle_system_bots flag.
We will set this to False for single-user exports.
2021-12-07 11:16:52 -08:00
Steve Howell 0e19deb558 exports: Limit s3 upload exports with path_id checks. 2021-12-07 11:16:52 -08:00
Steve Howell f6cbf931ae refactor: Pass attachments to export_uploads_from_local.
The next commit will use attachments in the s3 path.
2021-12-07 11:16:52 -08:00
Steve Howell 03f40a64d4 refactor: Pass valid_hashes to export_files_from_s3. 2021-12-07 11:16:52 -08:00
Steve Howell 15bc677f35 export: Pass users to export_avatars_from_local. 2021-12-07 11:16:52 -08:00
Eeshan Garg 79e9ba13e2 billing: Add do_change_remote_server_plan_type.
This is a part of the plumbing we need to support billing for
self-hosted customers.

With documentation changes from tabbott.
2021-12-07 10:25:37 -08:00
Eeshan Garg 2cdaae681d actions: Rename do_change_plan_type -> do change_realm_plan_type.
We will soon be adding an equivalent function for RemoteZulipServer,
so it makes sense to rename this function to be more descriptive.
2021-12-06 16:18:53 -08:00
Steve Howell 42ecabe967 export: Add check_metadata flag. 2021-12-06 15:09:37 -08:00
Steve Howell 0166f13d83 s3 exports: Validate user metadata for all assets.
This preps us to download assets for just a single
user.
2021-12-06 15:09:37 -08:00
Steve Howell 946ab22bba refactor: Lift users query to caller.
This preps us to reuse this code for single users
(after a few more subsequent changes).
2021-12-06 15:09:37 -08:00
Steve Howell b0e5c1d3b9 export: Remove paranoid assertion.
There are tactical reasons to remove this assertion.

Basically, the reason it's safe to remove is that it's
been around a long time and we would have seen this
operationally. Also, the check to make sure that the
S3 filename thingy matches the avatar hash is a much
stronger check.

We will soon restore a stronger version of this check
that applies to all of our asset types (emojis/avatars/etc.).
2021-12-06 15:09:37 -08:00
Steve Howell 59951ae52b refactor: Move metadata checks for s3 export.
This technically broadens the check for user_profile_id,
but we write that metadata on every record.
2021-12-06 15:09:37 -08:00
Steve Howell a27a6a4548 refactor: Inline _check_key_metadata.
This was only called in one place.
2021-12-06 15:09:37 -08:00
Steve Howell db39948be5 refactor: Pass in flavor to export_files_from_s3. 2021-12-06 15:09:37 -08:00
Steve Howell f4354c896b refactor: Pass object_key into export_files_from_s3.
This makes it easier to read the calling code and see
the big picture of how the four asset types are
organized.

I also handle uploads first, to be similar to the local
code.

This code is well tested--you can modify any of the callers
to pass in a wrong value of `object_key` and get a failing
test.
2021-12-06 15:09:37 -08:00
Tim Abbott 8aafce5619 test_signup: Fix test failures with week old test database.
The comment explains in more detail, but basically we'd skip
exercising a bit of code in the signup code path if there were no
messages in the last week, resulting in the query count not matching.
2021-12-06 14:08:37 -08:00