zulip

Commit Graph

Author	SHA1	Message	Date
Steve Howell	8f991f8eb1	export: Make sure messages are sorted across files. We now ensure that all message ids are sorted BEFORE we split them into batches. We now do a few extra "slim" queries to get message ids up front. But, now, when we divide them into batches, we no longer run 2 or 3 different complicated queries in a loop. We just basically hydrate our message ids, so `write_message_partials` should be easy to reason about. This change also means that for tiny realms with < 1000 messages you will always have just one json file, since we aggregate the ids from the queries before batching.	2021-12-09 12:22:34 -08:00
Steve Howell	cef0e11816	export: Add get_id_list_gently_from_database. This is slightly overkill for the single-user use case, but for small queries it's barely any overhead, and it's a nice abstraction.	2021-12-09 12:22:34 -08:00
Steve Howell	8ea320812f	user exports: Chunkify messages in sorted order. This accomplishes a few things: * It extracts `chunkify` rather than having us clumsily track chunking-related stuff in a big loop that is doing other stuff. * It makes it so that all message ids in message-000001.json < message-000002.json. * It makes it easier for us to customize the messages we send to a single user (coming soon). BTW we probably have a slicker version of chunkify somewhere in our codebase, but I couldn't remember where.	2021-12-09 12:22:34 -08:00
Steve Howell	2a73964e16	user export: Add reactions. We may eventually try to attach these to the messages in the message-NNNNNN.json files, but for now they're fine in user.json.	2021-12-09 12:22:34 -08:00
Mateusz Mandera	93e18fe289	migrations: Remove disallowed characters from topics. Following `b3c58f454f`, we want to clean up old topics that may contain the disallowed characters. The Message table is large, so we go in batches, making sure we limit topic fetches and UPDATE query to no more than BATCH_SIZE Message rows per query.	2021-12-09 09:51:06 -08:00
Steve Howell	f810833df5	export: Improve export_usermessages_batch. We no longer jankily read our input file into an "output" variable. Instead, we do things in a type-safe way.	2021-12-09 08:36:40 -08:00
Steve Howell	5c1e8cb8dc	mypy: Add MessagePartial TypedDict.	2021-12-09 08:36:40 -08:00
Steve Howell	09c57a3f9f	export: Log more consistently and sort ids. Now all file writes go through our three helper functions, and we consistently write a single log message after the file gets written. I killed off write_message_exports, since all but one of its callers can call write_table_data, which automatically sorts data. In particular, our Message and UserMessage data will now be sorted by ids.	2021-12-09 08:36:40 -08:00
Steve Howell	6ec49951c6	minor: Avoid creating intermediate list for message_ids. This probably just postpones the list creation until Django builds the "IN" query, but semantically it's good to work in sets where we don't have any meaningful ordering of the list that gets used.	2021-12-08 16:12:54 -08:00
Steve Howell	f8ed099d3c	export: Sort table data for most tables. This affects most of our tables, but it excludes table(s) like Message that go through kind of unique codepaths.	2021-12-08 16:12:54 -08:00
Steve Howell	a1d3f12e53	refactor: Extract write_table_data(). The immediate benefit of this is stronger mypy checks (avoiding the ugly union caused by message files). The subsequent commit will add sorting. We have test coverage on all these lines insofar as if you comment out the lines, tests will explode (i.e. more than superficial line coverage).	2021-12-08 16:12:54 -08:00
Steve Howell	c76ca2d0df	export: Sort records.json files by path.	2021-12-08 16:12:54 -08:00
Steve Howell	2ef38e3d48	refactor: Extract write_records_json_file.	2021-12-08 16:12:54 -08:00
Steve Howell	b79cfc19ab	user export: Broaden query for RealmAuditLog. We now check acting_user as well as modified_user to see if a row pertains to our exported user.	2021-12-08 16:01:38 -08:00
Steve Howell	927b04368e	minor: Use virtual_parent for custom fetchers. The distinction here wasn't super meaningful due to the way we order our "elif" statements, but we want to reserver "normal_parent" for the majority of use cases, where you simply tell the Config what the "foreign_key" is.	2021-12-08 15:58:07 -08:00
Steve Howell	50120a9387	export: Remove config parameter for custom fetchers.	2021-12-08 15:58:07 -08:00
Steve Howell	54a3a423e5	mypy: Fix CustomFetch=Any hack.	2021-12-08 15:58:07 -08:00
Steve Howell	4128b52ac5	export: Rename custom fetchers.	2021-12-08 15:58:07 -08:00
Steve Howell	a2c4931316	exports: Use realm for RealmAuditLog in realm exports. For realm-wide exports, there is no reason to query inefficiently against a list of modified users. We move the Config out of the common child configs.	2021-12-08 15:58:07 -08:00
Steve Howell	8dd3c1038f	exports: Rename parent_key to include_rows. Even though Django usually treats foo__in and foo_id__in identically for filters where foo is a ForeignKey type, we want to insist on somewhat more consistent syntax, because we have the odd combo of type and type_id in Recipient, where type_id is kinda like a foreign key, but not a ForeignKey. So we assert for now that all our include_rows values end in "_id__in".	2021-12-08 15:58:07 -08:00
Steve Howell	02207f47d5	minor: Move code blocks to be alphabetical.	2021-12-08 15:58:07 -08:00
Steve Howell	aae9f1b6f5	export: Make Config errors more clear.	2021-12-08 15:58:07 -08:00
Nikhil Maske	091772b534	hotspots: Remove intro_reply hotspot. Zulip shows two guides on How to reply, first one by the welcome bot and second one is intro_reply hotspot. To simply and avoid redundancy, intro_reply hotspot is removed. Fixes #20482.	2021-12-07 21:55:59 -08:00
Nipunn Koorapati	0ca49bc93a	emoji reactions: Order reactions query results by id. Force postgres to give reactions in ID order - which is generally chronological order. Results in frontend displaying reactions in said order. Fixes #20060.	2021-12-07 15:02:46 -08:00
Eeshan Garg	3714a30e63	stream notifications: Add helper for silent user mention syntax. In many of our stream notification messages, we make use of the same silent user mention syntax, the template for which was always hardcoded. This commit adds a helper function that all relevant callers can call to get the right syntax when mentioning users. Thanks to Tim Abbott for this suggestion!	2021-12-07 14:53:50 -08:00
Eeshan Garg	8ebe05f644	streams: Add RealmAuditLog entry for message retention updates.	2021-12-07 14:53:50 -08:00
Eeshan Garg	d2901892e2	streams: Add notifications for message retention policy updates. This is a part of #20289.	2021-12-07 14:53:50 -08:00
S-Abhishek	186d1a83e9	narrow_banner: Move empty narrow messages to handlebar templates. Removed existing empty narrow divs from app/home.html and created a new javascript module to dynamically load empty narrow messages using handlebar template. Fixes #18797	2021-12-07 13:38:48 -08:00
Steve Howell	6381c2e535	tests: Make sure import doesn't corrupt original realm. The original intention of this was to prevent coding errors with realm getters that don't, um, filter on realm. Unfortunately, you can still write a broken realm getter that forgets to filter on realm, but which returns a Set, and the new safeguards won't see any difference. We could make all the getters return sorted lists instead, but that's for another day. This code does serve another purpose, which is to prevet egregious bugs in the import itself.	2021-12-07 12:27:01 -08:00
Steve Howell	fea659eacd	tests: Extract get_getters.	2021-12-07 12:27:01 -08:00
Steve Howell	5803057589	tests: Make some helpers class-level. This is somewhat tactical in nature. I want to extract a huge chunk of code that minorly depends on these helpers.	2021-12-07 12:27:01 -08:00
Steve Howell	29bd1e8bd3	tests: Avoid clutter within long list of getters. The diff here is ugly, but to summarize: BEFORE IMPORT: define get_user_id define get_huddle_hashes AFTER IMPORT AND MAKING GETTERS: check realm id define assert_realm_values verify emoji codes check huddle hashes	2021-12-07 12:27:01 -08:00
Steve Howell	93761cd237	tests: Add getter decorator for import test.	2021-12-07 12:27:01 -08:00
Steve Howell	5892748c7b	tests: Avoid lambdas in import test.	2021-12-07 12:27:01 -08:00
Steve Howell	54a6c82282	tests: Avoid equal flag for huddle hashes. There's no need to complexify the codepath for all the normal use cases.	2021-12-07 12:27:01 -08:00
Steve Howell	6d09eab285	export: Export file images for single users. We don't have automated test coverage on this yet, but below are the results from manual testing. Note that we include the realm icon and logo even though they were not created by Cordelia. ./manage.py export_single_user cordelia@zulip.com $ (cd /tmp/zulip-export-4v3mo802/ && find .) . ./emoji ./emoji/2 ./emoji/2/emoji ./emoji/2/emoji/images ./emoji/2/emoji/images/3.jpg ./emoji/records.json ./messages-000001.json ./realm_icons ./realm_icons/2 ./realm_icons/2/night_logo.original ./realm_icons/2/night_logo.png ./realm_icons/2/icon.png ./realm_icons/2/icon.original ./realm_icons/records.json ./avatars ./avatars/2 ./avatars/2/c5125af0447f4d66ce34c1b32eac75ac27ebe0e7.original ./avatars/2/c5125af0447f4d66ce34c1b32eac75ac27ebe0e7.png ./avatars/records.json ./uploads ./uploads/2 ./uploads/2/68 ./uploads/2/68/xyEkC5dTIp8m42_6HJ3kBfdt ./uploads/2/68/xyEkC5dTIp8m42_6HJ3kBfdt/denver.jpg ./uploads/2/96 ./uploads/2/96/ol5WE6RTUntvuPDSpJUrYTim ./uploads/2/96/ol5WE6RTUntvuPDSpJUrYTim/denver.jpg ./uploads/records.json ./user.json	2021-12-07 11:16:52 -08:00
Steve Howell	b8d9143318	export: Validate emoji paths. (We lift the RealmEmoji query to be used by both local and S3 storage helpers.)	2021-12-07 11:16:52 -08:00
Steve Howell	ef6d9b10d2	refactor: Extract get_emoji_path.	2021-12-07 11:16:52 -08:00
Steve Howell	5a41904201	export: Add handle_system_bots flag. We will set this to False for single-user exports.	2021-12-07 11:16:52 -08:00
Steve Howell	0e19deb558	exports: Limit s3 upload exports with path_id checks.	2021-12-07 11:16:52 -08:00
Steve Howell	f6cbf931ae	refactor: Pass attachments to export_uploads_from_local. The next commit will use attachments in the s3 path.	2021-12-07 11:16:52 -08:00
Steve Howell	03f40a64d4	refactor: Pass valid_hashes to export_files_from_s3.	2021-12-07 11:16:52 -08:00
Steve Howell	15bc677f35	export: Pass users to export_avatars_from_local.	2021-12-07 11:16:52 -08:00
Eeshan Garg	79e9ba13e2	billing: Add do_change_remote_server_plan_type. This is a part of the plumbing we need to support billing for self-hosted customers. With documentation changes from tabbott.	2021-12-07 10:25:37 -08:00
Eeshan Garg	2cdaae681d	actions: Rename do_change_plan_type -> do change_realm_plan_type. We will soon be adding an equivalent function for RemoteZulipServer, so it makes sense to rename this function to be more descriptive.	2021-12-06 16:18:53 -08:00
Steve Howell	42ecabe967	export: Add check_metadata flag.	2021-12-06 15:09:37 -08:00
Steve Howell	0166f13d83	s3 exports: Validate user metadata for all assets. This preps us to download assets for just a single user.	2021-12-06 15:09:37 -08:00
Steve Howell	946ab22bba	refactor: Lift users query to caller. This preps us to reuse this code for single users (after a few more subsequent changes).	2021-12-06 15:09:37 -08:00
Steve Howell	b0e5c1d3b9	export: Remove paranoid assertion. There are tactical reasons to remove this assertion. Basically, the reason it's safe to remove is that it's been around a long time and we would have seen this operationally. Also, the check to make sure that the S3 filename thingy matches the avatar hash is a much stronger check. We will soon restore a stronger version of this check that applies to all of our asset types (emojis/avatars/etc.).	2021-12-06 15:09:37 -08:00
Steve Howell	59951ae52b	refactor: Move metadata checks for s3 export. This technically broadens the check for user_profile_id, but we write that metadata on every record.	2021-12-06 15:09:37 -08:00

1 2 3 4 5 ...

14540 Commits