zulip

Commit Graph

Author	SHA1	Message	Date
Vishnu Ks	719abbd352	test_classes: Move rm_tree to test_classes.	2019-04-04 13:51:52 -07:00
Tim Abbott	12d5e870c5	tests: Fix import test failure. Broken in `4d08461ab1`.	2019-02-12 17:46:55 -08:00
Anders Kaseorg	3127fb4dbd	zerver/tests: Remove unused imports. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-02-02 17:43:03 -08:00
Tim Abbott	8a90441d2f	slack import: Import long-inactive users as long-term idle. This avoids creating UserMessage rows for long-inactive users in organizations with many thousands of users.	2018-12-16 18:52:20 -08:00
Rishi Gupta	8a95526ced	billing: Always transition to Realm.LIMITED via do_change_plan_type. Fixes a bug in import_realm where secondary attributes like message visibility weren't being set, and also makes bugs like this less likely in the future. Also, putting the plan_type change at the end of import_realm, so that future restrictions to LIMITED realms don't affect the import process.	2018-12-13 13:26:24 -08:00
rht	e59ff6e6db	slack import: Eliminate need to load all messages into memory. This works by yielding messages sorted based on timestamp. Because the Slack exports are broken into files by date, it's convenient to do a 2-layer sorting process, where we open all the files for a given day, and then sort their messages by timestamp before yielding them. Fixes #10930.	2018-12-05 12:20:50 -08:00
Steve Howell	d86dd165da	gitter/slack/hipchat: Remove "subject" from conversions. We (lexically) remove "subject" from the conversion code. The `build_message` helper calls `set_topic_name` under the hood, so things still have "subject" in the JSON. There was good code coverage on `build_message`.	2018-11-12 15:47:11 -08:00
Steve Howell	30c493ed24	slack import: Generate message_id/reaction_id with NEXT_ID. This avoids the need to pass tuples of ints around, which is pretty brittle.	2018-10-29 13:24:50 -07:00
Steve Howell	2f58eb1057	slack import: Extract process_message_files(). This is mostly an extraction, but it does change the way we calculate `content`. We append the markdown links from ALL files to any content that came in the message itself. Separating this out also allows us to add more test coverage for the extracted code.	2018-10-29 13:24:50 -07:00
Steve Howell	00f822a26a	conversion: Generate attachment_ids with helpers.	2018-10-29 13:24:50 -07:00
Steve Howell	5cb60f7bea	conversions: Use subscriber_map for Slack/Gitter. We now use subscriber_map for building UserMessage rows in Slack/Gitter conversions. This is mostly designed to simplify the code, rather than having to scan the entire subscribers for each message. I am guessing this will improve performance for most conversions. We sort small lists on every message, in order to be deterministic, but the sorting cost is probably more than offset by avoiding the O(N) scans across all subscriptions. Also, it's probably negligible in the grand scheme of things, compared to JSON parsing, file I/O, etc. This commits also fixes some typos with mentioned_users_id -> mentioned_user_ids and cleans up a test a bit as well.	2018-10-29 13:24:50 -07:00
Steve Howell	5194701787	conversions: Use NEXT_ID for usermessage_id. This is mostly complicated due to the way that the Slack import passes around tuples of ids to maintain four different parallel sequences.	2018-10-29 13:24:50 -07:00
Rhea Parekh	3ff339c294	slack import: Add support for uploads in messages through 'files' keyword. It appears that Slack just changed their export format, and how uses this `files` list for user-uploaded files.	2018-08-10 16:20:36 -07:00
Rhea Parekh	18a4904437	import: Move 'build_attachment' to import_util.	2018-08-07 16:45:42 -07:00
Rhea Parekh	b6ccc0bc52	import: Move 'build_defaultstream' to import_util.	2018-08-07 16:45:42 -07:00
Rhea Parekh	bee3964f14	import: Move 'build_usermessages' to import_util.	2018-08-07 16:45:42 -07:00
Rhea Parekh	87cc1a6280	import: Move 'build_subscription' and 'build_recipient' to import_util.	2018-08-07 16:35:56 -07:00
Rhea Parekh	1117455a90	import: Move 'ZerverFieldsT' and 'build_zerver_realm' to import_util.	2018-08-07 16:35:56 -07:00
Rhea Parekh	b8e1e8b31d	import: Add slack import files in zerver/data_import directory.	2018-08-01 11:52:14 -07:00
Rhea Parekh	4bbccd8287	import: import RealmAuditLog when 'zerver_realmauditlog` is missing. * If `zerver_realmauditlog` is present in the exported data, `RealmAuditLog` would be imported normally. * If it is not present, `create_subscription_events` function in would create the `subscription_created` events for RealmAuditLog. The reason this function is in `import_realm` module and not in the individual export tool scripts (like Slack) is because this function would be common for all export tools. This fixes #9846 for users who have not already done an import of their organization from Slack. Fixes #9846.	2018-07-10 16:00:19 +05:30
Anders Kaseorg	d8ba378050	test_slack_importer: Remove backslashes wrongly copied from JSON data https://github.com/houstondatavis/slack-export/blob/master/users.json JSON or JavaScript decodes "\/" to / (and some encoders always write "\/" to avoid accidentally creating a </script> tag), while Python assumes "\/" is a typo for "\\/" and decodes it to \/. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-07-03 16:54:46 +02:00
Anders Kaseorg	037f696d26	Enable pycodestyle W605 (invalid escape sequence). The only changes visible at the AST level, checked using https://github.com/asottile/astpretty, are zerver/lib/test_fixtures.py: '\x1b\\[(1\|0)m' ↦ '\\x1b\\[(1\|0)m' '\\[[X\| ]\\] (\\d+_.+)\n' ↦ '\\[[X\| ]\\] (\\d+_.+)\\n' which is fine because re treats '\\x1b' and '\\n' the same way as '\x1b' and '\n'. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2018-07-03 16:54:46 +02:00
Rhea Parekh	6b7b6b38ad	slack import: Write messages batch-wise. Messages can be bulky, and storing them in a single data structure can cause a memory error. In this commit, the messages are written to a file batch-wise, thus avoiding the memory error.	2018-07-01 07:08:13 -07:00
Rhea Parekh	7f6c174099	slack import: Add 'id_list' field in channel_message_to_zerver_message. The id_list would help to store the associated max ID state between subsequent calls, which will help in batch-wise processing of the messages.	2018-07-01 07:08:13 -07:00
Rhea Parekh	af20ef4789	slack import: Save messages within convert_slack_workspace_messages. Previously, the messages where being stored in a output file from outside the function 'convert_slack_workspace_messages', but now we store it from the inside the mentioned function. This will help in processing and saving the messages batch-wise so as to avoid a memory error. Reactions are returned separately from 'convert_slack_workspace_messages' rather than 'message_json'. Also updated test for 'convert_slack_workspace_messages' and an additional test for reactions is added.	2018-07-01 07:08:13 -07:00
Rhea Parekh	2f88ca7446	slack import: Import skype and phone data of users.	2018-07-01 07:05:40 -07:00
Aditya Bansal	2f3b2fbf59	zerver/tests: Change use of typing.Text to str.	2018-05-10 14:19:49 -07:00
Tim Abbott	ff9371d63c	slack import: Fix issues with Slack empty files. Fixes #9217.	2018-04-25 10:20:55 -07:00
Tim Abbott	c4b886d8ae	import: Split out import.py into its own module. This should make it a bit easier to find the code.	2018-04-23 15:21:12 -07:00
Preston Hansen	e168f9938c	tests: Refactor use of test and webhook data fixtures.	2018-04-19 21:50:29 -07:00
Preston Hansen	76d6c71595	tests: Move zerver/fixtures to zerver/tests/fixtures for clarity. Fixes #9153.	2018-04-19 21:50:17 -07:00
Tim Abbott	1410a1e460	slack import: Remove unnecessary zerver_realm_skeleton.json. This was stored as a fixture file under zerver/fixtures, which caused problems, since we don't show that directory under production (as its part of the test system). The simplest emergency fix here would be to just move the file, but when looking at it, it's clear that we don't need or want a fixture file here; we want a Python object, so we just do that. A valuable follow-up improvement to this block would be to create an actual new Realm object (not saved to the database), and dump it the same code we use in the export tool; that should handle the vast majority of these correctly. Fixes #9123.	2018-04-18 10:33:53 -07:00
Rhea Parekh	7c0c3930a8	slack importer: Thread avatar downloads.	2018-04-15 19:53:01 +05:30
Rhea Parekh	f7398cbb09	slack import: Implement custom profile fields. Add custom profile fields in the slack converted data 'realm' file. Added tests for the custom profile fields. Fixes #8928	2018-04-10 13:28:53 -07:00
Rhea Parekh	852e8516b4	slack import: Add custom profile fields. Build CustomProfileField and CustomProfileFieldValue for every user and process the field type after getting an entire list of the custom fields.	2018-04-10 13:28:53 -07:00
rht	7a8655cc50	Slack importer: Add test for Slack channel mention to Zulip stream mention.	2018-04-09 10:47:39 -07:00
Rhea Parekh	2baa9bc16e	Import: Add subdomain in the import script. Also remove user input of subdomain in the slack data conversion script.	2018-04-06 09:12:56 -07:00
Rhea Parekh	1bba6cc4ce	slack importer: Support custom emoji reactions.	2018-04-01 23:24:35 -07:00
Rhea Parekh	c650b8fa3e	slack importer: Add zerver_realmemoji.	2018-04-01 23:24:35 -07:00
Rhea Parekh	b133d175a7	slack importer: Change 'get_user_data' function implementation. Change 'get_user_data' function to a more general function to get data from the slack api using legacy tokens. Also, change the error handling as upon invalid token, the response is 200, but the response has an error field in it. For eg. Go to the following link with invalid token: https://slack.com/api/emoji.list?token=xoxp-249056023425	2018-04-01 23:24:35 -07:00
Rhea Parekh	220ad6a386	slack importer: Map standard reactions. As mentioned in https://get.slack.help/hc/en-us/articles/202931348-Use-emoji-and-emoticons, slack supports the standard emoji codes (https://www.webpagefx.com/tools/emoji-cheat-sheet/) and majority of them are already supported in Zulip.	2018-04-01 23:24:35 -07:00
Rhea Parekh	8a028142d8	slack importer: Remove id allocation function and its implementation. Remove allocation ID function from slack import script. All the IDs count will start from 0. Hence the ID List returned by the allocation function is of no use, and we remove its implementation. (example: get_total_messages_and_attachments function is of no use anymore, hence we remove it)	2018-04-01 23:10:55 -07:00
Rhea Parekh	d147bd25d0	import script: Change file path of the upload in the import script. In importing avatars, we use the implementation where the 'avatar_path' is seperately calculated using realm and user ID and then the content of the path provided in the avatar's 'records.json' are copied to this 'avatar_path'. Similary, here for the uploads, 's3_file_name' is seperately calculated using the realm ID and uploaded file name and then the content of the path provided in upload's 'records.json' are copied to this 's3_file_name'.	2018-04-01 23:04:14 -07:00
Rhea Parekh	6f3c87006b	slack importer: Move output folder being extracted from /tmp to var/.	2018-03-16 11:12:58 -07:00
Rhea Parekh	4b66a2d0dc	slack importer: Add function to fetch and save uploads.	2018-03-16 11:12:58 -07:00
Rhea Parekh	e62945eb86	slack importer: Implement changes in script due to zerver_attachment.	2018-03-16 11:12:58 -07:00
Rhea Parekh	8e2d930644	slack importer: Implement changes in script due to user upload object.	2018-03-16 11:12:58 -07:00
Rhea Parekh	b0851eb20b	slack importer: Add helper functions to build attachment object.	2018-03-16 11:12:58 -07:00
Rhea Parekh	68af6e4b7a	slack importer: Add helper functions to build user uploads object.	2018-03-16 11:12:58 -07:00
Rhea Parekh	90a3ffc5c0	slack importer: Include only slack's purpose field in description.	2018-03-15 23:50:32 -07:00
Rhea Parekh	8a4f307c43	slack importer: Change topic for imported content.	2018-03-15 23:50:32 -07:00
Rhea Parekh	a5b0957e5d	slack importer: Set domain name in 'do_convert_data'. The domain name is being set in the helper function 'slack_workspace_to_realm', but it should be set in the main function 'do_convert_data', as we need it in other child functions of 'do_convert_data'.	2018-03-15 18:34:51 -07:00
Rhea Parekh	d4374880d5	slack importer: Clear 'output_dir' at the beginning of test. if the test fails, the 'output_dir' would not be deleted and hence it would give an error when we run the tests next time, as 'do_convert_data' expects an empty 'output_dir'. Also the unzipped data file should be removed if the test fails at 'do_convert_data'.	2018-03-08 07:53:09 -08:00
Rhea Parekh	7878cc53a3	slack importer: Cleanup build_subscription.	2018-03-07 14:07:24 -08:00
Rhea Parekh	f947194e4c	slack importer: Cleanup build_avatar.	2018-03-07 14:07:24 -08:00
Rhea Parekh	5efe05d5b6	slack importer: Cleanup build_zerver_usermessage.	2018-03-07 14:07:24 -08:00
Rhea Parekh	c2d4b49bf3	slack importer: Use precomputed value in 'channel_message_to_zerver_message'. Use the List of all messages in this helper function instead of using the messages channel-wise.	2018-03-06 14:07:09 -08:00
Rhea Parekh	3e4086d1b8	slack importer: Use precomputed value in 'get_total_messages_and_usermessages'. Use the List of all messages in this helper function instead of using the messages channel-wise.	2018-03-06 14:07:09 -08:00
Rhea Parekh	10c73ae577	slack importer: Add function to precompute all the messages. The messages were first being read and passed to the helper functions channel wise. This function makes a list of all the messages in the all the channels beforehand which would be used to pass in the helper functions.	2018-03-06 14:07:09 -08:00
Rhea Parekh	6a07897d3a	slack importer: Fetch and save the avatars. This gets the avatar of size 512 px and saves it in the user's avatar directory with both the extensions '.png' and '.original'.	2018-03-01 16:49:37 -08:00
Rhea Parekh	7076a49e04	slack importer: Pass avatar data through the import script.	2018-03-01 16:41:49 -08:00
Rhea Parekh	30b9d35d5e	slack importer: Add helper functions to get user avatars. Here, we create the slack avatar url using the user data and build the avatar object. Added Tests for the same.	2018-03-01 16:38:55 -08:00
Rhea Parekh	95df8452be	slack importer: Remove function 'get_user_avatar_source'. slack avatar urls have the format: 'https://ca.slack-edge.com/<team_id>-<user_id>-<avatar_hash>-<size>' For any url of this form, if the user hasn't uploaded an image, Slack uses default gravatar, but we don't have a way of knowing if Slack has used the uploaded image or the custom gravatar eg: https://ca.slack-edge.com/T5YFFM2QY-U6006P1CN-gd41c3c33cbe-512. Hence, avatar_source should be mapped to 'U'.	2018-03-01 16:38:55 -08:00
Rhea Parekh	3bb14a867b	slack importer: Change 'invite_only' mapping in streams. 'invite_only' should always be true for the slack's standard export plan as the private channels are not supported in it.	2018-02-25 09:22:01 -08:00
Rhea Parekh	a2f6f4ba1c	slack importer: Handle case where messages have no users.	2018-02-25 09:20:55 -08:00
Rhea Parekh	15a6f62fe7	slack importer: Refactor defaultstream handling. The check for the channel ('general' and 'random') must be added before 'build_defaultstream' function is called and then the id is incremented. Otherwise, the id appended at the end of second defaultstream object, which would be greater than the total number of defaultstream objects would crash at 'defaultstream_id_list[defaultstream_id]' which is a paramater of 'build_defaultstream'. Added tests to prevent the same.	2018-02-25 09:20:55 -08:00
Rhea Parekh	aff9099c3b	slack importer: Get domain name from settings.EXTERNAL_HOST.	2018-02-21 08:58:27 -08:00
Rhea Parekh	b702bbe5a1	slack importer: Allocate ids in a single db query. We use the command 'select nextval('sequence') from generate_series(1, increment_number)' which returns a list of allocated values for the ids. This list is used to assign ids to the to be converted objects.	2018-02-19 08:55:50 -08:00
Rhea Parekh	5dfacfcfca	slack importer: Change 'allocate_ids' to return a list of ids. Update the callers of this function to process the list and add tests for the same.	2018-02-18 20:47:45 -08:00
Rhea Parekh	6addf79edb	slack importer: Test import in existing database with fixtures. Check in sample slack dataset fixtures, test data conversion and import of this converted data into an existing database.	2018-02-09 12:17:10 -08:00
Rhea Parekh	be05bccb5b	slack importer: optimize allocation of id range before import.	2018-02-09 12:17:10 -08:00
Rhea Parekh	c0e30079f6	slack importer: Get user data from a get request to slack users api. The fresh imported data shows that the users emails are not included in the data. However, the data received from the older method of slack (which is using legacy tokens) contains the email data of the users.	2018-02-09 12:17:10 -08:00
Rhea Parekh	48640fd28f	slack importer: Suppress logger output from the unit tests.	2018-02-08 16:21:35 -08:00
Rhea Parekh	83a7fd84ab	slack importer: Import primary owner user first. According to https://get.slack.help/hc/en-us/articles/201912948-Owners-and-Administrators, only one Primary owner of a slack organsation exists. This allocates the first id to the Primary owner and hence makes sure that the primary owner is imported first. Added tests for the same.	2018-02-06 14:48:30 -08:00
Rhea Parekh	052e3e1540	slack importer: Change organization admin mappings. Map 'Primary owner', 'owner' and 'admin' to 'organization admin'. Added tests for the same.	2018-02-06 14:48:30 -08:00
Rhea Parekh	b3b6023230	slack importer: Always map 'is_staff' to false in user data. "staff" is only for server administrators, which doesn't exist in Slack. Hence, this should always be false.	2018-02-06 14:48:29 -08:00
Rhea Parekh	eb7a9675a4	slack importer: Add unit tests.	2018-02-05 14:46:39 -08:00

1 2 3

127 Commits