We (lexically) remove "subject" from the conversion code. The
`build_message` helper calls `set_topic_name` under the hood,
so things still have "subject" in the JSON.
There was good code coverage on `build_message`.
This is mostly an extraction, but it does change the
way we calculate `content`. We append the markdown
links from ALL files to any content that came in the
message itself.
Separating this out also allows us to add more
test coverage for the extracted code.
We now use subscriber_map for building UserMessage
rows in Slack/Gitter conversions.
This is mostly designed to simplify the code, rather
than having to scan the entire subscribers for each
message.
I am guessing this will improve performance for most
conversions. We sort small lists on every message,
in order to be deterministic, but the sorting cost
is probably more than offset by avoiding the O(N)
scans across all subscriptions. Also, it's probably
negligible in the grand scheme of things, compared
to JSON parsing, file I/O, etc.
This commits also fixes some typos with mentioned_users_id ->
mentioned_user_ids and cleans up a test a bit as well.
* If `zerver_realmauditlog` is present in the exported data,
`RealmAuditLog` would be imported normally.
* If it is not present, `create_subscription_events`
function in would create the `subscription_created`
events for RealmAuditLog. The reason this function
is in `import_realm` module and not in the individual
export tool scripts (like Slack) is because this
function would be common for all export tools.
This fixes#9846 for users who have not already done an import of
their organization from Slack.
Fixes#9846.
https://github.com/houstondatavis/slack-export/blob/master/users.json
JSON or JavaScript decodes "\/" to / (and some encoders always write
"\/" to avoid accidentally creating a </script> tag), while Python
assumes "\/" is a typo for "\\/" and decodes it to \/.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
The only changes visible at the AST level, checked using
https://github.com/asottile/astpretty, are
zerver/lib/test_fixtures.py:
'\x1b\\[(1|0)m' ↦ '\\x1b\\[(1|0)m'
'\\[[X| ]\\] (\\d+_.+)\n' ↦ '\\[[X| ]\\] (\\d+_.+)\\n'
which is fine because re treats '\\x1b' and '\\n' the same way as
'\x1b' and '\n'.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Messages can be bulky, and storing them in a single
data structure can cause a memory error.
In this commit, the messages are written to a file
batch-wise, thus avoiding the memory error.
Previously, the messages where being stored in a output file from
outside the function 'convert_slack_workspace_messages', but
now we store it from the inside the mentioned function.
This will help in processing and saving the messages batch-wise
so as to avoid a memory error.
Reactions are returned separately from 'convert_slack_workspace_messages'
rather than 'message_json'.
Also updated test for 'convert_slack_workspace_messages' and an additional
test for reactions is added.
This was stored as a fixture file under zerver/fixtures, which caused
problems, since we don't show that directory under production (as its
part of the test system).
The simplest emergency fix here would be to just move the file, but
when looking at it, it's clear that we don't need or want a fixture
file here; we want a Python object, so we just do that.
A valuable follow-up improvement to this block would be to create an
actual new Realm object (not saved to the database), and dump it the
same code we use in the export tool; that should handle the vast
majority of these correctly.
Fixes#9123.
Change 'get_user_data' function to a more general function
to get data from the slack api using legacy tokens.
Also, change the error handling as upon invalid token,
the response is 200, but the response has an error
field in it.
For eg. Go to the following link with invalid token:
https://slack.com/api/emoji.list?token=xoxp-249056023425
Remove allocation ID function from slack import script. All the IDs
count will start from 0. Hence the ID List returned
by the allocation function is of no use, and we remove its implementation.
(example: get_total_messages_and_attachments function is of no use anymore,
hence we remove it)
In importing avatars, we use the implementation where the 'avatar_path'
is seperately calculated using realm and user ID and then the content
of the path provided in the avatar's 'records.json' are copied to this
'avatar_path'.
Similary, here for the uploads, 's3_file_name' is seperately calculated
using the realm ID and uploaded file name and then the content of the
path provided in upload's 'records.json' are copied to this 's3_file_name'.
The domain name is being set in the helper function
'slack_workspace_to_realm', but it should be set in the main function
'do_convert_data', as we need it in other child functions of
'do_convert_data'.
if the test fails, the 'output_dir' would not be deleted and
hence it would give an error when we run the tests next time,
as 'do_convert_data' expects an empty 'output_dir'.
Also the unzipped data file should be removed if the test fails
at 'do_convert_data'.