In very old Slack workspaces, slackbot can appear as "Slackbot", and
the import script only checks for "slackbot" (case sensitive). This
breaks the import--it throws the assert that immediately follows the
test. I don't know how common this is, but it definitely affected our
import.
The simple fix is to compare against a lowercased-version of the
user's full name.
The Slack import process would incorrectly issue
CustomProfileFieldValue entries with a value of "" for users who
didn't have a given CustomProfileField (especially common for the
"skype" and "phone" fields). This had no user-visible effect, but
certainly added some clutter in the database.
This works by yielding messages sorted based on timestamp. Because
the Slack exports are broken into files by date, it's convenient to do
a 2-layer sorting process, where we open all the files for a given
day, and then sort their messages by timestamp before yielding them.
Fixes#10930.
We (lexically) remove "subject" from the conversion code. The
`build_message` helper calls `set_topic_name` under the hood,
so things still have "subject" in the JSON.
There was good code coverage on `build_message`.
This is mostly an extraction, but it does change the
way we calculate `content`. We append the markdown
links from ALL files to any content that came in the
message itself.
Separating this out also allows us to add more
test coverage for the extracted code.
We now use subscriber_map for building UserMessage
rows in Slack/Gitter conversions.
This is mostly designed to simplify the code, rather
than having to scan the entire subscribers for each
message.
I am guessing this will improve performance for most
conversions. We sort small lists on every message,
in order to be deterministic, but the sorting cost
is probably more than offset by avoiding the O(N)
scans across all subscriptions. Also, it's probably
negligible in the grand scheme of things, compared
to JSON parsing, file I/O, etc.
This commits also fixes some typos with mentioned_users_id ->
mentioned_user_ids and cleans up a test a bit as well.
The missing fields are checked by `full_clean()` method.
The datetime field errors are ignored as they are fixed
in the `import_realm` script. The field that are
allowed to be null are not included while building
this object.
The 'last_modified' value in emoji records is
needed for uploading the file to the S3 backend.
We set the same in the function 'import_uploads_s3'.
We also have to remove the keyword 'last_modified'
while building the RealmEmoji dict, as it is not
a field which exists in RealmEmoji objects.