This updates our error handling of invalid Slack API tokens (and other
networking error handling) to mostly make sense:
* A token that doesn't start with `xoxp-` gives an extended error early.
* An AssertionError for the codebase is correctly declared as such.
* We check for token shape errors before querying the Slack API.
We could still do useful work to raise custom exception classes here.
Thanks to @stavrospat for raising this issue.
This commits reduces the number of values returned by
channel_to_zerver_stream function by setting the values
directly in realm dict and returning it instead.
This renames references to user avatars, bot avatars, or organization
icons to profile pictures. The string in the UI are updated,
in addition to the help files, comments, and documentation. Actual
variable/function names, changelog entries, routes, and s3 buckets are
left as-is in order to avoid introducing bugs.
Fixes#11824.
In very old Slack workspaces, slackbot can appear as "Slackbot", and
the import script only checks for "slackbot" (case sensitive). This
breaks the import--it throws the assert that immediately follows the
test. I don't know how common this is, but it definitely affected our
import.
The simple fix is to compare against a lowercased-version of the
user's full name.
The Slack import process would incorrectly issue
CustomProfileFieldValue entries with a value of "" for users who
didn't have a given CustomProfileField (especially common for the
"skype" and "phone" fields). This had no user-visible effect, but
certainly added some clutter in the database.
This works by yielding messages sorted based on timestamp. Because
the Slack exports are broken into files by date, it's convenient to do
a 2-layer sorting process, where we open all the files for a given
day, and then sort their messages by timestamp before yielding them.
Fixes#10930.
We (lexically) remove "subject" from the conversion code. The
`build_message` helper calls `set_topic_name` under the hood,
so things still have "subject" in the JSON.
There was good code coverage on `build_message`.
This is mostly an extraction, but it does change the
way we calculate `content`. We append the markdown
links from ALL files to any content that came in the
message itself.
Separating this out also allows us to add more
test coverage for the extracted code.
We now use subscriber_map for building UserMessage
rows in Slack/Gitter conversions.
This is mostly designed to simplify the code, rather
than having to scan the entire subscribers for each
message.
I am guessing this will improve performance for most
conversions. We sort small lists on every message,
in order to be deterministic, but the sorting cost
is probably more than offset by avoiding the O(N)
scans across all subscriptions. Also, it's probably
negligible in the grand scheme of things, compared
to JSON parsing, file I/O, etc.
This commits also fixes some typos with mentioned_users_id ->
mentioned_user_ids and cleans up a test a bit as well.