Users will only be able to login via GitHub, because imported users
get GitHub's generated noreply email addresses - so this should be the
only auth method enabled at first, to avoid confusion.
Only ["id"] is accessed on the dicts (representing the external tool
users). Given that for some tools the id may be under a different name
etc. due to different user dicts format, it's best to just pass those
ids to the function so that it can stay generalized and not reliant
on a specific user dict format.
get_timestamp_from_message was extracted in the previous commit. We can
deduplicate and the code a bit cleaner by using it where appropriate
instead of message["ts"].
message["ts"] is slack-specific. For this to be a general util function
it needs to take a callable that will grab a timestamp from the message
dict (which has varying formats depending on what we're importing from).
It is apparently possible to have a mention of a user who is not (or
no longer?) in the `users.bson` table.
Skip such mention for the purposes of Zulip import; there's nothing
better for us to do.
This is likely an error somewhere in rocketchat's MongoDB "eventual
consistency," but there is no problem with skipping the chunks at this
step.
In the one case where this was observed so far, the upload-id was not
referenced in any message -- if it is referenced and has chunks, but
has no metadata, we will fail later, at that reference.
We construct model instances in the import tool solely for the purpose
of serializing them with the `model_to_dict` helper that returns a
dictionary. Passing `float` to these models' DateTimeField is not
accepted by the type checker. Modifying the dictionary instead avoids
this typing issue.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
Now that we can assume Python 3.6+, we can use the
email.headerregistry module to replace hacky manual email address
parsing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Add none-checks, rename variables (to avoid redefinition of
the same variable with different types error), add necessary
type annotations.
This is a part of #18777.
Signed-off-by: Zixuan James Li <359101898@qq.com>
history_public_to_subscribers wasn't explicitly set when creating
streams via build_stream, thus relying on the model's default of False.
This lead to public streams being created with that value set to False,
which doesn't make sense.
We can solve this by inferring the correct value based on invite_only in
the build_stream funtion itself - rather than needing to add a flag
argument to it.
This commit also includes a migration to fix public stream with the
wrong history_public_to_subscribers value.
Fixes#21784.
4815f6e28b tried to de-duplicate bot
email addresses, but instead caused duplicates to crash:
```
Traceback (most recent call last):
File "./manage.py", line 157, in <module>
execute_from_command_line(sys.argv)
File "./manage.py", line 122, in execute_from_command_line
utility.execute()
File "/srv/zulip-venv-cache/56ac6adf406011a100282dd526d03537be84d23e/zulip-py3-venv/lib/python3.8/site-packages/django/core/management/__init__.py", line 413, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/srv/zulip-venv-cache/56ac6adf406011a100282dd526d03537be84d23e/zulip-py3-venv/lib/python3.8/site-packages/django/core/management/base.py", line 354, in run_from_argv
self.execute(*args, **cmd_options)
File "/srv/zulip-venv-cache/56ac6adf406011a100282dd526d03537be84d23e/zulip-py3-venv/lib/python3.8/site-packages/django/core/management/base.py", line 398, in execute
output = self.handle(*args, **options)
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/management/commands/convert_slack_data.py", line 59, in handle
do_convert_data(path, output_dir, token, threads=num_threads)
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 1320, in do_convert_data
) = slack_workspace_to_realm(
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 141, in slack_workspace_to_realm
) = users_to_zerver_userprofile(slack_data_dir, user_list, realm_id, int(NOW), domain_name)
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 248, in users_to_zerver_userprofile
email = get_user_email(user, domain_name)
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 406, in get_user_email
return SlackBotEmail.get_email(user["profile"], domain_name)
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 85, in get_email
email_prefix += cls.duplicate_email_count[email]
TypeError: can only concatenate str (not "int") to str
```
Fix the stringification, make it case-insensitive, append with a dash
for readability, and add tests for all of the above.
This resolves the issues reported in #20108, major chunk of which were
due to the incomplete support for importing the livechat streams/messages
in the tool. So, it's best not to import any livechat streams/messages for
now until a complete support for importing the same is developed.
This commit adds functionality to import messages from the
Discussions having direct channels as their parent. As we don't
have topics in the PMs, the messages are imported in interleaved
form in the imported direct channels/PMs.
This was completely unsupported earlier and would have resulted in
an error.
These changes are all independent of each other; I just didn’t feel
like making dozens of commits for them.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
While the STREAM_LINK_REGEX and STREAM_TOPIC_LINK_REGEX
identifies the stream and topic mentions in the content
correctly (tested by printing out the matches), the
stream/topic mentions are still not linked to the
corresponding streams/topics for imported messages, as
a `zulip_message` instance is required for linking these
mentions to actual streams/topics (see `StreamPattern`
class in `markdown/__init__.py`) which is not provided
while processing the markdown for imported messages.
Slack bot emails generated by us can be duplicate for two bots.
If such a case occur, append a counter to the email to make it
unique.
For maintaining the counter of duplicate emails and the final
email assigned to each bot, a class based approach is used with
static variables and static (class) methods. This keeps all the
data related to slack bot emails at the same place and easily
accessible from anywhere inside the module (without defining any
class object and passing it around).
Fixes: #16793
This commit allows to import the following from rocketchat:
* All users
* All public/private channels
* All teams and its public/private channels
* All discussion rooms as topics in their parent channel
* All the messages in all the channels
* All private conversations
* Reactions on messages (except for custom emojis)
* Mentions in messages (except @all, @here mentions)
Sometimes the Slack import zip file we get isn't quite the canonical
form that Slack produces -- often because the user has unzip'd it,
looked at it, and re-zip'd it, resulting in extra nested directories
and the like.
For such cases, support passing in a path to an unpacked Slack export
tree.