We use a truncated SHA256 of the id and a server-side secret to make
emoji have non-guessable filenames, while also making collisions
unlikely.
We also adjust the Slack import to use the same SHA-based name,
instead of taking the same name as it had in Slack.
This commit performs a sweep on the first batch of non API
files to rename "huddle" to "direct_message_group`.
It also renames variables and methods of type -
"huddle_message" to "group_direct_message".
This is a part of #28640
Replaced HUDDLE attribute with DIRECT_MESSAGE_GROUP using VS Code search,
part of a general renaming of the object class.
Fixes part of #28640.
Co-authored-by: JohnLu2004 <JohnLu10212004@gmail.com>
It is possible to have multiple users with the same email address --
for instance, when two users are guests in shared channels via two
different other Slack instances.
Combine those Slack user-ids into one Zulip user, by their user-id;
otherwise, we run into problems during import due to duplicate keys.
1e5c49ad82 added support for shared channels -- but some users may
only currently exist in DMs or MPIMs, and not in channel membership.
Walk the list of MPIM subscriptions and messages, as well as DM users,
and add any such users to the set of mirror dummy users.
The previous function was poorly named, asked for a
Realm object when realm_id sufficed, and returned a
tuple of strings that had different semantics.
I also avoid calling it duplicate times in a couple
places, although it was probably rarely the case that
both invocations actually happened if upstream
validations were working.
Note that there is a TypedDict called EmojiInfo, so I
chose EmojiData here. Perhaps a better name would be
TinyEmojiData or something.
I also simplify the reaction tests with a verify
helper.
This is a follow-up to 4c8915c8e4, for
the case when the `team:read` permission is missing, which causes the
`team.info` call itself to fail. The error message supplies
information about the provided and missing permissions -- but it also
still sends the `X-OAuth-Scopes` header which we normall read, so we can
use that as normal.
`./manage.py import` does not take a tarball; it takes a directory.
Making a separate tarball is a waste of CPU time and disk, as it is
never used.
This was included in the commit of the initial Slack conversion code
in 5b37c5562b and propagated from there into every conversion tool.
Remove the unnecessary tarball creation.
The naive solution #23465 creates situations where the same user can have
multiple reactions as the base emojis are not unique, e.g. +1::skin2
and +1::skin4 would both reduce to +1 but the userlists are separate.
This solution handles the reduction, merges the same-base reactions,
and deduplicates the userlist.
Co-authored-by: Alex Vandiver <alexmv@zulip.com>
Co-authored-by: rht <rhtbot@protonmail.com>
Previously, emoji.json was read from
"$ZULIP_PATH/node_modules/emoji-datasource-google/emoji.json".
This path doesn't exist in production when installing from scratch from
a release tarball. And so, we ensure emoji.json exists by copying it to
`static/generated/emoji`.
With tweaks to comments by tabbott.
Fixes: #23469
This commit adds the OPTIONAL .realm attribute to Message
(and ArchivedMessage), with the server changes for making new Messages
have this set. Old Messages still have to be migrated to backfill this,
before it can be non-nullable.
Appropriate test changes to correctly set .realm for Messages the tests
manually create are included here as well.
build_message has a lot of arguments, so it's hard to verify correctness
of callers that just try to get the order right. It's much clearer to be
explicit via kwargs. mattermost.py and rocketchat.py already do this, so
let's bring slack.py and gitter.py up to par.
Because Slack emoji naming is different from Zulip's.
According to https://emojipedia.org/slack/, Slack's emoji shortcodes are
derived from https://github.com/iamcal/emoji-data.
There are probably some deviations from that dataset, but this PR should
at least catch the ones that are identical to iamcal's.
Only ["id"] is accessed on the dicts (representing the external tool
users). Given that for some tools the id may be under a different name
etc. due to different user dicts format, it's best to just pass those
ids to the function so that it can stay generalized and not reliant
on a specific user dict format.
get_timestamp_from_message was extracted in the previous commit. We can
deduplicate and the code a bit cleaner by using it where appropriate
instead of message["ts"].
message["ts"] is slack-specific. For this to be a general util function
it needs to take a callable that will grab a timestamp from the message
dict (which has varying formats depending on what we're importing from).
Now that we can assume Python 3.6+, we can use the
email.headerregistry module to replace hacky manual email address
parsing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
4815f6e28b tried to de-duplicate bot
email addresses, but instead caused duplicates to crash:
```
Traceback (most recent call last):
File "./manage.py", line 157, in <module>
execute_from_command_line(sys.argv)
File "./manage.py", line 122, in execute_from_command_line
utility.execute()
File "/srv/zulip-venv-cache/56ac6adf406011a100282dd526d03537be84d23e/zulip-py3-venv/lib/python3.8/site-packages/django/core/management/__init__.py", line 413, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/srv/zulip-venv-cache/56ac6adf406011a100282dd526d03537be84d23e/zulip-py3-venv/lib/python3.8/site-packages/django/core/management/base.py", line 354, in run_from_argv
self.execute(*args, **cmd_options)
File "/srv/zulip-venv-cache/56ac6adf406011a100282dd526d03537be84d23e/zulip-py3-venv/lib/python3.8/site-packages/django/core/management/base.py", line 398, in execute
output = self.handle(*args, **options)
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/management/commands/convert_slack_data.py", line 59, in handle
do_convert_data(path, output_dir, token, threads=num_threads)
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 1320, in do_convert_data
) = slack_workspace_to_realm(
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 141, in slack_workspace_to_realm
) = users_to_zerver_userprofile(slack_data_dir, user_list, realm_id, int(NOW), domain_name)
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 248, in users_to_zerver_userprofile
email = get_user_email(user, domain_name)
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 406, in get_user_email
return SlackBotEmail.get_email(user["profile"], domain_name)
File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 85, in get_email
email_prefix += cls.duplicate_email_count[email]
TypeError: can only concatenate str (not "int") to str
```
Fix the stringification, make it case-insensitive, append with a dash
for readability, and add tests for all of the above.