The previous system would crash with some files (because for some
reason the comment count was 1 but there was no "initial comment") and
also the file comment and file name were sorta redundant.
The 'make_new_dir' bool value was used to create a new directory
every time True is passed. Now that avatars and uploads directory
are being created seperately, we don't need this anymore.
The domain name is being set in the helper function
'slack_workspace_to_realm', but it should be set in the main function
'do_convert_data', as we need it in other child functions of
'do_convert_data'.
The messages were first being read and passed to the helper
functions channel wise.
This function makes a list of all the messages in the all the channels
beforehand which would be used to pass in the helper functions.
slack avatar urls have the format:
'https://ca.slack-edge.com/<team_id>-<user_id>-<avatar_hash>-<size>'
For any url of this form, if the user hasn't uploaded an image,
Slack uses default gravatar, but we don't have a way of knowing if Slack
has used the uploaded image or the custom gravatar
eg: https://ca.slack-edge.com/T5YFFM2QY-U6006P1CN-gd41c3c33cbe-512.
Hence, avatar_source should be mapped to 'U'.
The check for the channel ('general' and 'random') must be added before
'build_defaultstream' function is called and then the id is incremented.
Otherwise, the id appended at the end of second defaultstream object, which would be
greater than the total number of defaultstream objects would crash at
'defaultstream_id_list[defaultstream_id]' which is a paramater of 'build_defaultstream'.
Added tests to prevent the same.
The total number of stream objects are allocated to
total_users. They should be allocated to the total_channels.
This passed the tests as the total number of users in the test
where greater than the total number of channels.
We use the command
'select nextval('sequence') from generate_series(1, increment_number)'
which returns a list of allocated values for the ids.
This list is used to assign ids to the to be converted objects.
The fresh imported data shows that the users emails are not included
in the data. However, the data received from the older method of slack
(which is using legacy tokens) contains the email data of the users.
This is to break the bigger functions into smaller
ones and hence, makes unit testing easier.
Also includes renaming the stream's topic name to 'from slack'.
Previously we had a problem of id clashes while importing converted
slack data into an existing zulip instance with realms which are actively
populating the database.
This counts the total objects to be imported and does a db transaction
to increase the SEQUENCE number for that table by that number,
and hence allocates a range of ids for the to be converted slack data
objects.