Generated by pyupgrade --py36-plus --keep-percent-format, but with the
NamedTuple changes reverted (see commit
ba7906a3c6, #15132).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
If the #random channel in Slack is deactivated, we should follow
Zulip's data model of not allowing deactivated, default streams.
This had apparently happened in zulipchat.com for a few organizations,
resulting in weird exceptions trying to invite new users.
Slack has disabled creation of legacy tokens, which means we have to use other
tokens for importing the data. Thus, we shouldn't throw an error if the token
doesn't match the legacy token format.
Since we do not have any other validation for those tokens yet, we log a warning
but still try to continue with the import assuming that the token has the right
scopes.
See https://api.slack.com/changelog/2020-02-legacy-test-token-creation-to-retire.
**Features:**
Improving `./manage.py convert_gitter_data`
- If messages have been post-processed to add a 'room' field, we
create as many streams as existing rooms.
- Messages with a 'room' field go to the corresponding stream.
- This modification is backward compatible. I.e.
+ messages that have no 'room' field go to the default stream/topic
+ messages that do, go to a specific stream
**Implementation:**
- adding a map `stream_map` to map room names to stream ids
- create as many streams as room field messages + 1 default streamFeatures:
- If messages have been post-processed to add a 'room' field to messages,
we create as many streams as existing rooms.
- Up to renaming of the default stream/topic, this modification is
backwards compatible.
I.e. messages that have no 'room' field go to the default stream/topic
messages that do, go to a specific stream
Implementation:
- adding a map stream_map to map room names to stream ids
- create as many streams as room field messages + 1 default stream
Takes advantage of https://github.com/minrk/archive-gitter/pull/5.
Generated by autopep8, with the setup.cfg configuration from #14532.
I’m not sure why pycodestyle didn’t already flag these.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
webpack optimizes JSON modules using JSON.parse("{…}"), which is
faster than the normal JavaScript parser.
Update the backend to use emoji_codes.json too instead of the three
separate JSON files.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This updates our error handling of invalid Slack API tokens (and other
networking error handling) to mostly make sense:
* A token that doesn't start with `xoxp-` gives an extended error early.
* An AssertionError for the codebase is correctly declared as such.
* We check for token shape errors before querying the Slack API.
We could still do useful work to raise custom exception classes here.
Thanks to @stavrospat for raising this issue.
Previously, we skipped setting the list of subscribers to the channel,
which could result in problems if any messages had been posted there
in the past (e.g. because the channel used to have members, but now
doesn't). It could be correct to skip importing dead channels
altogether, but probably simpler is to just set an empty subscriber list.
Previously, our logic to handle Mattermost's "replies" feature didn't
copy the right fields for private messages, where `channel_members` is
included on the message body rather than a `channel` name.
Fixes#1727.
With the server down, apply migrations 0245 and 0246. 0246 will remove
the pub_date column, so it's essential that the previous migrations
ran correctly to copy data before running this.
It's not clear to me how this is intended to work in Mattermost's
system in that they don't document this behavior, but some users have
`null` as their list of teams, and presumably are not meant to be
included in any team at all.
This commits reduces the number of values returned by
channel_to_zerver_stream function by setting the values
directly in realm dict and returning it instead.
This renames references to user avatars, bot avatars, or organization
icons to profile pictures. The string in the UI are updated,
in addition to the help files, comments, and documentation. Actual
variable/function names, changelog entries, routes, and s3 buckets are
left as-is in order to avoid introducing bugs.
Fixes#11824.
We do not anticipate our UI for showing stream descriptions looking
reasonable for multi-line descriptions, so we should just ban creating
them.
Given the frontend changes, multi-line descriptions are only likely to
show up from importing content from other tools, in which case
replacing newlines with spaces is cleaner than the alternative.
This commit does the following three things:
1. Update stream model to accomodate rendered description.
2. Render and save the stream rendered description on update.
3. Render and save stream descriptions on creation.
Further, the stream's rendered description is also sent whenever the
stream's description is being sent.
This is preparatory work for eliminating the use of the
non-authoritative marked.js markdown parser for stream descriptions.
We've had this code oscillated a few times; the original comparison
was added as part of Stride import but broke HipChat import.
c34a8f2e69 fixed HipChat import but
regressed Stride.
This change fixes this for both HipChat + Stride.
In very old Slack workspaces, slackbot can appear as "Slackbot", and
the import script only checks for "slackbot" (case sensitive). This
breaks the import--it throws the assert that immediately follows the
test. I don't know how common this is, but it definitely affected our
import.
The simple fix is to compare against a lowercased-version of the
user's full name.
Now, if you pass an api_key, we'll initialize the public room
subscribers to be whatever they were at the time the import happened.
Also, document the situation on the caveats section.
The slim_mode setting had been incorrectly configured to skip
"deleted" users, resulting in bugs where private messages with deleted
users would not be imported.
Apparently, hc-migrate can generate emoticons.json files with a
somewhat different format. Assuming that other files are in the
normal format, we should be able to handle it like this.
See report in #11135.
Apparently, some methods of exporting from HipChat do not include an
emoticons.json file. We could test for this using the
`include_emoticons` field in `metadata.json`, but we currently don't
even bother to read that file. Rather than changing that, we just
print a warning and proceed. This is arguably better anyway, in that
often not having emoticons.json is the result of user error when
exporting, and it's nice to flag that this is happening.
Fixes#11135.
Our HipChat conversion tool didn't properly handle basic avatar
images, resulting in only the medium-size avatar images being imported
properly. This fixes that bug by asking the import tool to do the
thumbnailing for the basic avatar image (from the .original file) as
well as the medium avatar image.
The Slack import process would incorrectly issue
CustomProfileFieldValue entries with a value of "" for users who
didn't have a given CustomProfileField (especially common for the
"skype" and "phone" fields). This had no user-visible effect, but
certainly added some clutter in the database.
This works by yielding messages sorted based on timestamp. Because
the Slack exports are broken into files by date, it's convenient to do
a 2-layer sorting process, where we open all the files for a given
day, and then sort their messages by timestamp before yielding them.
Fixes#10930.
The previous implementation used run_parallel incorrectly, passing it
a set of very small jobs (each was to download a single file), which
meant that we'd end up forking once for every file to download.
This correct implementation sends each of N threads 1/N of the files
to download, which is more consistent with the goal of distributing
the download work between N threads.
For messages with strange senders, we don't import
messages. Basically, we only import a message if
it has sender with an id that maps to a non-deleted
user.
We now account for streams having users that
may be deleted. We do a couple things:
- use a loop instead of map
- only pass in users to hipchat_subscriber
- early-exit if there are not users
- skip owner/members logic for public streams
Normal hipchat exports use integer ids for their
users and "rooms," which we just borrowed during
conversion.
Atlassian Stride uses stride UUIDs for these instead, but otherwise
has the same export format.
We now introduce IdMapper to handle external ids
that aren't integer. The IdMapper will map UUID
ids to ints and remember them. For ints it just
leaves them alone.
Fixes#10805.
We (lexically) remove "subject" from the conversion code. The
`build_message` helper calls `set_topic_name` under the hood,
so things still have "subject" in the JSON.
There was good code coverage on `build_message`.
This was a pretty nasty error, where we were accidentally accessing
the parent list in this inner loop function.
This appears to have been introduced as a refactoring bug in
7822ef38c2.
The last_modified field is intended to support setting the
orig-last-modified field in the S3 backend when importing, basically
to keep track of this bit of pre-export data for debugging. In the
event that it isn't available, the correct thing to do is not write
out an invalid `last_modified` field; we should just not write it out
at all.
This is mostly an extraction, but it does change the
way we calculate `content`. We append the markdown
links from ALL files to any content that came in the
message itself.
Separating this out also allows us to add more
test coverage for the extracted code.
We now use subscriber_map for building UserMessage
rows in Slack/Gitter conversions.
This is mostly designed to simplify the code, rather
than having to scan the entire subscribers for each
message.
I am guessing this will improve performance for most
conversions. We sort small lists on every message,
in order to be deterministic, but the sorting cost
is probably more than offset by avoiding the O(N)
scans across all subscriptions. Also, it's probably
negligible in the grand scheme of things, compared
to JSON parsing, file I/O, etc.
This commits also fixes some typos with mentioned_users_id ->
mentioned_user_ids and cleans up a test a bit as well.
We now have all three third party
conversions (Gitter/Slack/Hipchat)
go through build_user_message().
Hipchat was already using this helper.
We also avoid callers having to pass in
an id to build_user_message().