Commit Graph

316 Commits

Author SHA1 Message Date
Priyansh Garg 0db9b7287b data_import: Import Rocket.Chat messages from direct discussions.
This commit adds functionality to import messages from the
Discussions having direct channels as their parent. As we don't
have topics in the PMs, the messages are imported in interleaved
form in the imported direct channels/PMs.

This was completely unsupported earlier and would have resulted in
an error.
2021-11-01 17:09:11 -07:00
Priyansh Garg 5f1e246230 data_import: Import wildcard mention data from Rocket.Chat. 2021-11-01 17:06:15 -07:00
Priyansh Garg 26f16b9eec data_import: Separate logic for naming Rocket.Chat streams and topics. 2021-11-01 16:48:25 -07:00
rht 58b19761b8 slack import: Fix requests.get usage of get_slack_api_data.
We also rewrite the tests using the `responses` module to avoid the
problematic mocking that made this bug possible.

Fixes #19833.
2021-10-07 11:46:23 -07:00
rht d8e1409fe5 Slack import: Use Python ZipFile to unzip.
This should handle the case when non-ASCII Unicode folder names are
created on Windows.

Fixes #19899.
2021-10-07 09:24:19 -07:00
Anders Kaseorg 4206e5f00b python: Remove locally dead code.
These changes are all independent of each other; I just didn’t feel
like making dozens of commits for them.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-19 01:51:37 -07:00
Priyansh Garg 54452fef6c data_import: Fix channel mentions in Rocket.Chat import.
While the STREAM_LINK_REGEX and STREAM_TOPIC_LINK_REGEX
identifies the stream and topic mentions in the content
correctly (tested by printing out the matches), the
stream/topic mentions are still not linked to the
corresponding streams/topics for imported messages, as
a `zulip_message` instance is required for linking these
mentions to actual streams/topics (see `StreamPattern`
class in `markdown/__init__.py`) which is not provided
while processing the markdown for imported messages.
2021-08-09 06:38:26 -07:00
Priyansh Garg aed4e48da7 data_import: Import attachments from Rocket.Chat. 2021-08-09 06:38:26 -07:00
Priyansh Garg 65e28907cb data_import: Import custom emoji from Rocket.Chat. 2021-08-09 06:38:26 -07:00
Priyansh Garg 4815f6e28b data_import: Make slack bot emails unique.
Slack bot emails generated by us can be duplicate for two bots.
If such a case occur, append a counter to the email to make it
unique.

For maintaining the counter of duplicate emails and the final
email assigned to each bot, a class based approach is used with
static variables and static (class) methods. This keeps all the
data related to slack bot emails at the same place and easily
accessible from anywhere inside the module (without defining any
class object and passing it around).

Fixes: #16793
2021-08-03 16:18:14 -07:00
Anders Kaseorg 5483ebae37 python: Convert "".format to Python 3.6 f-strings.
Generated automatically by pyupgrade.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-02 15:53:52 -07:00
Anders Kaseorg ad5f0c05b5 python: Remove default "utf8" argument for encode(), decode().
Partially generated by pyupgrade.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-02 15:53:52 -07:00
Anders Kaseorg 3665deb93a python: Remove unnecessary intermediate lists.
Generated automatically by pyupgrade.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-02 15:53:52 -07:00
rht 1bbd36d181 slack_import: Remove obsolete SlackImportAttachment placeholder.
This was introduced in f4ad464d82, and
incompletely removed in e037c2f93e649c28a71c02559b5ae7a3333f42a8; here
we finish removing it.
2021-08-02 13:13:28 -07:00
Priyansh Garg 044fe547d3 data_import: Add huddle import support for Rocket.Chat. 2021-07-28 15:45:54 -07:00
Priyansh Garg 24dd0ff96c data_import: Add rocket chat import tool.
This commit allows to import the following from rocketchat:
* All users
* All public/private channels
* All teams and its public/private channels
* All discussion rooms as topics in their parent channel
* All the messages in all the channels
* All private conversations
* Reactions on messages (except for custom emojis)
* Mentions in messages (except @all, @here mentions)
2021-07-28 15:28:56 -07:00
Priyansh Garg a21a280054 data_import: Rename mattermost_user to user_handler.
This logic can be readily reused for new data import tools.
2021-07-15 14:28:36 -07:00
Priyansh Garg 94a2be06f3 markdown: Use a shared variable for IMAGE_EXTENSION. 2021-07-02 11:22:55 -07:00
Priyansh Garg 5b2e21965c data_import: Add import attachments support for Mattermost.
Add support for importing message attachments from Mattermost.

Fixes: #18959
2021-07-02 11:19:45 -07:00
Alex Vandiver ff9126ac1e data_import: Protect better against bad Slack tokens.
An invalid token would be treated the same as a token with no scopes;
differentiate these better.
2021-05-27 22:46:58 -07:00
Alex Vandiver 94e4f33b29 data_import: Support importing from Slack conversions in a directory.
Sometimes the Slack import zip file we get isn't quite the canonical
form that Slack produces -- often because the user has unzip'd it,
looked at it, and re-zip'd it, resulting in extra nested directories
and the like.

For such cases, support passing in a path to an unpacked Slack export
tree.
2021-05-27 22:46:58 -07:00
Alex Vandiver 8228ea2a17 import_data: Do some quick verification of Slack import formats. 2021-05-27 22:46:58 -07:00
Anders Kaseorg 544bbd5398 docs: Fix capitalization mistakes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-05-10 09:57:26 -07:00
Sumanth V Rao 40228972b9 models/realm: Add a model for storing realm playground information.
Tweaked exports.py to add the config object there so that our export
tool can include the table when exporting. Also includes all the
changes required to import the new table from the exported data.

Helper function `get_realm_playgrounds` added to fetch all
playgrounds in a realm.

Tests amended.
2021-04-07 08:20:53 +05:30
Cyril Pletinckx ba7da6d5c0 import/export: Fix deprecated authentication method for Slack.
The query string parameter authentication method is now deprecated for
newly created Slack applications since the 24th of February[1].  This
causes Slack imports to fail, claiming that the token has none of the
required scopes.

Two methods can be used to solve this problem: either include the
authentication token in the header of an HTTP GET request, or include
it in the body of an HTTP POST request. The former is preferred, as
the code was already written to use HTTP GET requests.

Change the way the parameters are passed to the "requests.get" method
calls, to pass the token via the `Authorization` header.

[1] https://api.slack.com/changelog/2020-11-no-more-tokens-in-querystrings-for-newly-created-apps

Fixes: #17408.
2021-03-08 12:56:37 -08:00
Anders Kaseorg a1ba3ca066 import_util: Strengthen get_users type using a Protocol.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-15 17:05:28 -08:00
Anders Kaseorg 6e4c3e41dc python: Normalize quotes with Black.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg 11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Aman Agrawal c685d36821 hipchat_import: Remove tool from codebase.
Remove functions and scripts used by HipChat import tool and
those which will no longer be required in future.
2020-12-23 08:28:49 -08:00
Tim Abbott ed498e2f8e import: Import mattermost admins as Zulip owners.
Otherwise, we violate the invariant that all organizations have an owner.
2020-12-17 18:45:45 -08:00
Alex Vandiver 7c849fa940 slack: Check token access scopes before importing.
The Slack API always (even for failed requests) puts the access scopes
of the token passed in, into "X-OAuth-Scopes"[1], which can be used to
determine if any are missing -- and if so, which.

[1] https://api.slack.com/legacy/oauth-scopes#working-with-scopes
2020-12-15 11:33:15 -08:00
Tim Abbott 067cd3a97a docs: Remove incorrect references to chat.zulip.org.
Most of these are Help Center links that should be pointing to the
production Help Center.
2020-10-29 16:46:40 -07:00
Anders Kaseorg 4e9d587535 python: Pass query parameters as a dict when making GET requests.
This provides automatic URL-encoding.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-27 13:47:02 -07:00
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Anders Kaseorg 9a2aad58d0 import_util: Migrate from run_parallel to multiprocessing.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-09-14 16:22:23 -07:00
Anders Kaseorg b7b7475672 python: Use standard secrets module to generate random tokens.
There are three functional side effects:

• Correct an insignificant but mathematically offensive bias toward
repeated characters in generate_api_key introduced in commit
47b4283c4b4c70ecde4d3c8de871c90ee2506d87; its entropy is increased
from 190.52864 bits to 190.53428 bits.

• Use the base32 alphabet in confirmation.models.generate_key; its
entropy is reduced from 124.07820 bits to the documented 120 bits, but
now it uses 1 syscall instead of 24.

• Use the base32 alphabet in get_bigbluebutton_url; its entropy is
reduced from 51.69925 bits to 50 bits, but now it uses 1 syscall
instead of 10.

(The base32 alphabet is A-Z 2-7.  We could probably replace all of
these with plain secrets.token_urlsafe, since I expect most callers
can handle the full urlsafe_b64 alphabet A-Z a-z 0-9 - _ without
problems.)

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-09 15:52:57 -07:00
Anders Kaseorg f91d287447 python: Pre-fix a few spots for better Black formatting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00
Anders Kaseorg a276eefcfe python: Rewrite dict() as {}.
Suggested by the flake8-comprehensions plugin.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-02 11:15:41 -07:00
Anders Kaseorg 61d0417e75 python: Replace ujson with orjson.
Fixes #6507.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:55:12 -07:00
Anders Kaseorg 768f9f93cd docs: Capitalize Markdown consistently.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:23:06 -07:00
Anders Kaseorg 60a25b2721 docs: Fix spelling errors caught by codespell.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:23:06 -07:00
Alex Vandiver 2928bbc8bd logging: Report stack_info on logging.exception calls.
The exception trace only goes from where the exception was thrown up
to where the `logging.exception` call is; any context as to where
_that_ was called from is lost, unless `stack_info` is passed as well.
Having the stack is particularly useful for Sentry exceptions, which
gain the full stack trace.

Add `stack_info=True` on all `logging.exception` calls with a
non-trivial stack; we omit `wsgi.py`.  Adjusts tests to match.
2020-08-11 10:16:54 -07:00
Steve Howell c44500175d database: Remove short_name from UserProfile.
A few major themes here:

    - We remove short_name from UserProfile
      and add the appropriate migration.

    - We remove short_name from various
      cache-related lists of fields.

    - We allow import tools to continue to
      write short_name to their export files,
      and then we simply ignore the field
      at import time.

    - We change functions like do_create_user,
      create_user_profile, etc.

    - We keep short_name in the /json/bots
      API.  (It actually gets turned into
      an email.)

    - We don't modify our LDAP code much
      here.
2020-07-17 11:15:15 -07:00
Steve Howell 0b65abcdf5 pointer: Remove pointer from UserProfile.
Most of the changes here are just that we no
longer need to provide a value for pointer
when we create UserProfile objects.
2020-07-03 13:08:40 +00:00
Anders Kaseorg 3ffed617a2 mypy: Type simple generators as Iterator, not Iterable.
A generator that yields values without receiving or returning them is
an Iterator.  Although every Iterator happens to be iterable, Iterable
is a confusing annotation for generators because a generator is only
iterable once.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-23 11:29:54 -07:00
Anders Kaseorg 74c17bf94a python: Convert more percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Now including %d, %i, %u, and multi-line strings.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-14 23:27:22 -07:00
Anders Kaseorg 1ed2d9b4a0 logging: Use logging.exception and exc_info for unexpected exceptions.
logging.exception() and logging.debug(exc_info=True),
etc. automatically include a traceback.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-14 23:27:22 -07:00
Anders Kaseorg 0d6c771baf python: Guard against default value mutation with read-only types.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-13 15:31:27 -07:00
Anders Kaseorg 91a86c24f5 python: Replace None defaults with empty collections where appropriate.
Use read-only types (List ↦ Sequence, Dict ↦ Mapping, Set ↦
AbstractSet) to guard against accidental mutation of the default
value.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-13 15:31:27 -07:00
Anders Kaseorg 365fe0b3d5 python: Sort imports with isort.
Fixes #2665.

Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.

Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start.  I expect this change will increase pressure for us to split
those files, which isn't a bad thing.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 16:45:32 -07:00
Anders Kaseorg 69730a78cc python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:

import re
import sys

last_filename = None
last_row = None
lines = []

for msg in sys.stdin:
    m = re.match(
        r"\x1b\[35mflake8    \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
    )
    if m:
        filename, row_str, col_str, err = m.groups()
        row, col = int(row_str), int(col_str)

        if filename == last_filename:
            assert last_row != row
        else:
            if last_filename is not None:
                with open(last_filename, "w") as f:
                    f.writelines(lines)

            with open(filename) as f:
                lines = f.readlines()
            last_filename = filename
        last_row = row

        line = lines[row - 1]
        if err in ["C812", "C815"]:
            lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
        elif err in ["C819"]:
            assert line[col - 2] == ","
            lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")

if last_filename is not None:
    with open(last_filename, "w") as f:
        f.writelines(lines)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-06-11 16:04:12 -07:00
Anders Kaseorg 67e7a3631d python: Convert percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 15:02:09 -07:00
Anders Kaseorg 6480deaf27 python: Convert more "".format to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus --keep-percent-format, with more
restrictions patched out.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 14:48:09 -07:00
Tim Abbott 71078adc50 docs: Update URLs to use https://zulip.com.
We're migrating to using the cleaner zulip.com domain, which involves
changing all of our links from ReadTheDocs and other places to point
to the cleaner URL.
2020-06-08 18:10:45 -07:00
sahil839 2f7d684a84 slack_import: Map slack owners to zulip realm owners.
Slack owners and primary owners will be mapped to zulip
realm owners on import.

Previously, we mapped the owner and primary owner roles of slack
to realm admins in zulip. As we have added ROLE_REALM_OWNER in
8bbc074, we now map slack owners and primary owners to owners in
zulip.

Tests are modified for checking all the 3 cases-
 - Slack workspace primary owner
 - Slack workspace owner
 - Slack workspace admin

This commit also has docs changes in 'import-from-slack.md'.
2020-06-08 16:22:54 -07:00
Anders Kaseorg 8dd83228e7 python: Convert "".format to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus --keep-percent-format, but with the
NamedTuple changes reverted (see commit
ba7906a3c6, #15132).

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-08 15:31:20 -07:00
Tim Abbott 496c08e26c slack import: Fix DefaultStream import of deactivated #random.
If the #random channel in Slack is deactivated, we should follow
Zulip's data model of not allowing deactivated, default streams.

This had apparently happened in zulipchat.com for a few organizations,
resulting in weird exceptions trying to invite new users.
2020-05-12 17:18:57 -07:00
Rohitt Vashishtha 9506be0f4f slack-import: Downgrade Slack legacy-token check failure to warning.
Slack has disabled creation of legacy tokens, which means we have to use other
tokens for importing the data. Thus, we shouldn't throw an error if the token
doesn't match the legacy token format.

Since we do not have any other validation for those tokens yet, we log a warning
but still try to continue with the import assuming that the token has the right
scopes.

See https://api.slack.com/changelog/2020-02-legacy-test-token-creation-to-retire.
2020-05-11 13:41:50 -07:00
Cyril Cohen 0d6f80059b
gitter import: Subscribe every user to every stream. 2020-05-05 21:31:35 -07:00
Cyril Cohen 5598f8f6b0 gitter: Support importing data from multiple Gitter rooms.
**Features:**
Improving `./manage.py convert_gitter_data`
- If messages have been post-processed to add a 'room' field, we
  create as many streams as existing rooms.
- Messages with a 'room' field go to the corresponding stream.
- This modification is backward compatible. I.e.
  + messages that have no 'room' field go to the default stream/topic
  + messages that do, go to a specific stream

**Implementation:**
- adding a map `stream_map` to map room names to stream ids
- create as many streams as room field messages + 1 default streamFeatures:
- If messages have been post-processed to add a 'room' field to messages,
  we create as many streams as existing rooms.
- Up to renaming of the default stream/topic, this modification is
  backwards compatible.
  I.e. messages that have no 'room' field go to the default stream/topic
       messages that do, go to a specific stream

Implementation:
- adding a map stream_map to map room names to stream ids
- create as many streams as room field messages + 1 default stream

Takes advantage of https://github.com/minrk/archive-gitter/pull/5.
2020-05-02 10:30:18 -07:00
Anders Kaseorg bdc365d0fe logging: Pass format arguments to logging.
https://docs.python.org/3/howto/logging.html#optimization

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-02 10:18:02 -07:00
Anders Kaseorg fead14951c python: Convert assignment type annotations to Python 3.6 style.
This commit was split by tabbott; this piece covers the vast majority
of files in Zulip, but excludes scripts/, tools/, and puppet/ to help
ensure we at least show the right error messages for Xenial systems.

We can likely further refine the remaining pieces with some testing.

Generated by com2ann, with whitespace fixes and various manual fixes
for runtime issues:

-    invoiced_through: Optional[LicenseLedger] = models.ForeignKey(
+    invoiced_through: Optional["LicenseLedger"] = models.ForeignKey(

-_apns_client: Optional[APNsClient] = None
+_apns_client: Optional["APNsClient"] = None

-    notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
-    signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)

-    author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)
+    author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)

-    bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)
+    bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)

-    default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
-    default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)

-descriptors_by_handler_id: Dict[int, ClientDescriptor] = {}
+descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {}

-worker_classes: Dict[str, Type[QueueProcessingWorker]] = {}
-queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {}
+worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {}
+queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {}

-AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None
+AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-22 11:02:32 -07:00
Anders Kaseorg 1cf63eb5bf python: Whitespace fixes from autopep8.
Generated by autopep8, with the setup.cfg configuration from #14532.
I’m not sure why pycodestyle didn’t already flag these.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-21 17:58:09 -07:00
Siddharth Varshney e03176b272 help: Add doc for setting profile picture back to gravatar. 2020-04-16 20:27:52 -07:00
Anders Kaseorg c734bbd95d python: Modernize legacy Python 2 syntax with pyupgrade.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:43:22 -07:00
Stefan Weil d2fa058cc1
text: Fix some typos (most of them found and fixed by codespell).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-03-27 17:25:56 -07:00
Anders Kaseorg e257253e64 emoji_codes: Replace JS module with JSON module.
webpack optimizes JSON modules using JSON.parse("{…}"), which is
faster than the normal JavaScript parser.

Update the backend to use emoji_codes.json too instead of the three
separate JSON files.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-02-12 10:09:12 -08:00
Vishnu KS df5345705c import: Support importing team icon from slack. 2020-02-03 14:09:05 -08:00
Tim Abbott 122e11c678 slack import: Fix handling of messages sent by user U00. 2020-01-25 22:47:49 -08:00
Tim Abbott e052ec58db slack import: Improve error messages around invalid tokens.
This updates our error handling of invalid Slack API tokens (and other
networking error handling) to mostly make sense:
* A token that doesn't start with `xoxp-` gives an extended error early.
* An AssertionError for the codebase is correctly declared as such.
* We check for token shape errors before querying the Slack API.

We could still do useful work to raise custom exception classes here.

Thanks to @stavrospat for raising this issue.
2020-01-22 14:48:32 -08:00
Tlazypanda 6945ced76f slack import: Map Slack guest users to Zulip guests.
Slack's Single-User Guest and Multi-User Guest users should be
imported as Zulip guests during data import.

Fixes #13255.
2019-11-12 12:12:59 -08:00
Tim Abbott aad99ce951 mattermost import: Fix handling of channels with no subscribers.
Previously, we skipped setting the list of subscribers to the channel,
which could result in problems if any messages had been posted there
in the past (e.g. because the channel used to have members, but now
doesn't).  It could be correct to skip importing dead channels
altogether, but probably simpler is to just set an empty subscriber list.
2019-11-04 18:10:37 -08:00
Tim Abbott dc682da47a mattermost: Handle replies to private messages.
Previously, our logic to handle Mattermost's "replies" feature didn't
copy the right fields for private messages, where `channel_members` is
included on the message body rather than a `channel` name.
2019-11-04 18:10:37 -08:00
Vishnu KS 1585ad7bf4 mattermost: Add support for exporting DMs and huddles. 2019-10-10 16:37:03 -07:00
Rishi Gupta e10361a832 models: Replace is_guest and is_realm_admin with UserProfile.role.
This new data model will be more extensible for future work on
features like a primary administrator.
2019-10-06 16:24:37 -07:00
Mateusz Mandera dbe508bb91 models: Migration of Message.pub_date to date_sent, part 2.
Fixes #1727.

With the server down, apply migrations 0245 and 0246. 0246 will remove
the pub_date column, so it's essential that the previous migrations
ran correctly to copy data before running this.
2019-10-05 19:01:34 -07:00
Vishnu KS a21856c569 mattermost: Rename user_id to sender_user_id in process_raw_message_batch. 2019-09-25 20:06:47 +05:30
Vishnu KS 23d70bb685 mattermost: Rename get_recipient_id to get_recipient_id_from_receiver_name. 2019-09-25 20:06:04 +05:30
Vishnu Ks c4af0b7bc4 mattermost: Support importing messages without team name.
Mattermost doesn't place private messages within a particular team,
which is what this is needed for.
2019-09-18 11:57:37 -07:00
Vishnu Ks bf5f531e90 import_util: Support huddles in SubscriberHandler. 2019-09-18 11:53:13 -07:00
Vishnu KS d434c0ee88 slack: Remove unnecessary comments.
Remove comments that tries to explain code that is already readable.
Also remove some todo comments that has been already taken care of.
2019-08-26 14:10:19 -07:00
Vishnu KS 99d34fd11d slack: Rename default_channels to slack_default_channels. 2019-08-26 14:10:19 -07:00
Vishnu KS b919514f7f slack: Rename customprofilefield_id to custom_profile_field_id. 2019-08-26 14:10:19 -07:00
Vishnu KS c31355f9c1 slack: Rename custom_field_id_count to custom_profile_field_value_id_count. 2019-08-26 14:10:19 -07:00
Vishnu KS 138c659c97 slack: Rename slack_custom_field_name_to_zulip_custom_field_id.
Rename custom_field_map to
slack_custom_field_name_to_zulip_custom_field_id.
2019-08-26 14:10:19 -07:00
Vishnu KS 9560736d86 slack: Rename slack_user_id_to_custom_profile_fields.
Renames slack_user_custom_field_map to
slack_user_id_to_custom_profile_fields for readability.
2019-08-26 14:10:19 -07:00
Vishnu KS 01a51c8f4e slack: Rename added_recipient to slack_recipient_name_to_zulip_recipient_id. 2019-08-26 14:10:19 -07:00
Vishnu KS 9d51a1b527 slack: Rename added_users to slack_user_id_to_zulip_user_id. 2019-08-26 14:10:19 -07:00
Vishnu KS 3650f19692 slack: Lookup dir_name key in dict instead of in dict_keys.
No reason to do the lookup in O(n) when we can do
it in average O(1) time complexity.
2019-08-26 14:10:19 -07:00
Vishnu Ks 1e5c49ad82 slack: Support importing shared channels. 2019-08-26 14:10:19 -07:00
Vishnu Ks e09a29f4d3 slack: Refactor get_slack_api_data to accept multiple query params. 2019-08-26 14:10:19 -07:00
Tim Abbott 69505c30a4 mattermost: Handle users who aren't on any team correctly.
It's not clear to me how this is intended to work in Mattermost's
system in that they don't document this behavior, but some users have
`null` as their list of teams, and presumably are not meant to be
included in any team at all.
2019-08-19 16:06:39 -07:00
Anders Kaseorg e417d3a040 slack_message_conversion: Clean up type ignores.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-08-09 16:39:16 -07:00
Tim Abbott 9827801569 slack import: Improve readability of user recipient object allocation.
This loop management tweak makes it a bit more obvious what's
happening in this block of code.
2019-07-30 14:46:14 -07:00
Vishnu KS ff3871fc63 slack_import: Clean up return values of channels_to_zerver_stream.
This commits reduces the number of values returned by
channel_to_zerver_stream function by setting the values
directly in realm dict and returning it instead.
2019-07-30 14:46:14 -07:00
Vishnu Ks 6110f495df slack_import: Support importing pms. 2019-07-30 14:46:14 -07:00
Vishnu Ks 5e6d86c8c4 slack_import: Support importing multiparty IMs. 2019-07-09 15:03:28 -07:00
Vishnu Ks 443439d388 slack_import: Support importing private slack channels. 2019-06-28 11:03:32 -07:00
Vishnu Ks 196388cee3 slack_import: Extract processing channels into a seperate function. 2019-06-28 11:00:59 -07:00
Vishnu Ks 55bf44152a import: Handle hidden_by_limit case for files in slack import.
Fixes #12011
2019-05-30 12:01:09 -07:00
Anders Kaseorg 643bd18b9f lint: Fix code that evaded our lint checks for string % non-tuple.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:21:37 -07:00
Vishnu Ks 02c92e55a2 import: Add tool for importing teams from mattermost. 2019-04-05 17:53:03 -07:00
Vishnu Ks f517f72dd2 import: Make use of is_mirror_dummy in build_user_profile. 2019-04-04 13:51:52 -07:00
Vishnu Ks bd4c3b3ebb import: Move make_user_messages to import_util.py. 2019-04-04 13:51:52 -07:00
Vishnu Ks d921fd25e4 import: Move SubscriberHandler to import_util. 2019-03-20 11:29:51 -07:00
Vishnu Ks 6e3720e0b7 gitter: Fix minor comment typo in build_userprofile. 2019-03-20 10:12:18 -07:00
Ben Muschol d526ff00f2 settings: Rename "user avatar" to "profile picture"
This renames references to user avatars, bot avatars, or organization
icons to profile pictures. The string in the UI are updated,
in addition to the help files, comments, and documentation. Actual
variable/function names, changelog entries, routes, and s3 buckets are
left as-is in order to avoid introducing bugs.

Fixes #11824.
2019-03-15 13:29:56 -07:00
Tim Abbott 412d35900f slack import: Fix handling of tombstone files.
Apparently, the mode attribute is not always present.
2019-03-13 14:39:20 -07:00
Tim Abbott 49680a4503 slack import: Skip processing tombstone files.
The tombstone files undocumented feature of Slack's export format
appears sometimes and has no real data, so we just need to skip these.

Fixes #11619.
2019-03-13 12:43:11 -07:00
Tim Abbott cbc62b8e07 streams: Prevent creation of multi-line stream descriptions.
We do not anticipate our UI for showing stream descriptions looking
reasonable for multi-line descriptions, so we should just ban creating
them.

Given the frontend changes, multi-line descriptions are only likely to
show up from importing content from other tools, in which case
replacing newlines with spaces is cleaner than the alternative.
2019-02-20 12:28:00 -08:00
Rishi Gupta e183c316dd help: Rename help/change-your-avatar to help/set-your-avatar. 2019-02-13 17:50:39 -08:00
Anders Kaseorg 56a675d5ec export: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:25:27 -08:00
Hemanth V. Alluri 73d26c8b28 streams: Render and store the stream description from the backend.
This commit does the following three things:
    1. Update stream model to accomodate rendered description.
    2. Render and save the stream rendered description on update.
    3. Render and save stream descriptions on creation.

Further, the stream's rendered description is also sent whenever the
stream's description is being sent.

This is preparatory work for eliminating the use of the
non-authoritative marked.js markdown parser for stream descriptions.
2019-02-01 22:24:18 -08:00
Tim Abbott 9b25f8789f hipchat: Fix handling of user IDs in Stride import.
We've had this code oscillated a few times; the original comparison
was added as part of Stride import but broke HipChat import.
c34a8f2e69 fixed HipChat import but
regressed Stride.

This change fixes this for both HipChat + Stride.
2019-01-31 12:40:05 -08:00
Pragati Agrawal e1772b3b8f tools: Upgrade Pycodestyle and fix new linter errors.
Here, we are upgrading pycodestyle version from 2.4.0 to 2.5.0.

Fixes: #11396.
2019-01-31 12:21:41 -08:00
Matthew Wegner 370cf1a2cb import: Normalize Slackbot String Comparison.
In very old Slack workspaces, slackbot can appear as "Slackbot", and
the import script only checks for "slackbot" (case sensitive).  This
breaks the import--it throws the assert that immediately follows the
test.  I don't know how common this is, but it definitely affected our
import.

The simple fix is to compare against a lowercased-version of the
user's full name.
2019-01-28 14:59:41 -08:00
Tim Abbott 4c603990d2 hipchat: Use HTML2Text for the content.
While the result is by no means perfect, it's significantly cleaner
than what we had before this.
2019-01-09 16:59:45 -08:00
Tim Abbott 53436766c1 hipchat: Improve import of public room subscribers.
Now, if you pass an api_key, we'll initialize the public room
subscribers to be whatever they were at the time the import happened.

Also, document the situation on the caveats section.
2019-01-09 16:50:00 -08:00
Tim Abbott 035138dd98 hipchat: Refactor code for building subscriptions.
This moves the filtering of invite-only into the caller, and also
adjusts the indentation.
2019-01-09 16:50:00 -08:00
Tim Abbott c34a8f2e69 hipchat: Fix importing of private messages.
Apparently a stupid typing issue meant that we broke this a few weeks
ago.
2019-01-09 16:50:00 -08:00
Tim Abbott b7fb919dfa hipchat: Don't enable slim_mode by default.
For small organizations, generally one prefers the non-slim_mode
default behavior.
2019-01-04 11:30:25 -08:00
Tim Abbott 492230f405 hipchat import: Always include deleted users in import.
The slim_mode setting had been incorrectly configured to skip
"deleted" users, resulting in bugs where private messages with deleted
users would not be imported.
2019-01-04 11:23:02 -08:00
Tim Abbott 41b7d9a4c8 hipchat: Handle unusual emoticons.json format.
Apparently, hc-migrate can generate emoticons.json files with a
somewhat different format.  Assuming that other files are in the
normal format, we should be able to handle it like this.

See report in #11135.
2018-12-29 18:59:31 -08:00
Tim Abbott 44f117ac72 hipchat: Handle case where emoticons.json is not in export.
Apparently, some methods of exporting from HipChat do not include an
emoticons.json file.  We could test for this using the
`include_emoticons` field in `metadata.json`, but we currently don't
even bother to read that file.  Rather than changing that, we just
print a warning and proceed.  This is arguably better anyway, in that
often not having emoticons.json is the result of user error when
exporting, and it's nice to flag that this is happening.

Fixes #11135.
2018-12-29 18:54:50 -08:00
Tim Abbott c995e8e2ae import: Ensure presence of basic avatar images for HipChat.
Our HipChat conversion tool didn't properly handle basic avatar
images, resulting in only the medium-size avatar images being imported
properly.  This fixes that bug by asking the import tool to do the
thumbnailing for the basic avatar image (from the .original file) as
well as the medium avatar image.
2018-12-27 17:47:09 -08:00
Tim Abbott 8a90441d2f slack import: Import long-inactive users as long-term idle.
This avoids creating UserMessage rows for long-inactive users in
organizations with many thousands of users.
2018-12-16 18:52:20 -08:00
Tim Abbott a6ca95dfc4 slack import: Fix all messages being imported to one channel.
This was an ugly variable-escape-from-loop regression introduced in
e59ff6e6db.
2018-12-12 17:54:37 -08:00
Tim Abbott d6217eb862 slack import: Fix empty values for custom profile fields.
The Slack import process would incorrectly issue
CustomProfileFieldValue entries with a value of "" for users who
didn't have a given CustomProfileField (especially common for the
"skype" and "phone" fields).  This had no user-visible effect, but
certainly added some clutter in the database.
2018-12-12 12:58:27 -08:00
Tim Abbott e9900b2bdf gitter: Do something reasonable with invalid fullnames. 2018-12-12 10:07:52 -08:00
rht e59ff6e6db slack import: Eliminate need to load all messages into memory.
This works by yielding messages sorted based on timestamp.  Because
the Slack exports are broken into files by date, it's convenient to do
a 2-layer sorting process, where we open all the files for a given
day, and then sort their messages by timestamp before yielding them.

Fixes #10930.
2018-12-05 12:20:50 -08:00
Tim Abbott 48a3975ec0 import: Avoid unnecessary forks when downloading attachments.
The previous implementation used run_parallel incorrectly, passing it
a set of very small jobs (each was to download a single file), which
meant that we'd end up forking once for every file to download.

This correct implementation sends each of N threads 1/N of the files
to download, which is more consistent with the goal of distributing
the download work between N threads.
2018-12-02 13:50:27 -08:00
Tim Abbott 00826486bd hipchat: Fix typo in logging output. 2018-11-26 16:44:31 -08:00
Steve Howell 38f81d5d20 hipchat: Skip public stream subs in slim mode. 2018-11-26 16:37:30 -08:00
Steve Howell c2e9f5eb0a hipchat: Limit messages in slim mode.
For messages with strange senders, we don't import
messages.  Basically, we only import a message if
it has sender with an id that maps to a non-deleted
user.
2018-11-26 16:37:30 -08:00
Steve Howell 3a7788217e hipchat: Skip really long messages. 2018-11-26 16:37:30 -08:00
Steve Howell e57a932692 hipchat: Fix avatars.
This code was not reading any avatars because
it was not referencing 'User' to get to the avatar,
and it was not re-mapping user ids for some reason.
2018-11-26 16:37:30 -08:00
Steve Howell ad35e371fe hipchat: Support slim_mode flag.
We now skip deleted users.  There is a flag
here that's hard coded to True--we may decide
later to make this a command line option.
2018-11-26 16:37:30 -08:00
Steve Howell bd1e96cf63 hipchat: Rework stream/subscriber logic.
We now account for streams having users that
may be deleted.  We do a couple things:

    - use a loop instead of map
    - only pass in users to hipchat_subscriber
    - early-exit if there are not users
    - skip owner/members logic for public streams
2018-11-26 16:37:30 -08:00
Steve Howell 1335dfd295 hipchat: Handle messages with missing recipients.
If a message is for a stream or user that we didn't
load, then we just skip it.
2018-11-26 16:37:30 -08:00
Steve Howell ff68757358 hipchat: Just skip over missing attachments.
It seems like we get a lot of exports with bad
attachment data, and some folks don't necessarily
care, so we just skip for now.
2018-11-26 16:37:30 -08:00
Steve Howell ea26372083 hipchat: Make conversion work with UUID ids from Stride.
Normal hipchat exports use integer ids for their
users and "rooms," which we just borrowed during
conversion.

Atlassian Stride uses stride UUIDs for these instead, but otherwise
has the same export format.

We now introduce IdMapper to handle external ids
that aren't integer.  The IdMapper will map UUID
ids to ints and remember them.  For ints it just
leaves them alone.

Fixes #10805.
2018-11-14 23:22:40 -08:00
Steve Howell aff84cd1e9 hipchat: Skip attachments without paths.
This is a short term workaround.  Some variants
of HipChat exports are missing `path`, and we just
punt for now.
2018-11-14 23:14:13 -08:00
Steve Howell d86dd165da gitter/slack/hipchat: Remove "subject" from conversions.
We (lexically) remove "subject" from the conversion code.  The
`build_message` helper calls `set_topic_name` under the hood,
so things still have "subject" in the JSON.

There was good code coverage on `build_message`.
2018-11-12 15:47:11 -08:00
Tim Abbott e88998e6d4 import: Fix buggy handling of avatars in Slack conversion.
This was a pretty nasty error, where we were accidentally accessing
the parent list in this inner loop function.

This appears to have been introduced as a refactoring bug in
7822ef38c2.
2018-11-08 15:03:39 -08:00
Tim Abbott 8b661f2f03 slack import: Correctly detect the commenting user.
Fixes #10772.
2018-11-06 13:14:23 -08:00
Tim Abbott 81a4c846f4 hipchat: Set s3_path for exported emoji.
This fixes an issue where the import process would fail when importing
to a server using the S3 backend.
2018-11-06 13:02:04 -08:00
Tim Abbott 539e84e9a1 hipchat import: Stop setting last_modified=None.
The last_modified field is intended to support setting the
orig-last-modified field in the S3 backend when importing, basically
to keep track of this bit of pre-export data for debugging.  In the
event that it isn't available, the correct thing to do is not write
out an invalid `last_modified` field; we should just not write it out
at all.
2018-11-06 12:50:36 -08:00
Tim Abbott d54af3cb5b hipchat import: Handle deactivated users without an email address.
We saw this in a recent HipChat import data set.
2018-11-01 10:09:19 -07:00
Steve Howell 30c493ed24 slack import: Generate message_id/reaction_id with NEXT_ID.
This avoids the need to pass tuples of ints around, which
is pretty brittle.
2018-10-29 13:24:50 -07:00
Steve Howell 2f58eb1057 slack import: Extract process_message_files().
This is mostly an extraction, but it does change the
way we calculate `content`.  We append the markdown
links from ALL files to any content that came in the
message itself.

Separating this out also allows us to add more
test coverage for the extracted code.
2018-10-29 13:24:50 -07:00