Commit Graph

61 Commits

Author SHA1 Message Date
Anders Kaseorg 1ed2d9b4a0 logging: Use logging.exception and exc_info for unexpected exceptions.
logging.exception() and logging.debug(exc_info=True),
etc. automatically include a traceback.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-14 23:27:22 -07:00
Anders Kaseorg 0d6c771baf python: Guard against default value mutation with read-only types.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-13 15:31:27 -07:00
Anders Kaseorg 91a86c24f5 python: Replace None defaults with empty collections where appropriate.
Use read-only types (List ↦ Sequence, Dict ↦ Mapping, Set ↦
AbstractSet) to guard against accidental mutation of the default
value.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-13 15:31:27 -07:00
Anders Kaseorg 365fe0b3d5 python: Sort imports with isort.
Fixes #2665.

Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.

Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start.  I expect this change will increase pressure for us to split
those files, which isn't a bad thing.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 16:45:32 -07:00
Anders Kaseorg 69730a78cc python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:

import re
import sys

last_filename = None
last_row = None
lines = []

for msg in sys.stdin:
    m = re.match(
        r"\x1b\[35mflake8    \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
    )
    if m:
        filename, row_str, col_str, err = m.groups()
        row, col = int(row_str), int(col_str)

        if filename == last_filename:
            assert last_row != row
        else:
            if last_filename is not None:
                with open(last_filename, "w") as f:
                    f.writelines(lines)

            with open(filename) as f:
                lines = f.readlines()
            last_filename = filename
        last_row = row

        line = lines[row - 1]
        if err in ["C812", "C815"]:
            lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
        elif err in ["C819"]:
            assert line[col - 2] == ","
            lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")

if last_filename is not None:
    with open(last_filename, "w") as f:
        f.writelines(lines)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-06-11 16:04:12 -07:00
Anders Kaseorg 67e7a3631d python: Convert percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 15:02:09 -07:00
Anders Kaseorg bdc365d0fe logging: Pass format arguments to logging.
https://docs.python.org/3/howto/logging.html#optimization

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-02 10:18:02 -07:00
Anders Kaseorg fead14951c python: Convert assignment type annotations to Python 3.6 style.
This commit was split by tabbott; this piece covers the vast majority
of files in Zulip, but excludes scripts/, tools/, and puppet/ to help
ensure we at least show the right error messages for Xenial systems.

We can likely further refine the remaining pieces with some testing.

Generated by com2ann, with whitespace fixes and various manual fixes
for runtime issues:

-    invoiced_through: Optional[LicenseLedger] = models.ForeignKey(
+    invoiced_through: Optional["LicenseLedger"] = models.ForeignKey(

-_apns_client: Optional[APNsClient] = None
+_apns_client: Optional["APNsClient"] = None

-    notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
-    signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)

-    author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)
+    author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)

-    bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)
+    bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)

-    default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
-    default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)

-descriptors_by_handler_id: Dict[int, ClientDescriptor] = {}
+descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {}

-worker_classes: Dict[str, Type[QueueProcessingWorker]] = {}
-queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {}
+worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {}
+queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {}

-AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None
+AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-22 11:02:32 -07:00
Stefan Weil d2fa058cc1
text: Fix some typos (most of them found and fixed by codespell).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-03-27 17:25:56 -07:00
Vishnu KS 1585ad7bf4 mattermost: Add support for exporting DMs and huddles. 2019-10-10 16:37:03 -07:00
Rishi Gupta e10361a832 models: Replace is_guest and is_realm_admin with UserProfile.role.
This new data model will be more extensible for future work on
features like a primary administrator.
2019-10-06 16:24:37 -07:00
Mateusz Mandera dbe508bb91 models: Migration of Message.pub_date to date_sent, part 2.
Fixes #1727.

With the server down, apply migrations 0245 and 0246. 0246 will remove
the pub_date column, so it's essential that the previous migrations
ran correctly to copy data before running this.
2019-10-05 19:01:34 -07:00
Vishnu Ks bf5f531e90 import_util: Support huddles in SubscriberHandler. 2019-09-18 11:53:13 -07:00
Vishnu Ks 5e6d86c8c4 slack_import: Support importing multiparty IMs. 2019-07-09 15:03:28 -07:00
Anders Kaseorg 643bd18b9f lint: Fix code that evaded our lint checks for string % non-tuple.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:21:37 -07:00
Vishnu Ks f517f72dd2 import: Make use of is_mirror_dummy in build_user_profile. 2019-04-04 13:51:52 -07:00
Vishnu Ks bd4c3b3ebb import: Move make_user_messages to import_util.py. 2019-04-04 13:51:52 -07:00
Vishnu Ks d921fd25e4 import: Move SubscriberHandler to import_util. 2019-03-20 11:29:51 -07:00
Tim Abbott cbc62b8e07 streams: Prevent creation of multi-line stream descriptions.
We do not anticipate our UI for showing stream descriptions looking
reasonable for multi-line descriptions, so we should just ban creating
them.

Given the frontend changes, multi-line descriptions are only likely to
show up from importing content from other tools, in which case
replacing newlines with spaces is cleaner than the alternative.
2019-02-20 12:28:00 -08:00
Anders Kaseorg 56a675d5ec export: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:25:27 -08:00
Hemanth V. Alluri 73d26c8b28 streams: Render and store the stream description from the backend.
This commit does the following three things:
    1. Update stream model to accomodate rendered description.
    2. Render and save the stream rendered description on update.
    3. Render and save stream descriptions on creation.

Further, the stream's rendered description is also sent whenever the
stream's description is being sent.

This is preparatory work for eliminating the use of the
non-authoritative marked.js markdown parser for stream descriptions.
2019-02-01 22:24:18 -08:00
Tim Abbott 035138dd98 hipchat: Refactor code for building subscriptions.
This moves the filtering of invite-only into the caller, and also
adjusts the indentation.
2019-01-09 16:50:00 -08:00
Tim Abbott c995e8e2ae import: Ensure presence of basic avatar images for HipChat.
Our HipChat conversion tool didn't properly handle basic avatar
images, resulting in only the medium-size avatar images being imported
properly.  This fixes that bug by asking the import tool to do the
thumbnailing for the basic avatar image (from the .original file) as
well as the medium avatar image.
2018-12-27 17:47:09 -08:00
Tim Abbott 8a90441d2f slack import: Import long-inactive users as long-term idle.
This avoids creating UserMessage rows for long-inactive users in
organizations with many thousands of users.
2018-12-16 18:52:20 -08:00
Tim Abbott 48a3975ec0 import: Avoid unnecessary forks when downloading attachments.
The previous implementation used run_parallel incorrectly, passing it
a set of very small jobs (each was to download a single file), which
meant that we'd end up forking once for every file to download.

This correct implementation sends each of N threads 1/N of the files
to download, which is more consistent with the goal of distributing
the download work between N threads.
2018-12-02 13:50:27 -08:00
Steve Howell d86dd165da gitter/slack/hipchat: Remove "subject" from conversions.
We (lexically) remove "subject" from the conversion code.  The
`build_message` helper calls `set_topic_name` under the hood,
so things still have "subject" in the JSON.

There was good code coverage on `build_message`.
2018-11-12 15:47:11 -08:00
Tim Abbott e88998e6d4 import: Fix buggy handling of avatars in Slack conversion.
This was a pretty nasty error, where we were accidentally accessing
the parent list in this inner loop function.

This appears to have been introduced as a refactoring bug in
7822ef38c2.
2018-11-08 15:03:39 -08:00
Steve Howell 00f822a26a conversion: Generate attachment_ids with helpers. 2018-10-29 13:24:50 -07:00
Steve Howell 5cb60f7bea conversions: Use subscriber_map for Slack/Gitter.
We now use subscriber_map for building UserMessage
rows in Slack/Gitter conversions.

This is mostly designed to simplify the code, rather
than having to scan the entire subscribers for each
message.

I am guessing this will improve performance for most
conversions.  We sort small lists on every message,
in order to be deterministic, but the sorting cost
is probably more than offset by avoiding the O(N)
scans across all subscriptions.  Also, it's probably
negligible in the grand scheme of things, compared
to JSON parsing, file I/O, etc.

This commits also fixes some typos with mentioned_users_id ->
mentioned_user_ids and cleans up a test a bit as well.
2018-10-29 13:24:50 -07:00
Steve Howell adb458a5df refactor: Use build_user_message for Slack/Gitter.
We now have all three third party
conversions (Gitter/Slack/Hipchat)
go through build_user_message().

Hipchat was already using this helper.

We also avoid callers having to pass in
an id to build_user_message().
2018-10-29 13:24:50 -07:00
Steve Howell 5194701787 conversions: Use NEXT_ID for usermessage_id.
This is mostly complicated due to the way that the
Slack import passes around tuples of ids to maintain
four different parallel sequences.
2018-10-29 13:24:50 -07:00
Steve Howell 78f6e3ac7d hipchat import: Fix data issues with PMs.
We now set the is_private flag on UserMessage
rows for PMs and set their subject to ''.
2018-10-25 09:11:36 -05:00
Steve Howell 6e8ae2e3fd hipchat import: Support private stream subscribers.
We now create private stream subscriptions that are
based off of `members` and `owner` from room data
in `rooms.json`.
2018-10-25 08:31:01 -05:00
Steve Howell 25f532ca2f refactor: Break up build_subscriptions.
Having two smaller functions should make it
easier to customize the behavior for each specific
use case.  The only reason they were ever coupled
was to keep ids in sequence, but the recent NEXT_ID
changes make that a non-issue now.
2018-10-25 08:31:01 -05:00
Steve Howell 2ed9fbd25b conversions: Use NEXT_ID for recipient and subscription ids.
The NEXT_ID scheme seems pretty robust, so I'm fixing a
few easy places.
2018-10-25 08:31:01 -05:00
Steve Howell 481488a35e Extract make_subscriber_map().
We extract this function and put it in the shared
library `import_util.py`.

Also, we make it one time higher up in the call
stack, rather than re-building it for every batch
of messages.  I doubt this was super expensive, but
there's no reason to repeatedly execute this.
2018-10-23 17:27:37 -05:00
Steve Howell d1ff903534 refactor: Rename build_user -> build_user_profile.
This makes greps less confusing.
2018-10-23 17:27:37 -05:00
Steve Howell ff61c56f47 hipchat import: Add NotificationMessage support. 2018-10-17 12:11:08 -07:00
Tim Abbott f9b6eeb488 import: Migrate from json to ujson for better perf.
We expect to get better memory performace from
ujson than json.

We also do a better job of closing file handles.

This likely fixes #10377.
2018-10-17 12:11:08 -07:00
Steve Howell 8accc60ca7 import_util: Support multiple message ids for attachments. 2018-10-13 16:47:44 -07:00
Steve Howell 23d7b3d2cc import: De-dup create_converted_data_files helper. 2018-10-13 16:47:41 -07:00
Steve Howell 4b82326376 hipchat import: Support guest users.
We simplify the code for is_realm_admin
and set is_guest as well.

I verified that build_user() is not used
by Slack/Gitter, so the extra argument there
should be fine.

Fixes #10639
2018-10-11 15:28:58 -07:00
Steve Howell 4da664817b hipchat conversion: Add messages. 2018-10-02 16:55:16 -07:00
Steve Howell f296d60dad hipchat conversion: Add emoji support. 2018-10-02 16:55:16 -07:00
Steve Howell 9518b1344a hipchat conversion: Process avatars.
This processes the avatar payloads that we
get in users.json.
2018-10-02 16:55:16 -07:00
Steve Howell c0f15c3860 hipchat conversion: Include deactivated users/streams.
We now include deleted/deactivated data from the old system.
2018-10-02 16:55:16 -07:00
Steve Howell faea26783b Create convert_hipchat_data.
This is a very early version of a tool to convert Hipchat
tar files into data files that can be used by the Zulip
import process.

We include the most fundamental entities--users and
streams.  Customers who don't care about past messages
or customizations could start an instance off of this
and start communicating.

Of course, there are a lot of things missing in the
initial version:

    * messages!
    * file assets -- avatars, emojis, attachments
    * probably lots of other minor things

We currently ignore any incoming dates from Hipchat data
and just use the current time.  This is consistent with
other imports.

We also don't have any docs yet, although the process
will be extremely similar to the "Slack" process:

    https://zulipchat.com/help/import-from-slack

Also, there's a comment at the top of convert_hipchat_data.py
that describes how to test this in dev mode.

I tested this by following the steps in the comment above.
The users just "show up" in /devlogin, so that's nice, and
you can send messages to other users.  To verify the stream
data you have to go into the gear menu and click on "All
Streams", then you can subscribe and send a message.

Production users will need to get new passwords and
re-subscribe to streams.  We will probably auto-subscribe
all users to public streams.
2018-10-02 16:55:16 -07:00
Rhea Parekh 7822ef38c2 import: Change absolute path of downloaded avatars in records.json to relative path. 2018-09-09 09:18:18 -04:00
Rhea Parekh f70b9a3eba import: Move 'build_message' to import_util. 2018-08-19 22:27:13 -07:00
Rhea Parekh a5bc701181 import: Move 'build_stream' to import_util. 2018-08-19 22:27:13 -07:00