Commit Graph

70 Commits

Author SHA1 Message Date
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Anders Kaseorg 9a2aad58d0 import_util: Migrate from run_parallel to multiprocessing.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-09-14 16:22:23 -07:00
Anders Kaseorg f91d287447 python: Pre-fix a few spots for better Black formatting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-03 17:51:09 -07:00
Anders Kaseorg a276eefcfe python: Rewrite dict() as {}.
Suggested by the flake8-comprehensions plugin.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-09-02 11:15:41 -07:00
Anders Kaseorg 61d0417e75 python: Replace ujson with orjson.
Fixes #6507.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:55:12 -07:00
Alex Vandiver 2928bbc8bd logging: Report stack_info on logging.exception calls.
The exception trace only goes from where the exception was thrown up
to where the `logging.exception` call is; any context as to where
_that_ was called from is lost, unless `stack_info` is passed as well.
Having the stack is particularly useful for Sentry exceptions, which
gain the full stack trace.

Add `stack_info=True` on all `logging.exception` calls with a
non-trivial stack; we omit `wsgi.py`.  Adjusts tests to match.
2020-08-11 10:16:54 -07:00
Steve Howell c44500175d database: Remove short_name from UserProfile.
A few major themes here:

    - We remove short_name from UserProfile
      and add the appropriate migration.

    - We remove short_name from various
      cache-related lists of fields.

    - We allow import tools to continue to
      write short_name to their export files,
      and then we simply ignore the field
      at import time.

    - We change functions like do_create_user,
      create_user_profile, etc.

    - We keep short_name in the /json/bots
      API.  (It actually gets turned into
      an email.)

    - We don't modify our LDAP code much
      here.
2020-07-17 11:15:15 -07:00
Steve Howell 0b65abcdf5 pointer: Remove pointer from UserProfile.
Most of the changes here are just that we no
longer need to provide a value for pointer
when we create UserProfile objects.
2020-07-03 13:08:40 +00:00
Anders Kaseorg 3ffed617a2 mypy: Type simple generators as Iterator, not Iterable.
A generator that yields values without receiving or returning them is
an Iterator.  Although every Iterator happens to be iterable, Iterable
is a confusing annotation for generators because a generator is only
iterable once.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-23 11:29:54 -07:00
Anders Kaseorg 1ed2d9b4a0 logging: Use logging.exception and exc_info for unexpected exceptions.
logging.exception() and logging.debug(exc_info=True),
etc. automatically include a traceback.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-14 23:27:22 -07:00
Anders Kaseorg 0d6c771baf python: Guard against default value mutation with read-only types.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-13 15:31:27 -07:00
Anders Kaseorg 91a86c24f5 python: Replace None defaults with empty collections where appropriate.
Use read-only types (List ↦ Sequence, Dict ↦ Mapping, Set ↦
AbstractSet) to guard against accidental mutation of the default
value.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-13 15:31:27 -07:00
Anders Kaseorg 365fe0b3d5 python: Sort imports with isort.
Fixes #2665.

Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.

Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start.  I expect this change will increase pressure for us to split
those files, which isn't a bad thing.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 16:45:32 -07:00
Anders Kaseorg 69730a78cc python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:

import re
import sys

last_filename = None
last_row = None
lines = []

for msg in sys.stdin:
    m = re.match(
        r"\x1b\[35mflake8    \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
    )
    if m:
        filename, row_str, col_str, err = m.groups()
        row, col = int(row_str), int(col_str)

        if filename == last_filename:
            assert last_row != row
        else:
            if last_filename is not None:
                with open(last_filename, "w") as f:
                    f.writelines(lines)

            with open(filename) as f:
                lines = f.readlines()
            last_filename = filename
        last_row = row

        line = lines[row - 1]
        if err in ["C812", "C815"]:
            lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
        elif err in ["C819"]:
            assert line[col - 2] == ","
            lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")

if last_filename is not None:
    with open(last_filename, "w") as f:
        f.writelines(lines)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-06-11 16:04:12 -07:00
Anders Kaseorg 67e7a3631d python: Convert percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 15:02:09 -07:00
Anders Kaseorg bdc365d0fe logging: Pass format arguments to logging.
https://docs.python.org/3/howto/logging.html#optimization

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-02 10:18:02 -07:00
Anders Kaseorg fead14951c python: Convert assignment type annotations to Python 3.6 style.
This commit was split by tabbott; this piece covers the vast majority
of files in Zulip, but excludes scripts/, tools/, and puppet/ to help
ensure we at least show the right error messages for Xenial systems.

We can likely further refine the remaining pieces with some testing.

Generated by com2ann, with whitespace fixes and various manual fixes
for runtime issues:

-    invoiced_through: Optional[LicenseLedger] = models.ForeignKey(
+    invoiced_through: Optional["LicenseLedger"] = models.ForeignKey(

-_apns_client: Optional[APNsClient] = None
+_apns_client: Optional["APNsClient"] = None

-    notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
-    signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)

-    author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)
+    author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)

-    bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)
+    bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)

-    default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
-    default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)

-descriptors_by_handler_id: Dict[int, ClientDescriptor] = {}
+descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {}

-worker_classes: Dict[str, Type[QueueProcessingWorker]] = {}
-queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {}
+worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {}
+queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {}

-AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None
+AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-22 11:02:32 -07:00
Stefan Weil d2fa058cc1
text: Fix some typos (most of them found and fixed by codespell).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-03-27 17:25:56 -07:00
Vishnu KS 1585ad7bf4 mattermost: Add support for exporting DMs and huddles. 2019-10-10 16:37:03 -07:00
Rishi Gupta e10361a832 models: Replace is_guest and is_realm_admin with UserProfile.role.
This new data model will be more extensible for future work on
features like a primary administrator.
2019-10-06 16:24:37 -07:00
Mateusz Mandera dbe508bb91 models: Migration of Message.pub_date to date_sent, part 2.
Fixes #1727.

With the server down, apply migrations 0245 and 0246. 0246 will remove
the pub_date column, so it's essential that the previous migrations
ran correctly to copy data before running this.
2019-10-05 19:01:34 -07:00
Vishnu Ks bf5f531e90 import_util: Support huddles in SubscriberHandler. 2019-09-18 11:53:13 -07:00
Vishnu Ks 5e6d86c8c4 slack_import: Support importing multiparty IMs. 2019-07-09 15:03:28 -07:00
Anders Kaseorg 643bd18b9f lint: Fix code that evaded our lint checks for string % non-tuple.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:21:37 -07:00
Vishnu Ks f517f72dd2 import: Make use of is_mirror_dummy in build_user_profile. 2019-04-04 13:51:52 -07:00
Vishnu Ks bd4c3b3ebb import: Move make_user_messages to import_util.py. 2019-04-04 13:51:52 -07:00
Vishnu Ks d921fd25e4 import: Move SubscriberHandler to import_util. 2019-03-20 11:29:51 -07:00
Tim Abbott cbc62b8e07 streams: Prevent creation of multi-line stream descriptions.
We do not anticipate our UI for showing stream descriptions looking
reasonable for multi-line descriptions, so we should just ban creating
them.

Given the frontend changes, multi-line descriptions are only likely to
show up from importing content from other tools, in which case
replacing newlines with spaces is cleaner than the alternative.
2019-02-20 12:28:00 -08:00
Anders Kaseorg 56a675d5ec export: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:25:27 -08:00
Hemanth V. Alluri 73d26c8b28 streams: Render and store the stream description from the backend.
This commit does the following three things:
    1. Update stream model to accomodate rendered description.
    2. Render and save the stream rendered description on update.
    3. Render and save stream descriptions on creation.

Further, the stream's rendered description is also sent whenever the
stream's description is being sent.

This is preparatory work for eliminating the use of the
non-authoritative marked.js markdown parser for stream descriptions.
2019-02-01 22:24:18 -08:00
Tim Abbott 035138dd98 hipchat: Refactor code for building subscriptions.
This moves the filtering of invite-only into the caller, and also
adjusts the indentation.
2019-01-09 16:50:00 -08:00
Tim Abbott c995e8e2ae import: Ensure presence of basic avatar images for HipChat.
Our HipChat conversion tool didn't properly handle basic avatar
images, resulting in only the medium-size avatar images being imported
properly.  This fixes that bug by asking the import tool to do the
thumbnailing for the basic avatar image (from the .original file) as
well as the medium avatar image.
2018-12-27 17:47:09 -08:00
Tim Abbott 8a90441d2f slack import: Import long-inactive users as long-term idle.
This avoids creating UserMessage rows for long-inactive users in
organizations with many thousands of users.
2018-12-16 18:52:20 -08:00
Tim Abbott 48a3975ec0 import: Avoid unnecessary forks when downloading attachments.
The previous implementation used run_parallel incorrectly, passing it
a set of very small jobs (each was to download a single file), which
meant that we'd end up forking once for every file to download.

This correct implementation sends each of N threads 1/N of the files
to download, which is more consistent with the goal of distributing
the download work between N threads.
2018-12-02 13:50:27 -08:00
Steve Howell d86dd165da gitter/slack/hipchat: Remove "subject" from conversions.
We (lexically) remove "subject" from the conversion code.  The
`build_message` helper calls `set_topic_name` under the hood,
so things still have "subject" in the JSON.

There was good code coverage on `build_message`.
2018-11-12 15:47:11 -08:00
Tim Abbott e88998e6d4 import: Fix buggy handling of avatars in Slack conversion.
This was a pretty nasty error, where we were accidentally accessing
the parent list in this inner loop function.

This appears to have been introduced as a refactoring bug in
7822ef38c2.
2018-11-08 15:03:39 -08:00
Steve Howell 00f822a26a conversion: Generate attachment_ids with helpers. 2018-10-29 13:24:50 -07:00
Steve Howell 5cb60f7bea conversions: Use subscriber_map for Slack/Gitter.
We now use subscriber_map for building UserMessage
rows in Slack/Gitter conversions.

This is mostly designed to simplify the code, rather
than having to scan the entire subscribers for each
message.

I am guessing this will improve performance for most
conversions.  We sort small lists on every message,
in order to be deterministic, but the sorting cost
is probably more than offset by avoiding the O(N)
scans across all subscriptions.  Also, it's probably
negligible in the grand scheme of things, compared
to JSON parsing, file I/O, etc.

This commits also fixes some typos with mentioned_users_id ->
mentioned_user_ids and cleans up a test a bit as well.
2018-10-29 13:24:50 -07:00
Steve Howell adb458a5df refactor: Use build_user_message for Slack/Gitter.
We now have all three third party
conversions (Gitter/Slack/Hipchat)
go through build_user_message().

Hipchat was already using this helper.

We also avoid callers having to pass in
an id to build_user_message().
2018-10-29 13:24:50 -07:00
Steve Howell 5194701787 conversions: Use NEXT_ID for usermessage_id.
This is mostly complicated due to the way that the
Slack import passes around tuples of ids to maintain
four different parallel sequences.
2018-10-29 13:24:50 -07:00
Steve Howell 78f6e3ac7d hipchat import: Fix data issues with PMs.
We now set the is_private flag on UserMessage
rows for PMs and set their subject to ''.
2018-10-25 09:11:36 -05:00
Steve Howell 6e8ae2e3fd hipchat import: Support private stream subscribers.
We now create private stream subscriptions that are
based off of `members` and `owner` from room data
in `rooms.json`.
2018-10-25 08:31:01 -05:00
Steve Howell 25f532ca2f refactor: Break up build_subscriptions.
Having two smaller functions should make it
easier to customize the behavior for each specific
use case.  The only reason they were ever coupled
was to keep ids in sequence, but the recent NEXT_ID
changes make that a non-issue now.
2018-10-25 08:31:01 -05:00
Steve Howell 2ed9fbd25b conversions: Use NEXT_ID for recipient and subscription ids.
The NEXT_ID scheme seems pretty robust, so I'm fixing a
few easy places.
2018-10-25 08:31:01 -05:00
Steve Howell 481488a35e Extract make_subscriber_map().
We extract this function and put it in the shared
library `import_util.py`.

Also, we make it one time higher up in the call
stack, rather than re-building it for every batch
of messages.  I doubt this was super expensive, but
there's no reason to repeatedly execute this.
2018-10-23 17:27:37 -05:00
Steve Howell d1ff903534 refactor: Rename build_user -> build_user_profile.
This makes greps less confusing.
2018-10-23 17:27:37 -05:00
Steve Howell ff61c56f47 hipchat import: Add NotificationMessage support. 2018-10-17 12:11:08 -07:00
Tim Abbott f9b6eeb488 import: Migrate from json to ujson for better perf.
We expect to get better memory performace from
ujson than json.

We also do a better job of closing file handles.

This likely fixes #10377.
2018-10-17 12:11:08 -07:00
Steve Howell 8accc60ca7 import_util: Support multiple message ids for attachments. 2018-10-13 16:47:44 -07:00
Steve Howell 23d7b3d2cc import: De-dup create_converted_data_files helper. 2018-10-13 16:47:41 -07:00