Commit Graph

330 Commits

Author SHA1 Message Date
Anders Kaseorg 61d0417e75 python: Replace ujson with orjson.
Fixes #6507.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:55:12 -07:00
Anders Kaseorg 03d2540899 export: Post-process authentication_methods BitHandler field to list.
A BitHandler object is not JSON serializable, and orjson enforces
this.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:47:13 -07:00
Anders Kaseorg 2cf2547b27 export: Add missing datetime fields for post-processing.
datetime objects are not ordinarily JSON serializable.  While both
ujson and orjson have special cases to serialize datetime objects,
they do it in different ways.  So we want to fix the post-processing
code to do its job.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:47:13 -07:00
Anders Kaseorg 60a25b2721 docs: Fix spelling errors caught by codespell.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-08-11 10:23:06 -07:00
Tim Abbott 6130a61be0 export: Only print .s with percent_callback to console.
The S3 data export tool's upload code path uses this nice boto
callback feature for showing a progress bar, which is nice for the
management command.  It's spammy/broken in production and the backend
tests, so we change percent_callback to be a parameter passed in so
that it can only be used in the contexts where it makes sense.
2020-07-30 13:14:53 -07:00
Hemanth V. Alluri 0e893b9045 models/drafts: Add a model for storing Draft messages.
Also add a Draft object-to-dictionary conversion method.
The following commits will provide an API around this
model using which our clients can sync drafts across each
other (if they so wish too). As of making this commit, we
haven't finalized exactly how our clients will use this.

See https://chat.zulip.org/#narrow/stream/2-general/topic/drafts
For some of the discussion around this model and in general,
around this feature.

Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>
2020-07-28 17:18:35 -07:00
Steve Howell 318c55e030 export: Export AlertWord table. 2020-07-16 08:50:31 -07:00
Steve Howell 9662af842f export: Remove stream sanity check.
We also remove the post_process_data option.

The sanity check is just overkill at this point,
since the mechanism to find streams is very
direct due to a recent commit.
2020-07-09 11:34:00 -07:00
Steve Howell 54c596cfc4 export: Just export all streams in a realm.
Before this change we would only export streams
that had actual subscribers, which is usually
harmless, but it was mostly a relic of a one
time migration that we did when we were cleaning
up some dirty data in some of our very early
databases (circa 2016).

Now we work down the table hierarchy in a
more natural way:

    - get Streams in Realm
    - get Recipients matching above Streams
    - get Subscriptions matching above Recipients

Note that for per-user exports, I kept the
same logic (users -> subscriptions -> recipients ->
streams) we had before.

One subtle detail here is that we make our
final Config blocks--which build the final
version of Recipient/Subscription--now hang
off of realm_config.

Fixes #15146.
2020-07-09 11:33:55 -07:00
Anders Kaseorg 74c17bf94a python: Convert more percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Now including %d, %i, %u, and multi-line strings.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-14 23:27:22 -07:00
Anders Kaseorg 57a80856a5 python: Convert more "".format to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus --keep-percent-format.

Now including %d, %i, %u, and multi-line strings.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-13 15:39:00 -07:00
Anders Kaseorg 365fe0b3d5 python: Sort imports with isort.
Fixes #2665.

Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.

Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start.  I expect this change will increase pressure for us to split
those files, which isn't a bad thing.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-11 16:45:32 -07:00
Anders Kaseorg 69730a78cc python: Use trailing commas consistently.
Automatically generated by the following script, based on the output
of lint with flake8-comma:

import re
import sys

last_filename = None
last_row = None
lines = []

for msg in sys.stdin:
    m = re.match(
        r"\x1b\[35mflake8    \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
    )
    if m:
        filename, row_str, col_str, err = m.groups()
        row, col = int(row_str), int(col_str)

        if filename == last_filename:
            assert last_row != row
        else:
            if last_filename is not None:
                with open(last_filename, "w") as f:
                    f.writelines(lines)

            with open(filename) as f:
                lines = f.readlines()
            last_filename = filename
        last_row = row

        line = lines[row - 1]
        if err in ["C812", "C815"]:
            lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
        elif err in ["C819"]:
            assert line[col - 2] == ","
            lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")

if last_filename is not None:
    with open(last_filename, "w") as f:
        f.writelines(lines)

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-06-11 16:04:12 -07:00
Graham Bleaney 461d5b1a3e pysa: Introduce sanitizers, models, and inline marking safe.
This commit adds three `.pysa` model files: `false_positives.pysa`
for ruling out false positive flows with `Sanitize` annotations,
`req_lib.pysa` for educating pysa about Zulip's `REQ()` pattern for
extracting user input, and `redirects.pysa` for capturing the risk
of open redirects within Zulip code. Additionally, this commit
introduces `mark_sanitized`, an identity function which can be used
to selectively clear taint in cases where `Sanitize` models will not
work. This commit also puts `mark_sanitized` to work removing known
false postive flows.
2020-06-11 12:57:49 -07:00
Anders Kaseorg 67e7a3631d python: Convert percent formatting to Python 3.6 f-strings.
Generated by pyupgrade --py36-plus.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-06-10 15:02:09 -07:00
whoodes cea7d713cd requirements: Upgrade boto to boto3.
Fixes: #3490

Contributors include:

Author:    whoodes <hoodesw@hawaii.edu>
Author:    zhoufeng1989 <zhoufengloop@gmail.com>
Author:    rht <rhtbot@protonmail.com>
2020-05-26 23:18:07 -07:00
Anders Kaseorg bdc365d0fe logging: Pass format arguments to logging.
https://docs.python.org/3/howto/logging.html#optimization

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-05-02 10:18:02 -07:00
Wyatt Hoodes 5a58b7c549 data exports: Keep deleted export in UI table.
It makes sense to keep a deleted export in the table,
along with the time of deletion, for auditing reasons.
2020-04-30 13:00:59 -07:00
Wyatt Hoodes 82e7ad8e25 data exports: Handle pending and failed exports.
Prior to this change, there were reports of 500s in
production due to `export.extra_data` being a
Nonetype.  This was reproducible using the s3
backend in development when a row was created in
the `RealmAuditLog` table, but the export failed in
the `DeferredWorker`.  This left an entry lying
about that was never updated with an `extra_data`
field.

To fix this, we catch any exceptions in the
`DeferredWorker`, and then update `extra_data` to
encode the failure.  We also fix the fact that we
never updated the export UI table with pending exports.

These changes also negated the use for the somewhat
hacky `clear_success_banner` logic.
2020-04-30 13:00:59 -07:00
Abhishek-Balaji 052368bd3e alert_words: Move alert_words from UserProfile to separate model.
Previously, alert words were a JSON list of strings stored in a
TextField on user_profile.  That hacky model reflected the fact that
they were an early prototype feature.

This commit migrates from that to a separate table, 'AlertWord'.  The
new AlertWord has user_profile, word, id and realm(denormalization so
we can provide a nice index for fetching all the alert words in a
realm).

This transition requires moving the logic for flushing the Alert Words
caches to their own independent feature.

Note that this commit should not be cherry-picked without the
following commit, which fixes case-sensitivity issues with Alert Words.
2020-04-27 11:29:50 -07:00
Anders Kaseorg fead14951c python: Convert assignment type annotations to Python 3.6 style.
This commit was split by tabbott; this piece covers the vast majority
of files in Zulip, but excludes scripts/, tools/, and puppet/ to help
ensure we at least show the right error messages for Xenial systems.

We can likely further refine the remaining pieces with some testing.

Generated by com2ann, with whitespace fixes and various manual fixes
for runtime issues:

-    invoiced_through: Optional[LicenseLedger] = models.ForeignKey(
+    invoiced_through: Optional["LicenseLedger"] = models.ForeignKey(

-_apns_client: Optional[APNsClient] = None
+_apns_client: Optional["APNsClient"] = None

-    notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
-    signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)
+    signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE)

-    author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)
+    author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE)

-    bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)
+    bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL)

-    default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
-    default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)
+    default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE)

-descriptors_by_handler_id: Dict[int, ClientDescriptor] = {}
+descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {}

-worker_classes: Dict[str, Type[QueueProcessingWorker]] = {}
-queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {}
+worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {}
+queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {}

-AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None
+AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-22 11:02:32 -07:00
Anders Kaseorg c734bbd95d python: Modernize legacy Python 2 syntax with pyupgrade.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:43:22 -07:00
Graham Bleaney 5dca599481 export: Harden s3 export against directory traversal.
This commit modifies 'zerver/lib/export.py' to raise an exception
in the presence of a suspected attempt at directory traversal.
2020-03-25 16:39:17 -07:00
Anders Kaseorg 39f9abeb3f python: Convert json.loads(f.read()) to json.load(f).
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-03-24 10:46:32 -07:00
Vishnu KS 51f5701879 export: Canonicalize the email of cross realm bot to default value.
Fixes #13496
2020-02-19 14:44:50 -08:00
Vishnu Ks 5dfd4ea38d export: Remove unused parameter from _get_exported_s3_record. 2020-02-03 14:09:05 -08:00
Vishnu Ks 2ea53a347a import: Support importing realm icon and logo.
Fixes #11216
2020-02-03 14:09:05 -08:00
Ryan Rehman 3dc7d60ffe muting: Record DateTime when a Topic is muted.
This includes the necessary migration to add
the date_muted field to the MutedTopic class
and populates it with a hard coded value.
2020-02-02 20:49:53 -08:00
Hashir Sarwar 0cabacb8ab export: Fix data export parallelization.
This improves the approach of creating multiple parallel processes by
using subprocess.Popen() instead of run_parallel() and
subprocess.call() while exporting an organization's message
history.  This prevents forking twice for individual subprocess.

While this has some performance benefit, the main reason to fix this
is that it fixes an issue with the data export web UI introduced in
run_parallel forks exited).

Fixes #12904.
2020-01-07 13:23:18 -08:00
Mateusz Mandera 9077bbfefd models: Add MissedMessageEmailAddress class.
Preparatory commit for making the email mirror use the database instead
of redis for missed message addresses.

This model will represent missed message email addresses, which
currently have their data stored in redis.
The redis data will be converted and migrated into these models and
the email mirror will start using them in the main commit.
2020-01-07 12:46:55 -08:00
Mateusz Mandera dbe508bb91 models: Migration of Message.pub_date to date_sent, part 2.
Fixes #1727.

With the server down, apply migrations 0245 and 0246. 0246 will remove
the pub_date column, so it's essential that the previous migrations
ran correctly to copy data before running this.
2019-10-05 19:01:34 -07:00
Tim Abbott 96726c00ce export: Fix broken URLs in UI with S3 backend.
Apparently, the Zulip notifications (and resulting emails) were
correct, but the download links inside the Zulip UI were incorrectly
not including S3 prefix on the URL, making them not work.

While we're at this, we rewrite the somewhat convoluted previous
system for formatting the data export output.
2019-09-24 13:56:49 -07:00
Wyatt Hoodes 6f6efa516d exports: Refactor extra_data to export_data. 2019-08-12 17:51:46 -07:00
Wyatt Hoodes 7a2a1f29ad exports: Refactor event_time to export_time timestamp.
The time of the event was incorrectly being sent
as a datetime object.
2019-08-12 17:51:46 -07:00
Wyatt Hoodes 11db0c23fb exports: Update extra_data field to a JSON structure.
We add the `deleted_timestamp` key to the new `extra_data`
dictionary.
2019-08-07 12:04:28 -07:00
Wyatt Hoodes bbbea9ec87 events: Rewrite system for managing realm exports.
This feature is intended to cover all of our ways of exporting a
realm, not just the initial "public export" feature, so we should name
things appropriately for that goal.

Additionally, we don't want to include data exports in page_params;
the original implementation was actually buggy and would have.
2019-07-26 16:38:52 -07:00
Wyatt Hoodes ef02de4834 public_export: Add endpoint for returning all REALM_EXPORTED objects. 2019-07-26 15:52:02 -07:00
Wyatt Hoodes e331a758c3 python: Migrate open statements to use with.
This is low priority, but it's nice to be consistently using the best
practice pattern.

Fixes: #12419.
2019-07-20 15:48:52 -07:00
Wyatt Hoodes 0f31053b0c export.py: Call upload_backend only when we know which backend to use.
Importing at the top of the file causes conflictions on the
backend when deciding whether to go down the S3 path, or the
`LOCAL_UPLOADS_DIR` path.
2019-07-10 17:48:54 -07:00
Wyatt Hoodes af4eb8c0d5 export/upload: Refactor tarball upload logic to upload_backend.
The conditional block containing the tarball upload logic for both S3
and local uploads was deconstructed and moved to the more appropriate
location within `zerver/lib/upload.py`.
2019-07-03 15:40:35 -07:00
Wyatt Hoodes 8efb7b903b export.py: Have do_export_realm handle the export tarball.
This change is preliminary refactoring in order to improve the test
mocking strategy related to `test_realm_export.py`.

What this allows is the ability to simply mock a return value from
`do_export_realm`.  We can then use that value as a dummy url to
ensure a file has been served and can be retrieved.
2019-07-03 15:40:35 -07:00
Mateusz Mandera a2cce62c1c retention: Use new ArchiveTransaction model.
We add a new model, ArchiveTransaction, to tie archived objects together
in a coherent way, according to the batches in which they are archived.
This enables making a better system for restoring from archive, and it
seems just more sensible to tie the archived objects in this way, rather
the somewhat vague setting of archive_timestamp to each object using
timezone_now().
2019-06-26 12:05:59 -07:00
Tim Abbott 544f9c74ce export: Use outbox emoji for managing who is exported.
This is a little more unambiguous.
2019-06-17 16:10:28 -07:00
Mateusz Mandera 29529cf2e7 retention: Add ArchivedSubMessage model. 2019-05-29 16:26:11 -07:00
Mateusz Mandera 292b4bb0d7 retention: Add ArchivedReaction model. 2019-05-29 16:26:11 -07:00
Wyatt Hoodes 5c82c52b52 export.py: Clean up redundant import statements.
There existed duplicate import statements for the S3 backend as a
result of the prior refactoring work.
2019-05-27 20:13:56 -07:00
Wyatt Hoodes c0ef6c2fc6 export: Add LOCAL_UPLOADS_DIR support to the export feature.
A unique path was created using the `LOCAL_UPLOADS_DIR` backend, similar
to the code used in `LocalUploadBackend`.  The exported tarball was
copied to the directory, and an nginx url was created to serve the file
publicly.

Tweaked by tabbott to output an actual URL.
2019-05-27 20:06:35 -07:00
Vishnu Ks 21e7763886 export: Remove unnecessary query from export_partial_message_files.
The query is not required anymore after the refactoring done
while merging #12225.
2019-05-21 14:10:29 -07:00
Wyatt Hoodes 4dd8c133a9 export: Rename `--upload-to-s3` to be `--upload`.
The upload option will no longer be limited to strictly S3 uploads. This
commit serves as a preliminary step for supporting LOCAL_UPLOADS_DIR as
part of the public only export feature.
2019-05-20 19:59:57 -07:00
Vishnu Ks 06983298ba export: Add support for exporting realm with member consent.
This lets us handle directly in our tooling the user experience that
we document for exporting a realm with member consent (before, it
required unpleasant manual work).
2019-05-15 12:35:32 -07:00
Tim Abbott edb956091f export: Add a blank line in S3 upload output.
This should be more readable.
2019-04-30 16:37:23 -07:00
Tim Abbott 8b5d2e9631 export: Return the S3 URL we uploaded data to.
This will make it possible to access that URL from the caller for the
data export tool.
2019-04-26 17:22:02 -07:00
Anders Kaseorg 643bd18b9f lint: Fix code that evaded our lint checks for string % non-tuple.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-04-23 15:21:37 -07:00
Tim Abbott cb6c1e7a92 export: Fix log line about zerver_scheduledemail_users. 2019-04-23 13:52:02 -07:00
Wyatt Hoodes bafcf3c664 export.py: Add 'delete after upload' option for removing tarball.
This allows removal of the local tarball upon a succesful s3 upload.

A part of the public-only-realm-export webapp feature.
2019-04-12 10:50:06 -07:00
Wyatt Hoodes 0db7d6c31b export.py: Refactor './manage.py export' core logic.
This commit serves as the first step in supporting "public export" as a
webapp feature.  The refactoring was done as a means to allow calling
the export logic from elsewhere in the codebase.
2019-04-12 10:50:06 -07:00
Raymond Akornor 89351cdd19 send_email: Add ScheduledEmail support for multiple recipients.
Follow up on 92dc363. This modifies the ScheduledEmail model
and send_future_email to properly support multiple recipients.

Tweaked by tabbott to add some useful explanatory comments and fix
issues with the migration.
2019-03-15 11:02:12 -07:00
Tim Abbott bc3b864754 export: Add a bunch of comments to our export tool. 2019-02-28 12:20:08 -08:00
Anders Kaseorg 649235cfec python: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-22 16:54:36 -08:00
Tim Abbott 0c0aec3cc9 export: Fix finding manage.py to export usermessages.
We were using a hardcoded relative path, which doesn't work if you're
not running this from the root of the Zulip checkout.

As part of fixing this, we need to make `LOCAL_UPLOADS_DIR` an
absolute path.

Fixes #11581.
2019-02-15 11:32:36 -08:00
Abdelhadi Dyouri 4ac2db56f8 export: Correctly treat emoji author field as optional.
While we likely will eventually want to make every custom emoji have
an author, that's not the data model today.

Fixes #11518.
2019-02-13 16:12:06 -08:00
Wyatt Hoodes 9c68a97472 import/export: Use separate analytics.json for analytics data.
This helps keep the realm.json small and easy to process; previously,
almost the entire size of that file was the analytics data.

We implement this by refactoring the analytics Config objects into a
separate subroutine that writes to a separate file, plus the
corresponding import code.

Manual testing was performed by exporting the 'analytics' realm, and
importing back to a newly created 'test' realm.  The 'test' realm was
then exported and the json files were inspected.  The data appeared
consistent with no abnormalities.

Fixes: #11220.
2019-02-04 10:59:24 -08:00
Anders Kaseorg 56a675d5ec export: Remove unused imports.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-02-02 17:25:27 -08:00
Tim Abbott 022c8beaf5 analytics: Add APIs for submitting analytics to another server.
This adds a new API for sending basic analytics data (number of users,
number of messages sent) from a Zulip server to the Zulip Cloud
central analytics database, which will make it possible for servers to
elect to have their usage numbers counted in published stats on the
size of the Zulip ecosystem.
2019-02-01 22:03:52 -08:00
Rishi Gupta 85f7ac8172 analytics: Remove Anomaly model. 2019-02-01 18:48:18 -08:00
Anders Kaseorg 601b5eb036 export: Avoid hardcoded paths in /tmp.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2019-01-15 16:05:51 -08:00
Tim Abbott 848b2f687c export: Add support for public-streams-only exports.
Previously, this wasn't an explicit feature of the export tool.

Note that the current version still includes metadata on private
streams and private message recipients, just not their messages.
2019-01-07 16:52:02 -08:00
Tim Abbott 6eda129741 export: Export and import analytics table data.
This should eliminate the need to do manual analytics work when
importing organizations imported/exported using the zulip -> zulip
import/export tools.
2019-01-04 16:22:18 -08:00
Tim Abbott 48ccb3ad18 import: Move realm_tables to the appropriate file.
These had ended up in the wrong place when we split export from
import.
2019-01-04 16:22:18 -08:00
Steve Howell a8301ca14a status: Add UserStatus model and core library for away status. 2019-01-02 09:12:03 -08:00
rht a0dbcde063 export_files_from_s3: Move saving s3 object to local file to a separate function.
This refactor makes upgrading boto to boto3 easier.
Based on 24bf813e8a
2018-12-07 11:37:46 -08:00
rht 0ddb242583 export_files_from_s3: get s3 object info in dict to a separate function.
This refactor makes upgrading boto to boto3 easier.
Based on 24bf813e8a
2018-12-07 11:37:46 -08:00
rht 1cecf0f142 export_files_from_s3: Move checking for s3 oject's metadata to a separate function.
This refactor makes upgrading boto to boto3 easier.
Based on 24bf813e8a
2018-12-07 11:37:46 -08:00
Tim Abbott fc1c146d31 export: Remove assertion on current working directory.
This command hasn't made deep assumptions about CWD for a long time,
and this enables users to run it through a symlink (etc.).

Fixes #10961.
2018-12-06 11:05:40 -08:00
Anders Kaseorg 1d15d72775 zerver/lib/export.py: Avoid shelling out for cp, rm, ln.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2018-11-28 17:28:17 -08:00
Tim Abbott feee76eb23 export: Fix exporting files with S3 upload backend.
At some point as part of the process of supporting renumbering data,
we changed the structure of our file uploads to expect `path` to match
`s3_path`, with both having the relative path within the overall
hierarchy (including the realm ID).  This change updates the more
rarely-used S3 export code path to use that model, fixing a crash when
messages reference an Attachment object with a rewritten path_id.
2018-09-20 20:14:19 -07:00
Tim Abbott e2bd03365e import: Fix handling of recipient IDs for welcome bot.
If any user had sent the reply to the welcome bot recommended by our
tutorial, then the Zulip export/import process didn't work properly,
because we weren't including (and then remapping) the recipient ID for
sending PMs to the cross-realm bots.  This commit fixes that gap, by
recording the necessary data on the export side, and doing the
appropriate remapping on the import side.
2018-09-20 17:55:17 -07:00
Tim Abbott e04156eef3 export: Fix error messages for stream list mismatches.
The previous error messages for this were written for a tool only to
be used by a couple people, and didn't make clear what potential
causes were.  Tweak these to provide greater clarity about what's
going on.

The main cause of these errors appearing in practice was fixed in
7ea5987e5d, but nothing strongly
prevents a similar issue from being introduced in the future.

Fixes #10078.
2018-07-30 22:32:26 -07:00
Tim Abbott db1260fb93 export: Clean up comments on why tables are not currently exported. 2018-07-23 08:28:20 -07:00
Rhea Parekh c42d6b6983 export: Remove 'zerver_pushdevicetoken' from the to be imported list.
PushDevicetoken is automatically created when a user logs
in a server from mobile. This shouldn't be imported.
2018-07-23 08:21:00 -07:00
Rhea Parekh f01ff28e03 export: Export BotStorageData and BotConfigData. 2018-07-23 08:21:00 -07:00
Rhea Parekh 2978e025df import: Import UserGroup. 2018-07-23 08:21:00 -07:00
Rhea Parekh 98a7762a51 export: Export user groups. 2018-07-23 08:20:58 -07:00
Rhea Parekh 0fcf6d9a40 export: Export Service. 2018-07-23 08:20:58 -07:00
Rhea Parekh 6eab6446fc export: Export MutedTopic. 2018-07-23 08:20:58 -07:00
Rhea Parekh 8897e187c0 export: Export UserHotspot. 2018-07-23 08:20:58 -07:00
Rhea Parekh c182a0c7a0 export: export RealmAuditLog. 2018-07-10 15:53:15 +05:30
Rhea Parekh 838ab2fce5 export: Add variable MESSAGE_BATCH_CHUNK_SIZE in export.py.
Also use this variable in slack_data_to_zulip_data.
2018-07-01 07:08:13 -07:00
Rhea Parekh a2a74d9271 export: The records.json IDs should be integer.
In records the IDs like the realm_id and user_profile_id
of 'records.json' should be integers. This was missing in the
S3 backend and this commit fixes that.

Added tests for this as well.
2018-06-18 23:06:09 +05:30
Rhea Parekh b2e971b9b1 tests: Add tests for the export file's records.
For the emojis, In 'records.json', the record should contain
the attribute 'file_name', which was missing in the S3 backend.
This commit adds this attribute, as well as tests for the
records of uploads, avatars and emojis in both local and S3 backend.
2018-06-18 09:19:24 -07:00
Neil Pilgrim ba55d22fdb mypy: Improve MessageOutput typing in export.py.
See the comments above for why this is the correct list of options.
2018-06-14 15:22:56 -07:00
Tim Abbott b9b81cf658 export: Rename ALL_ZERVER_TABLES to ALL_ZULIP_TABLES.
They don't all start with zerver, now :).
2018-05-31 10:47:27 -07:00
Tim Abbott 42aea68df3 export: Automate validation of ALL_ZERVER_TABLES.
This should help make it explicit whenever we add a new table to Zulip
that we need to correctly categorize it for whether it will be
included in the data export, or not.
2018-05-31 10:47:27 -07:00
Tim Abbott 328136344a import: Fix typo in zerver_customprofilefieldvalue table name.
Apparently, we were doing this slightly wrong.
2018-05-31 10:47:27 -07:00
Rhea Parekh 468afe4840 export: Support export of Custom emojis.
Export of RealmEmoji should also include the image
file of those emojis.

Here, we export emojis both for local and S3 backend
in a method with is similar to attachments and avatars.

Added tests for the same.
2018-05-27 21:54:20 -07:00
Rhea Parekh 7a8b853708 Export: Support export of reactions.
We get the reactions from the messages exported.
2018-05-27 21:54:20 -07:00
Tim Abbott 4e70c9402a export: Fix path logic for exporting avatars with S3 backend.
Apparently, we missed this when we converted the export format to use
longer path names for avatars.
2018-05-25 12:04:34 -07:00
Rhea Parekh c24c249b8c export: Support export of Custom Profile Field. 2018-05-23 09:07:26 -07:00
Aditya Bansal a68376e2ba zerver/lib: Change use of typing.Text to str. 2018-05-12 15:22:39 -07:00
Tim Abbott 0a39eb2a58 export: Convert a bunch of error cases to AssertionError.
This reflects the fact that these are just defensive programming (we
don't expect them to ever happen) and also nicely makes these lines
not show up in our missing test coverage reports.
2018-05-09 20:49:13 -07:00
Tim Abbott c4b886d8ae import: Split out import.py into its own module.
This should make it a bit easier to find the code.
2018-04-23 15:21:12 -07:00
Rhea Parekh 035c440ff3 import script: Support import custom profile fields.
Import of Custom profile fields is only supported for slack
import script for now.
2018-04-09 10:45:35 -07:00
Rhea Parekh ed7127c8b4 import script: Delete medium sized avatars if it exists.
Deletion of medium sized image is done if it exists before calling the
function 'ensure_medium_avatar_image', to avoid potentially confusing
problems with left-over medium-size avatar images from a previous run
being used when repeatedly importing the same realm in a development
environment..

Fixes #8949.
2018-04-08 07:04:24 -07:00
Rhea Parekh e037c2f93e import script: Fix upload links.
Rendered content is None for Slack imports, hence it is replaced only
for Zulip->Zulip imports.

Fixes #8959.
2018-04-07 20:01:20 -07:00
Rhea Parekh b3f951d2cf import script: User profile ids should be allocated before allocating bot ids. 2018-04-07 13:28:33 +05:30
Rhea Parekh 2baa9bc16e Import: Add subdomain in the import script.
Also remove user input of subdomain in the slack data
conversion script.
2018-04-06 09:12:56 -07:00
Rhea Parekh f4ad464d82 import script: Fix broken links to attachments.
The comments explain this pretty well, but basically because we
rewrite the realm ID during the import process, we need to edit all
the message bodies that link to an attachment to instead link to the
post-processed URL where that file will be hosted on the new server.

Fixes #8926.
2018-04-04 10:05:15 -07:00
Rhea Parekh 5a9cea4134 import script: re map foreign key of UserProfile.last_active_message_id. 2018-04-04 08:53:09 -07:00
Rhea Parekh ed36314042 import script: Fix 're_map_foreign_keys' logging error. 2018-04-04 08:53:09 -07:00
Rhea Parekh 877c7760b7 import script: re_map Attachment foreign keys. 2018-04-04 08:53:09 -07:00
Rhea Parekh 1bba6cc4ce slack importer: Support custom emoji reactions. 2018-04-01 23:24:35 -07:00
Rhea Parekh 00c1f25b58 import script: Support custom emojis.
'processing_emojis' check is added in the 'import_uploads'
function, so that the emoji files present in the to be imported
data file can be uploaded.

The procedure of saving emoji files in slack importer is same as
saving attachments and avatars, and the import has the similar
procedure too.
2018-04-01 23:24:35 -07:00
Rhea Parekh 6f867fee40 import script: Support import of reactions. 2018-04-01 23:24:33 -07:00
Rhea Parekh d147bd25d0 import script: Change file path of the upload in the import script.
In importing avatars, we use the implementation where the 'avatar_path'
is seperately calculated using realm and user ID and then the content
of the path provided in the avatar's 'records.json' are copied to this
'avatar_path'.

Similary, here for the uploads, 's3_file_name' is seperately calculated
using the realm ID and uploaded file name and then the content of the
path provided in upload's 'records.json' are copied to this 's3_file_name'.
2018-04-01 23:04:14 -07:00
Rhea Parekh ff34d07fa0 import script: Add function to update model ids after allocation.
Add function 'update_model_ids' to remove repetitive code.
2018-04-01 22:29:23 -07:00
Rhea Parekh a2ecdeb28d import script: re_map minor foreign keys. 2018-04-01 22:29:23 -07:00
Rhea Parekh 078453554e import script: re_map Message foreign keys. 2018-04-01 22:29:23 -07:00
Rhea Parekh 93aabcb81c import script: re_map Subscription foreign keys. 2018-04-01 22:29:23 -07:00
Rhea Parekh 9ef7870c5a import script: re_map Recipient foreign keys. 2018-04-01 22:29:23 -07:00
Rhea Parekh 4537223ba7 import script: re_map UserProfile foreign keys. 2018-04-01 22:29:23 -07:00
Rhea Parekh 1314e7d247 import script: re_map Stream foreign keys.
'recipient_field' is added as a bool variable in the function
'update_id_map' to update the recipient foreign keys.

Recipient Foreign Key is equal to the UserProfile ID, if the
type is 1, and the same is equal to Stream ID, if the type is 2.
Hence a check is added in the 'update_id_map' field for this.
2018-04-01 22:29:23 -07:00
Rhea Parekh 8624ba4132 import script: re_map Realm foreign keys.
All the objects with realm ID as the foreign keys need to
be remapped with updated with the allocated ID.
Also the ID of the realm object itself is updated with the allocated
ID.
2018-04-01 22:29:23 -07:00
Rhea Parekh 2b0ee472af import script: Refactor re_map_foreign_keys.
The 'id_field' bool variable is added to the function just to check
if the field is the ID of that object, and not the foreign key relation.
For foreign key field names, a "_id" has to be added after the field name,
however we don't need that for the ID field of the object.
2018-04-01 22:29:23 -07:00
Rhea Parekh cd0871bae4 Import script: Add id allocation functions. 2018-04-01 22:29:23 -07:00
neiljp (Neil Pilgrim) 704c33331c mypy: Add explicit Optional for default=None parameters in export.py. 2018-03-28 12:31:51 -07:00
Rhea Parekh 0f183981e6 Import script: Make sure medium avatars exist during import.
During a slack import, we don't have medium-size avatars already
available in the export data set (and possibly also with a normal
import/export?).  The medium size avatar can be created by the
'ensure_medium_avatar_image' function, which checks if the medium
image exists, and if it doesn't, it creates the image.

This commit was substantially edited by tabbott to get rid of an
undefined variable bug, avoid initializing the upload backend classes
in a loop, and add some TODO notes on things that could be improved
later.
2018-03-01 16:48:06 -08:00
neiljp (Neil Pilgrim) 3cb12230b2 mypy: Annotate email_gateway_bot in export_files_from_s3(). 2018-02-13 11:40:52 -08:00
Greg Price cad4083987 export: Fix an unnecessary Any.
This was introduced a few weeks ago in
ed4054d11 "Import script: Check and add system bots after every import."
2018-01-30 15:34:47 -08:00
Rhea Parekh ed4054d110 Import script: Check and add system bots after every import.
This checks for the existing system bots and adds them if they
aren't included in the import.
2017-12-27 07:52:45 -05:00
greysome fb7ee942c4 mypy: Use Python 3 type syntax in zerver/lib/export.py 2017-12-26 08:30:33 -05:00
Tim Abbott 8b935f4e99 settings: Add setting for SYSTEM_BOT_REALM.
This fixes some subtle JavaScript exceptions we've been getting in
zulipchat.com, caused by the system bot realm there not being "zulip"
interacting with get_cross_realm_users.
2017-11-27 14:46:07 -08:00
rht 3f4bf2d22f zerver/lib: Use python 3 syntax for typing.
Extracted from a larger commit by tabbott because these changes will
not create significant merge conflicts.
2017-11-21 20:56:40 -08:00
rht 09af29b051 zerver/lib: Text-wrap long lines exceeding 110. 2017-11-15 10:58:03 -08:00
rht e311842a1b zerver/lib: Remove inheritance from object. 2017-11-06 08:53:48 -08:00
rht fef7d6ba09 zerver/lib: Remove u prefix from strings.
License: Apache-2.0
Signed-off-by: rht <rhtbot@protonmail.com>
2017-11-03 15:34:37 -07:00
Tim Abbott be619fe881 lint: Wrap many very long lines in the Python codebase.
This decreases the maximum line length in our Python codebase to 130.
2017-10-26 17:31:58 -07:00
Shekh Ataul d239f77966 refactor: Replace mkdir_p functions with Python 3 builtin.
This didn't exist in Python 2, but it does in Python 3, so we get to
reap the rewards of dropping Python 2 support.

Fixes #7082.
2017-10-25 11:06:11 -07:00
derAnfaenger cfadb43b93 codebase: Remove multiple whitespaces after comma. 2017-10-18 10:04:23 -07:00
rht 691598a88b py3: Remove "from six.moves import range".
This is no longer required, since in Python 3, this is what the range
built-in does.
2017-10-17 23:28:14 -07:00
rht 1da3c400e3 realm import: Convert the authentication_methods from list to bitfield.
This properly reflects how this is stored in the DB.

Tweaked by tabbott to use a proper function.
2017-10-17 21:32:20 -07:00
Tim Abbott c69c38b14e export: Fix importing/exporting of user avatars.
We apparently failed to update the export code for handling what
directories avatar files should live in during the earlier process.

Fixes #7052.
2017-10-17 21:15:58 -07:00
Umair Khan 60b8cba7df django: Bump version to 1.11.5. 2017-10-03 08:27:06 -07:00
derAnfaenger d1afab7199 Replace deprecated Logging.warn calls with Logging.warning. 2017-10-02 11:11:42 +02:00
rht 2e12fe5e2e zerver/lib: Remove print_function. 2017-09-27 18:05:45 -07:00
rht f43e54d352 zerver/lib: Remove absolute_import. 2017-09-27 10:00:39 -07:00
Vishnu Ks f3c04f711d lib: Remove unused get_user_profile_by_email import in export.py. 2017-07-20 16:50:23 -07:00
Vishnu Ks 3cbc0cc2eb lib: Use get_system_bot in do_import_realm(export.py). 2017-07-18 17:14:05 -07:00
Rishi Gupta aa845e7f60 models: Replace ScheduledJob with ScheduledEmail.
ScheduledJob was written for much more generality than it ended up being
used for. Currently it is used by send_future_email, and nothing
else. Tailoring the model to emails in particular will make it easier to do
things like selectively clear emails when people unsubscribe from particular
email types, or seamlessly handle using the same email on multiple realms.
2017-07-17 16:05:38 -07:00
Vaida d5517bae36 Delete the old zulip.com "referrals" system.
This system hasn't been in active use for several years, and had some 
problems with it's design.  So it makes sense to just remove it to declutter
the codebase.

Fixes #5655.
2017-07-07 14:59:18 -07:00
Christian Hudon 8ab6a23a30 Fix most strict-optional issues in export.py. 2017-05-24 18:50:59 -07:00