Commit Graph

376 Commits

Author SHA1 Message Date
Rhea Parekh b3f951d2cf import script: User profile ids should be allocated before allocating bot ids. 2018-04-07 13:28:33 +05:30
Rhea Parekh 2baa9bc16e Import: Add subdomain in the import script.
Also remove user input of subdomain in the slack data
conversion script.
2018-04-06 09:12:56 -07:00
Rhea Parekh f4ad464d82 import script: Fix broken links to attachments.
The comments explain this pretty well, but basically because we
rewrite the realm ID during the import process, we need to edit all
the message bodies that link to an attachment to instead link to the
post-processed URL where that file will be hosted on the new server.

Fixes #8926.
2018-04-04 10:05:15 -07:00
Rhea Parekh 5a9cea4134 import script: re map foreign key of UserProfile.last_active_message_id. 2018-04-04 08:53:09 -07:00
Rhea Parekh ed36314042 import script: Fix 're_map_foreign_keys' logging error. 2018-04-04 08:53:09 -07:00
Rhea Parekh 877c7760b7 import script: re_map Attachment foreign keys. 2018-04-04 08:53:09 -07:00
Rhea Parekh 1bba6cc4ce slack importer: Support custom emoji reactions. 2018-04-01 23:24:35 -07:00
Rhea Parekh 00c1f25b58 import script: Support custom emojis.
'processing_emojis' check is added in the 'import_uploads'
function, so that the emoji files present in the to be imported
data file can be uploaded.

The procedure of saving emoji files in slack importer is same as
saving attachments and avatars, and the import has the similar
procedure too.
2018-04-01 23:24:35 -07:00
Rhea Parekh 6f867fee40 import script: Support import of reactions. 2018-04-01 23:24:33 -07:00
Rhea Parekh d147bd25d0 import script: Change file path of the upload in the import script.
In importing avatars, we use the implementation where the 'avatar_path'
is seperately calculated using realm and user ID and then the content
of the path provided in the avatar's 'records.json' are copied to this
'avatar_path'.

Similary, here for the uploads, 's3_file_name' is seperately calculated
using the realm ID and uploaded file name and then the content of the
path provided in upload's 'records.json' are copied to this 's3_file_name'.
2018-04-01 23:04:14 -07:00
Rhea Parekh ff34d07fa0 import script: Add function to update model ids after allocation.
Add function 'update_model_ids' to remove repetitive code.
2018-04-01 22:29:23 -07:00
Rhea Parekh a2ecdeb28d import script: re_map minor foreign keys. 2018-04-01 22:29:23 -07:00
Rhea Parekh 078453554e import script: re_map Message foreign keys. 2018-04-01 22:29:23 -07:00
Rhea Parekh 93aabcb81c import script: re_map Subscription foreign keys. 2018-04-01 22:29:23 -07:00
Rhea Parekh 9ef7870c5a import script: re_map Recipient foreign keys. 2018-04-01 22:29:23 -07:00
Rhea Parekh 4537223ba7 import script: re_map UserProfile foreign keys. 2018-04-01 22:29:23 -07:00
Rhea Parekh 1314e7d247 import script: re_map Stream foreign keys.
'recipient_field' is added as a bool variable in the function
'update_id_map' to update the recipient foreign keys.

Recipient Foreign Key is equal to the UserProfile ID, if the
type is 1, and the same is equal to Stream ID, if the type is 2.
Hence a check is added in the 'update_id_map' field for this.
2018-04-01 22:29:23 -07:00
Rhea Parekh 8624ba4132 import script: re_map Realm foreign keys.
All the objects with realm ID as the foreign keys need to
be remapped with updated with the allocated ID.
Also the ID of the realm object itself is updated with the allocated
ID.
2018-04-01 22:29:23 -07:00
Rhea Parekh 2b0ee472af import script: Refactor re_map_foreign_keys.
The 'id_field' bool variable is added to the function just to check
if the field is the ID of that object, and not the foreign key relation.
For foreign key field names, a "_id" has to be added after the field name,
however we don't need that for the ID field of the object.
2018-04-01 22:29:23 -07:00
Rhea Parekh cd0871bae4 Import script: Add id allocation functions. 2018-04-01 22:29:23 -07:00
neiljp (Neil Pilgrim) 704c33331c mypy: Add explicit Optional for default=None parameters in export.py. 2018-03-28 12:31:51 -07:00
Rhea Parekh 0f183981e6 Import script: Make sure medium avatars exist during import.
During a slack import, we don't have medium-size avatars already
available in the export data set (and possibly also with a normal
import/export?).  The medium size avatar can be created by the
'ensure_medium_avatar_image' function, which checks if the medium
image exists, and if it doesn't, it creates the image.

This commit was substantially edited by tabbott to get rid of an
undefined variable bug, avoid initializing the upload backend classes
in a loop, and add some TODO notes on things that could be improved
later.
2018-03-01 16:48:06 -08:00
neiljp (Neil Pilgrim) 3cb12230b2 mypy: Annotate email_gateway_bot in export_files_from_s3(). 2018-02-13 11:40:52 -08:00
Greg Price cad4083987 export: Fix an unnecessary Any.
This was introduced a few weeks ago in
ed4054d11 "Import script: Check and add system bots after every import."
2018-01-30 15:34:47 -08:00
Rhea Parekh ed4054d110 Import script: Check and add system bots after every import.
This checks for the existing system bots and adds them if they
aren't included in the import.
2017-12-27 07:52:45 -05:00
greysome fb7ee942c4 mypy: Use Python 3 type syntax in zerver/lib/export.py 2017-12-26 08:30:33 -05:00
Tim Abbott 8b935f4e99 settings: Add setting for SYSTEM_BOT_REALM.
This fixes some subtle JavaScript exceptions we've been getting in
zulipchat.com, caused by the system bot realm there not being "zulip"
interacting with get_cross_realm_users.
2017-11-27 14:46:07 -08:00
rht 3f4bf2d22f zerver/lib: Use python 3 syntax for typing.
Extracted from a larger commit by tabbott because these changes will
not create significant merge conflicts.
2017-11-21 20:56:40 -08:00
rht 09af29b051 zerver/lib: Text-wrap long lines exceeding 110. 2017-11-15 10:58:03 -08:00
rht e311842a1b zerver/lib: Remove inheritance from object. 2017-11-06 08:53:48 -08:00
rht fef7d6ba09 zerver/lib: Remove u prefix from strings.
License: Apache-2.0
Signed-off-by: rht <rhtbot@protonmail.com>
2017-11-03 15:34:37 -07:00
Tim Abbott be619fe881 lint: Wrap many very long lines in the Python codebase.
This decreases the maximum line length in our Python codebase to 130.
2017-10-26 17:31:58 -07:00
Shekh Ataul d239f77966 refactor: Replace mkdir_p functions with Python 3 builtin.
This didn't exist in Python 2, but it does in Python 3, so we get to
reap the rewards of dropping Python 2 support.

Fixes #7082.
2017-10-25 11:06:11 -07:00
derAnfaenger cfadb43b93 codebase: Remove multiple whitespaces after comma. 2017-10-18 10:04:23 -07:00
rht 691598a88b py3: Remove "from six.moves import range".
This is no longer required, since in Python 3, this is what the range
built-in does.
2017-10-17 23:28:14 -07:00
rht 1da3c400e3 realm import: Convert the authentication_methods from list to bitfield.
This properly reflects how this is stored in the DB.

Tweaked by tabbott to use a proper function.
2017-10-17 21:32:20 -07:00
Tim Abbott c69c38b14e export: Fix importing/exporting of user avatars.
We apparently failed to update the export code for handling what
directories avatar files should live in during the earlier process.

Fixes #7052.
2017-10-17 21:15:58 -07:00
Umair Khan 60b8cba7df django: Bump version to 1.11.5. 2017-10-03 08:27:06 -07:00
derAnfaenger d1afab7199 Replace deprecated Logging.warn calls with Logging.warning. 2017-10-02 11:11:42 +02:00
rht 2e12fe5e2e zerver/lib: Remove print_function. 2017-09-27 18:05:45 -07:00
rht f43e54d352 zerver/lib: Remove absolute_import. 2017-09-27 10:00:39 -07:00
Vishnu Ks f3c04f711d lib: Remove unused get_user_profile_by_email import in export.py. 2017-07-20 16:50:23 -07:00
Vishnu Ks 3cbc0cc2eb lib: Use get_system_bot in do_import_realm(export.py). 2017-07-18 17:14:05 -07:00
Rishi Gupta aa845e7f60 models: Replace ScheduledJob with ScheduledEmail.
ScheduledJob was written for much more generality than it ended up being
used for. Currently it is used by send_future_email, and nothing
else. Tailoring the model to emails in particular will make it easier to do
things like selectively clear emails when people unsubscribe from particular
email types, or seamlessly handle using the same email on multiple realms.
2017-07-17 16:05:38 -07:00
Vaida d5517bae36 Delete the old zulip.com "referrals" system.
This system hasn't been in active use for several years, and had some 
problems with it's design.  So it makes sense to just remove it to declutter
the codebase.

Fixes #5655.
2017-07-07 14:59:18 -07:00
Christian Hudon 8ab6a23a30 Fix most strict-optional issues in export.py. 2017-05-24 18:50:59 -07:00
Christian Hudon 14e871ce9c Change order of arguments so output_dir is not optional. Helps mypy too. 2017-05-24 17:32:21 -07:00
Konstantin Gukov dd76222a3f Fetch system bots using new get_system_bot function.
This eliminate a bunch of uninteresting calls to
get_user_profile_by_email.
2017-05-23 10:30:40 -07:00
Aditya Bansal b822e75a4b pep8: Add compliance with rule E261 to export.py. 2017-05-18 03:00:32 +05:30
hackerkid c4f0fa97a8 Replace timezone.make_aware with timezone_make_aware. 2017-04-16 12:28:56 -07:00
hackerkid 6ddee006bd Replace timezone.is_naive with timezone_is_naive. 2017-04-16 12:28:56 -07:00
hackerkid 55c3d12078 Replace timezone.utc with timezone_utc. 2017-04-16 12:28:56 -07:00
Harshit Bansal ac2172e233 models: Rename RealmAlias model to RealmDomain.
Includes a migration.
2017-04-04 15:48:03 -07:00
Rishi Gupta 3aae6cd421 Change if(realm.domain == zulip.com) checks to use Realm.string_id. 2017-03-13 14:17:14 -07:00
Rishi Gupta 5dc683ba8d Use Realm.string_id instead of Realm.domain when logging. 2017-03-13 09:42:14 -07:00
Raghav Jajodia a3a03bd6a5 mypy: Added Dict, List and Set imports.
Fixed mypy errors associated with the upgrade.
2017-03-04 14:33:44 -08:00
Rishi Gupta 95f5c96bec Canonicalize how we convert timestamps to UTC datetimes.
No change in behavior with this commit, just making it easier to write a
future lint rule.
2017-03-01 23:03:56 -08:00
Eklavya Sharma dd0e1f6a4c Use correct string type in boto function parameters.
boto's stubs have been updated in mypy 0.4.7, which has given us
more information about what type of strings are expected as
parameters in various functions.
2017-02-06 22:37:37 -08:00
Tim Abbott 4e171ce787 lint: Clean up E126 PEP-8 rule. 2017-01-23 22:06:13 -08:00
Tim Abbott d6e38e2a5c lint: Clean up E123 PEP-8 rule. 2017-01-23 21:34:26 -08:00
Tim Abbott e9158dd520 lint: Clean up E121 PEP-8 rule. 2017-01-23 21:02:39 -08:00
JefftheBest1 a549ed6e65 Removed accommodate typos 2017-01-12 04:53:31 -08:00
Rishi Gupta cf762eaf84 Change X.realm.id to X.realm_id across codebase.
This makes it more clearly the pattern in the Zulip codebase, and thus
decreases the risk of accidentally doing database queries.
2017-01-03 16:46:26 -08:00
nikolay abc2ff4a06 pep8: Fix many rule E128 violations.
[Tweaked by tabbott to adjust some approaches used in wrapping]
2016-12-03 13:33:31 -08:00
Sidhant Bhavnani 8c0c12c1d9 pep8: Fix E303 violations. 2016-12-02 15:34:11 -08:00
Alex Huang c8ddea16c3 pep8: Fix E122. 2016-12-01 23:16:35 -08:00
Rafid Aslam 41bd88d5ed pep8: Fix E301 pep8 violations.
Fix "E301: expected (1 or 2) blank line" pep8 violations.
2016-11-29 08:51:44 -08:00
Rafid Aslam 7a2282986a pep8: Fix E225 pep8 violations. 2016-11-28 15:21:15 -08:00
Umair Khan cfded8b5af Django 1.10: Resolve QuerySet returned by model_to_dict.
Django 1.10 resolves ManyToManyField into a QuerySet in model_to_dict.
This commit further resolves the QuerySet to the primary keys.
2016-11-09 15:29:58 -08:00
Tim Abbott 5bea2f5e20 Remove unused AVATAR_FROM_SYSTEM code.
This is some of the code we'd need if we wanted to have Zulip generate
avatars for things.  Since it is so little useful code, and it's not
clear we will need this feature ever, we can remove this code to make
the codebase less confusing.  It'd be easy to dig this out of history
if we ever want it.

Fixes #2101.
2016-10-22 19:48:50 -07:00
Tim Abbott 22fd7ba02a avatar: Move avatar hash computations to their own file. 2016-10-02 21:19:10 -07:00
Sahil Dua 058587da77 Remove extra new lines at the ends of Zulip authoried files.
Fixes #1627.

[tweaked by tabbott to avoid patching third-party modules, for now]
2016-09-26 21:05:24 -07:00
Steve Howell ffec98d85c Annotate zerver/lib/export.py.
This adds the remaining annotations to lib/export.py.
2016-09-12 08:21:46 -07:00
Steve Howell 62dd86bcce mypy: Set Path to str instead of text_type in export.py.
Using text_type for Path just breaks a lot of calls to
the core Python libraries that still want strings.
2016-09-12 08:21:46 -07:00
Tim Abbott f0a65fe52a export_messages_single_user: Fix missing order_by.
This fixes a nasty bug where exporting messages sent by a single user
might only contain some of the messages in the event that the
unspecified sort order by the database didn't happen to be sorted by
message ID.
2016-08-17 22:18:58 -07:00
Steve Howell c12bd853f7 export: Add basic export tests. (fixes #1584) 2016-08-16 13:38:37 -07:00
Steve Howell ab053845aa export: Fetch messages with two queries. 2016-08-16 13:38:37 -07:00
Steve Howell a5f651b81a export: Add some user_id filtering to do_export_realm().
This commit only addresses tables that currently derive from
user_profile_config in get_realm_config:

   zerver_userpresence
   zerver_useractivity
   zerver_useractivityinterval
   zerver_subscription
   zerver_recipient
   zerver_stream
   zerver_huddle

It also introduces an entry in realm.json for a virtual
table called "zerver_userprofile_mirrordummy" for dummy users,
which include prior dummy users and users excluded from the call
to do_export_realm().

Note that this feature is not yet exposed in the management command.
2016-08-16 13:38:37 -07:00
Tim Abbott 4a46b879ee import_uploads_s3: Fix setting of content-type. 2016-08-13 11:26:53 -07:00
Tim Abbott 8edc5110cd export: Set re_map_foreign_keys verbose default to False.
Otherwise, this is super spammy when doing a large import.
2016-08-13 11:26:16 -07:00
Tim Abbott 8b170665e4 export: Add assertion that ./manage.py exists in current directory.
Otherwise we'll fail starting the UserMessage export, later on.
2016-08-13 10:13:11 -07:00
Tim Abbott 856ab48ec6 export: Fix stream export sanity check. 2016-08-13 10:02:34 -07:00
Steve Howell e932b03999 export: Clean up path.join/makedirs for avatars/uploads. 2016-08-13 09:31:49 -07:00
Steve Howell 12c25e7d5c export: Filter attachments by message id. 2016-08-13 09:31:49 -07:00
Steve Howell 0f493d5000 export: Return msg ids from export_partial_message_files(). 2016-08-13 09:31:49 -07:00
Steve Howell 0c3e98fa91 export: Introduce attachment.json file.
Now attachment data gets written to its own json file.  We are
splitting this out so that will be easier for us to cross-check
attachments against messages without holding up writing a lot
of the other realm data.  (message cross-checking is coming soon)
2016-08-12 18:59:14 -07:00
Steve Howell ea0a7d87c8 export: Refactor how we fetch attachment data.
This commit doesn't change any behavior; it just moves fetching
attachments out of the Config scheme and into its own method.
This prepares us to start writing attachment data to its own
file and cross-checking against message ids (coming soon).
2016-08-12 18:59:14 -07:00
Steve Howell fba7a9ca21 export: Unify top-down export configuration.
We now just have a single configuration get_realm_config() that
handles most of the top-down realm export tables.  (It basically
does everything not related to messages or uploads/avatars.)

Unifying the configs allows us to be more strict in our
configuration about checking for anomalies.  In the future
we may need to loosen up some of those restrictions again,
but for now we are picky and paranoid.
2016-08-12 15:27:23 -07:00
Steve Howell 5a5353b846 export: Fetch stream data only for stream recipients.
Fetch stream data only for stream recipients, instead of
getting streams via realm_id.

(This change is kind of moot for now, since our stream recipients
include all possible stream recipients in the realm, but this
sets us up for when we start restricting users that we export
within the realm.)
2016-08-12 15:27:22 -07:00
Steve Howell 7a429d1e30 export: Add sanity_check_stream_data(). 2016-08-12 15:27:22 -07:00
Steve Howell ec86e475b4 export: Add Config.post_process_data 2016-08-12 15:12:01 -07:00
Steve Howell 0c2c331905 export: Flip how we fetch stream subscription data.
We now get stream subscriptions BEFORE stream recipients.
2016-08-12 15:12:01 -07:00
Steve Howell 70a916aae3 export: Flip how we fetch user subscription data.
We now get user subscriptions BEFORE user recipients.
2016-08-12 15:12:01 -07:00
Steve Howell 2a2ce6ada1 export: Remove hard-to-maintain code comment.
Subsequent changes are gonna make the top-down/bottom-up comment
no longer valid.
2016-08-12 15:12:01 -07:00
Steve Howell 6fdd42c08b export: Create convenient soft links. 2016-08-12 10:48:33 -07:00
Steve Howell 70b68ddcc3 export: Use a config for export_single_user(). 2016-08-12 10:37:41 -07:00
Steve Howell c69a5bdec3 export: Handle more tables via export_from_config().
This commit introduces the ability to do custom fetches
and to essentially use temp tables for intermediate results.

(The temp table stuff deals with recipients/subscriptions
having three different flavors--user, stream, and huddle.)
2016-08-12 10:37:35 -07:00
Steve Howell f471a1779e export: Handle simple exports with export_from_config().
This handles the simple tables that don't need custom fetches.
2016-08-12 09:54:57 -07:00
Steve Howell 682155778d export: Add export_with_config().
Subsequent commits will start to use this.
2016-08-12 09:54:57 -07:00
Steve Howell b0e6d20321 export: Write stats.txt for `./manage.py export <realm>`. 2016-08-12 09:06:10 -07:00
Steve Howell df3aa39be3 export: Extract write_data_to_file(). 2016-08-11 15:51:22 -07:00
Steve Howell f29b32bbb2 export: Clarify message exporting code.
The function to create the message partial files has been
renamed to export_partial_message_files().  It now gets its own
list of user profile ids and recipient ids from the response,
so that we can de-clutter do_export_realm().
2016-08-11 15:51:22 -07:00
Steve Howell 5cd915694a export: Extract launch_user_message_subprocesses().
This is the last in a series of commits that makes it
so that do_export_realm() mostly delegates work out
to other functions.
2016-08-11 15:21:30 -07:00
Steve Howell b383f5ca5d export: Extract fetch_user_profile_cross_realm(). 2016-08-11 15:21:30 -07:00
Steve Howell fee2106c6f export: Extract fetch_huddle_objects().
This also removes the dead codepath for include_private=False.
2016-08-11 15:21:30 -07:00
Steve Howell a6235f6a60 export: Add comments to export_single_user().
(This is a bit of a prefactoring to hopefully create a nice
diff in a subsequent commit.)
2016-08-11 15:21:30 -07:00
Steve Howell 6e7fe76cf4 export: s/avatar_bucket/processing_avatars
The name avatar_bucket was confusing for a boolean, and
in some places it was used for non-S3 paths.

I considered the more concise 'is_avatar', but that
was still confusing when you are processing multiple
files, because you think it's a calculated property
on one file instead of an overall codepath switch.

I also considered splitting up some functions, but
there is a lot of common logic between handling
file uploads and avatars that's not trivial to extract
into helpers, especially on the S3 side.
2016-08-11 15:21:30 -07:00
Steve Howell 3dab366733 export: Clean up names of upload/avatar export functions.
I did some minor moving around of code that made us have
one fewer function without any additional conditional
logic. The names are more explicit about saying
"from_local" and "from_s3".  Also, there is less clutter
now in do_export_realm(), which is evolving into more of
a dispatcher and less of a worker.
2016-08-11 15:21:30 -07:00
Steve Howell d62a351107 export: Add sanity_check_output(). 2016-08-11 15:21:30 -07:00
Steve Howell 06b0df5efc export: Remove spurious select_related() call for Client. 2016-08-10 14:16:17 -07:00
Steve Howell cb59a11f0a export: Extract get_primary_ids(). 2016-08-10 14:16:17 -07:00
Steve Howell 90e9083b81 export: Extract filter_by_realm(). 2016-08-10 14:16:17 -07:00
Steve Howell 4b6b1b8ad4 export: Extract filter_by_users(). 2016-08-10 14:16:17 -07:00
Steve Howell db9edfce34 export: Use DATE_FIELDS in fix_datetime_fields().
Now we only call this once per table and use DATE_FIELDS to
look up the data fields.
2016-08-10 14:16:17 -07:00
Steve Howell 35c59fc4d7 export: Clean up export_messages().
This is pretty minor cleanup, but it makes it a little more
explicit what we're writing to the shard file, and it allows
us to use a more specific mypy type when calling
floatify_datetime_fields.
2016-08-10 14:16:17 -07:00
Steve Howell 1d1f36c0b8 export: Always use subprocesses to export UserMessage.
We no longer have an in-process code path to export
UserMessage rows.  We want to only maintain the
subprocess code, which we'll always use in production,
and which will work fine in dev.
2016-08-10 14:16:17 -07:00
Steve Howell 78bbefbf94 export: Create import_attachments. 2016-08-10 14:16:17 -07:00
Steve Howell 7ec6a394fe export: Filter Attachment objects by realm. 2016-08-09 16:47:14 -07:00
Steve Howell cecfaa7761 export: Extract import_message_data(). 2016-08-09 16:47:14 -07:00
Steve Howell 5386ed280e export: Extract update_id_map().
We also use a vanilla dictionary instead of a defaultdict, so
that we explicitly initialize what tables are being re-mapped.
2016-08-09 16:47:14 -07:00
Steve Howell 217ef8a4d2 export: Split fix_foreign_keys() into two functions.
We now have convert_to_id_fields for the simple case, and
re_map_foreign_keys for the more complex case. I also
renamed some parameters and variables.
2016-08-09 16:47:14 -07:00
Steve Howell dd88ffccfd export: Extract make_raw() in lib/export.py. 2016-08-09 15:58:27 -07:00
Steve Howell 09fa343bdd export: Use DATE_FIELDS in floatify_datetime_fields.
This avoids a little bit of code duplication, plus it should
make it a little easier to add new date fields in the future.
2016-08-09 15:58:27 -07:00
Steve Howell c14ab3c91f export: Add annotations to zerver/lib/export.py.
I also fixed some small things like removing unnecessary return
statements, and adding a TODO.

In some cases I explicitly cast stuff at run-time to set() or
str() to appease mypy, as well as make it clear to somebody
reading the code that the callee might not respect ordering
or tolerate unicode.
2016-08-09 15:58:27 -07:00
Steve Howell f18cc4ae3a export: Added export_avatars_local_helper(). 2016-08-09 15:58:27 -07:00
Tim Abbott 6264ff7039 Add new Zulip realm import/export tool.
The previous export tool would only work properly for small realms,
and was missing a number of important features:
* Export of avatars and uploads from S3
* Export of presence data, activity data, etc.
* Faithful export/import of timestamps
* Parallel export of messages
* Not OOM killing for large realms

The new tool runs as a pair of documented management commands, and
solves all of those problems.

Also we add a new management command for exporting the data of an
individual user.
2016-08-08 14:58:18 -07:00