This improves the approach of creating multiple parallel processes by
using subprocess.Popen() instead of run_parallel() and
subprocess.call() while exporting an organization's message
history. This prevents forking twice for individual subprocess.
While this has some performance benefit, the main reason to fix this
is that it fixes an issue with the data export web UI introduced in
run_parallel forks exited).
Fixes#12904.
Preparatory commit for making the email mirror use the database instead
of redis for missed message addresses.
This model will represent missed message email addresses, which
currently have their data stored in redis.
The redis data will be converted and migrated into these models and
the email mirror will start using them in the main commit.
Fixes#1727.
With the server down, apply migrations 0245 and 0246. 0246 will remove
the pub_date column, so it's essential that the previous migrations
ran correctly to copy data before running this.
Apparently, the Zulip notifications (and resulting emails) were
correct, but the download links inside the Zulip UI were incorrectly
not including S3 prefix on the URL, making them not work.
While we're at this, we rewrite the somewhat convoluted previous
system for formatting the data export output.
This feature is intended to cover all of our ways of exporting a
realm, not just the initial "public export" feature, so we should name
things appropriately for that goal.
Additionally, we don't want to include data exports in page_params;
the original implementation was actually buggy and would have.
The conditional block containing the tarball upload logic for both S3
and local uploads was deconstructed and moved to the more appropriate
location within `zerver/lib/upload.py`.
This change is preliminary refactoring in order to improve the test
mocking strategy related to `test_realm_export.py`.
What this allows is the ability to simply mock a return value from
`do_export_realm`. We can then use that value as a dummy url to
ensure a file has been served and can be retrieved.
We add a new model, ArchiveTransaction, to tie archived objects together
in a coherent way, according to the batches in which they are archived.
This enables making a better system for restoring from archive, and it
seems just more sensible to tie the archived objects in this way, rather
the somewhat vague setting of archive_timestamp to each object using
timezone_now().
A unique path was created using the `LOCAL_UPLOADS_DIR` backend, similar
to the code used in `LocalUploadBackend`. The exported tarball was
copied to the directory, and an nginx url was created to serve the file
publicly.
Tweaked by tabbott to output an actual URL.
The upload option will no longer be limited to strictly S3 uploads. This
commit serves as a preliminary step for supporting LOCAL_UPLOADS_DIR as
part of the public only export feature.
This lets us handle directly in our tooling the user experience that
we document for exporting a realm with member consent (before, it
required unpleasant manual work).
This commit serves as the first step in supporting "public export" as a
webapp feature. The refactoring was done as a means to allow calling
the export logic from elsewhere in the codebase.
Follow up on 92dc363. This modifies the ScheduledEmail model
and send_future_email to properly support multiple recipients.
Tweaked by tabbott to add some useful explanatory comments and fix
issues with the migration.
We were using a hardcoded relative path, which doesn't work if you're
not running this from the root of the Zulip checkout.
As part of fixing this, we need to make `LOCAL_UPLOADS_DIR` an
absolute path.
Fixes#11581.
This helps keep the realm.json small and easy to process; previously,
almost the entire size of that file was the analytics data.
We implement this by refactoring the analytics Config objects into a
separate subroutine that writes to a separate file, plus the
corresponding import code.
Manual testing was performed by exporting the 'analytics' realm, and
importing back to a newly created 'test' realm. The 'test' realm was
then exported and the json files were inspected. The data appeared
consistent with no abnormalities.
Fixes: #11220.
This adds a new API for sending basic analytics data (number of users,
number of messages sent) from a Zulip server to the Zulip Cloud
central analytics database, which will make it possible for servers to
elect to have their usage numbers counted in published stats on the
size of the Zulip ecosystem.
Previously, this wasn't an explicit feature of the export tool.
Note that the current version still includes metadata on private
streams and private message recipients, just not their messages.
This should eliminate the need to do manual analytics work when
importing organizations imported/exported using the zulip -> zulip
import/export tools.
At some point as part of the process of supporting renumbering data,
we changed the structure of our file uploads to expect `path` to match
`s3_path`, with both having the relative path within the overall
hierarchy (including the realm ID). This change updates the more
rarely-used S3 export code path to use that model, fixing a crash when
messages reference an Attachment object with a rewritten path_id.
If any user had sent the reply to the welcome bot recommended by our
tutorial, then the Zulip export/import process didn't work properly,
because we weren't including (and then remapping) the recipient ID for
sending PMs to the cross-realm bots. This commit fixes that gap, by
recording the necessary data on the export side, and doing the
appropriate remapping on the import side.
The previous error messages for this were written for a tool only to
be used by a couple people, and didn't make clear what potential
causes were. Tweak these to provide greater clarity about what's
going on.
The main cause of these errors appearing in practice was fixed in
7ea5987e5d, but nothing strongly
prevents a similar issue from being introduced in the future.
Fixes#10078.
In records the IDs like the realm_id and user_profile_id
of 'records.json' should be integers. This was missing in the
S3 backend and this commit fixes that.
Added tests for this as well.
For the emojis, In 'records.json', the record should contain
the attribute 'file_name', which was missing in the S3 backend.
This commit adds this attribute, as well as tests for the
records of uploads, avatars and emojis in both local and S3 backend.
This should help make it explicit whenever we add a new table to Zulip
that we need to correctly categorize it for whether it will be
included in the data export, or not.
Export of RealmEmoji should also include the image
file of those emojis.
Here, we export emojis both for local and S3 backend
in a method with is similar to attachments and avatars.
Added tests for the same.
This reflects the fact that these are just defensive programming (we
don't expect them to ever happen) and also nicely makes these lines
not show up in our missing test coverage reports.
Deletion of medium sized image is done if it exists before calling the
function 'ensure_medium_avatar_image', to avoid potentially confusing
problems with left-over medium-size avatar images from a previous run
being used when repeatedly importing the same realm in a development
environment..
Fixes#8949.
The comments explain this pretty well, but basically because we
rewrite the realm ID during the import process, we need to edit all
the message bodies that link to an attachment to instead link to the
post-processed URL where that file will be hosted on the new server.
Fixes#8926.
'processing_emojis' check is added in the 'import_uploads'
function, so that the emoji files present in the to be imported
data file can be uploaded.
The procedure of saving emoji files in slack importer is same as
saving attachments and avatars, and the import has the similar
procedure too.
In importing avatars, we use the implementation where the 'avatar_path'
is seperately calculated using realm and user ID and then the content
of the path provided in the avatar's 'records.json' are copied to this
'avatar_path'.
Similary, here for the uploads, 's3_file_name' is seperately calculated
using the realm ID and uploaded file name and then the content of the
path provided in upload's 'records.json' are copied to this 's3_file_name'.
'recipient_field' is added as a bool variable in the function
'update_id_map' to update the recipient foreign keys.
Recipient Foreign Key is equal to the UserProfile ID, if the
type is 1, and the same is equal to Stream ID, if the type is 2.
Hence a check is added in the 'update_id_map' field for this.
All the objects with realm ID as the foreign keys need to
be remapped with updated with the allocated ID.
Also the ID of the realm object itself is updated with the allocated
ID.
The 'id_field' bool variable is added to the function just to check
if the field is the ID of that object, and not the foreign key relation.
For foreign key field names, a "_id" has to be added after the field name,
however we don't need that for the ID field of the object.
During a slack import, we don't have medium-size avatars already
available in the export data set (and possibly also with a normal
import/export?). The medium size avatar can be created by the
'ensure_medium_avatar_image' function, which checks if the medium
image exists, and if it doesn't, it creates the image.
This commit was substantially edited by tabbott to get rid of an
undefined variable bug, avoid initializing the upload backend classes
in a loop, and add some TODO notes on things that could be improved
later.
This fixes some subtle JavaScript exceptions we've been getting in
zulipchat.com, caused by the system bot realm there not being "zulip"
interacting with get_cross_realm_users.
ScheduledJob was written for much more generality than it ended up being
used for. Currently it is used by send_future_email, and nothing
else. Tailoring the model to emails in particular will make it easier to do
things like selectively clear emails when people unsubscribe from particular
email types, or seamlessly handle using the same email on multiple realms.
This system hasn't been in active use for several years, and had some
problems with it's design. So it makes sense to just remove it to declutter
the codebase.
Fixes#5655.
boto's stubs have been updated in mypy 0.4.7, which has given us
more information about what type of strings are expected as
parameters in various functions.
This is some of the code we'd need if we wanted to have Zulip generate
avatars for things. Since it is so little useful code, and it's not
clear we will need this feature ever, we can remove this code to make
the codebase less confusing. It'd be easy to dig this out of history
if we ever want it.
Fixes#2101.