Commit Graph

14553 Commits

Author SHA1 Message Date
Steve Howell b2d83a8300 tests: Split out SingleUserExportTest.
This is mostly moving code, plus I now just
call shutil.rmtree directly.
2021-12-11 13:06:41 -05:00
Tim Abbott ee77c6365a portico: Use /help/ style pages for displaying policies.
This replaces the TERMS_OF_SERVICE and PRIVACY_POLICY settings with
just a POLICIES_DIRECTORY setting, in order to support settings (like
Zulip Cloud) where there's more policies than just those two.

With minor changes by Eeshan Garg.
2021-12-10 17:56:12 -08:00
Tim Abbott 95854d9d94 terms: Rename and tweak FIRST_TIME_TERMS_OF_SERVICE_TEMPLATE.
We do s/TOS/TERMS_OF_SERVICE/ on the name, and while we're at it,
remove the assumed zerver/ namespace for the template, which isn't
correct -- Zulip Cloud related content should be in the corporate/
directory.
2021-12-10 17:56:12 -08:00
Tim Abbott 31842e1377 export: Fix empty realm_icons directory in single-user exports. 2021-12-10 12:05:34 -08:00
Tim Abbott 7dd543bee5 export_single_user: Fix extra leading tmp/ in tarball. 2021-12-10 12:05:34 -08:00
Steve Howell 2902f8b931 tests: Ensure stream senders get a UserMessage row.
We now complain if a test author sends a stream message
that does not result in the sender getting a
UserMessage row for the message.

This is basically 100% equivalent to complaining that
the author failed to subscribe the sender to the stream
as part of the test setup, as far as I can tell, so the
AssertionError instructs the author to subscribe the
sender to the stream.

We exempt bots from this check, although it is
plausible we should only exempt the system bots like
the notification bot.

I considered auto-subscribing the sender to the stream,
but that can be a little more expensive than the
current check, and we generally want test setup to be
explicit.

If there is some legitimate way than a subscribed human
sender can't get a UserMessage, then we probably want
an explicit test for that, or we may want to change the
backend to just write a UserMessage row in that
hypothetical situation.

For most tests, including almost all the ones fixed
here, the author just wants their test setup to
realistically reflect normal operation, and often devs
may not realize that Cordelia is not subscribed to
Denmark or not realize that Hamlet is not subscribed to
Scotland.

Some of us don't remember our Shakespeare from high
school, and our stream subscriptions don't even
necessarily reflect which countries the Bard placed his
characters in.

There may also be some legitimate use case where an
author wants to simulate sending a message to an
unsubscribed stream, but for those edge cases, they can
always set allow_unsubscribed_sender to True.
2021-12-10 09:40:04 -08:00
Tim Abbott 1c180a9f57 documentation: Avoid potential unused variable code path.
These variables can be unset if the `os.path.exists` check fails.

That should be rare, since we've previously checked the files do
exist before getting here.
2021-12-09 17:51:52 -08:00
Tim Abbott 4cb189fc63 settings: Rename TOS_VERSION to TERMS_OF_SERVICE_VERSION.
The previous version was appropriate in a setting where it was only
used for Zulip Cloud, but it's definitely clearer to spell it out.
2021-12-09 17:51:16 -08:00
odunybrad 90aa45a316 emoji: Add database-level uniqueness constraint for RealmEmoji.
While races here are unlikely, it is most correct to enforce this
invariant at the database layer, and having a database-level
constraint makes the models file a bit more readable.
2021-12-09 17:48:53 -08:00
Steve Howell 9a39ca217f user export: Show less info for recipients.
For PM and huddles, show full names but no
emails or other crufty fields.
2021-12-09 17:20:01 -08:00
Steve Howell 6a5c407b05 user export: Be more selective about exported messages. 2021-12-09 17:20:01 -08:00
Steve Howell fa654fd7a0 user export: Ignore realm icon and logo.
These are not considered to be "personal"
info, even if you upload them, so we
don't export them.

Generally the only folks who upload
these are admins, who can easily get
them in other ways. In fact, anybody
can get these via the app.
2021-12-09 17:20:01 -08:00
Eeshan Garg 5aaeb1a432 use_cases: Rename /for/companies to /for/business. 2021-12-09 17:16:52 -08:00
Steve Howell 8f991f8eb1 export: Make sure messages are sorted **across** files.
We now ensure that all message ids are sorted BEFORE
we split them into batches.

We now do a few extra "slim" queries to get message
ids up front.

But, now, when we divide them into batches, we no
longer run 2 or 3 different complicated queries in
a loop. We just basically hydrate our message ids,
so `write_message_partials` should be easy to reason
about.

This change also means that for tiny realms with
< 1000 messages you will always have just one
json file, since we aggregate the ids from the
queries before batching.
2021-12-09 12:22:34 -08:00
Steve Howell cef0e11816 export: Add get_id_list_gently_from_database.
This is slightly overkill for the single-user use
case, but for small queries it's barely any overhead,
and it's a nice abstraction.
2021-12-09 12:22:34 -08:00
Steve Howell 8ea320812f user exports: Chunkify messages in sorted order.
This accomplishes a few things:

    * It extracts `chunkify` rather than having us
      clumsily track chunking-related stuff in a
      big loop that is doing other stuff.

    * It makes it so that all message ids
      in message-000001.json < message-000002.json.

    * It makes it easier for us to customize
      the messages we send to a single user
      (coming soon).

BTW we probably have a slicker version of chunkify
somewhere in our codebase, but I couldn't remember
where.
2021-12-09 12:22:34 -08:00
Steve Howell 2a73964e16 user export: Add reactions.
We may eventually try to attach these to the messages
in the message-NNNNNN.json files, but for now they're
fine in user.json.
2021-12-09 12:22:34 -08:00
Mateusz Mandera 93e18fe289 migrations: Remove disallowed characters from topics.
Following b3c58f454f, we want to clean up
old topics that may contain the disallowed characters. The Message table
is large, so we go in batches, making sure we limit topic fetches and
UPDATE query to no more than BATCH_SIZE Message rows per query.
2021-12-09 09:51:06 -08:00
Steve Howell f810833df5 export: Improve export_usermessages_batch.
We no longer jankily read our input file into
an "output" variable.  Instead, we do things
in a type-safe way.
2021-12-09 08:36:40 -08:00
Steve Howell 5c1e8cb8dc mypy: Add MessagePartial TypedDict. 2021-12-09 08:36:40 -08:00
Steve Howell 09c57a3f9f export: Log more consistently and sort ids.
Now all file writes go through our three
helper functions, and we consistently
write a single log message after the file
gets written.

I killed off write_message_exports, since
all but one of its callers can call
write_table_data, which automatically
sorts data.  In particular, our Message
and UserMessage data will now be sorted
by ids.
2021-12-09 08:36:40 -08:00
Steve Howell 6ec49951c6 minor: Avoid creating intermediate list for message_ids.
This probably just postpones the list creation until
Django builds the "IN" query, but semantically it's
good to work in sets where we don't have any
meaningful ordering of the list that gets used.
2021-12-08 16:12:54 -08:00
Steve Howell f8ed099d3c export: Sort table data for most tables.
This affects most of our tables, but it excludes
table(s) like Message that go through kind of
unique codepaths.
2021-12-08 16:12:54 -08:00
Steve Howell a1d3f12e53 refactor: Extract write_table_data().
The immediate benefit of this is stronger mypy
checks (avoiding the ugly union caused by message
files).

The subsequent commit will add sorting.

We have test coverage on all these lines insofar
as if you comment out the lines, tests will
explode (i.e. more than superficial line
coverage).
2021-12-08 16:12:54 -08:00
Steve Howell c76ca2d0df export: Sort records.json files by path. 2021-12-08 16:12:54 -08:00
Steve Howell 2ef38e3d48 refactor: Extract write_records_json_file. 2021-12-08 16:12:54 -08:00
Steve Howell b79cfc19ab user export: Broaden query for RealmAuditLog.
We now check acting_user as well as modified_user
to see if a row pertains to our exported user.
2021-12-08 16:01:38 -08:00
Steve Howell 927b04368e minor: Use virtual_parent for custom fetchers.
The distinction here wasn't super meaningful
due to the way we order our "elif" statements,
but we want to reserver "normal_parent" for the
majority of use cases, where you simply tell
the Config what the "foreign_key" is.
2021-12-08 15:58:07 -08:00
Steve Howell 50120a9387 export: Remove config parameter for custom fetchers. 2021-12-08 15:58:07 -08:00
Steve Howell 54a3a423e5 mypy: Fix CustomFetch=Any hack. 2021-12-08 15:58:07 -08:00
Steve Howell 4128b52ac5 export: Rename custom fetchers. 2021-12-08 15:58:07 -08:00
Steve Howell a2c4931316 exports: Use realm for RealmAuditLog in realm exports.
For realm-wide exports, there is no reason to query
inefficiently against a list of modified users.

We move the Config out of the common child configs.
2021-12-08 15:58:07 -08:00
Steve Howell 8dd3c1038f exports: Rename parent_key to include_rows.
Even though Django usually treats foo__in
and foo_id__in identically for filters where
foo is a ForeignKey type, we want to insist
on somewhat more consistent syntax, because
we have the odd combo of type and type_id
in Recipient, where type_id is kinda like a
foreign key, but not a ForeignKey.

So we assert for now that all our include_rows
values end in "_id__in".
2021-12-08 15:58:07 -08:00
Steve Howell 02207f47d5 minor: Move code blocks to be alphabetical. 2021-12-08 15:58:07 -08:00
Steve Howell aae9f1b6f5 export: Make Config errors more clear. 2021-12-08 15:58:07 -08:00
Nikhil Maske 091772b534 hotspots: Remove intro_reply hotspot.
Zulip shows two guides on How to reply, first one by
the welcome bot and second one is intro_reply hotspot.
To simply and avoid redundancy, intro_reply hotspot is
removed.

Fixes #20482.
2021-12-07 21:55:59 -08:00
Nipunn Koorapati 0ca49bc93a emoji reactions: Order reactions query results by id.
Force postgres to give reactions in ID order - which
is generally chronological order. Results in frontend
displaying reactions in said order.

Fixes #20060.
2021-12-07 15:02:46 -08:00
Eeshan Garg 3714a30e63 stream notifications: Add helper for silent user mention syntax.
In many of our stream notification messages, we make use of the
same silent user mention syntax, the template for which was always
hardcoded. This commit adds a helper function that all relevant
callers can call to get the right syntax when mentioning users.

Thanks to Tim Abbott for this suggestion!
2021-12-07 14:53:50 -08:00
Eeshan Garg 8ebe05f644 streams: Add RealmAuditLog entry for message retention updates. 2021-12-07 14:53:50 -08:00
Eeshan Garg d2901892e2 streams: Add notifications for message retention policy updates.
This is a part of #20289.
2021-12-07 14:53:50 -08:00
S-Abhishek 186d1a83e9 narrow_banner: Move empty narrow messages to handlebar templates.
Removed existing empty narrow divs from app/home.html and created
a new javascript module to dynamically load empty narrow messages
using handlebar template.

Fixes #18797
2021-12-07 13:38:48 -08:00
Steve Howell 6381c2e535 tests: Make sure import doesn't corrupt original realm.
The original intention of this was to prevent coding
errors with realm getters that don't, um, filter
on realm.

Unfortunately, you can still write a broken realm getter
that forgets to filter on realm, but which returns a
Set, and the new safeguards won't see any difference.

We could make all the getters return sorted lists
instead, but that's for another day.

This code does serve another purpose, which is to
prevet egregious bugs in the import itself.
2021-12-07 12:27:01 -08:00
Steve Howell fea659eacd tests: Extract get_getters. 2021-12-07 12:27:01 -08:00
Steve Howell 5803057589 tests: Make some helpers class-level.
This is somewhat tactical in nature. I want to
extract a huge chunk of code that minorly depends
on these helpers.
2021-12-07 12:27:01 -08:00
Steve Howell 29bd1e8bd3 tests: Avoid clutter within long list of getters.
The diff here is ugly, but to summarize:

    BEFORE IMPORT:
        define get_user_id
        define get_huddle_hashes

    AFTER IMPORT AND MAKING GETTERS:
        check realm id
        define assert_realm_values
        verify emoji codes
        check huddle hashes
2021-12-07 12:27:01 -08:00
Steve Howell 93761cd237 tests: Add getter decorator for import test. 2021-12-07 12:27:01 -08:00
Steve Howell 5892748c7b tests: Avoid lambdas in import test. 2021-12-07 12:27:01 -08:00
Steve Howell 54a6c82282 tests: Avoid equal flag for huddle hashes.
There's no need to complexify the codepath
for all the normal use cases.
2021-12-07 12:27:01 -08:00
Steve Howell 6d09eab285 export: Export file images for single users.
We don't have automated test coverage on this yet,
but below are the results from manual testing.

Note that we include the realm icon and logo even
though they were not created by Cordelia.

    ./manage.py export_single_user cordelia@zulip.com

    $ (cd /tmp/zulip-export-4v3mo802/ && find .)
    .
    ./emoji
    ./emoji/2
    ./emoji/2/emoji
    ./emoji/2/emoji/images
    ./emoji/2/emoji/images/3.jpg
    ./emoji/records.json
    ./messages-000001.json
    ./realm_icons
    ./realm_icons/2
    ./realm_icons/2/night_logo.original
    ./realm_icons/2/night_logo.png
    ./realm_icons/2/icon.png
    ./realm_icons/2/icon.original
    ./realm_icons/records.json
    ./avatars
    ./avatars/2
    ./avatars/2/c5125af0447f4d66ce34c1b32eac75ac27ebe0e7.original
    ./avatars/2/c5125af0447f4d66ce34c1b32eac75ac27ebe0e7.png
    ./avatars/records.json
    ./uploads
    ./uploads/2
    ./uploads/2/68
    ./uploads/2/68/xyEkC5dTIp8m42_6HJ3kBfdt
    ./uploads/2/68/xyEkC5dTIp8m42_6HJ3kBfdt/denver.jpg
    ./uploads/2/96
    ./uploads/2/96/ol5WE6RTUntvuPDSpJUrYTim
    ./uploads/2/96/ol5WE6RTUntvuPDSpJUrYTim/denver.jpg
    ./uploads/records.json
    ./user.json
2021-12-07 11:16:52 -08:00
Steve Howell b8d9143318 export: Validate emoji paths.
(We lift the RealmEmoji query to be used by
both local and S3 storage helpers.)
2021-12-07 11:16:52 -08:00