zulip

Commit Graph

Author	SHA1	Message	Date
Anders Kaseorg	72d2e5df15	isort: Enable black profile. Our isort configuration was almost Black-compatible, but we were missing ensure_newline_before_comments. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-02 11:00:07 -07:00
Anders Kaseorg	60a25b2721	docs: Fix spelling errors caught by codespell. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-08-11 10:23:06 -07:00
Alex Vandiver	39368cad3a	tornado: Extract functions called from django into one module. This makes clearer the separation of concerns.	2020-08-10 16:55:56 -07:00
Steve Howell	3b2c881ce6	tests: Decouple test_retention and test_reactions. We generally want to avoid having two sibling test suites depend on each other, unless there's a real compelling reason to share code. (And if there is code to share, we can usually promote it to either test_helpers or ZulipTestCase, as I did here.) This commit is also prep for the next commit, where I try to simplify all of the helpers in EmojiReactionBase. Especially now that we have f-strings, it is usually better to just call api_post explicitly than to obscure the mechanism with thin wrappers around api_post. Our url schemes are pretty stable, so it's unlikely that the helpers are actually gonna prevent future busywork.	2020-07-17 11:04:54 -07:00
Mateusz Mandera	0c6497d43a	retention: Add restore_retention_policy_deletions_for_stream function.	2020-06-24 10:40:38 -07:00
Mateusz Mandera	7a03e2a7fe	retention: Replace Realm.message_retention_days None value with -1. To be more consistent with the meaning in the Stream model, and to make it easier to have a reasonable settings API, we get rid of the None value for Realm.message_retention_days in favor of the value -1 to represent the "don't delete messages" default policy.	2020-06-24 10:33:21 -07:00
Aman Agrawal	cda7b2f539	deletion: Add support for bulk message deletion events. This is designed to have no user-facing change unless the client declares bulk_message_deletion in its client_capabilities. Clients that do so will receive a single bulk event for bulk deletions of messages within a single conversation (topic or PM thread). Backend implementation of #15285.	2020-06-14 22:34:00 -07:00
Anders Kaseorg	365fe0b3d5	python: Sort imports with isort. Fixes #2665. Regenerated by tabbott with `lint --fix` after a rebase and change in parameters. Note from tabbott: In a few cases, this converts technical debt in the form of unsorted imports into different technical debt in the form of our largest files having very long, ugly import sequences at the start. I expect this change will increase pressure for us to split those files, which isn't a bad thing. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-11 16:45:32 -07:00
Anders Kaseorg	69730a78cc	python: Use trailing commas consistently. Automatically generated by the following script, based on the output of lint with flake8-comma: import re import sys last_filename = None last_row = None lines = [] for msg in sys.stdin: m = re.match( r"\x1b\[35mflake8 \\|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg ) if m: filename, row_str, col_str, err = m.groups() row, col = int(row_str), int(col_str) if filename == last_filename: assert last_row != row else: if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) with open(filename) as f: lines = f.readlines() last_filename = filename last_row = row line = lines[row - 1] if err in ["C812", "C815"]: lines[row - 1] = line[: col - 1] + "," + line[col - 1 :] elif err in ["C819"]: assert line[col - 2] == "," lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ") if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-06-11 16:04:12 -07:00
Anders Kaseorg	67e7a3631d	python: Convert percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-10 15:02:09 -07:00
Mateusz Mandera	b234fe8ccb	retention: Pass optional realm argument to move_messages_to_archive. This allows having the realm field of ArchiveTransaction set instead of NULL when using move_messages_to_archive.	2020-05-16 14:46:56 -07:00
Mateusz Mandera	812ac4714f	retention: Optimize fetching of realms and streams with retention policy.	2020-05-07 16:28:05 -07:00
Anders Kaseorg	fead14951c	python: Convert assignment type annotations to Python 3.6 style. This commit was split by tabbott; this piece covers the vast majority of files in Zulip, but excludes scripts/, tools/, and puppet/ to help ensure we at least show the right error messages for Xenial systems. We can likely further refine the remaining pieces with some testing. Generated by com2ann, with whitespace fixes and various manual fixes for runtime issues: - invoiced_through: Optional[LicenseLedger] = models.ForeignKey( + invoiced_through: Optional["LicenseLedger"] = models.ForeignKey( -_apns_client: Optional[APNsClient] = None +_apns_client: Optional["APNsClient"] = None - notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) - signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) + notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) + signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) - author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE) + author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE) - bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL) + bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL) - default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) - default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) + default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) + default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) -descriptors_by_handler_id: Dict[int, ClientDescriptor] = {} +descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {} -worker_classes: Dict[str, Type[QueueProcessingWorker]] = {} -queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {} +worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {} +queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {} -AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None +AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-22 11:02:32 -07:00
Anders Kaseorg	1cf63eb5bf	python: Whitespace fixes from autopep8. Generated by autopep8, with the setup.cfg configuration from #14532. I’m not sure why pycodestyle didn’t already flag these. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-21 17:58:09 -07:00
Anders Kaseorg	c734bbd95d	python: Modernize legacy Python 2 syntax with pyupgrade. Generated by `pyupgrade --py3-plus --keep-percent-format` on all our Python code except `zthumbor` and `zulip-ec2-configure-interfaces`, followed by manual indentation fixes. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-09 16:43:22 -07:00
Stefan Weil	d2fa058cc1	text: Fix some typos (most of them found and fixed by codespell). Signed-off-by: Stefan Weil <sw@weilnetz.de>	2020-03-27 17:25:56 -07:00
Steve Howell	5e2a32c936	tests: Use users in send_*_message. This commit mostly makes our tests less noisy, since emails are no longer an important detail of sending messages (they're not even really used in the API). It also sets us up to have more scrutiny on delivery_email/email in the future for things that actually matter. (This is a prep commit for something along those lines, kind of hard to explain the full plan.)	2020-03-07 18:30:13 -08:00
Mateusz Mandera	2d544250b7	events: Add block for compatibility with old delete_message events.	2020-03-03 15:52:42 -08:00
Mateusz Mandera	7db3d4560f	do_delete_messages: Archive the messages in bulk. The test added in this commit shows 37 queries - compared to 181 without the change to the function. That seems very much worth it.	2020-02-27 23:12:32 -08:00
Tim Abbott	7ccc8373e2	bugdown: Fix logic for extracting attachment path_id. In `3892a8afd8`, we restructured the system for managing uploaded files to a much cleaner model where we just do parsing inside bugdown. That new model had potentially buggy handling of cases around both relative URLs and URLS starting with `realm.host`. We address this by further rewriting the handling of attachments to avoid regular expressions entirely, instead relying on urllib for parsing, and having bugdown output `path_id` values, so that there's no need for any conversions between formats outside bugdowm. The check_attachment_reference_change function for processing message updates is significantly simplified in the process. The new check on the hostname has the side effect of requiring us to fix some previously weird/buggy test data. Co-Author-By: Anders Kaseorg <anders@zulipchat.com> Co-Author-By: Rohitt Vashishtha <aero31aero@gmail.com>	2019-12-12 20:30:26 -08:00
Rohitt Vashishtha	3892a8afd8	messages: Set has_attachment correctly using Bugdown. Previously, we would naively set has_attachment just by searching the whole messages for strings like `/user_uploads/...`. We now prevent running do_claim_attachments for messages that obviously do not have an attachment in them that we previously ran. For example: attachments in codeblocks or attachments that otherwise do not match our link syntax. The new implementation runs that check on only the urls that bugdown determines should be rendered. We also refactor some Attachment tests in test_messages to test this change. The new method is: 1. Create a list of potential_attachment_urls in Bugdown while rendering. 2. Loop over this list in do_claim_attachments for the actual claiming. For saving: 3. If we claimed an attachment, set message.has_attachment to True. For updating: 3. If claimed_attachment != message.has_attachment: update has_attachment. We do not modify the logic for 'unclaiming' attachments when editing.	2019-12-11 11:03:44 -08:00
Mateusz Mandera	bbf2474bd0	tests: setUp overrides should call super().setUp(). MigrationsTestCase is intentionally omitted from this, since migrations tests are different in their nature and so whatever setUp() ZulipTestCase may do in the future, MigrationsTestCase may not necessarily want to replicate.	2019-10-19 17:27:01 -07:00
Mateusz Mandera	dbe508bb91	models: Migration of Message.pub_date to date_sent, part 2. Fixes #1727. With the server down, apply migrations 0245 and 0246. 0246 will remove the pub_date column, so it's essential that the previous migrations ran correctly to copy data before running this.	2019-10-05 19:01:34 -07:00
Mateusz Mandera	4646c7550c	test_retention: Prepare for moving system bots to zulipinternal.	2019-07-20 15:08:08 -07:00
Mateusz Mandera	d1c2185c81	retention: Archive cross-realm personal messages. We can simply archive cross-realm personal messages according to the retention policy of the recipient's realm. It requires adding another message-archiving query for this case however. What remains is to figure out how to treat cross-realm huddle messages.	2019-07-08 20:03:20 -07:00
Mateusz Mandera	7950aaea1e	retention: Add code for deleting old archive data.	2019-06-26 12:24:47 -07:00
Mateusz Mandera	3ac11a3fc5	retention: Use ON CONFLICT DO UPDATE to handle re-archiving properly. When archiving Messages, we stop relying on LEFT JOIN ... IS NULL to avoid duplicates when INSERTing. Instead we use ON CONFLICT DO UPDATE (added in postgresql 9.5) to, in case of archiving a Message that already has a corresponding archived objects (this happens if a Message gets archived, restored and then archived again), re-assign the existing ArchivedMessage to the new transaction. This also allows us to fix test_archiving_messages_second_time, which was temporarily disable a few commits before.	2019-06-26 12:05:59 -07:00
Mateusz Mandera	6e46c6d752	retention: Add functions for restoring archived data. Functions for restoring archived data are added and existing tests are expanded to restore data they archived and check correctness.	2019-06-26 12:05:59 -07:00
Mateusz Mandera	9acd3b0f46	retention: Rewrite move_messages_to_archive to use existing functions. Instead of having a bunch of custom code in the function, we make it use run_message_batch_query and run_archiving_in_chunks to do the necessary operations in a consistent way, using the same codepaths as the rest of the archiving system. This breaks test_archiving_messages_second_time temporarily, but we will fix it and re-enable the test in the next commits, where we'll address various other issues with re-archiving of messages. We also remove the @transaction.atomic wrapper, because atomicity is handled by the logic inside run_archiving_in_chunks.	2019-06-26 12:05:59 -07:00
Mateusz Mandera	c869ea8e1e	test_retention: Factor out a class with shared helper functions.	2019-06-26 12:05:59 -07:00
Mateusz Mandera	7fc48f8b93	test_retention: Check if messages get deleted when archiving. We add additional checks in _verify_archive_data to make sure the archived Messages and UserMessages are deleted from their normal tables.	2019-06-26 12:05:59 -07:00
Mateusz Mandera	25810752fe	retention: Fully process each Message chunk in a transaction. To ensure the database retains a consistent state if archiving gets interrupted, we process each Messages chunk together with related objects in a single atomic transaction.	2019-06-13 11:17:54 -07:00
Mateusz Mandera	f06a4b4eab	retention: Batch Message archiving queries. We batch queries that archive Messages, to limit the maximum amount of Message objects archived in a single query. This leads to the archiving of other related objects being batched as well, because we loop over chunks of archived messages and archive their related objects per-chunk.	2019-06-11 09:25:25 -07:00
Mateusz Mandera	323be57151	retention: If stream has no retention policy set, use realm policy. We add the following behavior: If stream has message_retention_days set to -1, archiving for it is disabled. If stream has message_retention_days set to null, use the realm's policy. If the realm has no policy, we don't archive for this stream.	2019-06-06 11:17:42 -07:00
Mateusz Mandera	0e9fa4f028	retention: Support stream-based retention policies. We change the archiving scheme to allow having stream based retention policies. In the first step of the archiving process, we loop over streams and archive their expired messages and related objects. Then we separately archive all expired personal and huddle messages and related objects. As the last step, we scan for redundant attachments which can now be deleted. To achieve this, we have to rewrite a significant portion of the retention code and rework some of the database queries. For the sake of simplicity, we neither archive nor delete cross-realm messages, except cross-realm stream messages – in their case they can be processed in the same manner as ordinary stream messages. In the query for archiving personal and huddle messages we simply exclude those sent by cross-realm bots. We change the tests to adapt to these modifications.	2019-06-06 11:17:42 -07:00
Mateusz Mandera	6c3ba25474	retention: Use RETURNING to speed up database queries. We add RETURNING to fetch relevant message and usermessage ids in archiving queries and use them to make other queries faster and slower. A side-effect of this implementation is that with cross-realm messages, the UserMessage of the recipient and the Message will not be deleted - but cross-realm messages are rare, will still get correctly put in the archive tables and so failing to delete should not be a problem for now. They will be fully handled later.	2019-06-02 14:55:14 -07:00
Mateusz Mandera	4facc93670	retention: Add archiving of SubMessages.	2019-05-30 11:40:20 -07:00
Mateusz Mandera	37c42a09e5	retention: Archiving of models tied to a Message, applied to Reactions. We add general code that will archive models that are tied to a specific Message (such as Reactions and SubMessages). Certain details of the model are grabbed from a list models_with_message_key, and then used to create queries that will archive these database tables. We put Reaction in that list in this commit, and add appropriate tests. To have archiving of other analogical models (for example SubMessage), one only needs to make an appropriate entry in the models_with_message_key list.	2019-05-30 11:40:20 -07:00
Mateusz Mandera	dfee559333	test_retention: Check that Reactions get correctly deleted.	2019-05-30 11:33:41 -07:00
Mateusz Mandera	29729b7748	test_retention: Check that SubMessages get correctly deleted.	2019-05-30 11:27:38 -07:00
Mateusz Mandera	6d69405f54	test_retention: Keep helper functions in a base class.	2019-05-30 11:27:38 -07:00
Mateusz Mandera	2370e6717c	test_retention: Factor out _make_expired_zulip_messages helper function.	2019-05-30 11:27:38 -07:00
Mateusz Mandera	0bf90be886	retention: Clean up and rewrite test_retention.py. test_retention.py had various issues - we opt for keeping its essence (what should the tests do and verify), but rewriting a lot of it in order to have more clarity in what's happening there.	2019-05-27 12:53:32 -07:00
Mateusz Mandera	c5ac66b9c8	retention: Split archive_messages code into two functions. We split archive_messages code into two functions: moving to archive and cleanup. This allows cleaning up the tests - they can call these functions directly instead of copying several lines of archive_messages here and there in multiple tests.	2019-05-27 12:53:32 -07:00
Mateusz Mandera	db86043195	test_retention: Quick fix for the remaining test failure. test_cross_realm_messages_archiving_two_realm_expired doesn't run the code path patched in commit 3d1aa98b2ea344fba7fbb2373a37d4cf30f53e08i, so it can still fail. We apply the analogical change in the test as in the cited commit.	2019-05-22 14:15:18 -07:00
Tim Abbott	3d1aa98b2e	retention: Use a consistent ordering for processing realms. This is probably a good idea for the production use case, since then there's some consistency of behavior, and if we extend logging, one knows exactly which realms were or were not executed before a logged failure. This fixes the nondeterministic test failures we've been seeing in CI: if you use `-id` in that order_by, it happens consistently.	2019-05-22 10:48:53 -07:00
Tim Abbott	bde9b28589	test_retention: Update debugging code for CI failures. This should provide more helpful output for the next stage of debugging.	2019-05-21 14:10:15 -07:00
Tim Abbott	55b15ba117	test_retention: Improve and extent print-debugging. We needed flush=True to have output not be lost. Also print the original messages, so we can compare what's missing.	2019-05-21 09:28:03 -07:00
Tim Abbott	1353e94b29	test_retention: Add print-debugging. We've been seeing nondeterministic failures in this test suite in CI that we can't reproduce locally; these print statements should help track them down.	2019-05-20 19:43:28 -07:00
K.Kanakhin	e930851d16	retention-period: Add more core code for retention policy. This is a very old commit for #106, which has been on hiatus for a few years. It was significantly modified by tabbott to: * Improve coding style and variable names * Update mypy annotations style * Clean up the testing logic * Update for API changes elsewhere in our system But the actual runtime code is essentially unmodified from the original work by Kirill. It contains basic support for archiving Messages, UserMessages, and Attachments with a nice test suite. It's still not usable in production (e.g. it will probably break Reactions, SubMessages, etc.), but upcoming commits will address that.	2019-05-19 20:22:47 -07:00

1 2

71 Commits