zulip

Commit Graph

Author	SHA1	Message	Date
Steve Howell	88a57ed4ac	bulk digest: Get stream subscriptions in bulk. If we have multiple users, this reduces the amount of queries we need to do, because we get all subscriptions for all users in a single query to Subscription. For the single-user case, we are introducing an extra query hop, but the database is doing roughly the same work, because we are just breaking up this complex query into two hops: messages = select ... from message where recipient__type_id in ( select stream_id from subscription where ... ) Now it's more like: stream_ids = select stream_id from subscription where ... messages = select ... from message where recipient__type_id in stream_ids	2020-11-05 09:36:59 -08:00
Steve Howell	c83db37161	email digests: Introduce bulk methods for digest. Note that we are not changing anything semantically or algorithmically yet. The only overhead here for the single-user case is boxing and unboxing data into single-item dicts and lists. The interfaces for callers in the view and the queue processor remain the same for now.	2020-11-05 09:36:59 -08:00
Steve Howell	0e2d02b0a2	digest tests: Count cache tries.	2020-11-05 09:36:59 -08:00
Steve Howell	127f4e1291	digest tests: Add more users to bulk digest test.	2020-11-05 09:36:59 -08:00
Steve Howell	89cb3fa841	digest tests: Localize mocks. We didn't need the enough-traffic mock. We also continue to prep for testing multiple users. I also finally remove a comment that is about to be addressed (and which inaccurately refers to huddles).	2020-11-05 09:36:59 -08:00
Steve Howell	1ec16dd1da	digest tests: Prep to test bulk digests. All this does, essentially, is put the logic we used to test for othello inside of a loop. We'll add more users in the next commit.	2020-11-05 09:36:59 -08:00
Anders Kaseorg	13c11ec5f3	openapi: Fix escaping in curl command generation. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-11-05 09:36:31 -08:00
Steve Howell	c1f134a3a4	performance: Use ORM to fetch sender in render_markdown. In `709493cd75` (Feb 2017) I added code to render_markdown that re-fetched the sender of the message, to detect whether the message is a bot. It's better to just let the ORM fetch this. The message object should already have sender. The diff makes it look like we are saving round trips to the database, which is true in some cases. For the main message-send codepath, though, we are only saving a trip to memcached, since the middleware will have put our sender's user object into the cache. The test_message_send test calls internally to check_send_stream_message, so it was actually hitting the database in render_markdown (prior to my change).	2020-11-05 09:35:15 -08:00
Steve Howell	637f596751	tests: Fix queries_captured to clear cache up front. Before this change we were clearing the cache on every SQL usage. The code to do this was added in February 2017 in `6db4879f9c`. Now we clear the cache just one time, but before the action/request under test. Tests that want to count queries with a warm cache now specify keep_cache_warm=True. Those tests were particularly flawed before this change. In general, the old code both over-counted and under-counted queries. It under-counted SQL usage for requests that were able to pull some data out of a warm cache before they did any SQL. Typically this would have bypassed the initial query to get UserProfile, so you will see several off-by-one fixes. The old code over-counted SQL usage to the extent that it's a rather extreme assumption that during an action itself, the entries that you put into the cache will get thrown away. And that's essentially what the prior code simulated. Now, it's still bad if an action keeps hitting the cache for no reason, but it's not as bad as hitting the database. There doesn't appear to be any evidence of us doing something silly like fetching the same data from the cache in a loop, but there are opportunities to prevent second or third round trips to the cache for the same object, if we can re-structure the code so that the same caller doesn't have two callees get the same data. Note that for invites, we have some cache hits that are due to the nature of how we serialize data to our queue processor--we generally just serialize ids, and then re-fetch objects when we pop them off the queue.	2020-11-05 09:35:15 -08:00
YashRE42	967efc32d2	widgets: Remove tictactoe example widget. Steve asked me to remove this, since the tictactoe game was always intended as a proof of concept. Now that we have poll and todo widgets, the sample code for tictactoe has much less value. We replace the content and type in test_widgets.py to maintain coverage.	2020-11-03 14:46:39 -08:00
Aman Agrawal	87cdd8433d	home: Allow logged out user through home. We allow user to load webapp without log-in. This is only be enabled for developed purposes now. Production setups will see no changes.	2020-11-02 17:07:12 -08:00
akshatdalton	620e9cbf72	markdown: Fix merging of separate quotations. Initally, when writing two or more quotes, having a blank line in between them, merges those quotes. This created confusion especially in "quote and reply". This commit fixes such issues. Now two or more quotes having a blank line in between them, will not get merged. This change is correct both for usability and for improving our compatibility with CommonMark. Fixes #14379.	2020-10-30 15:21:15 -07:00
Anders Kaseorg	aaa7b766d8	python: Use universal_newlines to get str from subprocess. We can replace ‘universal_newlines’ with ‘text’ when we bump our minimum Python version to 3.7. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-30 11:36:38 -07:00
Anders Kaseorg	7c4f68d9cf	python: Skip unnecessary decode before BeautifulSoup parsing. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-30 11:36:38 -07:00
Anders Kaseorg	86e8d81c7f	python: Skip unnecessary decode before JSON parsing. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-30 11:36:38 -07:00
Anders Kaseorg	1802a50cc9	python: Use requests.Response.text instead of decoding content. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-30 11:36:38 -07:00
sahil839	b29d39195c	streams: Do not allow default streams to be private. We now do not allow to make a stream private which is already a default stream.	2020-10-29 15:47:32 -07:00
sahil839	557ca0802c	streams: Do not allow private streams to be set as default. We now do not allow to set a private stream as default.	2020-10-29 15:43:37 -07:00
m-e-l-u-h-a-n	cbfd6464a5	logging: replace mock.patch() for logging with assertLogs() This commit removes mock.patch with assertLogs(). * Adds return value to do_rest_call() in outgoing_webhook.py, to support asserting log output in test_outgoing_webhook_system.py. * Logs are not asserted in test_realm.py because it would require to users to be queried using users=User.objects.filter(realm=realm) and the order of resulting queryset varies for each run. * In test_decorators.py, replacement of mock.patch is not done because I'm not sure if it's worth the effort to replace it as it's a return value of a function. Tweaked by tabbott to set proper mypy types.	2020-10-29 15:37:45 -07:00
Hemanth V. Alluri	99cf37dc51	drafts: Make the ID of the draft a part of the draft dict. Then because the ID is now part of the draft dict, we can (and do) change the structure of the "drafts" parameter returned from `GET /drafts` from an object (mapping ID to data) to an array. Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>	2020-10-29 11:06:04 -07:00
Hemanth V. Alluri	8d59fd2f45	tests/drafts: Simplify create_and_check_drafts_for_success. Sometimes we don't need to specify the expected_drafts field. So by removing it, we can reduce the clutter a bit. Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>	2020-10-29 11:06:04 -07:00
Hemanth V. Alluri	e60925b3e8	drafts: Change "timestamp" from float to integer. Now the timestamp returned in a draft dict will always be an int. The endpoints will still accept either an int or a float. Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>	2020-10-29 11:06:04 -07:00
m-e-l-u-h-a-n	be7a70e742	logging: Remove unnecessary mock.patch() for logging. Our test-backend validation confirms that we don't log anything to stdout in the tests, so the fact that CI passes with this removes shows there was nothing being logged.	2020-10-28 23:15:27 -07:00
Vishnu KS	fdea49742c	apps: Use GitHub API for generating the web app download link.	2020-10-28 23:04:14 -07:00
Alex Vandiver	f4eae83542	export: Only include real, active humans in the displayed count.	2020-10-28 18:31:06 -07:00
Anders Kaseorg	1352f2f233	python: Replace manual quote_plus usage with urlencode. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-27 13:47:02 -07:00
Anders Kaseorg	4e9d587535	python: Pass query parameters as a dict when making GET requests. This provides automatic URL-encoding. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-27 13:47:02 -07:00
Anders Kaseorg	41f509170b	users: Canonicalize the timezone identifier. While working on shifting toward native browser time zone APIs (#16451), it was found that all but very recent Chrome and Node versions reject certain legacy timezone aliases like US/Pacific (https://crbug.com/364374). For now, we only canonicalize the timezone property returned in user objects and not the timezone setting itself. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-27 13:42:54 -07:00
Anders Kaseorg	0b288f92c9	timezone: Remove get_timezone wrapper. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-27 13:42:54 -07:00
Tim Abbott	6d7cd351a3	events: Optimize creating streams for new users. During the new user creation code path, there can be no existing active clients for the user being created, so we can skip the code to send events to that user's clients. The tests here reflect that we need to send fewer events, and do fewer queries that would have been spent computing data for these.. Fixes #16503, combined with the long series of recent changes by Steve Howell to fix super-linear behavior in this code path.	2020-10-26 12:47:15 -07:00
Steve Howell	88a7a1b002	events: Optimize peer_add/peer_remove for public streams. We no bulk up peer_add/peer_remove events by user if the same user has subscribed to multiple streams (and just that single user). This mostly optimizes the new-user codepath, but the algorithm is a bit more general in nature.	2020-10-26 12:33:28 -07:00
Alex Vandiver	7cf737988d	queue: Be more explicit about test/real queue division.	2020-10-26 12:32:47 -07:00
Anders Kaseorg	31d0141a30	python: Close opened files. Fixes various instances of ‘ResourceWarning: unclosed file’ with python -Wd. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-26 12:31:30 -07:00
Steve Howell	3ad1335a97	tests: Clear ContentType cache for user test. This keeps the number of queries predictable.	2020-10-26 07:18:08 -04:00
Steve Howell	5ef01b3ad8	tests: Fix test_create_user_with_multiple_streams. This test was flaky due to some date-related non-determinism. I make all the Message objects current to make add_new_user_history reliably try to bulk-update UserMessage rows to read.	2020-10-26 07:18:08 -04:00
Harsh Srivastava	9b31df009b	openapi: Fix excessively large test_events failure output. Because of the very large `oneOf` clause of the formats of events possible in Zulip's `GET /events` system, we had issues with `test-backend` failures for missing documentation for a new event format being like 1000 lines of output, which was very much unhelpful. Fix this by limiting the output use only the oneOf variants that are broadly similar to the actual payload received. Fixes #16023.	2020-10-23 17:00:17 -07:00
Anders Kaseorg	72d6ff3c3b	docs: Fix more capitalization issues. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-23 11:46:55 -07:00
Anders Kaseorg	b9fd49a2c6	mypy: Correct mistaken *args type annotations. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-23 11:29:13 -07:00
Anders Kaseorg	d295da676b	test_message_fetch: Clean up obsolete PGroonga bug workaround. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-22 23:27:23 -07:00
sahil839	571bb62e3d	events: Update subscriber list on peer_add for unsubscribed streams. We update the subscriber list on peer_add event for unsubscribed streams as well.	2020-10-22 15:12:32 -07:00
sahil839	733d26aef2	events: Update subscriber list on peer_remove for never subscribed stream. We now update the subscriber list on peer_remove event for never subscribed streams also.	2020-10-22 15:12:32 -07:00
sahil839	af9b153ee3	events: Update subscriber list on peer_remove for unsubscribed stream. We update the subscriber list on peer_remove event for unsubscribed streams also.	2020-10-22 15:12:32 -07:00
sahil839	709edd29d4	test_events: Fix comment in do_test_subscribe_events. The comment still pointed to 'vacate' event flow, but we have removed the vacate event in `a9356508ca`. This commit fixes the comment to depict the correct purpose of below lines, i.e. to test the remove event flow.	2020-10-22 15:12:32 -07:00
sahil839	e578742b02	test_events: Remove 'realm_user' from event_types in subscription test. We were including 'realm_user' in event_types along with 'subscription', but we don't send event of type 'realm_user' when subscribing to a new stream. This was added in `1c332f5d6a`. This commit removes 'realm_user' from event_types.	2020-10-22 15:12:32 -07:00
sahil839	d0f5537fb2	actions: Modify check_message for handling wildcard_mention_policy setting. This commit adds enforcement for sending messages containing wildcard mentions according to wildcard_mention_policy.	2020-10-22 14:46:32 -07:00
sahil839	25f32d461e	tests: Add tests for all the values of wildcard_mention_policy.	2020-10-22 12:08:22 -07:00
Mateusz Mandera	48f80fcb0a	auth: Expect name in request params in Apple auth. The name used to be included in the id_token, but this seems to have been changed by Apple and now it's sent in the `user` request param. https://github.com/python-social-auth/social-core/pull/483 is the upstream PR for this - but upstream is currently unmaintained, so we have to monkey patch. We also alter the tests to reflect this situation. Tests no longer put the name in the id_token, but rather in the `user` request param in the browser flow, just like it happens in reality. An adaptation has to be made in the native flow - since the name won't be included by Apple in the id_token anymore, the app, when POSTing to the /complete/apple/ endpoint, can (and should for better user experience) add the `user` param formatted as json of {"email": "hamlet@zulip.com", "name": {"firstName": "Full", "lastName": "Name"}} dict. This is also reflected by the change in the native flow tests.	2020-10-22 12:07:46 -07:00
Steve Howell	7ff3859136	subscriber events: Change schema for peer_add/peer_remove. We now can send an implied matrix of user/stream tuples for peer_add and peer_remove events. The client code basically does this: for stream_id in event['stream_ids']: for user_id in event['user_ids']: update_sub(stream_id, user_id) We used to send individual events, which gets real expensive when you are creating new streams. For the case of copy-to-stream case, we should see events go from U to 1, where U is the number of users added. Note that we don't yet fully optimize the potential of this schema. For adding a new user with lots of default streams, we still send S peer_add events. And if you subscribe a bunch of users to a bunch of private streams, we only go from U * S to S; we can't optimize it down to one event easily.	2020-10-22 11:19:53 -07:00
Steve Howell	85ed6f332a	performance: Avoid Recipient lookup for stream messages. All the fields of a stream's recipient object can be inferred from the Stream, so we just make a local object. Django will create a Message object without checking that the child Recipient object has been saved. If that behavior changes in some upgrade, we should see some pretty obvious symptom, including query counts changing. Tweaked by tabbott to add a longer explanatory comment, and delete a useless old comment.	2020-10-20 11:47:23 -07:00
Steve Howell	7bbcc2ac96	refactor: Compute peers for public streams later. This saves us a query for edge cases like when you try to unsubscribe from a public stream that you have already unsubscribed from. But this is mostly to prep for upcoming optimizations.	2020-10-20 11:31:22 -07:00
akshatdalton	287c4ed2bb	markdown: Fix Youtube and Vimeo preview overriding markdown link titles bug. Initially markdown titles were overridden by Youtube and Vimeo preview titles. But now it will check if any markdown title is present to replace Youtube or Vimeo preview titles, if preview of linked websites is enabled. Fixes #16100	2020-10-19 12:06:13 -07:00
Anders Kaseorg	d81a93cdf3	requirements: Upgrade markdown to 3.3.1. Upstream has slightly changed the whitespace around stashes. Take this opportunity to clean up the extra blank lines we were outputting. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-19 11:54:14 -07:00
Steve Howell	4dce34ab8b	refactor: Simplify call to bulk_get_subscriber_user_ids. The way we were computing the dictionary was very convoluted--all we need is a set of subscribed user ids.	2020-10-18 14:27:31 -07:00
Steve Howell	0ca07ffd3c	peformance: Eliminate StreamRecipientMap. That class is an artifact of when Stream didn't have recipient_id. Now it's simpler to deal with stream subscriptions. We also save a query during page load (and other places where we get subscriber info).	2020-10-18 14:27:31 -07:00
Steve Howell	2f8ba383ef	tests: Test overhead for creating new users.	2020-10-18 14:27:31 -07:00
Mateusz Mandera	716df658fa	queue_processors: Don't run test queues with run-dev.py.	2020-10-18 14:07:31 -07:00
Steve Howell	e1bcf6124f	refactor: Remove recipient from access_stream_by_name.	2020-10-16 12:58:11 -07:00
Steve Howell	a51b483f1a	performance: Remove recipient from access_stream_by_id. The Recipient table is now kind of useless for stream-related operations, since we have recipient_id on Stream now.	2020-10-16 12:58:11 -07:00
Steve Howell	3685fcc701	refactor: Remove recipient arg for do_mute_topic.	2020-10-16 12:58:11 -07:00
Steve Howell	378062cc83	performance: Avoid call to access_stream_by_id. We already trust ids that are put on our queue for deferred work. For example, see the code for "mark_stream_messages_as_read_for_everyone" We now pass stream_recipient_id when we queue up work for do_mark_stream_messages_as_read. This generally saves about 3 queries per user when we unsubscribe them from a stream.	2020-10-16 12:58:11 -07:00
Steve Howell	2256d72015	minor: Add comment to subscriber test.	2020-10-16 12:58:11 -07:00
Steve Howell	31eb97ddde	performance: Fix do_mark_stream_messages_as_read. This function no longer asks for data that it doesn't need.	2020-10-16 12:58:11 -07:00
Steve Howell	6d1f9de7d3	performance: Use SubInfo when removing subscribers. We get two speedups: * The query to get existing subscribers only gets the two fields we need. We no longer need all the overhead of user_profile and recipient data being returned in the query. * We avoid Django making extra hops to the database to get user info.	2020-10-16 12:58:11 -07:00
Steve Howell	b4346d0276	performance: Extract subscribers/peers in bulk. We replace get_peer_user_ids_for_stream_change with two bulk functions to get peers and/or subscribers. Note that we have three codepaths that care about peers: subscribing existing users: we need to tell peers about new subscribers we need to tell subscribed user about old subscribers unsubscribing existing users: we only need to tell peers who unsubscribed subscribing new user: we only need to tell peers about the new user (right now we generate send_event calls to tell the new user about existing subscribers, but this is a waste of effort that we will fix soon) The two bulk functions are this: bulk_get_subscriber_peer_info bulk_get_peers They have some overlap in the implementation, but there are some nuanced differences that are described in the comments. Looking up peers/subscribers in bulk leads to some nice optimizations. We will save some memchached traffic if you are subscribing to multiple public streams. We will save a query in the remove-subscriber case if you are only dealing with private streams.	2020-10-15 15:12:01 -07:00
Steve Howell	c73f84f275	tests: Improve tests for unsubscribing multiple users. Note that the tests now reflect that we have O(N) behavior for multiple users.	2020-10-15 15:12:01 -07:00
Steve Howell	f86823f82f	tests: Add cache_tries_captured helper.	2020-10-15 15:12:01 -07:00
Steve Howell	a9356508ca	events: Stop sending occupy/vacate events. We used to send occupy/vacate events when either the first person entered a stream or the last person exited. It appears that our two main apps have never looked at these events. Instead, it's generally the case that clients handle events related to stream creation/deactivation and subscribe/unsubscribe. Note that we removed the apply_events code related to these events. This doesn't affect the webapp, because the webapp doesn't care about the "streams" field in do_events_register. There is a theoretical situation where a third party client could be the victim of a race where the "streams" data includes a stream where the last subscriber has left. I suspect in most of those situations it will be harmless, or possibly even helpful to the extent that they'll learn about streams that are in a "quasi" state where they're activated but not occupied. We could try to patch apply_event to detect when subscriptions get added or removed. Or we could just make the "streams" piece of do_events_register not care about occupy/vacate semantics. I favor the latter, since it might actually be what users what, and it will also simplify the code and improve performance.	2020-10-14 10:53:10 -07:00
Steve Howell	1bcb8d8ee8	performance: Avoid computing page_params.streams in webapp. The query to get "occupied" streams has been expensive in the past. I'm not sure how much any recent attempts to optimize that query have mitigated the issue, but since we clearly aren't sending this data, there is no reason to compute it.	2020-10-14 10:53:10 -07:00
Steve Howell	193ca397f9	tests: Include deactivated users for subscribe test.	2020-10-14 10:53:10 -07:00
Aman Agrawal	fbf7cb82a7	web_public_guest: Rename to web_public_visitor for clarity. Using web_public_guest for anonymous users is confusing since 'guest' is actually a logged-in user compared to web_public_guest which is not logged-in and has only read access to messages. So, we rename it to web_public_visitor.	2020-10-13 16:59:52 -07:00
Steve Howell	e7a8c7ac48	test: Improve tests for bulk-adding subscribers. This is a more thorough test of adding multiple streams for multiple users, including streams that users have already subscribed to. The extra queries here are due to the fact that we call `principal_to_user_profile` in a loop in the view. So that's an example of O(N) overhead. We may be able to bulk-fetch these users eventually.	2020-10-13 18:54:55 -04:00
Steve Howell	c29ba75135	refactor: Extract send_messages_for_new_subscribers. This is a pure extraction, except that I remove a redundant check that `len(principals) > 0`. Whenever that value is false, then `new_subscriptions` will only have one possible entry, which is the current user, and we skip that in the loop.	2020-10-13 18:54:55 -04:00
Steve Howell	3b338ec32e	performance: Optimize filter_stream_authorization. We no longer do O(N) queries to get existing streams. This is a somewhat contrived use case--generally, we are not trying to re-subscribe a user to several streams. Still, we want to avoid this. This commit also makes `test_bulk_subscribe_many` do more work, and the change to the test helped me discover this bug.	2020-10-13 18:54:55 -04:00
Anders Kaseorg	6564540d15	docs: Fix some spelling errors. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-13 15:47:13 -07:00
Anders Kaseorg	dd48dbd912	docs: Add spaces to “check out”, “log in”, “set up”, “sign up” as verbs. “Checkout”, “login”, “setup”, and “signup” are nouns, not verbs. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-13 15:47:13 -07:00
Steve Howell	598601e8fc	stream events: Prevent spurious events. If a user asks to be subscribed to a stream that they are already subscribed to, then that stream won't be in new_stream_user_ids, and we won't need to send an event for it. This change makes that happen more automatically.	2020-10-13 11:28:17 -07:00
Steve Howell	766892d8aa	import: Reuse get_last_message_id() helper.	2020-10-13 11:28:17 -07:00
Steve Howell	188cc9bb3b	minor: Fix user/stream in test_subscriptions.	2020-10-13 11:28:17 -07:00
Steve Howell	9df9934ed6	refactor: Pass realm to bulk_add_subscriptions. I think it's important that the callers understand that bulk_add_subscriptions assumes all streams are being created within a single realm, so I make it an explicit parameter. This may be overkill--I would also be happy if we just included the assertions from this commit.	2020-10-13 11:28:17 -07:00
Anders Kaseorg	17ac17286c	python: Catch specific exceptions from subprocess. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-11 16:11:41 -07:00
Anders Kaseorg	1346c5397a	zephyr: Use correct shell quoting for ssh. ssh always runs its command through a shell (after naïvely joining multiple arguments with spaces), so it needs an extra level of shell quoting. This should have no effect because we already validated user with a regex, but it’s better for escaping to be locally correct in case the context changes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-11 16:11:35 -07:00
Alex Vandiver	c2132a4f9c	queue: Drop register_json_consumer / json_drain_queue interface. Now that all callsites use the same interface, drop the now-unused ones, and their tests.	2020-10-11 14:19:42 -07:00
Alex Vandiver	5477b9d9a1	queue: Switch tests to start_json_consumer interface.	2020-10-11 14:19:42 -07:00
Alex Vandiver	f9358d5330	queue: Switch batch interface to use the channel.consume iterator. This low-level interface allows consuming from a queue with timeouts. This can be used to either consume in batches (with an upper timeout), or one-at-a-time. This is notably more performant than calling `.get()` repeatedly (what json_drain_queue does under the hood), which is "highly discouraged as it is very inefficient"[1]. Before this change: ``` $ ./manage.py queue_rate --count 10000 --batch Purging queue... Enqueue rate: 11158 / sec Dequeue rate: 3075 / sec ``` After: ``` $ ./manage.py queue_rate --count 10000 --batch Purging queue... Enqueue rate: 11511 / sec Dequeue rate: 19938 / sec ``` [1] https://www.rabbitmq.com/consumers.html#fetching	2020-10-11 14:19:40 -07:00
Alex Vandiver	571f8b8664	queue: Use low-level queue_purge to empty at the end of tests. This is O(1) at the RabbitMQ API level, and doesn't rely on the code under test to function correctly during test cleanup.	2020-10-09 20:43:49 -07:00
Alex Vandiver	ac0ba21c2c	tests: Stop reusing a variable name. `loopworker_sleep_mock` is a file-level variable used to mock out the sleep() call in LoopQueueProcessingWorker; don't reuse the variable name for something else.	2020-10-09 20:42:20 -07:00
Alex Vandiver	754638f673	tests: Refactor test_queue_worker to separate queues.	2020-10-09 20:42:12 -07:00
Alex Vandiver	d5a6b0f99a	queue: Rename queue_size, and update for all local queues. Despite its name, the `queue_size` method does not return the number of items in the queue; it returns the number of items that the local consumer has delivered but unprocessed. These are often, but not always, the same. RabbitMQ's queues maintain the queue of unacknowledged messages; when a consumer connects, it sends to the consumer some number of messages to handle, known as the "prefetch." This is a performance optimization, to ensure the consumer code does not need to wait for a network round-trip before having new data to consume. The default prefetch is 0, which means that RabbitMQ immediately dumps all outstanding messages to the consumer, which slowly processes and acknowledges them. If a second consumer were to connect to the same queue, they would receive no messages to process, as the first consumer has already been allocated them. If the first consumer disconnects or crashes, all prior events sent to it are then made available for other consumers on the queue. The consumer does not know the total size of the queue -- merely how many messages it has been handed. No change is made to the prefetch here; however, future changes may wish to limit the prefetch, either for memory-saving, or to allow multiple consumers to work the same queue. Rename the method to make clear that it only contains information about the local queue in the consumer, not the full RabbitMQ queue. Also include the waiting message count, which is used by the `consume()` iterator for similar purpose to the pending events list.	2020-10-09 20:40:39 -07:00
Aman Agrawal	8b419c93e4	message_send: Fix old guests being treated as full members. For streams in which only full members are allowed to post, we block guest users from posting there. Guests users were blocked from posting to admin only streams already. So now, guest users can only post to STREAM_POST_POLICY_EVERYONE streams. This is not a new feature but a bugfix which should have happened when implementing full member stream policy / guest users.	2020-10-08 11:30:11 -07:00
Alex Vandiver	d47637fa40	queue: Set a max consume timeout with SIGALRM. SIGALRM is the simplest way to set a specific maximum duration that queue workers can take to handle a specific message. This only works in non-threaded environments, however, as signal handlers are per-process, not per-thread. The MAX_CONSUME_SECONDS is set quite high, at 10s -- the longest average worker consume time is embed_links, which hovers near 1s. Since just knowing the recent mean does not give much information[1], it is difficult to know how much variance is expected. As such, we set the threshold to be such that only events which are significant outliers will be timed out. This can be tuned downwards as more statistics are gathered on the runtime of the workers. The exception to this is DeferredWorker, which deals with quite-long requests, and thus has no enforceable SLO. [1] https://www.autodesk.com/research/publications/same-stats-different-graphs	2020-10-06 17:26:14 -07:00
Alex Vandiver	baf882a133	queue: Only ACK drain_queue once it has completed work on the list. Currently, drain_queue and json_drain_queue ack every message as it is pulled off of the queue, until the queue is empty. This means that if the consumer crashes between pulling a batch of messages off the queue, and actually processing them, those messages will be permanently lost. Sending an ACK on every message also results in a significant amount lot of traffic to rabbitmq, with notable performance implications. Send a singular ACK after the processing has completed, by making `drain_queue` into a contextmanager. Additionally, use the `multiple` flag to ACK all of the messages at once -- or explicitly NACK the messages if processing failed. Sending a NACK will re-queue them at the front of the queue. Performance of a no-op dequeue before this change: ``` $ ./manage.py queue_rate --count 50000 --batch Purging queue... Enqueue rate: 10847 / sec Dequeue rate: 2479 / sec ``` Performance of a no-op dequeue after this change (a 25% increase): ``` $ ./manage.py queue_rate --count 50000 --batch Purging queue... Enqueue rate: 10752 / sec Dequeue rate: 3079 / sec ```	2020-10-06 17:26:14 -07:00
Alex Vandiver	8cf37a0d4b	queue: Add a tool to profile no-op enqueue and dequeue actions.	2020-10-06 17:26:14 -07:00
Mateusz Mandera	6e83bcc0d5	custom_profile_fields: Don't allow leading/trailing whitespaces. Allowing such whitespaces can lead to hard to debug issues e.g. with ldap sync.	2020-10-02 14:58:06 -07:00
Aman Agrawal	08fbde4e7c	test_move_msgs: Rename variable for clarity.	2020-10-01 17:45:11 -07:00
Tim Abbott	8c8f3ee13b	test_classes: Extract home view helpers for reuse.	2020-10-01 15:14:25 -07:00
Tim Abbott	6d041a3b34	home: Include is_web_public_guest in page_params.	2020-10-01 15:07:19 -07:00
Aman Agrawal	b0d92b3ff6	HomeTest: Extract page_params keys to be used in other functions.	2020-10-01 14:39:54 -07:00
sahil839	78b98d8067	realm: Add wildcard_mention_policy setting. We add a new wildcard_mention_policy setting to handle wildcard mentions in large streams, with a wide range of policies available to organizations. We set the default to the safe option for preventing accidental spam: only stream administrators being able to use wildcard mentions in large streams.	2020-10-01 12:18:03 -07:00
sahil839	6c473ed75f	message: Call build_message_send_dict from check_message. We call build_message_send_dict from check_message instead of do_send_messages. This is a prep commit for adding a new setting for handling wildcard mentions in large streams.	2020-09-29 17:18:04 -07:00
Steve Howell	c199571112	mypy: Add StreamDict. This requires us to rework the view code a little bit to explicitly assign fields.	2020-09-29 16:49:10 -07:00
Tim Abbott	0c2d1f068d	docs: Extend documentation of event system testing.	2020-09-28 12:37:54 -07:00
Tim Abbott	3242fc7388	soft_deactivation: Fix typo in logging output.	2020-09-28 12:12:04 -07:00
palash	7a7db69935	test_push_notifications: Refactor mock.patch to assertLogs. Replaced mock.patch with assertLogs for testing log outputs in file zerver/tests/test_push_notifications.py	2020-09-28 12:12:00 -07:00
palash	0c18113910	soft_deactivation: Change root logger to zulip.soft_deactivation. Update logger in the following files using this logger: test_soft_deactivation, test_home, test_push_notifications	2020-09-28 12:12:00 -07:00
Dinesh	acca870480	tests: Add a dummy request to self.client.login(). A later commit alters `authenticate` of EmailAuthBackend to add a store `needs_to_change_password` variable to session which is useful to insist users on changing their weak password. The tests start failing with that change because client.login() runs `authenticate` without a `request` object. So, this commit sends a request object with `request.session=self.client.session` to self.client.login() in tests wherever needed.	2020-09-25 16:24:18 -07:00
Dinesh	232eb8b7cf	auth: Render config error page on configuration error. We previously used to to redirect to config error page with a different URL. This commit renders config error in the same URL where configuration error is encountered. This way when conifguration error is fixed the user can refresh to continue normally or go back to login page from the link provided to choose any other backend auth. Also moved those URLs to dev_urls.py so that they can be easily accessed to work on styling etc. In tests, removed some of the asserts checking status code to be 200 as the function `assert_in_success_response` does that check.	2020-09-25 16:16:17 -07:00
Clara Dantas	8674287192	digest: Support digest of web public streams for guest users. This change requires some basic plumbing for test code creating web-public streams.	2020-09-25 16:11:04 -07:00
Tim Abbott	94a9fa1891	event_schema: Add documentation and rename a few functions. This should help make this revised subsystem readable for more new contributors. We still need to make updates to the high-level documentation.	2020-09-25 12:53:00 -07:00
Steve Howell	5b7c9c4714	test_events: Add check_realm_user_remove.	2020-09-25 11:43:20 -07:00
Steve Howell	7bb7f2943f	event_schema: Finish extraction with realm_emoji/update. We now no longer define any schemas in test_events--all of them are in event_schema, which helps our tooling cross-check schemas for openapi and node tests.	2020-09-25 11:43:20 -07:00
Steve Howell	ae4d083a5a	event_schema: Extract check_realm_domains_*.	2020-09-25 11:43:20 -07:00
Steve Howell	298bed9fa1	event_schema: Split check_update_message_flags.	2020-09-25 11:43:20 -07:00
Steve Howell	f6e0171d02	event_schema: Split check_reaction into add/remove. It happens that whether you add a reaction or remove a reaction, we send the exact same fields, just using a different op code. This sort of symmetry is actually kind of rare, as usually "add" events have more fields, and "remove" events might just send an id of something to remove. Our openapi schema treats these as two seperate events, so we are more consistent with it, and it helps our schema-checking tooling for node fixtures, too. Note that we now have to exempt the two events from our openapi checks, due to the is_mirror_dummy field in the deprecated user block. We can decide how to handle this later--one possibility is to just add it as an optional field on the event_schema side.	2020-09-25 11:43:20 -07:00
Steve Howell	b7b2546f44	event_schema: Extract check_subscription_update. Note that we use value_type for value instead of bool, since properties can be non-bool things like color, which we just don't test now. We should test them. We more than compensate for this by checking the actual value of the value in check_subscription_update.	2020-09-25 11:43:20 -07:00
Steve Howell	b920ebce81	event_schema: Extract check_has_zoom_token.	2020-09-25 11:43:20 -07:00
Steve Howell	0c4286222f	event_schema: Extract check_realm_update_dict.	2020-09-25 11:43:20 -07:00
Steve Howell	6ec6525624	event_schema: Extract check_delete_message. There is a legacy format where we send singular "message_id" instead of plural "message_ids". Then there are different fields for "private" and "stream" message types.	2020-09-25 11:43:20 -07:00
Steve Howell	88165aee6b	event_schema: Extract check_user_group_update.	2020-09-25 11:43:20 -07:00
Steve Howell	aaaac11661	event_schema: Extract check_user_group_remove.	2020-09-25 11:43:20 -07:00
Steve Howell	1b7af13f37	event_schema: Extract check_user_group_remove_members.	2020-09-25 11:43:20 -07:00
Steve Howell	19b7739065	event_schema: Extract check_user_group_add_members.	2020-09-25 11:43:20 -07:00
Steve Howell	4084f0b949	event_schema: Extract check_realm_user_add. Note that we make the schema for profile_data slightly more realistic, but it doesn't actually get exercised by our current tests (apart from making sure it's a dict), since we don't have profile data for our test realm. We also don't have the optional fields for bots, since our tests don't exercise that, nor delivery_email. So we exempt realm_user_add_event from openapi checks for now. When we try to match the openapi specs better, we will probably want to add a few tests to test_events. Obviously getting good coverage for adding users would be nice for all these scenarios: * delivery_email matters * bots * realm has profile fields	2020-09-25 11:43:19 -07:00
Steve Howell	dc2176a965	event_schema: Extract check_presence.	2020-09-25 11:43:19 -07:00
Steve Howell	6c74a44697	data_types: Generalize StringDictType. This is a prep commit for supporting "presence" events, where the key of the dictionary is some arbitrary string like "website" but the value of the dictionary is another dictionary itself with keys that are more like variable names.	2020-09-25 11:43:19 -07:00
Steve Howell	4f3d5f2d87	event_schema: Extract check_realm_filters. We have some known issues with representing tuples in openapi, so we exempt realm_filters from the relevant check.	2020-09-25 11:43:19 -07:00
Steve Howell	e40a5400e5	event_schema: Extract check_muted_topics. This also forces us to create TupleType. We exempt this from the openapi check, since we haven't figured out how to model tuples in openapi with the same precision as event_schema (and it may be impossible). Long term we just want to stop dealing in tuples, of course.	2020-09-25 11:43:19 -07:00
orientor	91ca1afe98	data_type: Add StringDict data type. StringDict is a data type for representing dictionaries where all keys and values are strings. Add this data type to data_types.py and edit other files so that this data type is put to use and tested. (slightly tweaked by @showell to remove a comment and shorten a var name now that we have a proper data type)	2020-09-25 11:43:19 -07:00
Steve Howell	78a2059b8d	event schema: Extract attachment checkers.	2020-09-25 11:43:19 -07:00
Steve Howell	4a947c971d	event_schema: Extract check_realm_export. These are all trivial transformations. Note that we don't insist timestamps are floats; the NumberType class allows ints too.	2020-09-25 11:43:19 -07:00
Steve Howell	d28c01284c	event_schema: Extract check_hotspots. This forces us to introduce a NumberType.	2020-09-25 11:43:19 -07:00
Steve Howell	cf26151cea	event_schema: Use realm_user_person_types. For realm_user events, we now structure the person type as a union of dicts, which is more consistent with how we model this in our openapi spec.	2020-09-25 11:43:19 -07:00
Steve Howell	10952394b0	test_events: Use int value of message_retention_days. We also make our schema in event_schema reflect this, which in turn makes us match the already accurate openapi spec, so we no longer need to exempt four types of events from our sanity checks.	2020-09-25 11:43:19 -07:00
Steve Howell	73e7f7edec	check-node-fixtures: Compare python/openapi schemas. We might want to rename the tool to something more general now, since we are really reconciling three things: - node fixtures - event_schema checkers for test_events - openapi specs The way we compare python and openapi schemas is as follows: - first convert openapi schemas to be build from DictType, ListType, etc. with from_opeapi - do a diff on the schemas Most of the new code is just having the FooType family of classes serialize themselves with schema().	2020-09-25 11:43:19 -07:00
Wes Galbraith	9645959ac4	populate_db: Add emoji reactions to development environment database. This change adds automated generated emoji reactions to the data in the development environment's database. Fixes part of #14991.	2020-09-23 16:10:37 -07:00
sahil839	fe370debe5	tests: Rename stream messages tests in test_message_send.py. This commit renames 'test_message_to_self' and 'test_api_message_to_self' tests to 'test_message_to_stream_by_name' and 'test_api_message_to_stream_by_name' to depict the actual purpose of these tests.	2020-09-23 15:28:31 -07:00
Aman Agrawal	48492a0633	fetch_initial_state_data: Pass realm as independent parameter. This removes dependency of the function on user_profile to get the realm, which will be useful when user_profile is None in case of web public guests.	2020-09-23 12:06:54 -07:00
Alex Vandiver	7001004ec0	webhooks: Do not predicate on the "payload" key. If we are to log to the webhook logger, do so no matter which arguments are passed.	2020-09-22 15:11:48 -07:00
Alex Vandiver	d24869e484	webhooks: Rename is_webhook to allow_webhook_access. This argument does not define if an endpoint "is a webhook"; it is set for "/api/v1/messages", which is not really a webhook, but allows access from webhooks.	2020-09-22 15:11:48 -07:00
Aman Agrawal	1b5b82e712	RealmFilterPattern: Mark converted content as AtomicString. If multiple filters match the same string, we run into an infinite loop of converting string into urls. To fix it, we mark the matched string as atomic after first conversion.	2020-09-22 15:10:38 -07:00
Anders Kaseorg	e70f2ae58d	rest: Specify rest_dispatch handlers by function, not by string. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-22 10:46:28 -07:00
Anders Kaseorg	faf600e9f5	urls: Remove unused URL names and shorten others. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-22 10:46:28 -07:00
Alex Vandiver	db8daf4175	linkifiers: Allow tildes in target URLs.	2020-09-21 21:04:02 -07:00
Sumanth V Rao	c563cdba61	markdown: Add data-code-lang attribute for locally echoed messages. This mimics the backend logic for adding the data-attribute - to know what Pygments language was used to highlight the code block - in locally echoed messages. New test added checks our logic for canonicalizing pygments alias (for both frontend and backend). Other fixtures and tests amended.	2020-09-18 17:12:26 -07:00
Alex Vandiver	f638518722	tornado: Move default production port to 9800. In development and test, we keep the Tornado port at 9993 and 9983, respectively; this allows tests to run while a dev instance is running. In production, moving to port 9800 consistently removes an odd edge case, when just one worker is on an entirely different port than if two workers are used.	2020-09-18 15:13:40 -07:00
Alex Vandiver	4354386e69	tornado: Remove an unused port argument. This was added in `ec065e92ee` for the WebSocket codepath, which was subsequently removed in `ea6934c26d`.	2020-09-18 15:13:40 -07:00
Tim Abbott	ae58ed5a74	markdown: Tweak data-code-language testing and comments. This should make it clearer the precise decisions we've made about the intended semantics of this feature.	2020-09-15 12:30:57 -07:00
Sumanth V Rao	b0c9e0a295	markdown: Rename fenced code data-attribute to data-code-language.	2020-09-15 20:09:58 +05:30
Sumanth V Rao	033351609d	markdown: Add data-codehilite-language attr for fenced code. When converting fenced code markdown, we add the language (if specified) in a data-attribute by tweaking the HTML generated. Doing so, allows the frontend to make use of this attr to display view-in-playground option for codeblocks. We use pygments to get the lexer subclass name and use that instead of directly using the language in the data-attribute. Doing so, helps us map different language aliases (like `js` and `javascript`) into a common variable (like `JavaScript`) - and avoids the client from dealing with multiple tags corresponding to the same language. The html structure for a message like this: ``` js ..content.. ``` would now be: <div class="codehilite" data-codehilite-language="JavaScript"> <pre>..content..</pre> </div> Tests and fixtures amended.	2020-09-14 21:25:19 -07:00
Aman Agrawal	2bc3924672	move_topic_to_stream: Allow moving to/between/from private streams. Fixes #16284. Most of the work for this was done when we implemented correct behavior for guest users, since they treat public streams like private streams anyway. The general method involves moving the messages to the new stream with special care of UserMessage. We delete UserMessages for subs who are losing access to the message. For private streams with protected history, we also create UserMessage elements for users who are not present in the old stream, since that's important for those users to access the moved messages.	2020-09-14 15:00:55 -07:00
Anders Kaseorg	ddf8ec33df	upload: Strip leading slash from deleted S3 export paths. Previously, S3UploadBackend.delete_export_tarball failed to strip the leading ‘/’ from the export path. This mistake is now caught by Moto 1.3.15. I expect it caused deletion failures in the real S3, although I haven’t verified this. We store export_path in the audit log with a leading ‘/’, but the actual S3 keys do not have a leading ‘/’. Changing either system would require a migration. So the new convention is that the variables named ‘export_path’ have a leading ‘/’, while variables named ‘path_id’ or ‘key’ do not. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-13 20:59:09 -07:00

1 2 3 4 5 ...

5279 Commits