zulip

Commit Graph

Author	SHA1	Message	Date
Tim Abbott	1ea2f188ce	tornado: Rewrite Django integration to duplicate less code. Since essentially the first use of Tornado in Zulip, we've been maintaining our Tornado+Django system, AsyncDjangoHandler, with several hundred lines of Django code copied into it. The goal for that code was simple: We wanted a way to use our Django middleware (for code sharing reasons) inside a Tornado process (since we wanted to use Tornado for our async events system). As part of the Django 2.2.x upgrade, I looked at upgrading this implementation to be based off modern Django, and it's definitely possible to do that: * Continue forking load_middleware to save response middleware. * Continue manually running the Django response middleware. * Continue working out a hack involving copying all of _get_response to change a couple lines allowing us our Tornado code to not actually return the Django HttpResponse so we can long-poll. The previous hack of returning None stopped being viable with the Django 2.2 MiddlewareMixin.__call__ implementation. But I decided to take this opportunity to look at trying to avoid copying material Django code, and there is a way to do it: * Replace RespondAsynchronously with a response.asynchronous attribute on the HttpResponse; this allows Django to run its normal plumbing happily in a way that should be stable over time, and then we proceed to discard the response inside the Tornado `get()` method to implement long-polling. (Better yet might be raising an exception?). This lets us eliminate maintaining a patched copy of _get_response. * Removing the @asynchronous decorator, which didn't add anything now that we only have one API endpoint backend (with two frontend call points) that could call into this. Combined with the last bullet, this lets us remove a significant hack from our never_cache_responses function. * Calling the normal Django `get_response` method from zulip_finish after creating a duplicate request to process, rather than writing totally custom code to do that. This lets us eliminate maintaining a patched copy of Django's load_middleware. * Adding detailed comments explaining how this is supposed to work, what problems we encounter, and how we solve various problems, which is critical to being able to modify this code in the future. A key advantage of these changes is that the exact same code should work on Django 1.11, Django 2.2, and Django 3.x, because we're no longer copying large blocks of core Django code and thus should be much less vulnerable to refactors. There may be a modest performance downside, in that we now run both request and response middleware twice when longpolling (once for the request we discard). We may be able to avoid the expensive part of it, Zulip's own request/response middleware, with a bit of additional custom code to save work for requests where we're planning to discard the response. Profiling will be important to understanding what's worth doing here.	2020-02-13 16:13:11 -08:00
Chris Heald	a91358e186	webhooks: Fix hellosign webhook. Hellosign now posts their callback as form/multipart, which Django only permits to be read once. Attempts to access request.body after the initial read throw "django.http.request.RawPostDataException: You cannot access body after reading from request's data stream". Fixes #13847.	2020-02-12 22:36:11 -08:00
Mateusz Mandera	27b15a9722	install: Don't create internal realm in the installation process.	2020-02-12 12:00:10 -08:00
Mateusz Mandera	bde495db87	registration: Add support for mobile and desktop flows. This makes it possible to create a Zulip account from the mobile or desktop apps and have the end result be that the user is logged in on their mobile device. We may need small changes in the desktop and/or mobile apps to support this. Closes #10859.	2020-02-12 11:22:16 -08:00
Mateusz Mandera	fe33966642	sessions: Implement the concept of expirable session variables. This can be useful in the future for various things, and right now it'll specifically be used in the signup mobile/desktop flows.	2020-02-12 11:09:55 -08:00
Hashir Sarwar	eb23c6fa6c	test_fixtures: Clean up interface for `template_database_status()`. 1) Created a new class `DatabaseType` and access its objects inside `template_database_status()` instead of sending five arguments with default values. 2) Made `check_files` and `setting_name` local variables instead of function parameters since they had same value(None) for every call. Fixes #13845.	2020-02-12 11:07:10 -08:00
Tim Abbott	96b0ec705d	email_notifications: Fix missing translation tags on sender.	2020-02-12 10:54:34 -08:00
Anders Kaseorg	e257253e64	emoji_codes: Replace JS module with JSON module. webpack optimizes JSON modules using JSON.parse("{…}"), which is faster than the normal JavaScript parser. Update the backend to use emoji_codes.json too instead of the three separate JSON files. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-02-12 10:09:12 -08:00
Tim Abbott	cb2c96f736	test_templates: Remove shallow template rendering code. This code was very useful when first implemented to help catch errors where our backend templates didn't render, but has been superceded by the success of our URL coverage testing (which ensures every URL supported by Zulip's urls.py is accessed by our tests, with a few exceptions) and other tests covering all of the emails Zulip sends. It has a significant maintenance cost because it's a bit hacky and involves generating fake context, so it makes sense to remove these. Any future coverage issues with templates should be addressed with a direct test that just accessing the relevant URL or sends the relevant email.	2020-02-11 18:00:15 -08:00
Mateusz Mandera	2475adbf8a	messages_for_topic: Use stream.recipient_id for more efficient query.	2020-02-11 17:39:43 -08:00
Chris Heald	bddb370750	tests: Reorder python version logic to be more clear.	2020-02-11 17:34:56 -08:00
Chris Heald	3236483d0e	tests: Fix type reflection for Python 3.7. In python 3.5-3.6, generic types had an __origin__ attribute which indicated which generic they originated from; the code was reflecting on that value to check types against the openapi spec. In python3.7, this changed, and there's no longer an immediately simple way to get this information in all cases. __origin__ appears to be the implementing class now, returning `list` or `collections.abc.Iterator` rather than `typing.List` and `typing.Iterator`. This adds a sloppy-but-effective mechanism for inferring if a type maps to the List/Dict/Iterator/Mapping types and gets the test suite passing again.	2020-02-11 17:34:56 -08:00
Dinesh	4304d5f8db	auth: Add support for GitLab authentication. With some tweaks by tabbott to the documentation and comments. Fixes #13694.	2020-02-11 13:54:17 -08:00
Steve Howell	900f98c0c5	presence: Use realm_id for UserPresence queries. We now use realm_id for querying UserPresence instead of building a big WHERE clause from the list of user_ids. This commit may be a bit hard to measure, since we still get the list of user_ids for the PushToken query in the same method.	2020-02-11 13:11:58 -08:00
Steve Howell	d68052b68d	presence: Add realm/timestamp index to UserPresence. It adds this index: "zerver_userpresence_realm_id_timestamp_25f410da_idx" btree (realm_id, "timestamp") We expect this index to provide a major performance improvement when fetching presence data for the whole realm from the database on servers like zulipchat.com hosting several realms.	2020-02-11 13:11:28 -08:00
Tim Abbott	fcac3a4342	recipients: Rename extract_recipients to extract_private_recipients. Recent changes mean this function is now only used for private messages.	2020-02-11 12:28:14 -08:00
Steve Howell	1b6578cafd	messages: Fix bug with commas in stream names. We now validate streams with a separate function from PM recipients. It's confusing enough all the ways you can encode a stream or encode the PM recipients, but trying to do it all in one function was hard to reason about and led to at least one bug. In particular, there was a bug where streams with commas in them would get split. Now we just don't ever split on commas inside of `extract_stream_indicator`. Fixes #13836	2020-02-11 12:20:54 -08:00
Steve Howell	96132fe0e9	extract_recipients: Enforce str as incoming type. After removing internal_send_message() in a recent commit, we now have only two callers for extract_recipients, and they are both related to our REQ mechanism that always passes strings to converters. (If there are default values, REQ does not call the converters.) We therefore make two changes: - use the more strict annotation of "str" for the `s` parameter - don't bother with the isinstance check	2020-02-11 12:20:54 -08:00
Steve Howell	8c3eaeb872	Remove obsolete internal_send_messages(). We have been phasing this out for a couple years, and I fixed the last stragglers over the last couple days.	2020-02-11 12:20:54 -08:00
Steve Howell	2e8dec233e	slow queries: Use internal_send_stream_message(). Note that while the test mocks the actual message send, we now have a `get_stream` call in the queue worker, so we have to set up a real stream for testing (or we could have mocked that as well, but it didn't seem necessary). The setup queries add to the amount of queries reported by the test, plus the `get_stream` call. I just made the query count a digits regex, which is a little bit lame, but I don't think it's worth risking test flakes for this.	2020-02-11 12:20:54 -08:00
Steve Howell	e37d660d19	error_notify: Use internal_send_stream_message().	2020-02-11 12:20:53 -08:00
Steve Howell	c4e3cfebb0	presence: Add realm_id to UserPresence. This index is intended to optimize the performance of the very frequently run query of "what is the presence status of all users in a realm?". Main changes: - add realm_id to UserPresence - add index for realm_id - backfill realm_id for old rows - change all writes to UserPresence to include realm_id The index is of this form: "zerver_userpresence_realm_id_5c4ef5a9" btree (realm_id) We will create an index on (realm_id, timestamp) in a future commit, but I think it's a bit faster if you do the backfill before the index. There's also a minor tweak to the populate_db script.	2020-02-10 17:21:45 -08:00
Steve Howell	28a8ffbc4c	email_mirror: Use internal_send_stream_message(). This is just a refactoring to the more modern API for sending internal messages. To make this work we now plumb the email_gateway flag through `internal_send_stream_message` instead of `internal_send_message`. We also change `send_zulip` to have its callers pass in a full UserProfile object (which one of them already had).	2020-02-10 15:45:13 -08:00
Steve Howell	6922eef380	signups: Use internal_send_stream_message(). We prefer this to internal_send_message(). We are trying to deprecate `internal_send_message`, which has extra moving parts related to `extract_recipients` and `Addressee.legacy_build`. There are two chunks of code that I touch here that look pretty similar, but I'm not quite sure they're worth de-duplicating, since they use different topics and different message content.	2020-02-10 15:45:13 -08:00
Steve Howell	f1ac16973c	tests: Create signups stream in RealmCreationTests.	2020-02-10 15:45:13 -08:00
Steve Howell	b33552997e	cross realm bots: Simplify notify_new_user. Instead of having `notify_new_user` delegate all the heavy lifting to `send_signup_message`, we just rename `send_signup_message` to be `notify_new_user` and remove the one-line wrapper. We remove a lot of obsolete complexity: - `internal` was no longer ever set to True by real code, so we kill it off as well as well as killing off the internal_blurb code and the now-obsolete test - the `sender` parameter was actually an email, not a UserProfile, but I think that got past mypy due to the caller passing in something from settings.py - we were only passing in NOTIFICATION_BOT for the sender, so we just hard code that now - we eliminate the verbose `admin_realm_signup_notifications_stream` parameter and just hard code it to "signups" - we weren't using the optional realm parameter There's also a long ugly comment in `get_recipient_info` related to this code that I amended for now. We should try to take action in a subsequent commit.	2020-02-10 15:45:13 -08:00
Steve Howell	6e40db4b1f	minor: Fix misleading comments. These comments were naming the wrong function.	2020-02-10 15:45:13 -08:00
Hashir Sarwar	dcbd3e486f	stream_subscription: Remove unused TypedDict `SubInfo`.	2020-02-10 14:04:22 -08:00
Steve Howell	2ff41bf9e5	/json/users: Use field.realm for realm lookup. This avoids an unnecessary join to UserProfile. To verify this, you can do `print(queries)` in the `test_get_custom_profile_fields_from_api` test. It's kinda noisy, so I excerpted them below... Before: SELECT ... FROM "zerver_customprofilefieldvalue" INNER JOIN "zerver_userprofile" ON ("zerver_customprofilefieldvalue"."user_profile_id" = "zerver_userprofile"."id") INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id") WHERE "zerver_userprofile"."realm_id" = 2 After: SELECT ... FROM "zerver_customprofilefieldvalue" INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id") WHERE "zerver_customprofilefield"."realm_id" = 2' I don't have any way to measure the two queries with realistic data, but I would assume the second query is significantly faster on most of our instances, since CustomProfileField should be tiny.	2020-02-09 22:04:02 -08:00
Steve Howell	9303c386b8	tests: Count queries for /json/users. I am trying to optimize a query in this endpoint. I don't think I'll actually reduce the number of queries, but I wanted to capture the query and this was the easiest way to do it, so might as well check in the code! :)	2020-02-09 22:04:02 -08:00
Steve Howell	01f180d042	minor: Remove unused line of code in get_raw_user_data(). The line removed here is a noop, as both sides of the immediately following conditional reassign the same variable. This harmless cruft was the result of the recent commit `1ae5964ab8`, which added support for single-user GETs.	2020-02-09 22:04:02 -08:00
Tim Abbott	986706c7e5	tornado: Use common code for copying headers. This fixes a bug where our asynchronous requests were only copying the Content-Type header (i.e. the one case where we're noticed) from the Django HttpResponse. I'm not sure what the impact of this would be; the rate-limiting headers rarely come up when breaking a long-polled request. But it seems clearly an improvement to do this in a consistent fashion. Only the headers piece is a change; in Tornado self.finish(x) is equivalent to: self.write(x) self.finish()	2020-02-07 16:14:19 -08:00
Tim Abbott	224a73a3ec	tornado: Extract a function for writing Tornado responses. This increases the readability of what's happening in our core Tornado handlers code, as well as making this logic reusable.	2020-02-07 16:13:49 -08:00
Tim Abbott	5305e8af85	tornado: Extract convert_tornado_request_to_django_request.	2020-02-07 16:03:58 -08:00
Tim Abbott	fc58ae117a	handlers: Rename confusingly named response to result_dict. This should somewhat increase the readability of zulip_finish.	2020-02-07 16:03:58 -08:00
Vishnu KS	4572be8c27	api: Rename subject_links to topic_links. Fixes #13588	2020-02-07 14:35:22 -08:00
Tim Abbott	84edb5c516	test_fixtures: Fix buggy reuse of status_dir between databases. Apparently, the arguments passed to template_database_status were incorrect for the manual testing development database, in that we didn't pass a status_dir when calling into that code from provision. The result was that provisioning before running `test-backend` would ignore changes to the list of check_files (etc.) made after rebasing, and vice versa. The cleanest fix is to compute status_dir from other values passed in; I'm also going to open a follow-up issue for creating a better overall interface here.	2020-02-07 13:33:08 -08:00
akashaviator	1ae5964ab8	api: Add an api endpoint for GET /users/{id} This adds a new API endpoint for querying basic data on a single other user in the organization, reusing the existing infrastructure (and view function!) for getting data on all users in an organization. Fixes #12277.	2020-02-07 10:36:31 -08:00
Tim Abbott	e39840c705	users: Add read-only mode for access_user_by_id. We've be using this in the upcoming GET /users/{id} method.	2020-02-07 10:36:31 -08:00
Tim Abbott	aa9286a1f9	users: Move query into caller of get_custom_profile_field_values. This will be useful for supporting a smaller query for a single user.	2020-02-07 10:36:31 -08:00
Tim Abbott	79e5dd1374	users: Rename get_raw_user_data user parameter to acting_user. This is for improved clarity as we extend this function to take multiple user objects.	2020-02-07 10:36:31 -08:00
Steve Howell	7e99e7feb2	presence: Extract get_legacy_user_info. This code is a bit flatter and just preps the data for a single user. There is never any interaction between the data for user A and user B, so we can mostly avoid complicated nested data structures and do most of the data-crunching on a per-user basis. We also do an explicit sort of the data before running it through groupby. The explicit sort simplifies how we calculate `most_recent_info` and also avoids needing to add `dt` to an intermediate data structure. Finally, when it comes to the individual client data, the code has relied on the assumption that there is only one row per client, which I believe to be true, but now the code is more explicit about that.	2020-02-06 17:16:22 -08:00
Steve Howell	bf3baa14ac	presence: Rename get_status_dict_by_user().	2020-02-06 17:16:22 -08:00
Steve Howell	675f8514e8	presence: Rename get_status_dict(). We renamed this to get_presences_for_realm(), and we have the caller pass in realm, not user_profile.	2020-02-06 17:16:22 -08:00
Steve Howell	363e6bf239	presence: Move get_status_dicts_for_rows().	2020-02-06 17:16:22 -08:00
Steve Howell	36fba1076f	presence: Move get_status_dict_by_user.	2020-02-06 17:16:22 -08:00
Steve Howell	6f027d84a9	presence: Move get_status_dict_by_realm.	2020-02-06 17:16:22 -08:00
Steve Howell	703338dfa3	presence: Extract lib/presence.py. This will make more sense when we pull some code out of the model.	2020-02-06 17:16:22 -08:00
Steve Howell	a5093be867	presence: Rename get_status_list. The word "status" is vague, and this isn't actually returning a list, so we now name it get_presence_response. I originally was gonna rename this to get_presence_dict, but there's a function called get_status_dict that returns a subset of the response, so I think it's a bit more clear that this is the bigger dict that actually gets sent back.	2020-02-06 17:16:22 -08:00
Steve Howell	8a1fb2dcd6	presence: Calculate server_timestamp slightly earlier. We want to err on the side of server_timestamp being old, since we may eventually use this to make responses just include incremental changes, and we don't want a time window (however small) when we miss presence rows. The clients will be able to deal with duplicate data to the extent that the time windows are overlapping. Also, extracting the other local var here (for `presences`) will set up a subsequent commit where we re-format the data for clients with slim_presence=True.	2020-02-06 17:16:22 -08:00

1 2 3 4 5 ...

10857 Commits