zulip

Commit Graph

Author	SHA1	Message	Date
Steve Howell	6f62c993a6	refactor: Extract get_existing_user_errors. This is a prep commit that will allow us to more efficiently validate a bunch of emails in the invite UI. This commit does not yet change any behavior or performance. A secondary goal of this commit is to prepare us to eliminate some hackiness related to how we construct `ValidationError` exceptions. It preserves some quirks of the prior implementation: - the strings we decided to translate here appear haphazard (and often get ignored anyway) - we use `msg` in most codepaths, but use `code` for invites Right now we never actually call this with more than one email, but that will change soon. Note that part of the rationale for the inner method here is to avoid a test coverage bug with `continue` in loops.	2020-03-06 11:53:22 -08:00
Steve Howell	689aca9140	refactor: Extract validate_email_is_valid(). This has two goals: - sets up a future commit to bulk-validate emails - the extracted function is more simple, since it just has errors, and no codes or deactivated flags This commit leaves us in a somewhat funny intermediate state where we have `action.validate_email` being a glorified two-line function with strange parameters, but subsequent commits will clean this up: - we will eliminate validate_email - we will move most of the guts of its other callee to lib/email_validation.py To be clear, the code is correct here, just kinda in an ugly, temporarily-disorganized intermediate state.	2020-03-06 11:53:22 -08:00
Steve Howell	4f5b07a7e6	refactor: Extract zerver/lib/email_validation.py.	2020-03-06 11:53:22 -08:00
Steve Howell	30b43605c3	invite performance: Reduce RealmDomain queries. We now use the `get_realm_email_validator()` helper to build an email validator outside the loop of emails in our invite list. This allows us to perform RealmDomain queries only once per request, instead of once per email.	2020-03-06 11:53:22 -08:00
Steve Howell	57f1aa722c	refactor: Rename validate_email_for_realm. Now called: validate_email_not_already_in_realm We have a separate validation function that makes sure that the email fits into a realm's domain scheme, and we want to avoid naming confusion here.	2020-03-06 11:53:22 -08:00
Steve Howell	c43a29ff54	invites: Fix bug with inviting cross realm bots. Without the fix here, you will get an exception similar to below if you try to invite one of the cross realm bots. (The actual exception is a bit different due to some rebasing on my branch.) File "/home/zulipdev/zulip/zerver/lib/request.py", line 368, in _wrapped_view_func return view_func(request, args, *kwargs) File "/home/zulipdev/zulip/zerver/views/invite.py", line 49, in invite_users_backend do_invite_users(user_profile, invitee_emails, streams, invite_as) File "/home/zulipdev/zulip/zerver/lib/actions.py", line 5153, in do_invite_users email_error, email_skipped, deactivated = validate_email(user_profile, email) File "/home/zulipdev/zulip/zerver/lib/actions.py", line 5069, in validate_email return None, (error.code), (error.params['deactivated']) TypeError: 'NoneType' object is not subscriptable Obviously, you shouldn't try to invite a cross realm bot to your realm, but we want a reasonable error message. RESOLUTION: Populate the `code` parameter for `ValidationError`. BACKGROUND: Most callers to `validate_email_for_realm` simply catch the `ValidationError` and then report a more generic error. That's also what `do_invite_users` does, but it has the somewhat convoluted codepath through `validate_email` that triggers this code: try: validate_email_for_realm(user_profile.realm, email) except ValidationError as error: return None, (error.code), (error.params['deactivated']) The way that we're using the `code` parameter for `ValidationError` feels hacky to me. The intention behind `code` is to provide a descriptive error to calling code, and it's not intended for humans, and it feels strange that we actually translate this in other places. Here are the Django docs: https://docs.djangoproject.com/en/3.0/ref/forms/validation/ And then here's an example of us actually translating a code (not part of this commit, just providing context): raise ValidationError(_('%s already has an account') % (email,), code = _("Already has an account."), params={'deactivated': False}) Those codes eventually get put into InvitationError, which inherits from JsonableError, and we do actually display these errors in the webapp: if skipped and len(skipped) == len(invitee_emails): # All e-mails were skipped, so we didn't actually invite anyone. raise InvitationError(_("We weren't able to invite anyone."), skipped, sent_invitations=False) I will try to untangle this somewhat in upcoming commits.	2020-03-06 11:53:22 -08:00
Rohitt Vashishtha	2fab45e530	bugdown: Use AtomicString in UserMentionPattern. This fixes the user-mention counterpart of #14080.	2020-03-06 11:35:56 -08:00
Rohitt Vashishtha	7f9d8e1907	bugdown: Use AtomicString in UserGroupMentionPattern. This fixes the user-group counterpart of #14080.	2020-03-06 11:35:56 -08:00
Mateusz Mandera	3922fb3a92	events: Clean up delete_message even processing code.	2020-03-03 15:52:42 -08:00
Rohitt Vashishtha	ff5e2b6eb7	bugdown: Avoid hanging list paragraphs being processed as codeblocks. Previously, the input: ==================== - One - Two Two continued ==================== Would produce the same output as: ==================== - One - Two ``` Two continued ``` ==================== This was because our CodeBlockProcessor had a higher priority than the ListIndentProcessor. This issue was discussed here: https://chat.zulip.org/#narrow/stream/9-issues/topic/continuation.20paragraphs.20in.20list.20items.	2020-03-03 12:08:19 -08:00
Rohitt Vashishtha	cd7396e732	bugdown: Update outdated comment about Zulip's heading support.	2020-03-03 11:54:18 -08:00
Rohitt Vashishtha	62a7e464fb	bugdown: Use AtomicString in StreamPattern. This fixes the stream counterpart of #14080.	2020-03-02 00:03:33 -08:00
Rohitt Vashishtha	245de9e1e2	bugdown: Use AtomicString in StreamTopicPattern. Fixes #14080.	2020-03-02 00:03:33 -08:00
Mateusz Mandera	05e7214690	do_delete_messages: Handle empty set of messages passed as input. /delete_topic endpoint could be used to request the deletion of a topic, that would cause do_delete_messages to be called with an empty set in these cases: 1. Requesting deletion of an empty stream. 2. Requesting deletion of a topic in a private stream with history not public to subscribers, if the requesting admin doesn't have access to any of the messages in that topic.	2020-03-02 00:01:35 -08:00
Steve Howell	94192395fb	perf: Extract Stream.get_client_data. This function slims down the data that we get from the database in order to create the streams part of our client payload. We also fix a typo. We also clearly distinguish between queries and lists here.	2020-03-01 22:38:03 -08:00
Steve Howell	49b8218463	perf: Extract get_subscribed_stream_ids_for_user. This new method prevents us from getting fat objects from the database. Instead, now we just get ids from the database to build our subqueries. Note that we could also technically eliminate the `set(...)` wrappers in this code to have Django make a subquery and save a round trip. I am postponing that for another commit (since it's still somewhat coupled to some other complexity in `do_get_streams` that I am trying to cut through, plus it's not the main point of this commit.) BEFORE: # old, still in use for other codepaths def get_stream_subscriptions_for_user(user_profile: UserProfile) -> QuerySet: # TODO: Change return type to QuerySet[Subscription] return Subscription.objects.filter( user_profile=user_profile, recipient__type=Recipient.STREAM, ) user_subs = get_stream_subscriptions_for_user(user_profile).filter( active=True, ).select_related('recipient') recipient_check = Q(id__in=[sub.recipient.type_id for sub in user_subs]) AFTER: # newly added def get_subscribed_stream_ids_for_user(user_profile: UserProfile) -> QuerySet: return Subscription.objects.filter( user_profile_id=user_profile, recipient__type=Recipient.STREAM, active=True, ).values_list('recipient__type_id', flat=True) subscribed_stream_ids = get_subscribed_stream_ids_for_user(user_profile) recipient_check = Q(id__in=set(subscribed_stream_ids))	2020-03-01 22:38:03 -08:00
Steve Howell	eb368c9c92	performance: Optimize max_message_id calculation. We calculate `max_message_id` for the mobile client. Our query now no longer joins to the Message table and just grabs one value instead of fat objects.	2020-03-01 22:38:03 -08:00
Chris Bobbe	23ba2b63c5	push_notifications: In dev, make APNs or GCM config suffice.	2020-02-28 16:49:35 -08:00
Steve Howell	504ec9d489	typing: Remove recipient-related complexity. For historical reasons we were creating Recipient objects at some point in the typing-notifications codepath. Now we just work with UserProfiles. This removes some queries, as indicated by the change to `len(queries)` in a couple of the tests. The one subtle thing that changes here is huddles. If user 10 sends a typing notification that they are talking to users 20 and 30, there might not actually be a huddle for users 10/20/30, but we were actually creating huddles on the fly! There is no need to create huddles just for typing notifications, since we don't even share huddle ids with our clients. The clients just infer the huddles. Some of the code that gets killed off here as somewhat "collateral damage" is some defensive code related to formerly supporting streams in typing indicators. The support for streams was killed off almost as soon as we released the feature, and the codepath is pretty clearly user-centric at this point.	2020-02-28 12:46:20 -08:00
Steve Howell	f224f215c1	refactor: Simplify handling of emails for typing endpoint. Instead of duplicating code for the email case, just convert emails to user_ids and then run the same code.	2020-02-28 12:39:36 -08:00
Steve Howell	bed6d5a789	typing: Inline check_typing_notification. I actually like this pattern: def check_send_typing_notification(...): typing_notification = check_typing_notification(...) do_send_typing_notification(...) It can help divide responsibilities nicely and make it easy to write detailed unit tests against each of the two helpers. Unfortunately, the good things didn't really happen here, and instead we got the worst aspects of the pattern: - The responsibilities for validation leaked into the second function. - Both functions were doing sane things individually that became not-so-sane in the big picture (namely, we ended up making Recipient objects for no reason, but if you read each of the helpers, it was just one step that seemed reasonable). - Passing around dictionaries for results can be annoying. Also, the pattern made a lot more sense when the validation for typing was a lot more complicated. My prior commit makes it so that we only ever deal with a list of user_ids. Anyway, now I'm inlining it. :) Subsequent commits will clean up the more substantive issue here, which is that we are building Recipients for no reason.	2020-02-28 12:39:36 -08:00
Mateusz Mandera	7db3d4560f	do_delete_messages: Archive the messages in bulk. The test added in this commit shows 37 queries - compared to 181 without the change to the function. That seems very much worth it.	2020-02-27 23:12:32 -08:00
Mateusz Mandera	b4186fb680	do_delete_messages: Remove unused message_ids list.	2020-02-27 23:12:32 -08:00
Wyatt Hoodes	6ed944c761	test_runner: Update database ids to be human readable. Before the Django 2.x upgrade, the DatabaseCreation argument took an integer value. To deal with running mulitple test instances, we created a random start range that could count up 100 workers until the next random id. Arbitrarily limiting the number of workers to 100. Post upgrade, we can now use string values. Enabling the database + worker numbers to be more readable, as well as removing the cap on the worker count.	2020-02-27 23:01:29 -08:00
Tim Abbott	2fb967b735	do_update_message: Remove sender field from update_message events. This field wasn't accessed by any clients and was a less robust version of the user_id field. Any client hoping to be interested in who did message edits should be able to handle working with user IDs rather than email addresses.	2020-02-26 16:16:01 -08:00
Tim Abbott	588bcb37cf	do_update_message: Avoid using a direct query to fetch a Stream. We have a helper designed for the purpose, and it fixes potentially misbehavior where the previous code did not do `.select_related()`.	2020-02-26 16:14:34 -08:00
Tim Abbott	49ca7cf717	topic: Add recipient_id to fields for message edit saves. This is preparation for supporting moving messages between streams in some cases. It doesn't actually have any functional effect, since flush_message clears the message unconditionally anyway.	2020-02-26 16:12:07 -08:00
Steve Howell	995353fb28	message validation: Clean up extract_private_recipients. This is mostly refactoring, but we also prevent a new type of value error (list of non-int-or-string). The new test code helps enforce that. Cleanup includes: - Use early-exit for email case. - Rename helpers to get_validate_*. - Avoid clumsy rebuilding of lists in helpers. - Avoid the confusing `recipient` name (which can be confused with the model by the same name). - Just delegate duplicate-id/email-removal to the helpers. The cleaner structure allows us to elminate a couple mypy workarounds.	2020-02-25 16:17:47 -08:00
Vishnu KS	303cd9bb9e	actions: Make do_change_plan_type support changing plan to SELF_HOSTED. Credits to @xpac1985 for reporting, debugging and proposing fix to the issue. The proposed fix was modified slightly by @hackerkid to set the correct value for max_invites and upload_quota_gb. Tests added by @hackerkid. Fixes #13974	2020-02-25 16:14:45 -08:00
Tim Abbott	27edc18330	test_classes: Use realistic web and mobile User-Agent strings. This fixes a confusing aspect of how our automated tests worked previously, where we'd almost all HTTP requests in the unlikely configuration with no User-Agent string specified. We need to adjust query counts in a few tests that now are a bit cheaper because they now can take advantage of a Client object created in server_initialization.py in `process_client`.	2020-02-24 23:19:43 -08:00
Tim Abbott	27b267026e	test_classes: Rename set_http_host to set_http_headers. This supports the goal of setting other headers like User-Agent in the future.	2020-02-24 23:19:43 -08:00
Tim Abbott	d80175d29e	server_initialization: Create Client objects for mobile/desktop. This replaces the "API" client, which isn't used by any real clients, with the "ZulipMobile" and "ZulipElectron" client strings, which are.	2020-02-24 23:19:43 -08:00
harshavardhanpb	cac4feb263	openapi: Move openapi.py into zerver/openapi.py. Fixes #14006	2020-02-24 12:21:26 -05:00
Steve Howell	ed859617e4	minor: Add test for extract_stream_indicator.	2020-02-24 07:40:31 -05:00
Mateusz Mandera	a9794ec001	cache: Delete unused function cache().	2020-02-21 09:05:46 -08:00
Mateusz Mandera	c78d0712f7	tests: For ldap tests, give each ldap user a unique password. To avoid some hidden bugs in tests caused by every ldap user having the same password, we give each user a different password, generated based on their uids (to avoid some ugly hard-coding in a bunch of places).	2020-02-19 14:46:29 -08:00
Vishnu KS	51f5701879	export: Canonicalize the email of cross realm bot to default value. Fixes #13496	2020-02-19 14:44:50 -08:00
Vishnu KS	e1a7716578	emails: Translate from_name of account security emails.	2020-02-18 17:45:33 -08:00
Tim Abbott	0075c6cd56	do_update_message: Clean up timestamp code. By moving this logic to the topic of the functon, we make the code a lot more readable.	2020-02-18 16:38:34 -08:00
Mateusz Mandera	6a0b68bc7f	models: Delete get_stream_recipient function and its uses. With recipient being now a Stream field, there's no more use for this helper function.	2020-02-18 10:49:14 -08:00
Mateusz Mandera	0d6f78b381	models: Delete get_personal_recipient function and its uses. With recipient being now a UserProfile field, there's no more use for this helper function.	2020-02-18 10:49:14 -08:00
Mateusz Mandera	920d22524b	import: Use re_map_foreign_keys on the realm column of UserPresence. We forgot to make this adjustment in the recent denormalization of realm into UserPresence. It's needed for imports to work correctly.	2020-02-18 10:45:38 -08:00
Anders Kaseorg	b2ec8e157b	has_request_variables: Remove query_params dict. ‘req_var in request.GET’ was previously believed to be slow from profiling results. However, the real explanation for those profiling results is that WSGIRequest.GET is a lazy cached property, so there’s no reason to avoid it if we’re accessing request.GET anyway. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-02-15 11:37:18 -08:00
Chris Heald	18e3982acd	integrations: Add AlertManager webhook.	2020-02-14 17:43:15 -08:00
Mateusz Mandera	cbdfef28a8	retention: Update to account for the zulipinternal realm. In https://github.com/zulip/zulip/pull/12823 some changes to the realms structure have been made, so now both in production and development cross-realm bots live in the realm with string_id "zulipinternal". There was a TODO in retention code to eliminate a conditional in a query that became redundant with this change, and also the zulipinternal realm should be omitted from the archiving process in archive_messages().	2020-02-14 17:15:26 -08:00
Tim Abbott	10e7e15088	user_agent: Compile the regular expression. We use this single regular expression for processing essentially every request, so it's definitely worth hinting to Python that we're going to do so by compiling it. Saves about 40us per request.	2020-02-14 10:26:37 -08:00
Tim Abbott	800312c976	has_request_variables: Fix slow extraction of parameters. A sloppy implementation of the main has_request_variables wrapper function meant that it did two very inefficient things: * To combine together the GET and POST parameters, it would make a copy of the request.GET QueryDict object, which combined with the fact that these objects are slow to access, consumed about 90us per argument. * Doing this in a loop (one time per argument), rather than once, which resulted in us doing this 11 times for a `GET /events` query. Fixing this to just make a dictionary and combine things with some small loops saved about 1 millisecond from the total runtime of GET /events (for comparison, the total actual work of that view function is about 700ms). We need to fix at least one test that used a bad mock HttpRequest object that didn't have a .GET property.	2020-02-14 09:45:26 -08:00
Tim Abbott	4fbcbeeea7	settings: Disable django.request logging at WARNING log level. The comment explains this issue, but effectively, the upgrade to Django 2.x means that Django's built-in django.request logger was writing to our errors logs WARNING-level data for every 404 and 400 error. We don't consider user errors to be a problem worth highlighting in that log file.	2020-02-13 23:50:53 -08:00
rht	41e3db81be	dependencies: Upgrade to Django 2.2.10. Django 2.2.x is the next LTS release after Django 1.11.x; I expect we'll be on it for a while, as Django 3.x won't have an LTS release series out for a while. Because of upstream API changes in Django, this commit includes several changes beyond requirements and: * urls: django.urls.resolvers.RegexURLPattern has been replaced by django.urls.resolvers.URLPattern; affects OpenAPI code and related features which re-parse Django's internals. https://code.djangoproject.com/ticket/28593 * test_runner: Change number to suffix. Django changed the name in this ticket: https://code.djangoproject.com/ticket/28578 * Delete now-unnecessary SameSite cookie code (it's now the default). * forms: urlsafe_base64_encode returns string in Django 2.2. https://docs.djangoproject.com/en/2.2/ref/utils/#django.utils.http.urlsafe_base64_encode * upload: Django's File.size property replaces _get_size(). https://docs.djangoproject.com/en/2.2/_modules/django/core/files/base/ * process_queue: Migrate to new autoreload API. * test_messages: Add an extra query caused by .refresh_from_db() losing the .select_related() on the Realm object. * session: Sync SessionHostDomainMiddleware with Django 2.2. There's a lot more we can do to take advantage of the new release; this is tracked in #11341. Many changes by Tim Abbott, Umair Waheed, and Mateusz Mandera squashed are squashed into this commit. Fixes #10835.	2020-02-13 16:27:26 -08:00
Tim Abbott	1ea2f188ce	tornado: Rewrite Django integration to duplicate less code. Since essentially the first use of Tornado in Zulip, we've been maintaining our Tornado+Django system, AsyncDjangoHandler, with several hundred lines of Django code copied into it. The goal for that code was simple: We wanted a way to use our Django middleware (for code sharing reasons) inside a Tornado process (since we wanted to use Tornado for our async events system). As part of the Django 2.2.x upgrade, I looked at upgrading this implementation to be based off modern Django, and it's definitely possible to do that: * Continue forking load_middleware to save response middleware. * Continue manually running the Django response middleware. * Continue working out a hack involving copying all of _get_response to change a couple lines allowing us our Tornado code to not actually return the Django HttpResponse so we can long-poll. The previous hack of returning None stopped being viable with the Django 2.2 MiddlewareMixin.__call__ implementation. But I decided to take this opportunity to look at trying to avoid copying material Django code, and there is a way to do it: * Replace RespondAsynchronously with a response.asynchronous attribute on the HttpResponse; this allows Django to run its normal plumbing happily in a way that should be stable over time, and then we proceed to discard the response inside the Tornado `get()` method to implement long-polling. (Better yet might be raising an exception?). This lets us eliminate maintaining a patched copy of _get_response. * Removing the @asynchronous decorator, which didn't add anything now that we only have one API endpoint backend (with two frontend call points) that could call into this. Combined with the last bullet, this lets us remove a significant hack from our never_cache_responses function. * Calling the normal Django `get_response` method from zulip_finish after creating a duplicate request to process, rather than writing totally custom code to do that. This lets us eliminate maintaining a patched copy of Django's load_middleware. * Adding detailed comments explaining how this is supposed to work, what problems we encounter, and how we solve various problems, which is critical to being able to modify this code in the future. A key advantage of these changes is that the exact same code should work on Django 1.11, Django 2.2, and Django 3.x, because we're no longer copying large blocks of core Django code and thus should be much less vulnerable to refactors. There may be a modest performance downside, in that we now run both request and response middleware twice when longpolling (once for the request we discard). We may be able to avoid the expensive part of it, Zulip's own request/response middleware, with a bit of additional custom code to save work for requests where we're planning to discard the response. Profiling will be important to understanding what's worth doing here.	2020-02-13 16:13:11 -08:00
Mateusz Mandera	27b15a9722	install: Don't create internal realm in the installation process.	2020-02-12 12:00:10 -08:00
Mateusz Mandera	fe33966642	sessions: Implement the concept of expirable session variables. This can be useful in the future for various things, and right now it'll specifically be used in the signup mobile/desktop flows.	2020-02-12 11:09:55 -08:00
Hashir Sarwar	eb23c6fa6c	test_fixtures: Clean up interface for `template_database_status()`. 1) Created a new class `DatabaseType` and access its objects inside `template_database_status()` instead of sending five arguments with default values. 2) Made `check_files` and `setting_name` local variables instead of function parameters since they had same value(None) for every call. Fixes #13845.	2020-02-12 11:07:10 -08:00
Tim Abbott	96b0ec705d	email_notifications: Fix missing translation tags on sender.	2020-02-12 10:54:34 -08:00
Anders Kaseorg	e257253e64	emoji_codes: Replace JS module with JSON module. webpack optimizes JSON modules using JSON.parse("{…}"), which is faster than the normal JavaScript parser. Update the backend to use emoji_codes.json too instead of the three separate JSON files. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-02-12 10:09:12 -08:00
Tim Abbott	cb2c96f736	test_templates: Remove shallow template rendering code. This code was very useful when first implemented to help catch errors where our backend templates didn't render, but has been superceded by the success of our URL coverage testing (which ensures every URL supported by Zulip's urls.py is accessed by our tests, with a few exceptions) and other tests covering all of the emails Zulip sends. It has a significant maintenance cost because it's a bit hacky and involves generating fake context, so it makes sense to remove these. Any future coverage issues with templates should be addressed with a direct test that just accessing the relevant URL or sends the relevant email.	2020-02-11 18:00:15 -08:00
Mateusz Mandera	2475adbf8a	messages_for_topic: Use stream.recipient_id for more efficient query.	2020-02-11 17:39:43 -08:00
Steve Howell	900f98c0c5	presence: Use realm_id for UserPresence queries. We now use realm_id for querying UserPresence instead of building a big WHERE clause from the list of user_ids. This commit may be a bit hard to measure, since we still get the list of user_ids for the PushToken query in the same method.	2020-02-11 13:11:58 -08:00
Tim Abbott	fcac3a4342	recipients: Rename extract_recipients to extract_private_recipients. Recent changes mean this function is now only used for private messages.	2020-02-11 12:28:14 -08:00
Steve Howell	1b6578cafd	messages: Fix bug with commas in stream names. We now validate streams with a separate function from PM recipients. It's confusing enough all the ways you can encode a stream or encode the PM recipients, but trying to do it all in one function was hard to reason about and led to at least one bug. In particular, there was a bug where streams with commas in them would get split. Now we just don't ever split on commas inside of `extract_stream_indicator`. Fixes #13836	2020-02-11 12:20:54 -08:00
Steve Howell	96132fe0e9	extract_recipients: Enforce str as incoming type. After removing internal_send_message() in a recent commit, we now have only two callers for extract_recipients, and they are both related to our REQ mechanism that always passes strings to converters. (If there are default values, REQ does not call the converters.) We therefore make two changes: - use the more strict annotation of "str" for the `s` parameter - don't bother with the isinstance check	2020-02-11 12:20:54 -08:00
Steve Howell	8c3eaeb872	Remove obsolete internal_send_messages(). We have been phasing this out for a couple years, and I fixed the last stragglers over the last couple days.	2020-02-11 12:20:54 -08:00
Steve Howell	e37d660d19	error_notify: Use internal_send_stream_message().	2020-02-11 12:20:53 -08:00
Steve Howell	c4e3cfebb0	presence: Add realm_id to UserPresence. This index is intended to optimize the performance of the very frequently run query of "what is the presence status of all users in a realm?". Main changes: - add realm_id to UserPresence - add index for realm_id - backfill realm_id for old rows - change all writes to UserPresence to include realm_id The index is of this form: "zerver_userpresence_realm_id_5c4ef5a9" btree (realm_id) We will create an index on (realm_id, timestamp) in a future commit, but I think it's a bit faster if you do the backfill before the index. There's also a minor tweak to the populate_db script.	2020-02-10 17:21:45 -08:00
Steve Howell	28a8ffbc4c	email_mirror: Use internal_send_stream_message(). This is just a refactoring to the more modern API for sending internal messages. To make this work we now plumb the email_gateway flag through `internal_send_stream_message` instead of `internal_send_message`. We also change `send_zulip` to have its callers pass in a full UserProfile object (which one of them already had).	2020-02-10 15:45:13 -08:00
Steve Howell	6922eef380	signups: Use internal_send_stream_message(). We prefer this to internal_send_message(). We are trying to deprecate `internal_send_message`, which has extra moving parts related to `extract_recipients` and `Addressee.legacy_build`. There are two chunks of code that I touch here that look pretty similar, but I'm not quite sure they're worth de-duplicating, since they use different topics and different message content.	2020-02-10 15:45:13 -08:00
Steve Howell	b33552997e	cross realm bots: Simplify notify_new_user. Instead of having `notify_new_user` delegate all the heavy lifting to `send_signup_message`, we just rename `send_signup_message` to be `notify_new_user` and remove the one-line wrapper. We remove a lot of obsolete complexity: - `internal` was no longer ever set to True by real code, so we kill it off as well as well as killing off the internal_blurb code and the now-obsolete test - the `sender` parameter was actually an email, not a UserProfile, but I think that got past mypy due to the caller passing in something from settings.py - we were only passing in NOTIFICATION_BOT for the sender, so we just hard code that now - we eliminate the verbose `admin_realm_signup_notifications_stream` parameter and just hard code it to "signups" - we weren't using the optional realm parameter There's also a long ugly comment in `get_recipient_info` related to this code that I amended for now. We should try to take action in a subsequent commit.	2020-02-10 15:45:13 -08:00
Hashir Sarwar	dcbd3e486f	stream_subscription: Remove unused TypedDict `SubInfo`.	2020-02-10 14:04:22 -08:00
Steve Howell	2ff41bf9e5	/json/users: Use field.realm for realm lookup. This avoids an unnecessary join to UserProfile. To verify this, you can do `print(queries)` in the `test_get_custom_profile_fields_from_api` test. It's kinda noisy, so I excerpted them below... Before: SELECT ... FROM "zerver_customprofilefieldvalue" INNER JOIN "zerver_userprofile" ON ("zerver_customprofilefieldvalue"."user_profile_id" = "zerver_userprofile"."id") INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id") WHERE "zerver_userprofile"."realm_id" = 2 After: SELECT ... FROM "zerver_customprofilefieldvalue" INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id") WHERE "zerver_customprofilefield"."realm_id" = 2' I don't have any way to measure the two queries with realistic data, but I would assume the second query is significantly faster on most of our instances, since CustomProfileField should be tiny.	2020-02-09 22:04:02 -08:00
Steve Howell	01f180d042	minor: Remove unused line of code in get_raw_user_data(). The line removed here is a noop, as both sides of the immediately following conditional reassign the same variable. This harmless cruft was the result of the recent commit `1ae5964ab8`, which added support for single-user GETs.	2020-02-09 22:04:02 -08:00
Vishnu KS	4572be8c27	api: Rename subject_links to topic_links. Fixes #13588	2020-02-07 14:35:22 -08:00
Tim Abbott	84edb5c516	test_fixtures: Fix buggy reuse of status_dir between databases. Apparently, the arguments passed to template_database_status were incorrect for the manual testing development database, in that we didn't pass a status_dir when calling into that code from provision. The result was that provisioning before running `test-backend` would ignore changes to the list of check_files (etc.) made after rebasing, and vice versa. The cleanest fix is to compute status_dir from other values passed in; I'm also going to open a follow-up issue for creating a better overall interface here.	2020-02-07 13:33:08 -08:00
akashaviator	1ae5964ab8	api: Add an api endpoint for GET /users/{id} This adds a new API endpoint for querying basic data on a single other user in the organization, reusing the existing infrastructure (and view function!) for getting data on all users in an organization. Fixes #12277.	2020-02-07 10:36:31 -08:00
Tim Abbott	e39840c705	users: Add read-only mode for access_user_by_id. We've be using this in the upcoming GET /users/{id} method.	2020-02-07 10:36:31 -08:00
Tim Abbott	aa9286a1f9	users: Move query into caller of get_custom_profile_field_values. This will be useful for supporting a smaller query for a single user.	2020-02-07 10:36:31 -08:00
Tim Abbott	79e5dd1374	users: Rename get_raw_user_data user parameter to acting_user. This is for improved clarity as we extend this function to take multiple user objects.	2020-02-07 10:36:31 -08:00
Steve Howell	7e99e7feb2	presence: Extract get_legacy_user_info. This code is a bit flatter and just preps the data for a single user. There is never any interaction between the data for user A and user B, so we can mostly avoid complicated nested data structures and do most of the data-crunching on a per-user basis. We also do an explicit sort of the data before running it through groupby. The explicit sort simplifies how we calculate `most_recent_info` and also avoids needing to add `dt` to an intermediate data structure. Finally, when it comes to the individual client data, the code has relied on the assumption that there is only one row per client, which I believe to be true, but now the code is more explicit about that.	2020-02-06 17:16:22 -08:00
Steve Howell	bf3baa14ac	presence: Rename get_status_dict_by_user().	2020-02-06 17:16:22 -08:00
Steve Howell	675f8514e8	presence: Rename get_status_dict(). We renamed this to get_presences_for_realm(), and we have the caller pass in realm, not user_profile.	2020-02-06 17:16:22 -08:00
Steve Howell	363e6bf239	presence: Move get_status_dicts_for_rows().	2020-02-06 17:16:22 -08:00
Steve Howell	36fba1076f	presence: Move get_status_dict_by_user.	2020-02-06 17:16:22 -08:00
Steve Howell	6f027d84a9	presence: Move get_status_dict_by_realm.	2020-02-06 17:16:22 -08:00
Steve Howell	703338dfa3	presence: Extract lib/presence.py. This will make more sense when we pull some code out of the model.	2020-02-06 17:16:22 -08:00
Tim Abbott	7c0a98754a	home: Refactor logic for show_invites and show_add_streams.	2020-02-05 16:05:02 -08:00
Tim Abbott	7032f49f8e	exceptions: Move default json_unauthorized string to response.py. This small refactor should make it easier to reuse this exception for other situations as well.	2020-02-05 15:40:10 -08:00
Anders Kaseorg	8e5a45267d	test_classes: Use a valid (but reserved as fictional) phone number. django-phonenumber-field 2.4.0 adds tighter phone number validation that rejects +12223334444 for having an invalid area code. This was reverted in 4.0.0, but django-two-factor-auth still requires <3.99. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-02-05 12:38:10 -08:00
Ryan Rehman	174b2abcfd	settings: Migrate to stream_post_policy structure. This commit includes a new `stream_post_policy` setting, by replacing the `is_announcement_only` field from the Stream model, which is done by mirroring the structure of the existing `create_stream_policy`. It includes the necessary schema and database migrations to migrate the is_announcement_only boolean field to stream_post_policy, a smallPositiveInteger field similar to many other settings. This change is done to allow organization administrators to restrict new members from creating and posting to a stream. However, this does not affect admins who are new members. With many tweaks by tabbott to documentation under /help, etc. Fixes #13616.	2020-02-04 17:08:08 -08:00
Mateusz Mandera	30d02c2e2c	test_fixtures: app_label should be a positional arg in call_command. We were incorrectly passing it as a kwarg, which would cause an exception on Django 2.	2020-02-04 12:46:53 -08:00
Mateusz Mandera	0e7c97378e	is_safe_url: Use allowed_hosts instead of depreciated host argument. Judging by comparing django 1.11 with django 2.2 code of this function, this shouldn't change any behavior.	2020-02-04 12:46:53 -08:00
Steve Howell	e3ad9baf1d	presence: Add process_presence_event. This lets us conditionally remove the email field from a presence event if the client has registered with the slim_presence flag.	2020-02-04 12:30:36 -08:00
Steve Howell	9847d4d9a3	refactor: Use user_id in get_status_dict_by_user. This avoids a needless user lookup in apply_event.	2020-02-04 12:30:36 -08:00
Steve Howell	a672a00677	presence: Add user_id to presence event. In a later commit, we will eliminate email for clients who have set slim_presence as their preference.	2020-02-04 12:30:36 -08:00
Steve Howell	bf9144ff69	presence: Add slim_presence flag. This flag affects page_params and the payload you get back from POSTs to this url: users/me/presence The flag does not yet affect the presence events that get sent to a client.	2020-02-04 12:30:34 -08:00
Vishnu Ks	5dfd4ea38d	export: Remove unused parameter from _get_exported_s3_record.	2020-02-03 14:09:05 -08:00
Vishnu Ks	5a59bf329e	import: Skip setting user_profile_id metadata only if unavailable.	2020-02-03 14:09:05 -08:00
Vishnu Ks	2ea53a347a	import: Support importing realm icon and logo. Fixes #11216	2020-02-03 14:09:05 -08:00
Vishnu Ks	af3a37b58b	upload: Refactor out realm_avatar_and_logo_path function.	2020-02-03 14:09:05 -08:00
Tim Abbott	df110e8ff9	test_fixtures: Note populate_db depends on server_initialization.py. This should ensure that folks rebasing past this commit from an older database model get their database rebuilt in the way that will match the test_subs.py query count of 40.	2020-02-03 10:38:04 -08:00
Ryan Rehman	3dc7d60ffe	muting: Record DateTime when a Topic is muted. This includes the necessary migration to add the date_muted field to the MutedTopic class and populates it with a hard coded value.	2020-02-02 20:49:53 -08:00
Mateusz Mandera	bf89cf2b4b	rate_limiter: Use ABC for defining the abstract class RateLimitedObject.	2020-02-02 19:15:13 -08:00
Mateusz Mandera	cb71a6571e	rate_limiter: Rename 'all' domain to 'api_by_user'.	2020-02-02 19:15:13 -08:00
Mateusz Mandera	06198af5b9	auth: Handle rate limiting in OurAuthenticationForm and user_settings. These parts of the code should catch the RateLimited exception and generate their own, apprioprate user-facing error message.	2020-02-02 19:15:13 -08:00
Mateusz Mandera	335b804510	exceptions: RateLimited shouldn't inherit from PermissionDenied. We will want to raise RateLimited in authenticate() in rate limiting code - Django's authenticate() mechanism catches PermissionDenied, which we don't want for RateLimited. We want RateLimited to propagate to our code that called the authenticate() function.	2020-02-02 19:15:00 -08:00
Mateusz Mandera	a6a2d70320	rate_limiter: Handle multiple types of rate limiting in middleware. As more types of rate limiting of requests are added, one request may end up having various limits applied to it - and the middleware needs to be able to handle that. We implement that through a set_response_headers function, which sets the X-RateLimit-* headers in a sensible way based on all the limits that were applied to the request.	2020-02-02 19:15:00 -08:00
Mateusz Mandera	4cc5d2464c	rate_limiter: Expand support for different domains.	2020-02-02 19:15:00 -08:00
Tim Abbott	51706bdc3a	stream: Deduplicate lists of stream/subscriptions fields. While the result of this change doesn't completely do what we need, it does remove a huge amount of duplicated lists of fields. With a bit more similar work, we should be able to eliminate a broad category of potential bugs involving Stream and Subscription objects being represented inconsistently in the API. Work towards #13787.	2020-02-02 18:34:45 -08:00
Tim Abbott	238bc386cb	actions: Deduplicate parts of get_web_public_subs. This has the side of effect of making new fields we add to Stream be automatically included, which will help maintain this code as we upgrade it. This commit adds is_web_public, history_public_to_subscribers, and email_notifications fields to the dictionary.	2020-02-02 17:42:12 -08:00
Tim Abbott	eac07698dd	users: Add nocoverage tag for settings.SYSTEM_BOT_REALM conditional. This is code for safety that should never happen and is likely annoying to setup an automated test to verify.	2020-01-31 14:51:12 -08:00
Tim Abbott	5825a155cc	users: Use format_user_row in events system as well. This completes the deduplication of our logic for turning users into dictionaries in the Zulip API.	2020-01-31 14:47:16 -08:00
akashaviator	20b8b29d11	users: Rewrite get_cross_realm_dicts to call format_user_row. This modifies get_cross_realm_dicts in zerver.lib.users to call format_user_row. This is done to remove current and prevent future inconsistencies between in the dictionary formats for get_raw_user_data and get_cross_realm_dicts. Implementation substantially rewritten by tabbott. Fixes #13638.	2020-01-31 14:28:46 -08:00
akashaviator	7d06293ac0	refactor: Cleanup actions.py and events.py in zerver/lib. This moves get_cross_realm_dicts (from zerver.lib.actions), get_raw_user_data and get_custom_profile_field_values (from zerver.lib.events) to zerver.lib.users.	2020-01-31 13:53:47 -08:00
Vishnu KS	bd460af099	emails: Remove unecessary call to message_content_allowed_in_missedmessage_emails.	2020-01-31 12:29:58 -08:00
Vishnu KS	47e442e4a4	emails: Show proper message when email content is not shown.	2020-01-31 12:29:58 -08:00
akashaviator	bd58e3397f	events: Extract user_data function from get_raw_user_data. This extracts the user_data inner function from get_raw_user_data as a reusable function. We intend to reuse it for cross-realm user dicts. A few changes were made while extracting it: * Renaming the UserProfile argument to acting_user, so we can do loops over a local user_profile variable. * Moved it to zerver.lib.users, as that's a more appropriate home for this function formatting data on users. * Simplified the calling convention for passing custom profile fields to reflect the fact that this function processes a single user (and is expected to be called in a loop).	2020-01-30 13:32:35 -08:00
Mateusz Mandera	8bd3752d13	email_mirror: Handle encoded attachment filenames.	2020-01-30 13:03:47 -08:00
Mateusz Mandera	49b76318c6	email_mirror: Extract handle_header_content function.	2020-01-30 13:03:47 -08:00
Tim Abbott	dd969b5339	install: Remove references to "Zulip Voyager". "Zulip Voyager" was a name invented during the Hack Week to open source Zulip for what a single-system Zulip server might be called, as a Star Trek pun on the code it was based on, "Zulip Enterprise". At the time, we just needed a name quickly, but it was never a good name, just a placeholder. This removes that placeholder name from much of the codebase. A bit more work will be required to transition the `zulip::voyager` Puppet class, as that has some migration work involved.	2020-01-30 12:40:41 -08:00
Mateusz Mandera	d68cf21952	server_initialization: Add server_initialized function.	2020-01-30 12:21:31 -08:00
Mateusz Mandera	682dea1b34	test_classes: Fix bug where UserProfile could be passed to client_post. It would cause JSON overflow error while producing URL coverage report.	2020-01-30 12:13:54 -08:00
Mateusz Mandera	68abddb534	server_initialization: Rename some variables. This makes the code of create_internal_realm identical to the corresponding block in initialize_voyager_db.py.	2020-01-29 17:26:45 -08:00
Mateusz Mandera	39b012a276	server_initialization: Set internal bots owners to themselves.	2020-01-29 17:26:45 -08:00
Mateusz Mandera	9c20611a65	server_initialization: Remove unnecessary type annotation.	2020-01-29 17:26:45 -08:00
Mateusz Mandera	d24936cbe3	server_initialization: Use tos_version argument in create_users.	2020-01-29 17:26:45 -08:00
Mateusz Mandera	261da5999d	populate_db: Extract default client creation to server_initialization.	2020-01-29 17:26:45 -08:00
Mateusz Mandera	a25f00a69c	populate_db: Extract some functions to server_initialization.py.	2020-01-29 17:26:45 -08:00
Mateusz Mandera	9dcf677bf9	email_mirror: Parse encoded From headers with show_sender=True.	2020-01-29 12:27:35 -08:00
Tim Abbott	05108760f6	narrow: Add support for passing oldest/newest for anchor. A wart that has long been present inin Zulip's get_messages API is how to request "the latest messages" in the API. Previously, the recommendation was basically to pass anchor=10000000000000000 (for an appropriately huge number). An accident of the server's implementation meant that specific number of 0s was actually important to avoid a buggy (or at least wasteful) value of found_newest=False if the query had specified num_after=0 (since we didn't check). This was the cause of the mobile issue https://github.com/zulip/zulip-mobile/issues/3654. The solution is to allow passing a special value of anchor='newest', basically a special string-type value that the server can interpret as meaning the user precisely just wants the most recent messages. We also add an analogous anchor='oldest' or similar to avoid folks needing to write a somewhat ugly anchor=0 for fetching the very first messages. We may want to also replace the use_first_unread_anchor argument to be a "first_unread" value for the anchor parameter. While it's not always ideal to make a value have a variable type like this, in this case it seems like a really clean way to express the idea of what the user is asking for in the API.	2020-01-29 12:14:06 -08:00
Tim Abbott	c0712431df	openapi: Add hacky support for oneOf parameter types. This is required for the upcoming type behavior of the "anchor" parameter. This change is the minimal work required to have our OpenAPI code not fail when checking a union-type value of this form. We'll likely want to, in the future, do something nicer, but it'd require more extensive infrastructure for parsing of OpenAPI data that it's worth with our current approach (we may want to switch to using a library).	2020-01-29 11:24:58 -08:00
Tim Abbott	91f1825474	test_helpers: Fix POSTRequestMock typing. The proximal issue here is that in upcoming commits, we're going to change the type of the `anchor` field in `get_messages_backend` to support passing either an integer or a string. Many of our tests using POSTRequestMock currently define a query object that uses integer values for the integer fields we're going to pass into it, e.g. {'num_after': 0}. That is the correct type for that field in the Zulip API, before HTTP encoding turns it into a string. However, because POSTRequestMock didn't use HTTP encoding at all (which will convert the 0 into a '0'), it ended up passing an integer to a function that can't possible receive one as an argument. Ideally, we'd just get rid of POSTRequestMock, since it's a hack, and just do real HTTP requests instead. But since it's used in a lot of places making doing so somewhat impractical, we can get past this issue by just making POSTRequestMock convert integers to strings.	2020-01-29 11:24:58 -08:00
Tim Abbott	8f50062e49	soft_deactivation: Fix incorrect logging function. Using logging.info() rather than logger.info() meant that our zulip.soft_deactivation logger configuration (which, in particular, included not logging to the console) was not active on this log line, resulting in the `manage.py soft_deactivate_users` cron job sending emails every time it ran. Fixes #13750.	2020-01-28 17:17:43 -08:00
Rohitt Vashishtha	630c564fc7	bugdown: Rewrite List Preprocessor logic to properly parse fences. Previously, we didn't track opening and closing fences separately, with led to bugs like not parsing a list that was immediately after a quoted fence; we treated each ``` as a new fence. This commit rewrites the function to maintain a stack of currently open fences. If any of the parent fences is a code fence, we do not insert a new line before a list. We also add some test cases specifically to test this behavior with complexly nested lists. Fixes #13745.	2020-01-27 17:14:27 -08:00
Mateusz Mandera	92c16996fc	redis_utils: Require key_format argument in get_dict_from_redis.	2020-01-26 21:40:15 -08:00
Mateusz Mandera	ad460e6ccb	redis_utils: Validate requested key length in helper functions.	2020-01-26 21:40:15 -08:00
Mateusz Mandera	8d987ba5ae	auth: Use tokens, with data stored in redis, for log_into_subdomain. The desktop otp flow (to be added in next commits) will want to generate one-time tokens for the app that will allow it to obtain an authenticated session. log_into_subdomain will be the endpoint to pass the one-time token to. Currently it uses signed data as its input "tokens", which is not compatible with the otp flow, which requires simpler (and fixed-length) token. Thus the correct scheme to use is to store the authenticated data in redis and return a token tied to the data, which should be passed to the log_into_subdomain endpoint. In this commit, we replace the "pass signed data around" scheme with the redis scheme, because there's no point having both.	2020-01-26 21:32:44 -08:00
Abhishek-Balaji	434e8d3104	home: Extract compute_show_invites_and_add_streams. This extracts a function for computing show_invites and show_add_streams, for better readability and testability. This commit was substantially cleaned up by tabbott.	2020-01-25 23:41:08 -08:00
Tim Abbott	d70e799466	bots: Remove FEEDBACK_BOT implementation. This legacy cross-realm bot hasn't been used in several years, as far as I know. If we wanted to re-introduce it, I'd want to implement it as an embedded bot using those common APIs, rather than the totally custom hacky code used for it that involves unnecessary queue workers and similar details. Fixes #13533.	2020-01-25 22:41:39 -08:00
Mateusz Mandera	af2c4a9735	redis: Extract put_dict_in_redis and get_dict_from_redis helpers.	2020-01-23 16:24:07 -08:00
Jonathan Cobb	c7433c83ff	integrations: Add errbit integration. Fixes #13685.	2020-01-16 15:33:51 -08:00
Mateusz Mandera	d37e6ef921	email_mirror: Use plaintext if html body empty with prefer-html option. If an email is sent with the .prefer-html option, but it has no html body, it's better to fall back to plaintext content instead of treating it as a user error.	2020-01-16 15:25:27 -08:00
Mateusz Mandera	0c9c218e91	email_mirror: Add prefer-html and prefer-text address options. Closes #13484. These options tell zulip whether to prefer the plaintext or html version of the email message. prefer-text is the default behavior, so including the option doesn't change anything as of now, but we're adding it to prepare to potentially change the default behavior in the future.	2020-01-16 15:25:19 -08:00
Mateusz Mandera	170e0ac2dd	email_mirror: More abstract option system. As we add more address options, which will have different behavior than simply setting option_name=True, we need to migrate this subsystem to something that better supports more complex logic and will allow encapsulating it, instead of needing to be put all over the decode_email_address function.	2020-01-16 15:16:04 -08:00
Tim Abbott	eb8b3539ad	test_classes: Remove DEFAULT_REALM variable. This essentially unused legacy variable was causing Zulip to query the database at import time, which is generally not something we aim to do. Combined with the issue fixed in the previous commit, this variable resulted in test-backend providing an unhelpful crash when provision hadn't updated the unit testing database.	2020-01-16 13:13:46 -08:00
Tim Abbott	8ff5d8ca89	test_classes: Clean up API_KEYS cache. Since the intent of our testing code was clearly to clear this cache for every test, there's no reason for it to be a module-level global. This allows us to remove an unnecessary import from test_runner.py, which in combination with DEFAULT_REALM's definition was causing us to run models code before running migrations inside test-backend. (That bug, in turn, caused test-backend's check for whether migrations needs to be run to happen sadly after trying to access a Realm, trigger a test-backend crash if the Realm model had changed since the last provision).	2020-01-16 13:07:26 -08:00
Anders Kaseorg	319e2231b8	thumbnail: Tighten fix for CVE-2019-19775 open redirect. Due to a known but unfixed bug in the Python standard library’s urllib.parse module (CVE-2015-2104), a crafted URL could bypass the validation in the previous patch and still achieve an open redirect. https://bugs.python.org/issue23505 Switch to using django.utils.http.is_safe_url, which already contains a workaround for this bug. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-16 12:36:24 -08:00
Anders Kaseorg	ea6934c26d	dependencies: Remove WebSockets system for sending messages. Zulip has had a small use of WebSockets (specifically, for the code path of sending messages, via the webapp only) since ~2013. We originally added this use of WebSockets in the hope that the latency benefits of doing so would allow us to avoid implementing a markdown local echo; they were not. Further, HTTP/2 may have eliminated the latency difference we hoped to exploit by using WebSockets in any case. While we’d originally imagined using WebSockets for other endpoints, there was never a good justification for moving more components to the WebSockets system. This WebSockets code path had a lot of downsides/complexity, including: * The messy hack involving constructing an emulated request object to hook into doing Django requests. * The `message_senders` queue processor system, which increases RAM needs and must be provisioned independently from the rest of the server). * A duplicate check_send_receive_time Nagios test specific to WebSockets. * The requirement for users to have their firewalls/NATs allow WebSocket connections, and a setting to disable them for networks where WebSockets don’t work. * Dependencies on the SockJS family of libraries, which has at times been poorly maintained, and periodically throws random JavaScript exceptions in our production environments without a deep enough traceback to effectively investigate. * A total of about 1600 lines of our code related to the feature. * Increased load on the Tornado system, especially around a Zulip server restart, and especially for large installations like zulipchat.com, resulting in extra delay before messages can be sent again. As detailed in https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it appears that removing WebSockets moderately increases the time it takes for the `send_message` API query to return from the server, but does not significantly change the time between when a message is sent and when it is received by clients. We don’t understand the reason for that change (suggesting the possibility of a measurement error), and even if it is a real change, we consider that potential small latency regression to be acceptable. If we later want WebSockets, we’ll likely want to just use Django Channels. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-01-14 22:34:00 -08:00
Mateusz Mandera	0beae44081	email_mirror: Use .walk() to search all MIME parts for attachments. Fixes #13416 We used to search only one level in depth through the MIME structure, and thus would miss attachments that were nested deeper (which can happen with some email clients). We can take advantage of message.walk() to iterate through each MIME part.	2020-01-14 15:37:39 -08:00
Mateusz Mandera	1561d144e0	email_mirror: Insert a new line before attachment links.	2020-01-14 15:37:39 -08:00
Tlazypanda	30ee0c2a49	invitations: Improve experience around reactivating users. Previously, if you tried to invite a user whose account had been deactivated, we didn't provide a clear path forward for reactivating the users, which was confusing. We fix this by plumbing through to the frontend the information that there is an existing user account with that email address in this organization, but that it's deactivated. For administrators, we provide a link for how to reactivate the user. Fixes #8144.	2020-01-13 18:30:51 -08:00
Tim Abbott	79f18138f5	realm: Add private_message_policy setting. This experimental setting disables sending private messages in Zulip in a crude way (i.e. users get an error when they try to send one). It makes no effort to adjust the UI to avoid advertising the idea of sending private messages. Fixes #6617.	2020-01-13 12:20:42 -08:00
Mateusz Mandera	d5ac1afce8	email_mirror: Check address usability in get_missed_message_address.	2020-01-12 20:43:51 -08:00
Mateusz Mandera	89046ea1a9	email_mirror: Give extract_and_validate a more descriptive name.	2020-01-12 11:30:18 -08:00
Mateusz Mandera	90a69ab24f	email_mirror: Reuse exception messages in mirror_email_message.	2020-01-12 11:30:18 -08:00
Mateusz Mandera	9f2b0c769f	stream_recipient: Eliminate unnecessary queries. We should take adventage of the recipient field being denormalized into the Stream model. We don't need to make queries to figure out a stream's recipient id, so we take advantage of that to eliminate some of those redundant queries and simplify StreamRecipientMap.	2020-01-08 14:34:43 -08:00
Mateusz Mandera	786c235023	stream_recipient: Optimize query in populate_for_recipient_ids. There's no reason to join with the Stream table, as Recipient.type_id is the stream id.	2020-01-08 14:34:43 -08:00
Hashir Sarwar	0cabacb8ab	export: Fix data export parallelization. This improves the approach of creating multiple parallel processes by using subprocess.Popen() instead of run_parallel() and subprocess.call() while exporting an organization's message history. This prevents forking twice for individual subprocess. While this has some performance benefit, the main reason to fix this is that it fixes an issue with the data export web UI introduced in run_parallel forks exited). Fixes #12904.	2020-01-07 13:23:18 -08:00
Mateusz Mandera	b87cf22b33	email_mirror: Move send_to_mm_address code to process_missed_message. process_missed_message did nothing other than calling send_to_missed_message_address with the same arguments, so there's no reason to have these as separate functions.	2020-01-07 13:03:32 -08:00
Mateusz Mandera	c011d2c6d3	email_mirror: Migrate missed message addresses from redis to database. Addresses point 1 of #13533. MissedMessageEmailAddress objects get tied to the specific that was missed by the user. A useful benefit of that is that email message sent to that address will handle topic changes - if the message that was missed gets its topic changed, the email response will get posted under the new topic, while in the old model it would get posted under the old topic, which could potentially be confusing. Migrating redis data to this new model is a bit tricky, so the migration code has comments explaining some of the compromises made there, and test_migrations.py tests handling of the various possible cases that could arise.	2020-01-07 13:03:22 -08:00
Mateusz Mandera	9077bbfefd	models: Add MissedMessageEmailAddress class. Preparatory commit for making the email mirror use the database instead of redis for missed message addresses. This model will represent missed message email addresses, which currently have their data stored in redis. The redis data will be converted and migrated into these models and the email mirror will start using them in the main commit.	2020-01-07 12:46:55 -08:00
Steve Howell	630aadb7e0	bot_owner_id: Explicitly set bot_owner_id to None. For cross realm bots, explicitly set bot_owner_id to None. This makes it clear that the cross realm bots have no owner, whereas before it could be misdiagnosed as the server forgetting to set the field.	2020-01-07 12:33:14 -08:00
Mateusz Mandera	510bc60663	test_helpers: Set Recipient class attrs in use_db_models. Model classes fetched through apps.get_model don't get methods or class attributes. It's not feasible to add them to all these objects in use_db_models, but Recipient.PERSONAL etc. are worth setting, since doing that increases the range of functions that can successfully be imported and called in test_migrations.py.	2020-01-03 16:56:58 -08:00
Mateusz Mandera	d691c249db	api: Return a JsonableError if API key of invalid format is given.	2020-01-03 16:56:42 -08:00
Mateusz Mandera	72401b229f	utils: Add a function to check if string can be an API key.	2020-01-03 16:56:42 -08:00
Mateusz Mandera	4f2897fafc	cache: Validate keys before passing them to memcached. Fixes #13504. This commit is purely an improvement in error handling. We used to not do any validation on keys before passing them to memcached, which meant for invalid keys, memcached's own key validation would throw an exception. Unfortunately, the resulting error messages are super hard to read; the traceback structure doesn't even show where the call into memcached happened. In this commit we add validation to all the basic cache_* functions, and appropriate handling in their callers. We also add a lot of tests for the new behavior, which has the nice effect of giving us decent coverage of all these core caching functions which previously had been primarily tested manually.	2020-01-03 16:56:42 -08:00
Steve Howell	405a529340	server: Sort user_ids in recent PM conversations. This change should prevent test flakes, plus it's more deterministic behavior for clients, who will generally comma-join the ids into a key for their internal data structures. I was able to verify test coverage on this by making the sort reversed, which would cause test_huddle_send_message_events to fail.	2020-01-02 11:59:58 -08:00
Anders Kaseorg	8f281c4fc9	apply_event: Replace list comprehension with list.remove. This should be about 4 times faster, saving something like half a millisecond on each stream of 10000 subscribers. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-12-31 10:06:09 -08:00
Tim Abbott	851eb1a6ee	generate_test_data: Remove some useless type annotations. One of these caused a parser error trying to run pyre on Zulip; the other is just useless as the type can be inferred.	2019-12-13 11:52:23 -08:00
Tim Abbott	7ccc8373e2	bugdown: Fix logic for extracting attachment path_id. In `3892a8afd8`, we restructured the system for managing uploaded files to a much cleaner model where we just do parsing inside bugdown. That new model had potentially buggy handling of cases around both relative URLs and URLS starting with `realm.host`. We address this by further rewriting the handling of attachments to avoid regular expressions entirely, instead relying on urllib for parsing, and having bugdown output `path_id` values, so that there's no need for any conversions between formats outside bugdowm. The check_attachment_reference_change function for processing message updates is significantly simplified in the process. The new check on the hostname has the side effect of requiring us to fix some previously weird/buggy test data. Co-Author-By: Anders Kaseorg <anders@zulipchat.com> Co-Author-By: Rohitt Vashishtha <aero31aero@gmail.com>	2019-12-12 20:30:26 -08:00
Anders Kaseorg	8e37862b69	CVE-2019-19775: Close open redirect in thumbnail view. This closes an open redirect vulnerability, one case of which was found by Graham Bleaney and Ibrahim Mohamed using Pysa. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-12-12 17:29:20 -08:00
Tim Abbott	4901dc3795	url_preview: Fix parsing of open graph tags. Our open graph parser logic sloppily mixed data obtained by parsing open graph properties with trusted data set by our oembed parser. We fix this by consistenly using our explicit whitelist of generic properties (image, title, and description) in both places where we interact with open graph properties. The fixes are redundant with each other, but doing both helps in making the intent of the code clearer. This issue fixed here was originally reported as an XSS vulnerability in the upcoming Inline URL Previews feature found by Graham Bleaney and Ibrahim Mohamed using Pysa. The recent Oembed changes close that vulnerability, but this change is still worth doing to make the implementation do what it looks like it does.	2019-12-12 15:24:38 -08:00
Anders Kaseorg	faa3ea0b8e	oembed: Remove unsound HTML filtering. The frontend now takes care of confining the HTML. Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2019-12-12 15:24:38 -08:00
Tim Abbott	9f223bb7c2	url_preview: Simplify path to oembed code.	2019-12-12 13:34:49 -08:00
Tim Abbott	e7cf1112c8	notifications: Enable online push notifications by default. For new user onboarding, it's important for it to be easy to verify that Zulip's mobile push notifications work without jumping through hoops or potentially making mistakes. For that reason, it makes sense to toggle the notification defaults for new users to the more aggressive mode (ignoring whether the user is currently actively online); they can set the more subtle mode if they find that the notifications are annoying.	2019-12-12 13:04:10 -08:00
Tim Abbott	f3c224058f	models: Use unlimited .select_related() for Stream and DefaultStream. Previously, these accesses used e.g. .select_related("realm"), which was the only foreign key on the Stream model. Since the intent in these code paths is to attach the related models for efficient access, we should just do that for all related models, including Recipient.	2019-12-12 12:13:07 -08:00
Mateusz Mandera	9a42a83e15	streams: Remove get_stream_recipients function and its uses. With the recipient field being denormalized into the UserProfile and Streams models, all current uses of get_stream_recipients can be done more efficiently, by simply checking the .recipient_id attribute on the appropriate objects.	2019-12-12 12:05:42 -08:00
Mateusz Mandera	01288ede9e	recipients: Remove bulk_get_recipients function and its uses. With the recipient field being denormalized into the UserProfile and Streams models, all current uses of bulk_get_recipients can be done more efficient, by simply checking the .recipient_id attribute on the appropriate objects.	2019-12-12 12:00:13 -08:00
Tim Abbott	63fd7bdf57	actions: Simplify logic of get_recipient_from_user_profiles. This just uses the early return pattern and a local variable to produce somewhat more readable code.	2019-12-12 11:59:27 -08:00
Mateusz Mandera	9995dab095	messages: Save a database query in check_message code path. The flow in recipient_for_user_profiles previously worked by doing validation on UserProfile objects (returning a list of IDs), and then using that data to look up the appropriate Recipient objects. For the case of sending a private message to another user, the new UserProfile.recipient column lets us avoid the query to the Recipient table if we move the step of reducing down to user IDs to only occur in the Huddle code path.	2019-12-12 11:49:01 -08:00
Mateusz Mandera	690dc7313d	actions: Restore a misplaced comment to its correct position.	2019-12-11 18:46:33 -08:00
Tim Abbott	299896b6ce	notifications: Ignore mobile presence when sending notifications. Previously, if the user had interacted with the Zulip mobile app in the last ~140 seconds, it's likely the mobile app had sent presence data to the Zulip server, which in turns means that the Zulip server might not send that user mobile push notifications (or email notifications) about new messages for the next few minutes. The email notifications behavior is potentially desirable, but the push notifications behavior is definitely not -- a private message reply to something you sent 2 minutes ago is definitely something you want a push notification for. This commit partially addresses that issue, by ignoring presence data from the ZulipMobile client when determining whether the user is currently engaging with a Zulip client (essentially, we're only considering desktop activity as something that predicts the user is likely to see a desktop notification or is otherwise "online").	2019-12-11 16:05:35 -08:00
Tim Abbott	958f39a551	message_edit: Call check_attachment_reference_change unconditionally. This removes the last of the messy use of regular expressions outside bugdown to make decisions on whether a message contains an attachment or not. Centralizing questions about links to be decided entirely within bugdown (rather than doing ad-hoc secondary parsing elsewhere) makes the system cleaner and more robust.	2019-12-11 11:10:46 -08:00
Rohitt Vashishtha	3fbb050216	messages: Remove dependence on regex for claiming attachments. This commit wraps up the work to remove basic regex based parsing of messages to handle attachment claiming/unclaiming. We now use the more dependable Bugdown processor to find potential links and only operate upon those links instead of parsing the full message content again.	2019-12-11 11:03:49 -08:00
Rohitt Vashishtha	fe24f4ee65	messages: Remove update_calculated_fields method. This infrastructure is no longer needed following reworking of how has_link and friends work.	2019-12-11 11:03:49 -08:00
Rohitt Vashishtha	3892a8afd8	messages: Set has_attachment correctly using Bugdown. Previously, we would naively set has_attachment just by searching the whole messages for strings like `/user_uploads/...`. We now prevent running do_claim_attachments for messages that obviously do not have an attachment in them that we previously ran. For example: attachments in codeblocks or attachments that otherwise do not match our link syntax. The new implementation runs that check on only the urls that bugdown determines should be rendered. We also refactor some Attachment tests in test_messages to test this change. The new method is: 1. Create a list of potential_attachment_urls in Bugdown while rendering. 2. Loop over this list in do_claim_attachments for the actual claiming. For saving: 3. If we claimed an attachment, set message.has_attachment to True. For updating: 3. If claimed_attachment != message.has_attachment: update has_attachment. We do not modify the logic for 'unclaiming' attachments when editing.	2019-12-11 11:03:44 -08:00
Rohitt Vashishtha	4674cc5098	bugdown: Set message.has_image while rendering message.	2019-12-11 17:01:41 +05:30
dustinheestand	157c98de99	bugdown: Correctly set has_link attribute on messages. Now autolinks and message edits affect the has_link attribute on messages.	2019-12-11 17:01:41 +05:30
Rohitt Vashishtha	182503e5c0	bugdown: Move helper methods to InlineInterestingLinksProcessor. add_a, add_oembed_data and add_embed are only called by InlineInterestingLinksProcessor and this commit allows these methods to access self.markdown object.	2019-12-10 15:35:00 -08:00
Rohitt Vashishtha	1229e69e9b	bugdown: Reenable -,+ to begin a markdown list. This commit has a side-effect that we also now allow mixed lists, but they have different syntax from the commonmark implementation and our marked output. For example, without the closing li tags: Input Bugdown Marked ------------------------------------- <ul> - Hello <li>Hello <ul><li>Hello</ul> + World <li>World <ul><li>World + Again <li>Again <li>Again</ul> * And <li>And <ul><li>And * Again <li>Again <li>Again</ul> </ul> The bugdown render is in line with what a user in #13447 requests. Fixes #13477.	2019-12-09 16:13:02 -08:00
Nat1405	d5f005fd61	wildcard_mentions_notify: Add per-stream override of global setting. Adds required API and front-end changes to modify and read the wildcard_mentions_notify field in the Subscription model. It includes front-end code to add the setting to the user's "manage streams" page. This setting will be greyed out when a stream is muted. The PR also includes back-end code to add the setting the initial state of a subscription. New automated tests were added for the API, events system and front-end. In manual testing, we checked that modifying the setting in the front end persisted the change in the Subscription model. We noticed the notifications were not behaving exactly as expected in manual testing; see https://github.com/zulip/zulip/issues/13073#issuecomment-560263081 . Tweaked by tabbott to fix real-time synchronization issues. Fixes: #13429.	2019-12-09 16:09:38 -08:00
Mateusz Mandera	792fbeea24	messages: Optimize check_message using recent denormalization.	2019-12-09 15:24:51 -08:00
Mateusz Mandera	1c5461663f	users: Eliminate some unnecessary get_personal_recipient calls.	2019-12-09 15:24:35 -08:00
Mateusz Mandera	467833a974	streams: Eliminate some unnecessary get_stream_recipient calls.	2019-12-09 15:24:35 -08:00
Mateusz Mandera	dda3ff41e1	messages: Optimize get_recent_private_conversations. Previously, get_recent_private_messages could take 100ms-1s to run, contributing a substantial portion of the total runtime of `/`. We fix this by taking advantage of the recent denormalization of personal_recipient into the UserProfile model, allowing us to avoid the complex join with Recipient that was previously required. The change that requires additional commentary is the change to the main, big SQL query: 1. We eliminate UserMessage table from the query, because the condition m.recipient_id=%(my_recipient_id)d implies m is a personal message to the user being processed - so joining with usermessage to check for user_profile_id and flags&2048 (which checks the message is private) is redundant. 2. We only need to join the Message table with UserProfile (on sender_id) and get the sender's personal_recipient_id from their UserProfile row. Fixes #13437.	2019-12-09 15:23:10 -08:00
Mateusz Mandera	8acfa17fe6	models: Add recipient foreign key in UserProfile and Stream. This is adds foreign keys to the corresponding Recipient object in the UserProfile on Stream tables, a denormalization intended to improve performance as this is a common query. In the migration for setting the field correctly for existing users, we do a direct SQL query (because Django 1.11 doesn't provide any good method for doing it properly in bulk using the ORM.). A consequence of this change to the model is that a bit of code needs to be added to the functions responsible for creating new users (to set the field after the Recipient object gets created). Fortunately, there's only a few code paths for doing that. Also an adjustment is needed in the import system - this introduces a circular relation between Recipient and UserProfile. The field cannot be set until the Recipient objects have been created, but UserProfiles need to be created before their corresponding Recipients. We deal with this by first importing UserProfiles same way as before, but we leave the personal_recipient field uninitialized. After creating the Recipient objects, we call a function to set the field for all the imported users in bulk. A similar change is made for managing Stream objects.	2019-12-09 15:14:41 -08:00
Vishnu KS	c8ede33fc3	openapi: Specify securityScheme for the API in root level. We used to specify the securityScheme for each REST operation seperately. This is unecessary as the securityScheme can be specified in root level and would be automatically applied to all operations. This also prevents us accidentally not specifying the securityScheme for some operations and was the case for /users/me/subscriptions PATCH endpoint. The root level securityScheme can be also overriden in the operational level when necessary. swagger.io/docs/specification/authentication/#security	2019-12-06 11:19:08 -08:00
Vishnu KS	e08d029dde	docs: Use term operation instead of openapi in generate_curl_example. The term operation makes more sense instead of openapi. OpenAPI specs defines a unique operation as a combination of a path and a HTTP method.	2019-12-06 11:19:08 -08:00
Mateusz Mandera	2b6cfbcf7b	push_notifs: Handle more requests Exceptions in send_to_push_bouncer. Closes #13294.	2019-12-04 09:58:22 -08:00
Mateusz Mandera	7d0444f903	push_notifs: Improve handling of errors when talking to the bouncer. We use the plumbing introduced in a previous commit, to now raise PushNotificationBouncerRetryLaterError in send_to_push_bouncer in case of issues with talking to the bouncer server. That's a better way of dealing with the errors than the previous approach of returning a "failed" boolean, which generally wasn't checked in the code anyway and did nothing. The PushNotificationBouncerRetryLaterError exception will be nicely handled by queue processors to retry sending again, and due to being a JsonableError, it will also communicate the error to API users.	2019-12-04 09:58:22 -08:00
Mateusz Mandera	20b30e1503	push_notifs: Set up plumbing for retrying in case of bouncer error. We add PushNotificationBouncerRetryLaterError as an exception to signal an error occurred when trying to communicate with the bouncer and it should be retried. We use JsonableError as the base class, because this signal will need to work in two roles: 1. When the push notification was being issued by the queue worker PushNotificationsWorker, it will signal to the worker to requeue the event and try again later. 2. The exception will also possibly be raised (this will be added in the next commit) on codepaths coming from a request to an API endpoint (for example to add a token, to users/me/apns_device_token). In that case, it'll be needed to provide a good error to the API user - and basing this exception on JsonableError will allow that.	2019-12-04 09:58:22 -08:00
Rohitt Vashishtha	68e93d2435	update-message: Use MentionData in the update_message_backend code. This is a performance optimization, since we can avoid doing work related to wildcard mentions in the common case that the message can't have any. We also add a unit test for adding wildcard mentions in a message edit.	2019-12-02 12:12:35 -08:00
Rohitt Vashishtha	bb42539b3f	do_send_messages: Populate possible_wildcard_mentions from MentionData. Fixes #13430.	2019-12-02 12:12:35 -08:00

... 2 3 4 5 6 ...

5008 Commits