Commit Graph

12969 Commits

Author SHA1 Message Date
Steve Howell 1040fb7219 email digests: Remove handle_digest_email shim.
The previous commit made it so we only call the
shim in tests, so now we completely remove it.
2021-01-17 11:28:30 -08:00
Steve Howell bfa0bdf3d6 email digests: Process users in chunks of 30.
This should make the queue empty more quickly,
because we do bulk queries to prevent database
hops.
2021-01-17 11:28:30 -08:00
Steve Howell e0b451730a email digests: Extract get_new_streams.
This makes us more efficient when handling
multiple users.  We don't have to keep
sending the same two queries to the database.

Note that as part of this we eliminated
a failure mode for the obscure population
of users from whom both `user.is_guest` and
`user.can_access_public_streams()` returns
False.  We know this would have only affected
Zephyr users (by looking at the code), and
we know we don't actually process Zephyr
users for email digests (or else we would
have raised exceptions in the old code).
2021-01-17 11:28:30 -08:00
Steve Howell 23de94504f email digests: Query streams for messages up front.
This should save us many hops to the database when
we process users in bulk.
2021-01-17 11:28:30 -08:00
Steve Howell 3662bf2dcb minor: Rename stream_map -> user_stream_map. 2021-01-17 11:28:30 -08:00
Steve Howell 11c93aced5 minor: Rename user_profile -> user and avoid shadowing. 2021-01-17 11:28:30 -08:00
Steve Howell f8bbb7fea9 email digests: Use select_related("realm").
We mostly need realm_id, but when we go to build
message lists, we need realm.uri.

We could probably be more aggresive about using
`only` here, but for now I am just trying to
reduce hops to the database.
2021-01-17 11:28:29 -08:00
Steve Howell bb56f0ec0e minor: Move get_stream_map to module level.
This is a pure code move.
2021-01-17 11:28:29 -08:00
Steve Howell 52e2d5a733 email digests: Avoid long_term_idle check.
We want to exclude users with recent subscription
activity from emails, regardless of whether
the long_term_idle flag is set.
2021-01-17 11:28:29 -08:00
Steve Howell 162b372b93 email digests: Do one query for recent streams.
This is another way to limit hops to the database
when we process users in bulk.
2021-01-17 11:28:29 -08:00
Alex Vandiver c2526844e9 worker: Remove SignupWorker and friends.
ZULIP_FRIENDS_LIST_ID and MAILCHIMP_API_KEY are not currently used in
production.

This removes the unused 'signups' queue and worker.
2021-01-17 11:16:35 -08:00
Alex Vandiver 01658e39a9 sentry: Verify version is supported, first.
Raven SDK does not send a `title` field.
2021-01-17 11:15:40 -08:00
Alex Vandiver d688e18de2 errors: Remove references to "deployment", use "host".
The `deployment` key was only set in `do_report_error`, which is now
only used in one codepath (the queue worker).  The logging handlers on
staging call notify_server_error directly, which omits the
`deployment` key.

Remove the odd one-of key, and instead simply do dispatch in
`do_report_error`.
2021-01-17 11:08:12 -08:00
Mateusz Mandera 3623681d30 message_edit: Don't rely on .recipient_id change not affecting recipient.
The codepath for moving a topic changes the message.recipient_id to the
id of the new recipient, but later, in update_messages_for_topic_edit,
it uses message.recipient when querying for messages with the matching
topic in the *old* stream (because those are the other messages that
need to be moved). This is a bug which happens to work fine, because in
Django 2, if message.recipient gets fetched first and then
message.recipient_id is mutated, message.recipient will not be altered
and thus will retain the outdated, previously fetched value.

In Django 3 changing .recipient_id causes .recipient to be updated to
the new Recipient objects, which is the Recipient of the *new* stream.
That will cause the bug to manifest.

This is a bugfix preparing for the upgrade to Django 3.
2021-01-17 10:39:46 -08:00
Mateusz Mandera f76202dd59 django3: Save language preference in a cookie rather than the session.
Support for saving it in the session is dropped in django3, the cookie
is the mechanism that needs to be used. The relevant i18n code doesn't
have access to the response objects and thus needs to delegate setting
the cookie to LocaleMiddleware.

Fixes the LocaleMiddleware point of #16030.
2021-01-17 10:38:58 -08:00
Steve Howell 04b6108e71 minor: Require keywords for verify_action. 2021-01-17 12:31:04 -05:00
Steve Howell 3df507be73 refactor: Clean up args for fetch_initial_state_data.
We now require explicit keywords for all arguments
to fetch_initial_state_data except user_profile.

We provide reasonable defaults to keep the test
code concise.
2021-01-17 12:31:04 -05:00
Alex Vandiver 08d716c741 registration: Re-use the redirect_to_email_login_url helper.
In the case of reusing a registration link, reuse the
redirect_to_email_login_url helper.  This does have the side effect of
now showing a "you've already registered" note, which did not happen
previously, but that seems probably for the best, since the user did
just click a "register" link.
2021-01-13 11:28:32 -08:00
Alex Vandiver ad3d25103b registration: Pre-fill the email when redirecting to login.
ecfafc05c0 shifted to using a different paramter name to hint that
the user had previously signed up -- and in so doing also stopped
pre-filling the "email" box.  Also send along the email box, to save
users time.
2021-01-13 11:28:32 -08:00
Tushar912 c60f48c889 registration: Move "already in realm" check outside of validation.
Checking for `validate_email_not_already_in_realm` again (after the
form already did so), but only in the case that the form fails to
validate, means that we may be spending time pushing totally invalid
emails to the DB to check.  In the case of emails containing nulls,
this can even trigger a 500 error from PostgreSQL.

Stop calling `validate_email_not_already_in_realm` in the form
validation. The form is currently only used in two places -- in
`accounts_home` and in `maybe_send_to_registration`.  The latter is
only called if the address is known to not currently have an account,
so checking in there is unnecessary; and in the former case, we wish
different behaviour (the redirect) than just validation failure, which
is all the validator can do.

Fixes #17015.

Co-authored-by: Alex Vandiver <alexmv@zulip.com>
2021-01-13 11:28:32 -08:00
Tushar912 410bb8ad89 imports: Add better checking for subdomains.
Add a `--allow-reserved-subdomain` flag which allows creation of
reserved keyword domains.  This also always enforces that the domain
is not in use, which was removed in 0258d7d.

Fixes #16924.
2021-01-12 17:54:01 -08:00
sushant52 6f0e8a9888 auth: Handle the case of invalid subdomain at various points.
Fixes #16770.
2021-01-11 22:29:50 -08:00
Siddharth Asthana 6c888977a6 change_subdomain: Create a deactivated realm on updating subdomain.
When changing the subdomain of a realm, create a deactivated realm with
the old subdomain of the realm, and set its deactivated_redirect to the
new subdomain.
Doing this will help us to do the following:
- When a user visits the old subdomain of a realm, we can tell the user
that the realm has been moved.
- During the registration process, we can assure that the old subdomain
of the realm is not used to create a new realm.

If the subdomain is changed multiple times, the deactivated_redirect
fields of all the deactivated realms are updated to point to the new
uri.
2021-01-07 14:15:22 -08:00
Aman Agrawal e566e985e4 topic_edit: Store edit history in all the message affected.
Instead of just storing the edit history in the message which
triggered the topic edit, we store the edit history in all
the messages that changed. This helps users track the edit history
of a message more reliably.
2021-01-04 18:18:05 -08:00
cozyrohan 16d1ab3d5f webhooks/github: Fix repeating description for edits and updates.
This change updates the GitHub Integration webhook
get_opened_or_update_pull_request_body method so that
the description is only printed if it actually changes.
If the update event is a result of some other
attribute update, such as an asignee change, then the
description is not included in the message sent to
the zulip stream.

Fixes #16345
2021-01-04 14:34:17 -08:00
Aman Agrawal c685d36821 hipchat_import: Remove tool from codebase.
Remove functions and scripts used by HipChat import tool and
those which will no longer be required in future.
2020-12-23 08:28:49 -08:00
Aman Agrawal 62d721e859 docs: Remove HipChat migration guide.
As of Feb 15th 2019, Hipchat Cloud and Stride
have reached End Of Life and are no longer
supported by Atlassian. Since it is almost 2 years
now we can remove the migration guides.
2020-12-23 15:43:13 +05:30
Vishnu KS 9fe39646fa analytics: Specify exact end_time in realm summary query.
Fetchings rows with end_time within the last 25 hours would result
in the realmcount queries returning two rows for each realm
if the analytics page was opened within an hour since the
count stats were updated.
2020-12-22 16:44:31 -08:00
Mateusz Mandera 160cc5120a api: Require can_create_users permission to create users via API.
Allowing any admins to create arbitrary users is not ideal because it
can lead to abuse issues.  We should require something stronger that
requires the server operator's approval and thus we add a new
can_create_users permission.
2020-12-21 13:20:21 -08:00
Mateusz Mandera c9b6d8ddad models: Remove redundant Meta.permissions on Realm model.
This is dead code leftover from the old way of handling admin
permissions.
2020-12-21 13:15:40 -08:00
Mateusz Mandera d0dc04a093 models: Rename is_api_super_user to can_forge_sender, 2020-12-21 13:15:39 -08:00
sahil839 2fa33be683 actions: Refactor check_message to change return dataclass instead of Dict.
We change the return type of check_message to be dataclass instead of
Dict[str, Any]. This refactoring helps us to understand the context of the
data structure returned by check_message clearly which was not possible
when using Dict.

SendMessageRequest class is added in zerver/lib/message.py inspite of it
not being used in that file itself just to maintain consistency as other
TypedDicts and dataclasses are defined in that file and to avoid circular
dependency as SendMessageRequest is being used in lib/widget.py as well.

We also rename local variable to 'send_request' for accessing
SendMessageRequest objects.
2020-12-21 12:55:30 -08:00
Tim Abbott 908025bdad runtornado: Avoid providing a URL for Tornado on startup.
The {addr} part isn't directly useful, since connections to Tornado
are done on localhost anyway, and made the development environment
output a bit more confusing.

Also, use the same phrasing for restarts we use for Django.
2020-12-20 12:27:51 -08:00
Tim Abbott 1f036f9bde tornado: Reduce logging of event queue load/dump.
This logging is really only potentially interesting in a development
environment when the numbers are nonzero.

In production, it seems worth logging for consistency reasons.

Probably we'll eventually redo this block by change the log level, but
this is good enough to despam the development environment startup
output.
2020-12-20 12:14:39 -08:00
Anders Kaseorg a054f57af6 message: Bundle message stripping, validation, and truncation.
We always want to do these at the same time.  Previously, message
editing did too much stripping (fixes #16837) and failed to check for
NUL bytes.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-12-18 17:44:13 -08:00
sahil839 37c8505435 message: Raise exception when trying to mirror an already sent message.
Previously we were just returning a dict containing a message id when
trying to mirror a already sent message in 'zephyr_mirror' cases.

This commit changes this behaviour to raise an exception when trying
to mirror an already sent message by adding a new exception class
ZephyrMessageAlreadySentException and then the caller returns the
message_id directly, instead of calling do_send_messages which also
returns a list of size one containing the message_id only.

This is a prep commit for changing the return type of check_message to
be a dataclass instead of a Dict as now we have only single output for
check_message.
2020-12-18 16:40:11 -08:00
sahil839 4e99ec34a9 widget: Use different variable names for message and submessage content.
This commit renames the content variable in do_widget_post_save_actions
to message_content and is a prep commit for changing the return type of
check_message from Dict to dataclass.

This change is required because content variable is used two times in
this function - one for message content and other for submessage
content, so when we change the return type of check_message to
dataclass, the type of content variable is considered as str and then
when dict is assigned to content in the submessage case, mypy raises
'Incompatible types in assignment' error.

This issue is not faced before the dataclass migration because there is
no type checking for the values of dict returned by check_message as the
return type of check_message is 'Dict[str, Any]'.
2020-12-18 16:19:35 -08:00
sahil839 db85b8a236 actions: Change type of wildcard_mention_user_ids in message_dict to set.
The message_dict['wildcard_mention_user_ids'] should be empty set instead
of empty list when there are no wildcard mentions similar to the case
when there are wildcard mentions, where it is equal to set of user ids and
not list of user ids.
2020-12-18 16:17:26 -08:00
Anders Kaseorg 6b8f4782c4 test_mattermost_importer: Fix test for admins-to-owners change.
Commit ed498e2f8e forgot to update this
test.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-12-17 18:59:08 -08:00
Tim Abbott ed498e2f8e import: Import mattermost admins as Zulip owners.
Otherwise, we violate the invariant that all organizations have an owner.
2020-12-17 18:45:45 -08:00
Anders Kaseorg 2ab0b3d4fc validator: Reject ISO 8601 dates missing leading zeros.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-12-15 16:36:50 -08:00
Max Zawisa 0e40cc72af newrelic: Added owner field and cleaned up code.
I reformatted the tests and view to include information about who
acknowledged and closed the alert. Only includes the information about
the owner if there was an owner.

Made a few small changes to the refactored bit as requested in review.
2020-12-15 12:04:46 -08:00
Max Zawisa 57e847ab89 newrelic: refactor of time input handling.
Moved time formatting check and conversion to
zerver/lib/webhooks/common.py. Updated tests slightly to match new
output. Removed duration from the calculation because the difference
is less than the precision of output and it complicated the error
handling.
2020-12-15 12:04:46 -08:00
Max Zawisa ec00557962 docs: Updated New Relic documentation.
The docs are updated to work with the new webhook and new process on
https://one.newrelic.com.
2020-12-15 12:04:46 -08:00
Mateusz Mandera b652cc786c django3: Remove remaining postgresql_psycopg2 use.
Removed in Django 3.0.
2020-12-15 11:52:32 -08:00
angela s 64becb20b5
logging: Set decorator tests to use assertLogs.
Fixes part of #15331.
2020-12-15 11:46:25 -08:00
Alex Vandiver 438d2aa632 digests: Ensure that the teaser_data can be JSON-serialized.
Leaving this as a set means that it fails in zerver.lib.send_email
when serializing into a ScheduledEmail object.
2020-12-15 11:44:50 -08:00
Alex Vandiver 7c849fa940 slack: Check token access scopes before importing.
The Slack API always (even for failed requests) puts the access scopes
of the token passed in, into "X-OAuth-Scopes"[1], which can be used to
determine if any are missing -- and if so, which.

[1] https://api.slack.com/legacy/oauth-scopes#working-with-scopes
2020-12-15 11:33:15 -08:00
Anders Kaseorg 415897f491 api docs: Use normal async/await code in JavaScript examples.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-12-15 11:32:18 -08:00
Anders Kaseorg bf45f921a7 url_preview: Allow Beautiful Soup to get the charset from <meta>.
An HTML document sent without a charset in the Content-Type header
needs to be scanned for a charset in <meta> tags.  We need to pass
bytes instead of str to Beautiful Soup to allow it to do this.

Fixes #16843.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-12-15 11:30:57 -08:00
Siddharth Asthana daac7536f3 accounts/deactivated: Show deactivated_redirect url if present
If a user visits a realm which has been deactivated and it's
deactivated_redirect field is set, we should have a message telling the
user that the realm has moved to the deactivated_redirect url.
2020-12-14 21:04:52 -08:00
Siddharth Asthana 82f5759299 Realm: Add a deactivated_redirect URLField to Realm object.
We export a realm's data, and disable the realm, because the user
is moving from Zulip Cloud (e.g. https://example.zulipchat.com/) to
self-hosting or another platform (e.g. https://zulip.example.com/)
which we do not control. This commit adds a field in the realm object
called deactivated_redirect to store the url to which the realm has
moved.
2020-12-14 21:04:52 -08:00
Anders Kaseorg 2c5e9f65f8 eslint: Fix new-cap errors.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-12-10 19:52:22 -08:00
Puneeth Chaganti 5dc3489166 webhooks/sentry: Fix URL generated in transform_webhook_payload.
The URL incorrectly had `event` in the URL path, instead of `events`.

Closes #16783
2020-12-02 12:28:45 -08:00
Puneeth Chaganti b7a08323aa webhooks/sentry: Use received key when timestamp key is absent. 2020-12-02 12:28:45 -08:00
Sundar Guntnur cbb7fb8ac0 anchor_value: Fix parsing of large anchor values.
This handles the conditions when anchor values are larger than
LARGER_THAN_MAX_MESSAGE_ID by clamping them down to it.  Also added
tests for the function parse_anchor_value.

Fixes #16768.
2020-12-02 11:00:22 -08:00
Mateusz Mandera 43a0c60e96 exceptions: Make RateLimited into a subclass of JsonableError.
This simplifies the code, as it allows using the mechanism of converting
JsonableErrors into a response instead of having separate, but
ultimately similar, logic in RateLimitMiddleware.
We don't touch tests here because "rate limited" error responses are
already verified in test_external.py.
2020-12-01 13:40:56 -08:00
Steve Howell 92ce2d0e31 events: Fix apply_event for streams.
In 1bcb8d8ee8 I made
it so the webapp doesn't include "streams" in its
state from `fetch_initial_state_data`, but I didn't
address all the places in apply_event.
2020-12-01 13:01:38 -08:00
Steve Howell c566ecfb30 minor: Remove dead code in events test. 2020-12-01 13:01:38 -08:00
Vishnu KS dabbc3445a webhooks: Properly format the currency amount for refunds.
By default all Stripe API amounts are in the currency's smallest unit.
It's upto us to convert it to a bigger unit and show it to the end user.
And refund event used to show the currency in the smallest unit which makes
the output wrong when it comes to most currencies like USD, Europ, INR etc
which uses a bigger unit(eg Dollar instead of Cents) as the standard.
2020-11-29 18:11:24 -08:00
Max Zawisa f05a04e000
webhooks: Update NewRelic webhook for new format.
Update the New Relic webhook and tests to match the format specified
in the New Relic documentation. The new format sends a json body
instead of using url parameters. The old format is no longer supported
by New Relic according to their support staff; as a result, the fixtures for 
the old test cases were removed. Added fixtures for new test cases.

Fixes: #16393.
2020-11-18 16:19:08 -08:00
Anders Kaseorg 13e35bfa94 mypy: Use sqlalchemy-stubs.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-16 18:17:41 -08:00
Anders Kaseorg 8e0240300a message_fetch: Skip intermediate mutation in limit_query_to_range.
This avoids extra mypy annotations.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-16 18:17:41 -08:00
Anders Kaseorg d0d8c358b3 lint: Migrate typing.Text check to semgrep.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-16 18:17:41 -08:00
Steve Howell 99e725cbde populate_db: Simplify how we create reactions.
For 3000 messages and 400 users, this saved
about 30 seconds.

We only do two queries per batch of messages
now, and the algorithm is easier to analyze,
as it's just three nested loops.
2020-11-16 17:19:23 -08:00
Vishnu KS 5eb63ddb7a webhooks: Handle dispute events with object IDs prefixed with du.
Sometimes the dispute object IDs are prefixed with `du` instead of `dp`.

https://freenode.logbot.info/stripe/20200605#c4059469

The correct long-term fix here would be to stop using object IDs to
detect the object type of these events and instead maybe make use of
"object" key instead.

https://stripe.com/docs/api/disputes/object#dispute_object-object
2020-11-16 17:05:54 -08:00
Steve Howell e2e0f06b2a email digests: Call get_recent_topics once per batch.
Once we start processing digests in batch, this will
let us amortize the expense of the message query
over multiple users.
2020-11-16 08:59:29 -08:00
Steve Howell 428f0564a0 minor: Move context code down in the function.
This will make a subsequent diff a bit less noisy.
2020-11-16 08:59:29 -08:00
Steve Howell 1d1e45e9ec digests: Use UserActivityInterval for user activity.
Note that we are much more efficient about finding
active users here:

    - we do one query per realm (instead of per-user)
    - we pass the cutoff date to the database
    - we get back just a list of distinct ids
2020-11-16 08:59:29 -08:00
Steve Howell b52f56080e performance: Just get user_ids to queue digest emails. 2020-11-16 08:59:29 -08:00
Steve Howell e13e5d104d refactor: Only require user_id for inactive_since().
This function is going away completely soon.  It is
querying everybody's entire UserActivity history instead
of passing the cutoff date to the database!
2020-11-16 08:59:29 -08:00
Steve Howell d0260392f7 digests: Get user objects from the database.
The query counts increase here for somewhat
contrived reasons.  The tests before this
commit reflected a successful trip to the
UserProfile cache, but that's not actually
realistic in practice.
2020-11-16 08:59:29 -08:00
Steve Howell 7737413cec digest tests: Improve gather_new_streams test.
We don't need to mock the dates here.  We also
explicitly clear out all streams first, and then
we explicitly test with both the stream being
current and the stream being old.
2020-11-16 08:59:28 -08:00
Steve Howell 9538edde06 digest tests: Simplify bots test.
We can use the _enqueue_emails_for_realm helper
to avoid all the Tuesday-related logic here.

We also don't bother to create UserActivity
records, since the bot gets excluded by virtue
of its being a bot.  (Also, the date ranges
here were sketchy due to the time mocking.)
2020-11-16 08:59:28 -08:00
Steve Howell 0624833af6 digest tests: Improve Tuesday tests.
If we're mocking time, we should do it consistently.
2020-11-16 08:59:28 -08:00
Steve Howell 2f4d7a6171 tests: Fix test_inactive_users_queued_for_digest.
We can avoid all the date mocking now for all
but a couple tests that exercise the is-it-Tuesday
logic.

And this test now correctly tests that we exclude
recently active users.

And this allows us to remove the other test.
2020-11-16 08:59:28 -08:00
Steve Howell e49a482baf email digests: Make transactions atomic. 2020-11-16 08:59:28 -08:00
Steve Howell cf6bcfb84a digest emails: Exclude users who had recent digests.
This code protects us in case we ever need to re-run
email digests twice in the same day.
2020-11-16 08:59:28 -08:00
Steve Howell fb3d4c1618 digest tests: Avoid warnings about naive time. 2020-11-16 08:59:28 -08:00
Steve Howell 4271442fba email digests: Write RealmAuditLog rows. 2020-11-16 08:59:28 -08:00
Mateusz Mandera 4f47f35cb4 auth: Handle the case of invalid subdomain at /fetch_api_key endpoint. 2020-11-13 16:43:17 -08:00
Anders Kaseorg 1275613812 requirements: Upgrade mypy to 0.790.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-12 15:44:30 -08:00
Anders Kaseorg 8ba95063d5 test_markdown: Construct FencedBlockPreprocessor with a real Markdown.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:28 -08:00
Anders Kaseorg e7e1fde6ec fenced_code: Use immutable type for codehilite_conf.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:28 -08:00
Anders Kaseorg fbf8ce0305 markdown: Add types for extra Markdown members.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
Anders Kaseorg b48bdc65b9 markdown: Fix AlertWordNotificationProcessor.run type.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
Anders Kaseorg 9573f6dc00 markdown: Fix build_block_parser type.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
Anders Kaseorg 4398eecd2b markdown: Use immutable type for extension config.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
Anders Kaseorg 060036dfd5 markdown: Merge build_engine into Markdown constructor.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
Anders Kaseorg 08c64f5cfa markdown: Fix imports for compatibility with typeshed stubs.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:54:27 -08:00
Anders Kaseorg 2a8a59f548 test_queue_worker: Simplify worker_queue_names computation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:46:04 -08:00
Anders Kaseorg dc84e9696c mypy: Fix types for redis.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-10 15:46:04 -08:00
Anders Kaseorg 3a8cf869db python: Convert os.open(…, O_EXCL) to open(…, "x").
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-09 14:31:01 -08:00
Puneeth Chaganti 358f1f9ba7 webhooks/sentry: Support integration configured as webhook.
Sentry allows adding simple webhooks without going through the process
of creating an Internal Integration in Sentry's Integration
Platform[1] (which our docs recommend).

The payload from sent from such a (simple) webhook integration is
slightly different from the payload sent by an Internal Integration
webhook. This commit tries to wrangle this payload into a form that is
usable by our webhook handler to send a notification message.

[1]: https://sentry.io/integration-platform/
2020-11-09 12:02:49 -08:00
Mateusz Mandera 47228f3a95 actions: Implement do_delete_user.
To have a reasonable way of creating the dummy user without duplicating
code, we need change create_user to have the optional force_id argument.
2020-11-09 11:58:02 -08:00
akshatdalton 806c1a0b8b markdown: Fix flickering of embedded link inside Italic.
This commit fixes a bug in marked.js which caused it to double-escape
HTML when rendering messages of the form: *[text](url)*.

This fixes a bug introduced in
3bdc8bbaa5, where an unnecessary
escape() call was added for the <em> code path, likely just because it
was adjacent to the others that needed it in the file.

Fix this, and add tests to verify that things are still being escaped
once after removing this extra escape.

Fixes #14845.
2020-11-06 10:09:15 -08:00
Steve Howell 5da4332620 minor: Add order-by-id to digest message query.
The order-by-id is now explicit, and I add
comments to explain the select_related tables.
2020-11-06 10:05:46 -08:00
Steve Howell 936171d258 refactor: Extract DigestTopic class.
This gets us away from a lot of dictionary soup.
2020-11-06 10:05:46 -08:00
Steve Howell e8b6c56322 refactor: Simplify get_hot_topics().
The code we deleted here was no longer
doing anything.

Maybe the code was always dead, or maybe it
was written during a time when topics_by_diversity
and topics_by_length actually had different keys.

But now it's clearly cruft.

If we have 4 or more topics, then the code above
it would already have populated the list with 4
elements, and the `if num_convos < 4` condition
would evaluate to False.

And if we had 3 or fewer topics, then we would
have already put all possible topics into our
result, and the `topics_by_diversity[num_convos:4]`
slice would be empty.

It's possible that we should just have a simple
heuristic for topic hotness like `10*num_senders
+ messages`, so we don't have to maintain this
fiddly function, and we can just do something like
`topics_by_score[:4]`.
2020-11-06 10:05:46 -08:00
Steve Howell c5dc9d386f refactor: Use sets of stream_ids for email digests.
I now use sets for stream_ids in more of the digest
code.

As part of this I replaced exclude_subscription_modified_streams
with streams_recently_modified_for_user.

It's easier for the caller to just ask for ids
to delete from its callee than it is to pass
in a set/list to mutate.

The simpler boundary between the functions makes
the tests easier to write--you can see the
`filtered_streams` logic goes away in this diff.

I also make the tests a bit more thorough by using
combinations of Cordelia/Othello and Verona/Denmark
to try to find multiple possible flaws.

And I make the time intervals longer than 1s to
avoid false negatives from slow CI boxes.
2020-11-05 17:42:43 -08:00
Steve Howell 88a57ed4ac bulk digest: Get stream subscriptions in bulk.
If we have multiple users, this reduces the amount
of queries we need to do, because we get all
subscriptions for all users in a single query
to Subscription.

For the single-user case, we are introducing an
extra query hop, but the database is doing
roughly the same work, because we are just breaking
up this complex query into two hops:

    messages =
        select ...  from message
        where recipient__type_id in (
            select stream_id from subscription
            where ...
        )

Now it's more like:

    stream_ids =
        select stream_id from subscription
        where ...

    messages =
        select ... from message
        where recipient__type_id in stream_ids
2020-11-05 09:36:59 -08:00
Steve Howell c83db37161 email digests: Introduce bulk methods for digest.
Note that we are not changing anything semantically
or algorithmically yet.  The only overhead here
for the single-user case is boxing and unboxing
data into single-item dicts and lists.

The interfaces for callers in the view and the
queue processor remain the same for now.
2020-11-05 09:36:59 -08:00
Steve Howell 7c89e46731 minor: Clean up some code formatting. 2020-11-05 09:36:59 -08:00
Steve Howell 4bd02eea19 minor: Use user, not user_profile, in some digest code. 2020-11-05 09:36:59 -08:00
Steve Howell 0e2d02b0a2 digest tests: Count cache tries. 2020-11-05 09:36:59 -08:00
Steve Howell 127f4e1291 digest tests: Add more users to bulk digest test. 2020-11-05 09:36:59 -08:00
Steve Howell 89cb3fa841 digest tests: Localize mocks.
We didn't need the enough-traffic mock.

We also continue to prep for testing multiple users.

I also finally remove a comment that is about to
be addressed (and which inaccurately refers to huddles).
2020-11-05 09:36:59 -08:00
Steve Howell 1ec16dd1da digest tests: Prep to test bulk digests.
All this does, essentially, is put the logic
we used to test for othello inside of a loop.

We'll add more users in the next commit.
2020-11-05 09:36:59 -08:00
Steve Howell e31326c823 refactor: Extract get_digest_context.
This eliminates the union type and boolean parameter,
and it makes it a bit easier to migrate to a
bulk-get approach.
2020-11-05 09:36:59 -08:00
Steve Howell 217967f743 refactor: Extract get_hot_topics.
This extraction will make a bit more sense when
we start doing bulk operations on a realm to
get digests, but even now, it encapsulates the
slightly complex way we cherry-pick the top 4
topics for a user.
2020-11-05 09:36:59 -08:00
Steve Howell 5a6d6f81ff refactor: Extract get_recent_topic_activity. 2020-11-05 09:36:59 -08:00
Steve Howell f987b014b3 refactor: Rename conversation to topic.
Not only is topic shorter, but the name makes
it clear that we're not dealing with abstract
conversations here--we are truly bucketing by
topic.
2020-11-05 09:36:59 -08:00
Steve Howell 6ac3cd3534 refactor: Use list of topics, not tuples. 2020-11-05 09:36:59 -08:00
Steve Howell 878e938a89 minor: Rename conversation_diversity to conversation_senders. 2020-11-05 09:36:59 -08:00
Steve Howell 6dc8250e9a mypy: Add TopicKey type for digests. 2020-11-05 09:36:59 -08:00
Steve Howell 96f6064b18 refactor: Move Messages query down the digest stack.
This prep step is mostly for diff hygiene; the next
commit will make the code a bit nicer.

The original code here had the nice property that
most (but not all) of the DB work happened up
front in `handle_digest_email`, and none of the
DB work was delegated to the callers.  But I
prefer the tradeoff of making the helpers a bit
more cohesive--let them get the data they need.
And we have query-count coverage in our tests,
so there's no real danger of having helpers
down in the stack insidiously doing a bunch of
extra DB hops.
2020-11-05 09:36:59 -08:00
Anders Kaseorg 13c11ec5f3 openapi: Fix escaping in curl command generation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-05 09:36:31 -08:00
Steve Howell c1f134a3a4 performance: Use ORM to fetch sender in render_markdown.
In 709493cd75 (Feb 2017)
I added code to render_markdown that re-fetched the
sender of the message, to detect whether the message is
a bot.

It's better to just let the ORM fetch this.  The
message object should already have sender.

The diff makes it look like we are saving round trips
to the database, which is true in some cases.  For
the main message-send codepath, though, we are only
saving a trip to memcached, since the middleware
will have put our sender's user object into the
cache.  The test_message_send test calls internally
to check_send_stream_message, so it was actually
hitting the database in render_markdown (prior to
my change).
2020-11-05 09:35:15 -08:00
Steve Howell 637f596751 tests: Fix queries_captured to clear cache up front.
Before this change we were clearing the cache on
every SQL usage.

The code to do this was added in February 2017
in 6db4879f9c.

Now we clear the cache just one time, but before
the action/request under test.

Tests that want to count queries with a warm
cache now specify keep_cache_warm=True.  Those
tests were particularly flawed before this change.

In general, the old code both over-counted and
under-counted queries.

It under-counted SQL usage for requests that were
able to pull some data out of a warm cache before
they did any SQL.  Typically this would have bypassed
the initial query to get UserProfile, so you
will see several off-by-one fixes.

The old code over-counted SQL usage to the extent
that it's a rather extreme assumption that during
an action itself, the entries that you put into
the cache will get thrown away.  And that's essentially
what the prior code simulated.

Now, it's still bad if an action keeps hitting the
cache for no reason, but it's not as bad as hitting
the database.  There doesn't appear to be any evidence
of us doing something silly like fetching the same
data from the cache in a loop, but there are
opportunities to prevent second or third round
trips to the cache for the same object, if we
can re-structure the code so that the same caller
doesn't have two callees get the same data.

Note that for invites, we have some cache hits
that are due to the nature of how we serialize
data to our queue processor--we generally just
serialize ids, and then re-fetch objects when
we pop them off the queue.
2020-11-05 09:35:15 -08:00
Tim Abbott eae14baa05 api: URL-quote password when testing authentication API.
The passwords generated for our development environment / test suite
include the `+` character, which needs to be quoted when encoded as an
HTTP POST parameter.

This is hopefully sufficient to fix the CI failures we've seen with
the tests for POST /api/v1/fetch_api_key; I haven't reproduced the
failure so am not completely sure.
2020-11-03 15:55:30 -08:00
YashRE42 967efc32d2 widgets: Remove tictactoe example widget.
Steve asked me to remove this, since the tictactoe game was always
intended as a proof of concept. Now that we have poll and todo
widgets, the sample code for tictactoe has much less value.

We replace the content and type in test_widgets.py to maintain
coverage.
2020-11-03 14:46:39 -08:00
Aman Agrawal 87cdd8433d home: Allow logged out user through home.
We allow user to load webapp without log-in. This is only
be enabled for developed purposes now. Production setups will
see no changes.
2020-11-02 17:07:12 -08:00
shanukun be39672026 api_docs: Document the /fetch-api-key endpoint.
With tweaks by tabbott to document addition details.

Fixes: #16408.
2020-11-02 16:45:42 -08:00
shanukun da9d586254 openapi: Add parameter examples for fetch api key endpoints. 2020-11-02 16:45:42 -08:00
Anders Kaseorg ac5cbf7693 Revert "markdown: Escape lang when echoing back custom non-pygments languages."
This reverts commit 564b199fe6, which
was part of #16308.

Escaping is either required or incorrect; it is never “defensive”.
This escaping is incorrect.  lxml already escapes attributes during
serialization (any other behavior would be a serious bug), and
additional escaping just results in double escaping.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-11-02 16:23:48 -08:00
akshatdalton 620e9cbf72 markdown: Fix merging of separate quotations.
Initally, when writing two or more quotes, having
a blank line in between them, merges those quotes.
This created confusion especially in "quote and reply".

This commit fixes such issues. Now two or more quotes
having a blank line in between them, will not get merged.

This change is correct both for usability and for improving our
compatibility with CommonMark.

Fixes #14379.
2020-10-30 15:21:15 -07:00
Mateusz Mandera cbeeadab16 delete_realm: Register a post_delete Realm handler.
By registering a post_delete handler to clear appropriate caches in a
nicer way, we can get rid of the ugly flush-memcached call in the
delete_realm command.
2020-10-30 11:43:03 -07:00
Alex Vandiver bff503feb4 delete_realm: Add command to completely remove realms.
This will need some tweaking in upcoming commits.
2020-10-30 11:42:40 -07:00
Anders Kaseorg 3c663e48db url_encoding: Skip unnecessary encode before quote.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Anders Kaseorg df10b306a6 python: Remove force_bytes.
We are generally good enough at types to know whether a value is str
or bytes.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Anders Kaseorg cc55393671 python: Open text files as text to skip decode operations.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Anders Kaseorg 18d0e4664c python: Replace binascii with bytes.hex to skip some decode operations.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Anders Kaseorg aaa7b766d8 python: Use universal_newlines to get str from subprocess.
We can replace ‘universal_newlines’ with ‘text’ when we bump our
minimum Python version to 3.7.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Anders Kaseorg 9281dccae4 python: Serialize lxml elements directly to str.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Anders Kaseorg 7c4f68d9cf python: Skip unnecessary decode before BeautifulSoup parsing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Anders Kaseorg 86e8d81c7f python: Skip unnecessary decode before JSON parsing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Anders Kaseorg 1802a50cc9 python: Use requests.Response.text instead of decoding content.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-30 11:36:38 -07:00
Tim Abbott 067cd3a97a docs: Remove incorrect references to chat.zulip.org.
Most of these are Help Center links that should be pointing to the
production Help Center.
2020-10-29 16:46:40 -07:00
Tim Abbott 3b9c726fc6 outgoing_webhook: Avoid logging a bytes string.
This fixes the new assertLogs() tests failing in CI; we fixed the
weird use of bytes in the test, but not in the runtime code.
2020-10-29 15:55:11 -07:00
sahil839 7106069d4d migration: Add migration to remove default status of private streams.
This commit adds migration which removes default status of exisitng
default private streams, i.e. private stream exists but they are no
longer default.
2020-10-29 15:47:34 -07:00
sahil839 b29d39195c streams: Do not allow default streams to be private.
We now do not allow to make a stream private which is already
a default stream.
2020-10-29 15:47:32 -07:00
sahil839 557ca0802c streams: Do not allow private streams to be set as default.
We now do not allow to set a private stream as default.
2020-10-29 15:43:37 -07:00
m-e-l-u-h-a-n cbfd6464a5 logging: replace mock.patch() for logging with assertLogs()
This commit removes mock.patch with assertLogs().

* Adds return value to do_rest_call() in outgoing_webhook.py, to
  support asserting log output in test_outgoing_webhook_system.py.

* Logs are not asserted in test_realm.py because it would require to users
  to be queried using users=User.objects.filter(realm=realm) and the order
  of resulting queryset varies for each run.

* In test_decorators.py, replacement of mock.patch is not done because
  I'm not sure if it's worth the effort to replace it as it's a return
  value of a function.

Tweaked by tabbott to set proper mypy types.
2020-10-29 15:37:45 -07:00
Hemanth V. Alluri 99cf37dc51 drafts: Make the ID of the draft a part of the draft dict.
Then because the ID is now part of the draft dict, we can
(and do) change the structure of the "drafts" parameter
returned from `GET /drafts` from an object (mapping ID to
data) to an array.

Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>
2020-10-29 11:06:04 -07:00
Hemanth V. Alluri 8d59fd2f45 tests/drafts: Simplify create_and_check_drafts_for_success.
Sometimes we don't need to specify the expected_drafts field.
So by removing it, we can reduce the clutter a bit.

Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>
2020-10-29 11:06:04 -07:00
Hemanth V. Alluri e60925b3e8 drafts: Change "timestamp" from float to integer.
Now the timestamp returned in a draft dict will always be an int.
The endpoints will still accept either an int or a float.

Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>
2020-10-29 11:06:04 -07:00
Abhijeet Prasad Bodas e98a8856c7 logging: Add logging in deferred_work queue processor.
Adds logging statements in deferred_work queue consume.
2020-10-29 10:34:53 -07:00
m-e-l-u-h-a-n be7a70e742 logging: Remove unnecessary mock.patch() for logging.
Our test-backend validation confirms that we don't log anything to
stdout in the tests, so the fact that CI passes with this removes
shows there was nothing being logged.
2020-10-28 23:15:27 -07:00
Vishnu KS fdea49742c apps: Use GitHub API for generating the web app download link. 2020-10-28 23:04:14 -07:00
ryanreh99 dfa7ce5637 uploads: Support non-AWS S3-compatible server.
Boto3 does not allow setting the endpoint url from
the config file. Thus we create a django setting
variable (`S3_ENDPOINT_URL`) which is passed to
service clients and resources of `boto3.Session`.

We also update the uploads-backend documentation
and remove the config environment variable as now
AWS supports the SIGv4 signature format by default.
And the region name is passed as a parameter instead
of creating a config file for just this value.

Fixes #16246.
2020-10-28 21:59:07 -07:00