Commit Graph

5150 Commits

Author SHA1 Message Date
Vishnu KS a2781e6364 emails: Set correct language for email in send_email_to_admins.
Previously the emails were translated to the default_language of
admin[0] in build_email function. Now we use realm.default_language
instead.
2020-04-16 19:31:08 -07:00
Puneeth Chaganti 4d2ce607c9 tools: Add script to trigger webhook notification using fixtures.
When creating a webhook integration or creating a new one, it is a pain to
create or update the screenshots in the documentation. This commit adds a
tool that can trigger a sample notification for the webhook using a fixture,
that is likely already written for the tests.

Currently, the developer needs to take a screenshot manually, but this could
be automated using puppeteer or something like that.

Also, the tool does not support webhooks with basic auth, and only supports
webhooks that use json fixtures. These can be fixed in subsequent commits.
2020-04-16 19:25:13 -07:00
Ryan Rehman 9340cd1a0b muting: Send muted_topic's date_muted field to frontend. 2020-04-15 15:48:25 -07:00
Mateusz Mandera fbc8325d0e test-backend: Remove rate_limiter from not_yet_fully_covered.
rate_limiter.py now has sufficient test coverage to remove from the
list of exclusions.

Tweaked by tabbott to handle @abstractmethod in a better way.
2020-04-15 11:20:37 -07:00
Mateusz Mandera 5f9da3053d rate_limiter: Handle edge case where rules list may be empty. 2020-04-15 11:20:37 -07:00
Hashir Sarwar b577366a05 rate_limiter: Add an in-process implementation for Tornado.
The Redis-based rate limiting approach takes a lot of time talking to
Redis with 3-4 network requests to Redis on each request.  It had a
negative impact on the performance of `get_events()` since this is our
single highest-traffic endpoint.

This commit introduces an in-process rate limiting alternate for
`/json/events` endpoint. The implementation uses Leaky Bucket
algorithm and Python dictionaries instead of Redis. This drops the
rate limiting time for `get_events()` from about 3000us to less than
100us (on my system).

Fixes #13913.

Co-Author-by: Mateusz Mandera <mateusz.mandera@protonmail.com>
Co-Author-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-15 11:20:37 -07:00
Mateusz Mandera 95fa8b2a26 rate_limiter: Fix too early return if no rules are passed in.
In the redis implementation, if rules was an empty list,
this would return too early - before checking if the key isn't
manually blocked.
2020-04-15 11:20:37 -07:00
Tim Abbott 0dd0227c8d send_email: Move custom email code to the bottom.
It's of interest to a relatively small subset of developers, in
comparison to Zulip's generic code for sending outgoing emails.
2020-04-14 10:57:20 -07:00
wowol 74b757c43c emails: Add support for email headers in send custom email function.
This makes it a bit more convenient to encode most of the email
configuration inside a single template file.
2020-04-14 10:50:29 -07:00
Wowol 0bf5ad3265 emails: Move send custom email function to library. 2020-04-12 16:11:44 -07:00
Abhishek-Balaji 68257e28ce emoji_name: Raise correct exception if emoji_name is missing.
Right now, the message is "Invalid characters in emoji name" when
the emoji_name is empty. Changing check_valid_emoji_name() in
zerver/lib/emoji.py which validates the name to accomodate the case
of missing name. The new message is "Emoji name is missing".
2020-04-12 11:52:45 -07:00
arpit551 d60efa1478 thumbor: Fix __file__ typo.
Replaced '__file__' typo with __file__ which used to add
wrong path to sys.path.
2020-04-12 11:23:03 -07:00
Mateusz Mandera 770086f983 url_preview: Discard url in oembed if server returns invalid json.
This fixes the scenario where we'd get errors in the
FetchLinksEmbedData queue processor if oembed got invalid json from the
URL.
2020-04-11 11:54:54 -07:00
Anders Kaseorg c734bbd95d python: Modernize legacy Python 2 syntax with pyupgrade.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:43:22 -07:00
Anders Kaseorg fff2d3958a timeout: Use Python 3 raise syntax.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-09 16:42:13 -07:00
Rohitt Vashishtha f9caf522f0 markdown: Allow setting a default language for code blocks.
This adds a new realm setting: default_code_block_language.

This PR also adds a new widget to specify a language, which
behaves somewhat differently from other widgets of the same
kind; instead of exposing methods to the whole module, we
just create a single IIFE that handles all the interactions
with the DOM for the widget.

We also move the code for remapping languages to format_code
function since we want to preserve the original language to
decide if we override it using default_code_clock_language.

Fixes #14404.
2020-04-09 16:02:02 -07:00
Rohitt Vashishtha 3f6541b306 bugdown: Add 'none' as alias for no syntax highlighting in codeblocks.
This is a precursor to #14404.
2020-04-09 16:02:02 -07:00
Abhinav 41fc7b2ae1 webhooks/semaphore: Add support for Semaphore 2.0 notifications.
Semaphore has currently has two different versions of their product -
Classic and 2.0. This commit adds support for Semaphore 2.0, along side
Semaphore Classic, using the same webhook. This would let the integration
work seamlessly for users who have already configured a Zulip integration in
their Semaphore 2.0 projects.

Semaphore 2.0 currently only supports GitHub and their payloads do not
contain URLs for common entities like commits, pull requests and tags. We
construct URLs for them using templates, but also try to support other
services by providing notifications without URLs.

Closes #14171

Co-authored-by: Puneeth Chaganti <punchagan@muse-amuse.in>
2020-04-09 12:41:40 -07:00
Mateusz Mandera 218be002f1 rate_limiter: Add more detailed automated tests.
Extracted by tabbott from the original commit to support testing
without the Tornado version merged yet.
2020-04-08 10:40:26 -07:00
Mateusz Mandera 46a02e70b0 rate_limiter: Fix inconsistency in an edge case in redis limiter.
If we had a rule like "max 3 requests in 2 seconds", there was an
inconsistency between is_ratelimited() and get_api_calls_left().
If you had:
request #1 at time 0
request #2 and #3 at some times < 2

Next request, if exactly at time 2, would not get ratelimited, but if
get_api_calls_left was called, it would return 0. This was due to
inconsistency on the boundary - the check in is_ratelimited was
exclusive, while get_api_calls_left uses zcount, which is inclusive.
2020-04-08 10:29:18 -07:00
Mateusz Mandera 4b567d8edd rate_limiter: Fix secs_to_freedom being set to a timestamp.
time_reset returned from api_calls_left() was a timestamp, but
mistakenly treated as delta seconds. We change the return value of
api_calls_left() to be delta seconds, to be consistent with the return
value of rate_limit().
2020-04-08 10:29:18 -07:00
Mateusz Mandera fc2b6c9c06 rate_limiter: Remove incorrect comment in RedisRateLimiterBackend. 2020-04-08 10:29:18 -07:00
Mateusz Mandera 0155193140 rate_limiter: Change type of the RateLimitResult.remaining to int.
This is cleaner than it being Optional[int], as the value of None for
this object has been synonymous to 0.
2020-04-08 10:29:18 -07:00
Mateusz Mandera e86cfbdbd7 rate_limiter: Store data in request._ratelimits_applied list.
The information used to be stored in a request._ratelimit dict, but
there's no need for that, and a list is a simpler structure, so this
allows us to simplify the plumbing somewhat.
2020-04-08 10:29:18 -07:00
Mateusz Mandera 9911c6a0f0 rate_limiter: Put secs_to_freedom as message when raising RateLimited.
That's the value that matters to the code that catches the exception,
and this change allows simplifying the plumbing somewhat, and gets rid
of the get_rate_limit_result_from_request function.
2020-04-08 10:29:18 -07:00
Wyatt Hoodes 13f86f35d9 zcommand: Add `/fluid-width` and `/fixed-width` slash commands. 2020-04-07 20:54:34 -07:00
Wyatt Hoodes 4d6755a807 zcommand.py: Clean up backend logic.
This commit contains a few clean ups:

* In order to scale better for adding multiple commands,
the message formatting and setting switch logic was
extracted to its own function.

* The command lists were removed, as the frontend parses
the slash command from the compose box, and only sends
a single command to the backend for any given command
alias typed.

* The `switch_command` logic was removed because, given
the aforementioned fact, the index of the command will
always be the same. Thus the switch command will always
be the same.

* Switched to using early returns as opposed to nested
conditionals.  Along with removing single use variable
declarations.
2020-04-07 20:54:34 -07:00
Tim Abbott 843345dfee message_edit: Add backend for moving a topic to another stream.
This commit reuses the existing infrastructure for moving a topic
within a stream to add support for moving topics from one stream to
another.

Split from the original full-feature commit so that we can merge just
the backend, which is finished, at this time.

This is a large part of #6427.

The feature is incomplete, in that we don't have real-time update of
the frontend to handle the event, documentation, etc., but this commit
is a good mergable checkpoint that we can do further work on top of.
We also still ideally would have a test_events test for the backend,
but I'm willing to leave that for follow-up work.

This appears to have switched to tabbott as the author during commit
squashing sometime ago, but this commit is certainly:

Co-Authored-By: Wbert Adrián Castro Vera <wbertc@gmail.com>
2020-04-07 14:19:19 -07:00
Tim Abbott 5aa6aeb303 typing: Stop using Collection[str].
Apparently, Collection is unavailable in Python 3.5, so we can't use
it.  Reverting to a Union for now.
2020-04-03 15:53:37 -07:00
Tim Abbott a745e533fe settings: Use cleaner validators for display settings.
This simplifies the update_display_settings endpoint to use REQ for
validation, rather than custom if/else statements.

The test changes just take advantage of the now more consistent
syntax.
2020-04-03 15:09:14 -07:00
Steve Howell 1ae07b93d8 presence: Simplify payload for webapp.
This changes the payload that is used
to populate `page_params` for the webapp,
as well as responses to the once-every-50-seconds
presence pings.

Now our dictionary of users only has these
two fields in the value:

    - activity_timestamp
    - idle_timestamp

Example data:

    {
        6: Object { idle_timestamp: 1585746028 },
        7: Object { active_timestamp: 1585745774 },
        8: Object { active_timestamp: 1585745578,
                    idle_timestamp: 1585745400}
    }

We only send the slimmer type of payload
to clients that have set `slim_presence`
to True.

Note that this commit does not change the format
of the event data, which still looks like this:

    {
        website: {
            client: 'website',
            pushable: false,
            status: 'active',
            timestamp: 1585745225
        }
    }
2020-04-03 11:44:56 -07:00
shubhamgupta2956 793c3f25e7 api_docs: Migrate POST /zulip-outgoing-webhook.
This commit migrates zulip outging webhook payload to
/zulip-outgoing-webhook:post in OpenAPI.

Since this migrates the last payloads from api/fixtures.json to
OpenAPI, this commit removes api/fixtures.json file and the functions
accessing the file.

Tweaked by tabbott to further remove an unnecessary conditional.
2020-04-02 14:55:32 -07:00
Tim Abbott 5eb5b6a5ad import: Make sure the internal realm is created before import.
This is critical for importing the very first realm into an empty
server, since in 27b15a9722, we changed
the model to create the internal realm when the first real realm would
be created, but neglected the data import code path.
2020-04-02 14:34:32 -07:00
Mateusz Mandera 5252b081bd queue_processors: Gather statistics on queue worker operations. 2020-04-01 16:44:06 -07:00
Steve Howell f6503a4061 validation: Use JsonableError for extractors.
The distinction between ValueError and TypeError
is not useful in these functions:

    - extract_stream_indicator
    - extract_private_recipients (or its callees)

These are always invoked in views to validate
user input.

When we use REQ to wrap the validators, any
Exception gets turned into a JsonableError, so
the distinction is not important.

And if we don't use REQ to wrap the validators,
the errors aren't caught.

Now we just let these functions directly produce
the desired end result for both codepaths.

Also, we now flag the error strings for translation.
2020-04-01 15:01:19 -07:00
Anders Kaseorg 2d45308546 CVE-2020-10935: Fix XSS vulnerability in local link rewriting.
Make sure rewrite_local_links_to_relative does not accidentally change
the meaning of links.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-01 14:01:45 -07:00
Anders Kaseorg 4f748fb627 markdown: Stop setting target="_blank".
This setting is being overridden by the frontend since the last
commit, and the security model is clearer and more robust if we don't
make it appear as though the markdown processor is handling this
issue.

Co-authored-by: Tim Abbott <tabbott@zulipchat.com>
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-04-01 14:01:45 -07:00
Tim Abbott e3a4aeeffa CVE-2020-9445: Remove unused and insecure modal_link feature.
Zulip's modal_link markdown feature has not been used since 2017; it
was a hack used for a 2013-era tutorial feature and was never used
outside that use case.

Unfortunately, it's sloppy implementation was exposed in the markdown
processor for all users, not just the tutorial use case.

More importantly, it was buggy, in that it did not validate the link
using the standard validation approach used by our other code
interacting with links.

The right solution is simply to remove it.
2020-04-01 14:01:45 -07:00
Udit107710 ef741bf317 messages: Return shallow copy of message object.
When more than one outgoing webhook is configured,
the message which is send to the webhook bot passes
through finalize_payload function multiple times,
which mutated the message dict in a way that many keys
were lost from the dict obj.

This commit fixes that problem by having
`finalize_payload` return a shallow copy of the
incoming dict, instead of mutating it.  We still
mutate dicts inside of `post_process_dicts`, though,
for performance reasons.

This was slightly modified by @showell to fix the
`test_both_codepaths` test that was added concurrently
to this work.  (I used a slightly verbose style in the
tests to emphasize the transformation from `wide_dict`
to `narrow_dict`.)

I also removed a deepcopy call inside
`get_client_payload`, since we now no longer mutate
in `finalize_payload`.

Finally, I added some comments here and there.

For testing, I mostly protect against the root
cause of the bug happening again, by adding a line
to make sure that `sender_realm_id` does not get
wiped out from the "wide" dictionary.

A better test would exercise the actual code that
exposed the bug here by sending a message to a bot
with two or more services attached to it.  I will
do that in a future commit.

Fixes #14384
2020-03-29 15:12:27 -07:00
Steve Howell e29ddd0ce0 outgoing_webhook: Remove `event` from process_success.
The `event` parameter is never used by `process_success`,
and eliminating it allows us to greatly simplify tests
that are just confusingly passing in events that are
totally ignored.
2020-03-29 15:12:27 -07:00
Steve Howell bacfadbc61 minor: Use explicit params in build_bot_request.
I also tweaked the block comment to mention
gravatars.
2020-03-29 15:12:27 -07:00
Stefan Weil d2fa058cc1
text: Fix some typos (most of them found and fixed by codespell).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2020-03-27 17:25:56 -07:00
arpit551 8f7733cb20 emails: Added placeholders strings in FormAddress.
We've had a bug for a while that if any ScheduledEmail objects get
created with the wrong email sender address, even after the sysadmin
corrects the problem, they'll still get errors because of the objects
stored with the wrong format.

We solve this by using FromAddress placeholders strings in
send_future_email function, so that ScheduledEmail objects end up
setting the final `from_address` value when mail is actually sent
using the setting in effect at that time.

Fixes #11008.
2020-03-27 16:41:02 -07:00
Steve Howell c2b3269420 message perf: Streamline stream name lookups.
When we are fetching messages, we need to hydrate
stream names into the messages for legacy reasons.

(Ideally, we could skip this step for the webapp
and modern mobile clients, since they really only
need stream_ids, but we're not there yet.)

We keep a recipient cache that maps recipient ids
to stream names.

When we populate that cache, we now use `values(...)`
to avoid fat objects and extra DB work.

Note that we are already using a similar technique
for hydrating PM/huddle recipients.
2020-03-27 17:20:34 +00:00
Tim Abbott 06c97b5be2 api docs: Render example responses as with JSON codehilite.
This makes the example responses a lot prettier visually.
2020-03-27 00:03:36 -07:00
Tim Abbott 820f0e275e api docs: Redesign visuals for documenting arguments.
The previous system for documenting arguments was very ugly if any of
the examples or descriptions were wrong.  After thinking about this
for a while, I concluded the core problem was that a table was the
wrong design element to use for API parameters, and we'd be much
better off with individual card-type widgets instead.

This rewrites the API arguments documentation implementation to use a
basic sort of card-like system with some basic styling; I think the
result is a lot more readable, and it's a lot more clear how we would
add additional OpenAPI details (like parameter types) to the
documentation.
2020-03-27 00:03:36 -07:00
Anders Kaseorg 7ff9b22500 docs: Convert many http URLs to https.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-03-26 21:35:32 -07:00
Graham Bleaney fd5ee9a831 bots: Decouple user input from imported module.
This commit modifies 'zerver/lib/bot_lib.py' to decouple the
user-controllable 'service_name' parameter from the value that is
passed in to 'import_module'. This is done as a precautionary
hardening.
2020-03-25 16:39:17 -07:00
Graham Bleaney 2fe9d85a5f redirects: Refactor redirect code to use central helper function.
This commit introduces two new functions in 'url_encoding.py' which
centralize two common patterns for constructing redirect URLs. It
also migrates the files using those patterns to use the new
functions.
2020-03-25 16:39:17 -07:00
Graham Bleaney 5dca599481 export: Harden s3 export against directory traversal.
This commit modifies 'zerver/lib/export.py' to raise an exception
in the presence of a suspected attempt at directory traversal.
2020-03-25 16:39:17 -07:00
Emilio López d3c841d587 email_mirror: also check for Envelope-To
After subscribing a stream email address to a Mailman email list
and receiving a message from it (using the polling configuration
with an Exim + Dovecot mailserver), the following error message
is emitted by Zulip:

    Logger zerver.lib.email_mirror, from module zerver.lib.email_mirror line 77:
    Error generated by Anonymous user (not logged in) on zulip deployment

    Sender: "Foo Bar" <foo@example.com>
    To: No recipient found
    Missing recipient in mirror email

This is because the To: header on the received email corresponds
to the email list, and there are no other headers to indicate the
final recipient, apart from the "Envelope-To" header added by
Exim. To resolve this problem, the commit adds "Envelope-To" to
the list of headers to check for a match.
2020-03-25 16:28:46 -07:00
Vishnu KS f8ddab58ba billing: Downgrade plan to Limited during realm deactivation.
The realm would be instantly downgraded to Limited plan when
deactivated. Any extra users that were added in the final month
would not be charged.
2020-03-25 10:54:10 -07:00
Tim Abbott 85c9ffd91c message: Validate propagate_mode parameters.
This improves the error handling for invalid values of the
propagate_mode parameter to our message editing endpoints.
Previously, invalid values would just work like change_one rather than
doing nothing.
2020-03-24 12:36:45 -07:00
Anders Kaseorg 39f9abeb3f python: Convert json.loads(f.read()) to json.load(f).
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-03-24 10:46:32 -07:00
Tim Abbott 180d8abed6 messages: Fix unlikely exception when trying to delete a message. 2020-03-22 21:35:27 -07:00
Tim Abbott 481d351cee events: Fix buggy apply_events handling of starred_messages.
The previous starred_messages race handling did not correctly consider
the possibility that an event queue might have been registered without
starred_messages.
2020-03-22 21:30:23 -07:00
Mateusz Mandera 27c19b081b rate_limit: Remove inaccurate docstring on clear_history methods. 2020-03-22 18:42:35 -07:00
Mateusz Mandera b9e5103d0c rate_limit: Refactor RateLimiterBackend to operate on keys and rules.
Instead of operating on RateLimitedObjects, and making the classes
depend on each too strongly. This also allows getting rid of get_keys()
function from RateLimitedObject, which was a redis rate limiter
implementation detail. RateLimitedObject should only define their own
key() function and the logic forming various necessary redis keys from
them should be in RedisRateLimiterBackend.
2020-03-22 18:42:35 -07:00
Mateusz Mandera 8069133f88 rate_limit: Remove __str__ methods of RateLimitedObjects.
These were clunky from the start and are no longer used, as keys are now
used directly for logging purposes.
2020-03-22 18:42:35 -07:00
Mateusz Mandera 4e9f77a6c4 rate_limit: Adjust keys() of some RateLimitedObjects.
type().__name__ is sufficient, and much readable than type(), so it's
better to use the former for keys.
We also make the classes consistent in forming the keys in the format
type(self).__name__:identifier and adjust logger.warning and statsd to
take advantage of that and simply log the key().
2020-03-22 18:42:35 -07:00
Mateusz Mandera 2c6b1fd575 rate_limit: Rename key_fragment() method to key(). 2020-03-22 18:42:35 -07:00
Mateusz Mandera 9c9f8100e7 rate_limit: Add the concept of RateLimiterBackend.
This will allow easily swapping and using various implementations of
rate-limiting, and separate the implementation logic from
RateLimitedObjects.
2020-03-22 18:42:35 -07:00
Mateusz Mandera 85df6201f6 rate_limit: Move functions called by external code to RateLimitedObject. 2020-03-22 18:42:35 -07:00
Mateusz Mandera 3b5b19fde8 tornado: Log shard id in all logs coming from tornado processes.
This will make it easier to investigate using logs which requests are
being processed by which Tornado process.
2020-03-22 18:26:35 -07:00
Steve Howell 8c1244d0b4 tests: Kill off find_one() helper.
This was only recently added.  Using tuple
assignment raises the same errors, so the
indirection probably isn't worth it.
2020-03-20 13:40:20 -07:00
Steve Howell ef772ee12f bot events: Prevent duplicate add-bot notifications.
We don't need `do_create_user` to send a partial
event here for bots.  The only caller to `do_create_user`
that actually creates bots (apart from some tests that
just need data setup) is `add_bot_backend`, which
sends the more complete event including bot "extras"
like service info.

The modified event tests show the simplification
here (2 events instead of 3).

Also, the bot tests now use tuple unpacking, which
will force a ValueError if we duplicate events
again.
2020-03-20 13:40:19 -07:00
Steve Howell f647587675 bulk_create: Handle realms that hide delivery emails. 2020-03-19 16:04:05 -07:00
Steve Howell ecbbc3e365 performance: Simplify bulk_create_users().
We were going back to the database to get all
the users in the realm, when we had them right
there already.  I believe this is a legacy
of us running on a very old version of Django
(back in early days), where `bulk_create`
didn't give you back ids in a nice way.

In the interim we added the `RealmAuditLog`
code, which does take advantage of the
existing profiles (and proves we can rely
on them).

But meanwhile we were still
doing a query to get all N users in the
realm.  With `selected_related`!

To be fair, bulk_create_users() is by
its very nature a pretty infrequent
operation.  This change is more motivated
by code cleanup.

Now we just loop through user_ids for
the Recipient/Subscriber foreign key rows.

I also removed some fairly convoluted code mapping
emails to user_ids and just work in user_id
space.
2020-03-19 16:04:05 -07:00
Steve Howell 1306239c16 tests: Use email/delivery_email more explicitly.
We try to use the correct variation of `email`
or `delivery_email`, even though in some
databases they are the same.

(To find the differences, I temporarily hacked
populate_db to use different values for email
and delivery_email, and reduced email visibility
in the zulip realm to admins only.)

In places where we want the "normal" realm
behavior of showing emails (and having `email`
be the same as `delivery_email`), we use
the new `reset_emails_in_zulip_realm` helper.

A couple random things:

    - I fixed any error messages that were leaking
      the wrong email

    - a test that claimed to rely on the order
      of emails no longer does (we sort user_ids
      instead)

    - we now use user_ids in some place where we used
      to use emails

    - for IRC mirrors I just punted and used
      `reset_emails_in_zulip_realm` in most places

    - for MIT-related tests, I didn't fix email
      vs. delivery_email unless it was obvious

I also explicitly reset the realm to a "normal"
realm for a couple tests that I frankly just didn't
have the energy to debug.  (Also, we do want some
coverage on the normal case, even though it is
"easier" for tests to pass if you mix up `email`
and `delivery_email`.)

In particular, I just reset data for the analytics
and corporate tests.
2020-03-19 16:04:03 -07:00
Steve Howell 42ee2f5e86 tests: Fix test coverage on recent commit.
I guess `test_classes` has 100% line coverage
enforcement, which is a bit tricky for error
handling.

This fixes that, as well as making the name
snake_case and improving the format of the
errors.
2020-03-19 11:37:31 -04:00
Steve Howell 80acbb9fdf Clean up `test_get_all_profiles_avatar_urls`.
This test was using the anti-pattern of doing an
assertion inside a conditional.

I added the `findOne` helper to make it easier
to write robust tests for scenarios like this.
2020-03-19 10:34:35 -04:00
Steve Howell ca74cd6e37 bug fix: Fix unread counts for certain API messages.
If I send a message from a normal Zulip client, it is
considered to be "read" by me.  But if I send it via
an API program (using my human account), the message
is not immediately "read" by me.

Now we handle this correctly in `get_raw_unread_data`.

The symptom of this was that these messages would get
"stuck" in "Private Messages" narrows until the next
time you reloaded your app.
2020-03-17 16:26:42 -07:00
Mateusz Mandera 5e47f2975e actions: Optimize query in get_occupied_streams.
Using an Exists subquery to avoid scanning the entire Subscription
table seems to speed things up greatly.
Set up with:
 ./manage.py populate_db --extra_users 2000 --extra-streams 1000

Tested on my computer, the original function was taking ~1.2seconds,
the optimized version only ~0.05-0.06.

Likely fixes #13874; we can re-open if after production testing we
feel more work is warranted.
2020-03-17 05:44:05 -07:00
Mateusz Mandera 884ff425da cache: Remove dead code for caching recipients.
With recipient column denormalized into all three of Stream, UserProfile
and Huddle, there is no more use for this caching.
2020-03-17 05:41:11 -07:00
Mateusz Mandera b4ce167a88 models: Add recipient foreign key to Huddle.
This follows the already tested approach from
8acfa17fe6.
2020-03-17 05:41:11 -07:00
Steve Howell fcc5ae5247 invites: Fix regression w/email vs. delivery_email.
In 220c2a5ff3 I
introduced a query to find invites by delivery_email
but was still using email as the key.

For most realms `email` and `delivery_email` are
synonymous, so this temporary bug would not affect
them.  For realms that restrict emails, the invite
would have probably failed for other reasons, but
the symptom would have been less clear.
2020-03-12 10:13:08 -04:00
Steve Howell 1b16693526 tests: Limit email-based logins.
We now have this API...

If you really just need to log in
and not do anything with the actual
user:

    self.login('hamlet')

If you're gonna use the user in the
rest of the test:

    hamlet = self.example_user('hamlet')
    self.login_user(hamlet)

If you are specifically testing
email/password logins (used only in 4 places):

    self.login_by_email(email, password)

And for failures uses this (used twice):

    self.assert_login_failure(email)
2020-03-11 17:10:22 -07:00
Steve Howell c235333041 test performance: Pass in users to api_* helpers.
This reduces query counts in some cases, since
we no longer need to look up the user again. In
particular, it reduces some noise when we
count queries for O(N)-related tests.

The query count is usually reduced by 2 per
API call.  We no longer need to look up Realm
and UserProfile.  In most cases we are saving
these lookups for the whole tests, since we
usually already have the `user` objects for
other reasons.  In a few places we are simply
moving where that query happens within the
test.

In some places I shorten names like `test_user`
or `user_profile` to just be `user`.
2020-03-11 14:18:29 -07:00
Steve Howell 626ad0078d tests: Add uuid_get and uuid_post.
We want a clean codepath for the vast majority
of cases of using api_get/api_post, which now
uses email and which we'll soon convert to
accepting `user` as a parameter.

These apis that take two different types of
values for the same parameter make sweeps
like this kinda painful, and they're pretty
easy to avoid by extracting helpers to do
the actual common tasks.  So, for example,
here I still keep a common method to
actually encode the credentials (since
the whole encode/decode business is an
annoying detail that you don't want to fix
in two places):

    def encode_credentials(self, identifier: str, api_key: str) -> str:
        """
        identifier: Can be an email or a remote server uuid.
        """
        credentials = "%s:%s" % (identifier, api_key)
        return 'Basic ' + base64.b64encode(credentials.encode('utf-8')).decode('utf-8')

But then the rest of the code has two separate
codepaths.

And for the uuid functions, we no longer have
crufty references to realm.  (In fairness, realm
will also go away when we introduce users.)

For the `is_remote_server` helper, I just inlined
it, since it's now only needed in one place, and the
name didn't make total sense anyway, plus it wasn't
a super robust check.  In context, it's easier
just to use a comment now to say what we're doing:

    # If `role` doesn't look like an email, it might be a uuid.
    if settings.ZILENCER_ENABLED and role is not None and '@' not in role:
        # do stuff
2020-03-11 14:18:29 -07:00
Steve Howell 00dc976379 tests: Use users for common_subscribe_to_streams.
We also use users for get_streams().
2020-03-11 14:18:29 -07:00
Mateusz Mandera 89394fc1eb middleware: Use request.user for logging when possible.
Instead of trying to set the _requestor_for_logs attribute in all the
relevant places, we try to use request.user when possible (that will be
when it's a UserProfile or RemoteZulipServer as of now). In other
places, we set _requestor_for_logs to avoid manually editing the
request.user attribute, as it should mostly be left for Django to manage
it.
In places where we remove the "request._requestor_for_logs = ..." line,
it is clearly implied by the previous code (or the current surrounding
code) that request.user is of the correct type.
2020-03-09 13:54:58 -07:00
Mateusz Mandera 0255ca9b6a middleware: Log user.id/realm.string_id instead of _email. 2020-03-09 13:54:58 -07:00
Tim Abbott 5835023021 tests: Use user IDs internally in send message helpers.
This uses the better, modern, user ID based API for sending messages
internally in the test suite, something that's convenient to do as a
follow-up to the migration to pass UserProfile objects to these
functions.
2020-03-07 18:31:13 -08:00
Steve Howell 5e2a32c936 tests: Use users in send_*_message.
This commit mostly makes our tests less
noisy, since emails are no longer an important
detail of sending messages (they're not even
really used in the API).

It also sets us up to have more scrutiny
on delivery_email/email in the future
for things that actually matter.  (This is
a prep commit for something along those
lines, kind of hard to explain the full
plan.)
2020-03-07 18:30:13 -08:00
Vishnu KS 1c6435d4cc validator: Optionally record a type_structure attribute.
We plan to use these records to check and record the schema of Zulip's
events for the purposes of API documentation.

Based on an original messier commit by tabbott.

In theory, a nicer version of this would be able to work directly off
the mypy type system, but this will be good enough for our use case.
2020-03-06 17:07:14 -08:00
Tim Abbott 9230213bde settings: Add EMAIL_ADDRESS_VISIBILITY_NOBODY.
This extends our email address visibility settings to deny access to
user email addresses even to organization administrators.

At the moment, they can of course change the setting (which leaves an
audit trail), but in the future only organization owners will be able
to change that setting.

While we're at this, we rewrite the settings_data.js test to cover all
the cases in a more consistent way.

Fixes #14111.
2020-03-06 16:34:08 -08:00
Tim Abbott 914cda9e2d test_classes: Fix api credentials with email_address_visibility setting.
This isn't the only bug in our testing libraries with
EMAIL_ADDRESS_VISIBILITY; but we don't have a lot of tests that need
to deal with that set of settings.
2020-03-06 16:33:16 -08:00
Steve Howell f2b8eef21a refactor: Avoid hacky use of ValidationError.code.
We were using `code` to pass around messages.

The `code` field is designed to be a code, not
a human-readable message.

It's possible that we don't actually need two
flavors of messages for these type of validations,
but I didn't want to change that yet.

We **definitely** don't need to put two types of
message in the exception, so I fix that.  Instead,
I just have the caller ask what level of detail
it needs.

I added a non-verbose message for the case of
system bots.

I removed the non-translated version of the message
for deactivated accounts, which didn't have test
coverage and is slightly more prone to leaking
email info that we don't want to leak.
2020-03-06 11:53:22 -08:00
Steve Howell 62fb3ad801 refactor: Move validate_email_not_already_in_realm.
We move this to email_validation.py.
2020-03-06 11:53:22 -08:00
Steve Howell 7e55cab429 invite performance: Reduce queries to find existing users.
In the prep commits leading up to this, we split
out two new helpers:

    validate_email_is_valid
    get_errors_for_new_emails

Now when we validate invites we use two separate
loops to filter our emails.

Note that the two extracted functions map to two
of the data structures that used to be handled
in a single loop, and now we break them out:

    errors = validate_email_is_valid
    skipped = get_errors_for_new_emails

The first loop checks that emails are even valid
to begin with.

The second loop finds out whether emails are already
in use.

The second loop takes advantage of this helper:

    get_errors_for_new_emails

The second helper can query all potential new emails
with a single round trip to the database.

This reduces our query count.
2020-03-06 11:53:22 -08:00
Steve Howell 220c2a5ff3 performance: Add get_users_by_delivery_email().
The main purpose of this new function is to allow
us to validate emails in bulk, which we don't do
yet (still setting the stage for that).

This is still a speedup, though, since in our
caller we grab only three fields now.

And other than that, we're essentially doing
the same query for the single-email case, just
outside the loop.
2020-03-06 11:53:22 -08:00
Steve Howell b35ffde5fb tests: Avoid calling actions.validate_email().
We are trying to kill off `validate_email`, so
we no longer call it from these tests.

These tests are already kind of low-level in
nature, so testing the more specific helpers
here should be fine.

Note that we also make the third parameter
to `validate_email` non-optional in this commit,
to preserve 100% coverage.  This is really just
refactoring noise--we will soon eliminate the
entire function, but I didn't want to do everything
in a huge commit.
2020-03-06 11:53:22 -08:00
Steve Howell 6f62c993a6 refactor: Extract get_existing_user_errors.
This is a prep commit that will allow us
to more efficiently validate a bunch of
emails in the invite UI.

This commit does not yet change any
behavior or performance.

A secondary goal of this commit is to
prepare us to eliminate some hackiness
related to how we construct
`ValidationError` exceptions.

It preserves some quirks of the prior
implementation:

   - the strings we decided to translate
     here appear haphazard (and often
     get ignored anyway)

   - we use `msg` in most codepaths,
     but use `code` for invites

Right now we never actually call this with
more than one email, but that will change
soon.

Note that part of the rationale for the inner
method here is to avoid a test coverage bug
with `continue` in loops.
2020-03-06 11:53:22 -08:00
Steve Howell 689aca9140 refactor: Extract validate_email_is_valid().
This has two goals:

    - sets up a future commit to bulk-validate
      emails

    - the extracted function is more simple,
      since it just has errors, and no codes
      or deactivated flags

This commit leaves us in a somewhat funny
intermediate state where we have
`action.validate_email` being a glorified
two-line function with strange parameters,
but subsequent commits will clean this up:

    - we will eliminate validate_email
    - we will move most of the guts of its
      other callee to lib/email_validation.py

To be clear, the code is correct here, just
kinda in an ugly, temporarily-disorganized
intermediate state.
2020-03-06 11:53:22 -08:00
Steve Howell 4f5b07a7e6 refactor: Extract zerver/lib/email_validation.py. 2020-03-06 11:53:22 -08:00
Steve Howell 30b43605c3 invite performance: Reduce RealmDomain queries.
We now use the `get_realm_email_validator()`
helper to build an email validator outside
the loop of emails in our invite list.

This allows us to perform RealmDomain queries
only once per request, instead of once per
email.
2020-03-06 11:53:22 -08:00
Steve Howell 57f1aa722c refactor: Rename validate_email_for_realm.
Now called:

    validate_email_not_already_in_realm

We have a separate validation function that
makes sure that the email fits into a realm's
domain scheme, and we want to avoid naming
confusion here.
2020-03-06 11:53:22 -08:00
Steve Howell c43a29ff54 invites: Fix bug with inviting cross realm bots.
Without the fix here, you will get an exception
similar to below if you try to invite one of the
cross realm bots.  (The actual exception is
a bit different due to some rebasing on my branch.)

	  File "/home/zulipdev/zulip/zerver/lib/request.py", line 368, in _wrapped_view_func
		return view_func(request, *args, **kwargs)
	  File "/home/zulipdev/zulip/zerver/views/invite.py", line 49, in invite_users_backend
		do_invite_users(user_profile, invitee_emails, streams, invite_as)
	  File "/home/zulipdev/zulip/zerver/lib/actions.py", line 5153, in do_invite_users
		email_error, email_skipped, deactivated = validate_email(user_profile, email)
	  File "/home/zulipdev/zulip/zerver/lib/actions.py", line 5069, in validate_email
		return None, (error.code), (error.params['deactivated'])
	TypeError: 'NoneType' object is not subscriptable

Obviously, you shouldn't try to invite a cross
realm bot to your realm, but we want a reasonable
error message.

RESOLUTION:

Populate the `code` parameter for `ValidationError`.

BACKGROUND:

Most callers to `validate_email_for_realm` simply catch
the `ValidationError` and then report a more generic error.

That's also what `do_invite_users` does, but it has the
somewhat convoluted codepath through `validate_email`
that triggers this code:

    try:
        validate_email_for_realm(user_profile.realm, email)
    except ValidationError as error:
        return None, (error.code), (error.params['deactivated'])

The way that we're using the `code` parameter for
`ValidationError` feels hacky to me.  The intention
behind `code` is to provide a descriptive error to
calling code, and it's not intended for humans, and
it feels strange that we actually translate this in
other places.  Here are the Django docs:

    https://docs.djangoproject.com/en/3.0/ref/forms/validation/

And then here's an example of us actually translating
a code (not part of this commit, just providing context):

    raise ValidationError(_('%s already has an account') %
                          (email,), code = _("Already has an account."),
                          params={'deactivated': False})

Those codes eventually get put into InvitationError, which
inherits from JsonableError, and we do actually display
these errors in the webapp:

    if skipped and len(skipped) == len(invitee_emails):
        # All e-mails were skipped, so we didn't actually invite anyone.
        raise InvitationError(_("We weren't able to invite anyone."),
                              skipped, sent_invitations=False)

I will try to untangle this somewhat in upcoming commits.
2020-03-06 11:53:22 -08:00
Rohitt Vashishtha 2fab45e530 bugdown: Use AtomicString in UserMentionPattern.
This fixes the user-mention counterpart of #14080.
2020-03-06 11:35:56 -08:00
Rohitt Vashishtha 7f9d8e1907 bugdown: Use AtomicString in UserGroupMentionPattern.
This fixes the user-group counterpart of #14080.
2020-03-06 11:35:56 -08:00
Mateusz Mandera 3922fb3a92 events: Clean up delete_message even processing code. 2020-03-03 15:52:42 -08:00
Rohitt Vashishtha ff5e2b6eb7 bugdown: Avoid hanging list paragraphs being processed as codeblocks.
Previously, the input:

====================
- One
  - Two

    Two continued
====================

Would produce the same output as:

====================
- One
  - Two

```
Two continued
```
====================

This was because our CodeBlockProcessor had a higher priority than
the ListIndentProcessor. This issue was discussed here:
https://chat.zulip.org/#narrow/stream/9-issues/topic/continuation.20paragraphs.20in.20list.20items.
2020-03-03 12:08:19 -08:00
Rohitt Vashishtha cd7396e732 bugdown: Update outdated comment about Zulip's heading support. 2020-03-03 11:54:18 -08:00
Rohitt Vashishtha 62a7e464fb bugdown: Use AtomicString in StreamPattern.
This fixes the stream counterpart of #14080.
2020-03-02 00:03:33 -08:00
Rohitt Vashishtha 245de9e1e2 bugdown: Use AtomicString in StreamTopicPattern.
Fixes #14080.
2020-03-02 00:03:33 -08:00
Mateusz Mandera 05e7214690 do_delete_messages: Handle empty set of messages passed as input.
/delete_topic endpoint could be used to request the deletion of a topic,
that would cause do_delete_messages to be called with an empty set in
these cases:
1. Requesting deletion of an empty stream.
2. Requesting deletion of a topic in a private stream with history not
   public to subscribers, if the requesting admin doesn't have access to
   any of the messages in that topic.
2020-03-02 00:01:35 -08:00
Steve Howell 94192395fb perf: Extract Stream.get_client_data.
This function slims down the data that we get
from the database in order to create the
streams part of our client payload.

We also fix a typo.

We also clearly distinguish between queries
and lists here.
2020-03-01 22:38:03 -08:00
Steve Howell 49b8218463 perf: Extract get_subscribed_stream_ids_for_user.
This new method prevents us from getting fat
objects from the database.

Instead, now we just get ids from the database
to build our subqueries.

Note that we could also technically eliminate
the `set(...)` wrappers in this code to have
Django make a subquery and save a round trip.
I am postponing that for another commit (since
it's still somewhat coupled to some other
complexity in `do_get_streams` that I am trying
to cut through, plus it's not the main point
of this commit.)

BEFORE:

    # old, still in use for other codepaths
    def get_stream_subscriptions_for_user(user_profile: UserProfile) -> QuerySet:
        # TODO: Change return type to QuerySet[Subscription]
        return Subscription.objects.filter(
            user_profile=user_profile,
            recipient__type=Recipient.STREAM,
        )

    user_subs = get_stream_subscriptions_for_user(user_profile).filter(
        active=True,
    ).select_related('recipient')
    recipient_check = Q(id__in=[sub.recipient.type_id for sub in user_subs])

AFTER:

    # newly added
    def get_subscribed_stream_ids_for_user(user_profile: UserProfile) -> QuerySet:
        return Subscription.objects.filter(
            user_profile_id=user_profile,
            recipient__type=Recipient.STREAM,
            active=True,
        ).values_list('recipient__type_id', flat=True)

    subscribed_stream_ids = get_subscribed_stream_ids_for_user(user_profile)
    recipient_check = Q(id__in=set(subscribed_stream_ids))
2020-03-01 22:38:03 -08:00
Steve Howell eb368c9c92 performance: Optimize max_message_id calculation.
We calculate `max_message_id` for the mobile client.

Our query now no longer joins to the Message table
and just grabs one value instead of fat objects.
2020-03-01 22:38:03 -08:00
Chris Bobbe 23ba2b63c5 push_notifications: In dev, make APNs or GCM config suffice. 2020-02-28 16:49:35 -08:00
Steve Howell 504ec9d489 typing: Remove recipient-related complexity.
For historical reasons we were creating Recipient
objects at some point in the typing-notifications
codepath.  Now we just work with UserProfiles.
This removes some queries, as indicated by
the change to `len(queries)` in a couple of the
tests.

The one subtle thing that changes here is huddles.
If user 10 sends a typing notification that they
are talking to users 20 and 30, there might not
actually be a huddle for users 10/20/30, but
we were actually creating huddles on the fly!
There is no need to create huddles just for
typing notifications, since we don't even
share huddle ids with our clients.  The clients
just infer the huddles.

Some of the code that gets killed off here as
somewhat "collateral damage" is some
defensive code related to formerly supporting streams
in typing indicators.  The support for streams
was killed off almost as soon as we released
the feature, and the codepath is pretty clearly
user-centric at this point.
2020-02-28 12:46:20 -08:00
Steve Howell f224f215c1 refactor: Simplify handling of emails for typing endpoint.
Instead of duplicating code for the email case, just
convert emails to user_ids and then run the same code.
2020-02-28 12:39:36 -08:00
Steve Howell bed6d5a789 typing: Inline check_typing_notification.
I actually like this pattern:

    def check_send_typing_notification(...):
        typing_notification = check_typing_notification(...)
        do_send_typing_notification(...)

It can help divide responsibilities nicely and make it easy
to write detailed unit tests against each of the two helpers.

Unfortunately, the good things didn't really happen here, and
instead we got the worst aspects of the pattern:

    - The responsibilities for validation leaked into
      the second function.

    - Both functions were doing sane things individually
      that became not-so-sane in the big picture (namely,
      we ended up making Recipient objects for no reason,
      but if you read each of the helpers, it was just one
      step that seemed reasonable).

    - Passing around dictionaries for results can be annoying.

Also, the pattern made a lot more sense when the validation
for typing was a lot more complicated.  My prior commit makes
it so that we only ever deal with a list of user_ids.

Anyway, now I'm inlining it. :)

Subsequent commits will clean up the more substantive issue
here, which is that we are building Recipients for no reason.
2020-02-28 12:39:36 -08:00
Mateusz Mandera 7db3d4560f do_delete_messages: Archive the messages in bulk.
The test added in this commit shows 37 queries - compared to 181 without
the change to the function. That seems very much worth it.
2020-02-27 23:12:32 -08:00
Mateusz Mandera b4186fb680 do_delete_messages: Remove unused message_ids list. 2020-02-27 23:12:32 -08:00
Wyatt Hoodes 6ed944c761 test_runner: Update database ids to be human readable.
Before the Django 2.x upgrade, the DatabaseCreation
argument took an integer value.  To deal with running
mulitple test instances, we created a random start
range that could count up 100 workers until the next
random id.  Arbitrarily limiting the number of workers
to 100.

Post upgrade, we can now use string values. Enabling
the database + worker numbers to be more readable, as
well as removing the cap on the worker count.
2020-02-27 23:01:29 -08:00
Tim Abbott 2fb967b735 do_update_message: Remove sender field from update_message events.
This field wasn't accessed by any clients and was a less robust
version of the user_id field.  Any client hoping to be interested in
who did message edits should be able to handle working with user IDs
rather than email addresses.
2020-02-26 16:16:01 -08:00
Tim Abbott 588bcb37cf do_update_message: Avoid using a direct query to fetch a Stream.
We have a helper designed for the purpose, and it fixes potentially
misbehavior where the previous code did not do `.select_related()`.
2020-02-26 16:14:34 -08:00
Tim Abbott 49ca7cf717 topic: Add recipient_id to fields for message edit saves.
This is preparation for supporting moving messages between streams in
some cases.

It doesn't actually have any functional effect, since flush_message
clears the message unconditionally anyway.
2020-02-26 16:12:07 -08:00
Steve Howell 995353fb28 message validation: Clean up extract_private_recipients.
This is mostly refactoring, but we also prevent a new
type of value error (list of non-int-or-string).  The
new test code helps enforce that.

Cleanup includes:

    - Use early-exit for email case.
    - Rename helpers to get_validate_*.
    - Avoid clumsy rebuilding of lists in helpers.
    - Avoid the confusing `recipient` name (which
      can be confused with the model by the same
      name).
    - Just delegate duplicate-id/email-removal to
      the helpers.

The cleaner structure allows us to elminate a couple
mypy workarounds.
2020-02-25 16:17:47 -08:00
Vishnu KS 303cd9bb9e actions: Make do_change_plan_type support changing plan to SELF_HOSTED.
Credits to @xpac1985 for reporting, debugging and proposing fix to the
issue. The proposed fix was modified slightly by @hackerkid to set the
correct value for max_invites and upload_quota_gb. Tests added by
@hackerkid.

Fixes #13974
2020-02-25 16:14:45 -08:00
Tim Abbott 27edc18330 test_classes: Use realistic web and mobile User-Agent strings.
This fixes a confusing aspect of how our automated tests worked
previously, where we'd almost all HTTP requests in the unlikely
configuration with no User-Agent string specified.

We need to adjust query counts in a few tests that now are a bit
cheaper because they now can take advantage of a Client object created
in server_initialization.py in `process_client`.
2020-02-24 23:19:43 -08:00
Tim Abbott 27b267026e test_classes: Rename set_http_host to set_http_headers.
This supports the goal of setting other headers like User-Agent in the
future.
2020-02-24 23:19:43 -08:00
Tim Abbott d80175d29e server_initialization: Create Client objects for mobile/desktop.
This replaces the "API" client, which isn't used by any real clients,
with the "ZulipMobile" and "ZulipElectron" client strings, which are.
2020-02-24 23:19:43 -08:00
harshavardhanpb cac4feb263 openapi: Move openapi.py into zerver/openapi.py.
Fixes #14006
2020-02-24 12:21:26 -05:00
Steve Howell ed859617e4 minor: Add test for extract_stream_indicator. 2020-02-24 07:40:31 -05:00
Mateusz Mandera a9794ec001 cache: Delete unused function cache(). 2020-02-21 09:05:46 -08:00
Mateusz Mandera c78d0712f7 tests: For ldap tests, give each ldap user a unique password.
To avoid some hidden bugs in tests caused by every ldap user having the
same password, we give each user a different password, generated based
on their uids (to avoid some ugly hard-coding in a bunch of places).
2020-02-19 14:46:29 -08:00
Vishnu KS 51f5701879 export: Canonicalize the email of cross realm bot to default value.
Fixes #13496
2020-02-19 14:44:50 -08:00
Vishnu KS e1a7716578 emails: Translate from_name of account security emails. 2020-02-18 17:45:33 -08:00
Tim Abbott 0075c6cd56 do_update_message: Clean up timestamp code.
By moving this logic to the topic of the functon, we make the code a
lot more readable.
2020-02-18 16:38:34 -08:00
Mateusz Mandera 6a0b68bc7f models: Delete get_stream_recipient function and its uses.
With recipient being now a Stream field, there's no more use for
this helper function.
2020-02-18 10:49:14 -08:00
Mateusz Mandera 0d6f78b381 models: Delete get_personal_recipient function and its uses.
With recipient being now a UserProfile field, there's no more use for
this helper function.
2020-02-18 10:49:14 -08:00
Mateusz Mandera 920d22524b import: Use re_map_foreign_keys on the realm column of UserPresence.
We forgot to make this adjustment in the recent denormalization of realm
into UserPresence. It's needed for imports to work correctly.
2020-02-18 10:45:38 -08:00
Anders Kaseorg b2ec8e157b has_request_variables: Remove query_params dict.
‘req_var in request.GET’ was previously believed to be slow from
profiling results.  However, the real explanation for those profiling
results is that WSGIRequest.GET is a lazy cached property, so there’s
no reason to avoid it if we’re accessing request.GET anyway.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-02-15 11:37:18 -08:00
Chris Heald 18e3982acd integrations: Add AlertManager webhook. 2020-02-14 17:43:15 -08:00
Mateusz Mandera cbdfef28a8 retention: Update to account for the zulipinternal realm.
In https://github.com/zulip/zulip/pull/12823 some changes to the realms
structure have been made, so now both in production and development
cross-realm bots live in the realm with string_id "zulipinternal".
There was a TODO in retention code to eliminate a conditional in a query
that became redundant with this change, and also the zulipinternal realm
should be omitted from the archiving process in archive_messages().
2020-02-14 17:15:26 -08:00
Tim Abbott 10e7e15088 user_agent: Compile the regular expression.
We use this single regular expression for processing essentially every
request, so it's definitely worth hinting to Python that we're going
to do so by compiling it.  Saves about 40us per request.
2020-02-14 10:26:37 -08:00
Tim Abbott 800312c976 has_request_variables: Fix slow extraction of parameters.
A sloppy implementation of the main has_request_variables wrapper
function meant that it did two very inefficient things:

* To combine together the GET and POST parameters, it would make a
  copy of the request.GET QueryDict object, which combined with the
  fact that these objects are slow to access, consumed about 90us per
  argument.
* Doing this in a loop (one time per argument), rather than once,
  which resulted in us doing this 11 times for a `GET /events` query.

Fixing this to just make a dictionary and combine things with some
small loops saved about 1 millisecond from the total runtime of GET
/events (for comparison, the total actual work of that view function
is about 700ms).

We need to fix at least one test that used a bad mock HttpRequest
object that didn't have a .GET property.
2020-02-14 09:45:26 -08:00
Tim Abbott 4fbcbeeea7 settings: Disable django.request logging at WARNING log level.
The comment explains this issue, but effectively, the upgrade to
Django 2.x means that Django's built-in django.request logger was
writing to our errors logs WARNING-level data for every 404 and 400
error.  We don't consider user errors to be a problem worth
highlighting in that log file.
2020-02-13 23:50:53 -08:00
rht 41e3db81be dependencies: Upgrade to Django 2.2.10.
Django 2.2.x is the next LTS release after Django 1.11.x; I expect
we'll be on it for a while, as Django 3.x won't have an LTS release
series out for a while.

Because of upstream API changes in Django, this commit includes
several changes beyond requirements and:

* urls: django.urls.resolvers.RegexURLPattern has been replaced by
  django.urls.resolvers.URLPattern; affects OpenAPI code and related
  features which re-parse Django's internals.
  https://code.djangoproject.com/ticket/28593
* test_runner: Change number to suffix. Django changed the name in this
  ticket: https://code.djangoproject.com/ticket/28578
* Delete now-unnecessary SameSite cookie code (it's now the default).
* forms: urlsafe_base64_encode returns string in Django 2.2.
  https://docs.djangoproject.com/en/2.2/ref/utils/#django.utils.http.urlsafe_base64_encode
* upload: Django's File.size property replaces _get_size().
  https://docs.djangoproject.com/en/2.2/_modules/django/core/files/base/
* process_queue: Migrate to new autoreload API.
* test_messages: Add an extra query caused by .refresh_from_db() losing
  the .select_related() on the Realm object.
* session: Sync SessionHostDomainMiddleware with Django 2.2.

There's a lot more we can do to take advantage of the new release;
this is tracked in #11341.

Many changes by Tim Abbott, Umair Waheed, and Mateusz Mandera squashed
are squashed into this commit.

Fixes #10835.
2020-02-13 16:27:26 -08:00
Tim Abbott 1ea2f188ce tornado: Rewrite Django integration to duplicate less code.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.

The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).

As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
  to change a couple lines allowing us our Tornado code to not
  actually return the Django HttpResponse so we can long-poll.  The
  previous hack of returning None stopped being viable with the Django 2.2
  MiddlewareMixin.__call__ implementation.

But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:

* Replace RespondAsynchronously with a response.asynchronous attribute
  on the HttpResponse; this allows Django to run its normal plumbing
  happily in a way that should be stable over time, and then we
  proceed to discard the response inside the Tornado `get()` method to
  implement long-polling.  (Better yet might be raising an
  exception?).  This lets us eliminate maintaining a patched copy of
  _get_response.

* Removing the @asynchronous decorator, which didn't add anything now
  that we only have one API endpoint backend (with two frontend call
  points) that could call into this.  Combined with the last bullet,
  this lets us remove a significant hack from our
  never_cache_responses function.

* Calling the normal Django `get_response` method from zulip_finish
  after creating a duplicate request to process, rather than writing
  totally custom code to do that.  This lets us eliminate maintaining
  a patched copy of Django's load_middleware.

* Adding detailed comments explaining how this is supposed to work,
  what problems we encounter, and how we solve various problems, which
  is critical to being able to modify this code in the future.

A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.

There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard).  We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response.  Profiling will be important to understanding what's
worth doing here.
2020-02-13 16:13:11 -08:00
Mateusz Mandera 27b15a9722 install: Don't create internal realm in the installation process. 2020-02-12 12:00:10 -08:00
Mateusz Mandera fe33966642 sessions: Implement the concept of expirable session variables.
This can be useful in the future for various things, and right now it'll
specifically be used in the signup mobile/desktop flows.
2020-02-12 11:09:55 -08:00
Hashir Sarwar eb23c6fa6c test_fixtures: Clean up interface for `template_database_status()`.
1) Created a new class `DatabaseType` and access its objects inside
`template_database_status()` instead of sending five arguments with
default values.

2) Made `check_files` and `setting_name` local variables instead of
function parameters since they had same value(None) for every call.

Fixes #13845.
2020-02-12 11:07:10 -08:00
Tim Abbott 96b0ec705d email_notifications: Fix missing translation tags on sender. 2020-02-12 10:54:34 -08:00
Anders Kaseorg e257253e64 emoji_codes: Replace JS module with JSON module.
webpack optimizes JSON modules using JSON.parse("{…}"), which is
faster than the normal JavaScript parser.

Update the backend to use emoji_codes.json too instead of the three
separate JSON files.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-02-12 10:09:12 -08:00
Tim Abbott cb2c96f736 test_templates: Remove shallow template rendering code.
This code was very useful when first implemented to help catch errors
where our backend templates didn't render, but has been superceded by
the success of our URL coverage testing (which ensures every URL
supported by Zulip's urls.py is accessed by our tests, with a few
exceptions) and other tests covering all of the emails Zulip sends.

It has a significant maintenance cost because it's a bit hacky and
involves generating fake context, so it makes sense to remove these.
Any future coverage issues with templates should be addressed with a
direct test that just accessing the relevant URL or sends the relevant
email.
2020-02-11 18:00:15 -08:00
Mateusz Mandera 2475adbf8a messages_for_topic: Use stream.recipient_id for more efficient query. 2020-02-11 17:39:43 -08:00
Steve Howell 900f98c0c5 presence: Use realm_id for UserPresence queries.
We now use realm_id for querying UserPresence
instead of building a big WHERE clause from the
list of user_ids.

This commit may be a bit hard to measure, since
we still get the list of user_ids for the PushToken
query in the same method.
2020-02-11 13:11:58 -08:00
Tim Abbott fcac3a4342 recipients: Rename extract_recipients to extract_private_recipients.
Recent changes mean this function is now only used for private
messages.
2020-02-11 12:28:14 -08:00
Steve Howell 1b6578cafd messages: Fix bug with commas in stream names.
We now validate streams with a separate
function from PM recipients.

It's confusing enough all the ways you can
encode a stream or encode the PM recipients,
but trying to do it all in one function was
hard to reason about and led to at least one
bug.

In particular, there was a bug where streams
with commas in them would get split.  Now
we just don't ever split on commas inside
of `extract_stream_indicator`.

Fixes #13836
2020-02-11 12:20:54 -08:00
Steve Howell 96132fe0e9 extract_recipients: Enforce str as incoming type.
After removing internal_send_message() in a recent
commit, we now have only two callers for
extract_recipients, and they are both related
to our REQ mechanism that always passes strings
to converters.  (If there are default values,
REQ does not call the converters.)

We therefore make two changes:

    - use the more strict annotation of "str"
      for the `s` parameter

    - don't bother with the isinstance check
2020-02-11 12:20:54 -08:00
Steve Howell 8c3eaeb872 Remove obsolete internal_send_messages().
We have been phasing this out for a couple years,
and I fixed the last stragglers over the last
couple days.
2020-02-11 12:20:54 -08:00
Steve Howell e37d660d19 error_notify: Use internal_send_stream_message(). 2020-02-11 12:20:53 -08:00
Steve Howell c4e3cfebb0 presence: Add realm_id to UserPresence.
This index is intended to optimize the performance of the very
frequently run query of "what is the presence status of all users in a
realm?".

Main changes:
    - add realm_id to UserPresence
    - add index for realm_id
    - backfill realm_id for old rows
    - change all writes to UserPresence to include
      realm_id

The index is of this form:

    "zerver_userpresence_realm_id_5c4ef5a9" btree (realm_id)

We will create an index on (realm_id, timestamp) in a
future commit, but I think it's a bit faster if you do
the backfill before the index.

There's also a minor tweak to the populate_db script.
2020-02-10 17:21:45 -08:00
Steve Howell 28a8ffbc4c email_mirror: Use internal_send_stream_message().
This is just a refactoring to the more modern API
for sending internal messages.

To make this work we now plumb the email_gateway
flag through `internal_send_stream_message` instead
of `internal_send_message`.

We also change `send_zulip` to have its callers
pass in a full UserProfile object (which one of
them already had).
2020-02-10 15:45:13 -08:00
Steve Howell 6922eef380 signups: Use internal_send_stream_message().
We prefer this to internal_send_message().

We are trying to deprecate `internal_send_message`,
which has extra moving parts related to
`extract_recipients` and `Addressee.legacy_build`.

There are two chunks of code that I touch here
that look pretty similar, but I'm not quite
sure they're worth de-duplicating, since they
use different topics and different message
content.
2020-02-10 15:45:13 -08:00
Steve Howell b33552997e cross realm bots: Simplify notify_new_user.
Instead of having `notify_new_user` delegate
all the heavy lifting to `send_signup_message`,
we just rename `send_signup_message` to be
`notify_new_user` and remove the one-line
wrapper.

We remove a lot of obsolete complexity:

    - `internal` was no longer ever set to True
      by real code, so we kill it off as well
      as well as killing off the internal_blurb code
      and the now-obsolete test

    - the `sender` parameter was actually an
      email, not a UserProfile, but I think
      that got past mypy due to the caller
      passing in something from settings.py

    - we were only passing in NOTIFICATION_BOT
      for the sender, so we just hard code
      that now

    - we eliminate the verbose
      `admin_realm_signup_notifications_stream`
      parameter and just hard code it to
      "signups"

    - we weren't using the optional realm
      parameter

There's also a long ugly comment in
`get_recipient_info` related to this code
that I amended for now.
We should try to take action in a subsequent
commit.
2020-02-10 15:45:13 -08:00
Hashir Sarwar dcbd3e486f stream_subscription: Remove unused TypedDict `SubInfo`. 2020-02-10 14:04:22 -08:00
Steve Howell 2ff41bf9e5 /json/users: Use field.realm for realm lookup.
This avoids an unnecessary join to UserProfile.

To verify this, you can do `print(queries)` in the
`test_get_custom_profile_fields_from_api` test.  It's
kinda noisy, so I excerpted them below...

Before:

    SELECT ...
    FROM "zerver_customprofilefieldvalue"
    INNER JOIN "zerver_userprofile" ON ("zerver_customprofilefieldvalue"."user_profile_id" = "zerver_userprofile"."id")
    INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id")
    WHERE "zerver_userprofile"."realm_id" = 2

After:

    SELECT ...
    FROM "zerver_customprofilefieldvalue"
    INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id")
    WHERE "zerver_customprofilefield"."realm_id" = 2'

I don't have any way to measure the two queries with
realistic data, but I would assume the second
query is significantly faster on most of our instances,
since CustomProfileField should be tiny.
2020-02-09 22:04:02 -08:00
Steve Howell 01f180d042 minor: Remove unused line of code in get_raw_user_data().
The line removed here is a noop, as both sides of the
immediately following conditional reassign the
same variable.

This harmless cruft was the result of the recent commit
1ae5964ab8, which added
support for single-user GETs.
2020-02-09 22:04:02 -08:00
Vishnu KS 4572be8c27 api: Rename subject_links to topic_links.
Fixes #13588
2020-02-07 14:35:22 -08:00
Tim Abbott 84edb5c516 test_fixtures: Fix buggy reuse of status_dir between databases.
Apparently, the arguments passed to template_database_status were
incorrect for the manual testing development database, in that we
didn't pass a status_dir when calling into that code from provision.

The result was that provisioning before running `test-backend` would
ignore changes to the list of check_files (etc.) made after rebasing,
and vice versa.

The cleanest fix is to compute status_dir from other values passed in;
I'm also going to open a follow-up issue for creating a better overall
interface here.
2020-02-07 13:33:08 -08:00
akashaviator 1ae5964ab8 api: Add an api endpoint for GET /users/{id}
This adds a new API endpoint for querying basic data on a single other
user in the organization, reusing the existing infrastructure (and
view function!) for getting data on all users in an organization.

Fixes #12277.
2020-02-07 10:36:31 -08:00
Tim Abbott e39840c705 users: Add read-only mode for access_user_by_id.
We've be using this in the upcoming GET /users/{id} method.
2020-02-07 10:36:31 -08:00
Tim Abbott aa9286a1f9 users: Move query into caller of get_custom_profile_field_values.
This will be useful for supporting a smaller query for a single user.
2020-02-07 10:36:31 -08:00
Tim Abbott 79e5dd1374 users: Rename get_raw_user_data user parameter to acting_user.
This is for improved clarity as we extend this function to take
multiple user objects.
2020-02-07 10:36:31 -08:00
Steve Howell 7e99e7feb2 presence: Extract get_legacy_user_info.
This code is a bit flatter and just preps the data
for a single user.  There is never any interaction
between the data for user A and user B, so we can
mostly avoid complicated nested data structures
and do most of the data-crunching on a per-user basis.

We also do an explicit sort of the data before
running it through groupby.  The explicit sort
simplifies how we calculate `most_recent_info`
and also avoids needing to add `dt` to an intermediate
data structure.

Finally, when it comes to the individual client data,
the code has relied on the assumption that there is
only one row per client, which I believe to be true,
but now the code is more explicit about that.
2020-02-06 17:16:22 -08:00
Steve Howell bf3baa14ac presence: Rename get_status_dict_by_user(). 2020-02-06 17:16:22 -08:00
Steve Howell 675f8514e8 presence: Rename get_status_dict().
We renamed this to get_presences_for_realm(),
and we have the caller pass in realm, not
user_profile.
2020-02-06 17:16:22 -08:00
Steve Howell 363e6bf239 presence: Move get_status_dicts_for_rows(). 2020-02-06 17:16:22 -08:00
Steve Howell 36fba1076f presence: Move get_status_dict_by_user. 2020-02-06 17:16:22 -08:00
Steve Howell 6f027d84a9 presence: Move get_status_dict_by_realm. 2020-02-06 17:16:22 -08:00
Steve Howell 703338dfa3 presence: Extract lib/presence.py.
This will make more sense when we pull some
code out of the model.
2020-02-06 17:16:22 -08:00
Tim Abbott 7c0a98754a home: Refactor logic for show_invites and show_add_streams. 2020-02-05 16:05:02 -08:00
Tim Abbott 7032f49f8e exceptions: Move default json_unauthorized string to response.py.
This small refactor should make it easier to reuse this exception for
other situations as well.
2020-02-05 15:40:10 -08:00
Anders Kaseorg 8e5a45267d test_classes: Use a valid (but reserved as fictional) phone number.
django-phonenumber-field 2.4.0 adds tighter phone number validation
that rejects +12223334444 for having an invalid area code.  This was
reverted in 4.0.0, but django-two-factor-auth still requires <3.99.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-02-05 12:38:10 -08:00
Ryan Rehman 174b2abcfd settings: Migrate to stream_post_policy structure.
This commit includes a new `stream_post_policy` setting,
by replacing the `is_announcement_only` field from the Stream model,
which is done by mirroring the structure of the existing
`create_stream_policy`.

It includes the necessary schema and database migrations to migrate
the is_announcement_only boolean field to stream_post_policy,
a smallPositiveInteger field similar to many other settings.

This change is done to allow organization administrators to restrict
new members from creating and posting to a stream. However, this does
not affect admins who are new members.

With many tweaks by tabbott to documentation under /help, etc.

Fixes #13616.
2020-02-04 17:08:08 -08:00
Mateusz Mandera 30d02c2e2c test_fixtures: app_label should be a positional arg in call_command.
We were incorrectly passing it as a kwarg, which would cause an
exception on Django 2.
2020-02-04 12:46:53 -08:00
Mateusz Mandera 0e7c97378e is_safe_url: Use allowed_hosts instead of depreciated host argument.
Judging by comparing django 1.11 with django 2.2 code of this function,
this shouldn't change any behavior.
2020-02-04 12:46:53 -08:00
Steve Howell e3ad9baf1d presence: Add process_presence_event.
This lets us conditionally remove the email
field from a presence event if the client
has registered with the slim_presence flag.
2020-02-04 12:30:36 -08:00
Steve Howell 9847d4d9a3 refactor: Use user_id in get_status_dict_by_user.
This avoids a needless user lookup in apply_event.
2020-02-04 12:30:36 -08:00
Steve Howell a672a00677 presence: Add user_id to presence event.
In a later commit, we will eliminate email for
clients who have set slim_presence as their
preference.
2020-02-04 12:30:36 -08:00
Steve Howell bf9144ff69 presence: Add slim_presence flag.
This flag affects page_params and the
payload you get back from POSTs to this
url:

    users/me/presence

The flag does not yet affect the
presence events that get sent to a
client.
2020-02-04 12:30:34 -08:00
Vishnu Ks 5dfd4ea38d export: Remove unused parameter from _get_exported_s3_record. 2020-02-03 14:09:05 -08:00
Vishnu Ks 5a59bf329e import: Skip setting user_profile_id metadata only if unavailable. 2020-02-03 14:09:05 -08:00
Vishnu Ks 2ea53a347a import: Support importing realm icon and logo.
Fixes #11216
2020-02-03 14:09:05 -08:00
Vishnu Ks af3a37b58b upload: Refactor out realm_avatar_and_logo_path function. 2020-02-03 14:09:05 -08:00
Tim Abbott df110e8ff9 test_fixtures: Note populate_db depends on server_initialization.py.
This should ensure that folks rebasing past this commit from an older
database model get their database rebuilt in the way that will
match the test_subs.py query count of 40.
2020-02-03 10:38:04 -08:00
Ryan Rehman 3dc7d60ffe muting: Record DateTime when a Topic is muted.
This includes the necessary migration to add
the date_muted field to the MutedTopic class
and populates it with a hard coded value.
2020-02-02 20:49:53 -08:00
Mateusz Mandera bf89cf2b4b rate_limiter: Use ABC for defining the abstract class RateLimitedObject. 2020-02-02 19:15:13 -08:00
Mateusz Mandera cb71a6571e rate_limiter: Rename 'all' domain to 'api_by_user'. 2020-02-02 19:15:13 -08:00
Mateusz Mandera 06198af5b9 auth: Handle rate limiting in OurAuthenticationForm and user_settings.
These parts of the code should catch the RateLimited exception and
generate their own, apprioprate user-facing error message.
2020-02-02 19:15:13 -08:00
Mateusz Mandera 335b804510 exceptions: RateLimited shouldn't inherit from PermissionDenied.
We will want to raise RateLimited in authenticate() in rate limiting
code - Django's authenticate() mechanism catches PermissionDenied, which
we don't want for RateLimited. We want RateLimited to propagate to our
code that called the authenticate() function.
2020-02-02 19:15:00 -08:00
Mateusz Mandera a6a2d70320 rate_limiter: Handle multiple types of rate limiting in middleware.
As more types of rate limiting of requests are added, one request may
end up having various limits applied to it - and the middleware needs to
be able to handle that. We implement that through a set_response_headers
function, which sets the X-RateLimit-* headers in a sensible way based
on all the limits that were applied to the request.
2020-02-02 19:15:00 -08:00
Mateusz Mandera 4cc5d2464c rate_limiter: Expand support for different domains. 2020-02-02 19:15:00 -08:00
Tim Abbott 51706bdc3a stream: Deduplicate lists of stream/subscriptions fields.
While the result of this change doesn't completely do what we need, it
does remove a huge amount of duplicated lists of fields.  With a bit
more similar work, we should be able to eliminate a broad category of
potential bugs involving Stream and Subscription objects being
represented inconsistently in the API.

Work towards #13787.
2020-02-02 18:34:45 -08:00
Tim Abbott 238bc386cb actions: Deduplicate parts of get_web_public_subs.
This has the side of effect of making new fields we add to Stream be
automatically included, which will help maintain this code as we
upgrade it.

This commit adds is_web_public, history_public_to_subscribers, and
email_notifications fields to the dictionary.
2020-02-02 17:42:12 -08:00
Tim Abbott eac07698dd users: Add nocoverage tag for settings.SYSTEM_BOT_REALM conditional.
This is code for safety that should never happen and is likely
annoying to setup an automated test to verify.
2020-01-31 14:51:12 -08:00
Tim Abbott 5825a155cc users: Use format_user_row in events system as well.
This completes the deduplication of our logic for turning users into
dictionaries in the Zulip API.
2020-01-31 14:47:16 -08:00
akashaviator 20b8b29d11 users: Rewrite get_cross_realm_dicts to call format_user_row.
This modifies get_cross_realm_dicts in zerver.lib.users to call
format_user_row.  This is done to remove current and prevent future
inconsistencies between in the dictionary formats for get_raw_user_data
and get_cross_realm_dicts.

Implementation substantially rewritten by tabbott.

Fixes #13638.
2020-01-31 14:28:46 -08:00
akashaviator 7d06293ac0 refactor: Cleanup actions.py and events.py in zerver/lib.
This moves get_cross_realm_dicts (from zerver.lib.actions),
get_raw_user_data and get_custom_profile_field_values (from
zerver.lib.events) to zerver.lib.users.
2020-01-31 13:53:47 -08:00
Vishnu KS bd460af099 emails: Remove unecessary call to message_content_allowed_in_missedmessage_emails. 2020-01-31 12:29:58 -08:00
Vishnu KS 47e442e4a4 emails: Show proper message when email content is not shown. 2020-01-31 12:29:58 -08:00
akashaviator bd58e3397f events: Extract user_data function from get_raw_user_data.
This extracts the user_data inner function from get_raw_user_data as a
reusable function.  We intend to reuse it for cross-realm user dicts.
A few changes were made while extracting it:

* Renaming the UserProfile argument to acting_user, so we can do loops
  over a local user_profile variable.
* Moved it to zerver.lib.users, as that's a more appropriate home for
  this function formatting data on users.

* Simplified the calling convention for passing custom profile fields
  to reflect the fact that this function processes a single user (and
  is expected to be called in a loop).
2020-01-30 13:32:35 -08:00
Mateusz Mandera 8bd3752d13 email_mirror: Handle encoded attachment filenames. 2020-01-30 13:03:47 -08:00
Mateusz Mandera 49b76318c6 email_mirror: Extract handle_header_content function. 2020-01-30 13:03:47 -08:00
Tim Abbott dd969b5339 install: Remove references to "Zulip Voyager".
"Zulip Voyager" was a name invented during the Hack Week to open
source Zulip for what a single-system Zulip server might be called, as
a Star Trek pun on the code it was based on, "Zulip Enterprise".

At the time, we just needed a name quickly, but it was never a good
name, just a placeholder.  This removes that placeholder name from
much of the codebase.  A bit more work will be required to transition
the `zulip::voyager` Puppet class, as that has some migration work
involved.
2020-01-30 12:40:41 -08:00
Mateusz Mandera d68cf21952 server_initialization: Add server_initialized function. 2020-01-30 12:21:31 -08:00
Mateusz Mandera 682dea1b34 test_classes: Fix bug where UserProfile could be passed to client_post.
It would cause JSON overflow error while producing URL coverage report.
2020-01-30 12:13:54 -08:00
Mateusz Mandera 68abddb534 server_initialization: Rename some variables.
This makes the code of create_internal_realm identical to the
corresponding block in initialize_voyager_db.py.
2020-01-29 17:26:45 -08:00
Mateusz Mandera 39b012a276 server_initialization: Set internal bots owners to themselves. 2020-01-29 17:26:45 -08:00
Mateusz Mandera 9c20611a65 server_initialization: Remove unnecessary type annotation. 2020-01-29 17:26:45 -08:00
Mateusz Mandera d24936cbe3 server_initialization: Use tos_version argument in create_users. 2020-01-29 17:26:45 -08:00
Mateusz Mandera 261da5999d populate_db: Extract default client creation to server_initialization. 2020-01-29 17:26:45 -08:00
Mateusz Mandera a25f00a69c populate_db: Extract some functions to server_initialization.py. 2020-01-29 17:26:45 -08:00
Mateusz Mandera 9dcf677bf9 email_mirror: Parse encoded From headers with show_sender=True. 2020-01-29 12:27:35 -08:00
Tim Abbott 05108760f6 narrow: Add support for passing oldest/newest for anchor.
A wart that has long been present inin Zulip's get_messages API is how
to request "the latest messages" in the API.  Previously, the
recommendation was basically to pass anchor=10000000000000000 (for an
appropriately huge number). An accident of the server's implementation
meant that specific number of 0s was actually important to avoid a
buggy (or at least wasteful) value of found_newest=False if the query
had specified num_after=0 (since we didn't check).

This was the cause of the mobile issue
https://github.com/zulip/zulip-mobile/issues/3654.

The solution is to allow passing a special value of anchor='newest',
basically a special string-type value that the server can interpret as
meaning the user precisely just wants the most recent messages.  We
also add an analogous anchor='oldest' or similar to avoid folks
needing to write a somewhat ugly anchor=0 for fetching the very first
messages.

We may want to also replace the use_first_unread_anchor argument to be
a "first_unread" value for the anchor parameter.

While it's not always ideal to make a value have a variable type like
this, in this case it seems like a really clean way to express the
idea of what the user is asking for in the API.
2020-01-29 12:14:06 -08:00
Tim Abbott c0712431df openapi: Add hacky support for oneOf parameter types.
This is required for the upcoming type behavior of the "anchor"
parameter.

This change is the minimal work required to have our OpenAPI code not
fail when checking a union-type value of this form.  We'll likely want
to, in the future, do something nicer, but it'd require more extensive
infrastructure for parsing of OpenAPI data that it's worth with our
current approach (we may want to switch to using a library).
2020-01-29 11:24:58 -08:00
Tim Abbott 91f1825474 test_helpers: Fix POSTRequestMock typing.
The proximal issue here is that in upcoming commits, we're going to
change the type of the `anchor` field in `get_messages_backend` to
support passing either an integer or a string.

Many of our tests using POSTRequestMock currently define a query
object that uses integer values for the integer fields we're going to
pass into it, e.g. {'num_after': 0}.  That is the correct type for
that field in the Zulip API, before HTTP encoding turns it into a
string.  However, because POSTRequestMock didn't use HTTP encoding at
all (which will convert the 0 into a '0'), it ended up passing an
integer to a function that can't possible receive one as an argument.

Ideally, we'd just get rid of POSTRequestMock, since it's a hack, and
just do real HTTP requests instead.

But since it's used in a lot of places making doing so somewhat
impractical, we can get past this issue by just making POSTRequestMock
convert integers to strings.
2020-01-29 11:24:58 -08:00
Tim Abbott 8f50062e49 soft_deactivation: Fix incorrect logging function.
Using logging.info() rather than logger.info() meant that our
zulip.soft_deactivation logger configuration (which, in particular,
included not logging to the console) was not active on this log line,
resulting in the `manage.py soft_deactivate_users` cron job sending
emails every time it ran.

Fixes #13750.
2020-01-28 17:17:43 -08:00
Rohitt Vashishtha 630c564fc7 bugdown: Rewrite List Preprocessor logic to properly parse fences.
Previously, we didn't track opening and closing fences separately,
with led to bugs like not parsing a list that was immediately after
a quoted fence; we treated each ``` as a new fence.

This commit rewrites the function to maintain a stack of currently
open fences. If any of the parent fences is a code fence, we do not
insert a new line before a list.

We also add some test cases specifically to test this behavior with
complexly nested lists.

Fixes #13745.
2020-01-27 17:14:27 -08:00
Mateusz Mandera 92c16996fc redis_utils: Require key_format argument in get_dict_from_redis. 2020-01-26 21:40:15 -08:00
Mateusz Mandera ad460e6ccb redis_utils: Validate requested key length in helper functions. 2020-01-26 21:40:15 -08:00
Mateusz Mandera 8d987ba5ae auth: Use tokens, with data stored in redis, for log_into_subdomain.
The desktop otp flow (to be added in next commits) will want to generate
one-time tokens for the app that will allow it to obtain an
authenticated session. log_into_subdomain will be the endpoint to pass
the one-time token to. Currently it uses signed data as its input
"tokens", which is not compatible with the otp flow, which requires
simpler (and fixed-length) token. Thus the correct scheme to use is to
store the authenticated data in redis and return a token tied to the
data, which should be passed to the log_into_subdomain endpoint.

In this commit, we replace the "pass signed data around" scheme with the
redis scheme, because there's no point having both.
2020-01-26 21:32:44 -08:00
Abhishek-Balaji 434e8d3104 home: Extract compute_show_invites_and_add_streams.
This extracts a function for computing show_invites and
show_add_streams, for better readability and testability.

This commit was substantially cleaned up by tabbott.
2020-01-25 23:41:08 -08:00
Tim Abbott d70e799466 bots: Remove FEEDBACK_BOT implementation.
This legacy cross-realm bot hasn't been used in several years, as far
as I know.  If we wanted to re-introduce it, I'd want to implement it
as an embedded bot using those common APIs, rather than the totally
custom hacky code used for it that involves unnecessary queue workers
and similar details.

Fixes #13533.
2020-01-25 22:41:39 -08:00
Mateusz Mandera af2c4a9735 redis: Extract put_dict_in_redis and get_dict_from_redis helpers. 2020-01-23 16:24:07 -08:00
Jonathan Cobb c7433c83ff integrations: Add errbit integration.
Fixes #13685.
2020-01-16 15:33:51 -08:00
Mateusz Mandera d37e6ef921 email_mirror: Use plaintext if html body empty with prefer-html option.
If an email is sent with the .prefer-html option, but it has no html
body, it's better to fall back to plaintext content instead of treating
it as a user error.
2020-01-16 15:25:27 -08:00
Mateusz Mandera 0c9c218e91 email_mirror: Add prefer-html and prefer-text address options.
Closes #13484.

These options tell zulip whether to prefer the plaintext or html version
of the email message. prefer-text is the default behavior, so including
the option doesn't change anything as of now, but we're adding it to
prepare to potentially change the default behavior in the future.
2020-01-16 15:25:19 -08:00
Mateusz Mandera 170e0ac2dd email_mirror: More abstract option system.
As we add more address options, which will have different behavior than
simply setting option_name=True, we need to migrate this subsystem to
something that better supports more complex logic and will allow
encapsulating it, instead of needing to be put all over the
decode_email_address function.
2020-01-16 15:16:04 -08:00
Tim Abbott eb8b3539ad test_classes: Remove DEFAULT_REALM variable.
This essentially unused legacy variable was causing Zulip to query the
database at import time, which is generally not something we aim to
do.

Combined with the issue fixed in the previous commit, this variable
resulted in test-backend providing an unhelpful crash when provision
hadn't updated the unit testing database.
2020-01-16 13:13:46 -08:00
Tim Abbott 8ff5d8ca89 test_classes: Clean up API_KEYS cache.
Since the intent of our testing code was clearly to clear this cache
for every test, there's no reason for it to be a module-level global.

This allows us to remove an unnecessary import from test_runner.py,
which in combination with DEFAULT_REALM's definition was causing us to
run models code before running migrations inside test-backend.

(That bug, in turn, caused test-backend's check for whether migrations
needs to be run to happen sadly after trying to access a Realm,
trigger a test-backend crash if the Realm model had changed since the
last provision).
2020-01-16 13:07:26 -08:00
Anders Kaseorg 319e2231b8 thumbnail: Tighten fix for CVE-2019-19775 open redirect.
Due to a known but unfixed bug in the Python standard library’s
urllib.parse module (CVE-2015-2104), a crafted URL could bypass the
validation in the previous patch and still achieve an open redirect.

https://bugs.python.org/issue23505

Switch to using django.utils.http.is_safe_url, which already contains
a workaround for this bug.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-16 12:36:24 -08:00
Anders Kaseorg ea6934c26d dependencies: Remove WebSockets system for sending messages.
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013.  We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not.  Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.

While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.

This WebSockets code path had a lot of downsides/complexity,
including:

* The messy hack involving constructing an emulated request object to
  hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
  needs and must be provisioned independently from the rest of the
  server).
* A duplicate check_send_receive_time Nagios test specific to
  WebSockets.
* The requirement for users to have their firewalls/NATs allow
  WebSocket connections, and a setting to disable them for networks
  where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
  been poorly maintained, and periodically throws random JavaScript
  exceptions in our production environments without a deep enough
  traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
  server restart, and especially for large installations like
  zulipchat.com, resulting in extra delay before messages can be sent
  again.

As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients.  We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.

If we later want WebSockets, we’ll likely want to just use Django
Channels.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-14 22:34:00 -08:00
Mateusz Mandera 0beae44081 email_mirror: Use .walk() to search all MIME parts for attachments.
Fixes #13416

We used to search only one level in depth through the MIME structure,
and thus would miss attachments that were nested deeper (which can
happen with some email clients). We can take advantage of message.walk()
to iterate through each MIME part.
2020-01-14 15:37:39 -08:00
Mateusz Mandera 1561d144e0 email_mirror: Insert a new line before attachment links. 2020-01-14 15:37:39 -08:00
Tlazypanda 30ee0c2a49 invitations: Improve experience around reactivating users.
Previously, if you tried to invite a user whose account had been
deactivated, we didn't provide a clear path forward for reactivating
the users, which was confusing.

We fix this by plumbing through to the frontend the information that
there is an existing user account with that email address in this
organization, but that it's deactivated.  For administrators, we
provide a link for how to reactivate the user.

Fixes #8144.
2020-01-13 18:30:51 -08:00
Tim Abbott 79f18138f5 realm: Add private_message_policy setting.
This experimental setting disables sending private messages in Zulip
in a crude way (i.e. users get an error when they try to send one).
It makes no effort to adjust the UI to avoid advertising the idea of
sending private messages.

Fixes #6617.
2020-01-13 12:20:42 -08:00
Mateusz Mandera d5ac1afce8 email_mirror: Check address usability in get_missed_message_address. 2020-01-12 20:43:51 -08:00
Mateusz Mandera 89046ea1a9 email_mirror: Give extract_and_validate a more descriptive name. 2020-01-12 11:30:18 -08:00
Mateusz Mandera 90a69ab24f email_mirror: Reuse exception messages in mirror_email_message. 2020-01-12 11:30:18 -08:00
Mateusz Mandera 9f2b0c769f stream_recipient: Eliminate unnecessary queries.
We should take adventage of the recipient field being denormalized into
the Stream model. We don't need to make queries to figure out a stream's
recipient id, so we take advantage of that to eliminate some of
those redundant queries and simplify StreamRecipientMap.
2020-01-08 14:34:43 -08:00
Mateusz Mandera 786c235023 stream_recipient: Optimize query in populate_for_recipient_ids.
There's no reason to join with the Stream table, as Recipient.type_id is
the stream id.
2020-01-08 14:34:43 -08:00
Hashir Sarwar 0cabacb8ab export: Fix data export parallelization.
This improves the approach of creating multiple parallel processes by
using subprocess.Popen() instead of run_parallel() and
subprocess.call() while exporting an organization's message
history.  This prevents forking twice for individual subprocess.

While this has some performance benefit, the main reason to fix this
is that it fixes an issue with the data export web UI introduced in
run_parallel forks exited).

Fixes #12904.
2020-01-07 13:23:18 -08:00
Mateusz Mandera b87cf22b33 email_mirror: Move send_to_mm_address code to process_missed_message.
process_missed_message did nothing other than calling
send_to_missed_message_address with the same arguments, so there's no
reason to have these as separate functions.
2020-01-07 13:03:32 -08:00
Mateusz Mandera c011d2c6d3 email_mirror: Migrate missed message addresses from redis to database.
Addresses point 1 of #13533.

MissedMessageEmailAddress objects get tied to the specific that was
missed by the user. A useful benefit of that is that email message sent
to that address will handle topic changes - if the message that was
missed gets its topic changed, the email response will get posted under
the new topic, while in the old model it would get posted under the
old topic, which could potentially be confusing.

Migrating redis data to this new model is a bit tricky, so the migration
code has comments explaining some of the compromises made there, and
test_migrations.py tests handling of the various possible cases that
could arise.
2020-01-07 13:03:22 -08:00
Mateusz Mandera 9077bbfefd models: Add MissedMessageEmailAddress class.
Preparatory commit for making the email mirror use the database instead
of redis for missed message addresses.

This model will represent missed message email addresses, which
currently have their data stored in redis.
The redis data will be converted and migrated into these models and
the email mirror will start using them in the main commit.
2020-01-07 12:46:55 -08:00