Commit Graph

4190 Commits

Author SHA1 Message Date
Mateusz Mandera ad460e6ccb redis_utils: Validate requested key length in helper functions. 2020-01-26 21:40:15 -08:00
Mateusz Mandera 859bde482d auth: Implement server side of desktop_flow_otp. 2020-01-26 21:40:15 -08:00
Mateusz Mandera 8d987ba5ae auth: Use tokens, with data stored in redis, for log_into_subdomain.
The desktop otp flow (to be added in next commits) will want to generate
one-time tokens for the app that will allow it to obtain an
authenticated session. log_into_subdomain will be the endpoint to pass
the one-time token to. Currently it uses signed data as its input
"tokens", which is not compatible with the otp flow, which requires
simpler (and fixed-length) token. Thus the correct scheme to use is to
store the authenticated data in redis and return a token tied to the
data, which should be passed to the log_into_subdomain endpoint.

In this commit, we replace the "pass signed data around" scheme with the
redis scheme, because there's no point having both.
2020-01-26 21:32:44 -08:00
Abhishek-Balaji 434e8d3104 home: Extract compute_show_invites_and_add_streams.
This extracts a function for computing show_invites and
show_add_streams, for better readability and testability.

This commit was substantially cleaned up by tabbott.
2020-01-25 23:41:08 -08:00
Vishnu KS 97a25657a5 tests: Set class name of video call test to TestVideoCall.
The previous class name TestFeedbackBot was probably
leftover from copy paste.
2020-01-25 22:54:59 -08:00
Vishnu KS 05b4610381 bots: Remove feedback cross realm bot.
This completes the remaining pieces of removing this missed in
d70e799466 (mostly in tests).
2020-01-25 22:54:44 -08:00
Tim Abbott d70e799466 bots: Remove FEEDBACK_BOT implementation.
This legacy cross-realm bot hasn't been used in several years, as far
as I know.  If we wanted to re-introduce it, I'd want to implement it
as an embedded bot using those common APIs, rather than the totally
custom hacky code used for it that involves unnecessary queue workers
and similar details.

Fixes #13533.
2020-01-25 22:41:39 -08:00
Mateusz Mandera bce50ee652 auth: Use authenticate_remote_user in remote_user_jwt.
authenticate_remote_user already takes care of calling the authenticate
with the dummy backend. Also, return_data is not used and catching
DoesNotExist exception is not needed, as the dummy backend just returns
None if user isn't found.
2020-01-23 16:24:07 -08:00
Tim Abbott e052ec58db slack import: Improve error messages around invalid tokens.
This updates our error handling of invalid Slack API tokens (and other
networking error handling) to mostly make sense:
* A token that doesn't start with `xoxp-` gives an extended error early.
* An AssertionError for the codebase is correctly declared as such.
* We check for token shape errors before querying the Slack API.

We could still do useful work to raise custom exception classes here.

Thanks to @stavrospat for raising this issue.
2020-01-22 14:48:32 -08:00
Mateusz Mandera 8dd95bd057 tests: Replace httpretty with responses.
responses is an module analogous to httpretty for mocking external
URLs, with a very similar interface (potentially cleaner in that it
makes use of context managers).

The most important (in the moment) problem with httpretty is that it
breaks the ability to use redis in parts of code where httpretty is
enabled.  From more research, the module in general has tendency to
have various troublesome bugs with breaking URLs that it shouldn't be
affecting, caused by it working at the socket interface layer.  While
those issues could be fixed, responses seems to be less buggy (based
on both third-party reports like ckan/ckan#4755 and our own experience
in removing workarounds for bugs in httpretty) and is more actively
maintained.
2020-01-22 11:56:15 -08:00
Mateusz Mandera d37e6ef921 email_mirror: Use plaintext if html body empty with prefer-html option.
If an email is sent with the .prefer-html option, but it has no html
body, it's better to fall back to plaintext content instead of treating
it as a user error.
2020-01-16 15:25:27 -08:00
Mateusz Mandera 0c9c218e91 email_mirror: Add prefer-html and prefer-text address options.
Closes #13484.

These options tell zulip whether to prefer the plaintext or html version
of the email message. prefer-text is the default behavior, so including
the option doesn't change anything as of now, but we're adding it to
prepare to potentially change the default behavior in the future.
2020-01-16 15:25:19 -08:00
Tim Abbott 8ff5d8ca89 test_classes: Clean up API_KEYS cache.
Since the intent of our testing code was clearly to clear this cache
for every test, there's no reason for it to be a module-level global.

This allows us to remove an unnecessary import from test_runner.py,
which in combination with DEFAULT_REALM's definition was causing us to
run models code before running migrations inside test-backend.

(That bug, in turn, caused test-backend's check for whether migrations
needs to be run to happen sadly after trying to access a Realm,
trigger a test-backend crash if the Realm model had changed since the
last provision).
2020-01-16 13:07:26 -08:00
Anders Kaseorg ea6934c26d dependencies: Remove WebSockets system for sending messages.
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013.  We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not.  Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.

While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.

This WebSockets code path had a lot of downsides/complexity,
including:

* The messy hack involving constructing an emulated request object to
  hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
  needs and must be provisioned independently from the rest of the
  server).
* A duplicate check_send_receive_time Nagios test specific to
  WebSockets.
* The requirement for users to have their firewalls/NATs allow
  WebSocket connections, and a setting to disable them for networks
  where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
  been poorly maintained, and periodically throws random JavaScript
  exceptions in our production environments without a deep enough
  traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
  server restart, and especially for large installations like
  zulipchat.com, resulting in extra delay before messages can be sent
  again.

As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients.  We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.

If we later want WebSockets, we’ll likely want to just use Django
Channels.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2020-01-14 22:34:00 -08:00
Mateusz Mandera 0beae44081 email_mirror: Use .walk() to search all MIME parts for attachments.
Fixes #13416

We used to search only one level in depth through the MIME structure,
and thus would miss attachments that were nested deeper (which can
happen with some email clients). We can take advantage of message.walk()
to iterate through each MIME part.
2020-01-14 15:37:39 -08:00
Mateusz Mandera 1561d144e0 email_mirror: Insert a new line before attachment links. 2020-01-14 15:37:39 -08:00
Tlazypanda 30ee0c2a49 invitations: Improve experience around reactivating users.
Previously, if you tried to invite a user whose account had been
deactivated, we didn't provide a clear path forward for reactivating
the users, which was confusing.

We fix this by plumbing through to the frontend the information that
there is an existing user account with that email address in this
organization, but that it's deactivated.  For administrators, we
provide a link for how to reactivate the user.

Fixes #8144.
2020-01-13 18:30:51 -08:00
Tim Abbott 79f18138f5 realm: Add private_message_policy setting.
This experimental setting disables sending private messages in Zulip
in a crude way (i.e. users get an error when they try to send one).
It makes no effort to adjust the UI to avoid advertising the idea of
sending private messages.

Fixes #6617.
2020-01-13 12:20:42 -08:00
Mateusz Mandera 90a69ab24f email_mirror: Reuse exception messages in mirror_email_message. 2020-01-12 11:30:18 -08:00
Mateusz Mandera 9f2b0c769f stream_recipient: Eliminate unnecessary queries.
We should take adventage of the recipient field being denormalized into
the Stream model. We don't need to make queries to figure out a stream's
recipient id, so we take advantage of that to eliminate some of
those redundant queries and simplify StreamRecipientMap.
2020-01-08 14:34:43 -08:00
Mateusz Mandera c011d2c6d3 email_mirror: Migrate missed message addresses from redis to database.
Addresses point 1 of #13533.

MissedMessageEmailAddress objects get tied to the specific that was
missed by the user. A useful benefit of that is that email message sent
to that address will handle topic changes - if the message that was
missed gets its topic changed, the email response will get posted under
the new topic, while in the old model it would get posted under the
old topic, which could potentially be confusing.

Migrating redis data to this new model is a bit tricky, so the migration
code has comments explaining some of the compromises made there, and
test_migrations.py tests handling of the various possible cases that
could arise.
2020-01-07 13:03:22 -08:00
Steve Howell 630aadb7e0 bot_owner_id: Explicitly set bot_owner_id to None.
For cross realm bots, explicitly set bot_owner_id
to None.  This makes it clear that the cross realm
bots have no owner, whereas before it could be
misdiagnosed as the server forgetting to set the
field.
2020-01-07 12:33:14 -08:00
Mateusz Mandera a993604fae test_email_notifs: Clean up mocking.
These tests had a lot of very repetetive, identical mocking, in some
tests without even doing anything with the mocks. It's cleaner to put
the mock in the one relevant, common place for all the tests that need
it, and remove it from tests who had no use for the mocking.
2020-01-03 16:56:58 -08:00
Mateusz Mandera d691c249db api: Return a JsonableError if API key of invalid format is given. 2020-01-03 16:56:42 -08:00
Mateusz Mandera 72401b229f utils: Add a function to check if string can be an API key. 2020-01-03 16:56:42 -08:00
Mateusz Mandera 4f2897fafc cache: Validate keys before passing them to memcached.
Fixes #13504.

This commit is purely an improvement in error handling.

We used to not do any validation on keys before passing them to
memcached, which meant for invalid keys, memcached's own key
validation would throw an exception.  Unfortunately, the resulting
error messages are super hard to read; the traceback structure doesn't
even show where the call into memcached happened.

In this commit we add validation to all the basic cache_* functions, and
appropriate handling in their callers.

We also add a lot of tests for the new behavior, which has the nice
effect of giving us decent coverage of all these core caching
functions which previously had been primarily tested manually.
2020-01-03 16:56:42 -08:00
Mateusz Mandera e90866876c queue: Take advantage of ABC for defining abstract worker base classes.
QueueProcessingWorker and LoopQueueProcessingWorker are abstract classes
meant to be subclassed by a class that will define its own consume()
or consume_batch() method. ABCs are suited for that and we can tag
consume/consume_batch with the @abstractmethod wrapper which will
prevent subclasses that don't define these methods properly to be
impossible to even instantiate (as opposed to only crashing once
consume() is called). It's also nicely detected by mypy, which will
throw errors such as this on invalid use:

error: Only concrete class can be given where "Type[TestWorker]" is
expected
error: Cannot instantiate abstract class 'TestWorker' with abstract
attribute 'consume'

Due to it being detected by mypy, we can remove the test
test_worker_noconsume which just tested the old version of this -
raising an exception when the unimplemented consume() gets called. Now
it can be handled already on the linter level.
2019-12-28 10:52:17 -08:00
Mateusz Mandera ec209a9bc9 test_queue_worker: Extract a repetitive mock. 2019-12-28 10:52:13 -08:00
Mateusz Mandera a54640fc68 queue: Share exception handling code between loop and normal workers.
LoopQueueProcessingWorker can handle exceptions inside consume_batch in
a similar manner to how QueueProcessingWorker handles exceptions inside
consume.
2019-12-28 10:47:36 -08:00
Mateusz Mandera e559447f83 ldap: Improve logging.
Our ldap integration is quite sensitive to misconfigurations, so more
logging is better than less to help debug those issues.
Despite the following docstring on ZulipLDAPException:

"Since this inherits from _LDAPUser.AuthenticationFailed, these will
be caught and logged at debug level inside django-auth-ldap's
authenticate()"

We weren't actually logging anything, because debug level messages were
ignored due to our general logging settings. It is however desirable to
log these errors, as they can prove useful in debugging configuration
problems. The django_auth_ldap logger can get fairly spammy on debug
level, so we delegate ldap logging to a separate file
/var/log/zulip/ldap.log to avoid spamming server.log too much.
2019-12-28 10:47:08 -08:00
Tim Abbott 02169c48cf ldap: Fix bad interaction between EMAIL_ADDRESS_VISIBILITY and LDAP sync.
A block of LDAP integration code related to data synchronization did
not correctly handle EMAIL_ADDRESS_VISIBILITY_ADMINS, as it was
accessing .email, not .delivery_email, both for logging and doing the
mapping between email addresses and LDAP users.

Fixes #13539.
2019-12-15 22:59:02 -08:00
Tim Abbott 7ccc8373e2 bugdown: Fix logic for extracting attachment path_id.
In 3892a8afd8, we restructured the
system for managing uploaded files to a much cleaner model where we
just do parsing inside bugdown.

That new model had potentially buggy handling of cases around both
relative URLs and URLS starting with `realm.host`.

We address this by further rewriting the handling of attachments to
avoid regular expressions entirely, instead relying on urllib for
parsing, and having bugdown output `path_id` values, so that there's
no need for any conversions between formats outside bugdowm.

The check_attachment_reference_change function for processing message
updates is significantly simplified in the process.

The new check on the hostname has the side effect of requiring us to
fix some previously weird/buggy test data.

Co-Author-By: Anders Kaseorg <anders@zulipchat.com>
Co-Author-By: Rohitt Vashishtha <aero31aero@gmail.com>
2019-12-12 20:30:26 -08:00
Anders Kaseorg 8e37862b69 CVE-2019-19775: Close open redirect in thumbnail view.
This closes an open redirect vulnerability, one case of which was
found by Graham Bleaney and Ibrahim Mohamed using Pysa.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-12 17:29:20 -08:00
Tim Abbott 4901dc3795 url_preview: Fix parsing of open graph tags.
Our open graph parser logic sloppily mixed data obtained by parsing
open graph properties with trusted data set by our oembed parser.

We fix this by consistenly using our explicit whitelist of generic
properties (image, title, and description) in both places where we
interact with open graph properties.  The fixes are redundant with
each other, but doing both helps in making the intent of the code
clearer.

This issue fixed here was originally reported as an XSS vulnerability
in the upcoming Inline URL Previews feature found by Graham Bleaney
and Ibrahim Mohamed using Pysa.  The recent Oembed changes close that
vulnerability, but this change is still worth doing to make the
implementation do what it looks like it does.
2019-12-12 15:24:38 -08:00
Anders Kaseorg faa3ea0b8e oembed: Remove unsound HTML filtering.
The frontend now takes care of confining the HTML.

Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
2019-12-12 15:24:38 -08:00
Tim Abbott e7cf1112c8 notifications: Enable online push notifications by default.
For new user onboarding, it's important for it to be easy to verify
that Zulip's mobile push notifications work without jumping through
hoops or potentially making mistakes.  For that reason, it makes sense
to toggle the notification defaults for new users to the more
aggressive mode (ignoring whether the user is currently actively
online); they can set the more subtle mode if they find that the
notifications are annoying.
2019-12-12 13:04:10 -08:00
Mateusz Mandera 9a42a83e15 streams: Remove get_stream_recipients function and its uses.
With the recipient field being denormalized into the UserProfile and
Streams models, all current uses of get_stream_recipients can be done
more efficiently, by simply checking the .recipient_id attribute on the
appropriate objects.
2019-12-12 12:05:42 -08:00
Mateusz Mandera 01288ede9e recipients: Remove bulk_get_recipients function and its uses.
With the recipient field being denormalized into the UserProfile and
Streams models, all current uses of bulk_get_recipients can be done more
efficient, by simply checking the .recipient_id attribute on the
appropriate objects.
2019-12-12 12:00:13 -08:00
Mateusz Mandera 9995dab095 messages: Save a database query in check_message code path.
The flow in recipient_for_user_profiles previously worked by doing
validation on UserProfile objects (returning a list of IDs), and then
using that data to look up the appropriate Recipient objects.

For the case of sending a private message to another user, the new
UserProfile.recipient column lets us avoid the query to the Recipient
table if we move the step of reducing down to user IDs to only occur
in the Huddle code path.
2019-12-12 11:49:01 -08:00
Mateusz Mandera 4eb629e276 auth: Use config_error instead of JsonableError in remote_user_sso. 2019-12-11 16:40:20 -08:00
Mateusz Mandera e955bfde83 auth: Check that the backend is enabled at the start of remote_user_sso. 2019-12-11 16:35:18 -08:00
Tim Abbott 299896b6ce notifications: Ignore mobile presence when sending notifications.
Previously, if the user had interacted with the Zulip mobile app in
the last ~140 seconds, it's likely the mobile app had sent presence
data to the Zulip server, which in turns means that the Zulip server
might not send that user mobile push notifications (or email
notifications) about new messages for the next few minutes.

The email notifications behavior is potentially desirable, but the
push notifications behavior is definitely not -- a private message
reply to something you sent 2 minutes ago is definitely something you
want a push notification for.

This commit partially addresses that issue, by ignoring presence data
from the ZulipMobile client when determining whether the user is
currently engaging with a Zulip client (essentially, we're only
considering desktop activity as something that predicts the user is
likely to see a desktop notification or is otherwise "online").
2019-12-11 16:05:35 -08:00
Tim Abbott 958f39a551 message_edit: Call check_attachment_reference_change unconditionally.
This removes the last of the messy use of regular expressions outside
bugdown to make decisions on whether a message contains an attachment
or not.  Centralizing questions about links to be decided entirely
within bugdown (rather than doing ad-hoc secondary parsing elsewhere)
makes the system cleaner and more robust.
2019-12-11 11:10:46 -08:00
Rohitt Vashishtha 3fbb050216 messages: Remove dependence on regex for claiming attachments.
This commit wraps up the work to remove basic regex based parsing
of messages to handle attachment claiming/unclaiming. We now use
the more dependable Bugdown processor to find potential links and
only operate upon those links instead of parsing the full message
content again.
2019-12-11 11:03:49 -08:00
Rohitt Vashishtha 3892a8afd8 messages: Set has_attachment correctly using Bugdown.
Previously, we would naively set has_attachment just by searching
the whole messages for strings like `/user_uploads/...`. We now
prevent running do_claim_attachments for messages that obviously
do not have an attachment in them that we previously ran.

For example: attachments in codeblocks or
             attachments that otherwise do not match our link syntax.

The new implementation runs that check on only the urls that
bugdown determines should be rendered. We also refactor some
Attachment tests in test_messages to test this change.

The new method is:

1. Create a list of potential_attachment_urls in Bugdown while rendering.
2. Loop over this list in do_claim_attachments for the actual claiming.
   For saving:
3. If we claimed an attachment, set message.has_attachment to True.
   For updating:
3. If claimed_attachment != message.has_attachment: update has_attachment.

We do not modify the logic for 'unclaiming' attachments when editing.
2019-12-11 11:03:44 -08:00
Rohitt Vashishtha 4674cc5098 bugdown: Set message.has_image while rendering message. 2019-12-11 17:01:41 +05:30
dustinheestand 157c98de99 bugdown: Correctly set has_link attribute on messages.
Now autolinks and message edits affect the has_link attribute on messages.
2019-12-11 17:01:41 +05:30
Ryan Rehman 2589065405 tests: Rename invitor to inviter in test_signup and test_queue_worker.
"Inviter" seems to be preferred for the person who invites an invitee.
2019-12-10 17:22:32 -08:00
Ryan Rehman 6110bf96ca tests: Rename prereg_users to prereg_user in test_events.
This is a typo fix.
2019-12-10 17:21:04 -08:00
Mateusz Mandera 7ee54810a1 auth: Eliminate if/else block for PreregUser handling with/without SSO.
Both branches did very similar things, and the code is better having
common handling in all cases.
2019-12-10 20:16:21 +01:00