This is a preparatory commit for using isort for sorting all of our
imports, merging changes to files where we can easily review the
changes as something we're happy with.
These are also files with relatively little active development, which
means we don't expect much merge conflict risk from these changes.
Previously, if you tried to invite a user whose account had been
deactivated, we didn't provide a clear path forward for reactivating
the users, which was confusing.
We fix this by plumbing through to the frontend the information that
there is an existing user account with that email address in this
organization, but that it's deactivated. For administrators, we
provide a link for how to reactivate the user.
Fixes#8144.
Our recent fixes to using the system's configured memcached settings
broke populate_db, because its hacky clear_database helper is called
with a hacked-up settings module.
We fix this by first moving this out-of-place code from models.py into
populate_db, and then saving the settings required to access memcached
so that we can use them in clear_database.
We also fix a mypy erorr in flush-memcached that matches the same
issue fixed in clear_database.
This experimental setting disables sending private messages in Zulip
in a crude way (i.e. users get an error when they try to send one).
It makes no effort to adjust the UI to avoid advertising the idea of
sending private messages.
Fixes#6617.
We should take adventage of the recipient field being denormalized into
the Stream model. We don't need to make queries to figure out a stream's
recipient id, so we take advantage of that to eliminate some of
those redundant queries and simplify StreamRecipientMap.
This removes zerver/webhooks/trello/view/exceptions.py, which
contained legacy Trello webhook exception related classes. We replace
them with UnexpectedWebhookEventType, which results in our standard
exception handling for unknown event types running (avoiding too-high
priority error logging).
Fixes#13467.
This improves the approach of creating multiple parallel processes by
using subprocess.Popen() instead of run_parallel() and
subprocess.call() while exporting an organization's message
history. This prevents forking twice for individual subprocess.
While this has some performance benefit, the main reason to fix this
is that it fixes an issue with the data export web UI introduced in
run_parallel forks exited).
Fixes#12904.
process_missed_message did nothing other than calling
send_to_missed_message_address with the same arguments, so there's no
reason to have these as separate functions.
Addresses point 1 of #13533.
MissedMessageEmailAddress objects get tied to the specific that was
missed by the user. A useful benefit of that is that email message sent
to that address will handle topic changes - if the message that was
missed gets its topic changed, the email response will get posted under
the new topic, while in the old model it would get posted under the
old topic, which could potentially be confusing.
Migrating redis data to this new model is a bit tricky, so the migration
code has comments explaining some of the compromises made there, and
test_migrations.py tests handling of the various possible cases that
could arise.
Preparatory commit for making the email mirror use the database instead
of redis for missed message addresses.
This model will represent missed message email addresses, which
currently have their data stored in redis.
The redis data will be converted and migrated into these models and
the email mirror will start using them in the main commit.
For cross realm bots, explicitly set bot_owner_id
to None. This makes it clear that the cross realm
bots have no owner, whereas before it could be
misdiagnosed as the server forgetting to set the
field.
Model classes fetched through apps.get_model don't get methods or class
attributes. It's not feasible to add them to all these objects in
use_db_models, but Recipient.PERSONAL etc. are worth setting, since
doing that increases the range of functions that can successfully be
imported and called in test_migrations.py.
These tests had a lot of very repetetive, identical mocking, in some
tests without even doing anything with the mocks. It's cleaner to put
the mock in the one relevant, common place for all the tests that need
it, and remove it from tests who had no use for the mocking.
Fixes#13504.
This commit is purely an improvement in error handling.
We used to not do any validation on keys before passing them to
memcached, which meant for invalid keys, memcached's own key
validation would throw an exception. Unfortunately, the resulting
error messages are super hard to read; the traceback structure doesn't
even show where the call into memcached happened.
In this commit we add validation to all the basic cache_* functions, and
appropriate handling in their callers.
We also add a lot of tests for the new behavior, which has the nice
effect of giving us decent coverage of all these core caching
functions which previously had been primarily tested manually.
If ldap sync is run while ldap is misconfigured, it can end up causing
troublesome deactivations due to not finding users in ldap -
deactivating all users, or deactivating all administrators of a realm,
which then will require manual intervention to reactivate at least one
admin in django shell.
This change prevents such potential troublesome situations which are
overwhelmingly likely to be unintentional. If intentional, --force
option can be used to remove the protection.
This change should prevent test flakes, plus
it's more deterministic behavior for clients,
who will generally comma-join the ids into
a key for their internal data structures.
I was able to verify test coverage on this
by making the sort reversed, which would
cause test_huddle_send_message_events to
fail.
This should be about 4 times faster, saving something like half a
millisecond on each stream of 10000 subscribers.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
QueueProcessingWorker and LoopQueueProcessingWorker are abstract classes
meant to be subclassed by a class that will define its own consume()
or consume_batch() method. ABCs are suited for that and we can tag
consume/consume_batch with the @abstractmethod wrapper which will
prevent subclasses that don't define these methods properly to be
impossible to even instantiate (as opposed to only crashing once
consume() is called). It's also nicely detected by mypy, which will
throw errors such as this on invalid use:
error: Only concrete class can be given where "Type[TestWorker]" is
expected
error: Cannot instantiate abstract class 'TestWorker' with abstract
attribute 'consume'
Due to it being detected by mypy, we can remove the test
test_worker_noconsume which just tested the old version of this -
raising an exception when the unimplemented consume() gets called. Now
it can be handled already on the linter level.
LoopQueueProcessingWorker can handle exceptions inside consume_batch in
a similar manner to how QueueProcessingWorker handles exceptions inside
consume.
Our ldap integration is quite sensitive to misconfigurations, so more
logging is better than less to help debug those issues.
Despite the following docstring on ZulipLDAPException:
"Since this inherits from _LDAPUser.AuthenticationFailed, these will
be caught and logged at debug level inside django-auth-ldap's
authenticate()"
We weren't actually logging anything, because debug level messages were
ignored due to our general logging settings. It is however desirable to
log these errors, as they can prove useful in debugging configuration
problems. The django_auth_ldap logger can get fairly spammy on debug
level, so we delegate ldap logging to a separate file
/var/log/zulip/ldap.log to avoid spamming server.log too much.
A block of LDAP integration code related to data synchronization did
not correctly handle EMAIL_ADDRESS_VISIBILITY_ADMINS, as it was
accessing .email, not .delivery_email, both for logging and doing the
mapping between email addresses and LDAP users.
Fixes#13539.
This simplifies the RDS installation process to avoid awkwardly
requiring running the installer twice, and also is significantly more
robust in handling issues around rerunning the installer.
Finally, the answer for whether dictionaries are missing is available
to Django for future use in warnings/etc. around full-text search not
being great with this configuration, should they be required.
Fixes#13528.
The email_auth_enabled check caused all enabled backends to get
initialized, and thus if LDAP was enabled the check_ldap_config()
check would cause an error if LDAP was misconfigured
(for example missing the new settings).
In 3892a8afd8, we restructured the
system for managing uploaded files to a much cleaner model where we
just do parsing inside bugdown.
That new model had potentially buggy handling of cases around both
relative URLs and URLS starting with `realm.host`.
We address this by further rewriting the handling of attachments to
avoid regular expressions entirely, instead relying on urllib for
parsing, and having bugdown output `path_id` values, so that there's
no need for any conversions between formats outside bugdowm.
The check_attachment_reference_change function for processing message
updates is significantly simplified in the process.
The new check on the hostname has the side effect of requiring us to
fix some previously weird/buggy test data.
Co-Author-By: Anders Kaseorg <anders@zulipchat.com>
Co-Author-By: Rohitt Vashishtha <aero31aero@gmail.com>
This closes an open redirect vulnerability, one case of which was
found by Graham Bleaney and Ibrahim Mohamed using Pysa.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This avoids risk of OOM issues on servers with relatively limited RAM
and millions of messages of history; apparently, fetching all messages
ordered by ID could be quite memory-intensive even with an iterator
usage model.
Fortunately, we have other migrations that already follow this pattern
of iterating over messages, so it's easy to borrow existing code to
make this migration run reasonably.
Our open graph parser logic sloppily mixed data obtained by parsing
open graph properties with trusted data set by our oembed parser.
We fix this by consistenly using our explicit whitelist of generic
properties (image, title, and description) in both places where we
interact with open graph properties. The fixes are redundant with
each other, but doing both helps in making the intent of the code
clearer.
This issue fixed here was originally reported as an XSS vulnerability
in the upcoming Inline URL Previews feature found by Graham Bleaney
and Ibrahim Mohamed using Pysa. The recent Oembed changes close that
vulnerability, but this change is still worth doing to make the
implementation do what it looks like it does.
This fixes a few minor issues with the migration:
* Skips messages with empty rendered_content, fixing an exception that
affected 4 messages on chat.zulip.org.
* Accesses messages in order.
* Provides some basic output on the progress made.
This should make life substantially better for any organizations that
run into trouble with this migration, either due to it taking a long
time to run or due to any new exceptions.
For new user onboarding, it's important for it to be easy to verify
that Zulip's mobile push notifications work without jumping through
hoops or potentially making mistakes. For that reason, it makes sense
to toggle the notification defaults for new users to the more
aggressive mode (ignoring whether the user is currently actively
online); they can set the more subtle mode if they find that the
notifications are annoying.
Previously, these accesses used e.g. .select_related("realm"), which
was the only foreign key on the Stream model. Since the intent in
these code paths is to attach the related models for efficient access,
we should just do that for all related models, including Recipient.
With the recipient field being denormalized into the UserProfile and
Streams models, all current uses of get_stream_recipients can be done
more efficiently, by simply checking the .recipient_id attribute on the
appropriate objects.
With the recipient field being denormalized into the UserProfile and
Streams models, all current uses of bulk_get_recipients can be done more
efficient, by simply checking the .recipient_id attribute on the
appropriate objects.
The flow in recipient_for_user_profiles previously worked by doing
validation on UserProfile objects (returning a list of IDs), and then
using that data to look up the appropriate Recipient objects.
For the case of sending a private message to another user, the new
UserProfile.recipient column lets us avoid the query to the Recipient
table if we move the step of reducing down to user IDs to only occur
in the Huddle code path.
Previously, if the user had interacted with the Zulip mobile app in
the last ~140 seconds, it's likely the mobile app had sent presence
data to the Zulip server, which in turns means that the Zulip server
might not send that user mobile push notifications (or email
notifications) about new messages for the next few minutes.
The email notifications behavior is potentially desirable, but the
push notifications behavior is definitely not -- a private message
reply to something you sent 2 minutes ago is definitely something you
want a push notification for.
This commit partially addresses that issue, by ignoring presence data
from the ZulipMobile client when determining whether the user is
currently engaging with a Zulip client (essentially, we're only
considering desktop activity as something that predicts the user is
likely to see a desktop notification or is otherwise "online").
This removes the last of the messy use of regular expressions outside
bugdown to make decisions on whether a message contains an attachment
or not. Centralizing questions about links to be decided entirely
within bugdown (rather than doing ad-hoc secondary parsing elsewhere)
makes the system cleaner and more robust.
This commit wraps up the work to remove basic regex based parsing
of messages to handle attachment claiming/unclaiming. We now use
the more dependable Bugdown processor to find potential links and
only operate upon those links instead of parsing the full message
content again.
Previously, we would naively set has_attachment just by searching
the whole messages for strings like `/user_uploads/...`. We now
prevent running do_claim_attachments for messages that obviously
do not have an attachment in them that we previously ran.
For example: attachments in codeblocks or
attachments that otherwise do not match our link syntax.
The new implementation runs that check on only the urls that
bugdown determines should be rendered. We also refactor some
Attachment tests in test_messages to test this change.
The new method is:
1. Create a list of potential_attachment_urls in Bugdown while rendering.
2. Loop over this list in do_claim_attachments for the actual claiming.
For saving:
3. If we claimed an attachment, set message.has_attachment to True.
For updating:
3. If claimed_attachment != message.has_attachment: update has_attachment.
We do not modify the logic for 'unclaiming' attachments when editing.
add_a, add_oembed_data and add_embed are only called by
InlineInterestingLinksProcessor and this commit allows
these methods to access self.markdown object.
We register ZulipRemoteUserBackend as an external_authentication_method
to make it show up in the corresponding field in the /server_settings
endpoint.
This also allows rendering its login button together with
Google/Github/etc. leading to us being able to get rid of some of the
code that was handling it as a special case - the js code for plumbing
the "next" value and the special {% if only_sso %} block in login.html.
An additional consequence of the login.html change is that now the
backend will have it button rendered even if it isn't the only backend
enabled on the server.
This commit builds a more complete concept of an "external
authentication method". Our social backends become a special case of an
external authentication method - but these changes don't change the
actual behavior of social backends, they allow having other backends
(that come from python-social-auth and don't use the social backend
pipeline) share useful code that so far only serviced social backends.
Most importantly, this allows having other backends show up in the
external_authentication_methods field of the /server_settings endpoint,
as well as rendering buttons through the same mechanism as we already
did for social backends.
This moves the creation of dictonaries describing the backend for the
API and button rendering code away into a method, that each backend in
this category is responsible for defining.
To register a backend as an external_authentication_method, it should
subclass ExternalAuthMethod and define its dict_representation
classmethod, and finally use the external_auth_method class decorator to
get added to the EXTERNAL_AUTH_METHODS list.
This commit has a side-effect that we also now allow mixed lists,
but they have different syntax from the commonmark implementation
and our marked output. For example, without the closing li tags:
Input Bugdown Marked
-------------------------------------
<ul>
- Hello <li>Hello <ul><li>Hello</ul>
+ World <li>World <ul><li>World
+ Again <li>Again <li>Again</ul>
* And <li>And <ul><li>And
* Again <li>Again <li>Again</ul>
</ul>
The bugdown render is in line with what a user in #13447 requests.
Fixes#13477.
Adds required API and front-end changes to modify and read the
wildcard_mentions_notify field in the Subscription model.
It includes front-end code to add the setting to the user's "manage
streams" page. This setting will be greyed out when a stream is muted.
The PR also includes back-end code to add the setting the initial state of
a subscription.
New automated tests were added for the API, events system and front-end.
In manual testing, we checked that modifying the setting in the front end
persisted the change in the Subscription model. We noticed the notifications
were not behaving exactly as expected in manual testing; see
https://github.com/zulip/zulip/issues/13073#issuecomment-560263081 .
Tweaked by tabbott to fix real-time synchronization issues.
Fixes: #13429.
Previously, get_recent_private_messages could take 100ms-1s to run,
contributing a substantial portion of the total runtime of `/`.
We fix this by taking advantage of the recent denormalization of
personal_recipient into the UserProfile model, allowing us to avoid
the complex join with Recipient that was previously required.
The change that requires additional commentary is the change to the
main, big SQL query:
1. We eliminate UserMessage table from the query, because the condition
m.recipient_id=%(my_recipient_id)d
implies m is a personal message to the user being processed - so joining
with usermessage to check for user_profile_id and flags&2048 (which
checks the message is private) is redundant.
2. We only need to join the Message table with UserProfile
(on sender_id) and get the sender's personal_recipient_id from their
UserProfile row.
Fixes#13437.
This is adds foreign keys to the corresponding Recipient object in the
UserProfile on Stream tables, a denormalization intended to improve
performance as this is a common query.
In the migration for setting the field correctly for existing users,
we do a direct SQL query (because Django 1.11 doesn't provide any good
method for doing it properly in bulk using the ORM.).
A consequence of this change to the model is that a bit of code needs
to be added to the functions responsible for creating new users (to
set the field after the Recipient object gets created). Fortunately,
there's only a few code paths for doing that.
Also an adjustment is needed in the import system - this introduces a
circular relation between Recipient and UserProfile. The field cannot be
set until the Recipient objects have been created, but UserProfiles need
to be created before their corresponding Recipients. We deal with this
by first importing UserProfiles same way as before, but we leave the
personal_recipient field uninitialized. After creating the Recipient
objects, we call a function to set the field for all the imported users
in bulk.
A similar change is made for managing Stream objects.
Fixes#13452.
The migration from UserProfile.is_realm_admin/UserProfile.is_guest in
e10361a832 broke our LDAP-based support
for setting a user's role via LDAP properties, which relied on setting
those fields. Because the django-auth-ldap feature powering that only
supports booleans (and in any case, we don't want to expose constants
like `ROLE_REALM_ADMINISTRATOR` to the LDAP configuration interface),
it makes sense to provide setters for these legacy fields for
backwards-compatibility.
We lint against using these setters directly in Zulip's codebase
directly. The issue with using these is that when changing user's
.role we want to create appropriate RealmAuditLog entries and send
events. This isn't possible when using these setters - the log entries
and events should be created if the role change in the UserProfile is
actually save()-ed to the database - and on the level of the setter
function, it's not known whether the change will indeed be saved.
It would have to be somehow figured out on the level of post_save
signal handlers, but it doesn't seem like a good design to have such
complexity there, for the sake of setters that generally shouldn't be
used anyway - because we prefer the do_change_is_* functions.
The purpose of this change is narrowly to handle use cases like the
setattr on these boolean properties.
We used to specify the securityScheme for each REST operation seperately.
This is unecessary as the securityScheme can be specified in root level
and would be automatically applied to all operations. This also prevents
us accidentally not specifying the securityScheme for some operations and
was the case for /users/me/subscriptions PATCH endpoint. The root level
securityScheme can be also overriden in the operational level when
necessary.
swagger.io/docs/specification/authentication/#security
We use the plumbing introduced in a previous commit, to now raise
PushNotificationBouncerRetryLaterError in send_to_push_bouncer in case
of issues with talking to the bouncer server. That's a better way of
dealing with the errors than the previous approach of returning a
"failed" boolean, which generally wasn't checked in the code anyway and
did nothing.
The PushNotificationBouncerRetryLaterError exception will be nicely
handled by queue processors to retry sending again, and due to being a
JsonableError, it will also communicate the error to API users.
We add PushNotificationBouncerRetryLaterError as an exception to signal
an error occurred when trying to communicate with the bouncer and it
should be retried. We use JsonableError as the base class, because this
signal will need to work in two roles:
1. When the push notification was being issued by the queue worker
PushNotificationsWorker, it will signal to the worker to requeue the
event and try again later.
2. The exception will also possibly be raised (this will be added in the
next commit) on codepaths coming from a request to an API endpoint (for
example to add a token, to users/me/apns_device_token). In that case,
it'll be needed to provide a good error to the API user - and basing
this exception on JsonableError will allow that.
If a message begins with /me, we do not have any cases where the
rendered content would not begin with `<p>/me`. Thus, we can safely
remove the redundant checks both on the backend and frontend.
It appears we forgot to make identical changes to the backend
in #11089 while adding support for multiline /me messages,
resulting in any messages that didn't end in a paragraph getting
rendered as a regular message instead.
Fixes#13454.
In configurations with LDAP_APPEND_DOMAIN, we don't want people creating
non-ldap accounts with emails matching the ldap domain.
So in the registration flow, if the email isn't found in LDAP, but
matches LDAP_APPEND_DOMAIN, we stop, rather than proceeding with account
creation. In case of emails not matching LDAP_APPEND_DOMAIN, we will
still continue to make a normal, non-ldap account.
The problem was that, for example, given a configuration of social
backend + LDAPPopulator, if a user that's not in ldap was being
registered, the Full Name field in the registration form would be
empty instead of getting prefilled with the name provided by the
social backend.
This fixes it - first we try to get the name from ldap. If that
succeeds, a form is created pre-filled with that name. Otherwise, we
proceed to attempt to pre-fill with other means.
This also has a nice side effect of reorganizing most of the logic to
be more parallel between LDAP and other sources of name data.
This is a performance optimization, since we can avoid doing work
related to wildcard mentions in the common case that the message can't
have any. We also add a unit test for adding wildcard mentions in a
message edit.
We also switch the underlying exctact_mention_text method to use
a regular for loop, as well as make the related methods return
tuples of (names, is_wildcard). This abstraction is hidden from the
MentionData callers behind mention_data.message_has_wildcards().
Concerns #13430.
This simple change switches us to take advantage of the
server-maintained data for the pm_conversations system we implemented
originally for mobile use.
This should make it a lot more convenient to find historical private
message conversations, since one can effectively scroll infinitely
into the history.
We'll need to do some profiling of the backend after this is deployed
in production; it's possible we'll need to add some database indexes,
denormalization, or other optimizations to avoid making loading the
Zulip app significantly slower.
Fixes#12502.
Previously, the LDAP code for syncing user data was not
multiple-realm-aware, resulting in errors trying to sync data for an
LDAP user present in multiple realms.
Tweaked by tabbott to add some extended comments.
Fixes#11520.
For a long time, we've been only doing the zxcvbn password strength
checks on the browser, which is helpful, but means users could through
hackery (or a bug in the frontend validation code) manage to set a
too-weak password. We fix this by running our password strength
validation on the backend as well, using python-zxcvbn.
In theory, a bug in python-zxcvbn could result in it producing a
different opinion than the frontend version; if so, it'd be a pretty
bad bug in the library, and hopefully we'd hear about it from users,
report upstream, and get it fixed that way. Alternatively, we can
switch to shelling out to node like we do for KaTeX.
Fixes#6880.
A bug in Zulip's new user signup process meant that users who
registered their account using social authentication (e.g. GitHub or
Google SSO) in an organization that also allows password
authentication could have their personal API key stolen by an
unprivileged attacker, allowing nearly full access to the user's
account.
Zulip versions between 1.7.0 and 2.0.6 were affected.
This commit fixes the original bug and also contains a database
migration to fix any users with corrupt `password` fields in the
database as a result of the bug.
Out of an abundance of caution (and to protect the users of any
installations that delay applying this commit), the migration also
resets the API keys of any users where Zulip's logs cannot prove the
user's API key was not previously stolen via this bug. Resetting
those API keys will be inconvenient for users:
* Users of the Zulip mobile and terminal apps whose API keys are reset
will be logged out and need to login again.
* Users using their personal API keys for any other reason will need
to re-fetch their personal API key.
We discovered this bug internally and don't believe it was disclosed
prior to our publishing it through this commit. Because the algorithm
for determining which users might have been affected is very
conservative, many users who were never at risk will have their API
keys reset by this migration.
To avoid this on self-hosted installations that have always used
e.g. LDAP authentication, we skip resetting API keys on installations
that don't have password authentication enabled. System
administrators on installations that used to have email authentication
enabled, but no longer do, should temporarily enable EmailAuthBackend
before applying this migration.
The migration also records which users had their passwords or API keys
reset in the usual RealmAuditLog table.
The expected signatures for these callbacks seem to have changed
somewhere in https://github.com/pika/pika/pull/1002.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This change makes it possible for users to control the notification
settings for wildcard mentions as a separate control from PMs and
direct @-mentions.
This includes adding a new endpoint to the push notification bouncer
interface, and code to call it appropriately after resetting a user's
personal API key.
When we add support for a user having multiple API keys, we may need
to add an additional key here to support removing keys associated with
just one client.
This allows us to email sets of users on a server with a nicely
formatted email similar to our onboarding emails, built off of a
Markdown template.
The code was based on send_password_reset_email, but it doesn't
replace that use case, since one cannot include special values like
password reset tokens in these emails.
We move the check that the user is a member or admin inot this
decorator.
This name better communicates that this may do other checks beyond
just verifying the policy.
Curl examples generated in test_generated_curl_examples_for_success
used to be html escaped. This commit removes the escaping in the test
since curl examples are not html escaped when run from terminal.
We'll be soon documenting a production workflow that involves using
it, and that means it needs to live under scripts/ (since tools/ isn't
present in release tarballs).
This is essentially an assertion failure code path, so it doesn't
really matter, but it seems best to use the value that's the cause of
the problem here.
Eventually, we'll want to replace emails with user IDs here entirely,
but until we make that happen, we should at least use the same email
address present in our other logging.
I think we won't miss updating these in a future migration thanks to
mypy types.
Since years ago, this field hasn't been used for anything other than
some logging that would be better off logging the user ID anyway.
It existed in the first place simply because we weren't passing the
user_profile_id to Tornado at all.
For organizations with EMAIL_ADDRESS_VISIBILITY_ADMINS, we were using
the wrong email address in the notice telling the user how to login in
the future.
Since we implemented EMAIL_ADDRESS_VISIBILITY_ADMINS, the intent is
that `delivery_only` should be used for accessing a user's actual
email address; with `email` used only in the Zulip API where we
haven't migrated to interacting with other users by ID.
This fixes a place we neglected to migrate.
Then, find and fix a predictable number of previous misuses.
With a small change by tabbott to preserve backwards compatibility for
sending `yes` for the `forged` field.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
The original/legacy emoji reactions endpoints made use of HTTP PUT and
didn't have an API that could correctly handle situations where the
emoji names change over time. We stopped using the legacy endpoints
some time ago, so we can remove them now.
This requires straightforward updates to older tests that were still
written against the legacy API.
Fixes#12940.
The function only used the user's realm anyway, so this is a cleaner
API.
This should also make it more convenient to permanently delete
messages manually, since one doesn't have to fetch a random user in
the realm in order to delete a message using the management shell.
No functional change.
This fixes two regressions in 1946692f9a.
The first bug was actually introduced much earlier, namely that we
were not sending a `bot_owner_id` field at all for bot users without
an owner. The correct behavior would have been send `None` for the
owner field.
The second bug was simply that we needed to update the webapp to look
for the `bot_owner_id` field, rather than an old email-address format
`bot_owner` field.
Thanks to Vinit Singh for reporting this bug.
When creating realm with the ldap backend, the registration flow didn't
properly handle some things - the user wouldn't be set as realm admin,
initial subscriptions and messages weren't created, and the redirect
wasn't happening properly in the case of subdomains.
The state of the FAKELDAP setup for the dev env has fallen behind the
backend changes and updates to fakeldap (which implemented
SCOPE_ONELEVEL searches), as well as having some other minor issues.
This commit restore it to a working state and now all three config modes
work properly.
This makes it possible to simlulate messages sent by specific clients,
rather than just "test suite". Relevant for sending messages where
`message.sent_by_human()` is True.
Rather than subtracting sets in multiple places, it's simpler/cleaner
to just check which users are in the set when processing them.
This refactoring be helpful when we extend the get_recipient_info
logic to handle wildcard mentions as well.
django_to_ldap_username is now able to find the correct ldap username in
every supported type of configuration, so we can remove these
conditionals and use django_to_ldap_username in a straight-forward
manner.
Having to account everywhere for both cases of having and not
having email search configured makes things needlessly complicated.
It's better to make the setting obligatory in configurations other than
LDAP_APPEND_DOMAIN.
Previously, we were using user_profile.email rather than
user_profile.delivery_email in all calculations involving Gravatar
URLs, which meant that all organizations with the new
EMAIL_ADDRESS_VISIBILITY_ADMINS setting enabled had useless gravatars
not based on the `user15@host.domain` type fake email addresses we
generate for the API to refer to users.
The fix is to convert these calculations to use the user's
delivery_email. Some refactoring is required to ensure the data is
passed through to the parts of the codebase that do the check;
fortunately, our automated tests of schemas are effective in verifying
that the new `sender_delivery_email` field isn't visible to the API.
Fixes#13369.
Previously, we weren't properly passing through the value of the
client_gravatar flag from the caller, resulting in buggy results if
the caller passed client_gravatar=False to do_test().
We happened to not have any uses of this before, but we're about to
add one.
Apparently, the refactor months ago that introduced finalize_payload
wasn't applied to the outgoing webhook code path, resulting in message
dicts with an unexpected format with no avatar_url and some extra
values that were intended to be internal details not relevant to
external clients.
Because this API is not widely used, we expect there to be little to
no impact of converting this back to matching the `get_messages`
interface, as it once was and has always been intended to be.
The one somewhat tricky detail is that we include both the `content`
and `rendered_content` fields, rather than asking the client to pick
which they want via the `apply_markdown` flag, because there is no
place for the client to configure that setting.
Previously, we skipped setting the list of subscribers to the channel,
which could result in problems if any messages had been posted there
in the past (e.g. because the channel used to have members, but now
doesn't). It could be correct to skip importing dead channels
altogether, but probably simpler is to just set an empty subscriber list.
Previously, our logic to handle Mattermost's "replies" feature didn't
copy the right fields for private messages, where `channel_members` is
included on the message body rather than a `channel` name.
As discussed in the comment, ideally these checks should be added
completely automatically, rather than needing to be manually added
every time we add a new setting. But hopefully the example code for
all of the similar enums that this provides will at least provide some
help.
By adding some additional plumbing (through PreregistrationUser) of the
full_name and an additional full_name_validated option, we
pre-populate the Full Name field in the registration form when coming
through a social backend (google/github/saml/etc.) and potentially skip
the registration form (if the user would have nothing to do there other
than clicking the Confirm button) and just create the account and log
the user in.
The main purpose of this is to make that name change happen in
/server_settings. external_authentication_methods is a much better, more
descriptive name than social_backends from API perspective.
These are returned through the API, at the /server_settings
endpoint. It's better to just return the list of dicts with a guarantee
of being sorted in the correct order, than to clutter things with the
sort_order field.
This small block of code was over-indented. It should be run in this
part of the function unconditionally, not inside an "else" block.
We obviously want it to run regardless of whether
request.POST.get('from_confirmation')
is True or not.
Needed so that the google entry in social_backends in /server_settings
shows the new url rather than the legacy accounts/login/google/ url as
the login url.
This legacy endpoint was designed for the original native Zulip mobile
apps, which were deprecated years ago in favor of the React Native
app.
It was replaced by /server_settings for active use years ago, so it's
safe to remove it now.
The code comment explains this issue in some detail, but essentially
in Kubernetes and Docker Swarm systems, the container overlayer
network has a relatively short TCP idle lifetime (about 15 minutes),
which can lead to it killing the connection between Tornado and
RabbitMQ.
We fix this by setting a TCP keepalive on that connection shorter than
15 minutes.
Fixes#10776.
This is following the change to the /users endpoint where we allow
an optional parameter "include_custom_profile_fields" which would
allow the client to request for users' custom profile fields along
with their other standard data.
The previous example no longer gives a good enough idea of what the user
can expect when the `include_custom_profile_fields` boolean parameter is
set to true.
The url scheme is now /accounts/login/social/saml/{idp_name} to initiate
login using the IdP configured under "idp_name" name.
display_name and display_logo (the name and icon to show on the "Log in
with" button) can be customized by adding the apprioprate settings in
the configured IdP dictionaries.
login_context now gets the social_backends list through
get_social_backend_dicts and we move display_logo customization
to backend class definition.
This prepares for easily adding multiple IdP support in SAML
authentication - there will be a social_backend dict for each configured
IdP, also allowing display_name and icon customization per IdP.
This changes the way django_to_ldap_username works to make sure the ldap
username it returns actually has a corresponding ldap entry and raise an
exception if that's not possible. It seems to be a more sound approach
than just having it return its best guess - which was the case so far.
Now there is a guarantee that what it returns is the username of an
actual ldap user.
This allows communicating to the registration flow when the email being
registered doesn't belong to ldap, which then will proceed to register
it via the normal email backend flow - finally fixing the bug where you
couldn't register a non-ldap email even with the email backend enabled.
These changes to the behavior of django_to_ldap_username require small
refactorings in a couple of other functions that call it, as well as
adapting some tests to these changes. Finally, additional tests are
added for the above-mentioned registration flow behavior and some
related corner-cases.
Instead of mocking the _LDAPUser class, these tests can now take
advantage of the test directory that other ldap are using. After these
changes, test_query_email_attr also verifies that query_ldap can
successfully be used to query by user email, if email search is
configured.
Fixes#11878
Instead of a confusing mix of django_auth_backed applying
ldap_to_django_username in its internals for one part of the
translation, and then custom logic for grabbing it from the email
attribute of the ldapuser in ZulipLDAPAuthBackend.get_or_build_user
for the second part of the translation,
we put all the logic in a single function user_email_from_ldapuser
which will be used by get_or_build of both ZulipLDAPUserPopulator and
ZulipLDAPAuthBackend.
This, building on the previous commits with the email search feature,
fixes the ldap sync bug from issue #11878.
If we can get upstream django-auth-ldap to merge
https://github.com/django-auth-ldap/django-auth-ldap/pull/154, we'll
be able to go back to using the version of ldap_to_django_username
that accepts a _LDAPUser object.
With this, django_to_ldap_username can take an email and find the ldap
username of the ldap user who has this email - if email search is
configured.
This allows successful authenticate() with ldap email and ldap password,
instead of ldap username. This is especially useful because when
a user wants to fetch their api key, the server attempts authenticate
with user_profile.email - and this used to fail if the user was an ldap
user (because the ldap username was required to authenticate
succesfully). See issue #9277.
This fixes a collection of bugs surrounding LDAP configurations A and
C (i.e. LDAP_APPEND_DOMAIN=None) with EmailAuthBackend also enabled.
The core problem was that our desired security model in that setting
of requiring LDAP authentication for accounts managed by LDAP was not
implementable without a way to
Now admins can configure an LDAPSearch query that will find if there
are users in LDAP that have the email address and
email_belongs_to_ldap() will take advantage of that - no longer
returning True in response to all requests and thus blocking email
backend authentication.
In the documentation, we describe this as mandatory configuration for
users (and likely will make it so soon in the code) because the
failure modes for this not being configured are confusing.
But making that change is pending work to improve the relevant error
messages.
Fixes#11715.
The value of realm attribute in confirmation object used to be empty
before. We are not currently using the realm attribute of reactivation
links anywhere. The value of realm stored in content_object is currently
used.
We currently have code to calculate the value of realm_icon_url,
admin_emails and default_discount in two diffrent places. With
the addition of showing confirmation links it would become three.
The easiest way to deduplicate the code and make the view cleaner
is by doing the calculations in template. Alternatively one can
write a function that takes users, realms and confirmations as
arguments and sets the value of realm_icon_url, admin_emails and
default_discount appropriately in realm object according to the
type of the confirmation. But that seems more messy than passing
the functions directly to template approach.
Most of the failures were due to parameters that are not intended to
be used by third-party code, so the correct fix for those was the set
intentionally_undocumented=True.
Fixes#12969.
MigrationsTestCase is intentionally omitted from this, since migrations
tests are different in their nature and so whatever setUp()
ZulipTestCase may do in the future, MigrationsTestCase may not
necessarily want to replicate.
new_name and description params should be valid JSON
strings. The format of these params are marked as
json so that the curl example genenrator can convert
them into json strings.
Previously, the logic for determining whether to provide an LDAP
password prompt on the registration page was incorrectly including it
if any LDAP authentication was backend enabled, even if LDAP was
configured with the populate-only backend that is not responsible for
authentication (just for filling in name and custom profile fields).
We fix this by correcting the conditional, and add a test.
There's still follow-up work to do here: We may still end up
presenting a registration form in situations where it's useless
because we got all the data from SAML + LDAP. But that's for a future
issue.
This fixes a bug reported in #13275.
This is a follow-up to b69213808a.
We now actually send messages from the notification_bot, which
is the real usecase for this code.
Also, this cleans up the code and removes needless asserts like
`assertNotEqual(zulip_realm, lear_realm)` making the test easier
to read.
A confirmation object is already created when
do_send_confirmation_email is called just above.
Tweaked by tabbott to remove an unnecessary somewhat hacky database
query.
There are a few outstanding issues that we expect to resolve beforce
including this in a release, but this is good checkpoint to merge.
This PR is a collaboration with Tim Abbott.
Fixes#716.
Priviously, we rendered the topic links using the msg.sender.realm.
This resulted in issues with Zulip's internal bots not having access
to the realm_filters of the destination stream's realm. For example,
sending a message via the email gateway or notification would not
linkify any realm filters that a user would expect them to.
Even though required attribute of stream and stream_id params is marked
false in openapi specification, the API expects atleast one of the
params to be set. There is no way to specify relationships like this
openapi and they dont seem to have any plan to implement this in future.
https://github.com/OAI/OpenAPI-Specification/issues/256
This limit was introduced in c588c79 as a part of the
feature and not due to performance crisis. So we are
increasing this limit to 7 days. Since topics tends to
naturally fizzle after day or two so 7 days limit
would be good enough.
One small change in behavior is that this creates an array with all the
row_objects at once, rather than creating them 1000 at a time.
That should be fine, given that the client batches these in units of
10000 anyway, and so we're just creating 10K rows of a relatively
small data structure in Python code here.
Fixes#1727.
With the server down, apply migrations 0245 and 0246. 0246 will remove
the pub_date column, so it's essential that the previous migrations
ran correctly to copy data before running this.
1. Apply migration 0243 to add date_sent column.
2. Apply migration 0244 to copy pub_date over to date_sent. Can be done
with the server running.
3. With the server down (for consistency between memory and
database state of Django objects), verify consistency with
Message.objects.exclude(date_sent=F("pub_date")).count() == 0
Apparently, our change in b8a1050fc4 to
stop caching responses on API endpoints accidentally ended up
affecting uploaded files as well.
Fix this by explicitly setting a Cache-Control header in our Sendfile
responses, as well as changing our outer API caching code to only set
the never cache headers if the view function didn't explicitly specify
them itself.
This is not directly related to #13088, as that is a similar issue
with the S3 backend.
Thanks to Gert Burger for the report.
The previous code for ensuring the sort order of emoji choices was
correct relied on an OrderedDict structure, which isn't guaranteed to
be preserved when passed to the frontend via JSON (in fact, it isn't,
since we converted the way page_params is passed to use
sort_keys=True). Switch it to a list of dictionaries to correct this.
Fixes#13220.
Previously, we were hardcoding the domain s3.amazonaws.com. Given
that we already have an interface for configuring the host in
/etc/zulip/boto.cfg (which in turn, automatically configures boto), we
just need to actually use the value configured in boto for what S3
hostname to use.
We don't have tests for this new use case, in part because they're
likely annoying to write with `moto` and there hasn't been a huge
amount of demand for it. Since this doesn't regress existing S3
backend support, it seems worth merging.
This patches an issue in f37535044 where we mistakenly tried to send
the function as part of the page_params. Instead, we should just try
to send the list of configuration options (in their user displayable
form).
This change adds the OpenAPI data needed to document the POST and
DELETE methods associated with this endpoint.
Descriptions edited slightly by tabbott.
Apparently, the Zulip notifications (and resulting emails) were
correct, but the download links inside the Zulip UI were incorrectly
not including S3 prefix on the URL, making them not work.
While we're at this, we rewrite the somewhat convoluted previous
system for formatting the data export output.
When using our EMAIL_ADDRESS_VISIBILITY_ADMINS feature, we were
apparently creating bot users with different email and delivery_email
properties, due to effectively an oversight in how the code was
written (the initial migration handled bots correctly, but not bots
created after the transition).
Following the refactor in the last commit, the fix for this is just
adding the missing conditional, a test, and a database migration to
fix any incorrectly created bots leaked previously.
This is also a useful preparatory refactor for having a user setting
controlling whether one's own email address is publicly available
within the organization.
This should dramatically improve the queue processor's performance in
cases where there's a very high volume of requests on a given endpoint
by a given user, as described in the new docstring.
Until we test this more broadly in production, we won't know if this
is a full solution to the problem, but I think it's likely. We've
never seen the UserActivityInterval worker end up backlogged without a
total queue processor outage, and it should have a similar workload.
Fixes#13180.
We don't actually need to go to the memcached (falling back to the
database) to fetch either user or client objects on every event. For
user objects, we actually can just pass through the user ID
transparently; for client objects, we can use an in-process cache,
since the mapping of string to ID never changes.
With the way these tests are, it's unnecessary to have 3 separate
classes, and it makes it confusing to decide where to add a potential
additional mm email test.
This simple backwards-compatible change saves approximately 12% in the
compressed size of the chat.zulip.org page_params. We can do much,
much better by changing the format, but this seems like a good
intermediate step.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
In a gigantic realm where we send several MB of `page_params`, it’s
slightly better to have the rest of the `<body>` available to the
browser earlier, so it can show the “Loading…” spinner and start
fetching subresources.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
any_oauth_backend_enabled is all about whether we will have extra
buttons on the login/register pages for logging in with some non-native
backends (like Github, Google etc.). And this isn't about specifically
oauth backends, but generally "social" backends - that may not rely
specifically rely on Oauth. This will have more concrete relevance when
SAML authentication is added - which will be a "social" backend,
requiring an additional button, but not Oauth-based.
This sidesteps tricky escaping issues, and will make it easier to
build a strict Content-Security-Policy.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This sidesteps tricky escaping issues, and will make it easier to
build a strict Content-Security-Policy.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Previously, incorrectly passing an existing directory to the
`manage.py export --output` option would remove its contents without
warning. Abort instead.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
* Whitelist a small number of image/ types to be served as
non-attachments.
* Serve the file using the type that we validated rather than relying
on an independent guess to match.
This issue can lead to a stored XSS security vulnerability for older
browsers that don't support Content-Security-Policy.
It primarily affects servers using Zulip's local file uploads backend
for servers running Ubuntu 16.04 Xenial or newer; the legacy local
file upload backend for (now EOL) Ubuntu 14.04 Trusty was not affected
and it has limited impact for the S3 upload backend (which uses an
unprivileged S3 bucket domain to serve files).
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This brings us in line, and also allows us to style these more like
unordered lists, which is visually more appealing.
On the backend, we now use the default list blockprocessor + sane list
extension of python-markdown to get proper list markup; on the
frontend, we mostly return to upstream's code as they have followed
CommonMark on this issue.
Using <ol> here necessarily removes the behaviour of not renumbering
on lists written like 3, 4, 7; hopefully users will be OK with the
change.
Fixes#12822.
Also cleans up the interface between the management command and the
LDAP backends code to not guess/recompute under what circumstances
what should be logged.
Co-authored-by: mateuszmandera <mateusz.mandera@protonmail.com>
The order of operations for our LDAP synchronization code wasn't
correct: We would run the code to sync avatars (etc.) even for
deactivated users.
Thanks to niels for the report.
Co-authored-by: mateuszmandera <mateusz.mandera@protonmail.com>
Fixes#13130.
django_auth_ldap doesn't give any other way of detecting that LDAPError
happened other than catching the signal it emits - so we have to
register a receiver. In the receiver we just raise our own Exception
which will properly propagate without being silenced by
django_auth_ldap. This will stop execution before the user gets
deactivated.
So the reason 38f8cf612c seems
to be flaking is because the value of harry id switches between
1 and 2 in Xenial while in Bionic it would be fixed at 2. The
reason behind this is that Bionic ships with Python3.6 which
preserves dict insert order while Python3.5 that ships with Xenial
dont preserve the order. In initialize_stream_membership_dicts
we iterate user_data_map dict and the order in which the iteration
happens affects the ID of the users.
Papertrail sends requests with the content type
`application/x-www-form-urlencoded`, with the payload parameter holding the
JSON body. This commit fixes the papertrail integration to use the payload
parameter in the request's POST data instead of trying to parse the
request's entire body as JSON.
Papertrail documentation here:
https://help.papertrailapp.com/kb/how-it-works/web-hooks#encoding
We have a very useful piece of code, _RateLimitFilter, which is
designed to avoid sending us a billion error emails in the event that
a Zulip production server is down in a way that throws the same
exception a lot. The code uses memcached to ensure we send each
traceback roughly once per Zulip server per 10 minutes (or if
memcached is unavailable, at most 1/process/10 minutes, since we use
memcached to coordinate between processes)
However, if memcached is down, there is a logging.error call internal
to the Django/memcached setup that happens inside the cache.set() call,
and those aren't caught by the `except Exception` block around it.
This ends up resulting in infinite recursion, eventually leading to
Fatal Python error: Cannot recover from stack overflow., since this
handler is configured to run for logging.error in addition to
logging.exception.
We fix this using a thread-local variable to detect whether we are
being called recursively.
This change should prevent some nasty failure modes we've had in the
past where memcached being down resulted in infinite recursion
(resulting in extra resources being consumed by our error
notifications code, and most importantly, the error notifications not
being sent).
Fixes#12595.
There's no reason for this to be a category of error that emails the
server administrator, since there's a good chance that fixing it will
need to be done in the Zulip codebase, not administrator action.
Fixes#9401.
This adds a FAKE_EMAIL_DOMAIN setting, which should be used if
EXTERNAL_HOST is not a valid domain, and something else is needed to
form bot and dummy user emails (if email visibility is turned off).
It defaults to EXTERNAL_HOST.
get_fake_email_domain() should be used to get this value. It validates
that it's correctly set - that it can be used to form valid emails.
If it's not set correctly, an exception is raised. This is the right
approach, because it's undesirable to have the server seemingly
peacefully operating with that setting misconfigured, as that could
mask some hidden sneaky bugs due to UserProfiles with invalid emails,
which would blow up the moment some code that does validate the emails
is called.
Apparently, due to poor naming of the outer capture group we use to
separate the actual match from the surrounding whitespace (etc.) we
use to determine if the syntax is a possible linkifier start/end, if
you created a linkifier using "name" as the capture group, we'd try to
compile a pattern with two capture groups called "name", which would
500, preventing anyone from accessing the organization.
We’re about to start using PostgreSQL-specific syntax that can’t be
stringified without a specified dialect.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This requires part 1 (which can take hours to run but generally
doesn't require downtime) to be completed first.
This portion of the migration will require the server to be completely
down for a brief period; for chat.zulip.org with 250M UserMessage
rows, it took about 60s to run; that time will vary depending on
hardware details like whether the server has an SSD, but fundamentally
shouldn't be long.
Our upgrade-zulip and upgrade-zulip-from-git tools can apply this
migration correctly; nothing special needs to be done.
Fixes#13040.
As part of adding support for more than 2B UserMessage rows in a Zulip
server, we need to change UserMessage.id (a field we don't access but
is needed by Django) from an int to a bigint. This commit is a series
of migrations which create a `bigint_id` column and populates it correctly.
This migration will take a long time to run; on chat.zulip.org (a
server with a lot of history), it took about 4 hours to complete.
How to migrate with minimal downtime:
1. Run `upgrade-zulip-from-git` through this commit. It will install
migration 0238 and then more or less hang while applying migration
0239. Once migration 0238 is completed, however, your server should
be able to be started back up safely while migration 0239 is running.
2. Run `/home/zulip/deployments/next/scripts/restart-server` in a
separate terminal to get Zulip running again.
3. When the `upgrade-zulip-from-git` command finishes, it will
automatically re-restart the Zulip server, leaving you in a consistent
state and ready to do part 2 of the migration.
A useful `manage.py shell` query for checking the state after this
commit is consistent is this:
assert UserMessage.objects.exclude(bigint_id=F("id")).count() == 0
Part of #13040.
Previously, several of our URL patterns accidentally did not end with
`$`, and thus ended up controlling just the stated URL, but actually a
much broader set of URLs starting with it.
I did an audit and fixed what I believe are all instances of this URL
pattern behavior. In the process, I fixed a few tests that were
unintentionally relying on the behavior.
Fixes#13082.
Historically, Zulip's implementation of wildcard mentions never
triggered either email or push notifications, instead being limited to
desktop notifications and the "mentions" counter.
We fix this just by plumbing the "wildcard_mentioned" flag through our
system.
Implements much of
https://github.com/zulip/zulip/issues/6040#issuecomment-510157264.
We're also now ready to seriously work on #3750.
We added default ToS for the development environment a few months
back; as a side effect, we now need to accept ToS when going through
the development environment registration flow, including for our
one-click account creation buttons.
After a new user joins an active organization, it isn't obvious what
to do next; this change causes there to be recent unread messages in
the stream sidebar for the user to click on to get a feel for what's
happening in the organization and experiment with Zulip.
Fixes#6512.
This commit wraps up the major work that we held back when upgrading
py-markdown 2.6.11 to 3.0.1. Since we were making our custom changes
to the link syntax, at the time we stuck to using the old method of
parsing links. This lays the groundwork for further changes to our
link and image link handling, and brings us on par with upstream.
Also, we now better document the ways in which our link handling is
different from upstream.
Previously, the unread_msgs data structure accounting (used for both
the web and mobile apps to determine the "Unread mentions" count
displayed in the UI) did not include wildcard mentions at all.
We fix this by adding the logic required to include properly that
data, with tests. As discussed in #6040, it makes sense to include
muted streams and topics for the purpose of this calculation.
Fixes part of #6040.
Apparently, get_active_presence_idle_user_ids, which is carefully
optimized to only fetch data for users who might actually need
notification processing, was only considering PMs and direct mentions,
not wildcard mentions or alert words.
This caused some pretty weird failure modes when working on adding
support for broader mention notifications, because users who had one
of these types of notifications would be treated as never
presence-idle, which was just confusing.
This is part of adding support for notifications for wildcard mentions
and alert words; it's worth merging this as an early commit because
the consequence of not doing this are very difficult to debug.
Rather than continually resetting the contents of an existing event
queue, we allocate a new one for each subtest.
We also fix a rather confusing bundle of comments.