In development env, we use `get_secret` to get
`SOCIAL_AUTH_GITLAB_KEY` from `dev-secrets.conf`. But in
production env, we don't need this as we ask the user
to set that value in `prod_settings_template.py`.
This restricts the code from looking `zulip-secrets.conf`
for `social_auth_gitlab_key` in production env.
type().__name__ is sufficient, and much readable than type(), so it's
better to use the former for keys.
We also make the classes consistent in forming the keys in the format
type(self).__name__:identifier and adjust logger.warning and statsd to
take advantage of that and simply log the key().
The previous model for GitHub authentication was as follows:
* If the user has only one verified email address, we'll generally just log them in to that account
* If the user has multiple verified email addresses, we will always
prompt them to pick which one to use, with the one registered as
"primary" in GitHub listed at the top.
This change fixes the situation for users going through a "login" flow
(not registration) where exactly one of the emails has an account in
the Zulip oragnization -- they should just be logged in.
Fixes part of #12638.
URLs for config errors were configured seperately for each error
which is better handled by having error name as argument in URL.
A new view `config_error_view` is added containing context for
each error that returns `config_error` page with the relevant
context.
Also fixed tests and some views in `auth.py` to be consistent with
changes.
We had a bunch of ugly hacks to monkey patch things due to upstream
being temporarily unmaintained and not merging PRs. Now the project is
active again and the fixes have been merged and included in the latest
version - so we clean up all that code.
We plan to use these records to check and record the schema of Zulip's
events for the purposes of API documentation.
Based on an original messier commit by tabbott.
In theory, a nicer version of this would be able to work directly off
the mypy type system, but this will be good enough for our use case.
Now called:
validate_email_not_already_in_realm
We have a separate validation function that
makes sure that the email fits into a realm's
domain scheme, and we want to avoid naming
confusion here.
We need to request access to read:org scope to be able to check org/team
membership. Without it SOCIAL_AUTH_GITHUB_ORG_NAME and
SOCIAL_AUTH_GITHUB_TEAM_ID settings don't work and simply lead to all
auth attempts failing.
Tested manually.
isort 5 knows not to reorder imports across function calls, so this
will stop isort from breaking our code.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
finish_desktop_flow is called with the assumption that the request
successfully proved control over the user_profile and generates a
special link to log into the user_profile account. There's no reason to
pass the realm param, as user_profile.realm can be assumed.
Extend the context dictionary with variables `social_backend_name`
and `backend_error` flag which determines if the error should be
shown. Not extended this for ldap, smtp and saml as they have a
different format of block.
SOCIAL_AUTH_SAML_SECURITY_CONFIG["authnRequestsSigned"] override in
settings.py in a previous commit wouldn't work on servers old enough to
not have the SAML settings in their settings.py - due to
SOCIAL_AUTH_SAML_SECURITY_CONFIG being undefined.
This commit fixes that.
Original idea was that KeyError was only going to happen there in case
of user passing bad input params to the endpoint, so logging a generic
message seemed sufficient. But this can also happen in case of
misconfiguration, so it's worth logging more info as it may help in
debugging the configuration.
Profiling suggests this saves about 600us in the runtime of every GET
/events request attempting to resolve URLs to determine whether we
need to do the APPEND_SLASH behavior.
It's possible that we end up doing the same URL resolution work later
and we're just moving around some runtime, but I think even if we do,
Django probably doesn't do any fancy caching that would mean doing
this query twice doesn't just do twice the work.
In any case, we probably want to extend this behavior to our whole API
because the APPEND_SLASH redirect behavior is essentially a bug there.
That is a more involved refactor, however.
The comment explains this issue, but effectively, the upgrade to
Django 2.x means that Django's built-in django.request logger was
writing to our errors logs WARNING-level data for every 404 and 400
error. We don't consider user errors to be a problem worth
highlighting in that log file.
Django 2.2.x is the next LTS release after Django 1.11.x; I expect
we'll be on it for a while, as Django 3.x won't have an LTS release
series out for a while.
Because of upstream API changes in Django, this commit includes
several changes beyond requirements and:
* urls: django.urls.resolvers.RegexURLPattern has been replaced by
django.urls.resolvers.URLPattern; affects OpenAPI code and related
features which re-parse Django's internals.
https://code.djangoproject.com/ticket/28593
* test_runner: Change number to suffix. Django changed the name in this
ticket: https://code.djangoproject.com/ticket/28578
* Delete now-unnecessary SameSite cookie code (it's now the default).
* forms: urlsafe_base64_encode returns string in Django 2.2.
https://docs.djangoproject.com/en/2.2/ref/utils/#django.utils.http.urlsafe_base64_encode
* upload: Django's File.size property replaces _get_size().
https://docs.djangoproject.com/en/2.2/_modules/django/core/files/base/
* process_queue: Migrate to new autoreload API.
* test_messages: Add an extra query caused by .refresh_from_db() losing
the .select_related() on the Realm object.
* session: Sync SessionHostDomainMiddleware with Django 2.2.
There's a lot more we can do to take advantage of the new release;
this is tracked in #11341.
Many changes by Tim Abbott, Umair Waheed, and Mateusz Mandera squashed
are squashed into this commit.
Fixes#10835.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
This makes it possible to create a Zulip account from the mobile or
desktop apps and have the end result be that the user is logged in on
their mobile device.
We may need small changes in the desktop and/or mobile apps to support
this.
Closes#10859.
This adds a new API endpoint for querying basic data on a single other
user in the organization, reusing the existing infrastructure (and
view function!) for getting data on all users in an organization.
Fixes#12277.
https://github.com/Bouke/django-two-factor-auth/issues/297
This setting was added in 1.9 version of the app and can be used
harmleslly in our current Django 1.11-based code and will prevent an
error on Django 2.1+ when we move there.
This is required for our migration to Django 2.2. authenticate()
definitions need to have that starting with Django 2.1.
rate_limit_auth needs to be adjusted to expect the request in the first
positional argument instead of a kwarg.
This makes the state cleaner for the tests. Tests that want to have rate
limiting set up their own desired rules anyway, and having some
pre-existing ones from the default settings can conflict with the
desired ones.
This applies rate limiting (through a decorator) of authenticate()
functions in the Email and LDAP backends - because those are the ones
where we check user's password.
The limiting is based on the username that the authentication is
attempted for - more than X attempts in Y minutes to a username is not
permitted.
If the limit is exceeded, RateLimited exception will be raised - this
can be either handled in a custom way by the code that calls
authenticate(), or it will be handled by RateLimitMiddleware and return
a json_error as the response.
validate_otp_params needs to be moved to backends.py, because as of this
commit it'll be used both there and in views.auth - and import from
views.auth to backends.py causes circular import issue.
Tests require adjusting, because the class-based view has an additional
redirect - through /uid/set-password/ and the token is read from the
session. See Django code of PasswordResetConfirmView.
"Zulip Voyager" was a name invented during the Hack Week to open
source Zulip for what a single-system Zulip server might be called, as
a Star Trek pun on the code it was based on, "Zulip Enterprise".
At the time, we just needed a name quickly, but it was never a good
name, just a placeholder. This removes that placeholder name from
much of the codebase. A bit more work will be required to transition
the `zulip::voyager` Puppet class, as that has some migration work
involved.
This legacy cross-realm bot hasn't been used in several years, as far
as I know. If we wanted to re-introduce it, I'd want to implement it
as an embedded bot using those common APIs, rather than the totally
custom hacky code used for it that involves unnecessary queue workers
and similar details.
Fixes#13533.
Because of how login_or_register_remote_user code is structured, this
doesn't change how the flow will go, but it's not a clean use of
login_or_register_remote_user to call it with is_signup=True if sign up
shouldn't actually happen - and may be fragile when refactoring
login_or_register_remote_user.
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013. We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not. Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.
While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.
This WebSockets code path had a lot of downsides/complexity,
including:
* The messy hack involving constructing an emulated request object to
hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
needs and must be provisioned independently from the rest of the
server).
* A duplicate check_send_receive_time Nagios test specific to
WebSockets.
* The requirement for users to have their firewalls/NATs allow
WebSocket connections, and a setting to disable them for networks
where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
been poorly maintained, and periodically throws random JavaScript
exceptions in our production environments without a deep enough
traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
server restart, and especially for large installations like
zulipchat.com, resulting in extra delay before messages can be sent
again.
As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients. We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.
If we later want WebSockets, we’ll likely want to just use Django
Channels.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
With LDAP authentication, we don't currently have a good way to
support the default stream groups feature.
The old behavior was just to assume a user select every default stream
group, which seems wrong; since we didn't prompt the user about these,
we should just ignore the feature.
These are leftovers from where we had default settings in the
settings.py file. Now that the files are separate those references to
"below" are not correct.
Our ldap integration is quite sensitive to misconfigurations, so more
logging is better than less to help debug those issues.
Despite the following docstring on ZulipLDAPException:
"Since this inherits from _LDAPUser.AuthenticationFailed, these will
be caught and logged at debug level inside django-auth-ldap's
authenticate()"
We weren't actually logging anything, because debug level messages were
ignored due to our general logging settings. It is however desirable to
log these errors, as they can prove useful in debugging configuration
problems. The django_auth_ldap logger can get fairly spammy on debug
level, so we delegate ldap logging to a separate file
/var/log/zulip/ldap.log to avoid spamming server.log too much.
A block of LDAP integration code related to data synchronization did
not correctly handle EMAIL_ADDRESS_VISIBILITY_ADMINS, as it was
accessing .email, not .delivery_email, both for logging and doing the
mapping between email addresses and LDAP users.
Fixes#13539.
This simplifies the RDS installation process to avoid awkwardly
requiring running the installer twice, and also is significantly more
robust in handling issues around rerunning the installer.
Finally, the answer for whether dictionaries are missing is available
to Django for future use in warnings/etc. around full-text search not
being great with this configuration, should they be required.
We register ZulipRemoteUserBackend as an external_authentication_method
to make it show up in the corresponding field in the /server_settings
endpoint.
This also allows rendering its login button together with
Google/Github/etc. leading to us being able to get rid of some of the
code that was handling it as a special case - the js code for plumbing
the "next" value and the special {% if only_sso %} block in login.html.
An additional consequence of the login.html change is that now the
backend will have it button rendered even if it isn't the only backend
enabled on the server.
This commit builds a more complete concept of an "external
authentication method". Our social backends become a special case of an
external authentication method - but these changes don't change the
actual behavior of social backends, they allow having other backends
(that come from python-social-auth and don't use the social backend
pipeline) share useful code that so far only serviced social backends.
Most importantly, this allows having other backends show up in the
external_authentication_methods field of the /server_settings endpoint,
as well as rendering buttons through the same mechanism as we already
did for social backends.
This moves the creation of dictonaries describing the backend for the
API and button rendering code away into a method, that each backend in
this category is responsible for defining.
To register a backend as an external_authentication_method, it should
subclass ExternalAuthMethod and define its dict_representation
classmethod, and finally use the external_auth_method class decorator to
get added to the EXTERNAL_AUTH_METHODS list.
Previously, the LDAP code for syncing user data was not
multiple-realm-aware, resulting in errors trying to sync data for an
LDAP user present in multiple realms.
Tweaked by tabbott to add some extended comments.
Fixes#11520.
For a long time, we've been only doing the zxcvbn password strength
checks on the browser, which is helpful, but means users could through
hackery (or a bug in the frontend validation code) manage to set a
too-weak password. We fix this by running our password strength
validation on the backend as well, using python-zxcvbn.
In theory, a bug in python-zxcvbn could result in it producing a
different opinion than the frontend version; if so, it'd be a pretty
bad bug in the library, and hopefully we'd hear about it from users,
report upstream, and get it fixed that way. Alternatively, we can
switch to shelling out to node like we do for KaTeX.
Fixes#6880.
A bug in Zulip's new user signup process meant that users who
registered their account using social authentication (e.g. GitHub or
Google SSO) in an organization that also allows password
authentication could have their personal API key stolen by an
unprivileged attacker, allowing nearly full access to the user's
account.
Zulip versions between 1.7.0 and 2.0.6 were affected.
This commit fixes the original bug and also contains a database
migration to fix any users with corrupt `password` fields in the
database as a result of the bug.
Out of an abundance of caution (and to protect the users of any
installations that delay applying this commit), the migration also
resets the API keys of any users where Zulip's logs cannot prove the
user's API key was not previously stolen via this bug. Resetting
those API keys will be inconvenient for users:
* Users of the Zulip mobile and terminal apps whose API keys are reset
will be logged out and need to login again.
* Users using their personal API keys for any other reason will need
to re-fetch their personal API key.
We discovered this bug internally and don't believe it was disclosed
prior to our publishing it through this commit. Because the algorithm
for determining which users might have been affected is very
conservative, many users who were never at risk will have their API
keys reset by this migration.
To avoid this on self-hosted installations that have always used
e.g. LDAP authentication, we skip resetting API keys on installations
that don't have password authentication enabled. System
administrators on installations that used to have email authentication
enabled, but no longer do, should temporarily enable EmailAuthBackend
before applying this migration.
The migration also records which users had their passwords or API keys
reset in the usual RealmAuditLog table.
The original/legacy emoji reactions endpoints made use of HTTP PUT and
didn't have an API that could correctly handle situations where the
emoji names change over time. We stopped using the legacy endpoints
some time ago, so we can remove them now.
This requires straightforward updates to older tests that were still
written against the legacy API.
Fixes#12940.
When creating realm with the ldap backend, the registration flow didn't
properly handle some things - the user wouldn't be set as realm admin,
initial subscriptions and messages weren't created, and the redirect
wasn't happening properly in the case of subdomains.
The state of the FAKELDAP setup for the dev env has fallen behind the
backend changes and updates to fakeldap (which implemented
SCOPE_ONELEVEL searches), as well as having some other minor issues.
This commit restore it to a working state and now all three config modes
work properly.
- Moves "Authentication in the development environment" from subsystems
to "development/authentication.md".
- Moves "Renumbering migrations" to a section within "Schema migrations".
Having to account everywhere for both cases of having and not
having email search configured makes things needlessly complicated.
It's better to make the setting obligatory in configurations other than
LDAP_APPEND_DOMAIN.
This function is inherited by ZulipLDAPUserPopulator and overriden by
ZulipLDAPAuthBackend, so it's more clear to have it simply defined in
ZulipLDAPUserPopulator directly.
By adding some additional plumbing (through PreregistrationUser) of the
full_name and an additional full_name_validated option, we
pre-populate the Full Name field in the registration form when coming
through a social backend (google/github/saml/etc.) and potentially skip
the registration form (if the user would have nothing to do there other
than clicking the Confirm button) and just create the account and log
the user in.
These are returned through the API, at the /server_settings
endpoint. It's better to just return the list of dicts with a guarantee
of being sorted in the correct order, than to clutter things with the
sort_order field.
Needed so that the google entry in social_backends in /server_settings
shows the new url rather than the legacy accounts/login/google/ url as
the login url.
This legacy endpoint was designed for the original native Zulip mobile
apps, which were deprecated years ago in favor of the React Native
app.
It was replaced by /server_settings for active use years ago, so it's
safe to remove it now.
Send the `csrftoken` and `sessionid` cookies with `SameSite=Lax`.
This adds a layer of defense against CSRF attacks and matches the new
default in Django 2.1:
https://docs.djangoproject.com/en/2.1/releases/2.1/#samesite-cookies
This can be reverted when we upgrade to Django ≥ 2.1.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>