So far the conversion was in a very random place -
register_remote_user(). All other codepaths that use
login_or_register_remote_user() call it with the user's email address.
Making remote_user_sso convert remote_username to the email address
before calling login_or_register_remote_user makes this usage consistent
across the board.
finish_desktop_flow is called with the assumption that the request
successfully proved control over the user_profile and generates a
special link to log into the user_profile account. There's no reason to
pass the realm param, as user_profile.realm can be assumed.
In `auth.py` there are three `if` blocks for different backends
to redirect to config error page with similar code. It is better
handled with common code using `get_attr()` function on
constructed setting names.
There was some duplicated code to test config error pages for
different auths which could be handled with less duplicated code
by adding those functions to `SocialAuthBase`.
Also moving the other tests makes it easier to access tests related
to a backend auth when they are in the same file.
Original idea was that KeyError was only going to happen there in case
of user passing bad input params to the endpoint, so logging a generic
message seemed sufficient. But this can also happen in case of
misconfiguration, so it's worth logging more info as it may help in
debugging the configuration.
Create a new page for desktop auth flow, in which
users can select one from going to the app or
continue the flow in the browser.
Co-authored-by: Mateusz Mandera <mateusz.mandera@protonmail.com>
To avoid some hidden bugs in tests caused by every ldap user having the
same password, we give each user a different password, generated based
on their uids (to avoid some ugly hard-coding in a bunch of places).
This avoids using `.save()` directly for editing stream properties,
and also uses the API in _send_and_verify_message to avoid confusing
logic around which user is doing what request.
Fixes part of #13823
This adds update_user to python_examples.py in zerver/openapi.
This also adds update-user.md to templates/zerver/api and adds
update-user to rest-endpoints.md (templates/zerver/help/include).
This adds deactivate_user to python_examples.py in zerver/openapi.
This also adds delete-user.md to templates/zerver/api and adds
delete-user to rest-endpoints.md (templates/zerver/help/include).
This adds get_single_user to python_examples.py in zerver/openapi.
This also adds get-single-user.md to templates/zerver/api and adds
get-single-user to rest-endpoints.md (templates/zerver/help/include).
This adds the OpenAPI format data for /users/{user_id} endpoint
and also removes 'users/{user_id}' from 'pending_endpoints' in
zerver/tests/test_openapi.py .
‘req_var in request.GET’ was previously believed to be slow from
profiling results. However, the real explanation for those profiling
results is that WSGIRequest.GET is a lazy cached property, so there’s
no reason to avoid it if we’re accessing request.GET anyway.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
In https://github.com/zulip/zulip/pull/12823 some changes to the realms
structure have been made, so now both in production and development
cross-realm bots live in the realm with string_id "zulipinternal".
There was a TODO in retention code to eliminate a conditional in a query
that became redundant with this change, and also the zulipinternal realm
should be omitted from the archiving process in archive_messages().
Profiling suggests this saves about 600us in the runtime of every GET
/events request attempting to resolve URLs to determine whether we
need to do the APPEND_SLASH behavior.
It's possible that we end up doing the same URL resolution work later
and we're just moving around some runtime, but I think even if we do,
Django probably doesn't do any fancy caching that would mean doing
this query twice doesn't just do twice the work.
In any case, we probably want to extend this behavior to our whole API
because the APPEND_SLASH redirect behavior is essentially a bug there.
That is a more involved refactor, however.
We use this single regular expression for processing essentially every
request, so it's definitely worth hinting to Python that we're going
to do so by compiling it. Saves about 40us per request.
A sloppy implementation of the main has_request_variables wrapper
function meant that it did two very inefficient things:
* To combine together the GET and POST parameters, it would make a
copy of the request.GET QueryDict object, which combined with the
fact that these objects are slow to access, consumed about 90us per
argument.
* Doing this in a loop (one time per argument), rather than once,
which resulted in us doing this 11 times for a `GET /events` query.
Fixing this to just make a dictionary and combine things with some
small loops saved about 1 millisecond from the total runtime of GET
/events (for comparison, the total actual work of that view function
is about 700ms).
We need to fix at least one test that used a bad mock HttpRequest
object that didn't have a .GET property.
The comment explains this issue, but effectively, the upgrade to
Django 2.x means that Django's built-in django.request logger was
writing to our errors logs WARNING-level data for every 404 and 400
error. We don't consider user errors to be a problem worth
highlighting in that log file.
Django 2.2.x is the next LTS release after Django 1.11.x; I expect
we'll be on it for a while, as Django 3.x won't have an LTS release
series out for a while.
Because of upstream API changes in Django, this commit includes
several changes beyond requirements and:
* urls: django.urls.resolvers.RegexURLPattern has been replaced by
django.urls.resolvers.URLPattern; affects OpenAPI code and related
features which re-parse Django's internals.
https://code.djangoproject.com/ticket/28593
* test_runner: Change number to suffix. Django changed the name in this
ticket: https://code.djangoproject.com/ticket/28578
* Delete now-unnecessary SameSite cookie code (it's now the default).
* forms: urlsafe_base64_encode returns string in Django 2.2.
https://docs.djangoproject.com/en/2.2/ref/utils/#django.utils.http.urlsafe_base64_encode
* upload: Django's File.size property replaces _get_size().
https://docs.djangoproject.com/en/2.2/_modules/django/core/files/base/
* process_queue: Migrate to new autoreload API.
* test_messages: Add an extra query caused by .refresh_from_db() losing
the .select_related() on the Realm object.
* session: Sync SessionHostDomainMiddleware with Django 2.2.
There's a lot more we can do to take advantage of the new release;
this is tracked in #11341.
Many changes by Tim Abbott, Umair Waheed, and Mateusz Mandera squashed
are squashed into this commit.
Fixes#10835.
Since essentially the first use of Tornado in Zulip, we've been
maintaining our Tornado+Django system, AsyncDjangoHandler, with
several hundred lines of Django code copied into it.
The goal for that code was simple: We wanted a way to use our Django
middleware (for code sharing reasons) inside a Tornado process (since
we wanted to use Tornado for our async events system).
As part of the Django 2.2.x upgrade, I looked at upgrading this
implementation to be based off modern Django, and it's definitely
possible to do that:
* Continue forking load_middleware to save response middleware.
* Continue manually running the Django response middleware.
* Continue working out a hack involving copying all of _get_response
to change a couple lines allowing us our Tornado code to not
actually return the Django HttpResponse so we can long-poll. The
previous hack of returning None stopped being viable with the Django 2.2
MiddlewareMixin.__call__ implementation.
But I decided to take this opportunity to look at trying to avoid
copying material Django code, and there is a way to do it:
* Replace RespondAsynchronously with a response.asynchronous attribute
on the HttpResponse; this allows Django to run its normal plumbing
happily in a way that should be stable over time, and then we
proceed to discard the response inside the Tornado `get()` method to
implement long-polling. (Better yet might be raising an
exception?). This lets us eliminate maintaining a patched copy of
_get_response.
* Removing the @asynchronous decorator, which didn't add anything now
that we only have one API endpoint backend (with two frontend call
points) that could call into this. Combined with the last bullet,
this lets us remove a significant hack from our
never_cache_responses function.
* Calling the normal Django `get_response` method from zulip_finish
after creating a duplicate request to process, rather than writing
totally custom code to do that. This lets us eliminate maintaining
a patched copy of Django's load_middleware.
* Adding detailed comments explaining how this is supposed to work,
what problems we encounter, and how we solve various problems, which
is critical to being able to modify this code in the future.
A key advantage of these changes is that the exact same code should
work on Django 1.11, Django 2.2, and Django 3.x, because we're no
longer copying large blocks of core Django code and thus should be
much less vulnerable to refactors.
There may be a modest performance downside, in that we now run both
request and response middleware twice when longpolling (once for the
request we discard). We may be able to avoid the expensive part of
it, Zulip's own request/response middleware, with a bit of additional
custom code to save work for requests where we're planning to discard
the response. Profiling will be important to understanding what's
worth doing here.
Hellosign now posts their callback as form/multipart, which Django only
permits to be read once. Attempts to access request.body after the
initial read throw "django.http.request.RawPostDataException: You
cannot access body after reading from request's data stream".
Fixes#13847.
This makes it possible to create a Zulip account from the mobile or
desktop apps and have the end result be that the user is logged in on
their mobile device.
We may need small changes in the desktop and/or mobile apps to support
this.
Closes#10859.
1) Created a new class `DatabaseType` and access its objects inside
`template_database_status()` instead of sending five arguments with
default values.
2) Made `check_files` and `setting_name` local variables instead of
function parameters since they had same value(None) for every call.
Fixes#13845.
webpack optimizes JSON modules using JSON.parse("{…}"), which is
faster than the normal JavaScript parser.
Update the backend to use emoji_codes.json too instead of the three
separate JSON files.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This code was very useful when first implemented to help catch errors
where our backend templates didn't render, but has been superceded by
the success of our URL coverage testing (which ensures every URL
supported by Zulip's urls.py is accessed by our tests, with a few
exceptions) and other tests covering all of the emails Zulip sends.
It has a significant maintenance cost because it's a bit hacky and
involves generating fake context, so it makes sense to remove these.
Any future coverage issues with templates should be addressed with a
direct test that just accessing the relevant URL or sends the relevant
email.
In python 3.5-3.6, generic types had an __origin__ attribute which
indicated which generic they originated from; the code was reflecting on
that value to check types against the openapi spec. In python3.7, this
changed, and there's no longer an immediately simple way to get this
information in all cases. __origin__ appears to be the implementing
class now, returning `list` or `collections.abc.Iterator` rather than
`typing.List` and `typing.Iterator`. This adds a sloppy-but-effective
mechanism for inferring if a type maps to the List/Dict/Iterator/Mapping
types and gets the test suite passing again.
We now use realm_id for querying UserPresence
instead of building a big WHERE clause from the
list of user_ids.
This commit may be a bit hard to measure, since
we still get the list of user_ids for the PushToken
query in the same method.
It adds this index:
"zerver_userpresence_realm_id_timestamp_25f410da_idx" btree (realm_id, "timestamp")
We expect this index to provide a major performance improvement when
fetching presence data for the whole realm from the database on
servers like zulipchat.com hosting several realms.
We now validate streams with a separate
function from PM recipients.
It's confusing enough all the ways you can
encode a stream or encode the PM recipients,
but trying to do it all in one function was
hard to reason about and led to at least one
bug.
In particular, there was a bug where streams
with commas in them would get split. Now
we just don't ever split on commas inside
of `extract_stream_indicator`.
Fixes#13836
After removing internal_send_message() in a recent
commit, we now have only two callers for
extract_recipients, and they are both related
to our REQ mechanism that always passes strings
to converters. (If there are default values,
REQ does not call the converters.)
We therefore make two changes:
- use the more strict annotation of "str"
for the `s` parameter
- don't bother with the isinstance check
Note that while the test mocks the actual message
send, we now have a `get_stream` call in the queue
worker, so we have to set up a real stream for
testing (or we could have mocked that as well, but
it didn't seem necessary). The setup queries add
to the amount of queries reported by the test,
plus the `get_stream` call. I just made the
query count a digits regex, which is a little bit
lame, but I don't think it's worth risking test
flakes for this.
This index is intended to optimize the performance of the very
frequently run query of "what is the presence status of all users in a
realm?".
Main changes:
- add realm_id to UserPresence
- add index for realm_id
- backfill realm_id for old rows
- change all writes to UserPresence to include
realm_id
The index is of this form:
"zerver_userpresence_realm_id_5c4ef5a9" btree (realm_id)
We will create an index on (realm_id, timestamp) in a
future commit, but I think it's a bit faster if you do
the backfill before the index.
There's also a minor tweak to the populate_db script.
This is just a refactoring to the more modern API
for sending internal messages.
To make this work we now plumb the email_gateway
flag through `internal_send_stream_message` instead
of `internal_send_message`.
We also change `send_zulip` to have its callers
pass in a full UserProfile object (which one of
them already had).
We prefer this to internal_send_message().
We are trying to deprecate `internal_send_message`,
which has extra moving parts related to
`extract_recipients` and `Addressee.legacy_build`.
There are two chunks of code that I touch here
that look pretty similar, but I'm not quite
sure they're worth de-duplicating, since they
use different topics and different message
content.
Instead of having `notify_new_user` delegate
all the heavy lifting to `send_signup_message`,
we just rename `send_signup_message` to be
`notify_new_user` and remove the one-line
wrapper.
We remove a lot of obsolete complexity:
- `internal` was no longer ever set to True
by real code, so we kill it off as well
as well as killing off the internal_blurb code
and the now-obsolete test
- the `sender` parameter was actually an
email, not a UserProfile, but I think
that got past mypy due to the caller
passing in something from settings.py
- we were only passing in NOTIFICATION_BOT
for the sender, so we just hard code
that now
- we eliminate the verbose
`admin_realm_signup_notifications_stream`
parameter and just hard code it to
"signups"
- we weren't using the optional realm
parameter
There's also a long ugly comment in
`get_recipient_info` related to this code
that I amended for now.
We should try to take action in a subsequent
commit.
This avoids an unnecessary join to UserProfile.
To verify this, you can do `print(queries)` in the
`test_get_custom_profile_fields_from_api` test. It's
kinda noisy, so I excerpted them below...
Before:
SELECT ...
FROM "zerver_customprofilefieldvalue"
INNER JOIN "zerver_userprofile" ON ("zerver_customprofilefieldvalue"."user_profile_id" = "zerver_userprofile"."id")
INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id")
WHERE "zerver_userprofile"."realm_id" = 2
After:
SELECT ...
FROM "zerver_customprofilefieldvalue"
INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id")
WHERE "zerver_customprofilefield"."realm_id" = 2'
I don't have any way to measure the two queries with
realistic data, but I would assume the second
query is significantly faster on most of our instances,
since CustomProfileField should be tiny.
I am trying to optimize a query in this endpoint.
I don't think I'll actually reduce the number of
queries, but I wanted to capture the query and
this was the easiest way to do it, so might as
well check in the code! :)
The line removed here is a noop, as both sides of the
immediately following conditional reassign the
same variable.
This harmless cruft was the result of the recent commit
1ae5964ab8, which added
support for single-user GETs.
This fixes a bug where our asynchronous requests were only copying the
Content-Type header (i.e. the one case where we're noticed) from the
Django HttpResponse. I'm not sure what the impact of this would be;
the rate-limiting headers rarely come up when breaking a long-polled
request. But it seems clearly an improvement to do this in a
consistent fashion.
Only the headers piece is a change; in Tornado
self.finish(x)
is equivalent to:
self.write(x)
self.finish()
Apparently, the arguments passed to template_database_status were
incorrect for the manual testing development database, in that we
didn't pass a status_dir when calling into that code from provision.
The result was that provisioning before running `test-backend` would
ignore changes to the list of check_files (etc.) made after rebasing,
and vice versa.
The cleanest fix is to compute status_dir from other values passed in;
I'm also going to open a follow-up issue for creating a better overall
interface here.
This adds a new API endpoint for querying basic data on a single other
user in the organization, reusing the existing infrastructure (and
view function!) for getting data on all users in an organization.
Fixes#12277.
This code is a bit flatter and just preps the data
for a single user. There is never any interaction
between the data for user A and user B, so we can
mostly avoid complicated nested data structures
and do most of the data-crunching on a per-user basis.
We also do an explicit sort of the data before
running it through groupby. The explicit sort
simplifies how we calculate `most_recent_info`
and also avoids needing to add `dt` to an intermediate
data structure.
Finally, when it comes to the individual client data,
the code has relied on the assumption that there is
only one row per client, which I believe to be true,
but now the code is more explicit about that.
The word "status" is vague, and this isn't
actually returning a list, so we now name it
get_presence_response.
I originally was gonna rename this to
get_presence_dict, but there's a function
called get_status_dict that returns a subset
of the response, so I think it's a bit more
clear that this is the bigger dict that
actually gets sent back.
We want to err on the side of server_timestamp being
old, since we may eventually use this to make responses
just include incremental changes, and we don't want a
time window (however small) when we miss presence rows.
The clients will be able to deal with duplicate data
to the extent that the time windows are overlapping.
Also, extracting the other local var here
(for `presences`) will set up a subsequent commit
where we re-format the data for clients with
slim_presence=True.
In e3ad9baf1d, we introduced yet another
bug where we incorrectly shared event dictionaries between multiple
queues.
Fortunately, the logging that reports on "event was not in the queue"
issues worked and detected this on chat.zulip.org, but this is a clear
indication that the comments we have around this system were not
sufficient to produce correct behavior.
We fix this by changing event_queue.push, the code that mutates the
event dictionaries, to do the shallow copies itself. The only
downside here is process_message_event, a relatively low-traffic code
path, does an extra per-queue dictionary copy. Given that presence,
heartbeat, and message reading events are likely more traffic and
dealing with HTTP is likely much more expensive than a dictionary
copy, this probably doesn't matter performance-wise.
(And if profiling later finds it is, there are potential workarounds
like passing a skip_copy argument we can do).
django-phonenumber-field 2.4.0 adds tighter phone number validation
that rejects +12223334444 for having an invalid area code. This was
reverted in 4.0.0, but django-two-factor-auth still requires <3.99.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This commit includes a new `stream_post_policy` setting,
by replacing the `is_announcement_only` field from the Stream model,
which is done by mirroring the structure of the existing
`create_stream_policy`.
It includes the necessary schema and database migrations to migrate
the is_announcement_only boolean field to stream_post_policy,
a smallPositiveInteger field similar to many other settings.
This change is done to allow organization administrators to restrict
new members from creating and posting to a stream. However, this does
not affect admins who are new members.
With many tweaks by tabbott to documentation under /help, etc.
Fixes#13616.
This flag affects page_params and the
payload you get back from POSTs to this
url:
users/me/presence
The flag does not yet affect the
presence events that get sent to a
client.
This should ensure that folks rebasing past this commit from an older
database model get their database rebuilt in the way that will
match the test_subs.py query count of 40.
Add a simple compatibility function for AWX 9.x.x. Before AWX 9.x.x
a "friendly_name" key was sent by default. Afterwards it was removed
from being a default key but we can still more or less determine if
the triggering event was a job from the REST-style URL.
Note: It is also technically possible to add the key back by defining
a custom notification template in AWX/Tower.
Resolves#13295.
This applies rate limiting (through a decorator) of authenticate()
functions in the Email and LDAP backends - because those are the ones
where we check user's password.
The limiting is based on the username that the authentication is
attempted for - more than X attempts in Y minutes to a username is not
permitted.
If the limit is exceeded, RateLimited exception will be raised - this
can be either handled in a custom way by the code that calls
authenticate(), or it will be handled by RateLimitMiddleware and return
a json_error as the response.
We will want to raise RateLimited in authenticate() in rate limiting
code - Django's authenticate() mechanism catches PermissionDenied, which
we don't want for RateLimited. We want RateLimited to propagate to our
code that called the authenticate() function.
As more types of rate limiting of requests are added, one request may
end up having various limits applied to it - and the middleware needs to
be able to handle that. We implement that through a set_response_headers
function, which sets the X-RateLimit-* headers in a sensible way based
on all the limits that were applied to the request.
validate_otp_params needs to be moved to backends.py, because as of this
commit it'll be used both there and in views.auth - and import from
views.auth to backends.py causes circular import issue.
This makes get_raw_user_data, which was being imported indirectly
from zerver.lib.events inside zerver/views/users.py, get imported
from zerver.lib.users where it actually is.
While the result of this change doesn't completely do what we need, it
does remove a huge amount of duplicated lists of fields. With a bit
more similar work, we should be able to eliminate a broad category of
potential bugs involving Stream and Subscription objects being
represented inconsistently in the API.
Work towards #13787.
This has the side of effect of making new fields we add to Stream be
automatically included, which will help maintain this code as we
upgrade it.
This commit adds is_web_public, history_public_to_subscribers, and
email_notifications fields to the dictionary.
Tests require adjusting, because the class-based view has an additional
redirect - through /uid/set-password/ and the token is read from the
session. See Django code of PasswordResetConfirmView.
The `notification_settings_null` field of the `client_capabilities`
parameter is, apparently unintentionally, required.
This is mostly harmless. However, if any _future_ fields are made
required, all existing clients using this parameter will break, and it
will be needlessly difficult for new clients to specify new
capabilities in a backwards-compatible way.
Attempt to stave that possibility off with warnings.
(No functional changes.)
This modifies get_cross_realm_dicts in zerver.lib.users to call
format_user_row. This is done to remove current and prevent future
inconsistencies between in the dictionary formats for get_raw_user_data
and get_cross_realm_dicts.
Implementation substantially rewritten by tabbott.
Fixes#13638.
This moves get_cross_realm_dicts (from zerver.lib.actions),
get_raw_user_data and get_custom_profile_field_values (from
zerver.lib.events) to zerver.lib.users.
Only the getter of the is_new_member property is added,
to the UserProfile class. This is done to deduplicate
action of checking whether a user is a new member or not.