This allows us to block use of the desktop app with insecure versions
(we simply fail to load the Zulip webapp at all, instead rendering an
error page).
For now we block only versions that are known to be both insecure and
not auto-updating, but we can easily adjust these parameters in the
future.
This improves the error handling for invalid values of the
propagate_mode parameter to our message editing endpoints.
Previously, invalid values would just work like change_one rather than
doing nothing.
setup_event_queue() generates some logs about loaded event queues, and
it's good for the logging system to have access to the port at that
point already.
I'm not sure what causes some Jira webhook events to not include the
metadata that other events do, but it's definitely a format sent by
real installations of Jira (likely a very old version, since this has
fields missing from what modern Jira does) and we've seen it in
production.
The best we can do is encourage users to upgrade Jira for better data.
The previous starred_messages race handling did not correctly consider
the possibility that an event queue might have been registered without
starred_messages.
Instead of operating on RateLimitedObjects, and making the classes
depend on each too strongly. This also allows getting rid of get_keys()
function from RateLimitedObject, which was a redis rate limiter
implementation detail. RateLimitedObject should only define their own
key() function and the logic forming various necessary redis keys from
them should be in RedisRateLimiterBackend.
type().__name__ is sufficient, and much readable than type(), so it's
better to use the former for keys.
We also make the classes consistent in forming the keys in the format
type(self).__name__:identifier and adjust logger.warning and statsd to
take advantage of that and simply log the key().
This returns us to a consistent logging format regardless of whether
the request is authenticated.
We also update some log examples in docs to be consistent with the new
style.
When a user in login flow using github auth chooses a email that is
not associated with an existing account, it leads to a "continue to
registration" choice. This cannot be tested with the earlier version
of `stage_two_of_registration`.
Also added the test.
Thanks to Mateusz Mandera for the solution.
Co-authored-by: Mateusz Mandera <mateusz.mandera@protonmail.com>
The previous model for GitHub authentication was as follows:
* If the user has only one verified email address, we'll generally just log them in to that account
* If the user has multiple verified email addresses, we will always
prompt them to pick which one to use, with the one registered as
"primary" in GitHub listed at the top.
This change fixes the situation for users going through a "login" flow
(not registration) where exactly one of the emails has an account in
the Zulip oragnization -- they should just be logged in.
Fixes part of #12638.
URLs for config errors were configured seperately for each error
which is better handled by having error name as argument in URL.
A new view `config_error_view` is added containing context for
each error that returns `config_error` page with the relevant
context.
Also fixed tests and some views in `auth.py` to be consistent with
changes.
Saying `foo.lstrip('# ')` does more than just remove
a '# ' prefix. It removes any combination of '#' and
spaces.
We now make the intention slightly more clear.
We would strip these as you'd expect:
# foo
## foo
### foo
but for this we now only strip the first "#":
# # # # # foo
Thanks to @minusworld for catching this--see #14264, which
points out that lstrip() doesn't do what your intuition
might tell you it does.
Now we properly remove the "HTTP_" prefix.
It's not clear to me why we need these prefixes for Django
purposes in the fixtures, but I didn't want to go down
the rabbit hole of fixing those.
To test:
got to http://YOUR-DEV_SERVER/devtools/integrations/
select "bitbucket3" for the integration.
select "diagnostics_ping.json" for the fixture.
see "X_EVENT_KEY" in "Custom HTTP Headers"
Fixes#14264
This is a bit more rigorous than just
dereferencing the first element of
a list comprehension, as it will give a
ValueError if more matches are found than
the test was expecting.
We don't need `do_create_user` to send a partial
event here for bots. The only caller to `do_create_user`
that actually creates bots (apart from some tests that
just need data setup) is `add_bot_backend`, which
sends the more complete event including bot "extras"
like service info.
The modified event tests show the simplification
here (2 events instead of 3).
Also, the bot tests now use tuple unpacking, which
will force a ValueError if we duplicate events
again.
We now restrict emails on the zulip realm, and now
`email` and `delivery_email` will be different for
users.
This change should make it more likely to catch
errors where we leak delivery emails or use the
wrong field for lookups.
We were going back to the database to get all
the users in the realm, when we had them right
there already. I believe this is a legacy
of us running on a very old version of Django
(back in early days), where `bulk_create`
didn't give you back ids in a nice way.
In the interim we added the `RealmAuditLog`
code, which does take advantage of the
existing profiles (and proves we can rely
on them).
But meanwhile we were still
doing a query to get all N users in the
realm. With `selected_related`!
To be fair, bulk_create_users() is by
its very nature a pretty infrequent
operation. This change is more motivated
by code cleanup.
Now we just loop through user_ids for
the Recipient/Subscriber foreign key rows.
I also removed some fairly convoluted code mapping
emails to user_ids and just work in user_id
space.
We try to use the correct variation of `email`
or `delivery_email`, even though in some
databases they are the same.
(To find the differences, I temporarily hacked
populate_db to use different values for email
and delivery_email, and reduced email visibility
in the zulip realm to admins only.)
In places where we want the "normal" realm
behavior of showing emails (and having `email`
be the same as `delivery_email`), we use
the new `reset_emails_in_zulip_realm` helper.
A couple random things:
- I fixed any error messages that were leaking
the wrong email
- a test that claimed to rely on the order
of emails no longer does (we sort user_ids
instead)
- we now use user_ids in some place where we used
to use emails
- for IRC mirrors I just punted and used
`reset_emails_in_zulip_realm` in most places
- for MIT-related tests, I didn't fix email
vs. delivery_email unless it was obvious
I also explicitly reset the realm to a "normal"
realm for a couple tests that I frankly just didn't
have the energy to debug. (Also, we do want some
coverage on the normal case, even though it is
"easier" for tests to pass if you mix up `email`
and `delivery_email`.)
In particular, I just reset data for the analytics
and corporate tests.
We specifically give the existing user different
delivery_email and email addresses, to prevent false
positives during the test that checks that users
signing up with an already-existing email get
an error message.
(We also rename the test.)
I guess `test_classes` has 100% line coverage
enforcement, which is a bit tricky for error
handling.
This fixes that, as well as making the name
snake_case and improving the format of the
errors.
This test was using the anti-pattern of doing an
assertion inside a conditional.
I added the `findOne` helper to make it easier
to write robust tests for scenarios like this.
We had a bunch of ugly hacks to monkey patch things due to upstream
being temporarily unmaintained and not merging PRs. Now the project is
active again and the fixes have been merged and included in the latest
version - so we clean up all that code.
If I send a message from a normal Zulip client, it is
considered to be "read" by me. But if I send it via
an API program (using my human account), the message
is not immediately "read" by me.
Now we handle this correctly in `get_raw_unread_data`.
The symptom of this was that these messages would get
"stuck" in "Private Messages" narrows until the next
time you reloaded your app.
I've always thought of distributed teams as the place where Zulip
really shines over other tools, because chat is much more important in
that context.
And I've always been kinda unhappy with "most productive team chat" as
a line.
There's a lot more we should do here, but this is a start.
Using an Exists subquery to avoid scanning the entire Subscription
table seems to speed things up greatly.
Set up with:
./manage.py populate_db --extra_users 2000 --extra-streams 1000
Tested on my computer, the original function was taking ~1.2seconds,
the optimized version only ~0.05-0.06.
Likely fixes#13874; we can re-open if after production testing we
feel more work is warranted.
This ensures that even if it were possible to create an MIT Kerberos
account with a malicious username and/or hack webathena to pretend
that's the case, one couldn't do anything malicious.
This security improvement only impacts a single installation of Zulip
where Zephyr mirroring is in use that has already had the fix applied,
so there's no reason to do a security notice for it.
Found by Graham Bleaney using pysa.
In 220c2a5ff3 I
introduced a query to find invites by delivery_email
but was still using email as the key.
For most realms `email` and `delivery_email` are
synonymous, so this temporary bug would not affect
them. For realms that restrict emails, the invite
would have probably failed for other reasons, but
the symptom would have been less clear.
We now have this API...
If you really just need to log in
and not do anything with the actual
user:
self.login('hamlet')
If you're gonna use the user in the
rest of the test:
hamlet = self.example_user('hamlet')
self.login_user(hamlet)
If you are specifically testing
email/password logins (used only in 4 places):
self.login_by_email(email, password)
And for failures uses this (used twice):
self.assert_login_failure(email)
This reduces query counts in some cases, since
we no longer need to look up the user again. In
particular, it reduces some noise when we
count queries for O(N)-related tests.
The query count is usually reduced by 2 per
API call. We no longer need to look up Realm
and UserProfile. In most cases we are saving
these lookups for the whole tests, since we
usually already have the `user` objects for
other reasons. In a few places we are simply
moving where that query happens within the
test.
In some places I shorten names like `test_user`
or `user_profile` to just be `user`.
We want a clean codepath for the vast majority
of cases of using api_get/api_post, which now
uses email and which we'll soon convert to
accepting `user` as a parameter.
These apis that take two different types of
values for the same parameter make sweeps
like this kinda painful, and they're pretty
easy to avoid by extracting helpers to do
the actual common tasks. So, for example,
here I still keep a common method to
actually encode the credentials (since
the whole encode/decode business is an
annoying detail that you don't want to fix
in two places):
def encode_credentials(self, identifier: str, api_key: str) -> str:
"""
identifier: Can be an email or a remote server uuid.
"""
credentials = "%s:%s" % (identifier, api_key)
return 'Basic ' + base64.b64encode(credentials.encode('utf-8')).decode('utf-8')
But then the rest of the code has two separate
codepaths.
And for the uuid functions, we no longer have
crufty references to realm. (In fairness, realm
will also go away when we introduce users.)
For the `is_remote_server` helper, I just inlined
it, since it's now only needed in one place, and the
name didn't make total sense anyway, plus it wasn't
a super robust check. In context, it's easier
just to use a comment now to say what we're doing:
# If `role` doesn't look like an email, it might be a uuid.
if settings.ZILENCER_ENABLED and role is not None and '@' not in role:
# do stuff
Instead of trying to set the _requestor_for_logs attribute in all the
relevant places, we try to use request.user when possible (that will be
when it's a UserProfile or RemoteZulipServer as of now). In other
places, we set _requestor_for_logs to avoid manually editing the
request.user attribute, as it should mostly be left for Django to manage
it.
In places where we remove the "request._requestor_for_logs = ..." line,
it is clearly implied by the previous code (or the current surrounding
code) that request.user is of the correct type.
This refactors remove_reaction in python_examples.py to validate the
result with validate_against_openapi_schema. Minor changes and some
additions have been made to the OpenAPI format data for
/messages/{message_id}/reactions endpoint.
This refactors add_reaction in python_examples.py to use the
openapi_test_function decorator and validate result with
validate_against_openapi_schema. Minor changes have been made to the
OpenAPI format data for /messages/{message_id}/reactions endpoint.
This also adds add-emoji.md to templates/zerver/api and adds
add-emoji to rest-endpoints.md (templates/zerver/help/include).
This refactors get_members_backend to return user data of a single
user in the form of a dictionary (earlier being a list with a single
dictionary).
This also refactors it to return the data with an appropriate key
(inside a dictionary), "user" or "members", according to the type of
data being returned.
Tweaked by tabbott to use somewhat less opaque code and simple OpenAPI
descriptions.
Previously, get_client_name was responsible for both parsing the
User-Agent data as well as handling the override behavior that we want
to use "website" rather than "Mozilla" as the key for the Client object.
Now, it's just responsible for User-Agent, and the override behavior
is entirely within process_client (the function concerned with Client
objects).
This has the side effect of changing what `Client` object we'll use
for HTTP requests to /json/ endpoints that set the `client` attribute.
I think that's in line with our intent -- we only have a use case for
API clients overriding the User-Agent parsing (that feature is a
workaround for situations where the third party may not control HTTP
headers but does control the HTTP request payload).
This loses test coverage on the `request.GET['client']` code path; I
disable that for now since we don't have a real use for that behavior.
(We may want to change that logic to have Client recognize individual
browsers; doing so requires first using a better User-Agent parsing
library).
Part of #14067.
The "sender" property in `send_message_backend` is meant to only do
something when doing Zephyr mirroring (or similar). We should help
clients behave correctly by banning this property in requests that are
not specifically requesting mirroring behavior.
This commit requires changes to a number of tests that incorrectly
passed this parameter or didn't use the right setup for mirroring.
The special Zephyr mirroring logic is only intended to be used via the
API, so this sets up a more effective test. It also allows us to
remove certain Client parsing logic for the /json/ views using session
authentication.
The email domain restriction to @zulip.com is annoying in development
environment when trying to test sign up. For consistency, it's best to
have tests use the same default, and the tests that require domain
restriction can be adjusted to set that configuration up for themselves
explicitly.
This uses the better, modern, user ID based API for sending messages
internally in the test suite, something that's convenient to do as a
follow-up to the migration to pass UserProfile objects to these
functions.
This commit mostly makes our tests less
noisy, since emails are no longer an important
detail of sending messages (they're not even
really used in the API).
It also sets us up to have more scrutiny
on delivery_email/email in the future
for things that actually matter. (This is
a prep commit for something along those
lines, kind of hard to explain the full
plan.)
We plan to use these records to check and record the schema of Zulip's
events for the purposes of API documentation.
Based on an original messier commit by tabbott.
In theory, a nicer version of this would be able to work directly off
the mypy type system, but this will be good enough for our use case.
This extends our email address visibility settings to deny access to
user email addresses even to organization administrators.
At the moment, they can of course change the setting (which leaves an
audit trail), but in the future only organization owners will be able
to change that setting.
While we're at this, we rewrite the settings_data.js test to cover all
the cases in a more consistent way.
Fixes#14111.
This isn't the only bug in our testing libraries with
EMAIL_ADDRESS_VISIBILITY; but we don't have a lot of tests that need
to deal with that set of settings.
We will cache failed lookups with None. The
use case here is that broken API clients may
continually ask for the same wrong API key, and
we want to handle that as quickly as possible.
We were using `code` to pass around messages.
The `code` field is designed to be a code, not
a human-readable message.
It's possible that we don't actually need two
flavors of messages for these type of validations,
but I didn't want to change that yet.
We **definitely** don't need to put two types of
message in the exception, so I fix that. Instead,
I just have the caller ask what level of detail
it needs.
I added a non-verbose message for the case of
system bots.
I removed the non-translated version of the message
for deactivated accounts, which didn't have test
coverage and is slightly more prone to leaking
email info that we don't want to leak.
In the prep commits leading up to this, we split
out two new helpers:
validate_email_is_valid
get_errors_for_new_emails
Now when we validate invites we use two separate
loops to filter our emails.
Note that the two extracted functions map to two
of the data structures that used to be handled
in a single loop, and now we break them out:
errors = validate_email_is_valid
skipped = get_errors_for_new_emails
The first loop checks that emails are even valid
to begin with.
The second loop finds out whether emails are already
in use.
The second loop takes advantage of this helper:
get_errors_for_new_emails
The second helper can query all potential new emails
with a single round trip to the database.
This reduces our query count.
The main purpose of this new function is to allow
us to validate emails in bulk, which we don't do
yet (still setting the stage for that).
This is still a speedup, though, since in our
caller we grab only three fields now.
And other than that, we're essentially doing
the same query for the single-email case, just
outside the loop.
We are trying to kill off `validate_email`, so
we no longer call it from these tests.
These tests are already kind of low-level in
nature, so testing the more specific helpers
here should be fine.
Note that we also make the third parameter
to `validate_email` non-optional in this commit,
to preserve 100% coverage. This is really just
refactoring noise--we will soon eliminate the
entire function, but I didn't want to do everything
in a huge commit.
This is a prep commit that will allow us
to more efficiently validate a bunch of
emails in the invite UI.
This commit does not yet change any
behavior or performance.
A secondary goal of this commit is to
prepare us to eliminate some hackiness
related to how we construct
`ValidationError` exceptions.
It preserves some quirks of the prior
implementation:
- the strings we decided to translate
here appear haphazard (and often
get ignored anyway)
- we use `msg` in most codepaths,
but use `code` for invites
Right now we never actually call this with
more than one email, but that will change
soon.
Note that part of the rationale for the inner
method here is to avoid a test coverage bug
with `continue` in loops.
We are trying to elminate the version of
`validate_email` that lives in `actions.py`.
Inlining it barely increases the code size, and
it removes some noise related the three-item
tuple that `check_incoming_email` returns.
This has two goals:
- sets up a future commit to bulk-validate
emails
- the extracted function is more simple,
since it just has errors, and no codes
or deactivated flags
This commit leaves us in a somewhat funny
intermediate state where we have
`action.validate_email` being a glorified
two-line function with strange parameters,
but subsequent commits will clean this up:
- we will eliminate validate_email
- we will move most of the guts of its
other callee to lib/email_validation.py
To be clear, the code is correct here, just
kinda in an ugly, temporarily-disorganized
intermediate state.
We now use the `get_realm_email_validator()`
helper to build an email validator outside
the loop of emails in our invite list.
This allows us to perform RealmDomain queries
only once per request, instead of once per
email.
We now query RealmDomain objects up front. This
change is minor in most circumstances--it sometimes
saves a round trip to the database; other times,
it actually brings back slightly more data
(optimistically).
The big win will come in a subsequent commit,
where we avoid running these queries in a loop
for every callback.
Note that I'm not sure if we intentionally
omitted checks for emails with "+" in them
for some circumstances, but I just preserved
the behavior.
Now called:
validate_email_not_already_in_realm
We have a separate validation function that
makes sure that the email fits into a realm's
domain scheme, and we want to avoid naming
confusion here.
Without the fix here, you will get an exception
similar to below if you try to invite one of the
cross realm bots. (The actual exception is
a bit different due to some rebasing on my branch.)
File "/home/zulipdev/zulip/zerver/lib/request.py", line 368, in _wrapped_view_func
return view_func(request, *args, **kwargs)
File "/home/zulipdev/zulip/zerver/views/invite.py", line 49, in invite_users_backend
do_invite_users(user_profile, invitee_emails, streams, invite_as)
File "/home/zulipdev/zulip/zerver/lib/actions.py", line 5153, in do_invite_users
email_error, email_skipped, deactivated = validate_email(user_profile, email)
File "/home/zulipdev/zulip/zerver/lib/actions.py", line 5069, in validate_email
return None, (error.code), (error.params['deactivated'])
TypeError: 'NoneType' object is not subscriptable
Obviously, you shouldn't try to invite a cross
realm bot to your realm, but we want a reasonable
error message.
RESOLUTION:
Populate the `code` parameter for `ValidationError`.
BACKGROUND:
Most callers to `validate_email_for_realm` simply catch
the `ValidationError` and then report a more generic error.
That's also what `do_invite_users` does, but it has the
somewhat convoluted codepath through `validate_email`
that triggers this code:
try:
validate_email_for_realm(user_profile.realm, email)
except ValidationError as error:
return None, (error.code), (error.params['deactivated'])
The way that we're using the `code` parameter for
`ValidationError` feels hacky to me. The intention
behind `code` is to provide a descriptive error to
calling code, and it's not intended for humans, and
it feels strange that we actually translate this in
other places. Here are the Django docs:
https://docs.djangoproject.com/en/3.0/ref/forms/validation/
And then here's an example of us actually translating
a code (not part of this commit, just providing context):
raise ValidationError(_('%s already has an account') %
(email,), code = _("Already has an account."),
params={'deactivated': False})
Those codes eventually get put into InvitationError, which
inherits from JsonableError, and we do actually display
these errors in the webapp:
if skipped and len(skipped) == len(invitee_emails):
# All e-mails were skipped, so we didn't actually invite anyone.
raise InvitationError(_("We weren't able to invite anyone."),
skipped, sent_invitations=False)
I will try to untangle this somewhat in upcoming commits.
We allow folks to invite emails that are
associated with a mirror_dummy account.
We had a similar test already for registration,
but not invites.
This logic typically affects MIT realms in the
real world, but the logic should apply to any
realm, so I use accounts from the zulip realm
for convenient testing. (For example, we might
run an IRC mirror for a non-MIT account.)
I use a range here because there's some leak
from another test that causes the count to
vary. Once we get this a bit more under control,
we should be able to analyze the leak better.
The substantive improvement here is to use
a strange casing for Hamlet's email, which
will prevent future casing bugs.
I also log in as Cordelia to prevent confusion
that the test has something to do with
inviting yourself. It's more typical for
somebody to invite another person to a realm
(not realizing they're already there).
I also made two readability tweaks.
Several of our queues are capable of doing work that includes
rendering markdown (outgoing_webhook, embedded_bots, embed_links, and
email_mirror). As a result, it's essential that these don't cache
per-request data (specifically, realm filters) longer than they
should, making editing/deleting linkifiers potentially use old
settings until the relevant process was restarted.
Flushing these caches is extremely cheap (just clearing two
dictionaries) and thus is reasonable to do after every queue event,
rather than trying to do it only the ~1/3 of queues that specifically
do markdown processing. We do the same in our middleware for
reset_queries.
It's not worth writing a test for this because it's very difficult to
create the test setup situation for this bug with a single test worker
process; one needs to edit the linkifier configuration in a different
process than the one sending the message in order to see the bug.
This was a much larger visible bug on Zulip 2.1.x, where the presence
of the message_sender queue meant that this would apply to messages
sent via a browser.
Fixes#14095.
Previously, the input:
====================
- One
- Two
Two continued
====================
Would produce the same output as:
====================
- One
- Two
```
Two continued
```
====================
This was because our CodeBlockProcessor had a higher priority than
the ListIndentProcessor. This issue was discussed here:
https://chat.zulip.org/#narrow/stream/9-issues/topic/continuation.20paragraphs.20in.20list.20items.
/delete_topic endpoint could be used to request the deletion of a topic,
that would cause do_delete_messages to be called with an empty set in
these cases:
1. Requesting deletion of an empty stream.
2. Requesting deletion of a topic in a private stream with history not
public to subscribers, if the requesting admin doesn't have access to
any of the messages in that topic.
This function slims down the data that we get
from the database in order to create the
streams part of our client payload.
We also fix a typo.
We also clearly distinguish between queries
and lists here.