setup_event_queue() generates some logs about loaded event queues, and
it's good for the logging system to have access to the port at that
point already.
I'm not sure what causes some Jira webhook events to not include the
metadata that other events do, but it's definitely a format sent by
real installations of Jira (likely a very old version, since this has
fields missing from what modern Jira does) and we've seen it in
production.
The best we can do is encourage users to upgrade Jira for better data.
The previous starred_messages race handling did not correctly consider
the possibility that an event queue might have been registered without
starred_messages.
Instead of operating on RateLimitedObjects, and making the classes
depend on each too strongly. This also allows getting rid of get_keys()
function from RateLimitedObject, which was a redis rate limiter
implementation detail. RateLimitedObject should only define their own
key() function and the logic forming various necessary redis keys from
them should be in RedisRateLimiterBackend.
type().__name__ is sufficient, and much readable than type(), so it's
better to use the former for keys.
We also make the classes consistent in forming the keys in the format
type(self).__name__:identifier and adjust logger.warning and statsd to
take advantage of that and simply log the key().
This returns us to a consistent logging format regardless of whether
the request is authenticated.
We also update some log examples in docs to be consistent with the new
style.
When a user in login flow using github auth chooses a email that is
not associated with an existing account, it leads to a "continue to
registration" choice. This cannot be tested with the earlier version
of `stage_two_of_registration`.
Also added the test.
Thanks to Mateusz Mandera for the solution.
Co-authored-by: Mateusz Mandera <mateusz.mandera@protonmail.com>
The previous model for GitHub authentication was as follows:
* If the user has only one verified email address, we'll generally just log them in to that account
* If the user has multiple verified email addresses, we will always
prompt them to pick which one to use, with the one registered as
"primary" in GitHub listed at the top.
This change fixes the situation for users going through a "login" flow
(not registration) where exactly one of the emails has an account in
the Zulip oragnization -- they should just be logged in.
Fixes part of #12638.
URLs for config errors were configured seperately for each error
which is better handled by having error name as argument in URL.
A new view `config_error_view` is added containing context for
each error that returns `config_error` page with the relevant
context.
Also fixed tests and some views in `auth.py` to be consistent with
changes.
Saying `foo.lstrip('# ')` does more than just remove
a '# ' prefix. It removes any combination of '#' and
spaces.
We now make the intention slightly more clear.
We would strip these as you'd expect:
# foo
## foo
### foo
but for this we now only strip the first "#":
# # # # # foo
Thanks to @minusworld for catching this--see #14264, which
points out that lstrip() doesn't do what your intuition
might tell you it does.
Now we properly remove the "HTTP_" prefix.
It's not clear to me why we need these prefixes for Django
purposes in the fixtures, but I didn't want to go down
the rabbit hole of fixing those.
To test:
got to http://YOUR-DEV_SERVER/devtools/integrations/
select "bitbucket3" for the integration.
select "diagnostics_ping.json" for the fixture.
see "X_EVENT_KEY" in "Custom HTTP Headers"
Fixes#14264
This is a bit more rigorous than just
dereferencing the first element of
a list comprehension, as it will give a
ValueError if more matches are found than
the test was expecting.
We don't need `do_create_user` to send a partial
event here for bots. The only caller to `do_create_user`
that actually creates bots (apart from some tests that
just need data setup) is `add_bot_backend`, which
sends the more complete event including bot "extras"
like service info.
The modified event tests show the simplification
here (2 events instead of 3).
Also, the bot tests now use tuple unpacking, which
will force a ValueError if we duplicate events
again.
We now restrict emails on the zulip realm, and now
`email` and `delivery_email` will be different for
users.
This change should make it more likely to catch
errors where we leak delivery emails or use the
wrong field for lookups.
We were going back to the database to get all
the users in the realm, when we had them right
there already. I believe this is a legacy
of us running on a very old version of Django
(back in early days), where `bulk_create`
didn't give you back ids in a nice way.
In the interim we added the `RealmAuditLog`
code, which does take advantage of the
existing profiles (and proves we can rely
on them).
But meanwhile we were still
doing a query to get all N users in the
realm. With `selected_related`!
To be fair, bulk_create_users() is by
its very nature a pretty infrequent
operation. This change is more motivated
by code cleanup.
Now we just loop through user_ids for
the Recipient/Subscriber foreign key rows.
I also removed some fairly convoluted code mapping
emails to user_ids and just work in user_id
space.
We try to use the correct variation of `email`
or `delivery_email`, even though in some
databases they are the same.
(To find the differences, I temporarily hacked
populate_db to use different values for email
and delivery_email, and reduced email visibility
in the zulip realm to admins only.)
In places where we want the "normal" realm
behavior of showing emails (and having `email`
be the same as `delivery_email`), we use
the new `reset_emails_in_zulip_realm` helper.
A couple random things:
- I fixed any error messages that were leaking
the wrong email
- a test that claimed to rely on the order
of emails no longer does (we sort user_ids
instead)
- we now use user_ids in some place where we used
to use emails
- for IRC mirrors I just punted and used
`reset_emails_in_zulip_realm` in most places
- for MIT-related tests, I didn't fix email
vs. delivery_email unless it was obvious
I also explicitly reset the realm to a "normal"
realm for a couple tests that I frankly just didn't
have the energy to debug. (Also, we do want some
coverage on the normal case, even though it is
"easier" for tests to pass if you mix up `email`
and `delivery_email`.)
In particular, I just reset data for the analytics
and corporate tests.
We specifically give the existing user different
delivery_email and email addresses, to prevent false
positives during the test that checks that users
signing up with an already-existing email get
an error message.
(We also rename the test.)
I guess `test_classes` has 100% line coverage
enforcement, which is a bit tricky for error
handling.
This fixes that, as well as making the name
snake_case and improving the format of the
errors.
This test was using the anti-pattern of doing an
assertion inside a conditional.
I added the `findOne` helper to make it easier
to write robust tests for scenarios like this.
We had a bunch of ugly hacks to monkey patch things due to upstream
being temporarily unmaintained and not merging PRs. Now the project is
active again and the fixes have been merged and included in the latest
version - so we clean up all that code.
If I send a message from a normal Zulip client, it is
considered to be "read" by me. But if I send it via
an API program (using my human account), the message
is not immediately "read" by me.
Now we handle this correctly in `get_raw_unread_data`.
The symptom of this was that these messages would get
"stuck" in "Private Messages" narrows until the next
time you reloaded your app.
I've always thought of distributed teams as the place where Zulip
really shines over other tools, because chat is much more important in
that context.
And I've always been kinda unhappy with "most productive team chat" as
a line.
There's a lot more we should do here, but this is a start.
Using an Exists subquery to avoid scanning the entire Subscription
table seems to speed things up greatly.
Set up with:
./manage.py populate_db --extra_users 2000 --extra-streams 1000
Tested on my computer, the original function was taking ~1.2seconds,
the optimized version only ~0.05-0.06.
Likely fixes#13874; we can re-open if after production testing we
feel more work is warranted.
This ensures that even if it were possible to create an MIT Kerberos
account with a malicious username and/or hack webathena to pretend
that's the case, one couldn't do anything malicious.
This security improvement only impacts a single installation of Zulip
where Zephyr mirroring is in use that has already had the fix applied,
so there's no reason to do a security notice for it.
Found by Graham Bleaney using pysa.
In 220c2a5ff3 I
introduced a query to find invites by delivery_email
but was still using email as the key.
For most realms `email` and `delivery_email` are
synonymous, so this temporary bug would not affect
them. For realms that restrict emails, the invite
would have probably failed for other reasons, but
the symptom would have been less clear.
We now have this API...
If you really just need to log in
and not do anything with the actual
user:
self.login('hamlet')
If you're gonna use the user in the
rest of the test:
hamlet = self.example_user('hamlet')
self.login_user(hamlet)
If you are specifically testing
email/password logins (used only in 4 places):
self.login_by_email(email, password)
And for failures uses this (used twice):
self.assert_login_failure(email)
This reduces query counts in some cases, since
we no longer need to look up the user again. In
particular, it reduces some noise when we
count queries for O(N)-related tests.
The query count is usually reduced by 2 per
API call. We no longer need to look up Realm
and UserProfile. In most cases we are saving
these lookups for the whole tests, since we
usually already have the `user` objects for
other reasons. In a few places we are simply
moving where that query happens within the
test.
In some places I shorten names like `test_user`
or `user_profile` to just be `user`.
We want a clean codepath for the vast majority
of cases of using api_get/api_post, which now
uses email and which we'll soon convert to
accepting `user` as a parameter.
These apis that take two different types of
values for the same parameter make sweeps
like this kinda painful, and they're pretty
easy to avoid by extracting helpers to do
the actual common tasks. So, for example,
here I still keep a common method to
actually encode the credentials (since
the whole encode/decode business is an
annoying detail that you don't want to fix
in two places):
def encode_credentials(self, identifier: str, api_key: str) -> str:
"""
identifier: Can be an email or a remote server uuid.
"""
credentials = "%s:%s" % (identifier, api_key)
return 'Basic ' + base64.b64encode(credentials.encode('utf-8')).decode('utf-8')
But then the rest of the code has two separate
codepaths.
And for the uuid functions, we no longer have
crufty references to realm. (In fairness, realm
will also go away when we introduce users.)
For the `is_remote_server` helper, I just inlined
it, since it's now only needed in one place, and the
name didn't make total sense anyway, plus it wasn't
a super robust check. In context, it's easier
just to use a comment now to say what we're doing:
# If `role` doesn't look like an email, it might be a uuid.
if settings.ZILENCER_ENABLED and role is not None and '@' not in role:
# do stuff
Instead of trying to set the _requestor_for_logs attribute in all the
relevant places, we try to use request.user when possible (that will be
when it's a UserProfile or RemoteZulipServer as of now). In other
places, we set _requestor_for_logs to avoid manually editing the
request.user attribute, as it should mostly be left for Django to manage
it.
In places where we remove the "request._requestor_for_logs = ..." line,
it is clearly implied by the previous code (or the current surrounding
code) that request.user is of the correct type.