Previously, the bot domain was calculated correctly in most
circumstances, but if you were using the root domain, it would be
e.g. ".chat.zulip.org", not "chat.zulip.org". We fix this, with
perhaps more use of setting REALMS_HAVE_SUBDOMAINS than would be ideal
if we weren't about to set that True unconditionally.
This leads to more than a 2x speedup when tested with
20k+ total subscribers. (For large realms with lots of default
streams, this function deals with LOTS of data, so it is important
to optimize.)
This class encapsulates the mapping of stream ids to
recipient ids, and it is optimized for bulk use and
repeated use (i.e. it remembers values it already fetched).
This particular commit barely improves the performance
of gather_subscriptions_helper, but it sets us up for
further optimizations.
Long term, we may try to denormalize stream_id on to the
Subscriber table or otherwise modify the database so we
don't have to jump through hoops to do this kind of mapping.
This commit will help enable those changes, because we
isolate the mapping to this one new class.
This commit enables user to authenticate with any attribute set in
AUTH_LDAP_USER_SEARCH given that LDAP_EMAIL_ATTR is set to an email
attributes in the ldap server. Thus email and username can be
completely unrelated.
With some tweaks by tabbott to squash in the documentation and make it
work on older servers.
Moves SEND_ALL to inside get_next_hotspots, since it is not something other
files should call.
Also changes the delay to 0s, and gates the code behind an
`if settings.DEVELOPMENT`.
We were mostly excluding inactive users before this fix, but
now we completely ignore them.
This potentially changes some of the data we return from
get_recipient_info(), but the extra user ids before this fix
were effectively ignored by the caller.
The prior code would queue up feedback messages even if the
feedback bot was deactivated, which was just due to oversight
most likely. (People probably rarely disable the feedback bot,
but they should have that option.)
We now triage message content for possible mentions before
going to the cache/DB to get name info. This will create an
extra data hop for messages with mentions, but it will save
a fairly expensive cache lookup for most messages. (This will
be especially helpful for large realms.)
[Note that we need a subsequent commit to actually make the speedup
happen here, since avatars also cause us to look up all users in
the realm.]
I feel like getting notifications about a board's background being
changed isn't very useful information and could interrupt the flow
of other important information such as Card changes or movement,
so I think we should not support this event and
should simply ignore such payloads in the future.
This is a nonfunctional refactor, designed primarily to make it
simpler to extend this code path when we later add support for
controlling whether email notifications go out on stream messages.
Previously, due to a logic bug, this feature would also send email
notifications for all messages on the stream, which is definitely not
the intent. The recent refactoring we just did makes the logic more
obvious.
This creates a lot of logging noise, and also causes confusion
for new contributors when something isn't working as they expect
and they aren't sure if this message is normal or an error.
We should have done this a long time ago, but better late than never.
Basically, this migration would crash in the event that there were any
attachments with particularly long names. The fix is the next
migration, 0042; we just inline it here to avoid that crash.
This sets us up a subsequent commit where we need more data
from the Subscription table to build recipient info, so the
function boundary doesn't work any more for get_recipient_info,
which is part of the heavily optimized send-message
path.
We used to share code here with typing notifications, but
typing notifications need a lot less data than the
send-message path, so it's useful to decouple these two
things. The idioms that are duplicated here are pretty simple
one-liners.
This change optimizes get_status_dict_by_realm() by
introducing query_for_ids(), which quickly computes
an "IN" clause for user ids.
This change also inlines the `two_weeks_ago` check, but
that is just for clarity, not performance.
The prior version of this function was passed in a QuerySet, which
made it difficult to effectively profile the callers, and there
is really no compelling reason to pass in a query any more.
compilemessages command now does all the heavy lifting by creating a
language_name_map.json file under locale directory. This file is used
by get_language_list to retrieve the require information.
Fixes: #6486
This commit makes get_recipient_info() faster by never creating
Django ORM objects. We use the ORM to create a values query
instead, and then we iterate over the rows to create various
collections of ids.
In order to avoid lots of code duplication, this commit unifies
how we query UserProfile for PMs and streams. Prior to this
commit we were getting "wide" UserProfile objects out of
our memcached cache. Now we just go to the database with our
list of userids. The new approach at worst adds one hop to the
database for PMs, which aren't really a performance bottleneck
(compared to streams). And the new approach actually saves a
hop when both partners aren't in cache (plus we don't pay the
penalty of hitting the cache itself).
The performance improvement here is easy to measure for messages
to streams with many users, even with all the other activity
that goes on inside do_send_messages(). I took test_performance()
in test_messages.py, set num_extra_users to 3000, and consistently
measured a ~20% speedup in do_send_messages().
This commit also eliminates fetching of emails. We probably
could have done that in a prior commit, but in this commit it
is very explicit that we don't need it. While removing email
from the query is a no-brainer, it actually had a negigible
impact on performance. Almost all the savings here comes from
not create UserProfile objects.
This function returns a summary of recipient data for a message
that's being sent. It's mostly just moving code into the
old function called get_recipient_user_profiles().
This commit is necessary to prevent bringing back emails from the
DB for all N recipients of a message just to see if the feedback
bot is being invoked.
We calculate `service_bot_tuples` earlier in the function, so that
we don't need "full" UserProfile objects later in the function.
This is part of consolidating code that basically just needs to
triage user_ids.
This starts to phase out the need for UserProfile objects in
do_send_messages(). UserProfile objects are expensive to create
for large streams with lots of users. The objects in the code
before this commit aren't even full UserProfile objects.
This change mostly sets up future performance improvements, but
we also get a minor speedup here when we run a test with 3000
stream subscribers.
There is no reason for either render_incoming_message() or
render_markdown() to require full UserProfile objects just to
triage alert words.
By only asking for user_ids, we save extra queries in two
callpaths and we make it easier to start using user_ids in
do_send_messages().
This function is essentially a copy of get_recipient_user_profiles,
which is about to go away. The new function enforces the contract of
typing indicators, which is that they don't apply to streams, which
allows us to use a relatively simple approach for getting user
profile objects.
We are diverging this code, because the send-message path needs
more optimizations.
This change introduces an extra hop to the database, but it is
generally faster due to nuances of the DB and the ORM. It
also sets us up to optimize get_recipient_user_profiles() by
avoiding creating ORM objects.
I measured the impact of this using a stream with 3000
subscribers, half of whom were idle, and it speeds things up
by 10%.
Avoid a join to UserProfile here speeds up the query from
86ms -> 28ms when you analyze it with about 2000 mobile users
in a 5000-user realm.
We also avoid some code duplication here, since we filter
UserPresence for the same group of users as we filter
PushDeviceToken.
This avoids an O(N-squared) hit during presence queries. The speedup
here is probably negligible compared to everything else going on, but
sets are more semantically correct, anyway.
Before this commit, postgres would choose a non-optimal query
plan to find all presence rows belonging to a realm. We now
do an extra query to get the list of relevant user_ids, which allows
the next query to take advantage of UserPresence's index on
user_profile_id.
Here is the query plan for the offending query (this particular query isn't
verbatim from the code, but it's representative of the problem):
explain analyze
select client_id
from zerver_userpresence
INNER JOIN zerver_userprofile ON
zerver_userprofile.id = zerver_userpresence.user_profile_id
WHERE
zerver_userprofile.is_active and
zerver_userprofile.realm_id = 3;
Hash Join (cost=149.66..506.82 rows=5007 width=4) (actual time=48.834..121.215 rows=5007 loops=1)
Hash Cond: (zerver_userprofile.id = zerver_userpresence.user_profile_id)
-> Seq Scan on zerver_userprofile (cost=0.00..260.11 rows=5369 width=4) (actual time=0.009..24.322 rows=5021 loops=1)
Filter: (is_active AND (realm_id = 3))
Rows Removed by Filter: 3
-> Hash (cost=87.07..87.07 rows=5007 width=8) (actual time=48.789..48.789 rows=5010 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 196kB
-> Seq Scan on zerver_userpresence (cost=0.00..87.07 rows=5007 width=8) (actual time=0.007..24.355 rows=5010 loops=1)
Total runtime: 145.063 ms
You can see above that we're filtering on realm_id instead of using an index.
When you decompose the query into two queries, the total time is about 100ms, for a
savings of 33%. I imagine the savings would be even greater on an instance with lots
of realms. This was tested on dev with one really large realm and one tiny realm.
We were using `.order_by('user_profile_id', '-timestamp') in our
UserPresence query in get_status_dicts_for_query.
We don't need a full sort to produce the dictionary of statuses.
In fact the whole operation in Python is still O(N):
- divvy rows up to be per-user in an O(N) pass
- find max row for the 'aggregated' entry in an O(n) pass
per user
The one minor annoyance of this fix is that datetime_to_timestamp
is lossy, so if you naively call to_presence_dict before finding
the "max" row, you get test flakes if rows are created during the
same second. I decided to avoid calling to_presence_dict so there
are fewer moving parts, but there's still the ugly step of having
to remove the "dt" field from the final results.
The commit() call in fix() breaks migrations and tests (unless you
mock) due to outer transactions.
We now explicitly call commit() from the management command.
Usually a small minority of users are eligible to receive missed
message emails or mobile notifications.
We now filter users first before hitting UserPresence to find idle
users. We also simply check for the existence of recent activity
rather than borrowing the more complicated data structures that we
use for the buddy list.
This commit completely switches us over to using a
dedicated model called MutedTopic to track which topics
a user has muted.
This includes the necessary migrations to create the
table and populate it from legacy data in UserProfile.
A subsequent commit will actually remove the old field
in UserProfile.
Instead of peeking directly at the DB to verify our mutes are
set correctly, we now use the library function. This prepares
us to modify the DB internals while preserving the tests.
The double forward slash (//) after the protocol in URLs was being
mistakenly considered the beginning of an inline JS comment, causing
internationalization strings being cut unexpectedly.
Now the check for inline JS comments is only run in .js files.
Use this new variable to determine if the user already exists while
doing registration. While doing login through GitHub if we press
*Go back to login*, we pass email using email variable. As a result,
the login page starts showing the "User already exists error" if we
don't change the variable.
Admins need to know about private streams to delete them, even
if they are not subscribed. We send the minimal info possible
to the client to allow them to have a UI for that.
The refactor in b46af40bd3 didn't
correctly translate the code for managing request.user and
request._email, resulting in requests for the push notification
bouncer being rejected with this exception:
AttributeError: 'AnonymousUser' object has no attribute 'rate_limits'
This never made sense to be a flag on the UserMessage table, since
it's not per-user state. And in fact it doesn't need to be in a
database at all, since it's easily computed from content anyway.
Fixes#1099.
And it works!
A couple of things still to do:
* When a device token is no longer active, we'll get HTTP status 410.
We should then remove the token from the database so we don't keep
trying to push to it. This is fairly urgent.
* The library we're using has a nice asynchronous API, but this
version doesn't use it. This is OK now, but async will be
essential at scale.
This code empirically doesn't work. It's not entirely clear why, even
having done quite a bit of debugging; partly because the code is quite
convoluted, and because it shows the symptoms of people making changes
over time without really understanding how it was supposed to work.
Moreover, this code targets an old version of the APNs provider API.
Apple deprecated that in 2015, in favor of a shiny new one which uses
HTTP/2 to meet the same needs for concurrency and scale that the old
one had to do a bunch of ad-hoc protocol design for.
So, rip this code out. We'll build a pathway to the new API from
scratch; it's not that complicated.
We'd been getting errors from APNs that appeared to say that the
device tokens we were trying to send to were invalid. It turned out
that the device tokens didn't match the "topic" (i.e. app ID) we were
sending, which was because the topic was wrong, which was because we
were using the wrong SSL cert. But for a while we thought it might be
that we were somehow messing up the device tokens we put into the
database. This logging helped us work out that wasn't the issue, and
would have helped our debugging sooner.
This brings type-checking to the last place we fetch
data from Redis, with the exception of our APNs code
which is being replaced (with a Redis-free version,
thanks to improvements in Apple's APNs API) shortly.
This gives us type-checking, to help prevent bugs like the
last couple of commits fixed in our Tornado code and our
missed-message email handling. Fortunately no behavior
changes are needed here.
Because the Redis client returns exclusively bytes -- even for
hash keys -- even on Python 3, the test `'response' in status`
was always returning false, and the line that tries to decode
as JSON was never running, so we were passing `response`
through as a `bytes` object encoding some JSON.
I'm not sure what the impact of this bug was, and in particular
whether something downstream would have fudged it to make up for
this error.
Redis and the Redis client know nothing but bytes. When we take a
`bytes` object it returns and pass it down as `subject` here, it
causes an exception deep inside message processing if the realm has
any filters, when `bugdown.subject_links` attempts to search the
subject for the filters, which are of course `str` patterns.
For symmetry, make the conversion to bytes on the storing side
explicit too.
Previously, we didn't pass customized HTTP_HOST headers when making
network requests. As we move towards a world where everything is on a
subdomain, we'll want to start doing that.
The vast majority of our test code is written to interact with the
default "zulip" realm, which has a subdomain of "zulip". While
probably longer-term, we'll wish this was the root domain, for now, we
need to make our HTTP requests match what is expected by the test
code.
This commit almost certainly introduces some weird bugs where code was
expecting a different subdomain but the tests doesn't fail yet. It's
not clear how to find all of these, but I've done some grepping.
get_realm_by_email_domain was intended to be registration flow code
not used in other code, but it was leaked to a few places. This
removes one of the main remaining references to it outside the
registration code path.
The older fixture for this event assumed that the "assignee" key
had a value of '{}' if no one was assigned to a PR anymore.
However, that is no longer true because testing with requestbin
showed that in the latest JSON payload for this event, the key
"assignee" now has the value of 'null' (None when converted to
Python) when a user is unassigned from a PR. The current code
didn't handle this correctly. This commit makes sure it does!
Its unclear as to whether the old fixture was simply wrong or
whether GitHub changed its payloads in any way.
Previously, Zulip's server logs would not show which user or client
was involved in login or user registration actions, which made
debugging more annoying than it needed to be.
This is mostly pure code extraction.
It also removes some dead code in update_muted_topic, where
were updating muted_topics spuriously before calling
do_update_muted_topic.
Previously, realm.uri and realm.host didn't support using a subdomain
of the empty string (""), aka using the root domain.
Also, since we're already accessing self.subdomain, we don't need to
check REALMS_HAVE_SUBDOMAINS again.
Unlike creating a stream, there's really no reason one would want to
call the function to create a realm while uncertain whether that realm
already existed.
For filters like has:link, where the web app doesn't necessarily
want to guess whether incoming messages meet the criteria of the
filter, the server is asked to query rows that match the query.
Usually these queries are search queries, which have fields for
content_matches and subject_matches. Our logic was handling those
correctly.
Non-search queries were throwing an exception related to tuple
unpacking. Now we recognize when those fields are absent and
do the proper thing.
There are probably situations where the web app should stop hitting
this endpoint and just use its own filters. We are making the most
defensive fix first.
Fixes#6118
Most of the code in show_unreads is for diagnosising unread
counts issues, and we may not use that often.
We're creating a dedicated fix_unreads management command with
less clutter.
This change is mostly based on a similar commit from hackerkid
in a feature branch. It borrows both code and ideas. Some of
it's my own stuff, as I was working on a newer branch.
We now call get_user_including_cross_realm_email() inside of
user_profiles_from_unvalidated_emails(), instead of using
get_user_profile_by_email.
This requires a few of our callers to pass down sender into us.
One consequence of this change is that we change the symptoms
for trying to send to emails outside of your realm. In some
cases, we simply raise an error that an email is invalid to us
instead of getting into the deeper validate_recipient_user_profiles
check.
We are trying to convert emails to user_profiles earlier in
the codepath. This may cause subtle changes in which errors
appear, but it's probably generally good to report on bad
addressees sooner than later.
This class simplifies the calling sequence to methods like
check_message and _internal_prep_message, and it's also more
type safe.
Checking for message types is encapsulated with calls to is_stream()
and is_private(). There are also shortcut constructors when you
know that the type of the address (stream vs. private), which is often.