This test was flaky due to some date-related
non-determinism. I make all the Message objects
current to make add_new_user_history reliably
try to bulk-update UserMessage rows to read.
Because of the very large `oneOf` clause of the formats of events
possible in Zulip's `GET /events` system, we had issues with
`test-backend` failures for missing documentation for a new event
format being like 1000 lines of output, which was very much unhelpful.
Fix this by limiting the output use only the oneOf variants that are
broadly similar to the actual payload received.
Fixes#16023.
The comment still pointed to 'vacate' event flow, but
we have removed the vacate event in a9356508ca.
This commit fixes the comment to depict the correct
purpose of below lines, i.e. to test the remove
event flow.
We were including 'realm_user' in event_types along with 'subscription',
but we don't send event of type 'realm_user' when subscribing to a new
stream. This was added in 1c332f5d6a.
This commit removes 'realm_user' from event_types.
The name used to be included in the id_token, but this seems to have
been changed by Apple and now it's sent in the `user` request param.
https://github.com/python-social-auth/social-core/pull/483 is the
upstream PR for this - but upstream is currently unmaintained, so we
have to monkey patch.
We also alter the tests to reflect this situation. Tests no longer put
the name in the id_token, but rather in the `user` request param in the
browser flow, just like it happens in reality.
An adaptation has to be made in the native flow - since the name won't
be included by Apple in the id_token anymore, the app, when POSTing
to the /complete/apple/ endpoint,
can (and should for better user experience)
add the `user` param formatted as json of
{"email": "hamlet@zulip.com", "name": {"firstName": "Full", "lastName": "Name"}}
dict. This is also reflected by the change in the
native flow tests.
We now can send an implied matrix of user/stream tuples
for peer_add and peer_remove events.
The client code basically does this:
for stream_id in event['stream_ids']:
for user_id in event['user_ids']:
update_sub(stream_id, user_id)
We used to send individual events, which gets real
expensive when you are creating new streams. For
the case of copy-to-stream case, we should see
events go from U to 1, where U is the number of users
added.
Note that we don't yet fully optimize the potential
of this schema. For adding a new user with lots
of default streams, we still send S peer_add events.
And if you subscribe a bunch of users to a bunch of
private streams, we only go from U * S to S; we can't
optimize it down to one event easily.
All the fields of a stream's recipient object can
be inferred from the Stream, so we just make a local
object. Django will create a Message object without
checking that the child Recipient object has been
saved. If that behavior changes in some upgrade,
we should see some pretty obvious symptom, including
query counts changing.
Tweaked by tabbott to add a longer explanatory comment, and delete a
useless old comment.
This saves us a query for edge cases like when
you try to unsubscribe from a public stream
that you have already unsubscribed from.
But this is mostly to prep for upcoming
optimizations.
Initially markdown titles were overridden by Youtube and Vimeo preview titles.
But now it will check if any markdown title is present to replace Youtube or
Vimeo preview titles, if preview of linked websites is enabled.
Fixes#16100
Upstream has slightly changed the whitespace around stashes. Take
this opportunity to clean up the extra blank lines we were outputting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
That class is an artifact of when Stream
didn't have recipient_id. Now it's simpler
to deal with stream subscriptions.
We also save a query during page load (and
other places where we get subscriber
info).
We already trust ids that are put on our queue
for deferred work. For example, see the code for
"mark_stream_messages_as_read_for_everyone"
We now pass stream_recipient_id when we queue
up work for do_mark_stream_messages_as_read.
This generally saves about 3 queries per
user when we unsubscribe them from a stream.
We get two speedups:
* The query to get existing subscribers only
gets the two fields we need. We no longer
need all the overhead of user_profile
and recipient data being returned in the
query.
* We avoid Django making extra hops to the
database to get user info.
We replace get_peer_user_ids_for_stream_change
with two bulk functions to get peers and/or
subscribers.
Note that we have three codepaths that care about
peers:
subscribing existing users:
we need to tell peers about new subscribers
we need to tell subscribed user about old subscribers
unsubscribing existing users:
we only need to tell peers who unsubscribed
subscribing new user:
we only need to tell peers about the new user
(right now we generate send_event
calls to tell the new user about existing
subscribers, but this is a waste
of effort that we will fix soon)
The two bulk functions are this:
bulk_get_subscriber_peer_info
bulk_get_peers
They have some overlap in the implementation,
but there are some nuanced differences that are
described in the comments.
Looking up peers/subscribers in bulk leads to some
nice optimizations.
We will save some memchached traffic if you are
subscribing to multiple public streams.
We will save a query in the remove-subscriber
case if you are only dealing with private streams.
We used to send occupy/vacate events when
either the first person entered a stream
or the last person exited.
It appears that our two main apps have never
looked at these events. Instead, it's
generally the case that clients handle
events related to stream creation/deactivation
and subscribe/unsubscribe.
Note that we removed the apply_events code
related to these events. This doesn't affect
the webapp, because the webapp doesn't care
about the "streams" field in do_events_register.
There is a theoretical situation where a
third party client could be the victim of
a race where the "streams" data includes
a stream where the last subscriber has left.
I suspect in most of those situations it
will be harmless, or possibly even helpful
to the extent that they'll learn about
streams that are in a "quasi" state where
they're activated but not occupied.
We could try to patch apply_event to
detect when subscriptions get added
or removed. Or we could just make the
"streams" piece of do_events_register
not care about occupy/vacate semantics.
I favor the latter, since it might
actually be what users what, and it will
also simplify the code and improve
performance.
The query to get "occupied" streams has been expensive
in the past. I'm not sure how much any recent attempts
to optimize that query have mitigated the issue, but
since we clearly aren't sending this data, there is no
reason to compute it.
Using web_public_guest for anonymous users is confusing since
'guest' is actually a logged-in user compared to
web_public_guest which is not logged-in and has only
read access to messages. So, we rename it to
web_public_visitor.
This is a more thorough test of adding multiple
streams for multiple users, including streams
that users have already subscribed to.
The extra queries here are due to the fact
that we call `principal_to_user_profile` in
a loop in the view. So that's an example
of O(N) overhead. We may be able to bulk-fetch
these users eventually.
This is a pure extraction, except that I remove a
redundant check that `len(principals) > 0`. Whenever
that value is false, then `new_subscriptions` will
only have one possible entry, which is the current
user, and we skip that in the loop.
We no longer do O(N) queries to get existing streams.
This is a somewhat contrived use case--generally, we
are not trying to re-subscribe a user to several
streams. Still, we want to avoid this.
This commit also makes `test_bulk_subscribe_many`
do more work, and the change to the test helped
me discover this bug.
If a user asks to be subscribed to a stream
that they are already subscribed to, then
that stream won't be in new_stream_user_ids,
and we won't need to send an event for it.
This change makes that happen more automatically.
I think it's important that the callers understand
that bulk_add_subscriptions assumes all streams
are being created within a single realm, so I make
it an explicit parameter.
This may be overkill--I would also be happy if we
just included the assertions from this commit.
ssh always runs its command through a shell (after naïvely joining
multiple arguments with spaces), so it needs an extra level of shell
quoting. This should have no effect because we already validated user
with a regex, but it’s better for escaping to be locally correct in
case the context changes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>