We now can send an implied matrix of user/stream tuples
for peer_add and peer_remove events.
The client code basically does this:
for stream_id in event['stream_ids']:
for user_id in event['user_ids']:
update_sub(stream_id, user_id)
We used to send individual events, which gets real
expensive when you are creating new streams. For
the case of copy-to-stream case, we should see
events go from U to 1, where U is the number of users
added.
Note that we don't yet fully optimize the potential
of this schema. For adding a new user with lots
of default streams, we still send S peer_add events.
And if you subscribe a bunch of users to a bunch of
private streams, we only go from U * S to S; we can't
optimize it down to one event easily.
All the fields of a stream's recipient object can
be inferred from the Stream, so we just make a local
object. Django will create a Message object without
checking that the child Recipient object has been
saved. If that behavior changes in some upgrade,
we should see some pretty obvious symptom, including
query counts changing.
Tweaked by tabbott to add a longer explanatory comment, and delete a
useless old comment.
This saves us a query for edge cases like when
you try to unsubscribe from a public stream
that you have already unsubscribed from.
But this is mostly to prep for upcoming
optimizations.
This doesn't change anything yet, but the goal is
to eventually optimize events for the case where
one user (typically a new user) gets subscribed
to multiple public streams.
Initially markdown titles were overridden by Youtube and Vimeo preview titles.
But now it will check if any markdown title is present to replace Youtube or
Vimeo preview titles, if preview of linked websites is enabled.
Fixes#16100
Upstream has slightly changed the whitespace around stashes. Take
this opportunity to clean up the extra blank lines we were outputting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
There was no need to put "stream_id" on the sub
dictionary here. It's kinda annoying to introduce
the little helper here, but I feel
that's better than crufting up the sub data
structure.
The is_web_public flag is already in Stream.API_FIELDS,
so there is no reason for all this complicated logic.
There's no reason to hack it on to the subscription
object.
We replace all_streams_id with a map.
We also use it to populate never_subscribed_streams.
And all_streams_map is a superset of stream_hash,
which we will soon kill off as well.
Apparently I put these parens in the code as
part of 73c30774cb
during 2017.
It looks like I extracted is_public during
the middle of my change and forgot to remove
the unnecessary parens. (The code was correct,
but it makes it look like a tuple if you're
skimming it too quickly.)
That class is an artifact of when Stream
didn't have recipient_id. Now it's simpler
to deal with stream subscriptions.
We also save a query during page load (and
other places where we get subscriber
info).
Let the callers access stream.recipient as needed.
It costs the same, and some of the callers can
actually stop caring about the actual Recipient
object.
We already trust ids that are put on our queue
for deferred work. For example, see the code for
"mark_stream_messages_as_read_for_everyone"
We now pass stream_recipient_id when we queue
up work for do_mark_stream_messages_as_read.
This generally saves about 3 queries per
user when we unsubscribe them from a stream.
We get two speedups:
* The query to get existing subscribers only
gets the two fields we need. We no longer
need all the overhead of user_profile
and recipient data being returned in the
query.
* We avoid Django making extra hops to the
database to get user info.
Previously, the transaction.atomic() was not properly scoped to ensure
that RealmAuditLog entries were created in the same transaction,
making it possible for state changes to not be properly recorded in
RealmAuditLog.
When apps like mobile register for "streams", we
will now just use active streams as our baseline,
rather than "occupied" streams.
This means we will send a stream that is active,
even if it happens to have zero occupants. It's
actually pretty rare that a stream has zero occupants,
and it's not exactly clear that we want to exclude
a non-occupied but otherwise active stream from
our list of streams.
It also happens to be fairly expensive to compute
whether a stream is occupied.
This change only affects API clients (including
possibly our mobile app). The main webapp never
used the data from this codepath.
We replace get_peer_user_ids_for_stream_change
with two bulk functions to get peers and/or
subscribers.
Note that we have three codepaths that care about
peers:
subscribing existing users:
we need to tell peers about new subscribers
we need to tell subscribed user about old subscribers
unsubscribing existing users:
we only need to tell peers who unsubscribed
subscribing new user:
we only need to tell peers about the new user
(right now we generate send_event
calls to tell the new user about existing
subscribers, but this is a waste
of effort that we will fix soon)
The two bulk functions are this:
bulk_get_subscriber_peer_info
bulk_get_peers
They have some overlap in the implementation,
but there are some nuanced differences that are
described in the comments.
Looking up peers/subscribers in bulk leads to some
nice optimizations.
We will save some memchached traffic if you are
subscribing to multiple public streams.
We will save a query in the remove-subscriber
case if you are only dealing with private streams.
This will ensure that we always fully execute the database part of
modifying subscription objects. In particular, this should prevent
invariant failures like #16347 where Subscription objects were created
without corresponding RealmAuditLog entries.
Fixes#16347.
We don't need the select_related('user_profile')
optimization any more, because we just keep
track of user info in our own data structures.
In this codepath we are never actually modifying
users; we just occasionally need their ids or
emails.
This can be a pretty substantive improvement if
you are adding a bunch of users to a stream
who each have a bunch of their own subscriptions.
We could also limit the number of full rows in this
query by adding an extra hop to the DB just to
get colors (using values_list), and then only get
full sub info for the streams that we're adding, rather
than getting every single subscription, in full, for each user.
Apart from finding what colors the user has already
used, the only other reason we need all the columns
in Subscription here is to handle streams that
need to be reactivated. Otherwise we could do
only("id", "active", "recipient_id", "user_profile_id")
or similar. Fortunately, Subscription isn't
an overly wide table; it's mostly bool fields.
But by far the biggest thing to avoid is bringing
in all the extra user_profile data.
We have pretty good coverage on query counts here,
so I think this fix is pretty low risk.
This class removes a lot of the annoying tuples
we were passing around.
Also, by including the user everywhere, which
is easily available to us when we make instances
of SubInfo, it sets the stage to remove
select_related('user_profile').
We used to send occupy/vacate events when
either the first person entered a stream
or the last person exited.
It appears that our two main apps have never
looked at these events. Instead, it's
generally the case that clients handle
events related to stream creation/deactivation
and subscribe/unsubscribe.
Note that we removed the apply_events code
related to these events. This doesn't affect
the webapp, because the webapp doesn't care
about the "streams" field in do_events_register.
There is a theoretical situation where a
third party client could be the victim of
a race where the "streams" data includes
a stream where the last subscriber has left.
I suspect in most of those situations it
will be harmless, or possibly even helpful
to the extent that they'll learn about
streams that are in a "quasi" state where
they're activated but not occupied.
We could try to patch apply_event to
detect when subscriptions get added
or removed. Or we could just make the
"streams" piece of do_events_register
not care about occupy/vacate semantics.
I favor the latter, since it might
actually be what users what, and it will
also simplify the code and improve
performance.
The query to get "occupied" streams has been expensive
in the past. I'm not sure how much any recent attempts
to optimize that query have mitigated the issue, but
since we clearly aren't sending this data, there is no
reason to compute it.
Using web_public_guest for anonymous users is confusing since
'guest' is actually a logged-in user compared to
web_public_guest which is not logged-in and has only
read access to messages. So, we rename it to
web_public_visitor.
We no longer do O(N) queries to get existing streams.
This is a somewhat contrived use case--generally, we
are not trying to re-subscribe a user to several
streams. Still, we want to avoid this.
This commit also makes `test_bulk_subscribe_many`
do more work, and the change to the test helped
me discover this bug.
If a user asks to be subscribed to a stream
that they are already subscribed to, then
that stream won't be in new_stream_user_ids,
and we won't need to send an event for it.
This change makes that happen more automatically.
Let
U = number of users to subscribe
S = number of streams to subscribe
We were technically doing N^3 amount of work
when we sent certain events, or to be more
precise, U * S * S amount of work. For each
stream, we were looping through a list of tuples
of size U * S to find the users for the stream.
In practice either U or S is usually 1, so the
performance gains here are probably negligible,
especially since the constant factors here
were just slinging around Python data.
But the code is actually more readable now, so
it's a double win.
We rename needs_new_sub (which sounds like
a boolean!) to new_recipient_ids, and we
calculate it explicitly within the loop, so
that we don't need to worry as much about
subsequent passes through the loop mutating it.
This allows us to also remove recipient_ids,
which in turn lets us remove recipients_map,
albeit with a small tweak for stream_map.
I also introduce the my_subs local, which
I use to more directly populate used_colors,
as well as using it as the loop var.
I think it's important that the callers understand
that bulk_add_subscriptions assumes all streams
are being created within a single realm, so I make
it an explicit parameter.
This may be overkill--I would also be happy if we
just included the assertions from this commit.
This function now does all the work that we used
to do with notify_subscriptions_added happening
inside a loop.
There's a small fine-tuning here, where we only
get recent traffic on streams that we're actually
sending events for.
We now just pass in all_subscribers_by_stream, rather
than a callback.
We also move sub_tuples_by_user closer to the
loop where we call notify_subscriptions_added.
This preserves the alpha layer on GIF images that need to be resized
before being uploaded. Two important changes occur here:
1. The new frame is a *copy* of the original image, which preserves the
GIF info.
2. The disposal method of the original GIF is preserved. This
essentially determines what state each frame of the GIF starts from
when it is drawn; see PIL's docs:
https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#saving
for more info.
This resolves some but not all of the test cases in #16370.
This low-level interface allows consuming from a queue with timeouts.
This can be used to either consume in batches (with an upper timeout),
or one-at-a-time. This is notably more performant than calling
`.get()` repeatedly (what json_drain_queue does under the hood), which
is "*highly discouraged* as it is *very inefficient*"[1].
Before this change:
```
$ ./manage.py queue_rate --count 10000 --batch
Purging queue...
Enqueue rate: 11158 / sec
Dequeue rate: 3075 / sec
```
After:
```
$ ./manage.py queue_rate --count 10000 --batch
Purging queue...
Enqueue rate: 11511 / sec
Dequeue rate: 19938 / sec
```
[1] https://www.rabbitmq.com/consumers.html#fetching
Despite its name, the `queue_size` method does not return the number
of items in the queue; it returns the number of items that the local
consumer has delivered but unprocessed. These are often, but not
always, the same.
RabbitMQ's queues maintain the queue of unacknowledged messages; when
a consumer connects, it sends to the consumer some number of messages
to handle, known as the "prefetch." This is a performance
optimization, to ensure the consumer code does not need to wait for a
network round-trip before having new data to consume.
The default prefetch is 0, which means that RabbitMQ immediately dumps
all outstanding messages to the consumer, which slowly processes and
acknowledges them. If a second consumer were to connect to the same
queue, they would receive no messages to process, as the first
consumer has already been allocated them. If the first consumer
disconnects or crashes, all prior events sent to it are then made
available for other consumers on the queue.
The consumer does not know the total size of the queue -- merely how
many messages it has been handed.
No change is made to the prefetch here; however, future changes may
wish to limit the prefetch, either for memory-saving, or to allow
multiple consumers to work the same queue.
Rename the method to make clear that it only contains information
about the local queue in the consumer, not the full RabbitMQ queue.
Also include the waiting message count, which is used by the
`consume()` iterator for similar purpose to the pending events list.
We modify access_stream_for_delete_or_update function to return
Subscription object also along with stream. This change will be
helpful in avoiding an extra query to get subscription object in
code for updating subscription role.
For streams in which only full members are allowed to post,
we block guest users from posting there.
Guests users were blocked from posting to admin only streams
already. So now, guest users can only post to
STREAM_POST_POLICY_EVERYONE streams.
This is not a new feature but a bugfix which should have
happened when implementing full member stream policy / guest users.
Replaced ImageOps.fit by ImageOps.pad, in zerver/lib/upload.py, which
returns a sized and padded version of the image, expanded to fill the
requested aspect ratio and size.
Fixes part of #16370.
Currently, drain_queue and json_drain_queue ack every message as it is
pulled off of the queue, until the queue is empty. This means that if
the consumer crashes between pulling a batch of messages off the
queue, and actually processing them, those messages will be
permanently lost. Sending an ACK on every message also results in a
significant amount lot of traffic to rabbitmq, with notable
performance implications.
Send a singular ACK after the processing has completed, by making
`drain_queue` into a contextmanager. Additionally, use the `multiple`
flag to ACK all of the messages at once -- or explicitly NACK the
messages if processing failed. Sending a NACK will re-queue them at
the front of the queue.
Performance of a no-op dequeue before this change:
```
$ ./manage.py queue_rate --count 50000 --batch
Purging queue...
Enqueue rate: 10847 / sec
Dequeue rate: 2479 / sec
```
Performance of a no-op dequeue after this change (a 25% increase):
```
$ ./manage.py queue_rate --count 50000 --batch
Purging queue...
Enqueue rate: 10752 / sec
Dequeue rate: 3079 / sec
```
Part of #16094.
Moved the language selection preference logic from home.py to a new
function in i18n.py to avoid repetition in analytics views and home
views.
This prevents the memcached connection from being shared across
multiple processes, and hopefully addresses unexpected behavior from
cached functions like get_user_profile_by_id invoked inside the worker
processes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We call build_message_send_dict from check_message instead of
do_send_messages.
This is a prep commit for adding a new setting for handling
wildcard mentions in large streams.
We extract the loop for building message dict in
do_send_messages in a separate function named
build_message_send_dict.
This is a prep commit for moving the code for building
of message dict in check_message.
There is a bug where we send event for even
those messages which do not have embedded links
as we are using single set 'links_for_embed' to
check whether we have to send event for
embedded links or not.
This commit fixes the bug by adding 'links_for_embed'
in message dict itself and send the event only
if that message has embedded links.
This commit removes the unnecessary comment which was added in
9454683108, when we were using message.get() for keys which
were also passed as args in do_send_messages, but there are no
such keys in the current code.
This commit removes the unnecessary line of code to get
rendered_content from message dict sent by check_message
when it actually does not inlcude 'rendered_content' key.
This line was added in 9454683108, but now we do not send
rendered_content in the message dict as we render the message
in do_send_messages itself.
A later commit alters `authenticate` of EmailAuthBackend to
add a store `needs_to_change_password` variable to session
which is useful to insist users on changing their weak password.
The tests start failing with that change because client.login()
runs `authenticate` without a `request` object. So, this commit
sends a request object with `request.session=self.client.session`
to self.client.login() in tests wherever needed.
We now no longer define any schemas in test_events--all
of them are in event_schema, which helps our tooling
cross-check schemas for openapi and node tests.
It happens that whether you add a reaction or remove
a reaction, we send the exact same fields, just using
a different op code.
This sort of symmetry is actually kind of rare, as
usually "add" events have more fields, and "remove" events
might just send an id of something to remove.
Our openapi schema treats these as two seperate events,
so we are more consistent with it, and it helps our
schema-checking tooling for node fixtures, too.
Note that we now have to exempt the two events from
our openapi checks, due to the is_mirror_dummy field
in the deprecated user block. We can decide how to
handle this later--one possibility is to just add it
as an optional field on the event_schema side.
Note that we use value_type for value instead of
bool, since properties can be non-bool things
like color, which we just don't test now. We
should test them.
We more than compensate for this by checking
the actual value of the value in
check_subscription_update.
There is a legacy format where we send
singular "message_id" instead of plural
"message_ids".
Then there are different fields for "private"
and "stream" message types.
Note that we make the schema for profile_data
slightly more realistic, but it doesn't actually get
exercised by our current tests (apart from
making sure it's a dict), since we don't have
profile data for our test realm.
We also don't have the optional fields for bots,
since our tests don't exercise that, nor
delivery_email.
So we exempt realm_user_add_event from openapi
checks for now.
When we try to match the openapi specs better, we
will probably want to add a few tests to test_events.
Obviously getting good coverage for adding users
would be nice for all these scenarios:
* delivery_email matters
* bots
* realm has profile fields
This is a prep commit for supporting "presence"
events, where the key of the dictionary is some
arbitrary string like "website" but the value
of the dictionary is another dictionary itself
with keys that are more like variable names.
This also forces us to create TupleType.
We exempt this from the openapi check,
since we haven't figured out how to model
tuples in openapi with the same precision
as event_schema (and it may be impossible).
Long term we just want to stop dealing in
tuples, of course.
StringDict is a data type for representing dictionaries where
all keys and values are strings. Add this data type to data_types.py
and edit other files so that this data type is put to use and tested.
(slightly tweaked by @showell to remove a comment and shorten
a var name now that we have a proper data type)
We also make our schema in event_schema reflect this,
which in turn makes us match the already accurate
openapi spec, so we no longer need to exempt four
types of events from our sanity checks.
We might want to rename the tool to something more
general now, since we are really reconciling three
things:
- node fixtures
- event_schema checkers for test_events
- openapi specs
The way we compare python and openapi schemas is
as follows:
- first convert openapi schemas to be build
from DictType, ListType, etc. with from_opeapi
- do a diff on the schemas
Most of the new code is just having the FooType
family of classes serialize themselves with schema().
Defining types with an object hierarchy
of type classes will allow us to build
functionality that was impossible (or
really janky) with the validators.py
approach of composing functions.
Most of the changes to event_schema.py
were automated search/replaces.
This patch doesn't really yet take
advantage of the new FooType classes,
but we will use it soon to audit our
openapi specs.
user_profile will be None for web_public_guests here. Hence, for
settings (of which most be inaccessible by web public guest),
which require a user_profile, we either set an empty value for
them or set them to a default value. This will help render
the frontend or extend support to our clients without breaking
a lot of code.
Tweaked by tabbott to add many comments.
This argument does not define if an endpoint "is a webhook"; it is set
for "/api/v1/messages", which is not really a webhook, but allows
access from webhooks.
If multiple filters match the same string, we run into an infinite
loop of converting string into urls. To fix it, we mark the matched
string as atomic after first conversion.
In ae58ed5a7 we decided to echo back the text, when no Pygments lexer
matching that language was found. When we do so, we must take care to
HTML escape the lang before wrapping it in a data-code-language attribute.
Tweaked by tabbott to make clear the escaping is defensive.
Calling `render()` in a middleware before LocaleMiddleware has run
will pick up the most-recently-set locale. This may be from the
_previous_ request, since the current language is thread-local. This
results in the "Organization does not exist" page occasionally being
in not-English, depending on the preferences of the request which that
thread just finished serving.
Move HostDomainMiddleware below LocaleMiddleware; none of the earlier
middlewares call `render()`, so are safe. This will also allow the
"Organization does not exist" page to be localized based on the user's
browser preferences.
Unfortunately, it also means that the default LocaleMiddleware catches
the 404 from the HostDomainMiddlware and helpfully tries to check if
the failure is because the URL lacks a language component (e.g.
`/en/`) by turning it into a 304 to that new URL. We must subclass
the default LocaleMiddleware to remove this unwanted functionality.
Doing so exposes a two places in tests that relied (directly or
indirectly) upon the redirection: '/confirmation_key'
was redirected to '/en/confirmation_key', since the non-i18n version
did not exist; and requests to `/stats/realm/not_existing_realm/`
incorrectly were expecting a 302, not a 404.
This regression likely came in during f00ff1ef62, since prior to
that, the HostDomainMiddleware ran _after_ the rest of the request had
completed.
When converting fenced code markdown, we add the language (if specified)
in a data-attribute by tweaking the HTML generated. Doing so, allows the
frontend to make use of this attr to display view-in-playground option
for codeblocks.
We use pygments to get the lexer subclass name and use that instead of
directly using the language in the data-attribute. Doing so, helps us
map different language aliases (like `js` and `javascript`) into a common
variable (like `JavaScript`) - and avoids the client from dealing with
multiple tags corresponding to the same language.
The html structure for a message like this:
``` js
..content..
```
would now be:
<div class="codehilite" data-codehilite-language="JavaScript">
<pre>..content..</pre>
</div>
Tests and fixtures amended.
This was a broken abstraction that returned to its caller within
multiple forked processes on exceptions, and encouraged ignoring the
error code (as all of its callers did).
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Fixes#16284.
Most of the work for this was done when we implemented correct
behavior for guest users, since they treat public streams like private
streams anyway.
The general method involves moving the messages to the new stream with
special care of UserMessage.
We delete UserMessages for subs who are losing access to the message.
For private streams with protected history, we also create UserMessage
elements for users who are not present in the old stream, since that's
important for those users to access the moved messages.
Previously, S3UploadBackend.delete_export_tarball failed to strip the
leading ‘/’ from the export path. This mistake is now caught by Moto
1.3.15. I expect it caused deletion failures in the real S3, although
I haven’t verified this.
We store export_path in the audit log with a leading ‘/’, but the
actual S3 keys do not have a leading ‘/’. Changing either system
would require a migration. So the new convention is that the
variables named ‘export_path’ have a leading ‘/’, while variables
named ‘path_id’ or ‘key’ do not.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The previous code only worked by accident and hyperlink 20.0.0 breaks
it.
>>> hyperlink.parse("example.com").replace(scheme="https")
DecodedURL(url=URL.from_text('https:example.com'))
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Django treats path("<name>") like re_path(r"(?P<name>[^/]+)") and
path("<path:name>") like re_path(r"(?P<name>.+)").
This is more readable and consistent than the mix of slightly
different regexes we had before, and fixes various bugs:
• The r'apps/(.*)$' regex was missing a start anchor ^, so it
incorrectly matched all URLs that included apps/ as a substring
anywhere.
• The r'accounts/login/(google)/$' regex was missing a start anchor ^,
so it incorrectly matched all URLs that ended with
accounts/login/google/.
• The type annotation of zerver.views.realm_export.delete_realm_export
takes export_id as an int, but it was previously passed as a string.
• The type annotation of zerver.views.users.avatar takes medium as a
bool, but it was previously passed as a string.
• The [0-9A-Za-z]+ pattern for uidb64 was missing the - and _
characters that can validly be part of a base64url encoded
string (although I think the id is actually a decimal integer here,
in which case only 012345ADEIMNOQTUYcgjkwxyz are present in its
base64url encoding).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This clears it out of the data sent to Sentry, where it is duplicative
with the indexed metadata -- and potentially exposes PHI if Sentry's
"make this issue public" feature is used.
Any exception is an "unexpected event", which means talking about
having an "unexpected event logger" or "unexpected event exception" is
confusing. As the error message in `exceptions.py` already explains,
this is about an _unsupported_ event type.
This also switches the path that these exceptions are written to,
accordingly.
8e10ab282a moved UnexpectedWebhookEventType into
`zerver.lib.exceptions`, but left the import into
`zserver.lib.webhooks.common` so that webhooks could continue to
import the exception from there.
This clutters things and adds complexity; there is no compelling
reason that the exception's source of truth should not move alongside
all other exceptions.
The main race conditions, which actually happened in production was with
concurrent execution of deliver_email and clear_scheduled_emails.
clear_scheduled_emails could delete all email.users in the middle of
deliver_email execution, causing it to pass empty to_user_ids list to
send_email. We mitigate this by getting the list of user ids in a single
query and moving forward with that snapshot, not having to worry about
database data being mutated anymore.
clear_scheduled_emails had potential race conditions with concurrent
execution of itself due to not locking the appropriate rows upon
selecting them for the purpose of potentially deleting them. FOR UPDATE
locks need to be acquired to prevent simultaneous mutation.
Tested manually with some print+sleep debugging to make some races
happen.
fixes #zulip-2k (sentry)
There are three functional side effects:
• Correct an insignificant but mathematically offensive bias toward
repeated characters in generate_api_key introduced in commit
47b4283c4b4c70ecde4d3c8de871c90ee2506d87; its entropy is increased
from 190.52864 bits to 190.53428 bits.
• Use the base32 alphabet in confirmation.models.generate_key; its
entropy is reduced from 124.07820 bits to the documented 120 bits, but
now it uses 1 syscall instead of 24.
• Use the base32 alphabet in get_bigbluebutton_url; its entropy is
reduced from 51.69925 bits to 50 bits, but now it uses 1 syscall
instead of 10.
(The base32 alphabet is A-Z 2-7. We could probably replace all of
these with plain secrets.token_urlsafe, since I expect most callers
can handle the full urlsafe_b64 alphabet A-Z a-z 0-9 - _ without
problems.)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
For web-public streams, clients can access full topic history
without being authenticated. They only need to additionally
send "streams:web-public" narrow with their request like all
the other web-public queries.
When user requests for a realm that doesn't exists, we raise
a InvalidSubdomainError.
This reduces our effort at repeatedly ensuring realm is valid
in request in web-public queries.
Our isort configuration was almost Black-compatible, but we were
missing ensure_newline_before_comments.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Some users setup zulip with trailing / at end, like 'https://meet.jit.si/
leading to extra / on clients while generating video chat link.
This commit removes trailing '/' if it exists to make it consistent. Manual
testing was done by generating jitsi url.
Fixes#16225
The query to finds and marks all unread UserMessages in the stream as read
can be quite expensive, so we'll move that work to the deferred_work
queue and split it into batches.
Fixes#15770.
Since ALL_HOTSPOTS is a global object, it is initialized
at the time the backend server is started. Hence, the
title and description is translated only once. Using
ugettext_lazy makes sure that the strings are translated
in each and every request according to the language
of the user.
Fixes#16224
The `typing: stop` event did not have any tests in test_events
hence its documentation wasn't added. So add tests and relevant
documentation for the typing stop event. Also edit the documentation
of `typing: start` to include the fact that servers should use
their own timeout incase `stop` event event isn't received.
Fixes#16122.
We raise two types of json_unauthorized when
MissingAuthenticationError is raised. Raising the one
with www_authenticate let's the client know that user needs
to be logged in to access the requested content.
Sending `www_authenticate='session'` header with the response
also stops modern web-browsers from showing a login form to the
user and let's the client handle it completely.
Structurally, this moves the handling of common authentication errors
to a single shared middleware exception handler.
This lets the backend tests pass if zilencer has been (manually)
removed from EXTRA_INSTALLED_APPS, by skipping the tests that require
it. test-backend complains that some URLs are untested in this case:
ERROR: Some URLs are untested! Here's the list of untested URLs:
api/v1/users/me/android_gcm_reg_id
api/v1/users/me/apns_device_token
team/
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This commit adds automatic detection of extra output (other than
printed by testing library or tools) in stderr and stdout by code under
test test-backend when it is run with flag --ban-console-output.
It also prints the test that produced the extra console output.
Fixes: #1587.
We raise two types of json_unauthorized when
MissingAuthenticationError is raised. Raising the one
with www_authenticate let's the client know that user needs
to be logged in to access the requested content.
Sending `www_authenticate='session'` header with the response
also stops modern web-browsers from showing a login form to the
user and let's the client handle it completely.
Structurally, this moves the handling of common authentication errors
to a single shared middleware exception handler.
`zproject/settings.py` itself is mostly-empty now. Adjust the
references which should now point to `zproject/computed_settings.py`
or `zproject/default_settings.py`.
`update_message_flags` events used `operation` instead of `op`, the
latter being the standard field used in other events. So add `op`
field to `update_message_flags` and mark `operation` as deprecated,
so that it can be removed later.
Commit c4254497b2
curiously had get_body() round tripping its data
through json load and dump.
I have seen this done for pretty-printing reasons,
but it doesn't apply here.
And if you're doing it for validation reasons,
you only need to do half the work, as my commit
here demonstrates.
We arguably don't even need the fail-fast code
here, since our fixtures are linted to be proper
json, I believe, plus downstream code probably
gives reasonably easy-to-diagnose symptoms.
We introduce get_payload for the relatively
exceptional cases where webhooks return payloads
as dicts.
Having a simple "str" type for get_body will
allow us to extract test helpers that use
payloads from get_body() without the ugly
`Union[str, Dict[str, str]]` annotations.
I also tightened up annotations in a few places
where we now call get_payload (using Dict[str, str]
instead of Dict[str, Any]).
In the zendesk test I explicitly stringify
one of the parameters to satisfy mypy.
We tighten up the mypy types here. And then
once we know that expected_message and expected_topic
are never None, we don't have call the do_test_message
and do_test_topic helpers any more, so we eliminate
them, too.
Finally, we don't return a message, since no tests
use the message currently.
This forces us to be a bit more explicit about testing
the three key values in any stream message, and it
also de-clutters the code a bit. I eventually want
to phase out do_test_topic and friends, since they
have the pitfall that you can call them and have them
do nothing, because they don't actually require
values to be be passed in.
I also clean up the code a bit for the tests that
have two new messages arriving.
Having an optional stream_name parameter makes
it confusing to read the code if you know your
webhook is sending private messages.
And then the other two callers are already
checking topics, so they might as well check
stream names, too.
We also have the two stream-oriented callers
make their own call to "subscribe". And we
future-proof this by making sure the exception
for no-message-being-sent calls out that gotcha.
Somewhat in passing, we now assert that
self.STREAM_NAME is not None in the main
helper. This is partly to satisfy mypy, but
it's also a good sanity check.
This also sets the stage for the next commit,
where I'll add an assert_stream_message helper.
Not all webhook payloads are json, so send_json_payload was a
bit misleading.
In passing I also remove "bytes" from the Union type for
"payload" parameter.
Almost all webhook tests use this helper, except a few
webhooks that write to private streams.
Being concise is important here, and the name
`self.send_and_test_stream_message` always confused
me, since it sounds you're sending a stream message,
and it leaves out the webhook piece.
We should consider renaming `send_and_test_private_message`
to something like `check_webhook_private`, but I couldn't
decide on a great name, and it's very rarely used. So
for now I just made sure the docstrings of the two
sibling functions reference each other.
This function is a bad idea, as it leads to a possible situation
where you aren't actually testing anything:
def do_test_message(self, msg: Message, expected_message: Optional[str]) -> None:
if expected_message is not None:
self.assertEqual(msg.content, expected_message)
Unfortunately, it's called deep in the stack in some places, but
we can safely replace it with assertEqual here.
The test helper here was taking an "expected_topic"
parameter that it just ignored, and then the
dialogflow tests were passing in expected messages
in that slot, so the actual "expected_message" var
was "None" and was ignored. So the tests weren't
testing anything.
Now we eliminate the crufty expected_topic parameter
and require an actual value for "expected_message".
I also clean up the mypy type for content_type,
and I remove the `content_type is None` check,
since all callers either pass in a str content
type or default to "application/json".
Some `<img>` tags do not have an SRC, if they are rewritten using JS
to have one later. Attempting to access `first_image['src']` on these
will raise an exception, as they have no such attribute.
Only look for images which have a defined `src` attribute on them. We
could instead check if `first_image.has_attr('src')`, but this seems
only likely to produce fewer valid images.
03ca3afbc2 added more codes that are equivalent to 404's; this adds to
the list of cache-as-None codes a couple which are equivalent to
403's. It does not comprise _all_ possible 403-like codes -- many of
them are "the client is not OK," which is relevant to log as an error
still.
This commit adds "role" field to the Subscription objects passed to
clients. This is important preparation for being able to work on the
frontend for this feature.
While exporting analytics data we were using wrong table name
'zerver_analytics' in analytics config. Renamed it with
correct table name 'zerver_realm'.
Django 3.0 removed private Python 2 compatibility APIs
so used lru_cache() directly from functools.
We cast lru_cache to Any to avoid attr-defined error in mypy since we
are adding extra field, 'key_prefix', to this object later.
This comment stopped being true in 5686821150, and very much stopped
being relevant in dd40649e04 when the middleware entirely stopped
publishing to a queue.
This commit adds the is_web_public field in the AbstractAttachment
class. This is useful when validating user access to the attachment,
as otherwise we would have to make a query in the db to check if
that attachment was sent in a message in a web-public stream or not.
The new Stream administrator role is allowed to manage a stream they
administer, including:
* Setting properties like name, description, privacy and post-policy.
* Removing subscribers
* Deactivating the stream
The access_stream_for_delete_or_update is modified and is used only
to get objects from database and further checks for administrative
rights is done by check_stream_access_for_delete_or_update.
We have also added a new exception class StreamAdministratorRequired.
Via API, users can now access messages which are in web-public
streams without any authentication.
If the user is not authenticated, we assume it is a web-public
query and add `streams:web-public` narrow if not already present
to the narrow. web-public streams are also directly accessible.
Any malformed narrow which is not allowed in a web-public query
results in a 400 or 401. See test_message_fetch for the allowed
queries.
These weren’t wrong since orjson.JSONDecodeError subclasses
json.JSONDecodeError which subclasses ValueError, but the more
specific ones express the intention more clearly.
(ujson raised ValueError directly, as did json in Python 2.)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
datetime objects are not ordinarily JSON serializable. While both
ujson and orjson have special cases to serialize datetime objects,
they do it in different ways. So we want to fix the post-processing
code to do its job.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The exception trace only goes from where the exception was thrown up
to where the `logging.exception` call is; any context as to where
_that_ was called from is lost, unless `stack_info` is passed as well.
Having the stack is particularly useful for Sentry exceptions, which
gain the full stack trace.
Add `stack_info=True` on all `logging.exception` calls with a
non-trivial stack; we omit `wsgi.py`. Adjusts tests to match.
In f8bcf39014, we fixed buggy
marshalling of Streams and similar data structures where we were
including the Stream object rather than its ID in dictionaries passed
to ujson, and ujson happily wrote that large object dump into the
RealmAuditLog.extra_data field.
This commit includes a migration to fix those corrupted RealmAuditLog
entries, and because the migration loop is the same, also fixes the
format of similar RealmAuditLog entries to be in a more natural format
that doesn't weirdly nest and duplicate the "property" field.
Fixes#16066.
This requires a few redundant runtime isinstance
checks, but the extra assertions arguably make
the code more readable, and isinstance checks
are extremely negligible.
JSON keys must be strings, and orjson enforces this. Mypy didn’t
catch the mismatched type of profiles_by_user_id because it doesn’t
understand CustomProfileFieldValue.field_id.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
It doesn’t end well. Or sometimes it doesn’t end (OverflowError:
Maximum recursion level reached).
Introduced by commits ccdf52fef6 and
94d2de8b4a (#15601).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The variant `update_message` events have this extra sender field not
present in normal update_message events; this field has no purpose, so
we remove it.
Apparently, `update_message` events unexpectedly contained what were
intended to be internal data structures about which users were
mentioned in a given message.
The bug has been present and accumulating new data structures for
years.
Fixing this should improve the performance of handling update_message
events as well as cleaning up this API's interface.
This was discovered by our automated API documentation schema checking
tooling detecting these unexpected elements in these event
definitions; that same logic should prevent future bugs like this from
being introduced in the future.
To make it easier to check if there is user information to be used
in the error report emails, we create a user object inside report.
Now, to check if we have the user's full name, email, etc, we just
need to do report['user']['user_full_name'] rather than check
each information one by one, because if the value of one key in
the report is different than None, all the others will be as well.
Per the API documentation[1], the following codes all correspond to
HTTP 404:
- `34`: **Sorry, that page does not exist.** The specified resource
was not found.
- `144`: **No status found with that ID.** The requested Tweet ID is
not found (if it existed, it was probably deleted)
- `421`: **This Tweet is no longer available.** The Tweet cannot be
retrieved. This may be for a number of reasons.
- `422`: **This Tweet is no longer available because it violated the
Twitter Rules.** The Tweet is not available in the API.
Treat all of these identically.
[1] https://developer.twitter.com/en/docs/basics/response-codes
The S3 data export tool's upload code path uses this nice boto
callback feature for showing a progress bar, which is nice for the
management command. It's spammy/broken in production and the backend
tests, so we change percent_callback to be a parameter passed in so
that it can only be used in the contexts where it makes sense.
This reverts commit c3779338c6 (part
of #14638), which incorrectly depended on commits from the future,
with the effect of either halting the flow of entropic time in an
irresolvable temporal paradox, summoning extradimensional beings to
rain destruction on the galaxy, or failing CI.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This commit modifies the /streams endpoint so that the web-public
streams are included in the default list of streams that users
have access to.
This is part of PR #14638 that aims to allow guest users to
browse and subscribe themselves to web public streams.
Modifies filter_stream_authorization so that web-public streams are
added in the list of authorized streams that a guest user can
subscribe.
This commit is part of PR #14638 that aims to allow guest users
to browse and subscribe to web-public streams.
In this commit, we grant guest users access to stream history,
send message and common stream data of web-public streams.
This is part of PR #14638 that aims to allow guest users to
browse and subscribe to web-public streams.
This modification allows guest users to have access to web-public
streams subscribers, even if they aren't subscribed or never
subscribed to that stream.
This commit is part of PR #14638 that aims to allow guest users to
browser and subscribe to web-public streams.
Now, gather_subscriptions include web-public streams in the 3 sets
of streams that it returns, subscribed, unsubscribed and never
subscribed.
This is part of PR #14638 that aims to allow guest users to browse and
subscribe to web-public streams.
This change makes the flow more coherent by instead of checking,
in the last condition, if the user isn't authorized to access that
stream, check if they are, as it is done in the other checks. Only
if all the conditions are false, which means that the user doesn't
have access to that stream, the stream is added to the
unauthorized_streams list.
Also add a Draft object-to-dictionary conversion method.
The following commits will provide an API around this
model using which our clients can sync drafts across each
other (if they so wish too). As of making this commit, we
haven't finalized exactly how our clients will use this.
See https://chat.zulip.org/#narrow/stream/2-general/topic/drafts
For some of the discussion around this model and in general,
around this feature.
Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>
Unlike the other Python datetime to Unix timestamp conversion
function (`datetime_to_timestamp`), `datetime_to_precise_timestamp`
won't drop the microseconds.
Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>
Edit the function `validate_against_openapi_schema` and add some
helper functions to allow for validation of documented events.
Also add OpenAPI response validation in `verify_action` as it is
called in a large number of `/events` tests.
The parameter Stream.date_created is now sent down to the clients
for both:
- client.get_streams()
- client.list_subscriptions()
API docs updated for stream and subscriptions.
Fixes#15410
We also have the caller pass in the property name for an
additional sanity check.
Note that we don't yet handle the possibility of extra_data;
that will be a subsequent commit.
Also, the stream_id fields aren't in Realm.property_types,
so we specify their types in the checker.
This a pretty big commit, but I really wanted it
to be atomic.
All realm_user/update events look the same from
the top:
_check_realm_user_update = check_events_dict(
required_keys=[
("type", equals("realm_user")),
("op", equals("update")),
("person", _check_realm_user_person),
]
)
And then we have a bunch of fields for person that
are optional, and we usually only send user_id plus
one other field, with the exception of avatar-related
events:
_check_realm_user_person = check_dict_only(
required_keys=[
# vertical formatting
("user_id", check_int),
],
optional_keys=[
("avatar_source", check_string),
("avatar_url", check_none_or(check_string)),
("avatar_url_medium", check_none_or(check_string)),
("avatar_version", check_int),
("bot_owner_id", check_int),
("custom_profile_field", _check_custom_profile_field),
("delivery_email", check_string),
("full_name", check_string),
("role", check_int_in(UserProfile.ROLE_TYPES)),
("email", check_string),
("user_id", check_int),
("timezone", check_string),
],
)
I would start the code review by just skimming the changes
to event_schema.py, to get the big picture of the complexity
here. Basically the schema is just the combined superset of
all the individual schemas that we remove from test_events.
Then I would read test_events.py.
The simplest diffs are basically of this form:
- schema_checker = check_events_dict([
- ('type', equals('realm_user')),
- ('op', equals('update')),
- ('person', check_dict_only([
- ('role', check_int_in(UserProfile.ROLE_TYPES)),
- ('user_id', check_int),
- ])),
- ])
# ...
- schema_checker('events[0]', events[0])
+ check_realm_user_update('events[0]', events[0], {'role'})
Instead of a custom schema checker, we use the "superset"
schema checker, but then we pass in the set of fields that we
expect to be there. Note that 'user_id' is always there.
So most of the heavy lifting happens in this new function
in event_schema.py:
def check_realm_user_update(
var_name: str, event: Dict[str, Any], optional_fields: Set[str],
) -> None:
_check_realm_user_update(var_name, event)
keys = set(event["person"].keys()) - {"user_id"}
assert optional_fields == keys
But we still do some more custom checks in test_events.py.
custom profile fields: check keys of custom_profile_field
def test_custom_profile_field_data_events(self) -> None:
+ self.assertEqual(
+ events[0]['person']['custom_profile_field'].keys(),
+ {"id", "value", "rendered_value"}
+ )
+ check_realm_user_update('events[0]', events[0], {"custom_profile_field"})
+ self.assertEqual(
+ events[0]['person']['custom_profile_field'].keys(),
+ {"id", "value"}
+ )
avatar fields: check more specific types, since the superset
schema has check_none_or(check_string)
def test_change_avatar_fields(self) -> None:
+ check_realm_user_update('events[0]', events[0], avatar_fields)
+ assert isinstance(events[0]['person']['avatar_url'], str)
+ assert isinstance(events[0]['person']['avatar_url_medium'], str)
+ check_realm_user_update('events[0]', events[0], avatar_fields)
+ self.assertEqual(events[0]['person']['avatar_url'], None)
+ self.assertEqual(events[0]['person']['avatar_url_medium'], None)
Also note that avatar_fields is a set of four fields that
are set in event_schema.
full name: no extra work!
def test_change_full_name(self) -> None:
- schema_checker('events[0]', events[0])
+ check_realm_user_update('events[0]', events[0], {'full_name'})
test_change_user_delivery_email_email_address_visibilty_admins:
no extra work for delivery_email
check avatar fields more directly
roles (several examples) -- actually check the specific role
def test_change_realm_authentication_methods(self) -> None:
- schema_checker('events[0]', events[0])
+ check_realm_user_update('events[0]', events[0], {'role'})
+ self.assertEqual(events[0]['person']['role'], role)
bot_owner_id: no extra work!
- change_bot_owner_checker_user('events[1]', events[1])
+ check_realm_user_update('events[1]', events[1], {"bot_owner_id"})
- change_bot_owner_checker_user('events[1]', events[1])
+ check_realm_user_update('events[1]', events[1], {"bot_owner_id"})
- change_bot_owner_checker_user('events[1]', events[1])
+ check_realm_user_update('events[1]', events[1], {"bot_owner_id"})
timezone: no extra work!
- timezone_schema_checker('events[1]', events[1])
+ check_realm_user_update('events[1]', events[1], {"email", "timezone"})
Obviously, this file will soon grow--this
was the easiest way to start without introducing
noise into other commits.
It will soon be structurally similar
to frontend_tests/node_tests/lib/events.js--I
have some ideas there. But this should also
help for things like API docs.
This change makes our handling of youtube-url previews consistent
with how we handle our inline images. This allows the previews to
render next to the paragraph that links to the youtube video.
Follow-up to PR #15773.
This commit rewrites the way addresses are collected. If
the header with the address is not an AddressHeader (for instance,
Delivered-To and Envelope-To), we take its string representation.
Fixes: #15864 ("Error in email_mirror - _UnstructuredHeader has no attribute addresses").
This commit re-adds the integration for canarytokens.org, now separate
from the primary Thinkst integration.
Signed-off-by: David Wood <david@davidtw.co>
This commit fixes the Thinkst Canary integration which - based on the
schema in upstream documentation - incorrectly assumed that some fields
would always be sent, which meant that the integration would fail. In
addition, this commit adjusts support for canarytokens to only support
the canarytoken schema with Thinkst Canaries (not Thinkst's
canarytokens.org).
Signed-off-by: David Wood <david@davidtw.co>
A few major themes here:
- We remove short_name from UserProfile
and add the appropriate migration.
- We remove short_name from various
cache-related lists of fields.
- We allow import tools to continue to
write short_name to their export files,
and then we simply ignore the field
at import time.
- We change functions like do_create_user,
create_user_profile, etc.
- We keep short_name in the /json/bots
API. (It actually gets turned into
an email.)
- We don't modify our LDAP code much
here.
We generally want to avoid having two sibling test
suites depend on each other, unless there's a real
compelling reason to share code. (And if there is
code to share, we can usually promote it to either
test_helpers or ZulipTestCase, as I did here.)
This commit is also prep for the next commit, where
I try to simplify all of the helpers in EmojiReactionBase.
Especially now that we have f-strings, it is usually
better to just call api_post explicitly than to
obscure the mechanism with thin wrappers around
api_post. Our url schemes are pretty stable, so it's
unlikely that the helpers are actually gonna prevent
future busywork.
This issue isn't something a system administrator needs to take action
on -- it's a likely minor logic bug around organization
administrators moving topics between streams.
As a result, it shouldn't send error emails to administrators.
This is a hacky fix to avoid spoiler content leaking in emails. The
general idea here is to tell people to open Zulip to view the actual
message in full.
We create a mini-markdown parser here that strips away the fence content
that has the 'spoiler' tag for the text emails.
Our handling of html emails is much better in comparison where we can
use lxml to parse the spoiler blocks.
We include tests for the new implementation to avoid churning the
codebase too much so this can be easily reverted when we are able to
re-enable the feature.
We now do something sensible for spoilers in notifications. A message
like:
```spoiler Luke's father is
Vader. Don't tell anyone else.
```
would be rendered as:
Luke's father is (...)
If the push_notification for the UserMessage is already active,
we don't send any push notification to the user. This may
happen due to race conditions.
Added and fixed test cases affected by this.
This is similar to our behavior with image previews, and helps
reduce clutter in the final rendered html.
We add the string 'Tweet: ' to our existing tests so those tests
remain the same.
This commit makes our handling of twitter previews consistent with
how we handle our inline images so that tweets render next to the
paragraph that links to the tweet.
We decouple the logic of insertion rules for inline links from
image preview logic. Now, we can use this same logic for other
kinds of link previews as well.
We also remove the post_process_data option.
The sanity check is just overkill at this point,
since the mechanism to find streams is very
direct due to a recent commit.
Before this change we would only export streams
that had actual subscribers, which is usually
harmless, but it was mostly a relic of a one
time migration that we did when we were cleaning
up some dirty data in some of our very early
databases (circa 2016).
Now we work down the table hierarchy in a
more natural way:
- get Streams in Realm
- get Recipients matching above Streams
- get Subscriptions matching above Recipients
Note that for per-user exports, I kept the
same logic (users -> subscriptions -> recipients ->
streams) we had before.
One subtle detail here is that we make our
final Config blocks--which build the final
version of Recipient/Subscription--now hang
off of realm_config.
Fixes#15146.
This particular commit has been a long time coming. For reference,
!avatar(email) was an undocumented syntax that simply rendered an
inline 50px avatar for a user in a message, essentially allowing
you to create a user pill like:
`!avatar(alice@example.com) Alice: hey!`
---
Reimplementation
If we decide to reimplement this or a similar feature in the future,
we could use something like `<avatar:userid>` syntax which is more
in line with creating links in markdown. Even then, it would not be
a good idea to add this instead of supporting inline images directly.
Since any usecases of such a syntax are in automation, we do not need
to make it userfriendly and something like the following is a better
implementation that doesn't need a custom syntax:
`![avatar for Alice](/avatar/1234?s=50) Alice: hey!`
---
History
We initially added this syntax back in 2012 and it was 'deprecated'
from the get go. Here's what the original commit had to say about
the new syntax:
> We'll use this internally for the commit bot. We might eventually
> disable it for external users.
We eventually did start using this for our github integrations in 2013
but since then, those integrations have been neglected in favor of
our GitHub webhooks which do not use this syntax.
When we copied `!gravatar` to add the `!avatar` syntax, we also noted
that we want to deprecate the `!gravatar` syntax entirely - in 2013!
Since then, we haven't advertised either of these syntaxes anywhere
in our docs, and the only two places where this syntax remains is
our game bots that could easily do without these, and the git commit
integration that we have deprecated anyway.
We do not have any evidence of someone asking about this syntax on
chat.zulip.org when developing an integration and rightfully so- only
the people who work on Zulip (and specifically, markdown) are likely
to stumble upon it and try it out.
This is also the only peice of code due to which we had to look up
emails -> userid mapping in our backend markdown. By removing this,
we entirely remove the backend markdown's dependency on user emails
to render messages.
---
Relevant commits:
- Oct 2012, Initial commit c31462c278
- Nov 2013, Update commit bot 968c393826
- Nov 2013, Add avatar syntax 761c0a0266
- Sep 2017, Avoid email use c3032a7fe8
- Apr 2019, Remove from webhook 674fcfcce1
Log RealmAuditLog in do_set_realm_property and do_remove_realm_domain.
Tests for the changes are written in test_events because it will save
duplicate code for test_change_realm_property.
Added new Event Type in AbstractRealmAuditLog STREAM_CREATED.
Since we finally create streams in create_stream_if_needed function
in zerver/lib/streams.py so logged realm_audit there.
Passed acting_user when create_stream_if_needed or ensure_stream
function is called.
Added tests in test_audit_log.
We had been using !time() syntax for timestamps so far. Since its
an unreleased feature, we can make changes without affecting many
people.
Fixes#15442.
Prior to this commit whenever convert was imported from zerver.lib.markdown
it was aliased as markdown_convert for readability.
This commit rename convert function to markdown_convert so that it can be
directly import it without aliasing and without compromising readability.
This fixes our triggering a RabbitMQ event to send a push notification
to remove the empty set of push notifications, resulting from not
using the correct data structure to determine which message IDs to look at.
This was causing a lot of visible exceptions when running
the `test_messages.py` test suite.
For users who are unsubscribed from the new stream but are in
the old stream, we delete the UserMessage.
We send the delete_message event only to guest users,
who have completely lost asses to the moved messages, for other
users we send the normal update_message event which moves
the messages to the new unsubed stream which
otherwise would look broken to the
user without reloading to the webpage.
Our previous OpenAPI schema validator that we implemented ourselves
was useful training wheels for our understanding OpenAPI properly, and
was mostly correct. But given that we've finally reached the point
where our OpenAPI file accurately describes the API, it makes sense to
switch to use an official OpenAPI validator. We lose some ability to
do exclude rules for particular elements, but those were primarily
important for us when we had a lot of them.
As part of this change, we need to add `additionalProperties: false`
for all of our dictonaries/objects where we've documented every
parameter; otherwise the OpenAPI schema checker won't know that we
expect every parameter to be documented.
This was hiding an actual type error in test_cache: a mismatch between
the object ID type, which is str, and the default id_fetcher, which
returns int.
Mypy’s insufficient support for default generic arguments basically
means we can’t use them without a lot of overloading, and there are
not enough callers here to justify that.
https://github.com/python/mypy/issues/3737
We avoid this being super messy where the code calls this by adding
some less generic wrappers for generic_bulk_cached_fetch.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
According to @showell:
> All the slow decorators can die. That was a failed experiment of
> mine from 2014 days. I have meaning to kill them for a couple years
> now. I wrote this with the best of intentions, but I believe it's
> now just cruft. We never made a "fast" mode, for one. And we kept
> writing more and more slow tests, haha.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We remove support for the old clients which required an event for
each message to clear notification.
This is justified since it has been around 1.5 years since we started
supporting the bulk operation (and so essentially nobody is using a
mobile app version so old that it doesn't support the batched
approach) and the unbatched approach has a maintenance and reliability
cost.
This commit removes bugdown alias and do proper imports from markdown
module. Also remove bugdown word and replace it with markdown in
comments.
This commit is part of series of commits aimed at renaming bugdown to
markdown.
There is still some miscellaneous cleanup that
has to happen for things like analytics queries
and dead code in node tests, but this should
remove the main use of pointers in the backend.
(We will also still need to drop the DB field.)
rename ZEPHYR_MIRROR_BUGDOWN_KEY to ZEPHYR_MIRROR_MARKDOWN_KEY and
DEFAULT_BUGDOWN_KEY tp DEFAULT_MARKDOWN_KEY.
This commit is part of series of commits aimed at renaming bugdown to
markdown.
This commits changes class name of MarkdownListPreprocessor to
MarkdownListPreprocessor. It also changes corresponding references
in tests.
This is part of series of commits which aims for renaming bugdown to
markdown.
This commit is first of few commita which aim to change all the
bugdown references to markdown. This commits rename the files,
file path mentions and change the imports.
Variables and other references to bugdown will be renamed in susequent
commits.
We send user_id of the referrer instead of email in the invites dict.
Sending user_ids is more robust, as those are an immutable reference
to a user, rather than something that can change with time.
Updates to the webapp UI to display the inviters for more convenient
inspection will come in a future commit.
In zulip.yaml, add `deprecated` tags to all parameters/keys with
`Deprecated` in the description. Then add tests to ensure that deprecated
parameters/keys will always have the `deprecated` key. Also, in
the API docs, sort the parameters according to presence of `deprecated`
key, presenting the `deprecated` keys at the end and add a `deprecated`
tag next to them.
We send a remove mobile push notification to the users who were
no longer mentioned after the content of the message was edited.
This also corrects the notification count for the mobile apps
where a user was prior mentioned in a muted stream / topic and the
message was edited and the user is no longer mentioned now.
Hence, fixing the case where user has read all his unreads
but the notification badge on the app is still positive.
Fixes#15428.
do_clear_mobile_push_notifications_for_ids can now be used to
clear push_notification for multiple users at once. This method
loops over users, so no performance optimization is gained.
We've been seeing an exception in server_event_dispatch.js in
production where in large organizations, sometimes when a new user
joined, every other browser in the organization would throw an
exception processing some sort of realm_user/update event.
It turns out the cause was that when a user copies their profile from
an existing user account with a user-uploaded avatar, the code path we
reused to set the avatar properly send a realm_user/update event about
the avatar change -- for a user that hadn't been fully created and
certainly hadn't have the realm_user/add event sent for.
We fix this and add tests and comments to prevent it recurring.
(Removed an incorrect docstring while working on this).
This fixes an issues that causes HTML entities inside of inline code
blocks to be converted rather than being displayed literally.
The upstream python-markdown now handles this correctly, so we just use
their implementation with our changes for removing .strip(). As a result
of this migration, we switch backtick pattern to an inline processor
too.
Fixes#12056.
For the codeblock counterpart of this issue, we should follow the
upstream PR https://github.com/Python-Markdown/markdown/pull/990.
Co-authored-by: Rohitt Vashishtha <aero31aero@gmail.com>
Note that I don't actually convert the
checker from check_dict to check_dict_only,
because that would be a user-facing change,
but I think we can sweep a lot of things
like this after the next release.