The purpose of this commit is to pass information
to the frontend whether the message response recieved
has been limited due to plan restrictions or not.
To implement this, the backend for limiting the message
history had to be rewritten as we used to fetch
only the message rows whose id was greater than
first_visible_message_id. The filtered rows gives us
no information on whether the message history was
limited or not. So the backend was rewritten to not
do any restriction of limiting the message rows while
making the query. The limiting of rows is now done in
post_process_limited_query which will also return back
the value of history_limited flag.
Tweaked by tabbott to note a few cases where the results are
incorrect. I'm merging this despite those, because those cases don't
impact the correctness of the feature, and it may have tricky
performance implications to fix correctly.
If cordelia searches on pm-with:iago@zulip.com,cordelia@zulip.com,
we now properly treat that the same way as pm-with:iago@zulip.com.
Before this fix, the query would initially go through the
huddle code path. The symptom wasn't completely obvious, as
eventually a deeper function would return a recipient id
corresponding to a single PM with @iago@zulip.com, but we would
only get messages where iago was the recipient, and not any
messages where he was the sender to cordelia.
I put the helper function for this in zerver/lib/addressee, which
is somewhat speculative. Eventually, we'll want pm-with queries
to allow for user ids, and I imagine there will be some shared
logic with other Addressee code in terms of how we handle these
strings. The way we deal with lists of emails/users for various
endpoints is kind of haphazard in the current code, although
granted it's mostly just repeating the same simple patterns. It
would be nice for some of this code to converge a bit. This
affects new messages, typing indicators, search filters, etc.,
and some endpoints have strange legacy stuff like supporting
JSON-encoded lists, so it's not trivial to clean this up.
Tweaked by tabbott to add some additional tests.
Tweaked by tabbott to use a declared constant rather than just use
5000 in multiple places; this also means we can change the count
without updating translations.
Fixes#10446.
The `match_subject` field is supposed to contain HTML; that's how
the highlighting is done. But the `subject` field is plain text --
it must be encoded if we want corresponding HTML.
Of the three places the `match_subject` field is populated -- two
here in messages_in_narrow_backend, one in get_messages_backend --
two of them already do this correctly, via get_search_fields.
Fix the remaining one, where in a `/messages/matches_narrow` query
we populate `matches_subject` even if the query didn't involve a
full-text search.
This doesn't affect the webapp, which ignores `match_subject` unless
it knows it did a full-text search; nor the mobile app, which
doesn't use `/messages/matches_narrow` at all.
Right now it only has one function, but the function
we removed never really belonged in actions.py, and
now we have better test coverage on actions.py, which
is an important module to get to 100%.
This implements a significant performance optimization for users
clicking the `Private messages` narrow in the Zulip UI, especially for
those users who do not have 50 recent private messages in an
organization with a lot of stream message traffic (because then
previously, postgres needed to scan through a huge amount of history
to find enough private messages).
The database index powering it can also support many other queries we
might want to do in the future to support "recent conversations" type
features.
Fixes#6896.
The use_first_unread_anchor parameter allows automatically setting the
anchor to the first message that hasn't been read in this narrow.
Therefore it isn't necessary to specify an anchor when this parameter is
enabled.
Note from Tim: Arguably, we should think about making
`use_first_unread_anchor` the default behavior when anchor is
unspecified, but that's for later consideration.
The only changes visible at the AST level, checked using
https://github.com/asottile/astpretty, are
zerver/lib/test_fixtures.py:
'\x1b\\[(1|0)m' ↦ '\\x1b\\[(1|0)m'
'\\[[X| ]\\] (\\d+_.+)\n' ↦ '\\[[X| ]\\] (\\d+_.+)\\n'
which is fine because re treats '\\x1b' and '\\n' the same way as
'\x1b' and '\n'.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
When GETting an unedited message's edit history, the server wasn't able
to reply properly and produced a 500 error.
Now when that happens, we return a message history that only contains
the original message.
Move the zcommands from '/views/messages.py' to
'/lib/zcommand'.
Also, move the zcommand tests from '/tests/test_messages.py'
to '/tests/test_zcommand'.
These two slash commands now use zcommand to talk to
the server, so we have no Message overhead, and if you're
on a stream, you no longer spam people by accident.
The commands now also give reasonable messages
if you are already in the mode you ask for.
It should be noted that by moving these commands out of
widget.py, they are no longer behind the ALLOW_SUB_MESSAGES
setting guard.
This adds a /ping command that will be useful for users
to see what the round trip to the Zulip server is (including
only a tiny bit of actual server time to basically give a
200).
It also introduce the "/zcommand" endpoint and zcommand.js
module.
API users, particularly bots, can now send a field
called "widget_content" that will be turned into
a submessage for the web app to look at. (Other
clients can still rely on "content" to be there,
although it's up to the bot author to make the
experience good for those clients as well.)
Right now widget_content will be a JSON string that
encodes a "zform" widget with "choices." Our first
example will be a trivia bot, where users will see
something like this:
Which fruit is orange in color?
[A] orange
[B] blackberry
[C] strawberry
The letters will be turned into buttons on the webapp
and have canned replies.
This commit has a few parts:
- receive widget_content in the request (simply
validating that it's a string)
- parse the JSON in check_message and deeply
validate its structure
- turn it into a submessage in widget.py
Add realm setting to set time limit for message deleitng.
Set default value of message_content_delete_limit_seconds
to 600 seconds(10 min).
Thanks to Shubham Dhama for rebasing and reworking this. Some final
edits also done by Tim Abbott.
Fixes#7344.
For certain queries where both include_history and
use_first_unread_anchor are set to True, we were excluding
historical rows. Now we only use the use_first_unread_anchor
flag to filter rows that we use to find the anchor, without
having it filter the actual search results.
The bug went unreported for a long time, because it only
affected mobile users who had newly subscribed to streams.
Note that we make a small change to the test called
test_use_first_unread_anchor_with_muted_topics, which has
a very scary comment about being "arcane" and "be
absolutely sure you know what you're doing." I think it's
fine.
Also, the new test code would fail before this fix, so it
should help prevent future regressions.
Fixes#8958
This is a bit more than a pure refactor, because we duplicate a
chunk of code to calculate a query inside of
find_first_unread_anchor(), so we're doing a bit more work
than before.
We need this refactoring to start decoupling find_first_unread_anchor
from get_messages_backend for the case where include_history is
True. This will happen in a subsequent commit.
The only test that changes here is a direct test on
find_first_unread_anchor(). All other tests pass without
modification, and we have decent coverage on get_messages_backend.
We use an array now to build up the list of search operands and
then consolidate the special search handling after the loop (which
means setting the flag, putting two more columns in the query, and
using ' '.join to build the string).
We have a debugging statement for some obscure errors we get
when narrows have search terms. We now show all the narrow
operators. This isn't really to improve debugging; it's more
to make it easier in the next commit to extract a function
that would make search_term have to be passed back in a tuple.
But it shouldn't hurt debugging either.
This is a pure refactoring and just pulls the function out
to the top level of the module. (The prior commit extracted
it inside a larger function to make a nicer diff.)
The new name can_access_stream_history_by_name gets to the point of
what this function actually does. And passing in a user object lets
us define what this does based on the user subscribed.
Applies the logic to allow community members to edit topics
of others' messages if this setting is True. Otherwise,
only administrators can update the topic of others' messages.
This logic includes a 24-hour time limit for community topic editing.
We now consistently set our query limits so that we get at
least `num_after` rows such that id > anchor. (Obviously, the
caveat is that if there aren't enough rows that fulfill the
query, we'll return the full set of rows, but that may be less
than `num_after`.) Likewise for `num_before`.
Before this change, we would sometimes return one too few rows
for narrow queries.
Now, we're still a bit broken, but in a more consistent way. If
we have a query that does not match the anchor row (which could
be true even for a non-narrow query), but which does match lots
of rows after the anchor, we'll return `num_after + 1` rows
on the right hand side, whether or not the query has narrow
parameters.
The off-by-one semantics here have probably been moot all along,
since our windows are approximate to begin with. If we set
num_after to 100, its just a rough performance optimization to
begin with, so it doesn't matter whether we return 99 or 101 rows,
as long as we set the anchor correctly on the subsequent query.
We will make the results more rigorous in a follow up commit.
This generic function isolates the before/after logic that really
is independent of Message and doesn't need to clutter up
`get_messages_backend`. Also, introducing a new namespace
reduces some shadowing/mutation with variables like `query`.
It's a pure code move, with some very minor renaming (e.g.
inner_msg_id_col -> id_col).
If anchor is 0, there is no sense doing a before_query.
Likewise, if anchor is `LARGER_THAN_MAX_MESSAGE_ID`, there is
no sense doing an after_query.
We introduce variables called `need_before_query` and
`need_after_query` to enforce those conditions.
This also adds some comments explaining the fallthrough case
where neither query makes sense.
If use_first_unread_anchor is set and we don't have any unread
messages, then our anchor is effectively "positive infinity" and
we can streamline queries.
In the past we'd have clauses like `message_id <= 999999999999999`
in the query that were harmless but crufty.
We want to say `if num_after > 0` when we expect num_after to be
a positive integer. We don't want any confusion that we will
execute the blocks for values of -7 or None.
This may be helpful for some API clients, since it avoids them needed
to do somewhat messy post-processing on the results (the data was
always available via scanning for the first unread message in the result).
Fixes#6244.
This is responsible for:
1.) Handling all the incoming requests at the
messages endpoint which have defer param set. This is similar to
send_message_backend apart from the fact that instead of really
sending a message it schedules one to be sent later on.
2.) Does some preliminary checks such as validating timestamp for
scheduling a message, prevent scheduling a message in past, ensure
correct format of message to be scheduled.
3.) Extracts time of scheduled delivery from message.
4.) Add tests for the newly introduced function.
5.) timezone: Add get_timezone() to obtain tz object from string.
This helps in obtaining a timezone (tz) object from a timezone
specified as a string. This string needs to be a pytz lib defined
timezone string which we use to specify local timezones of the
users.
This commit puts the guts of parse_usermessage_flags into
UserMessage.flags_list_for_flags, since it was slightly faster
than the old implementation and produced the same results.
(Both algorithms were super fast, actually.)
And then all callers use the model method now.
The logic to set search_fields was essentially the same for both
sides of the include_history conditional.
Now we have just one code block that sets search_fields, and we
can quickly short-circuit the loop when is_search is False.
tsearch_extras returns search offsets in bytes but our highlight
function treated them as character offsets. Added a check to subtract
extra bytes if the tsearch search backend is being used.
Fixes#4084.
Fixes#7021.
Do you call get_recipient(Recipient.STREAM, stream_id) or
get_recipient(stream_id, Recipient.STREAM)? I could never
remember, and it was not very type safe, since both parameters
are integers.
Before this change, we populated two cache entries for each
message that we sent. The entries were largely redundant,
with the only difference being whether we sent the content
as raw markdown or as the rendered HTML.
This commit makes it so we only have one cache entry per
message, and it includes both content and rendered_content.
One legacy source on confusion here is that `content`
changes meaning when you're on the front end. Here is the
situation going forward:
database:
content = raw
rendered_contented = rendered
cache entry:
content = raw
rendered_contented = rendered
payload for the frontend:
content = raw (for apply_markdown=False)
content = rendered (for apply_markdown=True)
Clients fetching messages can now specify that they are able
to compute their avatar, and if they set client_gratavar to
True in the request (w/our normal encoding scheme), then the
backend will not compute it, and the payload will be smaller.
The fix starts with get_messages_backend. The flag gets
passed down through these functions:
* MessageDict.post_process_dicts.
* MessageDict.set_sender_avatar.
We also fix up the callers for post_process_dicts to explicitly
pass in the client_gravatar path, but for now they all just hard
code the value to False.
Message.get_raw_db_rows is moved to MessageDict, since its
implementation details are highly coupled to other methods
in MessageDict.
And then sew_messages_and_reactions comes along for the
ride.
We eventually want to move Reaction.get_raw_db_rows to there
as well.
Introduce MessageDict.post_process_dicts() will allow us
the ability to do the following:
* use less memory in the cache for repeated data
* prevent cache invalidation
* format data according to different client needs
The first use of this function is pretty inconsequential, but
it sets us up for more consequential changes.
In this commit we defer the MessageDict.hydrate_recipient_info
step until after we pull data out of the cache. This impacts
cache size as follows:
* streams - negligibly bigger
* PMs/huddles - slimmer due to not needing to repeat
sender data like email/full_name
Again, the main point of this change is to start setting up
the infrastructure to do post-processing.
We now do push notifications and missed message emails
for offline users who are subscribed to the stream for
a message that has been edited, but we short circuit
the offline-notification logic for any user who presumably
would have already received a notification on the original
message.
This effectively boils down to sending notifications to newly
mentioned users. The motivating use case here is that you
forget to mention somebody in a message, and then you edit
the message to mention the person. If they are offline, they
will now get pushed notifications and missed message emails,
with some minor caveats.
We try to mostly use the same techniques here as the
send-message code path, and we share common code with the
send-message path once we get to the Tornado layer and call
maybe_enqueue_notifications.
The major places where we differ are in a function called
maybe_enqueue_notifications_for_message_update, and the top
of that function short circuits a bunch of cases where we
can mostly assume that the original message had an offline
notification.
We can expect a couple changes in the future:
* Requirements may change here, and it might make sense
to send offline notifications on the update side even
in circumstances where the original message had a
notification.
* We may track more notifications in a DB model, which
may simplify our short-circuit logic.
In the view/action layer, we already had two separate codepaths
for send-message and update-message, but this mostly echoes
what the send-message path does in terms of collecting data
about recipients.
We want to convert stream names to stream ids as close
to the "edges" of our system as possible, so we let our
caller do the work of finding the stream id for a stream
narrow.
There is no reason for either render_incoming_message() or
render_markdown() to require full UserProfile objects just to
triage alert words.
By only asking for user_ids, we save extra queries in two
callpaths and we make it easier to start using user_ids in
do_send_messages().
This is mostly pure code extraction.
It also removes some dead code in update_muted_topic, where
were updating muted_topics spuriously before calling
do_update_muted_topic.
For filters like has:link, where the web app doesn't necessarily
want to guess whether incoming messages meet the criteria of the
filter, the server is asked to query rows that match the query.
Usually these queries are search queries, which have fields for
content_matches and subject_matches. Our logic was handling those
correctly.
Non-search queries were throwing an exception related to tuple
unpacking. Now we recognize when those fields are absent and
do the proper thing.
There are probably situations where the web app should stop hitting
this endpoint and just use its own filters. We are making the most
defensive fix first.
Fixes#6118
Before this change, server searches for both
`is:mentioned` and `is:alerted` would return all messages
where the user is specifically mentioned (but not
at-all mentions).
Now we follow the JS semantics:
is:mentioned -- all mentions, including wildcards
is:alerted -- has an alert word
Here is one relevant JS snippet:
} else if (operand === 'mentioned') {
return message.mentioned;
} else if (operand === 'alerted') {
return message.alerted;
And here you see that `mentioned` is OR'ed over both mention flags:
message.mentioned = convert_flag('mentioned') || convert_flag('wildcard_mentioned');
The `alerted` flag on the JS side is a simple mapping:
message.alerted = convert_flag('has_alert_word');
Fixes#5020
The new endpoints are:
/json/mark_stream_as_read: takes stream name
/json/mark_topic_as_read: takes stream name, topic name
The /json/flags endpoint no longer allows streams or topics
to be passed in as parameters.
This function optimizes marking streams and topics as read,
by using UserMessage.where_unread(), which uses a partial
index on the "read" flag.
This also simplifies the code path for ordinary message
flag updates.
In order to keep 100% line coverage, I simplified the
logging in update_message_flags, so now all requests
will show the "actually" format.
This is an interim step toward creating dedicated endpoints
for marking streams/topics as reads, so we do error checking
with asserts for flag/operation, so we don't introduce a
temporary translation string.