We’re about to start using PostgreSQL-specific syntax that can’t be
stringified without a specified dialect.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Add ability to search entire message history of all public streams at
once. It includes all subscibed, non subscribed public streams messages
and even historical public stream messages sent before user had joined
an organization or stream.
Fixes#8859.
Instead of having the rather unclear type Union[str,
List[UserDisplayRecipient]] where display_recipient of message dicts was
involved, we use DisplayRecipientT (renamed from DisplayRecipientCacheT
- since there wasn't much reason to have the word Cache in there), which
makes it clearer what is the actual nature of the objects and gets rid
of this pretty big type declaration.
Since the display_recipients dictionaries corresponding to users are
always dictionaries with keys email, full_name, short_name, id,
is_mirror_dummy - instead of using the overly general Dict[str, Any]
type, we can define a UserDisplayRecipient type,
using an appropriate TypedDict.
The type definitions are moved from display_recipient.py to types.py, so
that they can be imported in models.py.
Appropriate type adjustments are made in various places in the code
where we operate on display_recipients.
We also document support for user IDs in the pm-with narrow operator.
Edited by tabbott to document on /api rather than in the /help page.
Fixes part of #9474.
Clients won't have access to user email addresses, and thus won't be
able to compute gravatars.
The tests for this are a bit messy, in large part because our tests
for get_events call subsections of it, rather than the main function.
This renames Subscription.in_home_view field to is_muted, for greater
clarity as to what it does just from seeing the setting name, without
having to look it up.
Also disabled an obsolete test_migrations test.
Fixes#10042.
The purpose of this commit is to pass information
to the frontend whether the message response recieved
has been limited due to plan restrictions or not.
To implement this, the backend for limiting the message
history had to be rewritten as we used to fetch
only the message rows whose id was greater than
first_visible_message_id. The filtered rows gives us
no information on whether the message history was
limited or not. So the backend was rewritten to not
do any restriction of limiting the message rows while
making the query. The limiting of rows is now done in
post_process_limited_query which will also return back
the value of history_limited flag.
Tweaked by tabbott to note a few cases where the results are
incorrect. I'm merging this despite those, because those cases don't
impact the correctness of the feature, and it may have tricky
performance implications to fix correctly.
Apparently, we weren't actually checking that found_oldest had the
correct value; fortunately, this didn't actually result in a problem,
because the values were always correct. But this will be important as
we start extending this test.
If cordelia searches on pm-with:iago@zulip.com,cordelia@zulip.com,
we now properly treat that the same way as pm-with:iago@zulip.com.
Before this fix, the query would initially go through the
huddle code path. The symptom wasn't completely obvious, as
eventually a deeper function would return a recipient id
corresponding to a single PM with @iago@zulip.com, but we would
only get messages where iago was the recipient, and not any
messages where he was the sender to cordelia.
I put the helper function for this in zerver/lib/addressee, which
is somewhat speculative. Eventually, we'll want pm-with queries
to allow for user ids, and I imagine there will be some shared
logic with other Addressee code in terms of how we handle these
strings. The way we deal with lists of emails/users for various
endpoints is kind of haphazard in the current code, although
granted it's mostly just repeating the same simple patterns. It
would be nice for some of this code to converge a bit. This
affects new messages, typing indicators, search filters, etc.,
and some endpoints have strange legacy stuff like supporting
JSON-encoded lists, so it's not trivial to clean this up.
Tweaked by tabbott to add some additional tests.
Tweaked by tabbott to use a declared constant rather than just use
5000 in multiple places; this also means we can change the count
without updating translations.
Fixes#10446.
This implements a significant performance optimization for users
clicking the `Private messages` narrow in the Zulip UI, especially for
those users who do not have 50 recent private messages in an
organization with a lot of stream message traffic (because then
previously, postgres needed to scan through a huge amount of history
to find enough private messages).
The database index powering it can also support many other queries we
might want to do in the future to support "recent conversations" type
features.
Fixes#6896.
The use_first_unread_anchor parameter allows automatically setting the
anchor to the first message that hasn't been read in this narrow.
Therefore it isn't necessary to specify an anchor when this parameter is
enabled.
Note from Tim: Arguably, we should think about making
`use_first_unread_anchor` the default behavior when anchor is
unspecified, but that's for later consideration.
The only changes visible at the AST level, checked using
https://github.com/asottile/astpretty, are
zerver/lib/test_fixtures.py:
'\x1b\\[(1|0)m' ↦ '\\x1b\\[(1|0)m'
'\\[[X| ]\\] (\\d+_.+)\n' ↦ '\\[[X| ]\\] (\\d+_.+)\\n'
which is fine because re treats '\\x1b' and '\\n' the same way as
'\x1b' and '\n'.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
This commit adds a new field history_public_to_subscribers to the
Stream model, which serves a similar function to the old
settings.PRIVATE_STREAM_HISTORY_FOR_SUBSCRIBERS; we still use that
setting as the default value for new streams to avoid breaking
backwards-compatibility for those users before we are ready with an
actual UI for users to choose directly.
This also comes with a migration to set the value of the new field for
existing streams with an algorithm matching that used at runtime.
With significant changes by Tim Abbott.
This is an initial part of our efforts on #9232.
For certain queries where both include_history and
use_first_unread_anchor are set to True, we were excluding
historical rows. Now we only use the use_first_unread_anchor
flag to filter rows that we use to find the anchor, without
having it filter the actual search results.
The bug went unreported for a long time, because it only
affected mobile users who had newly subscribed to streams.
Note that we make a small change to the test called
test_use_first_unread_anchor_with_muted_topics, which has
a very scary comment about being "arcane" and "be
absolutely sure you know what you're doing." I think it's
fine.
Also, the new test code would fail before this fix, so it
should help prevent future regressions.
Fixes#8958
This is a bit more than a pure refactor, because we duplicate a
chunk of code to calculate a query inside of
find_first_unread_anchor(), so we're doing a bit more work
than before.
We need this refactoring to start decoupling find_first_unread_anchor
from get_messages_backend for the case where include_history is
True. This will happen in a subsequent commit.
The only test that changes here is a direct test on
find_first_unread_anchor(). All other tests pass without
modification, and we have decent coverage on get_messages_backend.
We don't indend for this server-level setting to exist in the long
term; the purpose of this is just to make it easy to test this code
path for development purposes.
This implements much of the Message side part of #2745.
We now consistently set our query limits so that we get at
least `num_after` rows such that id > anchor. (Obviously, the
caveat is that if there aren't enough rows that fulfill the
query, we'll return the full set of rows, but that may be less
than `num_after`.) Likewise for `num_before`.
Before this change, we would sometimes return one too few rows
for narrow queries.
Now, we're still a bit broken, but in a more consistent way. If
we have a query that does not match the anchor row (which could
be true even for a non-narrow query), but which does match lots
of rows after the anchor, we'll return `num_after + 1` rows
on the right hand side, whether or not the query has narrow
parameters.
The off-by-one semantics here have probably been moot all along,
since our windows are approximate to begin with. If we set
num_after to 100, its just a rough performance optimization to
begin with, so it doesn't matter whether we return 99 or 101 rows,
as long as we set the anchor correctly on the subsequent query.
We will make the results more rigorous in a follow up commit.
If anchor is 0, there is no sense doing a before_query.
Likewise, if anchor is `LARGER_THAN_MAX_MESSAGE_ID`, there is
no sense doing an after_query.
We introduce variables called `need_before_query` and
`need_after_query` to enforce those conditions.
This also adds some comments explaining the fallthrough case
where neither query makes sense.
If use_first_unread_anchor is set and we don't have any unread
messages, then our anchor is effectively "positive infinity" and
we can streamline queries.
In the past we'd have clauses like `message_id <= 999999999999999`
in the query that were harmless but crufty.
This field has been unused by clients for some time, and isn't great
for our public archive feature plans (where we'll not want to be
including email addresses in messages).
This requires updating one of the tests for the group_pm_with feature
in test_narrow to use the new style of tautology generated by SQLAlchemy.
Thanks to Sinwar for investigating this.
Fixes#8381.