This commit adds a new field history_public_to_subscribers to the
Stream model, which serves a similar function to the old
settings.PRIVATE_STREAM_HISTORY_FOR_SUBSCRIBERS; we still use that
setting as the default value for new streams to avoid breaking
backwards-compatibility for those users before we are ready with an
actual UI for users to choose directly.
This also comes with a migration to set the value of the new field for
existing streams with an algorithm matching that used at runtime.
With significant changes by Tim Abbott.
This is an initial part of our efforts on #9232.
For certain queries where both include_history and
use_first_unread_anchor are set to True, we were excluding
historical rows. Now we only use the use_first_unread_anchor
flag to filter rows that we use to find the anchor, without
having it filter the actual search results.
The bug went unreported for a long time, because it only
affected mobile users who had newly subscribed to streams.
Note that we make a small change to the test called
test_use_first_unread_anchor_with_muted_topics, which has
a very scary comment about being "arcane" and "be
absolutely sure you know what you're doing." I think it's
fine.
Also, the new test code would fail before this fix, so it
should help prevent future regressions.
Fixes#8958
This is a bit more than a pure refactor, because we duplicate a
chunk of code to calculate a query inside of
find_first_unread_anchor(), so we're doing a bit more work
than before.
We need this refactoring to start decoupling find_first_unread_anchor
from get_messages_backend for the case where include_history is
True. This will happen in a subsequent commit.
The only test that changes here is a direct test on
find_first_unread_anchor(). All other tests pass without
modification, and we have decent coverage on get_messages_backend.
We don't indend for this server-level setting to exist in the long
term; the purpose of this is just to make it easy to test this code
path for development purposes.
This implements much of the Message side part of #2745.
We now consistently set our query limits so that we get at
least `num_after` rows such that id > anchor. (Obviously, the
caveat is that if there aren't enough rows that fulfill the
query, we'll return the full set of rows, but that may be less
than `num_after`.) Likewise for `num_before`.
Before this change, we would sometimes return one too few rows
for narrow queries.
Now, we're still a bit broken, but in a more consistent way. If
we have a query that does not match the anchor row (which could
be true even for a non-narrow query), but which does match lots
of rows after the anchor, we'll return `num_after + 1` rows
on the right hand side, whether or not the query has narrow
parameters.
The off-by-one semantics here have probably been moot all along,
since our windows are approximate to begin with. If we set
num_after to 100, its just a rough performance optimization to
begin with, so it doesn't matter whether we return 99 or 101 rows,
as long as we set the anchor correctly on the subsequent query.
We will make the results more rigorous in a follow up commit.
If anchor is 0, there is no sense doing a before_query.
Likewise, if anchor is `LARGER_THAN_MAX_MESSAGE_ID`, there is
no sense doing an after_query.
We introduce variables called `need_before_query` and
`need_after_query` to enforce those conditions.
This also adds some comments explaining the fallthrough case
where neither query makes sense.
If use_first_unread_anchor is set and we don't have any unread
messages, then our anchor is effectively "positive infinity" and
we can streamline queries.
In the past we'd have clauses like `message_id <= 999999999999999`
in the query that were harmless but crufty.
This field has been unused by clients for some time, and isn't great
for our public archive feature plans (where we'll not want to be
including email addresses in messages).
This requires updating one of the tests for the group_pm_with feature
in test_narrow to use the new style of tautology generated by SQLAlchemy.
Thanks to Sinwar for investigating this.
Fixes#8381.
We don't have our linter checking test files due to ultra-long strings
that are often present in test output that we verify. But it's worth
at least cleaning out all the ultra-long def lines.
This refactoring doesn't change behavior, but it sets us up
to more easily handle a register setting for `client_gravatar`,
which will allow clients to tell us they're going to compute
their own gravatar URLs.
The `client_gravatar` flag already exists in our code, but it
is only used for Django views (users/messages) but not for
Zulip events.
The main change is to move the call to `set_sender_avatar` into
`finalize_payload`, which adds the boolean `client_gravatar`
parameter to that function. And then we update various callers
to supply that flag.
One small performance benefit of this change is that we now
lazily compute the client message payloads in
`event_queue.process_message_event` now, so this will improve
performance if all interested clients have the same value of
`apply_markdown`. But the change here is really preparing us
for the additional boolean parameter, which will cause us to
have four variations of the payload.
tsearch_extras returns search offsets in bytes but our highlight
function treated them as character offsets. Added a check to subtract
extra bytes if the tsearch search backend is being used.
Fixes#4084.
Fixes#7021.
Do you call get_recipient(Recipient.STREAM, stream_id) or
get_recipient(stream_id, Recipient.STREAM)? I could never
remember, and it was not very type safe, since both parameters
are integers.
In do_send_messages, we only produce one dictionary for
the event queues, instead of different flavors for text
vs. html. This prevents two unnecessary queries to the
database.
It also means we only put one dictionary on the "message"
event queue instead of two, albeit a wider one that has
some values that won't be sent to the actual clients.
This wider dictionary from MessageDict.wide_dict is also
used for the `feedback_messages` queue and service bot
queues. Since the extra fields are possibly useful down
the road, and they'll just be ignored for now, we don't
bother to remove them. Also, those queue processors won't
have access to `content_type`, which they shouldn't need.
Fixes#6947
Before this change, we populated two cache entries for each
message that we sent. The entries were largely redundant,
with the only difference being whether we sent the content
as raw markdown or as the rendered HTML.
This commit makes it so we only have one cache entry per
message, and it includes both content and rendered_content.
One legacy source on confusion here is that `content`
changes meaning when you're on the front end. Here is the
situation going forward:
database:
content = raw
rendered_contented = rendered
cache entry:
content = raw
rendered_contented = rendered
payload for the frontend:
content = raw (for apply_markdown=False)
content = rendered (for apply_markdown=True)
Clients fetching messages can now specify that they are able
to compute their avatar, and if they set client_gratavar to
True in the request (w/our normal encoding scheme), then the
backend will not compute it, and the payload will be smaller.
The fix starts with get_messages_backend. The flag gets
passed down through these functions:
* MessageDict.post_process_dicts.
* MessageDict.set_sender_avatar.
We also fix up the callers for post_process_dicts to explicitly
pass in the client_gravatar path, but for now they all just hard
code the value to False.