We now consistently set our query limits so that we get at
least `num_after` rows such that id > anchor. (Obviously, the
caveat is that if there aren't enough rows that fulfill the
query, we'll return the full set of rows, but that may be less
than `num_after`.) Likewise for `num_before`.
Before this change, we would sometimes return one too few rows
for narrow queries.
Now, we're still a bit broken, but in a more consistent way. If
we have a query that does not match the anchor row (which could
be true even for a non-narrow query), but which does match lots
of rows after the anchor, we'll return `num_after + 1` rows
on the right hand side, whether or not the query has narrow
parameters.
The off-by-one semantics here have probably been moot all along,
since our windows are approximate to begin with. If we set
num_after to 100, its just a rough performance optimization to
begin with, so it doesn't matter whether we return 99 or 101 rows,
as long as we set the anchor correctly on the subsequent query.
We will make the results more rigorous in a follow up commit.
If anchor is 0, there is no sense doing a before_query.
Likewise, if anchor is `LARGER_THAN_MAX_MESSAGE_ID`, there is
no sense doing an after_query.
We introduce variables called `need_before_query` and
`need_after_query` to enforce those conditions.
This also adds some comments explaining the fallthrough case
where neither query makes sense.
If use_first_unread_anchor is set and we don't have any unread
messages, then our anchor is effectively "positive infinity" and
we can streamline queries.
In the past we'd have clauses like `message_id <= 999999999999999`
in the query that were harmless but crufty.
This field has been unused by clients for some time, and isn't great
for our public archive feature plans (where we'll not want to be
including email addresses in messages).
This requires updating one of the tests for the group_pm_with feature
in test_narrow to use the new style of tautology generated by SQLAlchemy.
Thanks to Sinwar for investigating this.
Fixes#8381.
We don't have our linter checking test files due to ultra-long strings
that are often present in test output that we verify. But it's worth
at least cleaning out all the ultra-long def lines.
This refactoring doesn't change behavior, but it sets us up
to more easily handle a register setting for `client_gravatar`,
which will allow clients to tell us they're going to compute
their own gravatar URLs.
The `client_gravatar` flag already exists in our code, but it
is only used for Django views (users/messages) but not for
Zulip events.
The main change is to move the call to `set_sender_avatar` into
`finalize_payload`, which adds the boolean `client_gravatar`
parameter to that function. And then we update various callers
to supply that flag.
One small performance benefit of this change is that we now
lazily compute the client message payloads in
`event_queue.process_message_event` now, so this will improve
performance if all interested clients have the same value of
`apply_markdown`. But the change here is really preparing us
for the additional boolean parameter, which will cause us to
have four variations of the payload.
tsearch_extras returns search offsets in bytes but our highlight
function treated them as character offsets. Added a check to subtract
extra bytes if the tsearch search backend is being used.
Fixes#4084.
Fixes#7021.
Do you call get_recipient(Recipient.STREAM, stream_id) or
get_recipient(stream_id, Recipient.STREAM)? I could never
remember, and it was not very type safe, since both parameters
are integers.
In do_send_messages, we only produce one dictionary for
the event queues, instead of different flavors for text
vs. html. This prevents two unnecessary queries to the
database.
It also means we only put one dictionary on the "message"
event queue instead of two, albeit a wider one that has
some values that won't be sent to the actual clients.
This wider dictionary from MessageDict.wide_dict is also
used for the `feedback_messages` queue and service bot
queues. Since the extra fields are possibly useful down
the road, and they'll just be ignored for now, we don't
bother to remove them. Also, those queue processors won't
have access to `content_type`, which they shouldn't need.
Fixes#6947
Before this change, we populated two cache entries for each
message that we sent. The entries were largely redundant,
with the only difference being whether we sent the content
as raw markdown or as the rendered HTML.
This commit makes it so we only have one cache entry per
message, and it includes both content and rendered_content.
One legacy source on confusion here is that `content`
changes meaning when you're on the front end. Here is the
situation going forward:
database:
content = raw
rendered_contented = rendered
cache entry:
content = raw
rendered_contented = rendered
payload for the frontend:
content = raw (for apply_markdown=False)
content = rendered (for apply_markdown=True)
Clients fetching messages can now specify that they are able
to compute their avatar, and if they set client_gratavar to
True in the request (w/our normal encoding scheme), then the
backend will not compute it, and the payload will be smaller.
The fix starts with get_messages_backend. The flag gets
passed down through these functions:
* MessageDict.post_process_dicts.
* MessageDict.set_sender_avatar.
We also fix up the callers for post_process_dicts to explicitly
pass in the client_gravatar path, but for now they all just hard
code the value to False.
Add this field to the Stream model will prevent us from having
to look at realm data for several types of stream operations, which
can be prone to either doing extra database lookups or making
our cached data bloated.
Going forward, we'll set stream.is_zephyr to True whenever the
realm's string id is "zephyr".
Instead of using `unified_reactions` mapping start using
`name_to_codepoint` mapping for converting emoji name to
codepoints. We were using `unified_reactions` mapping
because prior to emoji web PR `name_to_codepoint` mapping
was generated using emoji_map.json which contained old
codepoints but for reactions new codepoints were required
to display them using sprite sheets.
This commit completely switches us over to using a
dedicated model called MutedTopic to track which topics
a user has muted.
This includes the necessary migrations to create the
table and populate it from legacy data in UserProfile.
A subsequent commit will actually remove the old field
in UserProfile.
This is mostly pure code extraction.
It also removes some dead code in update_muted_topic, where
were updating muted_topics spuriously before calling
do_update_muted_topic.