Introduce MessageDict.post_process_dicts() will allow us
the ability to do the following:
* use less memory in the cache for repeated data
* prevent cache invalidation
* format data according to different client needs
The first use of this function is pretty inconsequential, but
it sets us up for more consequential changes.
In this commit we defer the MessageDict.hydrate_recipient_info
step until after we pull data out of the cache. This impacts
cache size as follows:
* streams - negligibly bigger
* PMs/huddles - slimmer due to not needing to repeat
sender data like email/full_name
Again, the main point of this change is to start setting up
the infrastructure to do post-processing.
We now do push notifications and missed message emails
for offline users who are subscribed to the stream for
a message that has been edited, but we short circuit
the offline-notification logic for any user who presumably
would have already received a notification on the original
message.
This effectively boils down to sending notifications to newly
mentioned users. The motivating use case here is that you
forget to mention somebody in a message, and then you edit
the message to mention the person. If they are offline, they
will now get pushed notifications and missed message emails,
with some minor caveats.
We try to mostly use the same techniques here as the
send-message code path, and we share common code with the
send-message path once we get to the Tornado layer and call
maybe_enqueue_notifications.
The major places where we differ are in a function called
maybe_enqueue_notifications_for_message_update, and the top
of that function short circuits a bunch of cases where we
can mostly assume that the original message had an offline
notification.
We can expect a couple changes in the future:
* Requirements may change here, and it might make sense
to send offline notifications on the update side even
in circumstances where the original message had a
notification.
* We may track more notifications in a DB model, which
may simplify our short-circuit logic.
In the view/action layer, we already had two separate codepaths
for send-message and update-message, but this mostly echoes
what the send-message path does in terms of collecting data
about recipients.
We want to convert stream names to stream ids as close
to the "edges" of our system as possible, so we let our
caller do the work of finding the stream id for a stream
narrow.
There is no reason for either render_incoming_message() or
render_markdown() to require full UserProfile objects just to
triage alert words.
By only asking for user_ids, we save extra queries in two
callpaths and we make it easier to start using user_ids in
do_send_messages().
This is mostly pure code extraction.
It also removes some dead code in update_muted_topic, where
were updating muted_topics spuriously before calling
do_update_muted_topic.
For filters like has:link, where the web app doesn't necessarily
want to guess whether incoming messages meet the criteria of the
filter, the server is asked to query rows that match the query.
Usually these queries are search queries, which have fields for
content_matches and subject_matches. Our logic was handling those
correctly.
Non-search queries were throwing an exception related to tuple
unpacking. Now we recognize when those fields are absent and
do the proper thing.
There are probably situations where the web app should stop hitting
this endpoint and just use its own filters. We are making the most
defensive fix first.
Fixes#6118
Before this change, server searches for both
`is:mentioned` and `is:alerted` would return all messages
where the user is specifically mentioned (but not
at-all mentions).
Now we follow the JS semantics:
is:mentioned -- all mentions, including wildcards
is:alerted -- has an alert word
Here is one relevant JS snippet:
} else if (operand === 'mentioned') {
return message.mentioned;
} else if (operand === 'alerted') {
return message.alerted;
And here you see that `mentioned` is OR'ed over both mention flags:
message.mentioned = convert_flag('mentioned') || convert_flag('wildcard_mentioned');
The `alerted` flag on the JS side is a simple mapping:
message.alerted = convert_flag('has_alert_word');
Fixes#5020
The new endpoints are:
/json/mark_stream_as_read: takes stream name
/json/mark_topic_as_read: takes stream name, topic name
The /json/flags endpoint no longer allows streams or topics
to be passed in as parameters.
This function optimizes marking streams and topics as read,
by using UserMessage.where_unread(), which uses a partial
index on the "read" flag.
This also simplifies the code path for ordinary message
flag updates.
In order to keep 100% line coverage, I simplified the
logging in update_message_flags, so now all requests
will show the "actually" format.
This is an interim step toward creating dedicated endpoints
for marking streams/topics as reads, so we do error checking
with asserts for flag/operation, so we don't introduce a
temporary translation string.
This is mostly a pure code extraction, except that we now
disregard the `messages` option for stream/topic updates,
since the web app always passes in an empty list (and this
commit is really just an incremental step toward creating
new endpoints.)
The "all" option for 'message/flags' was dangerous, as it could
apply to any of our flags. The only flag it made sense for, the
"read" flag, now has a dedicated endpoint.
This change simplifies how we mark all messages as read. It also
speeds up the backend by taking advantage of our partial index
for unread messages. We also use a new statsd indicator.
This completes the major endpoint migrations to eliminate legacy API
endpoints from Zulip.
There's a few other things that will happen naturally, so I believe
this fixes#611.
This provides the main infrastructure for fixing #5598. From here,
it's a matter of on the one hand upgrading exception handlers -- the
many except-blocks in the codebase that look for JsonableError -- to
look beyond the string `msg` and pass on the machine-readable full
error information to their various downstream recipients, and on the
other hand adjusting places where we raise errors to take advantage
of this mechanism to give the errors structured details.
In an ideal future, I think all exception handlers that look (or
should look) for a JsonableError would use its contents in structured
form, never mentioning `msg`; but the majority of error sites might
continue to just instantiate JsonableError with a string message. The
latter is the simplest thing to do, and probably most error types will
never have code looking for them specifically.
Because the new API refactors the `to_json_error_msg` method which was
designed for subclasses to override, update the 4 subclasses that did
so to take full advantage of the new API instead.
With #5598 there will soon be an application-level error code
optionally associated with a `JsonableError`, so rename this
field to make clear that it specifically refers to an
HTTP status code.
Also take this opportunity to eliminate most of the places
that refer to it, which only do so to repeat the default value.
This new setting controls whether or not users are allowed to see the
edit history in a Zulip organization. It controls access through 2
key mechanisms:
* For long-ago edited messages, get_messages removes the edit history
content from messages it sends to clients.
* For newly edited messages, clients are responsible for checking the
setting and not saving the edit history data. Since the webapp was
the only client displaying it before this change, this just required
some changes in message_events.js.
Significantly modified by tabbott to fix some logic bugs and add a
test.
I pushed a bunch of commits that attempted to introduce
the concept of `client_message_id` into our server, as
part of cleaning up our codepaths related to messages you
sent (both for the locally echoed case and for the host
case).
When we deployed this, we had some strange failures involving
double-echoed messages and issues advancing the pointer that appeared
related to #5779. We didn't get to the bottom of exactly why the PR
caused havoc, but I decided there was a cleaner approach, anyway.
We are deprecating local_id/local_message_id on the Python server.
Instead of the server knowing about the client's implementation of
local id, with the message id = 9999.01 scheme, we just send the
server an opaque id to send back to us.
This commit changes the name from local_id -> client_message_id,
but it doesn't change the actual values passed yet.
The goal for client_key in future commits will be to:
* Have it for all messages, not just locally rendered messages
* Not have it overlap with server-side message ids.
The history behind local_id having numbers like 9999.01 is that
they are actually interim message ids and the numerical value is
used for rendering the message list when we do client-side rendering.
Two wrinkles here:
* It's actually a little subtle why `ok_to_include_history` is
correct; in particular, it's not true that a term `stream:foo` will
always limit the query to the stream `foo`. For this, add an
explanatory comment backed up by an assert.
* The TODO comment in `messages_in_narrow_backend` about assuming this
is a search, I'm pretty sure doesn't matter; it seems to only be
saying that we return the set of fields we would for a search.
They're harmless to send, and in any case it doesn't appear to be
true anymore that the client only calls this for a search: the
`can_apply_locally` function also causes narrows with `has:` to go
to the server. So just take that comment out.
(After writing the term "invariant" a few times in these comments and
now this commit message, my inner mathematician reminds me that this
property is properly termed a "monovariant" -- something does change,
but it changes only in one direction. Pretty sure saying "invariant"
communicates better here, though.)
This makes it possible for Zulip administrators to delete messages.
This is primarily intended for use in deleting early test messages,
but it can solve other problems as well.
Later we'll want to play with the permissions model for this, but for
now, the goal is just to integrate the feature.
Note that it saves the deleted messages for some time using the same
approach as Zulip's message retention policy feature.
Fixes#135.
This is a better solution to the problem of how _pg_re_escape should
handle the null character. There's really no good reason to have a
null character in a stream name.
In cases where old unread messages in the home view might have been
leaked (either due to bugs or unusual muting interactions), it's
theoretically possible for the first unread message in the home view
to be far older than the pointer.
Since the Zulip mobile app is loading messages following the
use_first_unread logic, we need to plug this gap.
Probably a longer-term solution will involve changing how
update_message_flags works to automatically advance the pointer, but
this change should make it possible for the mobile apps to
consistently use the `use_first_unread` mechanism for fetching the
latest home view messages.
With tweaks to the tests by tabbott.
Fixeszulip/zulip-mobile#422.
textsearch based full text search doesn't match text in link tag but
PGroonga based full text search can match text in link tag.
Without this change, highlighting text in link tag generates broken
HTML.
This makes get_stream match get_realm, get_user_profile_by_email,
etc., in interface, and is more convenient for mypy annotations
because `get_stream` now doesn't return an Optional[Stream].
An empty narrow (ie, the home view) can be represented in code as either
`None` or `[]` but we had incorrect handling that failed to fully
properly deal with either case.
(1) In `get_stream_name_from_narrow`, we failed to deal with `None` by
trying to always iterate over `narrow`.
(2) In several other places, we failed to deal with `[]` by explicitly
checking `if narrow is None` or `if narrow is not None`. Changing these
to truthiness checks should work for both the `None` and `[]` cases.
Like many rare-case code with new tests, it turns out that the logic
for handling null characters in our Zephyr postgres query escaping
never worked, in multiple ways. First, it always changed the second
character in s, not the current one being inspected, and second, the
value it replaced it with was no the correct postgres escape of the
null byte. We fix this and add tests.
This completes the effort to get zerver/views/messages.py to 100%
test coverage.
Fixes#1006.
When you edit a message to contain links, and URL previews are
enabled, previously we'd throw an exception, because the realm ID
wasn't included in the event.
Also adds a test so that we can have effective test coverage on this
codepath, though this history is actually that I found the bug through
writing this test :).
Change `from django.utils.timezone import now` to
`from django.utils import timezone`.
This is both because now() is ambiguous (could be datetime.datetime.now),
and more importantly to make it easier to write a lint rule against
datetime.datetime.now().
The comments explain why this change is correct. This change is
useful because it's better to not have dead code paths, both because
it makes our life easier for coverage analysis, and because the else
statement provided the illusion that it could actually happen.
If the analysis in that comment is wrong, we'd rather have a 500 error
so we fix the bug than things silently sorta working.
This arguably regresses the Zephyr experience, in that we no longer
consider 'foo.d.d.d.d.d' to be something that gets narrowed in with
the rest, but that's a pretty rare use case anyway.
In practice, using that many '.d's anyway only happens a few times a
year.
This makes it super easy for frontend code using this view code to
produce a nice display of the history.
This also fixes an off-by-one error with the timestamps.
Based on work by Kartik Maji in #1204.
This has a few significant changes from the original version:
* We correctly handle filling in data for topic edits
* Has a complete test suite verifying correctness of the logic
* Currently, it doesn't include a special "start" entry
Things we may want to further change include:
* Adding a special "start" entry.
* Reversing the order of the history data returned for clarity.
This fixes a bug where update_message_backend would do one memcached
query per user receiving a given message. Right now we just do a
single bulk database query, but in principle we could use
generic_bulk_cached_fetch to use the cache as well.
This changes bugdown to use the realm passed in by the caller (if any)
for rendering, fixing a problem where bots such as the notification
bot would have their messages rendering using the admin realm's
settings, not the settings of the realm their messages are being sent
into.
Also adds a test for the notification bot case.
Fixes#3215.
Finishes the refactoring started in c1bbd8d. The goal of the refactoring is
to change the argument to get_realm from a Realm.domain to a
Realm.string_id. The steps were
* Add a new function, get_realm_by_string_id.
* Change all calls to get_realm to use get_realm_by_string_id instead.
* Remove get_realm.
* (This commit) Rename get_realm_by_string_id to get_realm.
Part of a larger migration to remove the Realm.domain field entirely.
This change adds support for displaying inline open graph previews for
links posted into Zulip.
It is designed to interact correctly with message editing.
This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control
whether this feature is enabled.
By default, this setting is currently disabled, so that we can burn it
in for a bit before it impacts users more broadly.
Eventually, we may want to make this manageable via a (set of?)
per-realm settings. E.g. I can imagine a realm wanting to be able to
enable/disable it for certain URLs.
Previously, the way that render_messages was calling bugdown meant
that the preview feature didn't have access to realm data like the
list of users or streams, resulting in previews for those elements
being wrong.
Now render_message_backend uses zerver.lib.render_markdown to render
messages correctly.
[Commit message tweaked and test added by tabbott]
With reactions and other upcoming features, we'll be adding several
places where we need to check whether a particular user can access a
particular message. It's best to just have a single helper function
for this purpose that we can use everywhere.
This pulls message-related code from models.py into a new
module called message.py, and it starts to break some bugdown
dependencies. All the methods here are basically related to
serializing Message objects as dictionaries for caches and
events.
extract_message_dict
stringify_message_dict
message_to_dict
message_to_dict_json
MessageDict.to_dict_uncached
MessageDict.to_dict_uncached_helper
MessageDict.build_dict_from_raw_db_row
MessageDict.build_message_dict
This fix also removes a circular dependency related
to get_avatar_url.
Also, there was kind of a latent bug in Message.need_to_render_content
where it was depending on other calls to Message to import bugdown
and set it globally in the namespace. We really need to just
eliminate the function, since it's so small and used by code that
may be doing very sketchy things, but for now I just fix it. (The
bug would possibly be exposed by moving build_message_dict out to the
library.)
I move these three functions to lib/cache.py:
to_dict_cache_key_id
to_dict_cache_key
flush_message
This will prepare us for a more significant refactoring that
eventually breaks down some circular dependencies with
Message and bugdown.
This was the original way to send messages via the Zulip API in the
very early days of Zulip, but was replaced by the REST API back in
2013.
Fixes: #730.
We no longer use all the alert words for all the users in the
entire realm when we look for alert words in a newly sent/edited
message. Now we limit the search to only all the alert words
for all the users who will get UserMessage records. This will
hopefully make a big difference for big realms where most messages
are only sent to a small subset of users.
We now use render_incoming_message() to render all incoming
new messages (sends/edits), so that they will get the same treatment.
This change also establishes do_send_messages() as the code
path to get new messages rendered. It removes some
logic from check_message() that only happened on certain code paths
for sending messages, and which would only detect failures by
expensively rendering messages, so it wasn't much of a guard.
This change also helps to phase out maybe_render_content(), which
deepens the call stack without providing much clarity to the reader,
since it's behavior is so variable.
Finally, this sets up to fix a flaw in the way we compute which
users have alert words in their messages (in a subsequent commit).
These annotations aren't perfect because the sqlalchemy stubs in
typeshed are broken (e.g. a `Select` doesn't have the ability to do
`.where()`, but we've at least used some typevars to make it easy to
address that when the sqlalchemy stubs are less broken).
This adds support for using PGroonga to back the Zulip full-text
search feature. Because built-in PostgreSQL full text search doesn't
support languages that don't put space between terms such as Japanese,
Chinese and so on. PGroonga supports all languages including Japanese
and Chinese.
Developers will need to re-provision when rebasing past this patch for
the tests to pass, since provision is what installs the PGroonga
package and extension.
PGroonga is enabled by default in development but not in production;
the hope is that after the PGroonga support is tested further, we can
enable it by default.
Fixes#615.
[docs and tests tweaked by tabbott]
Apparently, we had incorrectly concluded that our highlight_string
search result highlighting offsets coming from tsearch_extras were
measured in bytes, whereas in fact it is measured in characters.
There were a bunch of authorization and well-formedness checks in
zerver.lib.actions.do_update_message that I moved to
zerver.views.messages.update_message_backend.
Reason: by convention, functions in actions.py complete their actions;
error checking should be done outside the file when possible.
Fixes: #1150.
This is controlled through the admin tab and a new field in the Realms
table. This mirrors the behavior of the old hardcoded setting
feature_flags.disable_message_editing. Partially resolves#903.
For a long time, rest_dispatch has had this hack where we have to
create a copy of it in each views file using it, in order to directly
access the globals list in that file. This removes that hack, instead
making rest_dispatch just use Django's import_string to access the
target method to use.
[tweaked and reorganized from acrefoot's original branch in various
ways by tabbott]
This changes the type annotations for the cache keys in Zulip to be
consistently text_type, and updates the annotations for values that
are used as cache keys across the codebase.
As documented in https://github.com/zulip/zulip/issues/441, Guardian
has quite poor performance, and in fact almost 50% of the time spent
running the Zulip backend test suite on my laptop was inside Guardian.
As part of this migration, we also clean up the old API_SUPER_USERS
variable used to mark EMAIL_GATEWAY_BOT as an API super user; now that
permission is managed entirely via the database.
When rebasing past this commit, developers will need to do a
`manage.py migrate` in order to apply the migration changes before the
server will run again.
We can't yet remove Guardian from INSTALLED_APPS, requirements.txt,
etc. in this release, because otherwise the reverse migration won't
work.
Fixes#441.
We can't just check that the realms are the same because ist.mit.edu is an open
realm and uses @mit.edu email addresses.
(imported from commit 7dbaa81cea6e4f82563dfc0cfe67a61fe9378911)
This allows bot owners to configure which streams messages are delivered
to without needing to change webhook URLs or configuration files.
(imported from commit 32a0c26657c145b001cd8cb3ce0a0364d48902ce)
If we call exclude_muting_conditions() with a non-stream
narrow, it will now include a condition to exclude streams
that are not in your home view. As of now, this code only
executes during testing, but it sets the stage for doing
better in:home queries on the back end.
(imported from commit bbd764bd0e9588a50e4a82c915e82a2c1b99d73e)
If we are already narrowing to a stream, then we can disregard
muted topics in all the other streams and create a simpler query
for the DB to execute.
(imported from commit 35a074a76eec99922034a381741355da3fdd5b39)
Due to the way we store muted topics, it is possible that a
muted topic stream name may no longer exist, and we need to
handle that case gracefully.
(imported from commit 4d18ec55e45213657a67e160848229678f212765)
Previously, we assumed that num_before or num_after would be always be non-zero
after adjustment for the anchor. However, we don't adjust num_before or
num_after when a narrow is specified.
(imported from commit 9239fef140e109b11bdfbeef42e9fbed78660ad1)
Behind a feature flag you can now do searches like this:
-pm-with:othello@example.com is:private
The "-" in front of "pm-with" tells us to exclude messages
with Othello from our search. We support "-" in front of
all operators, although the behavior for "-search:" and
and "-near:" doesn't really change in this commit.
Note that the filtering out of "negated" predicates only
happens on the client side in this commit. On the server
side we ignore negated predicates and send back a superset
of the results.
(imported from commit 6cdeaf32f2d493fbbb838630f0da3da880b1ca18)
This commit doesn't change any functionality, and it is
designed to make diffs for upcoming changes related to
negated conditions a bit easier to read. This diff
looks a bit noiser than it really is due to some
reindentation of continuation lines.
(imported from commit 64c1cba98faa4bad4eaad122dd3de119caa880c0)
The narrow_parameter convert now converts tuples of
(operator, operand) into dictionaries so that downstream
functions/classes deal in dictionaries. This
affects get_old_messages_backend, messages_in_narrow_backend,
and NarrowBuilder.
(imported from commit 7e8cb887f7872ec687acd8c4857d1d5222ab0d5f)
This implementation is somewhat hackish in large part because I think
we're going to be wanting to redo the get_old_messages API somewhat
soon, and this may naturally become a lot cleaner as a result, but
this isn't a lot of code and fixes#2235 part (A) and substantially
mitigates #1510.
(imported from commit 47a2160a44befa9d83190c5cc95b90e92cc5b4cc)
Previously we only disabled it for 'is:starred'. This caused a backtrace when
someone searched for, say 'stream:test is:mentioned', because we weren't joining
on zerver_usermessage, which meant the flag wasn't available to filter on.
(imported from commit ba19f8a74b21d60b89dfc8dbe9c8458ed86b423b)
The normal code will now generate SQL that is basically identically to the raw
SQL we were using before.
(imported from commit 84a3971d6137d05ef3f71252278afdd59041e86a)
This commit also includes a few changes in the way we do some queries that
should speed searches up:
* Messages before and after the anchor are fetched in a single query by doing a
UNION ALL on the server. This incurs the cost of an extra sort because UNION
ALL does not guarantee the order that the results are appended, but it saves a
server round-trip.
* Searches involving flags now use a straight froward WHERE clause, which is
much faster than the one that django-bitfield generates (due to limitations of
the Django ORM)
(imported from commit a0db811a9073363cfabcf4b035d02d20dc8fc8a4)
This requires the tsearch_extras Postgres extension. To install the extension,
first install postgresql-9.1-tsearch-extras on both postgres-primary and
postgres-secondary (this would normally be done in a puppet apply, but there are
currently some changes that can't be applied on Postgres machines). Then run
the following as the postgres user on postgres-primary:
$ psql -d zulip -c 'CREATE EXTENSION tsearch_extras SCHEMA zulip;'
In dev environments, you must also run:
$ psql -d zulip_test_template -c 'CREATE EXTENSION tsearch_extras SCHEMA zulip;'
(imported from commit ad0a57c455b3b86002191ac5fb705d8f716f3296)
NarrowBuilder.by_stream and NarrowBuilder.by_topic for mit users uses a
regex to search by stream and topic. Python's re.escape escapes unicode
in a format that postgres can not parse. We escape unicode as '\uXXXX'
for postgres.
(imported from commit d2c27d4514c31fdc6ef1fea898fe721a6f0ab069)