Backend logic for handling user mention was cluttered
because it was handled at two stages first in
get_possible_mentions_info while fetching mention data
based on the messsage and then later in UserMentionPattern
which handles processing of text for mention.
Ideally UserMentionPattern should depend on
get_possible_mentions_info only for data but there was a
shared logic between these two that made it hard to debug
any possible bugs.
Updates in this commit make both of these functions
coherent in terms of logic and also add appropiate
comments to improve readability of these functions.
There was also a hidden bug that if a user A is
mentioned in with @**name|id** then @**invalid|id**
again mentioned A because of the way we handled mentions
earlier. It is solved as a result of this refactor and
appropiate test has been added for this.
This has been tested manually as well as by adding new
test to address missing case.
I noticed this because the test_events.py tests had the extremely
weird pattern of calling the actual change function, and then testing
the `notify` function's state changes (which should always be noops),
rather than actually testing the state change function.
Fixing the test made it clear that the actual logic in events.py
simply did not handle deleting custom_profile_field_value elements
from user objects when a custom_profile_field object was deleted.
So we fix that bit of logic as well.
It appears this bug was unique -- at least we don't have any other
notify_* functions being used directly in test_events.py, and the
handful of state_change_expected=False entries are all events for data
not present in page_params.
* `op` (operation) field, added in f6fb88549f, was never intended for
`custom_profile_fields` event. This commit removes the `op` as it doesn't
have any use in the code.
* As a part of cleanup, this also eliminates the schema check warnings
for `custom_profile_fields` event, mentioned in #17568.
When running some tests multiple times in the same call,
were failing because of the data duplication.
This commit resolves that issue by resetting the test
environment (i.e: Re-cloning test database and clearing
cache) after each run.
Fixes#17607.
This is no longer used in any important place,
get_user_profile_by_email is meant to be used only in manage.py shell
now and thus there's no point in this function being cached.
Emails are not unique, so we can only sensibly cache using keys formed
with both email and realm.
This requires adding a new cache key function for caching by delivery
email - user_profile_delivery_email_cache_key.
Extend our markdown system to support mentioning of users
by id also. Following these changes, it would be possible
to mention users with @**|user_id** and silently mention
using @_**|user_id**.
Main intention for extending the mention syntax is to make
it convenient for bots to mention a users using their ids. It
is to be noted that previous syntax are also supported.
Documentation tweaked by tabbott for better readability.
The changes were tested manually in development server, and also
by adding some new backend and frontend tests.
Fixes: #17487.
We add support to shorten links and test their shortening in
well-organized, clean manner that makes it trivial to extend the
GitHub approach for GitLab and perhaps other services.
We only shorten basic types of GitHub links (issue, PR, commit) that
fit a set of simple common patterns; the default behaviour of Autolink
is kept for everything else.
Logic added in frontend and backend Markdown Processor is identical.
This makes easy to extend the logic for other services like GitLab.
Fixes#11895.
This commit changes the list_to_streams function to raise error
according to create_stream_policy value when a user cannot create
streams instead of same error for all cases.
This add the schema checker, openapi schema, and also a test for
realm/deactivated event.
With several block comments by tabbott explaining the logic behind our
behavior here.
Part of #17568.
We discovered recently that some ops for events were just not
implemented in events.py (specifically, realm/deactivated).
Since our goal is for events.py to be complete, we add this bit of
hardening to ensure that it stays that way.
Modifies `StreamPattern` and `StreamTopicPattern` to inherit
from InlineProcessor instead of Pattern. This change is done
because Pattern stopped checking for matching patterns as soon
as it found a match which was not a valid stream. Due to this
all the subsequent mention failed, even if they were valid.
This bug was only present in backend renderring due to
markdown.inlinepatterns.Pattern.
Due to above changes verbose_compile is no longer used for
precompiling STREAM_LINK_REGEX, STREAM_TOPIC_LINK_REGEX as
adds ^(.*?) and (.*?)$ which cause extra overhead of matching
pattern which is not required. With new InlineProcessor these
extra patterns at beggining and end are not required.
So, StreamPattern and StreamTopicPattern now define their own
__init__ method for precompiling the regex.
Fixes#17535.
These changes were tested locally in dev server and by adding
some new markdown tests to test these.
Modifies `UserGroupMentionPattern` to inherit from InlineProcessor
instead of Pattern. This change is done because Pattern
stopped checking for matching patterns as soon as it found
a match which was not a valid user group. Due to this all
the subsequent user group mention failed, even if they were
valid. This bug was only present in backend renderring due to
markdown.inlinepatterns.Pattern.
This was reported as issue #17535.
These changes were tested locally in dev server and by adding
some new markdown tests to test these.
Modifies `UserMentionPattern` to inherit from InlineProcessor
instead of Pattern. This change is done because Pattern
stopped checking for matching patterns as soon as it found
a match which was not a valid user. Due to this all the
subsequent user mention failed. This bug was only present in
backend renderring due to markdown.inlinepatterns.Pattern.
This was reported as issue #17535.
These changes were tested locally in dev server and by adding
some new markdown tests to test these.
This is preparatory work for investigating reports of missing unread
messages.
It's a little surprising that not test failed after adding the code
without API documentation.
Co-Author-By: Tushar Upadhyay (tushar912).
This is a prep commit which modifies the
`send_message_moved_breadcrumbs` function to take
message strings as input.
This is done to reuse the function in other places
like the /digress command.
Structurally, exception, failure_message, and status_code are mutually
exclusive in how this function is called, and it's best for the
function's flow to represent that.
The message from the bot which triggered the 407 error message notifies
the bot owner about the exceptions as well in the error message. This
commit handles it more gracefully and shows a generic message.
The messages from the bot which were triggered by the outgoing_webhooks
didn't have the bot name in them. This commit adds the bot name to it
and makes the corresponding changes in the tests.
On replying to an email notifcation from a stream where the user
does not come under the stream_post_policy will subsequently result
in a failure. In such a case, the user does not receive feedback
regarding the failure.
Notify the user via notification bot if their email
message failed to send.
Fixes#16642.
If the client has an old version of the code which is not present on
the server, don't throw a 500; instead, default to the same `unable to
look up in source map` message is used when the line numbers don't
line up.
Minimized code duplication by integrating POSTRequestMock into
HostRequestMock and then updating the required files with
HostRequestMock.
Fixes part of #1211.
This commit renames the is_new_member property in models.py
to is_provisional_member which will return true for any user
who is not a full member. We will add a condition in further
commit such that this returns 'False' for a moderator as we
will initially give all the rights to moderator that a full
member has.
Its likely that we would implement new hotspots that aren't
a part of the tutorial hotspots, in the future. For instance,
a hotspot to advertise new features. Hence, grouping them into
categories like INTRO_HOTSPOTS would be a good start. We also
have an aggregate of all types of hotspots we may add in the
future, under ALL_HOTSPOTS.
Fixes#17238.
In process_new_human user, the queries were wrong, revoking all invites
sent to the email address, even in other realms than the one where the
new account just got created.
Currently, the ID and Type fields didn't have a description,
and weren't being displayed. Added a schema component to add
descriptions, and display on the api page. Fixes part of #15967.
This commmit includes ROLE_MODERATOR in realm_user_count_by_role.
We also update test_change_role in test_audit_log.py to include
changes for moderator role as well.
Note that at this point, it's not possible to create moderator users;
this just will make it easier to write tests for logic involving them
as we develop the feature.
We currently not allow new bots to send message in stream with post
policy as 'STREAM_POST_POLICY_RESTRICT_NEW_MEMBERS', but we should
allow them to send messages if their owner is a full member.
This will make it consistent with behavior in stream with post
policy as 'STREAM_POST_POLICY_ADMINS_ONLY' where we allow non admin
bots with owner as admin to send messages.
According to tests we should not allow bot without owners to
post in streams with STREAM_POST_POLICY_RESTRICT_NEW_MEMBERS.
But the code does not handle this and the related test passes
and raises error for case of bots without owner because the bot
is itself a new member.
This commit fixes this by adding a condition to check if there
is no bot owner and then raise error if there is no owner.
This is a minor refactor which renames the
notify_topic_moved_streams function to
send_message_moved_breadcrumbs.
This is done because this function will be also used
for other things in the future, when moving streams
or when using the /digress command, for example.
Added assertion to check that if a deprecated flag is in a field's
schema, then it should have deprecated mentioned in description
as well, and moved these checks to a separate function.
Fixes part of #15967.
user_profile.id was confused for user_profile.recipient_id. These bugs
are particularly sneaky as they can go undetected by tests due to ids of
objects accidentally coinciding. We add a mitigation for this class of
mistakes by shifting the Recipient.id sequence in test db.
This was introduced in dda3ff41e1.
On the rare occasion where user_profile.id would coincide with
recipient_id passed to the function, we would return the wrong value.
That is, instead of correctly returning recipient_id, we would return
sender.recipient_id - recipient id of the sender of the message, thus
possibly returning user_profile.recipient_id (if user_profile is the
sender) - exactly the situation the function wanted to avoid
with the `if recipient_id == my_recipient_id:` if. Ultimately resulting
in incorrect/malformed data in
state['raw_recent_private_conversations'].
nlargest is the natural fit for selecting n biggest items
from an unsorted list. It's more readable as well as more
efficent (even though we don't care much about the efficeny
in this particular case).
The current logic doesn't display data types when the additionalProperties
variables are not object, but are array of strings, etc. Changed the if
condition to allow rendering in such cases.
zerver/lib/users.py has a function named access_user_by_id, which is
used in /users views to fetch a user by it's id. Along with fetching
the user this function also does important validations regarding
checking of required permissions for fetching the target user.
In an attempt to solve the above problem this commit introduces
following changes:
1. Make all the parameters except user_profile, target_user_id
to be keyword only.
2. Use for_admin parameter instead of read_only.
3. Adds a documentary note to the function describing the reason for
changes along with recommended way to call this function in future.
4. Changes in views and tests to call this function in this changed
format.
Changes were tested using ./tools/test-backend.
Fixes#17111.
Previously, the data type of responses wasn't displayed in the API
Documentation, even though that OpenAPI data is carefully validated
against the implementation. Here we add a recursive function to
render the data types visibly in API Documentation.
Fixes part of #15967.
Commit 434094e599 (#11321) changed this
from an Extension to a subclass of Markdown, so it no longer has any
reason to use a config dict structured like that of an Extension.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This isn't quite the right model, because we're not actually going
through the upload code path, but it does at least provide some inline
image previews in the data.
Fixes part of #14991.
The changes are as follows:
• Fix one day offset in all western zones.
• Correct CST from -64800 to -21600 and CDT from -68400 to -18000.
• Disambiguate PST in favor of -28000 over +28000.
• Add GMT, UTC, WET, previously excluded for being at offset 0.
• Add ACDT, AEDT, AKST, MET, MSK, NST, NZDT, PKT, which the previous
code did not find.
• Remove numbered abbreviations -12, …, +14, which are unnecessary.
• Remove MSD and PKST, which are no longer used.
Hardcode the dict and verify it with a test, so that future
discrepancies won’t go silently unnoticed.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
https://docs.djangoproject.com/en/3.1/releases/3.1/
- django.contrib.postgres.fields.JSONField is deprecated and should be
replaced with models.JSONField
- The internals of the implementation in the postgresql backend have
changed a bit in
f48f671223
and thus we need to make an ugly tweak in test_runner.
- app_directories.Loader.get_dirs() now returns a list of PosixPath so
we need to make a small tweak in TwoFactorLoader for that (PosixPath
is not iterable)
Fixes#16010.
This commit updates the Zulip User-Agent to
'Mozilla/5.0 (compatible; ZulipURLPreview/{version}; +{external_host})'
as the older User-Agent was rendering Markdown YouTube titles as
'YouTube - YouTube'.
Fixes#16970.
c2526844e9 removed the `signups` queue
worker, and the command-line tool that enqueues to it -- but not the
automated process that enqueues during signups itself.
Remove the signup, since it is no longer in use.
Previously, the data type of parameters wasn't displayed in the API
Documentation, even though that OpenAPI data is carefully validated
against the implementation. Here we add a recursive function to
render the data types visibly in the API documentation.
This only covers the request parameters; we'll want to do something
similar for response parameters in a follow-up PR.
Fixes part of #15967.
When we were getting an apply_event call for
a subscription/add event, we were trying not to
mutate the event itself, but this clumsy code
was still mutating the actual event:
# Avoid letting 'subscribers' entries end up in the list
for i, sub in enumerate(event['subscriptions']):
event['subscriptions'][i] = \
copy.deepcopy(event['subscriptions'][i])
del event['subscriptions'][i]['subscribers']
This is only a theoretical bug.
The only person who receives a subscription/add
event is the current user.
And it wouldn't have affected the current user,
since the apply_event was correctly updating the
state, and we wouldn't actually deliver the event
to the client (because the whole point of apply_event
is to prevent us from having to piggyback the
super-recent events on to our payload or put
them into the event queue and possibly race).
The new code just cleanly makes a copy of each
sub, if necessary, as we add them to state["subscriptions"].
And I updated the event schemas to reflect that
subscribers is always present in subscription/add
event.
Long term we should probably avoid sending subscribers
on this event when the clients don't set something
like include_subscribers. That's a fairly complicated
fix that involves passing in flags to ClientDescriptor.
Alternatively, we could just say that our policy is
that we never send subscribers there, but we instead
use peer_add events. See issue #17089 for more
details.
It's always cleaner to work in id space. It probably
would have required a perfect storm to have broken
the existing code, but using ids is obviously more
robust in theory, and just as simple.
We now require keywords, so that there is no
pitfall for mixing up boolean parameters.
Positional parameters are basically evil
when you have a bunch of bools.
I also make user_profile the first argument.
Finally, the code is more diff-friendly.
I eliminate the defaults, since the existing code
was already specificying values for most things.
I move all the booleans to the bottom for both
parameters and arguments.
I require explicit keywords for everything but
user_profile (which is now first).
And, finally, I format the code in a more
diff-friendly manner.
We eliminate some redundant checks.
We also consistently provide a `subscribers` field
in our stream data with `[]`, even if our users
can't access subscribers. We therefore bump
the API version and tweak the docs. (See further
down for a detailed justification of the change.)
Even though it is sometimes fine to have redundant code
that is defensive in nature, some upcoming changes are gonna
move subscriber-related logic out of build_stream_dict_for_sub
for certain codepaths as part of our effort to streamline
the payload for subscribers within page_params.
So we can't rely on the code that I removed here
inside of build_stream_dict_for_sub.
Anyway, it makes more sense to do these checks explicitly
in the validate function.
The code in build_stream_dict_for_sub was almost effectively
a noop, since the validation function was already preventing
us from getting subscriber info. The only difference it
made was sometimes converting `[]` to `None`, and then
subsequently omitting the subscribers field.
Neither ZT nor the webapp make any distinction between
`[]` or <missing key> for the `subscribers` data in
`page_params`.
The webapp has had this code for a long time (and now
equivalent code elsewhere in this PR):
if (!Object.prototype.hasOwnProperty.call(sub, "subscribers")) {
sub.subscribers = new LazySet([]);
}
The webapp calculates access based on booleans, anyway:
sub.can_access_subscribers =
page_params.is_admin || sub.subscribed ||
(!page_params.is_guest && !sub.invite_only);
And ZT would choke if `subscribers` were missing, except that
it never gets to the relevant code due to other checks:
def get_other_subscribers_in_stream(<snip>):
assert stream_id is not None or stream_name is not None
if stream_id:
assert self.is_user_subscribed_to_stream(stream_id)
return [sub
for sub in self.stream_dict[stream_id]['subscribers']
if sub != self.user_id]
else:
return [sub
for _, stream in self.stream_dict.items()
for sub in stream['subscribers']
if stream['name'] == stream_name
if sub != self.user_id]
You could make a semantic argument that we should prefer
<missing key> to `[]` when subscribers aren't even available, but
we have precedent from the way that `bulk_get_subscriber_user_ids`
has traditionally populated its result:
result: Dict[int, List[int]] =
{stream["id"]: [] for stream in stream_dicts}
If we changed `stream_dicts` to `target_stream_dicts` we
would faciliate a move toward `None`, but it would just cause
headaches for other server code as well as the frontends
(which, to reiterate, already prefer the empty array
for convenience).
As my comment indicates, I would prefer to handle
this explicitly by raising JsonableError in an
else statement here, but it's not a big deal.
This function can probably be simplified with a
bit of work, mostly on the testing side to make
sure we are covering all edge cases, but that
is out of the scope of my current PR.
By moving the relevant logic from realm.get_bot_domain to
get_fake_email_domain we will make realm.host be used (if possible) for
dummy user addresses. That is, instead of user11@zulipchat.com, the
address will become user11@subdomain.zulipchat.com.
We often send only one field (away or status_text)
to be updated.
So we have to make our schema support optional
keys.
As a result of the more flexible schema, we no
longer need to exempt the node fixtures from
our schema checks.
Since recipient_id (id of the PERSONAL Recipient of the user) was
denormalized into the UserProfile model, this query can be simplified by
getting rid of the zerver_recipient JOIN.
This makes us more efficient when handling
multiple users. We don't have to keep
sending the same two queries to the database.
Note that as part of this we eliminated
a failure mode for the obscure population
of users from whom both `user.is_guest` and
`user.can_access_public_streams()` returns
False. We know this would have only affected
Zephyr users (by looking at the code), and
we know we don't actually process Zephyr
users for email digests (or else we would
have raised exceptions in the old code).
We mostly need realm_id, but when we go to build
message lists, we need realm.uri.
We could probably be more aggresive about using
`only` here, but for now I am just trying to
reduce hops to the database.
The `deployment` key was only set in `do_report_error`, which is now
only used in one codepath (the queue worker). The logging handlers on
staging call notify_server_error directly, which omits the
`deployment` key.
Remove the odd one-of key, and instead simply do dispatch in
`do_report_error`.
The codepath for moving a topic changes the message.recipient_id to the
id of the new recipient, but later, in update_messages_for_topic_edit,
it uses message.recipient when querying for messages with the matching
topic in the *old* stream (because those are the other messages that
need to be moved). This is a bug which happens to work fine, because in
Django 2, if message.recipient gets fetched first and then
message.recipient_id is mutated, message.recipient will not be altered
and thus will retain the outdated, previously fetched value.
In Django 3 changing .recipient_id causes .recipient to be updated to
the new Recipient objects, which is the Recipient of the *new* stream.
That will cause the bug to manifest.
This is a bugfix preparing for the upgrade to Django 3.
Support for saving it in the session is dropped in django3, the cookie
is the mechanism that needs to be used. The relevant i18n code doesn't
have access to the response objects and thus needs to delegate setting
the cookie to LocaleMiddleware.
Fixes the LocaleMiddleware point of #16030.
We now require explicit keywords for all arguments
to fetch_initial_state_data except user_profile.
We provide reasonable defaults to keep the test
code concise.
When changing the subdomain of a realm, create a deactivated realm with
the old subdomain of the realm, and set its deactivated_redirect to the
new subdomain.
Doing this will help us to do the following:
- When a user visits the old subdomain of a realm, we can tell the user
that the realm has been moved.
- During the registration process, we can assure that the old subdomain
of the realm is not used to create a new realm.
If the subdomain is changed multiple times, the deactivated_redirect
fields of all the deactivated realms are updated to point to the new
uri.
Instead of just storing the edit history in the message which
triggered the topic edit, we store the edit history in all
the messages that changed. This helps users track the edit history
of a message more reliably.
Allowing any admins to create arbitrary users is not ideal because it
can lead to abuse issues. We should require something stronger that
requires the server operator's approval and thus we add a new
can_create_users permission.
We change the return type of check_message to be dataclass instead of
Dict[str, Any]. This refactoring helps us to understand the context of the
data structure returned by check_message clearly which was not possible
when using Dict.
SendMessageRequest class is added in zerver/lib/message.py inspite of it
not being used in that file itself just to maintain consistency as other
TypedDicts and dataclasses are defined in that file and to avoid circular
dependency as SendMessageRequest is being used in lib/widget.py as well.
We also rename local variable to 'send_request' for accessing
SendMessageRequest objects.
We always want to do these at the same time. Previously, message
editing did too much stripping (fixes#16837) and failed to check for
NUL bytes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Previously we were just returning a dict containing a message id when
trying to mirror a already sent message in 'zephyr_mirror' cases.
This commit changes this behaviour to raise an exception when trying
to mirror an already sent message by adding a new exception class
ZephyrMessageAlreadySentException and then the caller returns the
message_id directly, instead of calling do_send_messages which also
returns a list of size one containing the message_id only.
This is a prep commit for changing the return type of check_message to
be a dataclass instead of a Dict as now we have only single output for
check_message.
This commit renames the content variable in do_widget_post_save_actions
to message_content and is a prep commit for changing the return type of
check_message from Dict to dataclass.
This change is required because content variable is used two times in
this function - one for message content and other for submessage
content, so when we change the return type of check_message to
dataclass, the type of content variable is considered as str and then
when dict is assigned to content in the submessage case, mypy raises
'Incompatible types in assignment' error.
This issue is not faced before the dataclass migration because there is
no type checking for the values of dict returned by check_message as the
return type of check_message is 'Dict[str, Any]'.
The message_dict['wildcard_mention_user_ids'] should be empty set instead
of empty list when there are no wildcard mentions similar to the case
when there are wildcard mentions, where it is equal to set of user ids and
not list of user ids.
I reformatted the tests and view to include information about who
acknowledged and closed the alert. Only includes the information about
the owner if there was an owner.
Made a few small changes to the refactored bit as requested in review.
Moved time formatting check and conversion to
zerver/lib/webhooks/common.py. Updated tests slightly to match new
output. Removed duration from the calculation because the difference
is less than the precision of output and it complicated the error
handling.
An HTML document sent without a charset in the Content-Type header
needs to be scanned for a charset in <meta> tags. We need to pass
bytes instead of str to Beautiful Soup to allow it to do this.
Fixes#16843.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We export a realm's data, and disable the realm, because the user
is moving from Zulip Cloud (e.g. https://example.zulipchat.com/) to
self-hosting or another platform (e.g. https://zulip.example.com/)
which we do not control. This commit adds a field in the realm object
called deactivated_redirect to store the url to which the realm has
moved.
This simplifies the code, as it allows using the mechanism of converting
JsonableErrors into a response instead of having separate, but
ultimately similar, logic in RateLimitMiddleware.
We don't touch tests here because "rate limited" error responses are
already verified in test_external.py.
In 1bcb8d8ee8 I made
it so the webapp doesn't include "streams" in its
state from `fetch_initial_state_data`, but I didn't
address all the places in apply_event.
For 3000 messages and 400 users, this saved
about 30 seconds.
We only do two queries per batch of messages
now, and the algorithm is easier to analyze,
as it's just three nested loops.
Note that we are much more efficient about finding
active users here:
- we do one query per realm (instead of per-user)
- we pass the cutoff date to the database
- we get back just a list of distinct ids