Commit 4a3ad0d introduced some extra stream-level parameters
to the `realm` object. This commit extends that to add a
max_message_length paramter too in the same server_level.
Previously, you had to request the `stream` event type in order to get
the stream-level parameters; this was a bad design in part because the
`subscription` event type has similar data and is preferred by most
clients.
So we move these to the `realm` object. We also add the maximum topic
length, as an adjacent parameter.
While changing this, we also fix these to better match the names of
similar API parameters.
* Don't require strings to be unnecessarily JSON-encoded.
* Use check_capped_string rather than custom code for length checks.
* Update frontend to pass the right parameters.
With a much simplified populate_data_for_request design suggested by
Anders; we only support a handful of data types, all of which are
correctly encoded automatically by jQuery.
Fixes part of #18035.
Previously, when unmuting a user, we used to make
two database fetches - one to verify that the user
is has been muted before, and one while actually
unmuting the user.
This reduces that to one, by passing around the
`MutedUser` object fetched in the first round.
Since the new function returns `Optional[MutedUser]`,
we need to use a hack for events tests, because
mypy does not yet use the type inferred from
`assert foo is not None` in nested functions like lambdas.
See python/mypy@8780d45507.
Instead of using internal functions for data setup,
we use the API so that these tests are more
end-to-end.
This commit also removes a now unnecessary
`if date_muted is None` check.
This cleans up some code added in 3bfcaa3968.
Also fixes some indentation to be more readable:
- `mock.patch` is in a single line.
- Dictionaries are one field per line.
This was used by the old native Zulip Android app
(zulip/zulip-android). That app has been undeveloped for enough years
that we believe it no longer functions; as a result, there's no reason
to keep a prototype API endpoint for it (that we believe never worked).
This endpoint was needed by the ancient pre-electron desktop app
written in QT; we removed support for that in practice a long time
ago, and even the custom error messages for it in
5a22e73cc6.
So we can delete this endpoint as well.
We keep the error message same for all cases when a user is not
allowed to subscribe others for all values of invite_to_stream_policy.
We raise error with different message for guest cases because it
is handled by decorators. We aim to change this behavior in future.
Explaining the details in error message isn't much important as
we do not show errors probably in API only, as we do not the show
the options itself in the frontend.
We keep the error message same for all cases when a user is not
allowed to create streams for all values of create_stream_policy.
We raise error with different message for guest cases because it
is handled by decorators. We aim to change this behavior in future.
Explaining the details in error message isn't much important as
we do not show errors probably in API only, as we do not the show
the options itself in the frontend.
We keep the error message same for all cases when a user is not
allowed to invite others for all values of invite_to_realm_policy.
We raise error with different message for guest cases because it
is handled by decorators. We aim to change this behavior in future.
Explaining the details in error message isn't much important as
we do not show errors probably in API only, as we do not the show
the options itself in the frontend.
This makes it much more clear that this feature does JSON encoding,
which previously was only indicated in the documentation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This fixes the API documentation tests following recent changes, but
isn't the right solution -- we probably want to change the API itself
to not require this strange JSON-encoding-of-a-string.
But this is necessary to have CI pass.
The comments explain in some detail, but basically we were displaying
the types for booleans incorrectly, and the types for strings in a
somewhat confusing fashion. Fix this with comments explaining the logic.
Using JSON dumping also results in our showing strings inside
quotation marks in our examples, which seems net helpful.
Thanks to ArunSankarKs for finding where we needed to change the
codebase.
Fixes#18021.
The moderator role was not included in the tests for create_stream_policy
and invite_to_stream_policy. The tests are do_set_realm_property_test
in test_events.py and do_test_realm_update_api in test_realm.py.
This should have been added for create_stream_policy in 5b32dcd and
in 5b32dcd for invite_to_stream_policy, but was missed by mistake.
This commit adds backend code for passing can_invite_others_to_realm
field to clients using the fetch_initial_state_data in the page_params
object.
Though this field is not used by webapp as of now, but will be used
to fix a bug of incorreclty showing the invite users option in
settings overlay in the next commit.
We add moderators and full members option to invite_to_realm_policy
by using COMMON_POLICY_TYPES and use can_invite_others_to_realm helper
added in previous commit. This commit only does the backend work,
frontend work will be done in separate commit.
This commit adds can_invite_others_to_realm helper which will be used in
further in next commit when invite_to_realm_policy will be modified to
support all values of COMMON_POLICY_TYPES.
It is important for this commit's correctness that
INVITE_TO_REALM_POLICY_TYPES was initialized to use the same values.
This commit replaces invite_by_admins_policy, which was a bool field,
with a new enum field invite_by_realm_policy.
Though the final goal is to add moderators and full members option
using COMMON_POLICY_TYPES, but this will be done in a separate
commit to make this easy for review.
This commit implements a subtle optimization (described in more detail
in the comment) that can save a few hundred milliseconds in when the
sender sees that their message has sent when sending to very large
streams.
Fixes#17898.
The tests for can_create_streams and can_subscribe_other_users shares a
lot of code and we deduplicate the code by extracting most of the code
as check_has_permission_policies which will now be called by the two
tests test_can_create_streams and test_can_subscribe_other_users.
This will also help in avoiding the duplication of code when we will
convert more policies to use COMMON_POLICY_TYPES.
We send the whole data set as a part of the event rather than
doing an add/remove operation for couple of reasons:
* This would make the client logic simpler.
* The playground data is small enough for us to not worry
about performance.
Tweaked both `fetch_initial_state_data` and `apply_events` to
handle the new playground event.
Tests added to validate the event matches the expected schema.
Documented realm_playgrounds sections inside /events and
/register to support our openapi validation system in test_events.
Tweaked other tests like test_event_system.py and test_home.py
to account for the new event being generated.
Lastly, documented the changes to the API endpoints in
api/changelog.md and bumped API_FEATURE_LEVEL.
Tweaked by tabbott to add an `id` field in RealmPlayground objects
sent to clients, which is essential to sending the API request to
remove one.
Similar to the previous commit, we have added a `do_*` function
which does the deletion from the DB. The next commit handles sending
the events when both adding and deleting a playground entry.
Added the openAPI format data to zulip.yaml for DELETE
/realm/playgrounds/{playground_id}. Also added python and curl
examples to remove-playground.md.
Tests added.
This endpoint will allow clients to create a playground entry
containing the name, pygments language and url_prefix for the
playground of their choice.
Introduced the `do_*` function in-charge of creating the entry in
the model. Handling the process of sending events which will be
done in a follow up commit.
Added the openAPI format data to zulip.yaml for POST
/realm/playgrounds. Also added python and curl examples for using
the endpoint in its markdown documented (add-playground.md).
Tests added.
Tweaked exports.py to add the config object there so that our export
tool can include the table when exporting. Also includes all the
changes required to import the new table from the exported data.
Helper function `get_realm_playgrounds` added to fetch all
playgrounds in a realm.
Tests amended.
Realm administrators already get creation and deletion events for all
streams, including private streams. So these should be reflected in
the initial state data.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Adds backend code for the mute users feature.
This is just infrastructure work (database
interactions, helpers, tests, events, API docs
etc) and does not involve any behavioral/semantic
aspects of muted users.
Adds POST and DELETE endpoints, to keep the
URL scheme mostly consistent in terms of `users/me`.
TODOs:
1. Add tests for exporting `zulip_muteduser` database table.
2. Add dedicated methods to python-zulip-api to be used
in place of the current `client.call_endpoint` implementation.
It does not seem like an official version supporting Webpack 4 (to say
nothing of 5) will be released any time soon, and we can reimplement
it in very little code.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
`call_on_each_event` now supports additional params other than
`event_type` and `narrow`; Ex: `all_public_streams` to fetch events
of all public streams.
Also add a bit of explanation of how this parameter works.
Fixeszulip/python-zulip-api#647
This is a prep change to eventually completely
replace the term "filter" with "linkifier" in
the codebase.
This only renames files. Code changes will be
done in further commits.
This renames the test file for muting to have
the term `topic` in it, along with an ambiguously
named helper.
This is a prep change for implementing the mute
users feature.
We use GIPHY web SDK to create popover containing GIFs in a
grid format. Simply clicking on the GIFs will insert the GIF in the compose
box.
We add GIPHY logo to compose box action icons which opens the GIPHY
picker popover containing GIFs with "Powered by GIPHY"
attribution.
Previously, if a user subscribed to a stream with
history_public_to_subscribers, and then was looking at old messages in
the stream, they would not get live-updates for that stream, because
of the structure in how notify_reaction_update only looked at
UserMessage rows (we had a previous workaround involving the
`historical` field in `UserMessage` which had already made it work if
the user themselves added the reaction).
We fix this by including all subscribers with history access in the
set of recipients for update events.
Fixes a bug that was confused with #16942.
Amazon SES has a limit on the size of address fields, and rejects
emails with too-long "From" combinations of name and address. This
limit is set to 320 bytes and comes from an RFC limitation on the
size of addresses. This RFC standard states that an email address
should not be composed of a local part (before the '@') longer than
64 bytes and a domain part (after the '@') longer than 255 bytes.
It is possible that Amazon SES misinterprets this limitation as it
checks the length of the combination of the name and the email
address of the sender.
To ensure that this problem is not encountered in the send_email
module of Zulip the length of this combination is now checked
against this limit and the from_name field is removed to only
keep the from_address field when it is necessary in order to
stay below 320 bytes.
If the from_address field alone is longer than 320 bytes the
sending process will raise an SMTPDataError exception.
Tests for this new check are added to the backend test suite in
order to test if build_email correctly outputs an email with filled
from_name and from_address fields when the total length is lower
than 320 bytes and that it correctly throws the from_name field
away when necessary.
Fixes: #17558.
We're renaming "stream deletion" language to "stream archiving"
and these pages were moved in the process, so we should keep redirects
for them for a while.
This reverts commit 9c6d8d9d81 (#16916).
This feature has known bugs, and also wants some design changes to
make it customizable like linkifiers, so we’re retargeting this to
post-4.x.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The Session middleware only adds `Vary: cookie` if it sees an access
to the from inside of it. Because we are effectively, from the Django
session middleware's point of view, returning the static content of
`request.saved_response` and never accessing the session, it does not
set `Vary: cookie` on longpoll requests.
Explicitly mark Tornado requests as varying by cookie.
Adding an additional `!` to the stream name each time a stream is
deactivated, to a maximum of 21 times, effectively limits number of
times a stream with a given name can be deactivated. This is unlikely
to come up in common usage, but may be confusing when testing.
Change what we prepend to deactivated stream names to something with
more entropy than just `!`, by instead prepending a substring of hash
of the stream's ID. `!`s. Using 128 bits of the hash means that it
will require more than 10^18th renames to have a 1% chance of collision.
Because too-long stream names are also truncated at 60 characters,
having this entropy in the beginning of the name also helps address
potential issues from stream names that differed only in, e.g. the
60th character.
Fixes#17016.
Instead of validating `op` value later, this commit does that
in `REQ`.
Also helps avoiding duplication of this validation when
stream typing notifications feature is added.
These flags were put in place in the first commit that introduced
Tornado (9afd63692f) with unclear
utility.
Remove them, since they have never been documented, and do not have a
clear need.
The `X-Forwarded-For` header is a list of proxies' IP addresses; each
proxy appends the remote address of the host it received its request
from to the list, as it passes the request down. A naïve parsing, as
SetRemoteAddrFromForwardedFor did, would thus interpret the first
address in the list as the client's IP.
However, clients can pass in arbitrary `X-Forwarded-For` headers,
which would allow them to spoof their IP address. `nginx`'s behavior
is to treat the addresses as untrusted unless they match an allowlist
of known proxies. By setting `real_ip_recursive on`, it also allows
this behavior to be applied repeatedly, moving from right to left down
the `X-Forwarded-For` list, stopping at the right-most that is
untrusted.
Rather than re-implement this logic in Django, pass the first
untrusted value that `nginx` computer down into Django via `X-Real-Ip`
header. This allows consistent IP addresses in logs between `nginx`
and Django.
Proxied calls into Tornado (which don't use UWSGI) already passed this
header, as Tornado logging respects it.
The `widget_content` key is expected to contain a string which parses
as JSON; in the event that it does not, log the error and notify the
bot owner, instead of failing silently.
Fixes#16850.
In `validate_account_and_subdomain` we check
if user's realm is not deactivated. In case
of failure of this check, we raise our standard
JsonableError. While this works well in most
cases but it creates difficulties in handling
of users with deactivated realms for non-browser
clients.
So we register a new REALM_DEACTIVATED error
code so that clients can distinguish if error
is because of deactivated account. Following
these changes `validate_account_and_subdomain`
raises RealmDeactivatedError if user's realm
is deactivated.
This error is also documented in
`/api/rest-error-handling`.
Testing: I have mostly relied on automated
backend tests to test this.
Fixes#17763.
In validate_account_and_subdomain we check if
user's account is not deactivated. In case of
failure of this check we raise our standard
JsonableError. While this works well in most
cases but it creates difficulties in handling
of deactivated accounts for non-browser clients.
So we register a new USER_DEACTIVATED error
code so that clients can distinguish if error
is because of deactivated account. Following
these changes `validate_account_and_subdomain`
raises UserDeactivatedError if user's account
is deactivated.
This error is also documented in
`/api/rest-error-handling`.
Testing: I have mostly relied on automated
backend tests to test this.
Partially addresses issue #17763.
We add a TUTORIAL_ENABLED setting for self-hosters who want to
disable the tutorial entirely on their system. For this, the
default value (True) is placed in default_settings.py, which
can be overwritten by adding an entry in /etc/zulip/settings.py.
Updated database query to filter out deactivated streams from the
return of the get_topic_mutes method. Added optional
include_deactivated parameter to the method to make the behavior
default but overrideable. Added test case in test_muting for these
changes. Fixes blueslip warnings thrown by muting.js set_muted_topics
when passed deactivated streams via page_params.
With the previous two commits deployed, we're ready to use the
denormalization to optimize the query.
With dev environment db prepared using
./manage.py populate_db --extra-users=2000 --extra-streams=400
this takes the execution time of the query in
bulk_get_subscriber_user_ids from 1.5-1.6s to 0.4-0.5s on my machine.
The new comment explains the issue in some detail, but basically if we
deactivate the bots first, then an error partway through is corrected
by a retry; if we deactivate the user first, then we may leak
undeactivated bots if a failure occurs.
This adds the is_user_active with the appropriate code for setting the
value correctly in the future. In the following commit a migration to
backfill the value for existing Subscriptions will be added.
To ensure correct user_profile.is_active handling also in tests, we
replace all direct .is_active mutation with calls to appropriate
functions.
These procedures should be done atomically overall, with the exception
of the code that sends events to avoid block if there's a delay
communicating with Tornado.
We add the savepoint=False on underlying function that already
executes inside an atomic context - to avoid the overhead of creating
savepoints where they aren't needed.
This commit adds a new option of STREAM_POST_POLICY_MODERATORS
in stream_post_policy which will allow only realm admins and
moderators to post in that stream.
We extract a helper which checks whether to allow the sender to send the
message to a stream according to the stream_post_policy. The purpose
of extracting it out is to avoid additional code for checking the access
for bot owners in case of bot sending the messages and instead calling
the handler two times - one time for sender and one time for bot owner if
sender is a bot.
The moderators-only option was actually added in the previous
commit for create_stream_policy as we use the same function
'has_permission' for both the policies. But we add the error
handling code and tests for moderators-only option in this
commit.
This commit modifies the has_permission function to include
realm moderator role. Thus this adds a new option of moderators
only for create_stream_policy.
Though this automatically adds this option for invite_to_stream_policy
also, but we will keep other code for showing error and for tests
in a separate commit.
This commit adds an assert statement in the last block of
has_permission which checks whether the policy_value is
POLICY_FULL_MEMBERS_ONLY. This assert statement is added
for readability.
The current API documentation uses call_endpoint to upload a file;
since we've added a custom helper in python-zulip-api, we should
document that cleaner approach.
The session object provides a common place to set headers on all
requests, no matter which implementation.
Because the `headers` attribute of Session is not a true static
attribute, but rather exposed via overriding `__getstate__`, `mock`'s
autospec cannot know about it, and thus throws an error; in tests that
mock the Session, we thus must explicitly set the `session.headers`.
The existing organization, of returning an opaque blob from
`build_bot_request`, which was later consumed by
`send_data_to_server`, is not particularly sensible; the steps become
oddly split between the OutgoingWebhookWorker, `do_rest_call`, and the
`OutgoingWebhookServiceInterface`.
Make the `OutgoingWebhookServiceInterface` in charge of building,
making, and returning the request in one method; another method
handles extracting content from a successful response. `do_rest_call`
is responsible for calling both halves of this, and doing common error
handling.
The comments in stream-policy tests in test_message_send.py specifies
the restriction of creating streams based on stream_post_policy. But
this restriction was removed in 9aaa61963 and we now allow everyone to
create all type of streams. So this commit fixes the stream creation
parts in comments.
The STREAM_POST_POLICY_RESTRICT_NEW_MEMBERS option of stream_post_policy
was explained as "Only new members can post" in the api docs. It should
instead be "Only full members can post" and this commit fixes it.
The /events and /register endpoints were excluded from schema validations,
because they were earlier not completely documented. However, they can
now be added for proper checking. Removed them from excluded endpoints list
and fixed the documentation for /register and /events after the checking.
Fixes#17796.
Backend logic for handling user mention was cluttered
because it was handled at two stages first in
get_possible_mentions_info while fetching mention data
based on the messsage and then later in UserMentionPattern
which handles processing of text for mention.
Ideally UserMentionPattern should depend on
get_possible_mentions_info only for data but there was a
shared logic between these two that made it hard to debug
any possible bugs.
Updates in this commit make both of these functions
coherent in terms of logic and also add appropiate
comments to improve readability of these functions.
There was also a hidden bug that if a user A is
mentioned in with @**name|id** then @**invalid|id**
again mentioned A because of the way we handled mentions
earlier. It is solved as a result of this refactor and
appropiate test has been added for this.
This has been tested manually as well as by adding new
test to address missing case.
I noticed this because the test_events.py tests had the extremely
weird pattern of calling the actual change function, and then testing
the `notify` function's state changes (which should always be noops),
rather than actually testing the state change function.
Fixing the test made it clear that the actual logic in events.py
simply did not handle deleting custom_profile_field_value elements
from user objects when a custom_profile_field object was deleted.
So we fix that bit of logic as well.
It appears this bug was unique -- at least we don't have any other
notify_* functions being used directly in test_events.py, and the
handful of state_change_expected=False entries are all events for data
not present in page_params.
* `op` (operation) field, added in f6fb88549f, was never intended for
`custom_profile_fields` event. This commit removes the `op` as it doesn't
have any use in the code.
* As a part of cleanup, this also eliminates the schema check warnings
for `custom_profile_fields` event, mentioned in #17568.
In 1a12e112d9, this page was converted
to use portico styling, but we intentionally left this page not using
the portico_content class since we didn't want the header/footer.
We still don't want the header/footer clutter, so instead, we achieve
that same goal using the isolated_page flag.
When running some tests multiple times in the same call,
were failing because of the data duplication.
This commit resolves that issue by resetting the test
environment (i.e: Re-cloning test database and clearing
cache) after each run.
Fixes#17607.
Since the list of streams returned by a query which is not sorted
can vary, the tests which use it become flaky.
NormalActionsTest.test_default_stream_groups_events became
flaky due to this and hopefully sorting the streams should
fix it.
I have the updated the documentation page for the hello world
integration to include numbers to bring it up to standard and make it
more readable.
Fixes part of #17633.
This commit migrates some of the backend tests in test_import_export
to use assertLogs(), instead of mock.patch() as planned in #15331.
Logs for tests in this file are suppressed and are not asserted as
that made changes to import/export codebase more fragile. As we
already have checks for the actual functionalities, it made less
sense to assert those logs.
Currently, there was no markdown page for deactivate-own-user API
endpoint. Created deactivate-own-user.md for the API page and
created a new owner client in test-api to reactivate the client
deactivated during testing.
Also changed endpoint name from deactivate-my-account to
deactivate-own-user, for better consistency with other endpoints.
Fixes#16163.
This is no longer used in any important place,
get_user_profile_by_email is meant to be used only in manage.py shell
now and thus there's no point in this function being cached.
Emails are not unique, so we can only sensibly cache using keys formed
with both email and realm.
This requires adding a new cache key function for caching by delivery
email - user_profile_delivery_email_cache_key.
Extend our markdown system to support mentioning of users
by id also. Following these changes, it would be possible
to mention users with @**|user_id** and silently mention
using @_**|user_id**.
Main intention for extending the mention syntax is to make
it convenient for bots to mention a users using their ids. It
is to be noted that previous syntax are also supported.
Documentation tweaked by tabbott for better readability.
The changes were tested manually in development server, and also
by adding some new backend and frontend tests.
Fixes: #17487.
We add support to shorten links and test their shortening in
well-organized, clean manner that makes it trivial to extend the
GitHub approach for GitLab and perhaps other services.
We only shorten basic types of GitHub links (issue, PR, commit) that
fit a set of simple common patterns; the default behaviour of Autolink
is kept for everything else.
Logic added in frontend and backend Markdown Processor is identical.
This makes easy to extend the logic for other services like GitLab.
Fixes#11895.
The schema for unread_msgs was missing additionalProperties: false
that was causing tests to pass even with undocumented parameters
which were validated as an additional property. Set
additionalProperties to false and added documentation for missing
count variable.
Fixes#17728.
This commit changes the list_to_streams function to raise error
according to create_stream_policy value when a user cannot create
streams instead of same error for all cases.
This commit modifies test_user_settings_for_subscribing_other_users
to check all the possible cases including the cases when a user
can successfully subscribe other users along with the already
tested failure cases. This commit also adds checks for guest users
which was not present before.
This commit replaces the code which directly changes user.role,
realm.create_stream_policy and realm.waiting_period_threshold
with do_change_user_role and do_set_realm_property functions
in test_can_create_streams. This makes the code similar to the
other tests.
We refactor test_can_create_streams and test_can_subscribe_other_users
in test_subs.py. We want to follow a specific order in such tests
which is just set the policy value one by one and then checking
that the role in policy returns true and role just below that returns
false. This approach is explained in detail below.
Following hierarchy of roles is considered for these tests -
1. Realm admin
2. Full members
3. Members
4. Guests.
Then if the policy is set to admins only, we check that the having
role as admin returns true and the role just below that, i.e. full
member returns false. Similarly, if the policy is set to members
only, we check that a member should return true and role below it
which is guest should return false. We basically follow these as
we can assume that if a user with particular role cannot do the
required task, then user with role below in the hierarchy would
be not allowed to do the task too.
This commit refactors the above mentioned two tests to have above
explained workflow.
This commit removes the unnecessary do_change_user_role function
in test_can_subcribe_other_users. This was added in 1aebf3cab
which replaced the multiple functions like do_change_is_admin
and do_change_is_guest with do_change_user_role.
Previously two functions do_change_is_admin and do_change_is_guest
were used because there were two flags is_realm_admin and is_guest
which were used to determine the role of a user. But then we added
a single field role to UserProfile and removed the multiple flags
and thus also replaced the different functions with a single
do_change_user_role. With addition of a new field role, two
different do_change_* functions were not needed as we only have
a role field instead of different flags, but this was missed in
1aebf3cab and this commit fixes it.
This add the schema checker, openapi schema, and also a test for
realm/deactivated event.
With several block comments by tabbott explaining the logic behind our
behavior here.
Part of #17568.
We discovered recently that some ops for events were just not
implemented in events.py (specifically, realm/deactivated).
Since our goal is for events.py to be complete, we add this bit of
hardening to ensure that it stays that way.
Modifies `StreamPattern` and `StreamTopicPattern` to inherit
from InlineProcessor instead of Pattern. This change is done
because Pattern stopped checking for matching patterns as soon
as it found a match which was not a valid stream. Due to this
all the subsequent mention failed, even if they were valid.
This bug was only present in backend renderring due to
markdown.inlinepatterns.Pattern.
Due to above changes verbose_compile is no longer used for
precompiling STREAM_LINK_REGEX, STREAM_TOPIC_LINK_REGEX as
adds ^(.*?) and (.*?)$ which cause extra overhead of matching
pattern which is not required. With new InlineProcessor these
extra patterns at beggining and end are not required.
So, StreamPattern and StreamTopicPattern now define their own
__init__ method for precompiling the regex.
Fixes#17535.
These changes were tested locally in dev server and by adding
some new markdown tests to test these.
Modifies `UserGroupMentionPattern` to inherit from InlineProcessor
instead of Pattern. This change is done because Pattern
stopped checking for matching patterns as soon as it found
a match which was not a valid user group. Due to this all
the subsequent user group mention failed, even if they were
valid. This bug was only present in backend renderring due to
markdown.inlinepatterns.Pattern.
This was reported as issue #17535.
These changes were tested locally in dev server and by adding
some new markdown tests to test these.
Modifies `UserMentionPattern` to inherit from InlineProcessor
instead of Pattern. This change is done because Pattern
stopped checking for matching patterns as soon as it found
a match which was not a valid user. Due to this all the
subsequent user mention failed. This bug was only present in
backend renderring due to markdown.inlinepatterns.Pattern.
This was reported as issue #17535.
These changes were tested locally in dev server and by adding
some new markdown tests to test these.
This removes the `add` from op list of stream event, as we do not
actually generate the stream/add event in the API, and when a stream
is created we identify it using the `create` operation.
(This was likely just a mistake introduced as a result of the fact
that `create` does not fit the normal naming scheme; probably
long-term we should actually migrate this to "add", but more important
for now is to document what's accurate).
Part of #17568.
This is preparatory work for investigating reports of missing unread
messages.
It's a little surprising that not test failed after adding the code
without API documentation.
Co-Author-By: Tushar Upadhyay (tushar912).
This is a prep commit which modifies the
`send_message_moved_breadcrumbs` function to take
message strings as input.
This is done to reuse the function in other places
like the /digress command.
Structurally, exception, failure_message, and status_code are mutually
exclusive in how this function is called, and it's best for the
function's flow to represent that.
The message from the bot which triggered the 407 error message notifies
the bot owner about the exceptions as well in the error message. This
commit handles it more gracefully and shows a generic message.
The messages from the bot which were triggered by the outgoing_webhooks
didn't have the bot name in them. This commit adds the bot name to it
and makes the corresponding changes in the tests.
This adds an option for restricting a ldap user
to only be allowed to login into certain realms.
This is done by configuring an attribute mapping of "org_membership"
to an ldap attribute that will contain the list of subdomains the ldap
user is allowed to access. This is analogous to how it's done in SAML.
Co-authored-by: Mateusz Mandera <mateusz.mandera@zulip.com>
On replying to an email notifcation from a stream where the user
does not come under the stream_post_policy will subsequently result
in a failure. In such a case, the user does not receive feedback
regarding the failure.
Notify the user via notification bot if their email
message failed to send.
Fixes#16642.
If the client has an old version of the code which is not present on
the server, don't throw a 500; instead, default to the same `unable to
look up in source map` message is used when the line numbers don't
line up.
I have updated the documentation for the Zabbix integration to give the
correct instructions for the latest version of Zabbix (5.2). The old
instructions are now obsolete.
I have also updated the message that is PMd to a user if the webhook
doesn't receive a complete payload to also align with the new
instructions.
Using get_user_profile_by_email is invalid, as it omits the realm, and
also fetches via .delivery_email - our convention is that .email is
supposed to be used for user-facing purposes like this.
self.example_user("hamlet") uses get_user_by_delivery_email, so it
doesn't actually cache anything. This should use a cached function, like
the test below: test_do_change_realm_subdomain_clears_user_realm_cache.
This is part of our general process of replacing emails, which are not
static with time, with user_ids when referring to users in the API.
We still keep the `email` reference option, since it can be useful for
linking third-party applications to Zulip on an intranet that might
have a user's corporate email handy and not want to do the extra round
trip to lookup the user.
The name of the parameter, user_id_or_email, was chosen to to make it
clear that the default/preferred option is user_id.
Fixes#14304.
TextField is used to allow users to set long stream + topic narrow
names in the urls.
We currently restrict users to only set "all_messages" and
"recent_topics" as narrows.
This commit achieves 3 things:
* Removes recent topics as the default view which loads when
hash is empty.
* Loads default_view when hash is empty.
* Loads default_view on pressing escape key when it is unhandled by
other present UI elements.
NOTE: After this commit loading zulip with an empty hash will
automatically set hash to default_view. Ideally, we'd just display
the default view without a hash, but that involves extra complexity.
One exception is when user is trying to load an overlay directly,
i.e. zulip is loaded with an overlay hash. In this case,
we render recent topics is background irrespective of default_view.
We consider this last detail to be a bug not important enough to block
adding this setting.
The query string parameter authentication method is now deprecated for
newly created Slack applications since the 24th of February[1]. This
causes Slack imports to fail, claiming that the token has none of the
required scopes.
Two methods can be used to solve this problem: either include the
authentication token in the header of an HTTP GET request, or include
it in the body of an HTTP POST request. The former is preferred, as
the code was already written to use HTTP GET requests.
Change the way the parameters are passed to the "requests.get" method
calls, to pass the token via the `Authorization` header.
[1] https://api.slack.com/changelog/2020-11-no-more-tokens-in-querystrings-for-newly-created-appsFixes: #17408.
Minimized code duplication by integrating POSTRequestMock into
HostRequestMock and then updating the required files with
HostRequestMock.
Fixes part of #1211.
A deprecated import shouldn’t be used even in a migration, since the
migration will need to remain runnable in the future. We never needed
a migration for this switch anyway; we just needed to edit the old
migration, since no actual state changes are involved.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This commit updates the stream creation, subscribing others to
stream, wildcard mention settings and stream post policy to allow
realm moderators even if they are new and the respective setting
is set to allow full members only.
This commit renames the is_new_member property in models.py
to is_provisional_member which will return true for any user
who is not a full member. We will add a condition in further
commit such that this returns 'False' for a moderator as we
will initially give all the rights to moderator that a full
member has.
This was inaccurate after testing the implementation, and there's an
argument that we shouldn't move it as it will simplify migrating to a
world where (some) private message threads can have topics.
These are all referring to email_gateway_bot, when they're supposed to
refer to the notification and welcome bots, respectively. The values are
the same though, so the tests were passing anyway.
Its likely that we would implement new hotspots that aren't
a part of the tutorial hotspots, in the future. For instance,
a hotspot to advertise new features. Hence, grouping them into
categories like INTRO_HOTSPOTS would be a good start. We also
have an aggregate of all types of hotspots we may add in the
future, under ALL_HOTSPOTS.
Fixes#17238.
In process_new_human user, the queries were wrong, revoking all invites
sent to the email address, even in other realms than the one where the
new account just got created.
The description of request parameter of update-subscription-settings was
wrongly pasted in yaml and wasn't completely removed from the md file.
Made appropriate fixes in yaml and md file.
test_signup: This test was wrong, because the inviter UserProfile was
from a different realm. Such a PreregistrationUser shouldn't be
considered valid.
test_tutorial: The direct call to internal_send_private_message was
using sender's realm as the realm argument which is not valid. It
doesn't lead to any error because the codepath seems to mostly not care
about the realm arg if the sender is a cross-realm bot. From my reading
of the code I think that wrong realm arg here would break user mentions,
because it makes its way to check_message() and then to
build_message_send_dict - but overall the message gets sent without
errors. Either way, this was a bug in the test and should be fixed.
Currently, the ID and Type fields didn't have a description,
and weren't being displayed. Added a schema component to add
descriptions, and display on the api page. Fixes part of #15967.
This commmit includes ROLE_MODERATOR in realm_user_count_by_role.
We also update test_change_role in test_audit_log.py to include
changes for moderator role as well.
Note that at this point, it's not possible to create moderator users;
this just will make it easier to write tests for logic involving them
as we develop the feature.
Have not included "ROLE_MODERATOR" in UserProfile.ROLE_TYPES
in this commit because did not want to update the openapi
docs at this stage as it will be a user-facing change and
not updating the openapi docs with moderator role included in
UserProfile.ROLE_TYPES gives error in ./tools/check-schemas.
The `message_id` was made an `str` object because
the request expected `Dict[str, str]`. The request is now
casted to `Dict[str, Any]` to fix the issue and removed
typecast of `message_id` to str.
python-zulip-api reference:
https://github.com/zulip/python-zulip-api/pull/653
Currently there are only tests for verifying the error case and there
are no tests to check the case where messages are sent successfully
in 'STREAM_POST_POLICY_RESTRICT_NEW_MEMBERS' stream.
This commit adds tests for checking that full members and bots owned
by them can send message successfully in streams with post policy as
'STREAM_POST_POLICY_RESTRICT_NEW_MEMBERS'.
We currently not allow new bots to send message in stream with post
policy as 'STREAM_POST_POLICY_RESTRICT_NEW_MEMBERS', but we should
allow them to send messages if their owner is a full member.
This will make it consistent with behavior in stream with post
policy as 'STREAM_POST_POLICY_ADMINS_ONLY' where we allow non admin
bots with owner as admin to send messages.
According to tests we should not allow bot without owners to
post in streams with STREAM_POST_POLICY_RESTRICT_NEW_MEMBERS.
But the code does not handle this and the related test passes
and raises error for case of bots without owner because the bot
is itself a new member.
This commit fixes this by adding a condition to check if there
is no bot owner and then raise error if there is no owner.
This is a minor refactor which renames the
notify_topic_moved_streams function to
send_message_moved_breadcrumbs.
This is done because this function will be also used
for other things in the future, when moving streams
or when using the /digress command, for example.
Added assertion to check that if a deprecated flag is in a field's
schema, then it should have deprecated mentioned in description
as well, and moved these checks to a separate function.
Fixes part of #15967.
Add new rest api endpoint GET users/{email} for looking up a user by
email, which is useful especially for corporate API applications that
might already have a user's email address.
Fixes#14302.
It looks like this ritual was born when a type comment wasn’t working
because it was mistyped without the colon.
Signed-off-by: Anders Kaseorg <anders@zulip.com>'
A few internal fields used for tracking which types of notifications
have already been sent for a given message, like `hander_id` and the
`push_notified` bundle of fields were being incorrectly included in
message events delivered to clients clients.
One could argue these fields might be useful hints to clients, but
because notifications can be triggered later on via
`missedmessage_hook`, they have no useful purpose in the API.
This commit move these extended event field on a `internal_data`
object within the event object, and delete this field in `contents()`
for call points that would serve data to clients.
Tweaked by tabbott to provide a cleaner interface.
We're not bumping API_FEATURE_LEVEL because these fields have always
been documented as being present only due to a bug, so no clients
should be expecting or relying on them.
Fixes: #15947.
user_profile.id was confused for user_profile.recipient_id. These bugs
are particularly sneaky as they can go undetected by tests due to ids of
objects accidentally coinciding. We add a mitigation for this class of
mistakes by shifting the Recipient.id sequence in test db.
This was introduced in dda3ff41e1.
On the rare occasion where user_profile.id would coincide with
recipient_id passed to the function, we would return the wrong value.
That is, instead of correctly returning recipient_id, we would return
sender.recipient_id - recipient id of the sender of the message, thus
possibly returning user_profile.recipient_id (if user_profile is the
sender) - exactly the situation the function wanted to avoid
with the `if recipient_id == my_recipient_id:` if. Ultimately resulting
in incorrect/malformed data in
state['raw_recent_private_conversations'].
nlargest is the natural fit for selecting n biggest items
from an unsorted list. It's more readable as well as more
efficent (even though we don't care much about the efficeny
in this particular case).
The current logic doesn't display data types when the additionalProperties
variables are not object, but are array of strings, etc. Changed the if
condition to allow rendering in such cases.
zerver/lib/users.py has a function named access_user_by_id, which is
used in /users views to fetch a user by it's id. Along with fetching
the user this function also does important validations regarding
checking of required permissions for fetching the target user.
In an attempt to solve the above problem this commit introduces
following changes:
1. Make all the parameters except user_profile, target_user_id
to be keyword only.
2. Use for_admin parameter instead of read_only.
3. Adds a documentary note to the function describing the reason for
changes along with recommended way to call this function in future.
4. Changes in views and tests to call this function in this changed
format.
Changes were tested using ./tools/test-backend.
Fixes#17111.
Previously, the data type of responses wasn't displayed in the API
Documentation, even though that OpenAPI data is carefully validated
against the implementation. Here we add a recursive function to
render the data types visibly in API Documentation.
Fixes part of #15967.
The responses for the API weren't being rendered from yaml, and were
incorrectly formatted in yaml. The parameters also weren't completely
included in yaml and needed to be moved. Made appropriate fixes in
yaml and markdown file.
Commit 434094e599 (#11321) changed this
from an Extension to a subclass of Markdown, so it no longer has any
reason to use a config dict structured like that of an Extension.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This commit migrates some of the backend tests to use assertLogs(),
instead of mock.patch() as planned in #15331.
Tweaked by tabbott to avoid tautological assertions.
There were some tests that had mock patches for logging, although no
logging was actually happening there. This commit removes such patches
in `corporate/tests/test_stripe.py`, `zerver/tests/test_cache.py`,
`zerver/tests/test_queue_worker.py`,
and `zerver/tests/test_signup.py`.
EmailLogBackend used to create a new EmailMessage and copy
only certain values from the original EmailMultiAlternatives
object. This resulted in the loss of information and made
it harder to test PRs like
https://github.com/zulip/zulip/pull/17121.
So instead of creating a new EmailMessage, tweak and send the existing
EmailMultiAlternatives object.
Depending on PostgreSQL’s query plan, it was possible for the value
condition to be evaluated before the field_type condition was checked,
leading to errors like
psycopg2.errors.InvalidDatetimeFormat: invalid value "stri" for "YYYY"
DETAIL: Value must be an integer.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This isn't quite the right model, because we're not actually going
through the upload code path, but it does at least provide some inline
image previews in the data.
Fixes part of #14991.
The changes are as follows:
• Fix one day offset in all western zones.
• Correct CST from -64800 to -21600 and CDT from -68400 to -18000.
• Disambiguate PST in favor of -28000 over +28000.
• Add GMT, UTC, WET, previously excluded for being at offset 0.
• Add ACDT, AEDT, AKST, MET, MSK, NST, NZDT, PKT, which the previous
code did not find.
• Remove numbered abbreviations -12, …, +14, which are unnecessary.
• Remove MSD and PKST, which are no longer used.
Hardcode the dict and verify it with a test, so that future
discrepancies won’t go silently unnoticed.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
https://docs.djangoproject.com/en/3.1/releases/3.1/
- django.contrib.postgres.fields.JSONField is deprecated and should be
replaced with models.JSONField
- The internals of the implementation in the postgresql backend have
changed a bit in
f48f671223
and thus we need to make an ugly tweak in test_runner.
- app_directories.Loader.get_dirs() now returns a list of PosixPath so
we need to make a small tweak in TwoFactorLoader for that (PosixPath
is not iterable)
Fixes#16010.
Adjustments made due to changes in Django 3.0:
(https://docs.djangoproject.com/en/3.0/releases/3.0/)
- test_signup: INTERNAL_RESET_URL_TOKEN was moved to
PasswordResetConfirmView.reset_url_token
- test_message_fetch:
"add_never_cache_headers() and never_cache() now add the private
directive to Cache-Control headers."
- "django.utils.html.escape() now uses html.escape() to escape HTML.
This converts ' to ' instead of the previous equivalent decimal
code '." - this requires adjusting the expected decimal code
in some of the string fixtures in tests.
This commit updates the Zulip User-Agent to
'Mozilla/5.0 (compatible; ZulipURLPreview/{version}; +{external_host})'
as the older User-Agent was rendering Markdown YouTube titles as
'YouTube - YouTube'.
Fixes#16970.
c2526844e9 removed the `signups` queue
worker, and the command-line tool that enqueues to it -- but not the
automated process that enqueues during signups itself.
Remove the signup, since it is no longer in use.
Previously, the data type of parameters wasn't displayed in the API
Documentation, even though that OpenAPI data is carefully validated
against the implementation. Here we add a recursive function to
render the data types visibly in the API documentation.
This only covers the request parameters; we'll want to do something
similar for response parameters in a follow-up PR.
Fixes part of #15967.
When we were getting an apply_event call for
a subscription/add event, we were trying not to
mutate the event itself, but this clumsy code
was still mutating the actual event:
# Avoid letting 'subscribers' entries end up in the list
for i, sub in enumerate(event['subscriptions']):
event['subscriptions'][i] = \
copy.deepcopy(event['subscriptions'][i])
del event['subscriptions'][i]['subscribers']
This is only a theoretical bug.
The only person who receives a subscription/add
event is the current user.
And it wouldn't have affected the current user,
since the apply_event was correctly updating the
state, and we wouldn't actually deliver the event
to the client (because the whole point of apply_event
is to prevent us from having to piggyback the
super-recent events on to our payload or put
them into the event queue and possibly race).
The new code just cleanly makes a copy of each
sub, if necessary, as we add them to state["subscriptions"].
And I updated the event schemas to reflect that
subscribers is always present in subscription/add
event.
Long term we should probably avoid sending subscribers
on this event when the clients don't set something
like include_subscribers. That's a fairly complicated
fix that involves passing in flags to ClientDescriptor.
Alternatively, we could just say that our policy is
that we never send subscribers there, but we instead
use peer_add events. See issue #17089 for more
details.
It's always cleaner to work in id space. It probably
would have required a perfect storm to have broken
the existing code, but using ids is obviously more
robust in theory, and just as simple.
We now require keywords, so that there is no
pitfall for mixing up boolean parameters.
Positional parameters are basically evil
when you have a bunch of bools.
I also make user_profile the first argument.
Finally, the code is more diff-friendly.
I eliminate the defaults, since the existing code
was already specificying values for most things.
I move all the booleans to the bottom for both
parameters and arguments.
I require explicit keywords for everything but
user_profile (which is now first).
And, finally, I format the code in a more
diff-friendly manner.
We eliminate some redundant checks.
We also consistently provide a `subscribers` field
in our stream data with `[]`, even if our users
can't access subscribers. We therefore bump
the API version and tweak the docs. (See further
down for a detailed justification of the change.)
Even though it is sometimes fine to have redundant code
that is defensive in nature, some upcoming changes are gonna
move subscriber-related logic out of build_stream_dict_for_sub
for certain codepaths as part of our effort to streamline
the payload for subscribers within page_params.
So we can't rely on the code that I removed here
inside of build_stream_dict_for_sub.
Anyway, it makes more sense to do these checks explicitly
in the validate function.
The code in build_stream_dict_for_sub was almost effectively
a noop, since the validation function was already preventing
us from getting subscriber info. The only difference it
made was sometimes converting `[]` to `None`, and then
subsequently omitting the subscribers field.
Neither ZT nor the webapp make any distinction between
`[]` or <missing key> for the `subscribers` data in
`page_params`.
The webapp has had this code for a long time (and now
equivalent code elsewhere in this PR):
if (!Object.prototype.hasOwnProperty.call(sub, "subscribers")) {
sub.subscribers = new LazySet([]);
}
The webapp calculates access based on booleans, anyway:
sub.can_access_subscribers =
page_params.is_admin || sub.subscribed ||
(!page_params.is_guest && !sub.invite_only);
And ZT would choke if `subscribers` were missing, except that
it never gets to the relevant code due to other checks:
def get_other_subscribers_in_stream(<snip>):
assert stream_id is not None or stream_name is not None
if stream_id:
assert self.is_user_subscribed_to_stream(stream_id)
return [sub
for sub in self.stream_dict[stream_id]['subscribers']
if sub != self.user_id]
else:
return [sub
for _, stream in self.stream_dict.items()
for sub in stream['subscribers']
if stream['name'] == stream_name
if sub != self.user_id]
You could make a semantic argument that we should prefer
<missing key> to `[]` when subscribers aren't even available, but
we have precedent from the way that `bulk_get_subscriber_user_ids`
has traditionally populated its result:
result: Dict[int, List[int]] =
{stream["id"]: [] for stream in stream_dicts}
If we changed `stream_dicts` to `target_stream_dicts` we
would faciliate a move toward `None`, but it would just cause
headaches for other server code as well as the frontends
(which, to reiterate, already prefer the empty array
for convenience).
As my comment indicates, I would prefer to handle
this explicitly by raising JsonableError in an
else statement here, but it's not a big deal.
This function can probably be simplified with a
bit of work, mostly on the testing side to make
sure we are covering all edge cases, but that
is out of the scope of my current PR.
By moving the relevant logic from realm.get_bot_domain to
get_fake_email_domain we will make realm.host be used (if possible) for
dummy user addresses. That is, instead of user11@zulipchat.com, the
address will become user11@subdomain.zulipchat.com.
With the change in d70e1bcdb7,
bots get email like bot@zulip.com with EXTERNAL_HOST="zulip.com",
rather than bot@subdomain.zulip.com, which was the old format. That's
not desirable, so with this commit, realm.host will be used when
possible and only falling back to FAKE_EMAIL_DOMAIN if needed.
We often send only one field (away or status_text)
to be updated.
So we have to make our schema support optional
keys.
As a result of the more flexible schema, we no
longer need to exempt the node fixtures from
our schema checks.
Since recipient_id (id of the PERSONAL Recipient of the user) was
denormalized into the UserProfile model, this query can be simplified by
getting rid of the zerver_recipient JOIN.
This makes us more efficient when handling
multiple users. We don't have to keep
sending the same two queries to the database.
Note that as part of this we eliminated
a failure mode for the obscure population
of users from whom both `user.is_guest` and
`user.can_access_public_streams()` returns
False. We know this would have only affected
Zephyr users (by looking at the code), and
we know we don't actually process Zephyr
users for email digests (or else we would
have raised exceptions in the old code).
We mostly need realm_id, but when we go to build
message lists, we need realm.uri.
We could probably be more aggresive about using
`only` here, but for now I am just trying to
reduce hops to the database.
The `deployment` key was only set in `do_report_error`, which is now
only used in one codepath (the queue worker). The logging handlers on
staging call notify_server_error directly, which omits the
`deployment` key.
Remove the odd one-of key, and instead simply do dispatch in
`do_report_error`.
The codepath for moving a topic changes the message.recipient_id to the
id of the new recipient, but later, in update_messages_for_topic_edit,
it uses message.recipient when querying for messages with the matching
topic in the *old* stream (because those are the other messages that
need to be moved). This is a bug which happens to work fine, because in
Django 2, if message.recipient gets fetched first and then
message.recipient_id is mutated, message.recipient will not be altered
and thus will retain the outdated, previously fetched value.
In Django 3 changing .recipient_id causes .recipient to be updated to
the new Recipient objects, which is the Recipient of the *new* stream.
That will cause the bug to manifest.
This is a bugfix preparing for the upgrade to Django 3.
Support for saving it in the session is dropped in django3, the cookie
is the mechanism that needs to be used. The relevant i18n code doesn't
have access to the response objects and thus needs to delegate setting
the cookie to LocaleMiddleware.
Fixes the LocaleMiddleware point of #16030.
We now require explicit keywords for all arguments
to fetch_initial_state_data except user_profile.
We provide reasonable defaults to keep the test
code concise.
In the case of reusing a registration link, reuse the
redirect_to_email_login_url helper. This does have the side effect of
now showing a "you've already registered" note, which did not happen
previously, but that seems probably for the best, since the user did
just click a "register" link.
ecfafc05c0 shifted to using a different paramter name to hint that
the user had previously signed up -- and in so doing also stopped
pre-filling the "email" box. Also send along the email box, to save
users time.
Checking for `validate_email_not_already_in_realm` again (after the
form already did so), but only in the case that the form fails to
validate, means that we may be spending time pushing totally invalid
emails to the DB to check. In the case of emails containing nulls,
this can even trigger a 500 error from PostgreSQL.
Stop calling `validate_email_not_already_in_realm` in the form
validation. The form is currently only used in two places -- in
`accounts_home` and in `maybe_send_to_registration`. The latter is
only called if the address is known to not currently have an account,
so checking in there is unnecessary; and in the former case, we wish
different behaviour (the redirect) than just validation failure, which
is all the validator can do.
Fixes#17015.
Co-authored-by: Alex Vandiver <alexmv@zulip.com>
Add a `--allow-reserved-subdomain` flag which allows creation of
reserved keyword domains. This also always enforces that the domain
is not in use, which was removed in 0258d7d.
Fixes#16924.
When changing the subdomain of a realm, create a deactivated realm with
the old subdomain of the realm, and set its deactivated_redirect to the
new subdomain.
Doing this will help us to do the following:
- When a user visits the old subdomain of a realm, we can tell the user
that the realm has been moved.
- During the registration process, we can assure that the old subdomain
of the realm is not used to create a new realm.
If the subdomain is changed multiple times, the deactivated_redirect
fields of all the deactivated realms are updated to point to the new
uri.
Instead of just storing the edit history in the message which
triggered the topic edit, we store the edit history in all
the messages that changed. This helps users track the edit history
of a message more reliably.
This change updates the GitHub Integration webhook
get_opened_or_update_pull_request_body method so that
the description is only printed if it actually changes.
If the update event is a result of some other
attribute update, such as an asignee change, then the
description is not included in the message sent to
the zulip stream.
Fixes#16345
As of Feb 15th 2019, Hipchat Cloud and Stride
have reached End Of Life and are no longer
supported by Atlassian. Since it is almost 2 years
now we can remove the migration guides.
Fetchings rows with end_time within the last 25 hours would result
in the realmcount queries returning two rows for each realm
if the analytics page was opened within an hour since the
count stats were updated.
Allowing any admins to create arbitrary users is not ideal because it
can lead to abuse issues. We should require something stronger that
requires the server operator's approval and thus we add a new
can_create_users permission.
We change the return type of check_message to be dataclass instead of
Dict[str, Any]. This refactoring helps us to understand the context of the
data structure returned by check_message clearly which was not possible
when using Dict.
SendMessageRequest class is added in zerver/lib/message.py inspite of it
not being used in that file itself just to maintain consistency as other
TypedDicts and dataclasses are defined in that file and to avoid circular
dependency as SendMessageRequest is being used in lib/widget.py as well.
We also rename local variable to 'send_request' for accessing
SendMessageRequest objects.
The {addr} part isn't directly useful, since connections to Tornado
are done on localhost anyway, and made the development environment
output a bit more confusing.
Also, use the same phrasing for restarts we use for Django.
This logging is really only potentially interesting in a development
environment when the numbers are nonzero.
In production, it seems worth logging for consistency reasons.
Probably we'll eventually redo this block by change the log level, but
this is good enough to despam the development environment startup
output.
We always want to do these at the same time. Previously, message
editing did too much stripping (fixes#16837) and failed to check for
NUL bytes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Previously we were just returning a dict containing a message id when
trying to mirror a already sent message in 'zephyr_mirror' cases.
This commit changes this behaviour to raise an exception when trying
to mirror an already sent message by adding a new exception class
ZephyrMessageAlreadySentException and then the caller returns the
message_id directly, instead of calling do_send_messages which also
returns a list of size one containing the message_id only.
This is a prep commit for changing the return type of check_message to
be a dataclass instead of a Dict as now we have only single output for
check_message.
This commit renames the content variable in do_widget_post_save_actions
to message_content and is a prep commit for changing the return type of
check_message from Dict to dataclass.
This change is required because content variable is used two times in
this function - one for message content and other for submessage
content, so when we change the return type of check_message to
dataclass, the type of content variable is considered as str and then
when dict is assigned to content in the submessage case, mypy raises
'Incompatible types in assignment' error.
This issue is not faced before the dataclass migration because there is
no type checking for the values of dict returned by check_message as the
return type of check_message is 'Dict[str, Any]'.
The message_dict['wildcard_mention_user_ids'] should be empty set instead
of empty list when there are no wildcard mentions similar to the case
when there are wildcard mentions, where it is equal to set of user ids and
not list of user ids.
I reformatted the tests and view to include information about who
acknowledged and closed the alert. Only includes the information about
the owner if there was an owner.
Made a few small changes to the refactored bit as requested in review.
Moved time formatting check and conversion to
zerver/lib/webhooks/common.py. Updated tests slightly to match new
output. Removed duration from the calculation because the difference
is less than the precision of output and it complicated the error
handling.
The Slack API always (even for failed requests) puts the access scopes
of the token passed in, into "X-OAuth-Scopes"[1], which can be used to
determine if any are missing -- and if so, which.
[1] https://api.slack.com/legacy/oauth-scopes#working-with-scopes
An HTML document sent without a charset in the Content-Type header
needs to be scanned for a charset in <meta> tags. We need to pass
bytes instead of str to Beautiful Soup to allow it to do this.
Fixes#16843.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
If a user visits a realm which has been deactivated and it's
deactivated_redirect field is set, we should have a message telling the
user that the realm has moved to the deactivated_redirect url.
We export a realm's data, and disable the realm, because the user
is moving from Zulip Cloud (e.g. https://example.zulipchat.com/) to
self-hosting or another platform (e.g. https://zulip.example.com/)
which we do not control. This commit adds a field in the realm object
called deactivated_redirect to store the url to which the realm has
moved.
This handles the conditions when anchor values are larger than
LARGER_THAN_MAX_MESSAGE_ID by clamping them down to it. Also added
tests for the function parse_anchor_value.
Fixes#16768.
This simplifies the code, as it allows using the mechanism of converting
JsonableErrors into a response instead of having separate, but
ultimately similar, logic in RateLimitMiddleware.
We don't touch tests here because "rate limited" error responses are
already verified in test_external.py.
In 1bcb8d8ee8 I made
it so the webapp doesn't include "streams" in its
state from `fetch_initial_state_data`, but I didn't
address all the places in apply_event.
By default all Stripe API amounts are in the currency's smallest unit.
It's upto us to convert it to a bigger unit and show it to the end user.
And refund event used to show the currency in the smallest unit which makes
the output wrong when it comes to most currencies like USD, Europ, INR etc
which uses a bigger unit(eg Dollar instead of Cents) as the standard.
Update the New Relic webhook and tests to match the format specified
in the New Relic documentation. The new format sends a json body
instead of using url parameters. The old format is no longer supported
by New Relic according to their support staff; as a result, the fixtures for
the old test cases were removed. Added fixtures for new test cases.
Fixes: #16393.
For 3000 messages and 400 users, this saved
about 30 seconds.
We only do two queries per batch of messages
now, and the algorithm is easier to analyze,
as it's just three nested loops.
Note that we are much more efficient about finding
active users here:
- we do one query per realm (instead of per-user)
- we pass the cutoff date to the database
- we get back just a list of distinct ids
This function is going away completely soon. It is
querying everybody's entire UserActivity history instead
of passing the cutoff date to the database!
The query counts increase here for somewhat
contrived reasons. The tests before this
commit reflected a successful trip to the
UserProfile cache, but that's not actually
realistic in practice.
We don't need to mock the dates here. We also
explicitly clear out all streams first, and then
we explicitly test with both the stream being
current and the stream being old.
We can use the _enqueue_emails_for_realm helper
to avoid all the Tuesday-related logic here.
We also don't bother to create UserActivity
records, since the bot gets excluded by virtue
of its being a bot. (Also, the date ranges
here were sketchy due to the time mocking.)
We can avoid all the date mocking now for all
but a couple tests that exercise the is-it-Tuesday
logic.
And this test now correctly tests that we exclude
recently active users.
And this allows us to remove the other test.
Sentry allows adding simple webhooks without going through the process
of creating an Internal Integration in Sentry's Integration
Platform[1] (which our docs recommend).
The payload from sent from such a (simple) webhook integration is
slightly different from the payload sent by an Internal Integration
webhook. This commit tries to wrangle this payload into a form that is
usable by our webhook handler to send a notification message.
[1]: https://sentry.io/integration-platform/
This commit fixes a bug in marked.js which caused it to double-escape
HTML when rendering messages of the form: *[text](url)*.
This fixes a bug introduced in
3bdc8bbaa5, where an unnecessary
escape() call was added for the <em> code path, likely just because it
was adjacent to the others that needed it in the file.
Fix this, and add tests to verify that things are still being escaped
once after removing this extra escape.
Fixes#14845.
The code we deleted here was no longer
doing anything.
Maybe the code was always dead, or maybe it
was written during a time when topics_by_diversity
and topics_by_length actually had different keys.
But now it's clearly cruft.
If we have 4 or more topics, then the code above
it would already have populated the list with 4
elements, and the `if num_convos < 4` condition
would evaluate to False.
And if we had 3 or fewer topics, then we would
have already put all possible topics into our
result, and the `topics_by_diversity[num_convos:4]`
slice would be empty.
It's possible that we should just have a simple
heuristic for topic hotness like `10*num_senders
+ messages`, so we don't have to maintain this
fiddly function, and we can just do something like
`topics_by_score[:4]`.
I now use sets for stream_ids in more of the digest
code.
As part of this I replaced exclude_subscription_modified_streams
with streams_recently_modified_for_user.
It's easier for the caller to just ask for ids
to delete from its callee than it is to pass
in a set/list to mutate.
The simpler boundary between the functions makes
the tests easier to write--you can see the
`filtered_streams` logic goes away in this diff.
I also make the tests a bit more thorough by using
combinations of Cordelia/Othello and Verona/Denmark
to try to find multiple possible flaws.
And I make the time intervals longer than 1s to
avoid false negatives from slow CI boxes.
If we have multiple users, this reduces the amount
of queries we need to do, because we get all
subscriptions for all users in a single query
to Subscription.
For the single-user case, we are introducing an
extra query hop, but the database is doing
roughly the same work, because we are just breaking
up this complex query into two hops:
messages =
select ... from message
where recipient__type_id in (
select stream_id from subscription
where ...
)
Now it's more like:
stream_ids =
select stream_id from subscription
where ...
messages =
select ... from message
where recipient__type_id in stream_ids
Note that we are not changing anything semantically
or algorithmically yet. The only overhead here
for the single-user case is boxing and unboxing
data into single-item dicts and lists.
The interfaces for callers in the view and the
queue processor remain the same for now.
We didn't need the enough-traffic mock.
We also continue to prep for testing multiple users.
I also finally remove a comment that is about to
be addressed (and which inaccurately refers to huddles).
This extraction will make a bit more sense when
we start doing bulk operations on a realm to
get digests, but even now, it encapsulates the
slightly complex way we cherry-pick the top 4
topics for a user.
This prep step is mostly for diff hygiene; the next
commit will make the code a bit nicer.
The original code here had the nice property that
most (but not all) of the DB work happened up
front in `handle_digest_email`, and none of the
DB work was delegated to the callers. But I
prefer the tradeoff of making the helpers a bit
more cohesive--let them get the data they need.
And we have query-count coverage in our tests,
so there's no real danger of having helpers
down in the stack insidiously doing a bunch of
extra DB hops.
In 709493cd75 (Feb 2017)
I added code to render_markdown that re-fetched the
sender of the message, to detect whether the message is
a bot.
It's better to just let the ORM fetch this. The
message object should already have sender.
The diff makes it look like we are saving round trips
to the database, which is true in some cases. For
the main message-send codepath, though, we are only
saving a trip to memcached, since the middleware
will have put our sender's user object into the
cache. The test_message_send test calls internally
to check_send_stream_message, so it was actually
hitting the database in render_markdown (prior to
my change).
Before this change we were clearing the cache on
every SQL usage.
The code to do this was added in February 2017
in 6db4879f9c.
Now we clear the cache just one time, but before
the action/request under test.
Tests that want to count queries with a warm
cache now specify keep_cache_warm=True. Those
tests were particularly flawed before this change.
In general, the old code both over-counted and
under-counted queries.
It under-counted SQL usage for requests that were
able to pull some data out of a warm cache before
they did any SQL. Typically this would have bypassed
the initial query to get UserProfile, so you
will see several off-by-one fixes.
The old code over-counted SQL usage to the extent
that it's a rather extreme assumption that during
an action itself, the entries that you put into
the cache will get thrown away. And that's essentially
what the prior code simulated.
Now, it's still bad if an action keeps hitting the
cache for no reason, but it's not as bad as hitting
the database. There doesn't appear to be any evidence
of us doing something silly like fetching the same
data from the cache in a loop, but there are
opportunities to prevent second or third round
trips to the cache for the same object, if we
can re-structure the code so that the same caller
doesn't have two callees get the same data.
Note that for invites, we have some cache hits
that are due to the nature of how we serialize
data to our queue processor--we generally just
serialize ids, and then re-fetch objects when
we pop them off the queue.
The passwords generated for our development environment / test suite
include the `+` character, which needs to be quoted when encoded as an
HTTP POST parameter.
This is hopefully sufficient to fix the CI failures we've seen with
the tests for POST /api/v1/fetch_api_key; I haven't reproduced the
failure so am not completely sure.
Steve asked me to remove this, since the tictactoe game was always
intended as a proof of concept. Now that we have poll and todo
widgets, the sample code for tictactoe has much less value.
We replace the content and type in test_widgets.py to maintain
coverage.
This reverts commit 564b199fe6, which
was part of #16308.
Escaping is either required or incorrect; it is never “defensive”.
This escaping is incorrect. lxml already escapes attributes during
serialization (any other behavior would be a serious bug), and
additional escaping just results in double escaping.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Initally, when writing two or more quotes, having
a blank line in between them, merges those quotes.
This created confusion especially in "quote and reply".
This commit fixes such issues. Now two or more quotes
having a blank line in between them, will not get merged.
This change is correct both for usability and for improving our
compatibility with CommonMark.
Fixes#14379.
By registering a post_delete handler to clear appropriate caches in a
nicer way, we can get rid of the ugly flush-memcached call in the
delete_realm command.
This commit adds migration which removes default status of exisitng
default private streams, i.e. private stream exists but they are no
longer default.
This commit removes mock.patch with assertLogs().
* Adds return value to do_rest_call() in outgoing_webhook.py, to
support asserting log output in test_outgoing_webhook_system.py.
* Logs are not asserted in test_realm.py because it would require to users
to be queried using users=User.objects.filter(realm=realm) and the order
of resulting queryset varies for each run.
* In test_decorators.py, replacement of mock.patch is not done because
I'm not sure if it's worth the effort to replace it as it's a return
value of a function.
Tweaked by tabbott to set proper mypy types.
Then because the ID is now part of the draft dict, we can
(and do) change the structure of the "drafts" parameter
returned from `GET /drafts` from an object (mapping ID to
data) to an array.
Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>
Sometimes we don't need to specify the expected_drafts field.
So by removing it, we can reduce the clutter a bit.
Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>
Now the timestamp returned in a draft dict will always be an int.
The endpoints will still accept either an int or a float.
Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>
Our test-backend validation confirms that we don't log anything to
stdout in the tests, so the fact that CI passes with this removes
shows there was nothing being logged.
Boto3 does not allow setting the endpoint url from
the config file. Thus we create a django setting
variable (`S3_ENDPOINT_URL`) which is passed to
service clients and resources of `boto3.Session`.
We also update the uploads-backend documentation
and remove the config environment variable as now
AWS supports the SIGv4 signature format by default.
And the region name is passed as a parameter instead
of creating a config file for just this value.
Fixes#16246.
The `no_proxy` parameter does not work to remove proxying[1]; in this
case, since all requests with this adapter are to the internal Tornado
process, explicitly pass in an empty set of proxies to disable
proxying.
[1] https://github.com/psf/requests/issues/4600
Not all of the workers are known to be safe to interrupt; they might
leave inconsistent state. As such, terminating them with timeouts
should currently only be a last-resort against stalled queues, not a
regular occurrence.
Since the exception can be triggered at arbitrary places in the stack
based on whenever the alarm happens to fire, they do not often group
together.
Explicitly group them together, grouped only by which queue the work
is in.
While working on shifting toward native browser time zone APIs
(#16451), it was found that all but very recent Chrome and Node
versions reject certain legacy timezone aliases like US/Pacific
(https://crbug.com/364374).
For now, we only canonicalize the timezone property returned in user
objects and not the timezone setting itself.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
During the new user creation code path, there can be no existing
active clients for the user being created, so we can skip the code to
send events to that user's clients.
The tests here reflect that we need to send fewer events, and do fewer
queries that would have been spent computing data for these..
Fixes#16503, combined with the long series of recent changes by Steve
Howell to fix super-linear behavior in this code path.
We no bulk up peer_add/peer_remove events by user if the
same user has subscribed to multiple streams (and just
that single user).
This mostly optimizes the new-user codepath, but the
algorithm is a bit more general in nature.
This test was flaky due to some date-related
non-determinism. I make all the Message objects
current to make add_new_user_history reliably
try to bulk-update UserMessage rows to read.
We replace knight command with change_user_role command which
allows us to change role of a user to owner, admins, member and
guest. We can also give/revoke api_super_user permission using
this command.
Tweaked by tabbott to improve the logging output and update documentation.
Fixes#16586.
Because of the very large `oneOf` clause of the formats of events
possible in Zulip's `GET /events` system, we had issues with
`test-backend` failures for missing documentation for a new event
format being like 1000 lines of output, which was very much unhelpful.
Fix this by limiting the output use only the oneOf variants that are
broadly similar to the actual payload received.
Fixes#16023.
See commit 8b002040e0 and #86. The
development environment bug that necessitated this handler has long
been irrelevant.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The comment still pointed to 'vacate' event flow, but
we have removed the vacate event in a9356508ca.
This commit fixes the comment to depict the correct
purpose of below lines, i.e. to test the remove
event flow.
We were including 'realm_user' in event_types along with 'subscription',
but we don't send event of type 'realm_user' when subscribing to a new
stream. This was added in 1c332f5d6a.
This commit removes 'realm_user' from event_types.
The name used to be included in the id_token, but this seems to have
been changed by Apple and now it's sent in the `user` request param.
https://github.com/python-social-auth/social-core/pull/483 is the
upstream PR for this - but upstream is currently unmaintained, so we
have to monkey patch.
We also alter the tests to reflect this situation. Tests no longer put
the name in the id_token, but rather in the `user` request param in the
browser flow, just like it happens in reality.
An adaptation has to be made in the native flow - since the name won't
be included by Apple in the id_token anymore, the app, when POSTing
to the /complete/apple/ endpoint,
can (and should for better user experience)
add the `user` param formatted as json of
{"email": "hamlet@zulip.com", "name": {"firstName": "Full", "lastName": "Name"}}
dict. This is also reflected by the change in the
native flow tests.
We now can send an implied matrix of user/stream tuples
for peer_add and peer_remove events.
The client code basically does this:
for stream_id in event['stream_ids']:
for user_id in event['user_ids']:
update_sub(stream_id, user_id)
We used to send individual events, which gets real
expensive when you are creating new streams. For
the case of copy-to-stream case, we should see
events go from U to 1, where U is the number of users
added.
Note that we don't yet fully optimize the potential
of this schema. For adding a new user with lots
of default streams, we still send S peer_add events.
And if you subscribe a bunch of users to a bunch of
private streams, we only go from U * S to S; we can't
optimize it down to one event easily.
Right now the list of languages in Display settings → Default language
is sorted in an unintuitive order due to the varying case conventions:
British English
Chinese (Taiwan)
Deutsch
English
Hindi
Indonesian (Indonesia)
Lietuviškai
Magyar
Malayalam
Nederlands
Português
Română
Tiếng Việt
Türkçe
català
español
français
galego
italiano
polski
suomi
svenska
česky
Русский
Українська
български
српски
فارسی
தமிழ்
日本語
简体中文
繁體中文
한국어
Fix the sort to use the locale-independent Unicode Collation
Algorithm:
British English
català
česky
Chinese (Taiwan)
Deutsch
English
español
français
galego
Hindi
Indonesian (Indonesia)
italiano
Lietuviškai
Magyar
Malayalam
Nederlands
polski
Português
Română
suomi
svenska
Tiếng Việt
Türkçe
български
Русский
српски
Українська
فارسی
தமிழ்
한국어
日本語
简体中文
繁體中文
Signed-off-by: Anders Kaseorg <anders@zulip.com>
All the fields of a stream's recipient object can
be inferred from the Stream, so we just make a local
object. Django will create a Message object without
checking that the child Recipient object has been
saved. If that behavior changes in some upgrade,
we should see some pretty obvious symptom, including
query counts changing.
Tweaked by tabbott to add a longer explanatory comment, and delete a
useless old comment.
This saves us a query for edge cases like when
you try to unsubscribe from a public stream
that you have already unsubscribed from.
But this is mostly to prep for upcoming
optimizations.
This doesn't change anything yet, but the goal is
to eventually optimize events for the case where
one user (typically a new user) gets subscribed
to multiple public streams.
Initially markdown titles were overridden by Youtube and Vimeo preview titles.
But now it will check if any markdown title is present to replace Youtube or
Vimeo preview titles, if preview of linked websites is enabled.
Fixes#16100
Upstream has slightly changed the whitespace around stashes. Take
this opportunity to clean up the extra blank lines we were outputting.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
There was no need to put "stream_id" on the sub
dictionary here. It's kinda annoying to introduce
the little helper here, but I feel
that's better than crufting up the sub data
structure.
The is_web_public flag is already in Stream.API_FIELDS,
so there is no reason for all this complicated logic.
There's no reason to hack it on to the subscription
object.
We replace all_streams_id with a map.
We also use it to populate never_subscribed_streams.
And all_streams_map is a superset of stream_hash,
which we will soon kill off as well.
Apparently I put these parens in the code as
part of 73c30774cb
during 2017.
It looks like I extracted is_public during
the middle of my change and forgot to remove
the unnecessary parens. (The code was correct,
but it makes it look like a tuple if you're
skimming it too quickly.)
That class is an artifact of when Stream
didn't have recipient_id. Now it's simpler
to deal with stream subscriptions.
We also save a query during page load (and
other places where we get subscriber
info).
Let the callers access stream.recipient as needed.
It costs the same, and some of the callers can
actually stop caring about the actual Recipient
object.
We already trust ids that are put on our queue
for deferred work. For example, see the code for
"mark_stream_messages_as_read_for_everyone"
We now pass stream_recipient_id when we queue
up work for do_mark_stream_messages_as_read.
This generally saves about 3 queries per
user when we unsubscribe them from a stream.
We get two speedups:
* The query to get existing subscribers only
gets the two fields we need. We no longer
need all the overhead of user_profile
and recipient data being returned in the
query.
* We avoid Django making extra hops to the
database to get user info.
Previously, the transaction.atomic() was not properly scoped to ensure
that RealmAuditLog entries were created in the same transaction,
making it possible for state changes to not be properly recorded in
RealmAuditLog.
When apps like mobile register for "streams", we
will now just use active streams as our baseline,
rather than "occupied" streams.
This means we will send a stream that is active,
even if it happens to have zero occupants. It's
actually pretty rare that a stream has zero occupants,
and it's not exactly clear that we want to exclude
a non-occupied but otherwise active stream from
our list of streams.
It also happens to be fairly expensive to compute
whether a stream is occupied.
This change only affects API clients (including
possibly our mobile app). The main webapp never
used the data from this codepath.
We replace get_peer_user_ids_for_stream_change
with two bulk functions to get peers and/or
subscribers.
Note that we have three codepaths that care about
peers:
subscribing existing users:
we need to tell peers about new subscribers
we need to tell subscribed user about old subscribers
unsubscribing existing users:
we only need to tell peers who unsubscribed
subscribing new user:
we only need to tell peers about the new user
(right now we generate send_event
calls to tell the new user about existing
subscribers, but this is a waste
of effort that we will fix soon)
The two bulk functions are this:
bulk_get_subscriber_peer_info
bulk_get_peers
They have some overlap in the implementation,
but there are some nuanced differences that are
described in the comments.
Looking up peers/subscribers in bulk leads to some
nice optimizations.
We will save some memchached traffic if you are
subscribing to multiple public streams.
We will save a query in the remove-subscriber
case if you are only dealing with private streams.
This will ensure that we always fully execute the database part of
modifying subscription objects. In particular, this should prevent
invariant failures like #16347 where Subscription objects were created
without corresponding RealmAuditLog entries.
Fixes#16347.
We don't need the select_related('user_profile')
optimization any more, because we just keep
track of user info in our own data structures.
In this codepath we are never actually modifying
users; we just occasionally need their ids or
emails.
This can be a pretty substantive improvement if
you are adding a bunch of users to a stream
who each have a bunch of their own subscriptions.
We could also limit the number of full rows in this
query by adding an extra hop to the DB just to
get colors (using values_list), and then only get
full sub info for the streams that we're adding, rather
than getting every single subscription, in full, for each user.
Apart from finding what colors the user has already
used, the only other reason we need all the columns
in Subscription here is to handle streams that
need to be reactivated. Otherwise we could do
only("id", "active", "recipient_id", "user_profile_id")
or similar. Fortunately, Subscription isn't
an overly wide table; it's mostly bool fields.
But by far the biggest thing to avoid is bringing
in all the extra user_profile data.
We have pretty good coverage on query counts here,
so I think this fix is pretty low risk.
This class removes a lot of the annoying tuples
we were passing around.
Also, by including the user everywhere, which
is easily available to us when we make instances
of SubInfo, it sets the stage to remove
select_related('user_profile').
We used to send occupy/vacate events when
either the first person entered a stream
or the last person exited.
It appears that our two main apps have never
looked at these events. Instead, it's
generally the case that clients handle
events related to stream creation/deactivation
and subscribe/unsubscribe.
Note that we removed the apply_events code
related to these events. This doesn't affect
the webapp, because the webapp doesn't care
about the "streams" field in do_events_register.
There is a theoretical situation where a
third party client could be the victim of
a race where the "streams" data includes
a stream where the last subscriber has left.
I suspect in most of those situations it
will be harmless, or possibly even helpful
to the extent that they'll learn about
streams that are in a "quasi" state where
they're activated but not occupied.
We could try to patch apply_event to
detect when subscriptions get added
or removed. Or we could just make the
"streams" piece of do_events_register
not care about occupy/vacate semantics.
I favor the latter, since it might
actually be what users what, and it will
also simplify the code and improve
performance.
The query to get "occupied" streams has been expensive
in the past. I'm not sure how much any recent attempts
to optimize that query have mitigated the issue, but
since we clearly aren't sending this data, there is no
reason to compute it.
Using web_public_guest for anonymous users is confusing since
'guest' is actually a logged-in user compared to
web_public_guest which is not logged-in and has only
read access to messages. So, we rename it to
web_public_visitor.
This is a more thorough test of adding multiple
streams for multiple users, including streams
that users have already subscribed to.
The extra queries here are due to the fact
that we call `principal_to_user_profile` in
a loop in the view. So that's an example
of O(N) overhead. We may be able to bulk-fetch
these users eventually.
This is a pure extraction, except that I remove a
redundant check that `len(principals) > 0`. Whenever
that value is false, then `new_subscriptions` will
only have one possible entry, which is the current
user, and we skip that in the loop.
We no longer do O(N) queries to get existing streams.
This is a somewhat contrived use case--generally, we
are not trying to re-subscribe a user to several
streams. Still, we want to avoid this.
This commit also makes `test_bulk_subscribe_many`
do more work, and the change to the test helped
me discover this bug.
If a user asks to be subscribed to a stream
that they are already subscribed to, then
that stream won't be in new_stream_user_ids,
and we won't need to send an event for it.
This change makes that happen more automatically.
Let
U = number of users to subscribe
S = number of streams to subscribe
We were technically doing N^3 amount of work
when we sent certain events, or to be more
precise, U * S * S amount of work. For each
stream, we were looping through a list of tuples
of size U * S to find the users for the stream.
In practice either U or S is usually 1, so the
performance gains here are probably negligible,
especially since the constant factors here
were just slinging around Python data.
But the code is actually more readable now, so
it's a double win.
We rename needs_new_sub (which sounds like
a boolean!) to new_recipient_ids, and we
calculate it explicitly within the loop, so
that we don't need to worry as much about
subsequent passes through the loop mutating it.
This allows us to also remove recipient_ids,
which in turn lets us remove recipients_map,
albeit with a small tweak for stream_map.
I also introduce the my_subs local, which
I use to more directly populate used_colors,
as well as using it as the loop var.
I think it's important that the callers understand
that bulk_add_subscriptions assumes all streams
are being created within a single realm, so I make
it an explicit parameter.
This may be overkill--I would also be happy if we
just included the assertions from this commit.
This function now does all the work that we used
to do with notify_subscriptions_added happening
inside a loop.
There's a small fine-tuning here, where we only
get recent traffic on streams that we're actually
sending events for.
We now just pass in all_subscribers_by_stream, rather
than a callback.
We also move sub_tuples_by_user closer to the
loop where we call notify_subscriptions_added.
This preserves the alpha layer on GIF images that need to be resized
before being uploaded. Two important changes occur here:
1. The new frame is a *copy* of the original image, which preserves the
GIF info.
2. The disposal method of the original GIF is preserved. This
essentially determines what state each frame of the GIF starts from
when it is drawn; see PIL's docs:
https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#saving
for more info.
This resolves some but not all of the test cases in #16370.
ssh always runs its command through a shell (after naïvely joining
multiple arguments with spaces), so it needs an extra level of shell
quoting. This should have no effect because we already validated user
with a regex, but it’s better for escaping to be locally correct in
case the context changes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
do_send_messages has side effects outside the database and may not
work reliably if its database effects are reordered by being inside a
transaction.
This also fixes a bug where we were doing the update incorrectly on
the Message table.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Since this was using repead individual get() calls previously, it
could not be monitored for having a consumer. Add it in, by marking
it of queue type "consumer" (the default), and adding Nagios lines for
it.
Also adjust missedmessage_emails to be monitored; it stopped using
LoopQueueProcessingWorker in 5cec566cb9, but was never added back
into the set of monitored consumers.
This low-level interface allows consuming from a queue with timeouts.
This can be used to either consume in batches (with an upper timeout),
or one-at-a-time. This is notably more performant than calling
`.get()` repeatedly (what json_drain_queue does under the hood), which
is "*highly discouraged* as it is *very inefficient*"[1].
Before this change:
```
$ ./manage.py queue_rate --count 10000 --batch
Purging queue...
Enqueue rate: 11158 / sec
Dequeue rate: 3075 / sec
```
After:
```
$ ./manage.py queue_rate --count 10000 --batch
Purging queue...
Enqueue rate: 11511 / sec
Dequeue rate: 19938 / sec
```
[1] https://www.rabbitmq.com/consumers.html#fetching
`loopworker_sleep_mock` is a file-level variable used to mock out the
sleep() call in LoopQueueProcessingWorker; don't reuse the variable
name for something else.
Despite its name, the `queue_size` method does not return the number
of items in the queue; it returns the number of items that the local
consumer has delivered but unprocessed. These are often, but not
always, the same.
RabbitMQ's queues maintain the queue of unacknowledged messages; when
a consumer connects, it sends to the consumer some number of messages
to handle, known as the "prefetch." This is a performance
optimization, to ensure the consumer code does not need to wait for a
network round-trip before having new data to consume.
The default prefetch is 0, which means that RabbitMQ immediately dumps
all outstanding messages to the consumer, which slowly processes and
acknowledges them. If a second consumer were to connect to the same
queue, they would receive no messages to process, as the first
consumer has already been allocated them. If the first consumer
disconnects or crashes, all prior events sent to it are then made
available for other consumers on the queue.
The consumer does not know the total size of the queue -- merely how
many messages it has been handed.
No change is made to the prefetch here; however, future changes may
wish to limit the prefetch, either for memory-saving, or to allow
multiple consumers to work the same queue.
Rename the method to make clear that it only contains information
about the local queue in the consumer, not the full RabbitMQ queue.
Also include the waiting message count, which is used by the
`consume()` iterator for similar purpose to the pending events list.
We modify access_stream_for_delete_or_update function to return
Subscription object also along with stream. This change will be
helpful in avoiding an extra query to get subscription object in
code for updating subscription role.
For streams in which only full members are allowed to post,
we block guest users from posting there.
Guests users were blocked from posting to admin only streams
already. So now, guest users can only post to
STREAM_POST_POLICY_EVERYONE streams.
This is not a new feature but a bugfix which should have
happened when implementing full member stream policy / guest users.
Otherwise, if consume_func raised an exception for any reason *other*
than the alarm being fired, the still-pending alarm would have fired
later at some arbitrary point in the calling code.
We need two try…finally blocks in case the signal arrives just before
signal.alarm(0).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Replaced ImageOps.fit by ImageOps.pad, in zerver/lib/upload.py, which
returns a sized and padded version of the image, expanded to fill the
requested aspect ratio and size.
Fixes part of #16370.
SIGALRM is the simplest way to set a specific maximum duration that
queue workers can take to handle a specific message. This only works
in non-threaded environments, however, as signal handlers are
per-process, not per-thread.
The MAX_CONSUME_SECONDS is set quite high, at 10s -- the longest
average worker consume time is embed_links, which hovers near 1s.
Since just knowing the recent mean does not give much information[1],
it is difficult to know how much variance is expected. As such, we
set the threshold to be such that only events which are significant
outliers will be timed out. This can be tuned downwards as more
statistics are gathered on the runtime of the workers.
The exception to this is DeferredWorker, which deals with quite-long
requests, and thus has no enforceable SLO.
[1] https://www.autodesk.com/research/publications/same-stats-different-graphs
Currently, drain_queue and json_drain_queue ack every message as it is
pulled off of the queue, until the queue is empty. This means that if
the consumer crashes between pulling a batch of messages off the
queue, and actually processing them, those messages will be
permanently lost. Sending an ACK on every message also results in a
significant amount lot of traffic to rabbitmq, with notable
performance implications.
Send a singular ACK after the processing has completed, by making
`drain_queue` into a contextmanager. Additionally, use the `multiple`
flag to ACK all of the messages at once -- or explicitly NACK the
messages if processing failed. Sending a NACK will re-queue them at
the front of the queue.
Performance of a no-op dequeue before this change:
```
$ ./manage.py queue_rate --count 50000 --batch
Purging queue...
Enqueue rate: 10847 / sec
Dequeue rate: 2479 / sec
```
Performance of a no-op dequeue after this change (a 25% increase):
```
$ ./manage.py queue_rate --count 50000 --batch
Purging queue...
Enqueue rate: 10752 / sec
Dequeue rate: 3079 / sec
```
Part of #16094.
Moved the language selection preference logic from home.py to a new
function in i18n.py to avoid repetition in analytics views and home
views.
For users who are not authenticated, we don't need to 2fa them,
we only need it once they are trying to login.
Tweaked by tabbott to be much more readable; the new style might
require new test coverage.
We add a new wildcard_mention_policy setting to handle wildcard
mentions in large streams, with a wide range of policies available to
organizations.
We set the default to the safe option for preventing accidental spam:
only stream administrators being able to use wildcard mentions in
large streams.
This prevents the memcached connection from being shared across
multiple processes, and hopefully addresses unexpected behavior from
cached functions like get_user_profile_by_id invoked inside the worker
processes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We call build_message_send_dict from check_message instead of
do_send_messages.
This is a prep commit for adding a new setting for handling
wildcard mentions in large streams.
We extract the loop for building message dict in
do_send_messages in a separate function named
build_message_send_dict.
This is a prep commit for moving the code for building
of message dict in check_message.
There is a bug where we send event for even
those messages which do not have embedded links
as we are using single set 'links_for_embed' to
check whether we have to send event for
embedded links or not.
This commit fixes the bug by adding 'links_for_embed'
in message dict itself and send the event only
if that message has embedded links.
As explained in the previous commit, yamole preprocessed allOf with an
algorithm that is not standards compliant. We replicate that
algorithm, but importantly, we only use it for our own code and not
for building the openapi_core RequestValidator.
This improves the time taken by OpenAPISpec().check_reload() from
1.69s to 0.53s, nearly all of which is inside
openapi_core.create_spec.
Closes#10484. Significantly improves #16068.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
yamole preprocesses our schema by naïvely merging all the objects in
an allOf array together, but this fails to capture the meaning of
allOf according to the OpenAPI specification. allOf is supposed to be
a strict logical intersection of each subschema interpreted
independently. It does not combine their properties maps before
interpreting additionalProperties. So according to the old definition
of JsonSuccess, every response is invalid:
allOf:
- additionalProperties: false
properties:
result:
type: string
- required:
- result
- msg
properties:
msg:
type: string
because the first subschema disallowed msg and the second subschema
required msg.
To fix this, whenever we use allOf for schema “inheritence”, the base
schema must not specify additionalProperties, and the child schema
must explicitly list all properties recursively inherited from the
base schema in any subschema that uses additionalProperties.
Fixes#16109.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This commit removes the unnecessary comment which was added in
9454683108, when we were using message.get() for keys which
were also passed as args in do_send_messages, but there are no
such keys in the current code.
This commit removes the unnecessary line of code to get
rendered_content from message dict sent by check_message
when it actually does not inlcude 'rendered_content' key.
This line was added in 9454683108, but now we do not send
rendered_content in the message dict as we render the message
in do_send_messages itself.
A later commit alters `authenticate` of EmailAuthBackend to
add a store `needs_to_change_password` variable to session
which is useful to insist users on changing their weak password.
The tests start failing with that change because client.login()
runs `authenticate` without a `request` object. So, this commit
sends a request object with `request.session=self.client.session`
to self.client.login() in tests wherever needed.
We previously used to to redirect to config error page with
a different URL. This commit renders config error in the same
URL where configuration error is encountered. This way when
conifguration error is fixed the user can refresh to continue
normally or go back to login page from the link provided to
choose any other backend auth.
Also moved those URLs to dev_urls.py so that they can be easily
accessed to work on styling etc.
In tests, removed some of the asserts checking status code to be 200
as the function `assert_in_success_response` does that check.
We now no longer define any schemas in test_events--all
of them are in event_schema, which helps our tooling
cross-check schemas for openapi and node tests.
It happens that whether you add a reaction or remove
a reaction, we send the exact same fields, just using
a different op code.
This sort of symmetry is actually kind of rare, as
usually "add" events have more fields, and "remove" events
might just send an id of something to remove.
Our openapi schema treats these as two seperate events,
so we are more consistent with it, and it helps our
schema-checking tooling for node fixtures, too.
Note that we now have to exempt the two events from
our openapi checks, due to the is_mirror_dummy field
in the deprecated user block. We can decide how to
handle this later--one possibility is to just add it
as an optional field on the event_schema side.
Note that we use value_type for value instead of
bool, since properties can be non-bool things
like color, which we just don't test now. We
should test them.
We more than compensate for this by checking
the actual value of the value in
check_subscription_update.
There is a legacy format where we send
singular "message_id" instead of plural
"message_ids".
Then there are different fields for "private"
and "stream" message types.
Note that we make the schema for profile_data
slightly more realistic, but it doesn't actually get
exercised by our current tests (apart from
making sure it's a dict), since we don't have
profile data for our test realm.
We also don't have the optional fields for bots,
since our tests don't exercise that, nor
delivery_email.
So we exempt realm_user_add_event from openapi
checks for now.
When we try to match the openapi specs better, we
will probably want to add a few tests to test_events.
Obviously getting good coverage for adding users
would be nice for all these scenarios:
* delivery_email matters
* bots
* realm has profile fields
This is a prep commit for supporting "presence"
events, where the key of the dictionary is some
arbitrary string like "website" but the value
of the dictionary is another dictionary itself
with keys that are more like variable names.
This also forces us to create TupleType.
We exempt this from the openapi check,
since we haven't figured out how to model
tuples in openapi with the same precision
as event_schema (and it may be impossible).
Long term we just want to stop dealing in
tuples, of course.
StringDict is a data type for representing dictionaries where
all keys and values are strings. Add this data type to data_types.py
and edit other files so that this data type is put to use and tested.
(slightly tweaked by @showell to remove a comment and shorten
a var name now that we have a proper data type)
We also make our schema in event_schema reflect this,
which in turn makes us match the already accurate
openapi spec, so we no longer need to exempt four
types of events from our sanity checks.
We might want to rename the tool to something more
general now, since we are really reconciling three
things:
- node fixtures
- event_schema checkers for test_events
- openapi specs
The way we compare python and openapi schemas is
as follows:
- first convert openapi schemas to be build
from DictType, ListType, etc. with from_opeapi
- do a diff on the schemas
Most of the new code is just having the FooType
family of classes serialize themselves with schema().
Defining types with an object hierarchy
of type classes will allow us to build
functionality that was impossible (or
really janky) with the validators.py
approach of composing functions.
Most of the changes to event_schema.py
were automated search/replaces.
This patch doesn't really yet take
advantage of the new FooType classes,
but we will use it soon to audit our
openapi specs.
Even before GDPR changes, it was strange that we displayed
users differently for fork events vs. all other events.
After GDPR, we don't even get the `username` field any
more.
So now we simply use `display_name` if available, and then
we try `nickname`.
See https://developer.atlassian.com/cloud/bitbucket/bitbucket-api-changes-gdpr/
for more context.