When pulling batches out of the ScheduledEmail list in a single
transaction, an unexpected failure to send an email will result in the
whole batch getting retried. This will result in infinite email
sending loops.
Pull a single row off at a time and send it. We continue without
retries to the next email on EmailNotDeliveredException, but will
retry infinitely on other exceptions.
Fixes: #20943.
Putting all of the logic in a `finally` block is equivalent to a bare
`except` block, which silently consumes all exceptions.
Move only the most-necessary parts into the except; this lets
`BadImageError` exceptions from `zerver/lib/upload.py` to escape,
allowing better the generic "Image file upload failed" to be replaced
with a more specific message.
It also allows unexpected exceptions, as the previous commit resolved,
to escape and 500. This lets them be detected and resolved, rather
than give users a silently bad experience.
5dab6e9d31 began honoring the list of disposals for every frame.
Unfortunately, passing a list of disposals for a non-animated image
raises an exception:
```
File "zerver/lib/upload.py", line 212, in resize_emoji
image_data = resize_gif(im, size)
File "zerver/lib/upload.py", line 165, in resize_gif
frames[0].save(
File "[...]/PIL/Image.py", line 2212, in save
save_handler(self, fp, filename)
File "[...]/PIL/GifImagePlugin.py", line 605, in _save
_write_single_frame(im, fp, palette)
File "[...]/PIL/GifImagePlugin.py", line 506, in _write_single_frame
_write_local_header(fp, im, (0, 0), flags)
File "[...]/PIL/GifImagePlugin.py", line 647, in _write_local_header
disposal = int(im.encoderinfo.get("disposal", 0))
TypeError: int() argument must be a string, a bytes-like object or a
number, not 'list'
```
`check_add_realm_emoji` calls this as:
```
try:
is_animated = upload_emoji_image(image_file, emoji_file_name, a
uthor)
emoji_uploaded_successfully = True
finally:
if not emoji_uploaded_successfully:
realm_emoji.delete()
return None
# ...
```
This is equivalent to dropping _all_ exceptions silently. As such,
Zulip has silently rejected all non-animated images larger than 64x64
since 5dab6e9d31.
Adjust to only pass a single disposal if there are no additional
frames. Add a test for non-animated images, which requires also
fixing the incidental bug that all GIF images were being recorded as
animated, regardless of if they had more than 1 frame or not.
The deferred_work events can't be processed in the test suite
environment, since it tries to make requests to the host <realm.uri>
which is "zulip.testserver" and obviously not going to work. Since the
test suite base data set has no animated emoji, there's nothing to do,
and we can skip this step.
This code only runs when applying the migration to an already
provisioned test db - because during an initial db set up there are no
realms yet, so no events get pushed by the migration.
The previous random strategy for picking 5 emails would result in
collisions in 1 out of 10000.35 tests, leading to
psycopg2.errors.UniqueViolation.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
For aliases that will no longer be listed, see the third column of
grep '^L ' zulip-py3-venv/lib/python3.*/site-packages/pytz/zoneinfo/tzdata.zi
Time zones previously set to an alias will be canonicalized on demand.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
A recent Postgres upstream release appears to have broken PGroonga.
While we wait for https://github.com/pgroonga/pgroonga/issues/203 to
be resolved, disable PGroonga in our automated tests so that Zulip
CI passes.
Sometimes we may get data to import, due to export bugs, malformed data
etc., which doesn't have the invariant of RealmEmoji.author always being
set. The import code should fix that, by choosing a reasonable default
and setting it.
There doesn’t seem to be a reason to override this, and the upstream
method it was based on has diverged since this was written.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Although our NonClosingPool prevents the SQLAlchemy connection from
closing the underlying Django connection, we still want to properly
dispose of the associated SQLAlchemy structures.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Fixes these warnings with SQLALCHEMY_WARN_20=1:
RemovedIn20Warning: Using non-integer/slice indices on Row is
deprecated and will be removed in version 2.0; please use
row._mapping[<key>], or the mappings() accessor on the Result
object. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
RemovedIn20Warning: Using the 'in' operator to test for string or
column keys, or integer indexes, in a :class:`.Row` object is
deprecated and will be removed in a future release. Use the
`Row._fields` or `Row._mapping` attribute, i.e. 'key in
row._fields' (Background on SQLAlchemy 2.0 at:
https://sqlalche.me/e/b8d9)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Fixes this warning with SQLALCHEMY_WARN_20=1:
RemovedIn20Warning: The legacy calling style of select() is deprecated
and will be removed in SQLAlchemy 2.0. Please use the new calling
style described at select(). (Background on SQLAlchemy 2.0 at:
https://sqlalche.me/e/b8d9)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Fixes “SADeprecationWarning: The Select.column() method is deprecated
and will be removed in a future release. Please use
Select.add_columns() (deprecated since: 1.4)”.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Fixes “SADeprecationWarning: Implicit coercion of SELECT and textual
SELECT constructs into FROM clauses is deprecated; please call
.subquery() on any Core select or ORM Query object in order to produce
a subquery object.”
Signed-off-by: Anders Kaseorg <anders@zulip.com>
match_querystring is irrelevant in these cases. Fixes this warning:
/srv/zulip-py3-venv/lib/python3.7/site-packages/responses/__init__.py:340:
DeprecationWarning: Argument 'match_querystring' is deprecated. Use
'responses.matchers.query_param_matcher' or
'responses.matchers.query_string_matcher'
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The tool needs to run this function, since it uses django's send_email
directly instead of going through our zerver.lib.send_email.send_email
codepath.
The message ID is encoded in the URL, not the PATCH parameters, so
this argument was ignored. I verified that it appears to have always
matched the value present in the URL.
The new logic better matches reasonable user expectations, that if you
move all the messages, that's a whole-topic move, regardless of which
propagation mode you selected.
When moving only part of a topic, it's useful to display that
information to users in these notifications so that it's clear what's
happening.
The most important consequence is actually just increasing confidence
that when you see that the whole topic was moved, that's accurate.
Substantially modified by tabbott.
Fixes#20575.
To avoid an uncaught IntegrityError causing a 500 HTTP response in a
race between two processes trying to mute a topic, we catch the
integrity error and raise the error exception with status 400 we'd
have gotten if the second request had been a bit later.
Fixes#21011.
The S3 backend implementation of upload_emoji_image was accessing
emoji_file.name - which is redundant because emoji_file_name already
gets passed in and can be used, and an object of type IO[bytes] may not
have the .name attribute. Spotted by @Fingel.
Fixes#20132.
EMAIL_HOST_USER without EMAIL_HOST_PASSWORD is not going to be a valid
configuration, and may result from making mistake in correctly setting
it in the secrets file and end up being a non-obvious cause of failure
to send email. Logging an error will be useful for detecting it. Further
conditions can be added to the function in the future.
Calling `get_apns_context` opens (and caches) an open connection to
the APNs servers. Since `apns_enabled` is called from Django
codepaths, this means that the Django processes hold unnecessary
connections open to the APNs servers.
Switch `apns_enabled` to checking what `get_apns_context` checks when
we're just returning True/False.
aioapns 2.1 removed the loop parameter from the aioapns.APNs
constructor, because Python 3.10 removed the loop parameter from the
asyncio.Lock constructor.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The new release adds the commit:
20ac22b96d
Which allows us to get rid of the entire ugly override that was needed
to do this commit's job in our code. What we do here in this commit:
* Use django-scim2 0.17.1
* Revert the relevant parts of f5a65846a8
* Adjust the expected error message in test_exception_details_not_revealed_to_client
since the message thrown by django-scim2 in this release is slightly
different.
We do not have to add anything to set EXPOSE_SCIM_EXCEPTIONS, since
django-scim2 uses False as the default, which is what we want - and we
have the aforementioned test verifying that indeed information doesn't
get revealed to the SCIM client.
I incorrectly removed this when simplifying
dbddbee5a115b9352862cb13d4c66820865c30b6; while that commit did not
require the hunk re-added here, the later commit
3be622ffa7 added a call that did require it.
Adds request as a parameter to json_success as a refactor towards
making `ignored_parameters_unsupported` functionality available
for all API endpoints.
Also, removes any data parameters that are an empty dict or
a dict with the generic success response values.
As a preparatory step to refactoring json_success to accept
request as a parameter, change `do_report_error`, which is
called from the events queue for "error_reports", to return
None instead of json_success.
Adds an assertion error to `ErrorReporter` queue processor
and removes `JsonableError` from `do_report_error`.
It is likely that `do_error_report` was moved from a view in a
previous refactor, but was not updated to no longer return an
HttpReponse.
As a preparatory step to refactoring json_success to accept
request as a parameter, update helper function `compose_views`
in `views.streams.py` to return the response data and call
json_success from view functions that utilize `compose_views`.
Also, updates related test in `zerver.tests.test_subs.py`.
As a preparatory step to refactoring json_success to accept
request as a parameter, change interface of helper functions:
`handle_deferred_message` in `views.message_send.py` and
`mute_topic` and `unmute_topic` in `views.muting.py`, so
that they return None or data for json_success.
Instead call json_sucess in the caller function, which already
has the HttpRequest as a parameter.
Adds a check for `additionalProperties: true` when there are no
properties listed in the schema.
This currently only happens in one place, but will be helpful for
deduplicating text between the `register-queue` and `get-events`
endpoints.
Wordle has recently become a thing and it uses green, yellow and white (or
black in dark mode) large square unicode characters to let people share their
gameplay. Zulip converts the white and black large square unicode characters to
emojis, but not the green and yellow ones. This causes the Wordle grid to be
misaligned when shared on Zulip.
This commit adds green and yellow large square emojis to our emoji list to fix
the problem.
Previously, we only checked mandatory_topics setting before
sending message in frontend and there was no restriction in
backend. This commit adds the check in backend also making
sure messages without topic cannot be sent through API as
well if mandatory_topics setting is set to True.
Previously, users found it annoying that the automated "Resolve topic"
notifications triggered an unread for everyone in the stream; this
discouraged some users from using the feature on older threads for
fear of being annoying. We change this to a better default, of only
users who participated in the topic (via either messages or reactions)
being eligible for the new message being unread.
We will likely want to create global and stream-level notifications
settings to control this behavior as a follow-up -- some users, like
me, might prefer the simpler "Always unread" behavior in some streams.
Note that the automated notifications that a topic was resolved will
still result in the topic being moved to the top of the left sidebar.
This would be somewhat difficult to change, since the left sidebar
algorithm just looks at the highest message ID in the topic.
Fixes#19709.
Tests added by Aman Agrawal (amanagr@zulip.com).
In English, compound adjectives should essentially always be
hyphenated. This makes them easier to parse, especially for users who
might not recognize that the words “web public” go together as a
phrase.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
There are three user settings that are integer enums:
`color_scheme`, `demote_inactive_streams` and
`desktop_icon_count_displays`.
Unlike the other user settings, these were using the `content`
keyword instead of the `schema` keyword in their definitions,
which caused them not to be rendered correctly in the api
documentation.
Changes the keyword to `schema` and fixes the indentation
for these three user settings in the two endpoints using
them.
Removes various instances of quotation marks that are not needed,
specifically looking at instances of arrays, e.g. `- "`, in the
OpenAPI documentation.
The target realm was not being passed to create_attachment in
upload_message_file implementations. This was a bug in the edge-case of
cross-realm messages - in particular, causing a bug in the email
gateway:
When an email with an attachment is sent, the message is mirrored to
Zulip with Email Gateway Bot as the message sender and uploader of the
attachment. Due to the realm not being passed to create_attachment, the
Attachment would get created with .realm being the system bot realm,
making the attachment inaccessible under some conditions due to failing
the following condition check (that's expected to pass, provided that
the .realm is set correctly):
```
if (
attachment.is_realm_public
and attachment.realm == user_profile.realm
and user_profile.can_access_public_streams()
):
# Any user in the realm can access realm-public files
return True
```
Fixes the rendering of enums to show strings with quotation marks,
while integers will continue to be rendered without quotation marks.
This allows for an empty string to be passed as an enum value and be
rendered as such in the documentation. Null will be rendered without
quotation marks, like integer values.
Makes `edit_timestamp` and `user_id` required fields for all
`update_message` events.
Adds `rendering_only` as another required field to signal if
events are only updating the rendered content of the message,
which is currently the case for adding inline url previews.
Updates `test_event.py` so that `do_update_message` and
`do_update_embedded_data` refer to the same testing schema
for `update_message` events, and therefore reflect the same
required fields for the `update_message` event.
The OpenAPI definition for `update_message` events is also
updated to reflect the required field and descriptions of
various properties are updated for the addition of the
`rendering_only` property.
Moves details about the rate limit error object and handling to
the OpenAPI documentation description for that common error.
Previously, this information was on the general rest error
handling documentation page without clear connection to the
specific rate limit error.
Fixes a typo in the changelog (feature 36) for that same error
and also fixes a misplaced colon in the description of the error
for missing request parameters.
Adds detailed definition of objects in the `subscription_data` parameter
array for the `/update-subscription-settings` endpoint.
Fixes#20825. Follow-up to #20409.
Formats and moves whether a field of an object in a request
parameter is required or optional to be in the same location
and have the same formatting as the general api parameter
documentation.
Also formats any examples within the object detailed
description to be the same as the general api parameter
documentation.
Follow up to #20409.
Adds a line break before the descriptive text for return
values and events in the api documentation in order to
help with readability of descriptions with multiple
paragraphs of descriptive text.
Adjustments made to the CSS of list items in unordered
lists to visually group the first paragraph of text
to any following paragraphs or unordered lists.
As a consequence:
• Bump minimum supported Python version to 3.7.
• Move Vagrant environment to Debian 10, which has Python 3.7.
• Move CI frontend tests to Debian 10.
• Move production build test to Debian 10.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
regarding -
POST https://yourZulipDomain.zulipchat.com/api/v1/users/me/subscriptions
The definition of the "subscription" parameter didn't include full
information about the parameter. It only said that an array of objects
is passed as a parameter, and relied on description of the parameter
to explain what the object contained. I edited the definition to contain
the full information about the object.
Fixes#20824.
The change to curl_param_value_generators.py warrants a brief
explanation. Stream permission changes now generate a notification
message. Our curl example test for removing a reaction comes after
the two tests for updating the stream permission changes, thus the
hardcoded message ID in that test needs to be incremented by 2 to
account for the two notification messages that now come before it.
This is a part of #20289.
do_make_stream_web_public and do_change_stream_invite_only seem
to contain very similar logic that could just live inside the
do_change_stream_permission function that handles all permission
changes in one place.
queue_client.queues does not list all the queues that exist on the
server (you can’t do that over AMQP); the condition "test_suite" in
queue_client.queues was always false. So the test_suite queue could
accumulate extra messages that broke test_queue_error_json.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This also fixes a warning from
RealmExportTest.test_endpoint_local_uploads: “ResourceWarning:
unclosed file <_io.BufferedReader
name='/srv/zulip/var/…/test-export.tar.gz'>”.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Adds a check for object in parameter type that will render the details
of the object in the parameter description if they are in the object
definition in the OpenAPI documentation.
Fixes#19424.
revoke_invites_generated_by_user should send invites_changed event if it
actually revokes some invitations. This is called in the user
deactivatoin codepath.
Event of type "realm_user", op "remove", emitted by do_deactivate_user
should remove the user id from subscriptions in the state. We weren't
catching this bug, because test_do_deactivate_bot uses a newly created
bot, so no stream subscriptions are affected. The bug shows up if
deactivating e.g. cordelia - thus we want to have two tests instead,
one for testing bot deactivation and one for user deactivation.
We now use recipient_id % 24 for new stream colors
when users have already used all 24 of our canned
colors.
This fix doesn't address the scenario that somebody
dislikes one of our current canned colors, so if a
user continually changes canned color N to some other
color for new streams, their new streams will continue
to include color N (and the user will still need to
change them).
This fix doesn't address the fact that it can be expensive
during bulk-add situations to query for all the colors
that users have already used up.
See https://chat.zulip.org/#narrow/stream/3-backend/topic/assigning.20stream.20colors
for more discussion.
The limit here is purely to prevent breakage in case of a pathological
number of images in a single message; 5 images is entirely possible in
a reasonable message, and causes user confusion when they are not
expended.
Increase the limit to 10 per message.
Django 3.2 expects a list, and Django 4.1 will require one. Fixes
“RemovedInDjango41Warning: Using a boolean value for
requires_system_checks is deprecated. Use '__all__' instead of True,
and [] (an empty list) instead of False.”
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This was deprecated in Django 3.1 for being jQuery-specific, and
removed in Django 4.0. Replicate the jQuery-specific check.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
oneOf with two identical branches (modulo example) is a bug because
oneOf means exclusive or. It’s also a totally inappropriate kludge
for encoding multiple examples. The OpenAPI specification provides a
perfectly good standard way to do that:
https://spec.openapis.org/oas/v3.0.3#example-object
However, we don’t handle that in our OpenAPI documentation generator
yet, so for now just merge the examples.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This was a oneOf with two identical branches modulo example, which is
always a bug because oneOf means exclusive or. But the example for
the first branch did not fit the schema for AddSubscriptionsResponse,
which is a subset of JsonSuccessBase.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Although allOf is often used to indicate inheritance, its semantics
are that of a plain set intersection. The intersection of a nullable
property with a non-nullable property is a non-nullable property.
Therefore, if we want an inherited property to remain nullable, we
need to mark it as such.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
django.utils.translation.ugettext is a deprecated alias of
django.utils.translation.gettext as of Django 3.0, and will be removed
in Django 4.0.
Commit e7ed907cf6 (#18174) fixed this
before, but new instances have been added.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We do not accept heterogeneous arrays containing both user ids and
email addresses.
This also happens to disallow an empty array, which is fine since the
principals parameter should be omitted if the default to the calling
user is desired.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Fixes “DeprecationWarning: 'jinja2.Markup' is deprecated and will be
removed in Jinja 3.1. Import 'markupsafe.Markup' instead.”
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Updates regex in the openapi markdown extension to match api
endpoint names that contain dashes, which is the case for
`zulip-outgoing-webhook` and `rest-error-handling`.
The subscriber list was not updating without a refresh on
reactivating user, because the subscriptions data with the
client was not updated on reactivation.
This commit adds code to send peer_add subscription events
on reactivating the user.
We do not send peer_remove events on deactivating the user,
but the subscriber list is still live-updated because we
have the data of the streams which the deactivated user is
susbcribed to and the clients itself updates the data and UI
on receiving event of deactivation of user, which it is not
possible when reactivating the user.
Fixes#20383.
Leaving old invitations valid, potentially for a very long time, is
clearly unexpected and undesired behavior under normal circumstances. A
user shouldn't be able to e.g. generate a multiuse invite link, get
banned from the organization by being deactivated and then just re-join
using the link they've created for themselves.
do_revoke_user_invite and do_revoke_multi_use_invite were using objects
after their deletion to pass the argument to notify_invites_changed. We
should avoid that. The function was only using the .realm attribute of
the received objects, so it's simpler to make it just take realm as its
argument.
The msg parameter is a string to be displayed when the expected
exception wasn’t raised, not a pattern to match against the raised
exception’s message.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Under the unicodedata distributed with Python 3.6, some Emoji are
classified as `Cn`, and not `So`:
```
$ unicode 1f929 --long
U+1F929 GRINNING FACE WITH STAR EYES
UTF-8: f0 9f a4 a9 UTF-16BE: d83edd29 Decimal: 🤩 Octal: \0374451
🤩
Category: So (Symbol, Other); East Asian width: W (wide)
Unicode block: 1F900..1F9FF; Supplemental Symbols and Pictographs
Bidi: ON (Other Neutrals)
$ python3.6 -c 'import unicodedata; print(unicodedata.category("\U0001f929"))'
Cn
$ python3.7 -c 'import unicodedata; print(unicodedata.category("\U0001f929"))'
So
```
Drop `Cn` from the list of excluded Unicode character classes, and
replace it with an explicit list of the 66 non-characters, which are
invariant.
Co-authored-by: Shlok Patel <shlokcpatel2001@gmail.com>
An explanatory note on the changes in zulip.yaml and
curl_param_value_generators is warranted here. In our automated
tests for our curl examples, the test for the API endpoint that
changes the posting permissions of a stream comes before our
existing curl test for adding message reactions.
Since there is an extra notification message due to the change in
posting permissions, the message IDs used in tests that come after
need to be incremented by 1.
This is a part of #20289.
Prior to this commit, we wrapped all incoming messages from Slack
in backticks. This led to weird formatting errors when an incom-
ing message from Slack contains backticks, to refer to a function
name, for instance.
Moves `flags` field to top part of object description because
it is always included in the event.
If a field is present only for certain types of message updates,
the description begins by stating when the field is present:
"Only present if ...".
These fields are organized by the type of message update:
stream, stream and/or topic, topic, content.
If a field is not present due to a special event, the description
ends by stating when the field is not present:
"Not present if ...".
Adds documentation for fields currently required to be returned
with any `update_message` event.
do_delete_users had two bugs:
1. Creating the replacement dummy users
with active=True
2. Creating the replacement dummy users with email domain set to
realm.uri, which may not be a valid email domain.
Prior commits fixed the bugs, and this migration fixes the pre-existing
objects.
Otherwise the dummy user can be created with an invalid email domain -
e.g. in development environment with the domain
"@http://localhost:9991". get_fake_email_domain exists exactly for
handling these kinds of scenarios.
Stop using `access_user_group_by_id` in notifications codepaths, as it
is meant to be used to check for _write_ access, not read
access (which is not limited). In the notification codepaths, there
are no ACLs to apply, and the ID is known-good; just load it
directly. The `for_mention` flag is removed, as it was not used in the
mention codepaths at all, only the notification ones.
get_remote_server_by_uuid (called in validate_api_key) raises
ValidationError when given an invalid UUID due to how Django handles
UUIDField. We don't want that exception and prefer the ordinary
DoesNotExist exception to be raised.
APNs payloads nest the zulip-custom data further than the top level,
as Android notifications do. This led to APNs data silently never
being truncated; this case was not caught in tests because the mocks
provided the wrong data for the APNs structure.
Adjust to look in the appropriate place within the APNs data, and
truncate that.
This replaces the temporary (and testless) fix in
24b1439e93 with a more permanent
fix.
Instead of checking if the user is a bot just before
sending the notifications, we now just don't enqueue
notifications for bots. This is done by sending a list
of bot IDs to the event_queue code, just like other
lists which are used for creating NotificationData objects.
Credit @andersk for the test code in `test_notification_data.py`.
As explained in the comments in the code, just doing UUID(string) and
catching ValueError is not enough, because the uuid library sometimes
tries to modify the string to convert it into a valid UUID:
>>> a = '18cedb98-5222-5f34-50a9-fc418e1ba972'
>>> uuid.UUID(a, version=4)
UUID('18cedb98-5222-4f34-90a9-fc418e1ba972')
This diff looks slightly noisy, but the main chunk of
code that we moved here has the same logic as before,
and it just gets realm_id from MentionBackend now, instead
of having our markdown processor have to supply it.
We basically want MentionData to be the gatekeeper of
mention data, and then we delegate backend tasks to
MentionBackend.
Soon we will add a cache to MentionBacked, which will
justify this change a bit more.
We now make it mandatory to pass in the Realm object.
If this function was ever called with None, I am scared
to know what the expected results were at the time of
writing.
It's slightly annoying to plumb Optional[MentionBackend]
down the stack, but it's a one-time change.
I tried to make the cache code relatively unobtrusive
for the single-message use case.
We should be able to eliminate redundant stream queries
using similar techniques.
I considered caching at the level of rendering the message
itself, but this involves nearly as much plumbing, and
you have to account for the fact that several users on
your realm may have distinct default languages (French,
Spanish, Russian, etc.), so you would not eliminate as
many query hops. Also, if multiple streams were involved,
users would get slightly different messages based on
their prior subscriptions.
When our handlers specifically reference self.md.zulip_db_data,
we now use an explicit type.
We probably want a more robust solution here, such as a semgrep
rule.
We now serialize still_url as None for non-animated emojis,
instead of omitting the field. The webapp does proper checks
for falsiness here. The mobile app does not yet use the field
(to my knowledge).
We bump the API version here. More discussion here:
https://chat.zulip.org/#narrow/stream/378-api-design/topic/still_url/near/1302573
Appending to bytes in a loop leads to a quadratic slowdown since
Python doesn’t optimize this for bytes like it does for str.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
While accepting an invitation from a user, there was no condition in
place to check if the user sending the invitation was now
now-deactivated.
Skip sending notifications about newly-joined users to users who are
now disabled.
Fixes#18569.
We don't have to go to the database to get the Recipient
fields for `user_profile.recipient`.
See also 85ed6f332a from a little
over a year ago--it's very similar.
The bug here probably didn't come up too much in
practice, but if we were adding a user to multiple
streams when they already had used all N available
colors, all the new streams would be assigned the same
color, since the size of used_colors would stay at N,
thwarting our little modulo-len hackery.
It's not a terrible bug, since users can obviously
customize their stream colors as they see fit.
Usually when we are adding a user to multiple streams,
the users are fairly new, and thus don't have many
existing streams, so I have never heard this bug
reported in the field.
Anyway, assigning the colors in bulk seems to make more
sense, and I added some tests.
For the situations where all the colors have already
been used, I didn't put a ton of thought into exactly
which repeated colors we want to choose; instead, I
just ensure they're different modulo 24. It's possible
that we should just have more than 24 canned colors, or
we should just assign the same default color every time
and let users change it themselves (once they've gone
beyond the 24, to be clear). Or maybe we can just do
something smarter here. I don't have enough time for a
deep dive on this issue.
Part of our codepath for subscribing users involves
fetching the users' existing subscriptions to make sure
we can do things like properly report to the clients
that the users were already subscribed. This codepath
used to be coupled to code that helped users maintain
unique stream colors.
Suppose you are creating a new stream, and you are
importing users from an older stream with 15k
subscribers, and each of your users is subscribed to
about 20 streams.
The prior code, instead of filtering on recipient_id,
would literally look at every subscription for every
user, which was kind of crazy if you didn't understand
the pick-stream-color complications.
Before this commit, we would fetch 300k rows with 15
columns each (granted, all but one of the columns are
bool/int). That's a total of 4.5 million tiny objects
that we had to glom into Django ORM objects and slice
and dice.
After this commit, we would fetch exactly zero rows
for the are-they-already-subscribed logic.
Yes, ZERO.
If we were to mistakenly try to re-add the same 15k
subscribers to the new stream (under the new code), we
will now fetch 15k Sub rows instead of 300k.
It is worth looking at the prior commit. We go through
great pains to ensure that users get new stream colors
when we invite them to a stream, and we still fetch a
bunch of data for that. Instead of 4.5 million cells,
it's more like 600k cells (2 columns per row), and it's
less than that insofar as some users may only
have 24 distinct colors among their many streams.
It's a lot of work.
This commit sets us up for the next commit, which will
save us a very expensive query.
If you are adding 15k users to a stream, and each user
has about 20 existing streams, then we need to retrieve
300k rows from the database to figure out which stream
colors they already have. We don't need all the extra
fields from Subscription, so now we get just the two
values we need for making a color map.
In the next commit we'll eliminate the other use case
for the big query, and I will explain in greater
depth how splitting out the color-picking code can
be a huge win. It is possible that some product decisions
could make this codepath easier. We could also do some
engineering specific to stream colors, such as caching
which colors users have already used.
This does cost us an extra round trip to the database.
Having the `wildcard_mentions_notify` setting turned on does
not necessarily mean that the user will receive notification
for that message. There is more nuance to this, as explained
in the updated comment.
We recently ran into a payload in production that didn't contain
an event type at all. A payload where we can't figure out the event
type is quite rare. Instead of letting these payloads run amok, we
should raise a more informative exception for such unusual payloads.
If we encounter too many of these, then we can choose to conduct a
deeper investigation on a case-by-case basis.
With some changes by Tim Abbott.
Given that these values are uuids, it's better to use UUIDField which is
meant for exactly that, rather than an arbitrary CharField.
This requires modifying some tests to use valid uuids.
We avoid repeating the same calculations over and
over again for the same stream.
This helps, but the real bottleneck in this function
is that send_event usually takes at least a millisecond,
and that adds up quickly if you're doing something
like subscribing 5k users to a new stream.
GIF files can be `.GIF`, and also we determine the file format by
inspecting the image data, so there's no reason to have this
assertion.
(The code for serving still images does not rely on the file being a
GIF.)
Have kept process_new_human_user out of
the atomic block because it involves many
different operations and also sends events.
Tried enclosing event in on_commit but that
would need many changes in the tests, so have
skipped it for now.
Updates testing helpers in `event_schema.py` for `do_update_message` so
that all stream message fields are present in any edits / updates to
stream messages. Adds verfication tests of events returned from private
message edits and from stream message content-only and topic-only edits.
Updates the `update_message` event type to always include a `stream_id`
field when the message being edited is a stream message. This change
aligns with the current definition of the `\get-events` endpoint
in the OpenAPI documentation.
It is better to press on, than stop halfway through due to a user
whose email no longer works. The exception is already logged, which
is sufficient here, as this is generally run interactively.
These fundamentally tested send_email, not build_email, and thus
belong in TestSendEmail, not TestBuildEmail. They also duplicated the
code in test_send_email_exceptions; reuse it.
This allows verify_uploads to use the database
as the authoritative source for what attachments
we need to look for when we're verifying the
images got exported properly, while still
also verifying attachment.json is correct.
It is better for the verifying code to just explicitly
ensure that the exported file bytes match the bytes
in the test image. This introduces a tiny bit more
of I/O.
It's easier to read the code without the intermediate
full_data dictionary that obscures where the files live.
We also avoid some unnecessary file i/o in the tests.
We do a sanity check for every table
that gets written to user.json as part of
the single-user export.
If we add more tables to the single-user export,
the test that I modified here will now ask
the author to add a new checker function, which
means we should always have at least a basic
sanity check for every exported table as long
as we stay in this new paradigm.
We also remove a little bit of old code that
became redundant.
This replaces the TERMS_OF_SERVICE and PRIVACY_POLICY settings with
just a POLICIES_DIRECTORY setting, in order to support settings (like
Zulip Cloud) where there's more policies than just those two.
With minor changes by Eeshan Garg.
We do s/TOS/TERMS_OF_SERVICE/ on the name, and while we're at it,
remove the assumed zerver/ namespace for the template, which isn't
correct -- Zulip Cloud related content should be in the corporate/
directory.
We now complain if a test author sends a stream message
that does not result in the sender getting a
UserMessage row for the message.
This is basically 100% equivalent to complaining that
the author failed to subscribe the sender to the stream
as part of the test setup, as far as I can tell, so the
AssertionError instructs the author to subscribe the
sender to the stream.
We exempt bots from this check, although it is
plausible we should only exempt the system bots like
the notification bot.
I considered auto-subscribing the sender to the stream,
but that can be a little more expensive than the
current check, and we generally want test setup to be
explicit.
If there is some legitimate way than a subscribed human
sender can't get a UserMessage, then we probably want
an explicit test for that, or we may want to change the
backend to just write a UserMessage row in that
hypothetical situation.
For most tests, including almost all the ones fixed
here, the author just wants their test setup to
realistically reflect normal operation, and often devs
may not realize that Cordelia is not subscribed to
Denmark or not realize that Hamlet is not subscribed to
Scotland.
Some of us don't remember our Shakespeare from high
school, and our stream subscriptions don't even
necessarily reflect which countries the Bard placed his
characters in.
There may also be some legitimate use case where an
author wants to simulate sending a message to an
unsubscribed stream, but for those edge cases, they can
always set allow_unsubscribed_sender to True.
These variables can be unset if the `os.path.exists` check fails.
That should be rare, since we've previously checked the files do
exist before getting here.
While races here are unlikely, it is most correct to enforce this
invariant at the database layer, and having a database-level
constraint makes the models file a bit more readable.
These are not considered to be "personal"
info, even if you upload them, so we
don't export them.
Generally the only folks who upload
these are admins, who can easily get
them in other ways. In fact, anybody
can get these via the app.
We now ensure that all message ids are sorted BEFORE
we split them into batches.
We now do a few extra "slim" queries to get message
ids up front.
But, now, when we divide them into batches, we no
longer run 2 or 3 different complicated queries in
a loop. We just basically hydrate our message ids,
so `write_message_partials` should be easy to reason
about.
This change also means that for tiny realms with
< 1000 messages you will always have just one
json file, since we aggregate the ids from the
queries before batching.
This accomplishes a few things:
* It extracts `chunkify` rather than having us
clumsily track chunking-related stuff in a
big loop that is doing other stuff.
* It makes it so that all message ids
in message-000001.json < message-000002.json.
* It makes it easier for us to customize
the messages we send to a single user
(coming soon).
BTW we probably have a slicker version of chunkify
somewhere in our codebase, but I couldn't remember
where.
Following b3c58f454f, we want to clean up
old topics that may contain the disallowed characters. The Message table
is large, so we go in batches, making sure we limit topic fetches and
UPDATE query to no more than BATCH_SIZE Message rows per query.
Now all file writes go through our three
helper functions, and we consistently
write a single log message after the file
gets written.
I killed off write_message_exports, since
all but one of its callers can call
write_table_data, which automatically
sorts data. In particular, our Message
and UserMessage data will now be sorted
by ids.
This probably just postpones the list creation until
Django builds the "IN" query, but semantically it's
good to work in sets where we don't have any
meaningful ordering of the list that gets used.
The immediate benefit of this is stronger mypy
checks (avoiding the ugly union caused by message
files).
The subsequent commit will add sorting.
We have test coverage on all these lines insofar
as if you comment out the lines, tests will
explode (i.e. more than superficial line
coverage).
The distinction here wasn't super meaningful
due to the way we order our "elif" statements,
but we want to reserver "normal_parent" for the
majority of use cases, where you simply tell
the Config what the "foreign_key" is.
For realm-wide exports, there is no reason to query
inefficiently against a list of modified users.
We move the Config out of the common child configs.
Even though Django usually treats foo__in
and foo_id__in identically for filters where
foo is a ForeignKey type, we want to insist
on somewhat more consistent syntax, because
we have the odd combo of type and type_id
in Recipient, where type_id is kinda like a
foreign key, but not a ForeignKey.
So we assert for now that all our include_rows
values end in "_id__in".
Zulip shows two guides on How to reply, first one by
the welcome bot and second one is intro_reply hotspot.
To simply and avoid redundancy, intro_reply hotspot is
removed.
Fixes#20482.
Force postgres to give reactions in ID order - which
is generally chronological order. Results in frontend
displaying reactions in said order.
Fixes#20060.
In many of our stream notification messages, we make use of the
same silent user mention syntax, the template for which was always
hardcoded. This commit adds a helper function that all relevant
callers can call to get the right syntax when mentioning users.
Thanks to Tim Abbott for this suggestion!
Removed existing empty narrow divs from app/home.html and created
a new javascript module to dynamically load empty narrow messages
using handlebar template.
Fixes#18797
The original intention of this was to prevent coding
errors with realm getters that don't, um, filter
on realm.
Unfortunately, you can still write a broken realm getter
that forgets to filter on realm, but which returns a
Set, and the new safeguards won't see any difference.
We could make all the getters return sorted lists
instead, but that's for another day.
This code does serve another purpose, which is to
prevet egregious bugs in the import itself.
The diff here is ugly, but to summarize:
BEFORE IMPORT:
define get_user_id
define get_huddle_hashes
AFTER IMPORT AND MAKING GETTERS:
check realm id
define assert_realm_values
verify emoji codes
check huddle hashes
We don't have automated test coverage on this yet,
but below are the results from manual testing.
Note that we include the realm icon and logo even
though they were not created by Cordelia.
./manage.py export_single_user cordelia@zulip.com
$ (cd /tmp/zulip-export-4v3mo802/ && find .)
.
./emoji
./emoji/2
./emoji/2/emoji
./emoji/2/emoji/images
./emoji/2/emoji/images/3.jpg
./emoji/records.json
./messages-000001.json
./realm_icons
./realm_icons/2
./realm_icons/2/night_logo.original
./realm_icons/2/night_logo.png
./realm_icons/2/icon.png
./realm_icons/2/icon.original
./realm_icons/records.json
./avatars
./avatars/2
./avatars/2/c5125af0447f4d66ce34c1b32eac75ac27ebe0e7.original
./avatars/2/c5125af0447f4d66ce34c1b32eac75ac27ebe0e7.png
./avatars/records.json
./uploads
./uploads/2
./uploads/2/68
./uploads/2/68/xyEkC5dTIp8m42_6HJ3kBfdt
./uploads/2/68/xyEkC5dTIp8m42_6HJ3kBfdt/denver.jpg
./uploads/2/96
./uploads/2/96/ol5WE6RTUntvuPDSpJUrYTim
./uploads/2/96/ol5WE6RTUntvuPDSpJUrYTim/denver.jpg
./uploads/records.json
./user.json