The commit:
1. Adds the new field as nullable.
2. Adds code that'll create new Confirmation with the field set
correctly.
3. For verifying validity of Confirmation object this still uses the old
logic in get_object_from_key() to keep things functioning until we
backfill the old objects in the next step.
Thus this commit is deployable. Next we'll have a commit to run a
backfill migration.
An integer or no argument is supposed to be passed.
These weren't caught by mypy because booleans are integers in python,
see https://github.com/python/mypy/issues/1757
This will be used to check if the narrow being requested by
spectator requires authentication without requesting the server.
Having this check locally, makes this process look snappy to
the user and doesn't result in 404s in the browser log.
aioapns already has a retry loop. By default it retries forever on
ConnectionError and ConnectionClosed, so our own retry loop would
never be reached. Remove our retry loop, and configure aioapns to
retry APNS_MAX_RETRIES times on ConnectionError like the previous
version did. It still retries forever on ConnectionClosed; that’s not
configurable but probably fine.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This utilizes the generic `BaseNotes` we added for multipurpose
patching. With this migration as an example, we can further support
more types of notes to replace the monkey-patching approach we have used
throughout the codebase for type safety.
The motive of adding `BaseNotes` was to support monokey patching
temporary attributes to objects (such as `.trigger` on `Message`) when
working on the django-stubs migration in #18777.
This way, we no longer have to manually keep the upload path code in
sync with the upload path code in zerver/lib/upload.py.
This was originally suggested in
https://github.com/zulip/zulip/pull/19478#issuecomment-911479530.
This change fixes a bug when importing into a server using the local
file uploads backend, where the `import_realm.py` copy wasn't using
our standard 256-directory approach to avoid putting too many files in
a single directory.
de04f0ad67 changed now notifications recipients were calculated, in
a manner that caused them to be sent when they should not have been.
ac70a2d2e1 was supposed to resolve this, but appears to have been
insufficient, as all three of these cases have been observed to still
happen.
Add safety checks immediately before notification, until the
underlying logic error can be sussed out.
This information can be gleaned from the stacktrace, but making it
explicit in the stringification makes it much easier to differentiate
types of errors at a glance, particularly in Sentry.
We move the emojiset_choices method from UserProfile class to
UserBaseSettings class because emojiset_choices exists in
UserBaseSettings class and this would be used for realm-level
settings as well along with existing user-level settings.
We rename the user group in the example for 'GET /user_groups'
with is_system_group=True, to be 'Moderators' as is_system_group
will be set to True for role-based user groups only.
The default is kept as no retries. Since retries with exponential
backoff are a good thing to make easy, the int form defaults to
setting a backoff_factor.
Unfortunately, urllib3 retry backoff does not implement jitter.
Switching this to use the `backoff` library[1] rather than urllib3's
native Retry is left as future extension.
[1] https://pypi.org/project/backoff/
This adds the X-Smokescreen-Role header to proxy connections, to track
usage from various codepaths, and enforces a timeout. Timeouts were
kept consistent with their previous values, or set to 5s if they had
none previously.
This commits removes some unnecessary checks for `self.md.zulip_message`,
which were put there historically, as earlier we used to add the additional
properties like mentions_user_ids, alert_words, etc. to Message dict
only. These were later moved to MessageRenderingResult class in commit
75cea329b but the checks weren't removed.
This is important because while rendering the messages imported from
other chat tools (like Rocket.Chat), the Message dict is not passed to
the markdown, due to which the checks for `self.md.zerver_message` fails
and hence, things like user mentions, stream/topic mentions are not
rendered in the imported messages properly.
The `user_activity_interval` worker calls:
```python3
last = UserActivityInterval.objects.filter(user_profile=user_profile).order_by("-end")[0]
`````
Which results in a query like:
```sql
SELECT "zerver_useractivityinterval"."id", "zerver_useractivityinterval"."user_profile_id", "zerver_useractivityinterval"."start", "zerver_useractivityinterval"."end" FROM "zerver_useractivityinterval" WHERE "zerver_useractivityinterval"."user_profile_id" = 12345 ORDER BY "zerver_useractivityinterval"."end" DESC LIMIT 1
```
For users which have at least one matching row, this results in a
query plan like:
```
Limit (cost=0.56..711.38 rows=1 width=24) (actual time=0.078..0.078 rows=1 loops=1)
-> Index Scan Backward using zerver_useractivityinterval_7f021a14 on zerver_useractivityinterval (cost=0.56..1031399.46 rows=1451 width=24) (actual time=0.077..0.078 rows=1 loops=1)
Filter: (user_profile_id = 12345)
Rows Removed by Filter: 98
Planning Time: 0.059 ms
Execution Time: 0.088 ms
```
But for users that have just been created, with no matching rows, this
is considerably more expensive:
```
Limit (cost=0.56..711.38 rows=1 width=24) (actual time=10798.146..10798.146 rows=0 loops=1)
-> Index Scan Backward using zerver_useractivityinterval_7f021a14 on zerver_useractivityinterval (cost=0.56..1031399.46 rows=1451 width=24) (actual time=10798.145..10798.145 rows=0 loops=1)
Filter: (user_profile_id = 12345)
Rows Removed by Filter: (count of every single row in the table, redacted)
Planning Time: 0.053 ms
Execution Time: 10798.158 ms
```
Regular vacuuming can force the use of the index on `user_profile_id`
as long as there are few enough users, which is fast -- however, at
some point, the query planner decides that is insufficiently specific,
always chooses the effective-whole-table-scan.
Add an index on `(user_profile_id, end)`, which is expected to be
sufficiently specific that it is used even with large numbers of user
profiles.
Ref #19250.
The transforms called from `build_message_payload` use
`lxml.html.fromstring` to parse (and stringify, and re-parse) the HTML
generated by Markdown. However, this function fails if it is passed
an empty document. "empty" is broader than just the empty string; it
also includes any document made entirely out of control characters,
spaces, unpaired surrogates, U+FFFE, or U+FFFF, and so forth. These
documents would fail to parse, and raise a ParserError.
Using `lxml.html.fragment_fromstring` handles these cases, but does by
wrapping the contents in a <div> every time it is called. As such,
replacing each `fromstring` with `fragment_fromstring` would nest
another layer of `<div>`.
Instead of each of the helper functions re-parsing, modifying, and
stringifying the HTML, parse it once with `fragment_fromstring` and
pass around the parsed document to each helper, which modifies it
in-place. This adds one outer `<div>`, requiring minor changes to
tests and the prepend-sender functions.
The modification to add the sender is left using BeautifulSoup, as
that sort of transform is much less readable, and more fiddly, in raw
lxml.
Partial fix for #19559.
This is a roundabout way to appease a semgrep complaint about
‘error_msg = error_msg % (string_id,)’ while also improving the code.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
9ac55a8cf6 introduced support for
batch updates to stories. However, that commit didn't skip label
removals, as we already do in non-batch story payloads. This led
to an exception for batch story update payloads where labels were
removed but none were added.
maybe_send_batched_emails handles batches of emails from different
users at once; as it processes each user's batch, it enqueues messages
onto the `email_senders` queue. If `handle_missedmessage_emails`
raises an exception when processing a single user's email, no events
are marked as handled -- including those that were already handled and
enqueued onto `email_senders`. This results in an increasing number
of users being sent repeated emails about the same missed messages.
Catch and log any exceptions when handling an individual user's
events. This guarantees forward progress, and that notifications are
sent at-most-once, not at-least-once.
This commit indicates that the realm_message_retention_days field can have
a special value, similar to its stream counterpart, and also explains how
the special value changed over different server versions.
With an extension from tabbott to double-enter the changelog entry.
Related discussion: https://chat.zulip.org/#narrow/stream/378-api-design/topic/realm_message_retention_days
The ability to use multiple ports has been removed a long time ago.
And the "optional" note in the help message is in fact incorrect
since `addrport` being `None` is not supported.
We do not allow any user to edit the system user groups (including
renaming, deleting, adding or removing members, etc.) from the
API. These user groups will change only by the code when a new
user is added or role of a user is changed.
This is implemented by rejecting access_user_group_by_id always
except the case when it is use to get the user group for sending
email and push notifications, as we would need to send notifications
to the mentioned user group.
We make the description parameter in create_user_group as keyword-only
to improve readability. We would also keep the is_system_group
parameter which will be added in future keyword-only.
Tuples cannot be deserialized from JSON.
While we do use these validators for other things, like event
dictionaries, we have migrated the API away from using those. The
last use was removed in 4f3d5f2d87
Signed-off-by: Anders Kaseorg <anders@zulip.com>
These changes are all independent of each other; I just didn’t feel
like making dozens of commits for them.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The auth attempt rate limit is quite low (on purpose), so this can be a
common scenario where a user asks their admin to reset the limit instead
of waiting. We should provide a tool for administrators to handle such
requests without fiddling around with code in manage.py shell.
Calling `email.save()` is only needed if we altered `email.address`;
it is unnecessary if we called `email.users.add(...)` which will have
done its own INSERT.
This fixes two bugs: the most obvious is that there is a race where a
ScheduledEmail object could be observed in the window between creation
and when users are added; this is a momentary instance when the object
has no users, but one that will resolve itself.
The more subtle is that .save() will, if no records were found to be
updated, _re-create_ the object as it exists in memory, using an
INSERT[1]. Thus, there is a race with `deliver_scheduled_emails`
between when the users are added, and when `email.save()` runs:
1. Web request creates ScheduledEmail object
2. Web request creates ScheduledEmailUsers object
3. deliver_scheduled_emails locks the former, preventing updates.
4. deliver_scheduled_emails deletes both objects, commits, releasing lock
5. Web request calls `email.save()`; UPDATE finds no rows, so it
re-creates the ScheduledEmail object.
6. Future deliver_scheduled_emails runs find a ScheduledEmail with no
attending ScheduledEmailUsers objects
Wrapping the logical creation of both of these in a single transaction
avoids both of these races.
[1] https://docs.djangoproject.com/en/3.2/ref/models/instances/#how-django-knows-to-update-vs-insert
Only clear_scheduled_emails previously took a lock on the users before
removing them; make deliver_scheduled_emails do so as well, by using
prefetch_related to ensure that the table appears in the SELECT. This
is not necessary for correctness, since all accesses of
ScheduledEmailUser first access the ScheduledEmail and lock it; it is
merely for consistency.
Since SELECT ... FOR UPDATE takes an UPDATE lock on all tables
mentioned in the SELECT, merely doing the prefetch is sufficient to
lock both tables; no `on=(...)` is needed to `select_for_update`.
This also does not address the pre-existing potential deadlock from
these two use cases, where both try to lock the same ScheduledEmail
rows in opposite orders.
No codepath except tests passes in more than one user_profile -- and
doing so is what makes the deduplication necessary.
Simplify the API by making it only take one user_profile id.
This fixes a bug where email notifications were sent for wildcard
mentions even if the `enable_offline_email_notifications` setting was
turned off.
This was because the `notification_data` class incorrectly considered
`wildcard_mentions_notify` as an indeoendent setting, instead of a wrapper
around `enable_offline_email_notifications` and `enable_offline_push_notifications`.
Also add a test for this case.
While the STREAM_LINK_REGEX and STREAM_TOPIC_LINK_REGEX
identifies the stream and topic mentions in the content
correctly (tested by printing out the matches), the
stream/topic mentions are still not linked to the
corresponding streams/topics for imported messages, as
a `zulip_message` instance is required for linking these
mentions to actual streams/topics (see `StreamPattern`
class in `markdown/__init__.py`) which is not provided
while processing the markdown for imported messages.