Calling `email.save()` is only needed if we altered `email.address`;
it is unnecessary if we called `email.users.add(...)` which will have
done its own INSERT.
This fixes two bugs: the most obvious is that there is a race where a
ScheduledEmail object could be observed in the window between creation
and when users are added; this is a momentary instance when the object
has no users, but one that will resolve itself.
The more subtle is that .save() will, if no records were found to be
updated, _re-create_ the object as it exists in memory, using an
INSERT[1]. Thus, there is a race with `deliver_scheduled_emails`
between when the users are added, and when `email.save()` runs:
1. Web request creates ScheduledEmail object
2. Web request creates ScheduledEmailUsers object
3. deliver_scheduled_emails locks the former, preventing updates.
4. deliver_scheduled_emails deletes both objects, commits, releasing lock
5. Web request calls `email.save()`; UPDATE finds no rows, so it
re-creates the ScheduledEmail object.
6. Future deliver_scheduled_emails runs find a ScheduledEmail with no
attending ScheduledEmailUsers objects
Wrapping the logical creation of both of these in a single transaction
avoids both of these races.
[1] https://docs.djangoproject.com/en/3.2/ref/models/instances/#how-django-knows-to-update-vs-insert
Only clear_scheduled_emails previously took a lock on the users before
removing them; make deliver_scheduled_emails do so as well, by using
prefetch_related to ensure that the table appears in the SELECT. This
is not necessary for correctness, since all accesses of
ScheduledEmailUser first access the ScheduledEmail and lock it; it is
merely for consistency.
Since SELECT ... FOR UPDATE takes an UPDATE lock on all tables
mentioned in the SELECT, merely doing the prefetch is sufficient to
lock both tables; no `on=(...)` is needed to `select_for_update`.
This also does not address the pre-existing potential deadlock from
these two use cases, where both try to lock the same ScheduledEmail
rows in opposite orders.
No codepath except tests passes in more than one user_profile -- and
doing so is what makes the deduplication necessary.
Simplify the API by making it only take one user_profile id.
Previously, the output would make it look like we sent an actual email
to the first user in the dry_run output, which is very confusing.
The `dry_run` code path already prints all the accounts that would
have been emailed at the end, so there's no reason to have this line
before the dry_run check.
Additionally, we move after the `get_connection` check because
failures at that stage shouldn't result in logging an attempt to send
an email.
An organization with at most 5 users that is behind on payments isn't
worth spending time on investigating the situation.
For larger organizations, we likely want somewhat different logic that
at least does not void invoices.
This makes it parallel with deliver_scheduled_messages, and clarifies
that it is not used for simply sending outgoing emails (e.g. the
`email_senders` queue).
This also renames the supervisor job to match.
Django's default SMTP implementation can raise various exceptions
when trying to send an email. In order to allow Zulip calling code
to catch fewer exceptions to handle any cause of "email not
sent", we translate most of them into EmailNotDeliveredException.
The non-translated exceptions concern the connection with the
SMTP server. They were not merged with the rest to keep some
details about the nature of these.
Tests are implemented in the test_send_email.py module.
Commit 9afde790c6 introduced a bug
concerning outgoing emails inside the development environment. These
emails are not supposed to use a real connection with a mail
server as the send_messages function is overwritten inside the
EmailLogBackEnd class.
The bug was happening inside the initialize_connection function that
was introduced in the above-mentioned commit. This function is used
to refresh the connection with an SMTP server that would have closed
it. As the socket used to communicate with the server is not
initialized inside the development environment this function was
wrongly trying to send no-op commands.
The fix just checks that the connection argument of the function is
an EmailLogBackEnd object before trying the no-op command.
Additionally as it is sometimes useful to be able to send outgoing
emails inside the development environment the get_forward_address
function is used to check if a real connection exists between Zulip
and the server. If it is the case, as EmailLogBackEnd is a subclass
of smtp.EmailBackend, the connection will be nicely refreshed.
This commit was tested manually by checking that the console prints
correctly that an email is sent to the user when it signs in inside
the development environment. It was also tested when a mail provider
is specified and the mails were correctly received.
Add a `--dry-run` flag to send_custom_email management command
in order to provide a mechanism to verify the emails of the recipients
and the text of the email being sent before actually sending them.
Add tests to:
- Check that no emails are actually sent when we are in the dry-run mode.
- Check if the emails are printed correctly when we are in the dry-run mode.
Fixes#17767
Previously the outgoing emails were sent over several SMTP
connections through the EmailSendingWorker; establishing a new
connection each time adds notable overhead.
Redefine EmailSendingWorker worker to be a LoopQueueProcessingWorker,
which allows it to handle batches of events. At the same time, persist
the connection across email sending, if possible.
The connection is initialized in the constructor of the worker
in order to keep the same connection throughout the whole process.
The concrete implementation of the consume_batch function is simply
processing each email one at a time until they have all been sent.
In order to reuse the previously implemented decorator to retry
sending failures a new method that meets the decorator's required
arguments is declared inside the EmailSendingWorker class. This
allows to retry the sending process of a particular email inside
the batch if the caught exception leaves this process retriable.
A second retry mechanism is used inside the initialize_connection
function to redo the opening of the connection until it works or
until three attempts failed. For this purpose the backoff module
has been added to the dependencies and a test has been added to
ensure that this retry mechanism works well.
The connection is closed when the stop method is called.
Fixes: #17672.
This was introduced in 8321bd3f92 to serve as a sort of drop-in
replacement for zerver.lib.queue.queue_json_publish, but its use has
been subsequently cut out (e.g. `9fcdb6c83ac5`).
Remote its last callsite.
django.utils.translation.ugettext is a deprecated alias of
django.utils.translation.gettext as of Django 3.0, and will be removed
in Django 4.0.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Amazon SES has a limit on the size of address fields, and rejects
emails with too-long "From" combinations of name and address. This
limit is set to 320 bytes and comes from an RFC limitation on the
size of addresses. This RFC standard states that an email address
should not be composed of a local part (before the '@') longer than
64 bytes and a domain part (after the '@') longer than 255 bytes.
It is possible that Amazon SES misinterprets this limitation as it
checks the length of the combination of the name and the email
address of the sender.
To ensure that this problem is not encountered in the send_email
module of Zulip the length of this combination is now checked
against this limit and the from_name field is removed to only
keep the from_address field when it is necessary in order to
stay below 320 bytes.
If the from_address field alone is longer than 320 bytes the
sending process will raise an SMTPDataError exception.
Tests for this new check are added to the backend test suite in
order to test if build_email correctly outputs an email with filled
from_name and from_address fields when the total length is lower
than 320 bytes and that it correctly throws the from_name field
away when necessary.
Fixes: #17558.
These procedures should be done atomically overall, with the exception
of the code that sends events to avoid block if there's a delay
communicating with Tornado.
We add the savepoint=False on underlying function that already
executes inside an atomic context - to avoid the overhead of creating
savepoints where they aren't needed.
The main race conditions, which actually happened in production was with
concurrent execution of deliver_email and clear_scheduled_emails.
clear_scheduled_emails could delete all email.users in the middle of
deliver_email execution, causing it to pass empty to_user_ids list to
send_email. We mitigate this by getting the list of user ids in a single
query and moving forward with that snapshot, not having to worry about
database data being mutated anymore.
clear_scheduled_emails had potential race conditions with concurrent
execution of itself due to not locking the appropriate rows upon
selecting them for the purpose of potentially deleting them. FOR UPDATE
locks need to be acquired to prevent simultaneous mutation.
Tested manually with some print+sleep debugging to make some races
happen.
fixes #zulip-2k (sentry)
Use read-only types (List ↦ Sequence, Dict ↦ Mapping, Set ↦
AbstractSet) to guard against accidental mutation of the default
value.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Fixes#2665.
Regenerated by tabbott with `lint --fix` after a rebase and change in
parameters.
Note from tabbott: In a few cases, this converts technical debt in the
form of unsorted imports into different technical debt in the form of
our largest files having very long, ugly import sequences at the
start. I expect this change will increase pressure for us to split
those files, which isn't a bad thing.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Automatically generated by the following script, based on the output
of lint with flake8-comma:
import re
import sys
last_filename = None
last_row = None
lines = []
for msg in sys.stdin:
m = re.match(
r"\x1b\[35mflake8 \|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg
)
if m:
filename, row_str, col_str, err = m.groups()
row, col = int(row_str), int(col_str)
if filename == last_filename:
assert last_row != row
else:
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
with open(filename) as f:
lines = f.readlines()
last_filename = filename
last_row = row
line = lines[row - 1]
if err in ["C812", "C815"]:
lines[row - 1] = line[: col - 1] + "," + line[col - 1 :]
elif err in ["C819"]:
assert line[col - 2] == ","
lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ")
if last_filename is not None:
with open(last_filename, "w") as f:
f.writelines(lines)
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
The test_management_commands use in particular was causing pickling
errors when the test failed, because Python 3 filter returns an
iterator, not a list.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Previously, the send_custom_email code path leaked files in paths that
were not `.gitignored`, under templates/zerver/emails.
This became problematic when we added automated tests for this code
path, as it meant we leaked these files every time `test-backend` ran.
Fix this by ensuring all the files we generate are in this special
subdirectory.
We've had a bug for a while that if any ScheduledEmail objects get
created with the wrong email sender address, even after the sysadmin
corrects the problem, they'll still get errors because of the objects
stored with the wrong format.
We solve this by using FromAddress placeholders strings in
send_future_email function, so that ScheduledEmail objects end up
setting the final `from_address` value when mail is actually sent
using the setting in effect at that time.
Fixes#11008.
Changed the requirements for UserProfile in order to allow use of
the formataddr function in send_mail.py.
Converted send_email to use formataddr in conjunction with the commit
that strengthened requirements for full_name, such that they can now be
used in the to field of emails.
Fixes#4676.
This function is an alternative to get_admin_users that we use in all
places where we explicitly want only human administrative users (not
administrative bots). The following commits will rename
get_admin_users for better clarity.
Follow up on 92dc363. This modifies the ScheduledEmail model
and send_future_email to properly support multiple recipients.
Tweaked by tabbott to add some useful explanatory comments and fix
issues with the migration.
This adds language paramater to send_future_email. As a result, this
properly internationalizes invitation reminder emails, by passing
correct language into send_future_email.
Fixes#11240.
Apparently, when we renamed these files to no longer have a .txt
extension, we accidentally removed them from the set of strings for
translation, because `manage.py makemessages` by default only
processes .txt and .html files under the templates/ directory.
Fix this by adding a .txt extension.
It appears that our i18n logic was only using the recipient's language
for logged-in emails, so even properly tagged for translation and
translated emails for functions like "Find my team" and "password
reset" were being always sent in English.
With great work by Vishnu Ks on the tests and the to_emails code path.
The previous version was also doing almost the same thing.
But checking for DEVELOPMENT_LOG_EMAILS would allow us
to control the call of send_email by altering the value
of DEVELOPMENT_LOG_EMAILS in tests.
The previous migration code path was broken in two ways:
* ScheduledEmail objects generally contain a `None` value for
whichever of `to_user_id` and `to_email` isn't in use; this could
result in us sending a [None] to send_email(), which doesn't make
sense.
* We were calling handle_send_email_format_changes in the wrong order
with respect to the JSON loading process.
Thanks to Tom Daff for the report!
Apparently, we have a second code path where we might try to call
send_email library functions on old data, namely in the
queue_processors codebase. So we apply the same migration logic here.
This adds a function that sends provided email to all administrators
of a realm, but in a single email. As a result, send_email now takes
arguments to_user_ids and to_emails instead of to_user_id and
to_email.
We adjust other APIs to match, but note that send_future_email does
not yet support the multiple recipients model for good reasons.
Tweaked by tabbott to modify `manage.py deliver_email` to handle
backwards-compatibily for any ScheduledEmail objects already in the
database.
Fixes#10896.
This commit creates a new field called delivery_email. For now, it is
exactly the same as email upon user profile creation and should stay
that way even when email is changed, and is used only for sending
outgoing email from Zulip.
The purpose of this field is to support an upcoming option where the
existing `email` field in Zulip becomes effectively the user's
"display email" address, as part of making it possible for users
actual email addresses (that can receive email, stored in the
delivery_email field) to not be available to other non-administrator
users in the organization.
Because the `email` field is used in numerous places in display code,
in the API, and in database queries, the shortest path to implementing
this "private email" feature is to keep "email" as-is in those parts
of the codebase, and just set the existing "email" ("display email")
model field to be something generated like
"username@zulip.example.com" for display purposes.
Eventually, we'll want to do further refactoring, either in the form
of having both `display_email` and `delivery_email` as fields, or
renaming "email" to "username".
Users having only account in one realm will not be distracted by realm
name in subject lines of every email. Users who have multiple
accounts in realms can turn this setting on and receive a
corresponding realm name in email's subject.
Tweaked by tabbott to rebase and address a few small issues.
Fixes#5489.
The name `create_logger` suggests something much bigger than what this
function actually does -- the logger doesn't any more or less exist
after the function is called than before. Its one real function is to
send logs to a specific file.
So, pull out that logic to an appropriately-named function just for
it. We already use `logging.getLogger` in a number of places to
simply get a logger by name, and the old `create_logger` callsites can
do the same.
This is already the loglevel we set on the root logger, so this has no
effect -- except in tests, where `test_settings.py` attempts to set
some of these same loggers to higher loglevels. Because the
`create_logger` call generally runs after we've configured settings,
it clobbers that effect.
The code in `test_settings.py` that tries to suppress logs only works
because it also sets `propagate=False`, which has nothing to do with
loglevels but does cause logs at this logger (and descendants) to be
dropped completely unless we've configured handlers for this logger
(or one of its relevant descendants.)
The rules here are fuzzy, and it's quite possible none of Zulip's emails
need an address at all. Every country has its own rules though, which makes
it hard to tell. In general, transactional emails do not need an address,
and marketing emails do.
Create a new custom email backend which would automatically
logs the emails that are send in the dev environment as
well as print a friendly message in console to visit /emails
for accessing all the emails that are sent in dev environment.
Since django.core.mail.backends.console.EmailBackend is no longer
userd emails would not be printed to the console anymore.
ScheduledJob was written for much more generality than it ended up being
used for. Currently it is used by send_future_email, and nothing
else. Tailoring the model to emails in particular will make it easier to do
things like selectively clear emails when people unsubscribe from particular
email types, or seamlessly handle using the same email on multiple realms.
Both the queue processor and ScheduledJob emails need to sometimes pass a
to_user_id and sometimes pass a to_email, and it's more convenient to just
have one function that they can call that can handle either.
Also removes the now redundant send_email_to_user.
Better to see "noreply@..." when replying to a message that you can't reply
to than to see "Zulip" (for email clients that hide the email address when
there is a display name).
Make it less likely that further development will break compatibility with
ZULIP_ADMINISTRATORs of the form "name <email>".
Note that the suggested value for this setting has been
'zulip-admin@example.com' for a while, so hopefully this commit causes no
change for most installations.