Previously, MissedMessageWorker used a batching strategy of just
grabbing all the events from the last 2 minutes, and then sending them
off as emails. This suffered from the problem that you had a random
time, between 0s and 120s, to edit your message before it would be
sent out via an email.
Additionally, this made the queue had to monitor, because it was
expected to pile up large numbers of events, even if everything was
fine.
We fix this by batching together the events using a timer; the queue
processor itself just tracks the items, and then a timer-handler
process takes care of ensuring that the emails get sent at least 120s
(and at most 130s) after the first triggering message was sent in Zulip.
This introduces a new unpleasant bug, namely that when we restart a
Zulip server, we can now lose some missed_message email events;
further work is required on this point.
Fixes#6839.
This fixes a couple things:
* process_event() is a pretty vague name
* returning tuples should generally be avoided
* we were producing the same REST parameters in both
subclasses
* relative_url_path was always blank
* request_kwargs was always empty
Now process_event() is called build_bot_request(),
and it only returns request data,
not a tuple of `rest_operation` and `request_data`.
By no longer returning `rest_operation`, there are
fewer moving parts. We just have `do_rest_call` make
a POST call.
Before this change, we instantiated base_url into a superclass
of subclasses that returned base_url into a dictionary that
gets returned to our caller.
Now we just pull base_url out of service when we need to make
the REST call.
This uses the recently introduced active_mobile_push_notification
flag; messages that have had a mobile push notification sent will have
a removal push notification sent as soon as they are marked as read.
Note that this feature is behind a setting,
SEND_REMOVE_PUSH_NOTIFICATIONS, since the notification format is not
supported by the mobile apps yet, and we want to give a grace period
before we start sending notifications that appear as (null) to
clients. But the tracking logic to maintain the set of message IDs
with an active push notification runs unconditionally.
This is designed with at-least-once semantics; so mobile clients need
to handle the possibility that they receive duplicat requests to
remove a push notification.
We reuse the existing missedmessage_mobile_notifications queue
processor for the work, to avoid materially impacting the latency of
marking messages as read.
Fixes#7459, though we'll need to open a follow-up issue for
using these data on iOS.
Private messages are not supported in Slack-format webhook.
Instead of raising a NotImplementedError, we warn the user
that PM service is not supported by sending a message to the
user.
Added tests for the same.
Fixes#9239
We already had a setting for whether these logs were enabled; now it
also controls which stream the messages go to.
As part of this migration, we disable the feature in dev/production by
default; it's not useful for most environments.
Fixes the proximal data-export issue reported in #10078 (namely, a
stream with nobody ever subscribed to having been created).
Slow queries during backend tests sends messages to Error Bot
which affects the database state causing the tests to fail.
This fixes the occasional flakes due to that.
These two classes are tricky to test, and nocoverage-ing them
allows us to mark queue_processors.py as fully covered. We
still want to cover these two workers at some point, but for
now, it's nice to enforce full coverage for any future changes
to queue_processors.py.
Fixes (sort of) #6542.
We already check in get_service_bot_events() if a bot is mentioned,
and then only pass on the call to the bot handler if it is. The
commit removes the additional check in the embedded bot queue
processor simply because it is impossible to obtain test coverage
for it (there is no meaningful way to trigger the content of the
if-clause, because there will never be a message reaching the bot
without @-mentioning it.
To alleviate the danger of a potential regression, the check is not
removed completely, but rather replaced by an assert statement.
Revert c8f034e9a "queue: Remove missedmessage_email_senders code."
As the comment in the code says, it ensures a smooth upgrade path
from 1.7.x; we can delete it in master after 1.8.0 is released.
The removal commit was merged early due to a communication failure.
External bots may call bot_handler.quit() when
they wish to terminate, e.g. due to a misconfiguration.
Currently, embedded bots ignore calls to quit(), even
though they signal a problem. This commit does the first
step in handling quit() calls by logging a warning.
If an exception was thrown inside `send_email` resulting in a retry,
we would include the `failed_tries` data in the event, which turned
out to thrown an exception itself.
This fixes that flow, including deepening the test so that it would
fail if we didn't have the new logic.
This commit just copies all the code from MissedMessageSendingWorker
class to a new EmailSendingWorker class. All the logic to send an email
through a queue was already there. This commit only makes the logic
generic. It does so by creating a special purpose queue called
'email_senders' to send any type of email. To make
MissedMessageSendingWorker still work we derive it from
EmailSendingWorker. All the tests that were testing
MissedMessageSendingWorker now run against EmailSendingWorker.
The original logic is buggy now that emails can belong to (and be
invited to) multiple realms.
The new logic in the `invites` queue worker also avoids the bug where
when the PreregistrationUser was gone by the time the queue worker got
to the invite (e.g., because it'd been revoked), we threw an exception.
[greg: fix upgrade-compatibility logic; add test; explain
revoked-invite race above]
Because we use access_stream_by_id here, and that checks for an active
subscription to interact with a private stream, this didn't work.
The correct fix to add an option to active_stream_by_id to accept an
argument indicating whether we need an active subscription; for this
use case, we definitely do not.
There's one migration required by this release:
* queue_processors: Stop passing state_handler to handle_message.
state_handler is now a property of bot_handler and thus, does
not need to be passed to bot_handler.handle_message().
The commit responsible is:
2a74ad11c5
This fixes a bug where, when a user is unsubscribed from a stream,
they might have unread messages on that stream leak. While it might
seem to be a minor problem, it can cause significant problems for
computing the `unread_msgs` data structures, since it means we need to
add an extra filter for whether the user is still subscribed, either
in the backend or in the UI.
Fixes#7095.
This fixes a regression in 25c669df52.
We were draining the queue in both the superclass and the subclass,
so by the time the subclass started processing events, there were
no events to process. Now the subclass properly uses the events
passed in from the superclass.
The os.mkdir call is straightforward and doesn't testing.
Workers relying on LoopQueueProcessingWorker are tested
with its consume method that exists solely for this purpose.
This should help make it easier to add tests coverage for these queue
processors, since they now use a system more similar to the other
queue processors.
This replaces the former non-functional StateHandler
stub with a dictionary-like state object. Accessing it will
will read and store strings in the BotUserStateData model.
Each bot has a limited state size. To enforce this limit while
keeping data updates efficient, StateHandler caches the expensive
query for getting a bot's total state size. Assignments to a key
then only need to fetch that entry's previous size, if any, and
compare it to the new entry's size.
Previously, invitation reminder emails were only being cleared after a
successful signup if newsletter_data was available, since that was the
circumstance in which we were calling the relevant queue processor
code. Now, we (1) clear them when a human user finishes signing up
and (2) correctly clear them using the 'address' field of
ScheduleEmail, not user_id.
There is no reason for either render_incoming_message() or
render_markdown() to require full UserProfile objects just to
triage alert words.
By only asking for user_ids, we save extra queries in two
callpaths and we make it easier to start using user_ids in
do_send_messages().
Add test to check if the embedded bot service being used is in the
registry or not.
Add test to check if the bot being added to the registry has a valid
bot corresponding to it.
Move 'get_bot_handler' to 'zerver/lib/bot_lib.py' as it is an independent
function, not related to the 'EmbeddedBotWorker' class that it was
previously a part of.
This fixes some error message strings and skips converting request_data
into json. From now, conversion would be the responsibility of interface.
Also, base_url is now not passed into event structure.
Splitting bot_lib.py file into 2 files led to unnecessary
redirection of the code workflow. For an embedded bot/service to
send a reply, it was being redirected 3 times.
First, the code flow comes to "EmbeddedBotHandler" class to send
reply, then it goes to the common function in "zulip_bots/lib.py",
then it would come back to "EmbeddedBotHandler". Later on, if we
create an abstract class, from where the bot work flow would
directly hit and then from there it is classified into
EmbeddedBotHandler or ExternalBotHandler and accordingly it would
get redirected.
Now, first the bot flow goes to it's handler class External or
Embedded (where we pass that this is External or Embedded bot as
parameter) and then goes to a common point and then comes back to
the same class.
This is required, since we just reorganized the python-zulip-api
repository into 3 packages.
A nice side effect is that we get to eliminate some now-unnecessary
code for editing sys.path.
- For threaded workers:
Django's autoreloader catches SIGQUIT(3) to reload the program. If
a process being watched by autoreloader exits with status code 3,
reloader will restart the process. To reload, we send SIGUSR1(10)
signal from consumers to a handler in process_queue which then
exits with status code 3.
- For single worker per process:
Catch the SIGUSR1 and quit; supervisorctl will restart the worker
automatically.
Fixes#5512
This prevents users from accidentally sending a confirmation link
specific to their account to their Zulip administrator if they reply
to the invitation, invitation reminder, account confirmation, or new
email confirmation emails.
No change in behavior.
Also makes the first step towards converting all uses of
settings.ZULIP_ADMINISTRATOR and settings.NOREPLY_EMAIL_ADDRESS to
FromAddress.*.
Once everything is converted, it will be easier to ensure that future
development doesn't break backwards compatibility with the old style of
settings emails.
This will allow for customized senders for emails, e.g. 'Zulip Digest' for
digest emails and 'Zulip Missed Messages' for missed message emails.
Also:
* Converts the sender name to always be "Zulip", if the from_email used to
be settings.NOREPLY_EMAIL_ADDRESS or settings.ZULIP_ADMINISTRATOR.
* Changes the default value of settings.NOREPLY_EMAIL_ADDRESS in the
prod_setting_template to no longer have a display name. The only use of
that display name was in the email pathway.
Similar code appeared at two places (the code is meant to remove the
leading @-mention before passing the remainder of the message to the
bot handler)—both 'zerver/worker/queue_processors.py' and
'api/bots_api/bot_lib.py'.
Remove function from the queue_processors.py file and let the entire message
be processed (includes removing of @-mention) at bot_lib.py.
Update EmbeddedBotWorker.get_bot_api_client in
'zerver/workers/queue_processors.py' to return an EmbeddedBotHandler
object instead of ExternalBotHandler object.
This change is as a followup for splitting the BotHandler class into
two separate classes for external and embedded bots.
This change is done for better understanding of the class functionality
from its class name. Now there will be 3 different classes for bots,
namely 'ExternalBotHandler', 'FlaskBotHandler' and 'EmbeddedBotHandler'.
Server settings should just be added to the context in build_email, so that
the individual email pathways (and later, the email testing framework)
doesn't have to worry about it.
This commit is a step towards the goal of replacing most of the
send_future_email pathway with a call to send_email.
Note that this commit changes the default value of sender from "Zulip
<NOREPLY_EMAIL_ADDRESS>" to "NOREPLY_EMAIL_ADDRESS". NOREPLY_EMAIL_ADDRESS
will soon be changed to have the Zulip in front.
Note that the correctness of this commit relies on the fact that
send_future_email also sets the sender to settings.NOREPLY_EMAIL_ADDRESS by
default (in the body of the function).
Also puts them into a processing queue, though the queue processor
does nothing.
Rewritten by tabbott to avoid unnecessary database queries in
do_send_messages.
- Add new 'missedmessage_email_senders' queue for sending missed messages emails.
- Add the new worker to process 'missedmessage_email_senders' queue.
- Split aggregation missed messages and sending missed messages email
to separate queue workers.
- Adapt tests for sending missed emails to the new logic.
Fixes#2607
This feature hardcoded zulip.com, and never really made much sense
("feedback" should generally go to the local server administrator, not
to the Zulip development community).
This makes life a lot easier for people inviting users to a new Zulip
organization, since they can give some form of context now.
Modified by tabbott to clean up CSS, backend code flow, and improve
the formatting of the emails.
Fixes: #1409.
Our lists of rabbitmq queues was likely to end up out of date, since
there was nothing enforcing that the various lists of queues were
correct or the same as each other.
This significantly simplify the logic for our logging process, making
it the case that websockets message sending requests always are logged
as having the exact same client as a normal AJAX request from that
server.
A lot of care has been taken to ensure we're using the realm that the
message is being sent into, not the realm of the sender, to correctly
handle the logic for cross-realm bot users such as the notifications
bot.
Removes the dependence on postmonkey, which is a wrapper around
MailChimp API v1.3. MailChimp recommends using their v3.0 API directly,
rather than through a wrapper library.
Recent changes to the API removed Client.register(), and
this change restores the correct API call, although the
codepath this affects is probably ready for eventual
deprecation.
This change adds support for displaying inline open graph previews for
links posted into Zulip.
It is designed to interact correctly with message editing.
This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control
whether this feature is enabled.
By default, this setting is currently disabled, so that we can burn it
in for a bit before it impacts users more broadly.
Eventually, we may want to make this manageable via a (set of?)
per-realm settings. E.g. I can imagine a realm wanting to be able to
enable/disable it for certain URLs.
In fe1ba6f3eb, we change our auth
decorators etc. to use request.POST/request.GET rather than the (now
removed request.REQUEST). This broke sending messages via our
websockets codepath, because we failed to update the artificial
requests we generate to use request._post as well.
This adds a few new helpful context variables that we can use to
compute URLs in all of our templates:
* external_uri_scheme: http(s)://
* server_uri: The base URL for the server's canonical name
* realm_uri: The base URL for the user's realm
This is preparatory work for making realm_uri != server_uri when we
add support for subdomains.
response.content is binary data, but code usually assumes it to
be text. Fix this by decoding response.content where required.
Don't do this in tests yet.
This change drops the memory used for Python processes run by Zulip in
development from about 1GB to 300MB on my laptop.
On the front of safety, http://pika.readthedocs.org/en/latest/faq.html
explains "Pika does not have any notion of threading in the code. If
you want to use Pika with threading, make sure you have a Pika
connection per thread, created in that thread. It is not safe to share
one Pika connection across threads.". Since this code only connects
to rabbitmq inside the individual threads, I believe this should be
safe.
Progress towards #32.
This is in some ways a regression, but because we don't have
python-postmonkey packaged right now, this is required to make the
Zulip production installation process work on Trusty.
(imported from commit 539d253eb7fedc20bf02cc1f0674e9345beebf48)
The one time use address are a unique token which maps to stored stated
in redis. We store the user_id, recipient_id, and subject. When an email
is received at this address it is sent to the stored recipient by the
stored user. Anyone with this address can send a single message as this
user.
(imported from commit 4219417bdc30c033a6cf7a0c7c0939f7d0308144)
Now that we've debugged the memory leak, I don't think we need this
anymore.
This reverts commit 1bdc7ee2f72bdebb1cdc94601247834a434614d6.
Conflicts:
puppet/zulip/files/cron.d/rabbitmq-numconsumers
puppet/zulip/files/supervisor/conf.d/zulip.conf
(imported from commit ff87f2aebcbc71013fa7a05aedb24e2dcad82ae6)
Unbundle the push notifications from the missed message queue processors
and handlers. This makes notifications more immediate, and sets things up
for better badge count handling, and possibly per-stream filtering.
(imported from commit 11840301751b0bbcb3a99848ff9868d9023b665b)
This fixes a small memory leak in our queue workers, where we don't
reset the accumulated times contained in our query logging data.
Longer-term, we may want to make something mergable for mainline where
we only store on the connection object the totals; that would be a
fixed amount of emmory per connection and thus not have this problem.
(imported from commit 914fa13acfb576f73c5f35e0f64c2f4d8a56b111)
This command should be run continuously via supervisor. It periodically
checks for new email messages to send, and then sends them. This is for
sending email that you've queued via the Email table, instead of mandrill
(as is the case for our localserver/development deploys).
(imported from commit a2295e97b70a54ba99d145d79333ec76b050b291)
Errors are sent to a queue processor that posts them to staging,
just like the feedback bot.
(imported from commit 4a8d099672a1b3e48a8bc94148d8b53db73d2c64)
We also now separate out the times for the socket overhead, the
request service time, and the queuing delays.
(imported from commit e1683f7f28b968b86ebb701b0ac29b00ac6d67c3)
One quirk here is that the Request object is built in the
message_sender worker, not Tornado. This means that the request time
only counts time taken for the actual sending and does not account
for socket overhead. For this reason, I've left the fake logging in
for now so we can compare the two times.
(imported from commit b0c60a3017527a328cadf11ba68166e59cf23ddf)
The register_json_consumer() function now expects its callback
function to accept a single argument, which is the payload, as
none of the callbacks cared about channel, method, and properties.
This change breaks down as follows:
* A couple test stubs and subclasses were simplified.
* All the consume() and consume_wrapper() functions in
queue_processors.py were simplified.
* Two callbacks via runtornado.py were simplified. One
of the callbacks was socket.respond_send_message, which
had an additional caller, i.e. not register_json_consumer()
calling back to it, and the caller was simplified not
to pass None for the three removed arguments.
(imported from commit 792316e20be619458dd5036745233f37e6ffcf43)
Subclasses of QueueProcessingWorker that don't override start() will
have their consume() functions wrapped by consume_wrapper(), which
will catch exceptions and log data from troublesome events to a log
file.
We need to do a puppet apply to create /var/log/zulip/queue_error.
(imported from commit 3bd7751da5fdef449eeec3f7dd29977df11e2b9c)
The DigestWorker isn't do any batching of events, so it can
be implemented with a simple consume() handler.
(imported from commit 0560eaf1a6e510adf97448fbe8933d9903d30016)
This prevents the ones with external side-effects (like sending real
email) from being accidentally run in dev instances.
(imported from commit 6d9861d721abb29136bfff974de01a9264051436)
New dependency: sockjs-tornado
One known limitation is that we don't clean up sessions for
non-websockets transports. This is a bug in Tornado so I'm going to
look at upgrading us to the latest version:
https://github.com/mrjoes/sockjs-tornado/issues/47
(imported from commit 31cdb7596dd5ee094ab006c31757db17dca8899b)
The queued email gets deleted if the user signs up before it gets sent.
Otherwise, they are reminded in 2 days that they still haven't signed up.
This addressses Trac #1812
(imported from commit c1bdc09c03ac576b08986e56994de72d52fd293b)
Deployment instructions: I think all the queue workers get
restarted automatically, so there is probably nothing special
to do here in the deploy itself, but we will want to monitor
it closely, and the change should make our number of locks go
down.
QueueProcessingWorker.start() now calls consume_and_commit(),
which ensures that we don't hold locks after work actions
by using Django's commit_on_success() decorator.
Obviously, workers that override start() will not call consume_and_commit()
through this code path. SlowQueryWorker calls commit_on_success()
in its start() method now, and I hope to address MissedMessageWorker soon.
(imported from commit f3f38a7f45730eee8f3b5794371ba5b994017676)
Use the commit_on_success() context manager around the call
to internal_send_message() inside of SlowQueryWorker's polling
loop, so that the pending SELECT statement from
get_status_dict_by_realm() gets committed. If we don't do
this, postgres will hold locks on zerver_userprofile, and other
tables, for a long time, which can interfere with migrations.
This is an interim solution until we switch postgres's default
commit behavior. Right now the default transaction isolation
is "read committed," so SELECT statements lead to AccessShareLocks
that do no get closed until the transaction finishes.
(imported from commit f72aeffbbe71a731e327459f15bd7dbebaf9e0b8)
Handled by the queue processor for signups. Added a management command
that accomplishes the same task, in case it's needed for manually added users,
or in case we goof and need to remove queued emails for a given user.
This addresses Trac #1807
(imported from commit 6727b82a07fa6a3ea3d827860c9e60fd0602297a)