Empirically, the retry in `_on_connection_closed` didn't actually work
-- if a reconnect failed, that was it, and the exception handler
didn't get run. A traceback would get logged, but all its frames were
in Tornado or Pika, not our own code; presumably something magic and
async was happening to the exception.
Moreover, though we would make one attempt to reconnect if we had a
connection that got closed, we didn't have any form of retry if the
original attempt at connecting failed in the first place.
Happily, upstream offers a perfectly reasonable bit of API that avoids
both of these problems: the on-open-error callback. So use that.
This method was new in Tornado 4.0. It saves us from having to get
the time ourselves and do the arithmetic -- which not only makes the
code a bit shorter, but also easier to get right. Tornado docs (see
http://www.tornadoweb.org/en/stable/ioloop.html) say we should have
been getting the time from `ioloop.time()` rather than hardcoding
`time.time()`, because the loop could e.g. be running on the
`time.monotonic()` clock.
Adding it afterward is inherently racy, and upstream's API is quite
reasonable for avoiding that -- just like we can pass an on-open
callback up front, we can do the same with the on-close callback.
This is a more thorough version of 4adf2d5c2 from back in 2013-04.
The default value of this parameter is already False upstream.
(It was already False in pika version 0.9.6, which we were
supposedly using when we introduced this in 4baeaaa52; not sure
what the story was there.)
When the RabbitMQ server disappears, we log errors like these:
```
Traceback (most recent call last):
File "./zerver/lib/queue.py", line 114, in json_publish
self.publish(queue_name, ujson.dumps(body))
File "./zerver/lib/queue.py", line 108, in publish
self.ensure_queue(queue_name, do_publish)
File "./zerver/lib/queue.py", line 88, in ensure_queue
if not self.connection.is_open:
AttributeError: 'NoneType' object has no attribute 'is_open'
During handling of the above exception, another exception occurred:
[... traceback of connection failure inside the retried self.publish()]
```
That's a type error -- a programming error, not an exceptional
condition from outside the program. Fix the programming error.
Also move the retry out of the `except:` block, so that if it also
fails we don't get the exceptions stacked on each other. This is a
new feature of Python 3 which is sometimes indispensable for
debugging, and which surfaced this nit in the logs (on Python 2 we'd
never see the AttributeError part), but in some cases it can cause a
lot of spew if care isn't taken.
This makes tests of queue processors more realistic,
by adding a parameter to `queue_json_publish` that
calls a queue's consumer function if accessed in a test.
Fixes part of #6542.
Previously these were hardcoded in zproject/settings.py to be accessed
on localhost.
[Modified by Tim Abbott to adjust comments and fix configure-rabbitmq]
The register_json_consumer() function now expects its callback
function to accept a single argument, which is the payload, as
none of the callbacks cared about channel, method, and properties.
This change breaks down as follows:
* A couple test stubs and subclasses were simplified.
* All the consume() and consume_wrapper() functions in
queue_processors.py were simplified.
* Two callbacks via runtornado.py were simplified. One
of the callbacks was socket.respond_send_message, which
had an additional caller, i.e. not register_json_consumer()
calling back to it, and the caller was simplified not
to pass None for the three removed arguments.
(imported from commit 792316e20be619458dd5036745233f37e6ffcf43)
This includes a hack to preserve humbug/backends.py as a symlink, so
that we don't need to regenerate all our old sessions.
(imported from commit b7918988b31c71ec01bbdc270db7017d4069221d)
This needs to be deployed to both staging and prod at the same
off-peak time (and the schema migration run).
At the time it is deployed, we need to make a few changes directly in
the database:
(1) UPDATE django_content_type set app_label='zerver' where app_label='zephyr';
(2) UPDATE south_migrationhistory set app_name='zerver' where app_name='zephyr';
(imported from commit eb3fd719571740189514ef0b884738cb30df1320)