Clients can now request to receive only certain kinds of events,
although they always receive restart events.
(imported from commit 1e72981f8fe763829ab2abde1e35f94cad5c34e4)
This version has several limitations that are addressed in later
commits in this series.
(imported from commit 5d452b312d4204935059c4d602af0b9a8be1a009)
When we added rabbitmq usage within Tornado, we inadvertently caused
the Tornado ioloop to be initialized in runtornado.py's imports,
before we overwrote the _poll method. The end result was that we
weren't running the our instrumented Tornado poll function.
Fix this by moving that code to its own file which we import at the
top of runtornado.py, and adding comments documenting the situation so
we don't break this in some future import reorganization.
(imported from commit 016717476f10566fef4ed2b656f29f865d2084db)
This can result in a significant performance benefit because we only
need to update the columns that changed..
(imported from commit 42bef1fcc58ad79bd864f89263fe82e90743ee5b)
The policy this implements is:
* 1 week for most persistent data (Clients, etc.)
* 1 day for messages
(imported from commit d57bb2c6b9626ffa2155c6d0ef9b60827d1f2381)
This saves 2 database queries per user in the huddle when sending the
first message to a particular huddle.
(imported from commit f71aa32df846fb4b82651a93ff9608087ffcaa5a)
Previous we had around 4 copies of the logic for deciding whether we
should publish data via a SimpleQueueClient queue, a
TornadoQueueClient queue, or to directly handle the operation, which
resulted in their getting out of sync and buggy (see e.g. the previous
commit).
We need to add a lock around adding things to the queue to work around
a bug with pika's BlockingConnection.
I should note that the previous logic in some places had a bunch of
tests of the form "elif settings.TEST_SUITE" for doing the work that
would have been done by the queue processor directly; these should
have just been "else" clauses -- since we generally want that code to
run on development environments whether or not the test suite is
currently running.
(imported from commit 16bdbed4fff04b1bda6fde3b16bee7359917720b)
Previously we had several files which initialized SimpleQueueClient()
for sending items to the UserActivity queue, even though those code
paths aren't used outside Tornado. This resulted in slower Tornado
startup times.
(imported from commit ad97021ec18d3927233744037c548c22db33c321)
This fixes an experienced bug where you couldn't subscribe to a stream
with non-ASCII characters (failing with a UnicodeEncodeError), as well
as many other potential bugs.
(imported from commit f084a4b4b597b85935655097a7b5a163811c4d71)
This is required by Pika 0.9.8. We need at least 0.9.6 for the next
commit; I had been testing with 0.9.5 previously. Anyway this way
seems more correct as well.
(imported from commit bfb9e9e78938073001f70c4d28a5e07cc4ebac32)
This will automatically fix bugs such as one in which
internal_send_message didn't properly strip() the subject argument
before sending a message.
We change the recipient_type argument to internal_send_message to take
the recipient type name (e.g. 'stream') both to better fit the API and
also because the previous code incorrectly handled huddles.
(imported from commit 78c2596d328f6bb1ce2eaa3eed9a9e48146e3b6a)
This cache should save 2 database queries whenever we send a private
message. However, previously it was per-process (which meant it was
mostly useless) and also buggy (it never stored anything in the cache,
so that it was completely useless). Switching this to our standard
memcached setup will address both problems.
(imported from commit 1d807f30704bccf28de33a80523488aedc58a9be)
Previously we only used these caches for Tornado requests, because we
were not updating memcached when e.g. the user's pointer changed, and
so functions like update_pointer would not work correctly.
Now that we are updated memcached when the User and UserProfile
objects change, we can use these for all requests.
This saves 2 database queries on every Django request to the server.
(imported from commit aa5bffd885d14bde38b95e80a226bd5ab66f253d)
This should substantially decrease the amount of server load generated
by the userpresence system.
I tested that this indeed was indeed saving one query on
/json/update_active_status requests on my laptop with 2 users from the
humbughq.com realm logged in.
(imported from commit 03e9d4eb95b9f664d489862684ae162db2076e08)
This cache filling code takes about 5 seconds to run, which means it
will finish before Tornado starts reading from this cache, but if that
were to change, it would be much better to have made at least some
progress.
(imported from commit 60a3420cdb9ddf331d83573a3fdb6be1a5ee5a4f)
Previously we were calling select_related() on Message.client, which
doesn't exist. It seems kinda poor that this doesn't raise an
exception.
I believe this issue was causing us to do very large numbers of
database queries during get_updates calls during server restarts.
(imported from commit b79bd698820fbd9dd82bd61fc175c32cd5ce6d05)
Since we flush memcached when we do a server restart, the flurry of
get_updates requests that fly in afterwards are all cache misses for
getting the User/UserProfile objects, so Tornado ends up spending
around 70ms per get_updates request rather than the usual 1-2ms.
So this should substantially improve our Tornado performance around
server restarts.
(imported from commit 07b8126bdfd4ff14e4c3362f9eda1fe5fd571c5b)
Sometimes Dropbox shares with /s/ and sometimes with /sh/,
and I'm not sure which controls it, but we should deal with both.
(imported from commit 2222450f25c418b5fbd60ab2c30477467e34c0d1)
This avoids our repeatedly retrying to fetch a tweet that doesn't
exist from the Twitter API.
(imported from commit b4ca1060d03da21e7e59e5b99e682d2e8457df15)
This should substantially improve the repeat-rendering time for pages
with large numbers of tweets since we don't need to go all the way to
twitter.com, which can take like a second, to render tweets properly.
To deploy this commit properly, one needs to run
./manage.py createcachetable third_party_api_results
(imported from commit 01b528e61f9dde2ee718bdec0490088907b6017e)
This commit adds a dependency on python-twitter,
whose upstream is at https://github.com/bear/python-twitter,
and which for now needs to manually be installed on our
servers from the Debian package in sid.
(imported from commit 80cd9f4f59a6f0de6b75ac95e412c69e2a2e2490)
This uses the unauthed v1 of the Twitter API, which is going to go
away soon, but it's fine as an interim measure.
(imported from commit 709a250271321f5479854a363875c9da43e6382d)
Prior to this change, any stream message sent by internal_send_message
could only be in the realm of the sender.
This was a problem most notably for... the tutorial bot, with the
hilarious consequence that the tutorial worked fine in humbughq.com,
but failed to start anywhere else.
(imported from commit 33a904a28e3a57e1a2cf9172c2e2a75b50967a50)
With this change,
pkill -SIGUSR1 -f runtornado
will dump the stack and SIGUSR2 will enable an interactive debugging
session.
This fixes#613 for Tornado which was the original motive for that
ticket; I'm not sure whether we want to do this for our Django
processes as well, but it would be easy to do so if we did.
(imported from commit a7de7c6070f4bf0404bed6f434e6a6b291d66a26)
Even though they look like images, they're not -- you need to
append ?dl=1 to get the image version.
(imported from commit 2a05e7c58f475c908687110d9191f8709425c660)
At Ksplice we used /usr/bin/python because we shipped dependencies as Debian /
Red Hat packages, which would be installed against the system Python. We were
also very careful to use only Python 2.3 features so that even old system
Python would still work.
None of that is true at Humbug. We expect users to install dependencies
themselves, so it's more likely that the Python in $PATH is correct. On OS X
in particular, it's common to have five broken Python installs and there's no
expectation that /usr/bin/python is the right one.
The files which aren't marked executable are not interesting to run as scripts,
so we just remove the line there. (In general it's common to have libraries
that can also be executed, to run test cases or whatever, but that's not the
case here.)
(imported from commit 437d4aee2c6e66601ad3334eefd50749cce2eca6)
Previously we were doing 100 queries for new messages being sent to
the main hacker school channels; they were faster than many similar
instances because they were all done within 1 transaction, but still,
send_message_backend would be spending up to 200ms (and 148 queries)
querying the database with the previous code on prod; this new version
should do a fixed number of database queries per message.
(imported from commit 3799e63aebb6f017932ddb0fe1f6209281c0ddcf)
To work around the issue we're having with queue draining between
parallel blocking connections, use the same rabbitmq queue for both
activity and presence events, keyed on a 'type' flag in the message
itself.
(imported from commit 188e8fda1695734e52c5740db2195072cfc81479)
This cuts the MIT Zephyr home page load time and also #subscriptions
database-sourced load time for me from about a second to about 50ms on
my laptop. The home page still takes about 600ms to load due to
templating.
(imported from commit 1cfd8c750301abe6b6a881be4b2857532be947ec)
SimpleQueueClient was not thread-safe as the pika method
calls that we are using might disconnect and end up in a bad
state if two threads enqueue messages at the same time.
In order to avoid this, each user of the rabbitmq queue
gets its own instance of a SimpleQueueClient.
(imported from commit 694083b75cd58a60b8de282a8f40eb92a864c5ce)
Note: When deploying, restarting the process-user-activity-commandline script is needed
(imported from commit 63ee795c9c7a7db4a40170cff5636dc1dd0b46a8)
Adds a new db table for storing presences, and an API for setting
an individual user's idleness as well as fetching all idle status
for all users in a realm
(imported from commit 5aad3510d4c90c49470c130d6dfa80f0d36b0057)
Code like this is dangerous:
class SimpleQueue(object):
queues = set()
because all instances will share the same 'queues' set object.
We don't really need multiple instances, but neither is there a reason to break
them. So do the normal Python thing instead.
(imported from commit a56bb8414dd549cfd312c9565df198cf9d20f08a)
This will give us flexibility in the future to add new properties to the
list.
In order to support that, we now do a list comprehension rather than just
returning the gather_subscriptions list in get_stream_colors.
(imported from commit a3c0f749a3320f647440f800105942434da08111)
Supporting ``` as a code fence marker complicates the auto-fence
closing, and as per a discussion with Keegan on code-review@, it
is not worth the extra complexity.
(imported from commit 405afb95c4295a02f4677181456caf9d49913ac4)
Trying to add a user to an invite-only stream that already
exists will result in in error
(imported from commit 910750580a122cee92096d7e83457cb0b8cce616)
We need this so that we can safely expunge old events without interfering with
the running server. See #414.
(imported from commit 4739e59e36ea69f877c158c13ee752bf6a2dacfe)
Before this is deployed, we need to install rabbitmq and pika on the
target server (see the puppet part of this commit for how).
When this is deployed, we need to start the new user activity bot:
./manage.py process_user_activity
in the screen session on the relevant server, or user_activity logs
won't be processed (which will eventually result in all users getting
notifications about how their mirrors are out of date).
(imported from commit 44d605aca0290bef2c94fb99267e15e26b21673b)
internal_send_message now has the ability to send personals as well as
stream messages.
This change was accidentally lost during a rebase.
(imported from commit 153a3929c5c64be82288293c1f0cc02fcc03c08d)
This allows us to handle the return_messages_immediately part of
get_updates requests without having to talk to the database.
(imported from commit ed0b7742d359efb21a0a4960f4fc25f4337e9ad4)
Otherwise one gets:
AttributeError: 'module' object has no attribute 'time'
when trying to use the time module from inside zephyr.lib.
(imported from commit 645368672a3eff68320278dd480edeed56721fcc)
This is syntax like
Here's [a link][]
[a link]: http://google.com
This is not very useful for short chat-style messages. It will confuse users,
especially because we don't document it. And disabling it saves the effort of
applying the same link fixups as elsewhere.
(imported from commit c23391465486db545302b79c084b4f9cd5cdcc6a)
And wire it up to our local copy of codehilite. This fixes highlighting in
fenced code blocks, e.g.
~~~~ .js
var x = function () {
return "hi";
};
~~~~
(imported from commit 0efb0c9b98a3acdf55e18bb1918af7960f3425be)
These get automatically re-numbered, which will do the wrong thing when people
split their lists across multiple messages.
Fixes#241.
(imported from commit 7f6f2c36a6ab27cef0a34008f304fc0fe25c8bd0)
Our pages are declared as HTML5:
<!DOCTYPE html>
The markdown library only supports HTML4, but that's probably closer than XHTML.
(imported from commit c78be9ae9bccf029def8d94d3647b0ccce8b2252)
This only linkifies inside angle brackets, per
print md.inlinePatterns['autolink'].getCompiledRegExp().pattern
We have our own linkification extension.
(imported from commit 20cab11aaafee075e0caf933d8d197717976988c)
This cuts the time required to import 38000 messages (with 140000
message receipts) from about 10 minutes to just under 40 seconds on my
laptop with sqlite.
(imported from commit d53b0d1360408c77f38353b58fbb25875262cb40)
Pygments renders these using tables, which breaks our client's assumptions
about what <tr>s mean.
(imported from commit 46d37395785e06fb183d17b08afed4c6f5957baa)
We'll use this internally for the commit bot. We might eventually disable it
for external users.
(imported from commit 3136cd9faadc6b81355889d2ee6472985da87fbe)