This is required because our migration is going to go in two phases.
When we do the database migration (on pushing to master), we update
all messages at that point. But prod doesn't know about the new
flags field, so any new messages sent on prod will not have the
read bit set.
When we push to prod, we want to re-run the bit of the migration script
that automatically sets read flags on messages older than the users's
pointer.
(imported from commit 961d33e972eac9ada80089bf1b1269c7fb42d56b)
We now clean up the stream subscription in more places, but some
historical tutorial streams are still around and if an error or page
reload happens during the tutorial it'll stick around.
(imported from commit 8cf0ebda26bf752c1a23296a4ba85d194bbb3004)
With this change,
pkill -SIGUSR1 -f runtornado
will dump the stack and SIGUSR2 will enable an interactive debugging
session.
This fixes#613 for Tornado which was the original motive for that
ticket; I'm not sure whether we want to do this for our Django
processes as well, but it would be easy to do so if we did.
(imported from commit a7de7c6070f4bf0404bed6f434e6a6b291d66a26)
The idea here is: part of the onboarding tutorial is going to
be you talking to the tutorial bot and it talking to you, from
our Javascript.
The reason it's driven by Javascript is that then in principle we can
do nice stuff like making popovers appear in places to point things
out to you, whereas if we were to do it strictly server-side, doing so
would be a lot harder.
The downside to doing it in Javascript is that you don't get any of
the Markdown rendering, since that happens on the server. So instead
we add this call where you give it a message, and it responds by
having the tutorial bot send you that message.
I don't think there are any security concerns here because
(1) The bot only messages you -- so you can't use it to make someone
else think that the system is telling them to do something
(2) If there were an issue associated with having the server parse
arbitrary Markdown, you could just trigger the issue by sending
a message yourself.
(imported from commit b34f594dab6be6bcb81899278ae1cbe447404468)
To work around the issue we're having with queue draining between
parallel blocking connections, use the same rabbitmq queue for both
activity and presence events, keyed on a 'type' flag in the message
itself.
(imported from commit 188e8fda1695734e52c5740db2195072cfc81479)
Note: When deploying, restarting the process-user-activity-commandline script is needed
(imported from commit 63ee795c9c7a7db4a40170cff5636dc1dd0b46a8)
The production database will need to have this user created before
this commit is pushed
(imported from commit cc8356d8afa0f0747486b7b4c82337c60499d3fd)
We need this so that we can safely expunge old events without interfering with
the running server. See #414.
(imported from commit 4739e59e36ea69f877c158c13ee752bf6a2dacfe)
Before this is deployed, we need to install rabbitmq and pika on the
target server (see the puppet part of this commit for how).
When this is deployed, we need to start the new user activity bot:
./manage.py process_user_activity
in the screen session on the relevant server, or user_activity logs
won't be processed (which will eventually result in all users getting
notifications about how their mirrors are out of date).
(imported from commit 44d605aca0290bef2c94fb99267e15e26b21673b)
This commit has the effect of eliminating all of the non-UserActivity
database queries from the Tornado process -- at least in the uncached
case.
This is safe to do, if a bit fragile, since our Tornado code only
accesses these objects (as opposed to their IDs) in a few places that
are all fine with old data, and I don't expect us to add any new ones
soon:
* UserActivity logging, which I plan to move out of Tornado entirely
* Checking whether we're authenticated in our decorators (which could
be simplified -- the actual security check is just whether the
Django session object has a particular field)
* Checking the user realm for whether we should sync to the client
notices about their Zephyr mirror being up to date, which is quite
static and I think we can move out of this code path.
But implementation constraints around mapping the user_ids to
user_profile_ids mean that it makes sense to get the actual objects
for now.
This code is not what I want to do long-term. I expect we'll be able
to clean up the dual User/UserProfile nonsense once we integrate the
upcoming Django 1.5 release, with its support for pluggable User
models, and after that I change, I expect it'll be fairly easy to make
the Tornado code only work with the user ID, not the actual objects.
(imported from commit 82e25b62fd0e3af7c86040600c63a4deec7bec06)
Otherwise one gets:
AttributeError: 'module' object has no attribute 'time'
when trying to use the time module from inside zephyr.lib.
(imported from commit 645368672a3eff68320278dd480edeed56721fcc)
Note that on local dev servers, this will print out every half second, as
Tornado polls for file changes for autoreloading. In production it will only
print out when network events occur.
(imported from commit adfe88879e4e446b7dfa6ee69e0a9ad013e9c4d4)
tornado.web already does this, based on the setting of the 'debug' kwarg.
Dropping this in production saves us waking up twice a second to stat()
a bunch of files.
We already explicitly restart the server on deploys.
(imported from commit 283bb0da609acb2699a04111a74c13224fe5124c)
So, I got annoyed that our test suite was taking forever to run:
real 2m13.443s
user 1m32.630s
sys 0m3.748s
Some quick profiling determined that the test suite is spending all of
its time loading the fixtures files (zephyr/fixtures/messages.json)
that it loads for each test case (3s to load that for each test case).
To improve this situation, I cut out from the test database used by
the test suite most of the users, subscriptions, etc. that aren't
being used directly by the test cases. The impact is a quite
significant speedup:
real 0m15.176s
user 0m9.161s
sys 0m0.508s
We're still spending over a quarter of a second per test, which isn't
great -- but this is at least no longer unbearable.
This commit doesn't make any changes to the populate_db output if you
don't pass the new --test-suite option.
(imported from commit 2334ba5399b33edab3d29ff269fde4ea77ccd48e)
This is needed to avoid exceptions trying to do internal_send_message
in any test against a simple populate_db database.
(imported from commit 36927f57cbbb7e30ae249b5f1a0549fb352827f5)
Importing zephyr.views here has the unfortunate side effect of
creating Client ids 1 and 2 automatically (via decorators.py
instantiating the two client objects it makes), before we go ahead and
delete all objects in the database as part of the populate_db startup.
(imported from commit da03cb7606334d5926e42f422ab94d1c884937b9)
Previously, the StreamColor restore code didn't properly account for
the fact that most user subscriptions were in pending_subs and thus
not yet in the database.
(imported from commit 2e28c5a68aa045494b9336d7114c23f5c3706c28)
By processing UserMessage objects in batches as we go, this avoids
consuming a large amount of memory that is linear in the size of the
messages log.
(imported from commit 0c42d97f0863da9c079836c60bebcbaeec59f849)
This is useful when testing the sigup workflow, as this script enables you
to run through a MIT signup without manually creating a new inactive user
in the database.
(imported from commit c22649cc7c561c2fbe8682d1b17d7e5aba9ac04e)
We previously were only using it at the first loop through all
messages, which meant code accessing the message type copied from one
place to another would break (potentially subtly), because things
would work if and only if the very last message happened to have the
same type as what is expected in the relevant piece of code.
(imported from commit ad9ce5efdb200e0c0d5c3ffa6db33113fdad8c5a)
This cuts a 30s operation down to about 2s on my machine.
And also move the code to run before we print the "done" message and
have logging for how long it is taking.
(imported from commit 2f20f8ca3fee714735a50fe6c6cfd630df452768)
When we changed our stream name model to treat stream names as
case-insensitive, we didn't update populate_db to do the same.
This commit makes that update, which is to use the lower-cased stream
name for dictionary lookups and deduplication, but the original-case
stream name for actually creating streams.
(imported from commit fc32ec75a5ae286bce7ec86c6e6fb6893888cbd0)
authenticated_api_view and authenticated_json_view call
update_user_activity with a client generated using
@has_request_variables with a default of e.g. get_client("website").
Because that get_client call only happens once on importing the
module, if those client objects didn't exist previously in the
fixtures, then the first test will generate objects with ids 2 and 3,
and then the second test will dump the database, restore from the
fixtures, and then eventually generate client objects with ids 4 and
5. But since the default values were only computed at module load,
we'll still end calling add_user_activity with client objects with ids
2 and 3, which don't exist in the newly restored database.
Fix this madness by just making sure those two client objects exist in
the database.
(imported from commit d940e129d077a560d9a0f96ec3daa2e16ce21c8b)
Run this script on an existing realm to create or change default
streams, which new users will get upon account creation.
(imported from commit 8938dcbd3520d97d25b4c6ca783d35c9aef52df0)
We have a lot of forged users that have bad fullnames due to
historical versions of our fullname computations; this function will
clean those up.
Also, we have a bunch of users with emails like foo|mit.edu@mit.edu
that were the result of a mirroring bug that we want to get rid of
from autocomplete -- putting them in a useless realm name will do.
(imported from commit 6e305093653ca9d327e9e28491636e99d16cfe1d)
Within 'except', 'raise' re-raises the current exception. But outside, it produces
TypeError: exceptions must be old-style classes or derived from BaseException, not NoneType
which is pretty confusing as a generic "something has gone wrong" exception.
(imported from commit 9fcd003a952b82df67726c26161dced079978a32)
This was causing us to log some requests twice, and might have more serious
consequences as well.
(imported from commit 0bb2d7207ee3e4e04679215a7f5ae637cd26aa19)
The zero-port case never actually worked, because addrport wasn't an optional
parameter in run_one. And multiple ports was implemented using the
multiprocessing library, which is just bad news. Since we have no need for
this, remove it before it can cause trouble.
(imported from commit 9d913924701f30d23ebe878b76c8f1f0da2800e2)
Here we introduce a new manage.py command, activate_mit, which takes a
number of usernames and sends out emails to the users with instructions on
how to activate their accounts.
(imported from commit f14401b55f915698e83ff27b86434f53e64685f3)
This is the behavior specified by Django. Since this was broken before,
our CSRF protection had no effect on Tornado views other than printing
a warning message :(
(imported from commit 7975d3c9b6c18915f917ac2da4592a55f6b6a658)
The client may now optionally send its current pointer during
get_updates and the server will return the latest pointer if it
differs and was updated more recently by a different session.
(imported from commit e43b377d7dfb52f83cefb0b1003863d5407caf80)
This cuts the time required to import 38000 messages (with 140000
message receipts) from about 10 minutes to just under 40 seconds on my
laptop with sqlite.
(imported from commit d53b0d1360408c77f38353b58fbb25875262cb40)
This eliminates a bunch of unnecessary code and also fixes a bunch of
places where we were improperly not using transactions.
(imported from commit f194ae9226f9229fc56a0b1b21615534f486ea0c)
This is something of a temporary hack. In the future, we should make
zephyr_mirror.py smarter about fixing up the realms.
(imported from commit bdcff1a834904616538f430b4513ec7619b95e95)