Previously we were generating API keys deterministically using a hash
of the user's email address; this is clearly not a good long-term
approach.
(imported from commit 14d0c7c9edbc45b3ae1d17a43765ad9726338d4d)
We get too many error reports from it, which is bad for us actually
fixing the other errors that we do have.
(imported from commit 8442fe4251adb15a01b4e61ebcd07bc270b08631)
Messages that get sent out when someone subscribes many people to a new stream each
cause individual database queries (and their associated transactions). With the patched
bulk_create (which sets the .id on created objects), we can reduce this query down to a constant
number of queries on the Message and UserMessage tables.
Note for deployment (local dev, staging and prod):
you must be running a patched django, found here: https://github.com/acrefoot/django/branches
use this branch: acrefoot-bulk_create_with_id-1.5.1
on acrefoot-bulk_create_with_id-1.5.1
relevant sha1: ac6d885b811f7e2e34f0db0da217983f7dfd357f
(imported from commit b0dab9dac784d3ff47751e65bf22c2dddc22edf5)
We also record the historical edits to the message in this JSON format:
[{"prev_content": "new test message 14", "timestamp": 1369157249},
{"prev_content": "new test message 13", "timestamp": 1369157118}]
but we don't actually do anything with the information as of yet.
(imported from commit 2d5ca449b87b33ad035ab0e076a22e150c8e7267)
There was no benefit to our various link processors all doing
independent scans through the list of messages, and this makes it much
easier to understand the logic of how each link will be handled, and
also makes policies like "don't process links if there are more than 5
of then" easier to implement coherently.
(imported from commit 4affdeab889ba89b99eec905fdf871e78bbc3dd4)
Currently the interface for editing messages is limited to a
command-line API tool; it's great for testing with e.g.:
./api/examples/edit-message --message=348135 --content="test $(date +%s)" --site=http://localhost:9991 --subject="test"
The next commit will add a user interface for actually doing the editing.
(imported from commit bdd408cec2946f31c2292e44f724f96ed5938791)
This, combined with acrefoot's work on sending the notifications in
bulk, resolves trac #1142 -- we do only 10 database queries and the
whole operation completes in about 300ms on my laptop.
(imported from commit 36b5bb836bc6c713903d1ca72e39af87775dc469)
Since we log to statsd our cache time lookups by cache key, using a unique
tweet id for each lookup was just filling up our cache without being useful.
Also, log database cache lookups in a further namespace to distinguish between
memcached caches
(imported from commit a2a16b777fb7ab8cd066feee7344f9c8a3c107f5)
Users can send to any stream except invite-only streams that they
aren't subscribed to. Bots can send to any stream except invite-only
streams that neither they nor their owner is subscribed to.
(imported from commit 623d34d249d923611ca7ca781b5b55205cd3e548)
After this change, the memcached time consumed by doing
get_old_messages for 200 and 1000 messages respectively now look like
this:
200 63ms (mem: 6ms/3) (db: 4ms/2q) /json/get_old_messages
200 178ms (mem: 67ms/2) (db: 6ms/1q) /json/get_old_messages
which might help explain where the time is going on prod for some of
our slower queries.
(imported from commit b8fe83b175914b6796922a65a1c5537f4e7a9429)
For sites that are supported, we now grab thumbnails for images + video
embed code for videos and use them in lieu of our existing embed code.
We also embed rich non-script content.
Special casing is done so that we don't embed images twice.
Some testcases were modified to avoid triggering Embed.ly
The manual step is to install python-embedly.
(imported from commit d725bab91675c61953116c5ca741055fce49724e)
This decouples from Chrome notifications, which gives us cross-platform
support in at least modern browsers.
We log this action so its replayable in our message logs.
This implements the model change indicated by the previous schema commit.
(imported from commit b21213cdde54f43670bbb0bf1f607147fc732b38)
In repeated trials, the initial data fetch used to take about 1100ms.
In practice, it was often taking >2000ms, probably due to caching
effects. This commit cuts the time down to about 300ms in repeated
trials.
Note that the semantics are changed slightly in that we may no longer
get exactly 25000 messages. However, holes in the message_id
sequence are currently very rare or non-existent so this shouldn't be
a problem and we don't care about the exact number of messages
anyway.
I believe the problem was that the query planner was unable to
effectively use the LIMIT clause to figure out that only a small
subset of zephyr_message was going to be needed. Thus, it planned
for operating on the entire table and decided it could not use a more
efficient plan because work_mem, although large, would not be large
enough to execute the query over all of zephyr_message.
The original query was:
SELECT "zephyr_message"."id", "zephyr_message"."sender_id", "zephyr_message"."recipient_id", "zephyr_message"."subject", "zephyr_message"."content", "zephyr_message"."rendered_content", "zephyr_message"."rendered_content_version", "zephyr_message"."pub_date", "zephyr_message"."sending_client_id", "zephyr_userprofile"."id", "zephyr_userprofile"."password", "zephyr_userprofile"."last_login", "zephyr_userprofile"."email", "zephyr_userprofile"."is_staff", "zephyr_userprofile"."is_active", "zephyr_userprofile"."date_joined", "zephyr_userprofile"."full_name", "zephyr_userprofile"."short_name", "zephyr_userprofile"."pointer", "zephyr_userprofile"."last_pointer_updater", "zephyr_userprofile"."realm_id", "zephyr_userprofile"."api_key", "zephyr_userprofile"."enable_desktop_notifications", "zephyr_userprofile"."enter_sends", "zephyr_userprofile"."tutorial_status", "zephyr_realm"."id", "zephyr_realm"."domain", "zephyr_realm"."restricted_to_domain", "zephyr_recipient"."id", "zephyr_recipient"."type_id", "zephyr_recipient"."type", "zephyr_client"."id", "zephyr_client"."name" FROM "zephyr_message" INNER JOIN "zephyr_userprofile" ON ( "zephyr_message"."sender_id" = "zephyr_userprofile"."id" ) INNER JOIN "zephyr_realm" ON ( "zephyr_userprofile"."realm_id" = "zephyr_realm"."id" ) INNER JOIN "zephyr_recipient" ON ( "zephyr_message"."recipient_id" = "zephyr_recipient"."id" ) INNER JOIN "zephyr_client" ON ( "zephyr_message"."sending_client_id" = "zephyr_client"."id" ) ORDER BY "zephyr_message"."id" DESC LIMIT 25000;
with query plan:
Limit (cost=0.00..27120.95 rows=25000 width=362) (actual time=0.051..1121.282 rows=25000 loops=1)
-> Nested Loop (cost=0.00..5330872.99 rows=4913981 width=362) (actual time=0.048..1081.014 rows=25000 loops=1)
-> Nested Loop (cost=0.00..3932643.31 rows=4913981 width=344) (actual time=0.042..926.398 rows=25000 loops=1)
-> Nested Loop (cost=0.00..2550275.29 rows=4913981 width=334) (actual time=0.035..752.524 rows=25000 loops=1)
Join Filter: (zephyr_message.sending_client_id = zephyr_client.id)
-> Nested Loop (cost=0.00..1739467.29 rows=4913981 width=320) (actual time=0.024..217.348 rows=25000 loops=1)
-> Index Scan Backward using zephyr_message_pkey on zephyr_message (cost=0.00..362510.09 rows=4913981 width=156) (actual time=0.014..42.097 rows=25000 loops=1)
-> Index Scan using zephyr_userprofile_pkey on zephyr_userprofile (cost=0.00..0.27 rows=1 width=164) (actual time=0.003..0.004 rows=1 loops=25000)
Index Cond: (id = zephyr_message.sender_id)
-> Materialize (cost=0.00..1.17 rows=11 width=14) (actual time=0.001..0.010 rows=11 loops=25000)
-> Seq Scan on zephyr_client (cost=0.00..1.11 rows=11 width=14) (actual time=0.002..0.010 rows=11 loops=1)
-> Index Scan using zephyr_recipient_pkey on zephyr_recipient (cost=0.00..0.27 rows=1 width=10) (actual time=0.002..0.003 rows=1 loops=25000)
Index Cond: (id = zephyr_message.recipient_id)
-> Index Scan using zephyr_realm_pkey on zephyr_realm (cost=0.00..0.27 rows=1 width=18) (actual time=0.002..0.003 rows=1 loops=25000)
Index Cond: (id = zephyr_userprofile.realm_id)
Total runtime: 1141.408 ms
In the new code, we do two queries:
SELECT "zephyr_message"."id" FROM "zephyr_message" ORDER BY "zephyr_message"."id" DESC LIMIT 1
followed by:
SELECT "zephyr_message"."id", "zephyr_message"."sender_id", "zephyr_message"."recipient_id", "zephyr_message"."subject", "zephyr_message"."content", "zephyr_message"."rendered_content", "zephyr_message"."rendered_content_version", "zephyr_message"."pub_date", "zephyr_message"."sending_client_id", "zephyr_userprofile"."id", "zephyr_userprofile"."password", "zephyr_userprofile"."last_login", "zephyr_userprofile"."email", "zephyr_userprofile"."is_staff", "zephyr_userprofile"."is_active", "zephyr_userprofile"."date_joined", "zephyr_userprofile"."full_name", "zephyr_userprofile"."short_name", "zephyr_userprofile"."pointer", "zephyr_userprofile"."last_pointer_updater", "zephyr_userprofile"."realm_id", "zephyr_userprofile"."api_key", "zephyr_userprofile"."enable_desktop_notifications", "zephyr_userprofile"."enter_sends", "zephyr_userprofile"."tutorial_status", "zephyr_realm"."id", "zephyr_realm"."domain", "zephyr_realm"."restricted_to_domain", "zephyr_recipient"."id", "zephyr_recipient"."type_id", "zephyr_recipient"."type", "zephyr_client"."id", "zephyr_client"."name" FROM "zephyr_message" INNER JOIN "zephyr_userprofile" ON ( "zephyr_message"."sender_id" = "zephyr_userprofile"."id" ) INNER JOIN "zephyr_realm" ON ( "zephyr_userprofile"."realm_id" = "zephyr_realm"."id" ) INNER JOIN "zephyr_recipient" ON ( "zephyr_message"."recipient_id" = "zephyr_recipient"."id" ) INNER JOIN "zephyr_client" ON ( "zephyr_message"."sending_client_id" = "zephyr_client"."id" ) WHERE "zephyr_message"."id" > 4941883
with the message id filled in as the result of the first query. The
new query differs from the original only in that its ORDER BY and
LIMIT clauses are replaced by a WHERE clause. The second query has
query plan:
Hash Join (cost=709.30..28048.18 rows=20544 width=365) (actual time=41.678..279.261 rows=25041 loops=1)
Hash Cond: (zephyr_message.recipient_id = zephyr_recipient.id)
-> Hash Join (cost=102.98..27056.66 rows=20544 width=355) (actual time=3.686..190.730 rows=25041 loops=1)
Hash Cond: (zephyr_message.sending_client_id = zephyr_client.id)
-> Hash Join (cost=101.73..26772.94 rows=20544 width=341) (actual time=3.649..143.695 rows=25041 loops=1)
Hash Cond: (zephyr_userprofile.realm_id = zephyr_realm.id)
-> Hash Join (cost=99.99..26488.71 rows=20544 width=323) (actual time=3.578..96.746 rows=25041 loops=1)
Hash Cond: (zephyr_message.sender_id = zephyr_userprofile.id)
-> Index Scan using zephyr_message_pkey on zephyr_message (cost=0.00..26106.24 rows=20544 width=159) (actual time=0.017..41.980 rows=25041 loops=1)
Index Cond: (id > 4941883)
-> Hash (cost=83.33..83.33 rows=1333 width=164) (actual time=3.548..3.548 rows=1333 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 275kB
-> Seq Scan on zephyr_userprofile (cost=0.00..83.33 rows=1333 width=164) (actual time=0.006..1.646 rows=1333 loops=1)
-> Hash (cost=1.33..1.33 rows=33 width=18) (actual time=0.064..0.064 rows=33 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 2kB
-> Seq Scan on zephyr_realm (cost=0.00..1.33 rows=33 width=18) (actual time=0.003..0.033 rows=33 loops=1)
-> Hash (cost=1.11..1.11 rows=11 width=14) (actual time=0.027..0.027 rows=11 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on zephyr_client (cost=0.00..1.11 rows=11 width=14) (actual time=0.003..0.013 rows=11 loops=1)
-> Hash (cost=335.03..335.03 rows=21703 width=10) (actual time=37.974..37.974 rows=21761 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 893kB
-> Seq Scan on zephyr_recipient (cost=0.00..335.03 rows=21703 width=10) (actual time=0.004..18.443 rows=21761 loops=1)
Total runtime: 299.300 ms
(imported from commit b2a70cccc47be7970df407c6be00eccd2e8be82a)
The fact that we were dumping this cache and not refilling it seems to
be one of the causes of Tornado restarts being a lot slower on prod
than on local systems.
(imported from commit a32a759f4dfb591706ede1cce2d38f5c3704193c)
On my laptop, this saves about 80 milliseconds per 1000 messages
requested via get_old_messages queries. Since we only have one
memcached process and it does not run with special priority, this
might have significant impact on load during server restarts.
(imported from commit 06ad13f32f4a6d87a0664c96297ef9843f410ac5)
Timing out within the Twitter portion of the render causes the message
to still go through (without a preview). If we don't timeout here, it
causes the entire Markdown render to timeout, which rejects the
message in its entirety -- a far worse outcome.
(imported from commit f510a56f48afa46da8ec6277496fa03374cdb042)
See PEP 328[1] for details. This feature was introduced in Python 2.5 and
will become mandatory in Python 3.
[1]: http://www.python.org/dev/peps/pep-0328
(imported from commit 7444eeba8a08d5f91b94c7921848f2274979bd76)
Otherwise these logs will end up all getting split up when we switch
to the new deployment model.
(imported from commit 0514c296470be7113cab6c2f48e8dd33f1b9353d)
This commit will incorrectly list past-online users as active, a shortcoming that is
addressed in the next commit
(imported from commit b018767df686f88c0ca939c067c573e4d7cea357)
This avoids 10s of seconds of delay when you invite several people at
once through the web UI.
(imported from commit 75acdbdb04caf62bbb08affc7796330246d8a00e)
The previous code for adding users to default streams wouldn't do so
if the user didn't have a PreregistrationUser row.
(imported from commit 25f1383f6771319542d07660b29d891368889212)
This is preparatory for removing the StreamColor model, so we also set
things up so anything changing the StreamColor model changes the
Subscription model too.
The manual task is to run the copy_colors.py management command after
deployment to each of staging and prod.
(imported from commit 1be7523ca59f5266eb2c4dc2009e31209ed49635)