We now write the fact that a message was created or updated to a log
table and actually update the tsvector asynchronously from a worker
process.
This also means that when running locally, one now needs to run the
process_fts_updates script in order to have new messages indexed for
full text search.
(imported from commit ebb11b08d30be2a45242dafe146e8e861a0f050a)
This creates the required model fields to use the Django permissions
framework or various other third-party frameworks.
To apply this commit, run:
python manage.py migrate zephyr
(imported from commit a14fa7552c5389522d15edecedfd8a34418bb23d)
This stop words file is just the default Postgres english stop file
with all the rest of the letters of the alphabet added. Adding the
extra letters ensures that, e.g., "bed" doesn't get transformed into
"bed | b".
(imported from commit 0be3ef9a43eb524ed4f081d5081a786cf602c487)
We haven't been using this for months and not removing it before was
just an oversight.
(imported from commit d95c1911765a04a0c8713cc6c0dd346c123c97a3)
We also record the historical edits to the message in this JSON format:
[{"prev_content": "new test message 14", "timestamp": 1369157249},
{"prev_content": "new test message 13", "timestamp": 1369157118}]
but we don't actually do anything with the information as of yet.
(imported from commit 2d5ca449b87b33ad035ab0e076a22e150c8e7267)
South doesn't properly deal with removing the Django User model, so
this commit redoes our South history to instead start after that
migration has already been applied. This allows us to get rid of some
annoying hacks.
Note that developers and staging will need to run
./manage.py migrate --delete-ghost-migrations zephyr
in order to clear out the old versions of the migrations.
(imported from commit 7f45ea601b809dde33720f76e7dfb0ab348b0e65)
We HTML-escape the subject in Postgres to avoid a server round-trip.
Unlike the rendered_content, which is already escaped and cached on
zephyr_message, we normally escape subjects client-side. Escaping in
Django would require fetching the messages that match the query,
escaping the subjects, and then making a second query to Postgres to
insert the markup. We could instead fetch the messages with subjects
marked up using non-HTML (some unique string) that is later converted
into the correct markup either in Django or client-side, but then the
escaping problem would just be with some random string instead of
HTML. Since the function is pretty simple, doing the escaping in
Postgres itself is the least painful option.
(imported from commit 004931d8e496697c18650aee97b1a74c55a04cb2)
In addition to changing the trigger that updates
zephyr_message.search_tsvector to use our new text search
configuration, it also now builds the tsvector on rendered_content
instead of content and fires on update of only the subject or
rendered_content columns.
This migration is expected to take a long time. The
checkpoint_segments parameter in postgresql.conf should be
temporarily raised (probably to 32) while it is running.
(imported from commit 4535438bb33ce1db2a74ecbe91efc52afdb568f1)
Text search was not that great partially because Postgres wasn't
using a ispell dictionary (Postgres term) before. We now pull in
Hunspell and use its dictionary and affix rules.
It is Ok to run with this new configuration before updating our full
text column and index that will be coming in the next few commits.
Manual steps for deploy:
1) On both postgres0 and postgres1 (both before moving on to step 2),
install the hunspell-en-us package
2) On staging, run migration 0022
3) On both postgres0 and postgres1, copy the appropriate postgresql.conf
file over
4) On both postgres0 and postgres1, run `pg_ctlcluster 9.1 main reload`
(imported from commit 706bf0f6ecc46c712cea10b73c34fd9d1dfd4767)
We accidentally lost this when we did the User/UserProfile merge (this
commit also deletes the old code to add the auth_user index in
do-destroy-rebuild-database).
This below is mostly just notes for future reference, but when
deploying this change to staging, we should consider running the
following instead of using the migration directly:
CREATE UNIQUE INDEX CONCURRENTLY zephyr_userprofile_email_uniq ON zephyr_userprofile(email);
ALTER TABLE zephyr_userprofile ADD CONSTRAINT zephyr_userprofile_email_uniq UNIQUE USING INDEX zephyr_userprofile_email_uniq;
CREATE INDEX CONCURRENTLY zephyr_userprofile_email ON zephyr_userprofile(email);
But I think it might be the case that it's fine to just run it
directly, since the ALTER TABLE part seems to hang if there's an open
transaction working on a UserProfile object anyway.
(imported from commit 1bf34ce242de51e97c91c8bab86b6b273e17fb43)
Adds a new db table for storing presences, and an API for setting
an individual user's idleness as well as fetching all idle status
for all users in a realm
(imported from commit 5aad3510d4c90c49470c130d6dfa80f0d36b0057)
This needs to be done in three South migrations to not block users
from sending messages for a long time. Adding the column requires a
write lock on the zephyr_message table and populating the new column
takes a long time. Thus, we can't do them both in the same
transaction (which South forces on migrations). Additionally,
creating the index takes a lot of computation and needs to lock the
table when not done CONCURRENTLY, which can't be done inside of a
transaction.
To do this manual change, you need to run:
python manage.py migrate zephyr 0007
ssh postgres.humbughq.com 'echo "CREATE INDEX CONCURRENTLY zephyr_message_search_tsvector ON zephyr_message USING gin(search_tsvector);" | psql'
python manage.py migrate zephyr 0008
on staging. No action is required on prod since the database is
shared.
Note that this migration must be done completely before we switch to
using the tsvector cache column.
(imported from commit b6a27013a60c1fd196eabb095d2d11d20bba5aac)
Autogenerated schema migration:
+ Added field in_home_view on zephyr.Subscription
To do this manual change, you need to run:
python manage.py migrate zephyr 0005
on staging. No action is required on prod since there is a shared database.
(imported from commit d554f17b25631482ec2d5605a42ac0b9d6df421e)
This schema migration is only for use in automated migrations. To
deploy on the production database (the migration only needs to be
done once for both of staging and prod because they share a
database), you should instead execute the following SQL manually:
$ ssh postgres.humbughq.com
$ psql
humbug=> CREATE INDEX CONCURRENTLY zephyr_message_full_text_idx ON zephyr_message USING gin(to_tsvector('english', subject || ' ' || content));
Note the addition of the "CONCURRENTLY" keyword. The problem is that
creating the index takes non-trivial time and requires a write lock
on the table while the index is being created. This would mean that
users would be unable to send messages while we were generating the
index, which isn't acceptable. We can't create the index
concurrently in the South migration because concurrent index
creations can't happen inside of a transaction and South forces a
transaction on migration functions.
Also note that this index must be created before Postgres full text
search is deployed to the app because full text search without an index
is actually much slower than plain search using the LIKE operator.
(imported from commit 8b9445c27d0e427278de997b22342bffe6d855b7)
Added field invited_at on zephyr.PreregistrationUser, with a one-time
default of Jan 1, 1970 for existing objects.
Added M2M table for streams on zephyr.PreregistrationUser
Deleted unique constraint for ['email'] on zephyr.PreregistrationUser
(imported from commit 85247acb488201f8fc51dfaae354423c27eddcb0)