I've tried to do this in a way that's scalable and easily configured,
so that we can add new such filters for customers on-demand without
needing to add anything other than a bit of configuration.
Once we're confident in the arguments to this system, I think we'll
want to move the regular expression lists into the database so that we
don't need to do a prod push to modify the regular expression lists.
The initial set of regular expressions are:
(1) Linkifying e.g. "trac #224" in the Humbug realm, so we're exercising this code.
(2) The various ticket number things CUSTOMER7 uses for the CUSTOMER7 realm.
(imported from commit 992b0937b9012c15a7c2f585eb0aacb221c52e01)
Some cache keys used by Django (like sessions) will not have the key
prefixes, but those values shouldn't change across most restarts.
(imported from commit 2fe61028111fe9d5700432214a611b3341412654)
Diff Match Patch provides more human-readable diffs. For example,
try replacing "mouse" with "sofas".
(imported from commit 7ced81202ce85d5ef69888c59912e3e44c38cfc8)
I didn't use red and green for fear of it not being visible to
color-blind users. We may need to tweak the colors.
(imported from commit 59c4f1dac549a248783e4c3b3ec472d8cb690df5)
I would really like to parse the HTML we produce from the library to
ensure that we don't generate malformed-HTML. This is unfortunately
hard because we both want pretty strict parsing and we want to parse
html5 fragments. For now, we just do a basic sanity check.
We also may want to switch to Google Diff-Match-Patch, as that can
clean up the resulting diffs.
(imported from commit 3772f92135cfd7423c335335f861f2c11462a8db)
Previously we were generating API keys deterministically using a hash
of the user's email address; this is clearly not a good long-term
approach.
(imported from commit 14d0c7c9edbc45b3ae1d17a43765ad9726338d4d)
We get too many error reports from it, which is bad for us actually
fixing the other errors that we do have.
(imported from commit 8442fe4251adb15a01b4e61ebcd07bc270b08631)
Messages that get sent out when someone subscribes many people to a new stream each
cause individual database queries (and their associated transactions). With the patched
bulk_create (which sets the .id on created objects), we can reduce this query down to a constant
number of queries on the Message and UserMessage tables.
Note for deployment (local dev, staging and prod):
you must be running a patched django, found here: https://github.com/acrefoot/django/branches
use this branch: acrefoot-bulk_create_with_id-1.5.1
on acrefoot-bulk_create_with_id-1.5.1
relevant sha1: ac6d885b811f7e2e34f0db0da217983f7dfd357f
(imported from commit b0dab9dac784d3ff47751e65bf22c2dddc22edf5)
We also record the historical edits to the message in this JSON format:
[{"prev_content": "new test message 14", "timestamp": 1369157249},
{"prev_content": "new test message 13", "timestamp": 1369157118}]
but we don't actually do anything with the information as of yet.
(imported from commit 2d5ca449b87b33ad035ab0e076a22e150c8e7267)
There was no benefit to our various link processors all doing
independent scans through the list of messages, and this makes it much
easier to understand the logic of how each link will be handled, and
also makes policies like "don't process links if there are more than 5
of then" easier to implement coherently.
(imported from commit 4affdeab889ba89b99eec905fdf871e78bbc3dd4)
Currently the interface for editing messages is limited to a
command-line API tool; it's great for testing with e.g.:
./api/examples/edit-message --message=348135 --content="test $(date +%s)" --site=http://localhost:9991 --subject="test"
The next commit will add a user interface for actually doing the editing.
(imported from commit bdd408cec2946f31c2292e44f724f96ed5938791)
This, combined with acrefoot's work on sending the notifications in
bulk, resolves trac #1142 -- we do only 10 database queries and the
whole operation completes in about 300ms on my laptop.
(imported from commit 36b5bb836bc6c713903d1ca72e39af87775dc469)