* Allow email addresses surrounded by <>
* Reject things that look like email addresses that have a path after them
This requires adding a new branch to the regex specifically for email addresses.
* Fix comment whitespace
(imported from commit 0383cd4067ae9ee31f3802e6777a200ba1cbccd6)
Be more restrictive on what characters can be part of a URL and what
characters can precede a URL to prevent linkifying other strings that
come just before a valid URL. Allow : and , before a URL.
(imported from commit f072980b39ff652edf20de0585f256f072d04e88)
* This makes bugdown.convert take a `message` parameter. Properties
for parsed mentions are added to the message object by the `Pattern`
for use in do_send_messages.
* Refactor repeated markdown rendering code into `Message` model methods.
(imported from commit 4f0ed5570104c0210f984b6de21e9048e2b53fa0)
(Before it had been disabled only on prod/staging, but we are
removing it everywhere, motivated by making tests run faster.
In particular, the call to embedly_client.is_supported() was
expensive, as it went over the Internet.)
(imported from commit ea12bf6e7ae84ce7e8023a0d314ecc4c07cbc0a8)
This saves something like 15ms on our 1000 message get_old_messages
queries, and will save even more when we start sending JSON dumps into
our memcached system.
We need to install python-ujson on servers and dev instances before
pushing this to prod.
(imported from commit 373690b7c056d00d2299a7588a33f025104bfbca)
I've tried to do this in a way that's scalable and easily configured,
so that we can add new such filters for customers on-demand without
needing to add anything other than a bit of configuration.
Once we're confident in the arguments to this system, I think we'll
want to move the regular expression lists into the database so that we
don't need to do a prod push to modify the regular expression lists.
The initial set of regular expressions are:
(1) Linkifying e.g. "trac #224" in the Humbug realm, so we're exercising this code.
(2) The various ticket number things CUSTOMER7 uses for the CUSTOMER7 realm.
(imported from commit 992b0937b9012c15a7c2f585eb0aacb221c52e01)
We get too many error reports from it, which is bad for us actually
fixing the other errors that we do have.
(imported from commit 8442fe4251adb15a01b4e61ebcd07bc270b08631)
There was no benefit to our various link processors all doing
independent scans through the list of messages, and this makes it much
easier to understand the logic of how each link will be handled, and
also makes policies like "don't process links if there are more than 5
of then" easier to implement coherently.
(imported from commit 4affdeab889ba89b99eec905fdf871e78bbc3dd4)
Since we log to statsd our cache time lookups by cache key, using a unique
tweet id for each lookup was just filling up our cache without being useful.
Also, log database cache lookups in a further namespace to distinguish between
memcached caches
(imported from commit a2a16b777fb7ab8cd066feee7344f9c8a3c107f5)
For sites that are supported, we now grab thumbnails for images + video
embed code for videos and use them in lieu of our existing embed code.
We also embed rich non-script content.
Special casing is done so that we don't embed images twice.
Some testcases were modified to avoid triggering Embed.ly
The manual step is to install python-embedly.
(imported from commit d725bab91675c61953116c5ca741055fce49724e)
Timing out within the Twitter portion of the render causes the message
to still go through (without a preview). If we don't timeout here, it
causes the entire Markdown render to timeout, which rejects the
message in its entirety -- a far worse outcome.
(imported from commit f510a56f48afa46da8ec6277496fa03374cdb042)
This will automatically fix bugs such as one in which
internal_send_message didn't properly strip() the subject argument
before sending a message.
We change the recipient_type argument to internal_send_message to take
the recipient type name (e.g. 'stream') both to better fit the API and
also because the previous code incorrectly handled huddles.
(imported from commit 78c2596d328f6bb1ce2eaa3eed9a9e48146e3b6a)
Sometimes Dropbox shares with /s/ and sometimes with /sh/,
and I'm not sure which controls it, but we should deal with both.
(imported from commit 2222450f25c418b5fbd60ab2c30477467e34c0d1)
This avoids our repeatedly retrying to fetch a tweet that doesn't
exist from the Twitter API.
(imported from commit b4ca1060d03da21e7e59e5b99e682d2e8457df15)
This should substantially improve the repeat-rendering time for pages
with large numbers of tweets since we don't need to go all the way to
twitter.com, which can take like a second, to render tweets properly.
To deploy this commit properly, one needs to run
./manage.py createcachetable third_party_api_results
(imported from commit 01b528e61f9dde2ee718bdec0490088907b6017e)
This commit adds a dependency on python-twitter,
whose upstream is at https://github.com/bear/python-twitter,
and which for now needs to manually be installed on our
servers from the Debian package in sid.
(imported from commit 80cd9f4f59a6f0de6b75ac95e412c69e2a2e2490)