We use the message a lot for the query modified
here, so I think it's worth taking the up-front
hit of getting bulkier objects to avoid O(N)
hops back to the database.
Normal hipchat exports use integer ids for their
users and "rooms," which we just borrowed during
conversion.
Atlassian Stride uses stride UUIDs for these instead, but otherwise
has the same export format.
We now introduce IdMapper to handle external ids
that aren't integer. The IdMapper will map UUID
ids to ints and remember them. For ints it just
leaves them alone.
Fixes#10805.
Our webhook-errors.log file is riddled with exceptions that are
logged when a webhook is incorrectly configured to send data in
a non-JSON format. To avoid this, api_key_only_webhook_view
now supports an additional argument, notify_bot_owner_on_invalid_json.
This argument, when True, will send a PM notification to the bot's
owner notifying them of the configuration issue.
We (lexically) remove "subject" from the conversion code. The
`build_message` helper calls `set_topic_name` under the hood,
so things still have "subject" in the JSON.
There was good code coverage on `build_message`.
We now attach zulip_db_data to the markdown engines
for classes that need it. This was the last remaining
global we had, so we remove `arguments.py` here.
The Markdown processor makes it fairly simple for
the helper classes to access the `md` engine. We
now write `_md_engine.zulip_message` to avoid having
the current message in the global namespace.
Note that we do reuse engines for multiple messages,
but each engine is specific to a realm. And we therefore
avoid even the theoretical possibility of leaking message
data between realms.
We were building the same link regex every time
we build a Markdown engine, which happens twice
per realm. It's an expensive operation due to
the complexity of the regex and us reading a file.
This is a preparator refactor for supporting hosting different Tornado
processes on different servers; to look up which Tornado server we
should be sending the event to, we'll need the realm object.
This should make it possible for there to safely be multiple Tornado
processes running on different ports on the same system.
It may also fix a rare race bug in development, where previously, it
was possible for the Tornados processes for Casper and the main
development server to interfere; I haven't investigated whether this
was a real bug or not, but now those two services will use independent
Tornado files.
We still need to add something to direct traffic between the different
Tornado processes.
This is mostly an extraction, but it does change the
way we calculate `content`. We append the markdown
links from ALL files to any content that came in the
message itself.
Separating this out also allows us to add more
test coverage for the extracted code.
We now use subscriber_map for building UserMessage
rows in Slack/Gitter conversions.
This is mostly designed to simplify the code, rather
than having to scan the entire subscribers for each
message.
I am guessing this will improve performance for most
conversions. We sort small lists on every message,
in order to be deterministic, but the sorting cost
is probably more than offset by avoiding the O(N)
scans across all subscriptions. Also, it's probably
negligible in the grand scheme of things, compared
to JSON parsing, file I/O, etc.
This commits also fixes some typos with mentioned_users_id ->
mentioned_user_ids and cleans up a test a bit as well.