datetime.timezone is available in Python ≥ 3.2. This also lets us
remove a pytz dependency from the PostgreSQL scripts.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Refactored code in actions.py and streams.py to move stream related
functions into streams.py and remove the dependency on actions.py.
validate_sender_can_write_to_stream function in actions.py was renamed
to access_stream_for_send_message in streams.py.
This is critical for importing the very first realm into an empty
server, since in 27b15a9722, we changed
the model to create the internal realm when the first real realm would
be created, but neglected the data import code path.
"Zulip Voyager" was a name invented during the Hack Week to open
source Zulip for what a single-system Zulip server might be called, as
a Star Trek pun on the code it was based on, "Zulip Enterprise".
At the time, we just needed a name quickly, but it was never a good
name, just a placeholder. This removes that placeholder name from
much of the codebase. A bit more work will be required to transition
the `zulip::voyager` Puppet class, as that has some migration work
involved.
This legacy cross-realm bot hasn't been used in several years, as far
as I know. If we wanted to re-introduce it, I'd want to implement it
as an embedded bot using those common APIs, rather than the totally
custom hacky code used for it that involves unnecessary queue workers
and similar details.
Fixes#13533.
This is adds foreign keys to the corresponding Recipient object in the
UserProfile on Stream tables, a denormalization intended to improve
performance as this is a common query.
In the migration for setting the field correctly for existing users,
we do a direct SQL query (because Django 1.11 doesn't provide any good
method for doing it properly in bulk using the ORM.).
A consequence of this change to the model is that a bit of code needs
to be added to the functions responsible for creating new users (to
set the field after the Recipient object gets created). Fortunately,
there's only a few code paths for doing that.
Also an adjustment is needed in the import system - this introduces a
circular relation between Recipient and UserProfile. The field cannot be
set until the Recipient objects have been created, but UserProfiles need
to be created before their corresponding Recipients. We deal with this
by first importing UserProfiles same way as before, but we leave the
personal_recipient field uninitialized. After creating the Recipient
objects, we call a function to set the field for all the imported users
in bulk.
A similar change is made for managing Stream objects.
Fixes#1727.
With the server down, apply migrations 0245 and 0246. 0246 will remove
the pub_date column, so it's essential that the previous migrations
ran correctly to copy data before running this.
Apparently, a subtle mismatch between the filename/URL formats for our
upload codebases meant that importing Slack avatars into systems using
S3_UPLOAD_BACKEND would end up with the avatars having the wrong URLs.
Our recently-added code for rewriting user IDs on data import didn't
correctly handle wildcard mentions and mentions generated by very old
versions of Zulip (pre data-user-id).
The previous query ended up doing an awkward join that did not
guarantee use of the Recipient index on zerver_message, turning a very
fast query into something that could take much longer for a single
stream than the rest of the import combined.
lxml parser appends html and body tags to the soup object which
are not reqired. There are no other major parsing diffrences between
the two parsers as long the HTML input is perfectly formated.
lxml parser is much faster than html.parser but it hardly matters
in our case.
https://www.crummy.com/software/BeautifulSoup/bs4/doc/#differences-
between-parsers
Previously, if you exported a Zulip organization and then re-imported
it, we'd end up renumbering the user IDs and all direct foreign key
references to them in the database, but not the data-user-id
references in mentions. Fix this by parsing the message content and
doing that renumbering.
(Because we import raw markdown, not HTML, from third-party tools,
these changes won't affect data import from slack etc.)
Fixes the high-priority part of #11293.
This field is primarily intended to support avoiding displaying the
"more topics" feature in new organizations and streams, where we might
know that all messages in the stream are already available in the
browser.
Based on original work by Roman Godov, and significantly modified by
tabbott.
The second migration involved here could be expensive on Zulip Cloud,
but is unlikely to be an issue on other servers.
In commit de65a04 we can see that if the need ever arises to modify
how stream descriptions are rendered, we would need to make changes
at 5 different call points which can be quite cumbersome. So this
functionality has been extracted to a new method called
'render_stream_descriptions'.
This commit leverages the ahocorasick algorithm to build a set of user_ids
that have their alert_words present in the message. It runs in linear time
of the order of length of the input message as opposed to number of
alert_words. This is after building a ahocorasick Automaton which runs
in O(number of alert_words in entire realm) which is usually cached.
We want to use the baseline features of bugdown, but not fancy things
like inline URL previews, since the whole structure of stream
descriptions is to have a single-line thing supporting some
formatting.
The migration part of this change fixes a bug encountered by some
organizations upgrading from older versions of Zulip.
We've for a while had logic to set plan_type to LIMITED when importing
into Zulip Cloud; we need corresponding logic to set it to SELF_HOSTED
when importing into a self-hosted server.
Fixes#11541.
This helps keep the realm.json small and easy to process; previously,
almost the entire size of that file was the analytics data.
We implement this by refactoring the analytics Config objects into a
separate subroutine that writes to a separate file, plus the
corresponding import code.
Manual testing was performed by exporting the 'analytics' realm, and
importing back to a newly created 'test' realm. The 'test' realm was
then exported and the json files were inspected. The data appeared
consistent with no abnormalities.
Fixes: #11220.
This commit does the following three things:
1. Update stream model to accomodate rendered description.
2. Render and save the stream rendered description on update.
3. Render and save stream descriptions on creation.
Further, the stream's rendered description is also sent whenever the
stream's description is being sent.
This is preparatory work for eliminating the use of the
non-authoritative marked.js markdown parser for stream descriptions.
This should eliminate the need to do manual analytics work when
importing organizations imported/exported using the zulip -> zulip
import/export tools.
The octet-stream content type is potentially under-specified, but it's
better than potentially submitting None and increases consistency of
this part of the codebase.
The boto library's s3 interface allows setting only string-format
metadata keys. So we need to cast the last_modified floating-point
timestamp into a string before storing on the S3 object.
This bug mostly broke uploading avatars when using the S3 storage backend.
Our HipChat conversion tool didn't properly handle basic avatar
images, resulting in only the medium-size avatar images being imported
properly. This fixes that bug by asking the import tool to do the
thumbnailing for the basic avatar image (from the .original file) as
well as the medium avatar image.
Fixes a bug in import_realm where secondary attributes like message
visibility weren't being set, and also makes bugs like this less likely in
the future.
Also, putting the plan_type change at the end of import_realm, so that
future restrictions to LIMITED realms don't affect the import process.
We've had a long stream of bugs existed because only one of these two
code paths was tested (usually the local uploads backend). By
deduplicating these functions, we ensure that this category of bugs no
longer happens.
Following my recent refactor, this is just a straightforward merge,
with code for one or the other backend ending up inside an if
statement.