We add a new model, ArchiveTransaction, to tie archived objects together
in a coherent way, according to the batches in which they are archived.
This enables making a better system for restoring from archive, and it
seems just more sensible to tie the archived objects in this way, rather
the somewhat vague setting of archive_timestamp to each object using
timezone_now().
Now that we have a system for storing HTTP headers for each integration, we
should fix the send_all button. Previously, it used the same user entered
custom HTTP header (from the GUI) for all of the fixtures, but now we
automatically determine the header with the new system instead.
For storing HTTP headers as a function of fixture name, previously
we required that the fixture_to_headers method should reside in a
separate module called headers.py.
However, as in many cases, this method will only take a few lines,
we decided to move this function into the view.py file of the
integration instead of requiring a whole new file called headers.py
This commit introduces the small change in the system architecture,
migrates the GitHub integration, and updates the docs accordingly.
In the GitHub integration we established that for many integrations,
we can directly map the fixture filename to the set of required
headers and by following a simple naming convention we can greatly
ease the logic involved in fixture_to_headers method required .
So to prevent the need for duplicating the logic used by the GitHub
integration, we created a method called `get_http_headers_from_filename`
which will take the name of the HTTP header (key) and then return a
corresponding method (in a decorator-like fashion) which could then be
equated to fixture_to_headers in headers.py.
The GitHub integration was modified to use this method and the docs
were updated to suggest using this when possible.
This fixes the mis-alphabetized `fluid_layout_width` at few places in
the codebase, along with that it also fixes sorting order of
`property_types` dictionary in models.py and few model fields of
`UserProfile` model class.
The markup output changed but the rendering is the same, so modified
expected output in tests.
There is a regression introduced in one of the new versions of KaTeX,
which produces a warning in our node tests:
```
No character metrics for ' ' in style 'Main-Bold'
```
but the rendering is correct so we can ignore it.
Tracking issue: KaTeX/KaTeX#1994
Fixes#12472.
When parsing custom HTTP headers in the integrations dev panel, http
headers from fixtures system and the send_webhook_fixture_message
we now use a singular source of logic: standardize_headers which
will take care of converting a dictionary of input headers into a
standard form that Django expects.
Previously, our Github authentication backend just used the user's
primary email address associated with GitHub, which was a reasonable
default, but quite annoying for users who have several email addresses
associated with their GitHub account.
We fix this, by adding a new screen where users can select which of
their (verified) GitHub email addresses to use for authentication.
This is implemented using the "partial" feature of the
python-social-auth pipeline system.
Each email is displayed as a button. Clicking on that button chooses
the email. The email value is stored in a hidden input above the
button. The `primary_email` is displayed on top followed by
`verified_non_primary_emails`. Backend name is also passed as
`backend` to the template, which in our case is GitHub.
Fixes#9876.
Now that we store HTTP headers in a way that is easy to retreive
by specifying the integration name and fixture name, we should
use it to pre-load the "Custom HTTP Headers" field in the
integrations dev panel.
Using this system, we can now associate any fixture of any integration
with a particular set of HTTP headers. A helper method called
determine_http_headers was introduced, and the test suite was upgraded
to use determine_http_headers.
Comments and documentation significantly edited by tabbott.
This function is an alternative to get_admin_users that we use in all
places where we explicitly want only human administrative users (not
administrative bots). The following commits will rename
get_admin_users for better clarity.
The argument parser has default empty values set for the options
`--password` and `--password-file`, and this causes the script to try and
read a password file even when the argument was not provided.
Our recently-added code for rewriting user IDs on data import didn't
correctly handle wildcard mentions and mentions generated by very old
versions of Zulip (pre data-user-id).
The previous query ended up doing an awkward join that did not
guarantee use of the Recipient index on zerver_message, turning a very
fast query into something that could take much longer for a single
stream than the rest of the import combined.
We also document support for user IDs in the pm-with narrow operator.
Edited by tabbott to document on /api rather than in the /help page.
Fixes part of #9474.
If the event key is None, the handler content_func never gets
defined, which leads to an UnboundLocalError. This can be easily
avoided by having a dedicated function that handles the case for
when the event key is None.
Namely, here we add the "plan_includes_wide_organization_logo" and
"upgrade_text_for_wide_organization_logo" to the page_params (which
is set in zerver/lib/events.py).
"plan_includes_wide_organization_logo" is True if the plan is not of
the Realm.LIMITED type. We need to add this extra boolean parameter
instead of just using "realm_plan_type" to make things a lot easier
to work with on the frontend side, especially considering that
handlebars won't allow checking for equality in its {{#if}} blocks.
When a realm's plan type is updated using "do_change_plan_type" we
notify active users of the realm. This way certain plan features
could be enabled instantaneously for active users.
If the invoice was paid then the message should simply be
"Invoice is now paid." with a link to the invoice.
Also, suppress the "status_transitions" and "payment_intent"events.
A function was written in `test_fixtures.py` to drop a test database
template if the corresponding database id doesn't belong to a file.
Alongside this fact, every file that is written is removed after 60
minutes. Meaning any potential database template can never exist
longer than one hour.
This follow-up work was added to deal with the potential race
conditions when running `test-backend`. Ensuring that all templates
are properly dealt with.
Essentially rewritten by tabbott for cleanliness.
Fixes the remainder of #12426.
The ids that will be used for each particular run of the test suite are
written to a unique file. Each file will then be used as a time
reference of when the suite was ran.
This change sets up the ability for a complete clean up of potentially
leaked database templates.
Tweaked by tabbott to remove these files after successful database
cleanup.
When running the test-backend suite in serial mode, `destroy_test_db`
double appends the database id number to the template if passed an
argument for `number`. The comment here explains this behavior.
This fixes an issue that caused LDAP synchronization to fail for
avatars. The problem occurred due to the lack of a 'name' attribute
on the BytesIO object that we pass to the upload backend (which is
only used in the S3 backend for computing Content-Type).
Fixes#12411.
Rather than relying on the CASCADING property of the ForeignKey to the
Message table to clean up these objects, we delete them in the same
query as we archive them - since it's guaranteed that any of these
objects that we archive will be deleted due to their Message being
deleted later.
We don't have this guarantee for Attachment objects, which is why we
can't apply this scheme to them.
To ensure the database retains a consistent state if archiving gets
interrupted, we process each Messages chunk together with related
objects in a single atomic transaction.
We had two duplicate functions for archiving zerver_attachment_messages
rows, doing the same thing - archiving by message_id. One of them had a
redundant INNER JOIN, so we get rid of that too.
Since we loop over realms in the functions for archiving stream messages
and then personal+huddle messages, and also want to split cleaning up
attachments by realm - it makes sense to do it all in one single loop.
Rename notification property `enable_stream_sounds` to
`enable_stream_audible_notifications` to match with other
notification property patterns.
Fixes part of #12304
We batch queries that archive Messages, to limit the maximum amount of
Message objects archived in a single query. This leads to the archiving
of other related objects being batched as well, because we loop over
chunks of archived messages and archive their related objects per-chunk.
N = self.parallel templates are created, and these templates were
previously named 'zulip_test_template_<1, N>'. However, to support
running multiple instances of `test-backend`, a unique
`random_id_range_start` was created for each template database.
There was no problem prior because the templates would simply be
used again and thus did not require any clean up. Now that there are
unique database names being created, every time `test-backend` is run
these templates can accumulate on disk. Instead, we clean up our
templates at the end of every complete run of the test suite, or upon a
SIGINT.
Fixes: #12426
This validation is incomplete, in large part because of the long list
of TODOs in this code. But this test should provide a ton of support
for us in avoiding regressions as we work towards having complete API
documentation.
See https://github.com/zulip/zulip/issues/12521 for a bunch of
follow-up improvements.
We add the following behavior:
If stream has message_retention_days set to -1, archiving for it is
disabled.
If stream has message_retention_days set to null, use the realm's
policy. If the realm has no policy, we don't archive for this stream.
UserMessages no longer need special handling, they can be archived by
move_models_with_message_key_to_archive and automatically cleaned up
like the other models with a message key with CASCADING=True.
We change the archiving scheme to allow having stream based retention
policies. In the first step of the archiving process, we loop over
streams and archive their expired messages and related objects.
Then we separately archive all expired personal and huddle messages and
related objects. As the last step, we scan for redundant attachments
which can now be deleted.
To achieve this, we have to rewrite a significant portion of the
retention code and rework some of the database queries.
For the sake of simplicity, we neither archive nor delete cross-realm
messages, except cross-realm stream messages – in their case they can
be processed in the same manner as ordinary stream messages.
In the query for archiving personal and huddle messages we simply
exclude those sent by cross-realm bots.
We change the tests to adapt to these modifications.
Since we archive attachments and attachment_messages tied to a list of
ids of Messages that we just archived (so from the current realm), it's
unnecessary to check their realm in the queries. This could potentially
cause archiving of an attachment with realm_id of another realm, but
this isn't an issue, as long as we make sure we don't end up deleting
the original Attachment object incorrectly - but realm_id check is
included in delete_expired_attachments() to ensure that.
This makes it a lot more useful for understanding how our flag update
endpoints work.
With significant edits by tabbott to explain what these are.
Fixes#12092.
Previously, we didn't have validation to prevent editing certain flags
that don't make sense for a client to edit, like whether a user was
mentioned in a given message.
This isn't a security issue -- the user could only mess up their own
personal search results (etc.), but it does seem worth fixing to avoid
confusion for folks developing Zulip clients.
While we're at it, clearly document the situation in comments.
This adds a setting to control Zulip's default behavior of sorting to
bottom and graying out inactive streams. The previous logic is still
the default "automatic", but this gives users more control. See the
models.py comment for details.
Fixes#11524.
We were apparently reusing the path for both the development and test
databases, which meant that we would not always correctly run
`generate_fixtures` when changes were required.
This was a recent regression introduced when we added this cache a few
days ago.
We add RETURNING to fetch relevant message and usermessage ids in
archiving queries and use them to make other queries faster and slower.
A side-effect of this implementation is that with cross-realm messages,
the UserMessage of the recipient and the Message will not be deleted -
but cross-realm messages are rare, will still get correctly put in the
archive tables and so failing to delete should not be a problem for now.
They will be fully handled later.
zerver_archivedmessage is already INNER JOIN-ed earlier in the query, so
we check the pub_date in it, instead of joining zerver_message, which
would just redundantly join the analogical rows.
lxml parser appends html and body tags to the soup object which
are not reqired. There are no other major parsing diffrences between
the two parsers as long the HTML input is perfectly formated.
lxml parser is much faster than html.parser but it hardly matters
in our case.
https://www.crummy.com/software/BeautifulSoup/bs4/doc/#differences-
between-parsers
In addition to the "+show-sender" option, we now add "+include-footers"
which disables stripping of the footer from the email body if this token
is included in the email address.
To enable a comfortable way of adding more optional tokens in the
address (like current '+show-sender') we change decode_email_address to
return a general dictionary containing options specified through adding
these optional tokens in the To: address. For now, we only have
"+show-sender", but more can be easily added using this change.
The RealmAuditLog object ID was stored in the event sent to the
deferred_work queue as a means to update the row's extra_data field.
The extra_data field then stores the location of the export.
Instead of running `what_to_do_with_migrations` unconditionally, we
first hash and compare the files located in `*/migrations/*`. Only if
a migration file has changed (or the hash file does not exist yet) do we
call `what_to_do_with_migrations`.
It was discovered that the call to Django's `showmigrations.py` file was
causing roughly a 500ms increase in `test-backend`'s start up time.
However, this fix only saves about 100ms, apparently because a lot of
that work was importing Django dependnecies we need for most tests
anyway.
Fixes: #12428.
The payload for when a build is cancelled was causing an error
because the build result code mapping was missing one of the
codes. This commit also fixes a minor typo in the result codes.
Ensure that the html is safe, before using it. The html is considered if it is
in an iframe with a http/https src, based on the recommendations here:
https://oembed.com/#section3
We directly embed the `iframe` html into the lightbox overlay.
We add general code that will archive models that are tied to a specific
Message (such as Reactions and SubMessages). Certain details of the
model are grabbed from a list models_with_message_key, and then used to
create queries that will archive these database tables.
We put Reaction in that list in this commit, and add appropriate tests.
To have archiving of other analogical models (for example SubMessage),
one only needs to make an appropriate entry in the
models_with_message_key list.
Sometimes it's useful to run two copies of test-backend at the same
time. The problem with doing so is that we need to make sure no two
threads are using the same test database ID.
Previously, this worked only if at most one of those copies was
running in the single-threaded mode, because we used a random database
ID for the single-threaded code path, but the same IDs counting from 0
for the parallel code path.
Fix this, mostly, by generating a random start for the range of IDs
used by the process, and then counting off database IDs starting from
there (both in the parallel and non-paralllel modes).
There's still a very low probability race, see the TODO.
Additionally, there appear to be some other races with running two
copies of test-backend at the same time not related to the database.
See https://github.com/zulip/zulip/issues/12426 for a follow-up issue
that's sorta created by this.
The test-backend parallel test runner system doesn't actually use the
zulip_test database; instead, it creates its own databases off the
zulip_test_template database.
We were accidentally running `tools/generate_fixtures` even when there
are no changes, because this function is shared with the
tools/lib/test_server.py codebase, which needs us to do the work of
creating a test database for it off the zulip_test_template database.
Fixing this saves about 1.5s / 4s of the runtime of a single test.
Previously, if you exported a Zulip organization and then re-imported
it, we'd end up renumbering the user IDs and all direct foreign key
references to them in the database, but not the data-user-id
references in mentions. Fix this by parsing the message content and
doing that renumbering.
(Because we import raw markdown, not HTML, from third-party tools,
these changes won't affect data import from slack etc.)
Fixes the high-priority part of #11293.
Modifies the dict with the user info to include the key `bot_owner_id`
so it can be displayed in the user info popover.
Tests concerned with changing bot owner have been modified to have
number of events=2 because while updating the bot info, two events
are fired -- updating the `realm_bot` and `realm_user` since the
key `bot_owner_id` is a part of realm user info.
Since positional arguments are interpreted differently by different
backends in Django's authentication backend system, it’s safer to
disallow them.
This had been the motivation for previously declaring the parameters
with default values when we were on Python 2, but that was not super
effective because Python has no rule against positional default
arguments and that convention for our authentication backends was
solely enforced by code review.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This commit modifies the regex used when parsing JIRA's full links of
the form `[text|link]` so that if you have two in a message, Zulip
markup conversion doesn't think that the first link extends to the
closing `]` of the second link.
This reordering was originally made with regard to the delete after
access feature for the public export. However, this reordering is
more correct overall, i.e., the object should be created before the
event pertaining to the object is sent.
The `queue_data` variable is an intermediate step that's unnecessary.
Instead, the values from the queue event are assigned dierectly.
Also, the `worker` variable is not worth an assignment as it is only
referenced a single time per test case.
A FileNotFound error was set as the side-effect of the do_export_realm
mock and the DeferredWorker was made to consume the event explicitly.
Previously, the mock of do_export_realm was producing spammy output
as a result of a FileNotFound error coming from the queue processing of
`do_write_stats_file_for_realm_export`.
A unique path was created using the `LOCAL_UPLOADS_DIR` backend, similar
to the code used in `LocalUploadBackend`. The exported tarball was
copied to the directory, and an nginx url was created to serve the file
publicly.
Tweaked by tabbott to output an actual URL.
This cleans up the pattern for how we check which user is logged in
during Zulip's backend unit tests to be much more readable (replacing
the arcane session code that does this check).
test_retention.py had various issues - we opt for keeping its essence
(what should the tests do and verify), but rewriting a lot of it in
order to have more clarity in what's happening there.
We split archive_messages code into two functions: moving to archive and
cleanup. This allows cleaning up the tests - they can call
these functions directly instead of copying several lines of
archive_messages here and there in multiple tests.
test_cross_realm_messages_archiving_two_realm_expired doesn't run the
code path patched in commit 3d1aa98b2ea344fba7fbb2373a37d4cf30f53e08i,
so it can still fail. We apply the analogical change in the test as
in the cited commit.
This is probably a good idea for the production use case, since then
there's some consistency of behavior, and if we extend logging, one
knows exactly which realms were or were not executed before a logged
failure.
This fixes the nondeterministic test failures we've been seeing in CI:
if you use `-id` in that order_by, it happens consistently.
Sending PM from a hamlet(consented) to othello is a case
of sending message from a consented user to a non consented
user. This result in the generation of more than one message
files during realm export. To handle this case _export_realm
is updated.