zulip

Commit Graph

Author	SHA1	Message	Date
Zixuan James Li	a081428ad2	user_groups: Make locks required for updating user group memberships. Background User groups are expected to comply with the DAG constraint for the many-to-many inter-group membership. The check for this constraint has to be performed recursively so that we can find all direct and indirect subgroups of the user group to be added. This kind of check is vulnerable to phantom reads which is possible at the default read committed isolation level because we cannot guarantee that the check is still valid when we are adding the subgroups to the user group. Solution To avoid having another transaction concurrently update one of the to-be-subgroup after the recursive check is done, and before the subgroup is added, we use SELECT FOR UPDATE to lock the user group rows. The lock needs to be acquired before a group membership change is about to occur before any check has been conducted. Suppose that we are adding subgroup B to supergroup A, the locking protocol is specified as follows: 1. Acquire a lock for B and all its direct and indirect subgroups. 2. Acquire a lock for A. For the removal of user groups, we acquire a lock for the user group to be removed with all its direct and indirect subgroups. This is the special case A=B, which is still complaint with the protocol. Error handling We currently rely on Postgres' deadlock detection to abort transactions and show an error for the users. In the future, we might need some recovery mechanism or at least better error handling. Notes An important note is that we need to reuse the recursive CTE query that finds the direct and indirect subgroups when applying the lock on the rows. And the lock needs to be acquired the same way for the addition and removal of direct subgroups. User membership change (as opposed to user group membership) is not affected. Read-only queries aren't either. The locks only protect critical regions where the user group dependency graph might violate the DAG constraint, where users are not participating. Testing We implement a transaction test case targeting some typical scenarios when an internal server error is expected to happen (this means that the user group view makes the correct decision to abort the transaction when something goes wrong with locks). To achieve this, we add a development view intended only for unit tests. It has a global BARRIER that can be shared across threads, so that we can synchronize them to consistently reproduce certain potential race conditions prevented by the database locks. The transaction test case lanuches pairs of threads initiating possibly conflicting requests at the same time. The tests are set up such that exactly N of them are expected to succeed with a certain error message (while we don't know each one). Security notes get_recursive_subgroups_for_groups will no longer fetch user groups from other realms. As a result, trying to add/remove a subgroup from another realm results in a UserGroup not found error response. We also implement subgroup-specific checks in has_user_group_access to keep permission managing in a single place. Do note that the API currently don't have a way to violate that check because we are only checking the realm ID now.	2023-08-24 17:21:08 -07:00
Anders Kaseorg	124c5d02e5	ci: Restore commented clean_unused_caches.py invocation. The comment logic doesn’t make sense. Every build gets to write to the caches; some builds do in fact add new items, and without clean_unused_caches.py there’s no way for them to remove items. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-08-23 16:20:01 -07:00
Tim Abbott	396cedd0e8	ci: Reorder tests to run unique tests first. As discussed in the comment, it doesn't really make sense for our 4 jobs that we run in parallel for different platforms to all start with running the backend tests. While it's true that puppeteer will likely fail if the backend doesn't run, and thus there's a mild prerequisite relationship there, what is far more common is the node tests fail and the user doesn't get that input for 10 minutes unnecessarily while all the backend jobs run, and this change lets us avoid that.	2023-08-09 17:15:51 -07:00
Anders Kaseorg	d926144e13	ci: Fix pnpm store path for GitHub Actions. This would ordinarily be determined by running ‘pnpm store path’, but pnpm is not installed yet at that point. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-05-31 13:23:06 -07:00
Anders Kaseorg	a4d897c42b	ci: Remove unused pnpm cache from production install test. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-05-31 13:23:06 -07:00
Anders Kaseorg	12310189ed	install: Support Debian 12. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-05-18 11:52:22 -07:00
Anders Kaseorg	16dedb08fd	ci: Fix matrix definition for tests job. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-05-18 11:52:22 -07:00
Alex Vandiver	65c552e55a	ci: Rename focal job to describe all it does.	2023-05-05 13:35:32 -07:00
Alex Vandiver	4a9424b207	ci: Stop trying to pull out the default extra-args. This was preventing the 20.04 install from actually happening, as GitHub was folding the two into one configuration.	2023-05-05 13:35:32 -07:00
Anders Kaseorg	033f561d94	ci: Run pnpm dedupe --check. New in pnpm 8.3.0, this replaces the yarn-deduplicate check that was removed in commit `3a27b12a7d` (#24731). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-04-25 22:26:04 -07:00
Anders Kaseorg	341f6173aa	ci: Enable XML coverage report to fix Codecov uploads. This was broken by commit `534754442a` (#22039). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-03-31 15:51:43 -07:00
Anders Kaseorg	3a27b12a7d	dependencies: Switch to pnpm. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-03-20 15:48:29 -07:00
Anders Kaseorg	8102556578	ci: Run generate-failure-message from the right path. This was broken by commit `3a0620a40c` (#23719). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-03-09 13:20:24 -08:00
Anders Kaseorg	5a79ca251b	check-database-compatibility: Drop .py from script name. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-03-03 18:02:37 -08:00
Anders Kaseorg	0ef8e88b17	webpack: Move webpack configuration to web. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-02-24 06:35:58 -08:00
Anders Kaseorg	c1675913a2	web: Move web app to ‘web’ directory. Ever since we started bundling the app with webpack, there’s been less and less overlap between our ‘static’ directory (files belonging to the frontend app) and Django’s interpretation of the ‘static’ directory (files served directly to the web). Split the app out to its own ‘web’ directory outside of ‘static’, and remove all the custom collectstatic --ignore rules. This makes it much clearer what’s actually being served to the web, and what’s being bundled by webpack. It also shrinks the release tarball by 3%. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-02-23 16:04:17 -08:00
Anders Kaseorg	7cafbefdef	ci: Reduce production suite tarball retention from 14 days to 1. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-02-16 10:15:11 -05:00
Mateusz Mandera	9aacf76f0d	do: Install pynacl in the oneclick job. This is now a required dependency.	2023-01-24 10:33:41 -08:00
Anders Kaseorg	872f4b41c1	ci: Check that non-scripts aren’t marked executable. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-12-07 09:54:01 -08:00
Josh Klar	a67ecc0f36	ci: Only report failures to CZO on branch pushes. Targeted fix for regression introduced in #23719 wherein failure reports were attempted for all CI failures, including those from forked pull requests, which don't have access to Actions Secrets. Since undefined Secrets are empty strings at interpolation time [^1], the underlying `send-message` Action was being called with no API Key, causing a failure in the failure handler. This fix is, per discussion in both a comment on #23719 and later on CZO [^2], prefered to restoring the prior guard against ZULIP_BOT_KEY being an empty string that had been in the shell script as it is more explicit in its intent. [^1]: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#example-using-secrets [^2]: https://chat.zulip.org/#narrow/stream/43-automated-testing/topic/all.20branches.20failing/near/1475246	2022-12-06 17:43:54 -05:00
Josh Klar	3a0620a40c	tools: Reimplement CI failure script without using CircleCI endpoint. Using curl to POST to the CircleCI workflow endpoint on CZO: - Doesn't work on zulip/zulip@main (CZO runs a revert) - Sets a bad example for other orgs - Robs us of an opportunity to dogfood our own zulip/github-actions-zulip Refactor the Actions workflows in this repo to report failure states using the Zulip Action, and reimplement the related helper scripts in Python, since they'd previously mostly shelled out to Python anyway.	2022-12-05 14:33:15 -05:00
Alex Vandiver	1ad26a2a9a	ci: Test upgrades from Zulip Server 6.0.	2022-11-28 20:21:28 -08:00
Alex Vandiver	63d2565467	ci: Do not pre-install rabbitmq-server in Docker images. Before Zulip 4.9, the Zulip install process left any already-installed rabbitmq with whatever nodename it had previously configured. Wince this encodes the name of the host when it was installed, this does not function well with containers. Leave rabbitmq-server uninstalled, which lets the Zulip installation process set the nodename to `localhost`, which ensures that it is usable across container restarts.	2022-10-25 14:53:32 -04:00
Anders Kaseorg	a8d72115eb	ci: Fix custom database name test. Caught by actionlint. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-08-30 17:33:37 -07:00
Varun Sharma	6cdf2853ff	ci: Limit GitHub token permissions for workflows. This limits the ability for an Action to do mischief with this token. Fixes #22786. Signed-off-by: Varun Sharma <varunsh@stepsecurity.io>	2022-08-29 17:12:55 -07:00
Vipul	35d56ea528	CI: Remove multiple hashFiles instances in a single step. hashFiles supports passing multiple filenames, and using this feature results in much cleaner keys. Fixes: #22796	2022-08-29 10:37:10 -07:00
Alex Vandiver	f8e2d652e1	ci: Test upgrades from the minimum of each major version, not the max.	2022-07-16 10:43:40 -07:00
Anders Kaseorg	e8283b37b4	ci: Limit CodeQL analysis with the same branches for push, pull_request. Silences “Warning: 1 issue was detected with this workflow: Please make sure that every branch in on.pull_request is also in on.push so that Code Scanning can compare pull requests against the state of the base branch.” Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-07-07 14:51:51 -07:00
Anders Kaseorg	acff0879e7	ci: Avoid duplicate GitHub Actions runs for push, pull_request. We’ve always been running CI on both push events and pull_request events, which means it runs twice for commits that are pushed to a pull request. Filter the push events by branch name. Add the workflow_dispatch event in case developers want to manually run CI on some other branch that isn’t a pull request. https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-07-06 17:31:07 -07:00
Anders Kaseorg	27fa91066c	ci: Update GitHub Actions dependencies. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-07-05 15:54:46 -07:00
Anders Kaseorg	4a11642cee	ci: Replace cancel-previous-runs job with concurrency configuration. Using ‘github.head_ref \|\| github.run_id’ makes this only cancel in-progress jobs for pull_request events. https://docs.github.com/en/actions/using-jobs/using-concurrency Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-07-05 13:08:06 -07:00
Alex Vandiver	91379fd67e	ci: Update upgrade test to 5.3, from 5.2.	2022-06-21 17:40:33 -07:00
Alex Vandiver	bf562f8fff	ci: Update upgrade test to 5.2, from 5.1.	2022-05-04 11:37:15 -07:00
Anders Kaseorg	e952641013	install: Resupport Ubuntu 22.04. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-05-03 09:41:08 -07:00
Anders Kaseorg	e8e0b045fc	Revert "ci: Remove actions/cache@v2 steps from run due to failures." This reverts commit `ae24fe69ed`. The problem was fixed by GitHub. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-29 14:03:12 -07:00
Lauryn Menard	ae24fe69ed	ci: Remove actions/cache@v2 steps from run due to failures. Comments out the steps in 'Create cache directories' that use `actions/cache@2` so that the CI and production build can pass while Github support issue is processed. See https://github.com/actions/cache/issues/794 for an upstream report.	2022-04-29 10:14:51 -07:00
Tim Abbott	0c385fe01b	ci: Only run documentation/link tests on a single job. As noted in ReadTheDocs, it's very unlikely that these documentation tests will pass or fail depending on the server's OS.	2022-04-26 17:26:41 -07:00
Anders Kaseorg	a543dcc8e3	Remove Debian 10 support. As a consequence: • Bump minimum supported Python version to 3.8. • Move Vagrant environment to Ubuntu 20.04, which has Python 3.8. • Move CI frontend tests to Ubuntu 20.04. • Move production build test to Ubuntu 20.04. • Move 3.4 upgrade test to Ubuntu 20.04. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-04-26 16:32:02 -07:00
Alex Vandiver	e2a3fe0930	ci: Test upgrades from 3.x, 4.x and 5.x.	2022-04-08 17:10:03 -07:00
Alex Vandiver	d150236217	ci: Test upgrades from 4.11.	2022-03-15 16:00:02 -07:00
Anders Kaseorg	3848050456	ci: Temporarily disable Ubuntu 22.04. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-03-02 16:00:35 -08:00
Alex Vandiver	62f4f3435f	ci: Test upgrades from 4.10.	2022-02-25 16:28:33 -08:00
Anders Kaseorg	894a50b5c9	install: Support Ubuntu 22.04. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-25 14:49:07 -08:00
Anders Kaseorg	170f4745dc	ci: Ban check-database-compatibility.py from using static/generated. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-24 14:31:24 -08:00
Anders Kaseorg	b3260bd610	docs: Use Debian and Ubuntu version numbers over development codenames. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-23 12:04:24 -08:00
Anders Kaseorg	b0ce4f1bce	docs: Fix many spelling mistakes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-07 18:51:06 -08:00
Alex Vandiver	2fc156e556	ci: Cache with the OS name, not the job name. The job name is just the constant `production_build`. Renaming it to have the OS in the key ensures that it is not shared across OS'es (for instance between `4.x` and `main`, which are now bionic and buster, respectively), and also allows it to share caches with the install step, which uses the OS name in that place.	2022-01-24 14:29:49 -08:00
Anders Kaseorg	a58a71ef43	Remove Ubuntu 18.04 support. As a consequence: • Bump minimum supported Python version to 3.7. • Move Vagrant environment to Debian 10, which has Python 3.7. • Move CI frontend tests to Debian 10. • Move production build test to Debian 10. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-01-21 17:26:14 -08:00
Anders Kaseorg	d035efd467	ci: Test upgrade-postgresql on Ubuntu 20.04. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-01-21 17:26:14 -08:00
Alex Vandiver	71b56f7c1c	puppet: process_fts_updates connects as nagios (or provided username). It should not use the configured zulip username, but should instead pull from the login user (likely `nagios`), or an explicit alternate provided PostgreSQL username. Failure to do so results in Nagios failures because the `nagios` login does not have permissions to authenticated the `zulip` PostgreSQL user. This requires CI changes, as the install tests install as the `zulip` login username, which allowed Nagios tests to pass previously; with the custom database and username, however, they must be passed to process_fts_updates explicitly when validating the install.	2021-12-14 14:48:53 -08:00

1 2 3

120 Commits