Commit Graph

583 Commits

Author SHA1 Message Date
Alex Vandiver bad600e731 docs: Clarify access to port 25 is needed for local email delivery. 2023-06-07 08:56:01 -07:00
Alex Vandiver 9a6529f86a docs: Add language to code blocks. 2023-05-31 08:18:03 -07:00
Alex Vandiver adb30c4d18 docs: Remove unused link references. 2023-05-31 08:18:03 -07:00
Alex Vandiver 8212dccc91 docs: Fix missing and typo'd link references. 2023-05-31 08:18:03 -07:00
Alex Vandiver 679591ccce docs: Document postgresql.missing_dictionaries. 2023-05-31 08:18:03 -07:00
Alex Vandiver d058188fa8 docs: Update documentation for deploy hooks.
ecfb12404a updated how values were passed to hooks, but did not
update the documentation.
2023-05-30 14:52:01 -07:00
Alex Vandiver 9ca4574fae docs: Document zulip_notify deploy hook. 2023-05-30 14:52:01 -07:00
Alex Vandiver f45a6a6d99 docs: Add missing link in Sentry deploy docs. 2023-05-30 11:25:43 -07:00
Mateusz Mandera 8fb0fe96c6 saml: Save SessionIndex in session and use when making a LogoutRequest.
This is a useful improvement in general for making correct
LogoutRequests to Idps and a necessary one to make SP-initiated logout
fully work properly in the desktop application. During desktop auth
flow, the user goes through the browser, where they log in through their
IdP. This gives them a logged in  browser session at the IdP. However,
SAML SP-initiated logout is fully conducted within the desktop
application. This means that proper information needs to be given to the
the IdP in the LogoutRequest to let it associate the LogoutRequest with
that logged in session that was established in the browser. SessionIndex
is exactly the tool for that in the SAML spec.
2023-05-23 13:01:15 -07:00
Mateusz Mandera 5dd4dcdebb saml: Make SP-initiated SLO work in the desktop application. 2023-05-23 13:01:15 -07:00
Mateusz Mandera 3f55c10685 saml: Rework SP-initiated logout config to support IdP-level config.
This gives more flexibility on a server with multiple organizations and
SAML IdPs. Such a server can have some organizations handled by IdPs
with SLO set up, and some without it set up. In such a scenario, having
a generic True/False server-wide setting is insufficient and instead
being able to specify the IdPs/orgs for SLO is needed.
2023-05-23 13:01:15 -07:00
Mateusz Mandera 0bb0220ebb saml: Implement SP-initiated Logout.
Closes #20084

This is the flow that this implements:
1. A logged-in user clicks "Logout".
2. If they didn't auth via SAML, just do normal logout. Otherwise:
3. Form a LogoutRequest and redirect the user to
https://idp.example.com/slo-endpoint?SAMLRequest=<LogoutRequest here>
4. The IdP validates the LogoutRequest, terminates its own user session
and redirects the user to
https://thezuliporg.example.com/complete/saml/?SAMLRequest=<LogoutResponse>
with the appropriate LogoutResponse. In case of failure, the
LogoutResponse is expected to express that.
5. Zulip validates the LogoutResponse and if the response is a success
response, it executes the regular Zulip logout and the full flow is
finished.
2023-05-23 13:01:15 -07:00
Toyam Cox 650cdc474d docs: Also set X-Forwarded-Proto in proxies.
Django 4.0 and higher began checking the `Origin` header, which made
it important that Zulip know accurately if the request came over HTTPS
or HTTP; failure to do so would result in "CSRF verification failed"
errors.

For Zulip servers which are accessed via proxies, this means that
`X-Fowarded-Proto` must be set accurately.  Adjust the documentation
for the suggested configurations to add the header.

Fixes: #24599.

Co-authored-by: Alex Vandiver <alexmv@zulip.com>
2023-05-18 17:17:35 -04:00
Alex Vandiver a95b796a91 supervisor: Drop minfds back down from 1000000 to 40000.
1c76036c61 raised the number of `minfds` in Supervisor from 40k to
1M.  If Supervisor cannot guarantee that number of available file
descriptors, it will fail to start; `/etc/security/limits.conf` was
hence adjusted upwards as well.  However, on some virtualized
environments, including Proxmox LXC, setting
`/etc/security/limits.conf` may not be enough to raise the
system-level limits.  This causes `supervisord` with the larger
`minfds` to fail to start.

The limit of 1000000 was chosen to be arbitrarily high, assuming it
came without cost; it is not expected to ever be reached on any
deployment.  262b19346e already lowered one aspect of that
changeset, upon determining it did come with a cost.  Potentially
breaking virtualized deployments during upgrade is another cost of
that change.

Lower the `minfds` it back down to 40k, partially reverting
1c76036c61, but allow adjusting it upwards for extremely large
deployments.  We do not expect any except the largest deployments to
ever hit the 40k limit, and a frictionless deployment for the
vanishingly small number of huge deployments is not worth the
potential upgrade hiccups for the much more frequent smaller
deployments.
2023-05-18 13:04:33 -07:00
Anders Kaseorg 12310189ed install: Support Debian 12.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-05-18 11:52:22 -07:00
Alex Vandiver 6cb570f3f0 docs: Upgrade to zulip-cloud-current before exporting to Zulip Cloud.
Partial fix for #25482.
2023-05-15 14:44:11 -07:00
Alex Vandiver e0c05825ed docs: Importing from Zulip Cloud exports should use zulip-cloud-current.
Partial fix for #25482.
2023-05-15 14:44:11 -07:00
Alex Vandiver f4683de742 puppet: Switch the `rolling_restart` setting to use the bool values.
2c5fc1827c standardized which values are "true"; use them.
2023-05-11 15:54:15 -07:00
Alex Vandiver 857f79161f docs: Update documentation on compliance exports. 2023-05-11 12:01:54 -07:00
Alex Vandiver f11350f789 puppet: Add PostgreSQL 15 support.
Instead of copying over a mostly-unchanged `postgresql.conf`, we
transition to deploying a `conf.d/zulip.conf` which contains the
only material changes we made to the file, which were previously
appended to the end.

While shipping separate while `postgresql.conf` files for each
supported version is useful if there is large variety in supported
options between versions, there is not no such variation at current,
and the burden of overriding the entire default configuration is that
it must be keep up to date wit the package's version.
2023-05-10 14:06:02 -07:00
Alex Vandiver e5ae55637e install: Remove PostgreSQL 11 support.
Django 4.2 removes this support, so Zulip has not installed with
PostgreSQL 11 since 2c20028aa4.
2023-05-05 13:35:32 -07:00
Alex Vandiver 8a3236638a docs: Update sharding docs for single-org sharding option.
Co-authored-by: Tim Abbott <tabbott@zulip.com>
2023-05-01 11:28:08 -07:00
Alex Vandiver 510b96046a docs: Update production docs for local S3 caching. 2023-05-01 11:28:08 -07:00
Alex Vandiver b8a6de95d2 pg_backup_and_purge: Allow adjusting the backup concurrency.
SSDs are good at parallel random reads.
2023-04-26 10:54:51 -07:00
Alex Vandiver 19a11c9556 pg_backup_and_purge: Take backups on replicas, if present.
Taking backups on the database primary adds additional disk load,
which can impact the performance of the application.

Switch to taking backups on replicas, if they exist.  Some deployments
may have multiple replicas, and taking backups on all of them is
wasteful and potentially confusing; add a flag to inhibit taking
nightly snapshots on the host.

If the deployment is a single instance of PostgreSQL, with no
replicas, it takes backups as before, modulo the extra flag to allow
skipping taking them.
2023-04-26 10:54:51 -07:00
Alex Vandiver 7c023042cf puppet: Rotate access log files every day, not at 500M.
Since logrotate runs in a daily cron, this practically means "daily,
but only if it's larger than 500M."  For large installs with large
traffic, this is effectively daily for 10 days; for small installs, it
is an unknown amount of time.

Switch to daily logfiles, defaulting to 14 days to match nginx; this
can be overridden using a zulip.conf setting.  This makes it easier to
ensure that access logs are only kept for a bounded period of time.
2023-04-06 14:31:16 -04:00
Alex Vandiver a77c89f610 docs: Always suggest start-server, now that it is safer. 2023-04-04 10:58:56 -07:00
Alex Vandiver 5b9fb582e2 docs: Remove now-unnecessary reactivate_realm step after import.
113a8c4782 made this step unnecessary.
2023-04-04 10:58:56 -07:00
Mateusz Mandera 7ca08cb84b docs: Link to SCIM docs from SAML instructions. 2023-04-03 17:06:05 -07:00
Mateusz Mandera 1bfe48bce6 docs: Add ReadTheDocs documentation for SCIM. 2023-04-03 17:01:05 -07:00
Alya Abbott e136636715 docs: Clarify "Should I follow this installation guide?" instructions. 2023-03-30 09:08:48 -07:00
Alex Vandiver c686c5ed0f web: Save a needless 301 redirect from /plans to /plans/. 2023-03-24 14:51:01 -07:00
Alex Vandiver 262b19346e puppet: Decrease default nginx worker_connections.
Increasing worker_connections has a memory cost, unlike the rest of
the changes in 1c76036c61d8; setting it to 1 million caused nginx to
consume several GB of memory.

Reduce the default down to 10k, and allow deploys to configure it up
if necessary.  `worker_rlimit_nofile` is left at 1M, since it has no
impact on memory consumption.
2023-03-23 15:59:23 -07:00
Alex Vandiver 015a10637b docs: Document how to use SMTP without authentication.
This is the behaviour inherited from Django[^1].  While setting the
password to empty (`email_password = `) in
`/etc/zulip/zulip-secrets.conf` also would suffice, it's unclear what
the user would have been putting into `EMAIL_HOST_USER` in that
context.

Because we previously did not warn when `email_password` was not
present in `zulip-secrets.conf`, having the error message clarify the
correct configuration for disabling SMTP auth is important.

Fixes: #23938.

[^1]: https://docs.djangoproject.com/en/4.1/ref/settings/#std-setting-EMAIL_HOST_USER
2023-02-27 11:59:48 -08:00
Alex Vandiver 6969a6a92d docs: Update instructions for realm deletion to use management command.
This documentation was written in 9ece4c9f51, which predated the
`./manage delete_realm` command added in bff503feb4.
2023-02-26 17:11:07 -08:00
Alex Vandiver 8ede54fb1b docs: Add a link to installer flag docs from backup instructions.
The documentation for restoring backups referenced that it needed to
be to the same version of PostgreSQL, but did not explain how to do
that.

Link to the relevant section of the installer documentation, and name
the flag explicitly.

Fixes: #23691
2023-02-24 12:25:48 -05:00
David Rosa 3254023fa3 help: Update URLs to match "Restrict message editing and deletion" title.
Updates all references to the new URL and adds a URL redirect.

Follow up to #24329.
2023-02-10 15:56:16 -08:00
Alex Vandiver 3109d40b21 puppet: Add a sentry release class.
This installs the Sentry CLI, and uses it to send API events to Sentry
when a release is started and completed.
2023-02-10 15:53:10 -08:00
Alex Vandiver 840884ec89 upgrade-zulip: Provide directories to run hooks before/after upgrade.
These hooks are run immediately around the critical section of the
upgrade.  If the upgrade fails for preparatory reasons, the pre-deploy
hook may not be run; if it fails during the upgrade, the post-deploy
hook will not be run.  Hooks are called from the CWD of the new
deploy, with arguments of the old version and the new version.  If
they exit with non-0 exit code, the deploy aborts.
2023-02-10 15:53:10 -08:00
Alex Vandiver 7ab4fdf250 memcached: Allow overriding the max-item-size.
This is necessary for organizations with extremely large numbers of
members (20k+).
2023-02-09 12:04:29 -08:00
Mateusz Mandera d23b0a1f08 docs: Document how LDAP email address changes work (manually).
We will hopefully be able to just this in #16208 to document what
users need to configure in order to do this manually, but the content
here will be useful for anyone who hasn't set that up regardless.
2023-02-06 15:57:44 -08:00
Alessandro Toppi ff89590558 auth: Add JWT-based user API key fetch.
This adds a new endpoint /jwt/fetch_api_key that accepts a JWT and can
be used to fetch API keys for a certain user. The target realm is
inferred from the request and the user email is part of the JWT.

A JSON containing an user API key, delivery email and (optionally)
raw user profile data is returned in response.
The profile data in the response is optional and can be retrieved by
setting the POST param "include_profile" to "true" (default=false).

Co-authored-by: Mateusz Mandera <mateusz.mandera@zulip.com>
2023-02-03 15:23:35 -08:00
Alex Vandiver 68f4071873 puppet: Allow choice of timesync tool. 2023-01-31 14:20:41 -08:00
David Rosa a6abf959bb contributor docs: Improve the first sentence in "Upgrade Zulip". 2023-01-27 12:41:56 -08:00
David Rosa 538801c651 contributor docs: Rename "Customize Zulip" -> "Server configuration".
- Renames "Customize Zulip" to "Server configuration".
- Cross-links "Server configuration" with "System and deployment
configuration".

Fixes part of #23984.
2023-01-27 12:41:56 -08:00
David Rosa 08e9686cd2 contributor docs: Rename "Upgrade or modify Zulip" -> "Upgrade Zulip".
Fixes part of #23984.
2023-01-27 12:41:56 -08:00
David Rosa af39a1a554 contributor docs: Migrate "Modify Zulip" to its own page.
Splits /production/upgrade-or-modify.md to improve the organization
of production documentation.

Fixes #23984.
2023-01-27 12:41:56 -08:00
Tran Sang 3bea65b39c puppet: Set /etc/mailname based on postfix.mailname configuration.
The `postfix.mailname` setting in `/etc/zulip.conf` was previously
only used for incoming mail, to identify in Postfix configuration
which messages were "local."

Also set `/etc/mailname`, which is used by Postfix to set how it
identifies to other hosts when sending outgoing email.

Co-authored-by: Alex Vandiver <alexmv@zulip.com>
2023-01-27 15:08:22 -05:00
Alex Vandiver e19d4e5e0a docs: Mention probably needing to allow port 22 for SSH access. 2023-01-19 17:31:13 -08:00
Alex Vandiver 04cf68b45e uploads: Serve S3 uploads directly from nginx.
When file uploads are stored in S3, this means that Zulip serves as a
302 to S3.  Because browsers do not cache redirects, this means that
no image contents can be cached -- and upon every page load or reload,
every recently-posted image must be re-fetched.  This incurs extra
load on the Zulip server, as well as potentially excessive bandwidth
usage from S3, and on the client's connection.

Switch to fetching the content from S3 in nginx, and serving the
content from nginx.  These have `Cache-control: private, immutable`
headers set on the response, allowing browsers to cache them locally.

Because nginx fetching from S3 can be slow, and requests for uploads
will generally be bunched around when a message containing them are
first posted, we instruct nginx to cache the contents locally.  This
is safe because uploaded file contents are immutable; access control
is still mediated by Django.  The nginx cache key is the URL without
query parameters, as those parameters include a time-limited signed
authentication parameter which lets nginx fetch the non-public file.

This adds a number of nginx-level configuration parameters to control
the caching which nginx performs, including the amount of in-memory
index for he cache, the maximum storage of the cache on disk, and how
long data is retained in the cache.  The currently-chosen figures are
reasonable for small to medium deployments.

The most notable effect of this change is in allowing browsers to
cache uploaded image content; however, while there will be many fewer
requests, it also has an improvement on request latency.  The
following tests were done with a non-AWS client in SFO, a server and
S3 storage in us-east-1, and with 100 requests after 10 requests of
warm-up (to fill the nginx cache).  The mean and standard deviation
are shown.

|                   | Redirect to S3      | Caching proxy, hot  | Caching proxy, cold |
| ----------------- | ------------------- | ------------------- | ------------------- |
| Time in Django    | 263.0 ms ±  28.3 ms | 258.0 ms ±  12.3 ms | 258.0 ms ±  12.3 ms |
| Small file (842b) | 586.1 ms ±  21.1 ms | 266.1 ms ±  67.4 ms | 288.6 ms ±  17.7 ms |
| Large file (660k) | 959.6 ms ± 137.9 ms | 609.5 ms ±  13.0 ms | 648.1 ms ±  43.2 ms |

The hot-cache performance is faster for both large and small files,
since it saves the client the time having to make a second request to
a separate host.  This performance improvement remains at least 100ms
even if the client is on the same coast as the server.

Cold nginx caches are only slightly slower than hot caches, because
VPC access to S3 endpoints is extremely fast (assuming it is in the
same region as the host), and nginx can pool connections to S3 and
reuse them.

However, all of the 648ms taken to serve a cold-cache large file is
occupied in nginx, as opposed to the only 263ms which was spent in
nginx when using redirects to S3.  This means that to overall spend
less time responding to uploaded-file requests in nginx, clients will
need to find files in their local cache, and skip making an
uploaded-file request, at least 60% of the time.  Modeling shows a
reduction in the number of client requests by about 70% - 80%.

The `Content-Disposition` header logic can now also be entirely shared
with the local-file codepath, as can the `url_only` path used by
mobile clients.  While we could provide the direct-to-S3 temporary
signed URL to mobile clients, we choose to provide the
served-from-Zulip signed URL, to better control caching headers on it,
and greater consistency.  In doing so, we adjust the salt used for the
URL; since these URLs are only valid for 60s, the effect of this salt
change is minimal.
2023-01-09 18:23:58 -05:00