From a95b796a916762aa81031a43622ff6db568c6cf2 Mon Sep 17 00:00:00 2001 From: Alex Vandiver Date: Wed, 17 May 2023 15:06:03 +0000 Subject: [PATCH] supervisor: Drop minfds back down from 1000000 to 40000. 1c76036c61d8 raised the number of `minfds` in Supervisor from 40k to 1M. If Supervisor cannot guarantee that number of available file descriptors, it will fail to start; `/etc/security/limits.conf` was hence adjusted upwards as well. However, on some virtualized environments, including Proxmox LXC, setting `/etc/security/limits.conf` may not be enough to raise the system-level limits. This causes `supervisord` with the larger `minfds` to fail to start. The limit of 1000000 was chosen to be arbitrarily high, assuming it came without cost; it is not expected to ever be reached on any deployment. 262b19346e4e already lowered one aspect of that changeset, upon determining it did come with a cost. Potentially breaking virtualized deployments during upgrade is another cost of that change. Lower the `minfds` it back down to 40k, partially reverting 1c76036c61d8, but allow adjusting it upwards for extremely large deployments. We do not expect any except the largest deployments to ever hit the 40k limit, and a frictionless deployment for the vanishingly small number of huge deployments is not worth the potential upgrade hiccups for the much more frequent smaller deployments. --- docs/production/deployment.md | 11 +++++++++++ puppet/zulip/manifests/supervisor.pp | 1 + .../zulip/templates/supervisor/supervisord.conf.erb | 2 +- 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/docs/production/deployment.md b/docs/production/deployment.md index 4c3adede51..ee41bcb0b1 100644 --- a/docs/production/deployment.md +++ b/docs/production/deployment.md @@ -760,6 +760,17 @@ once. This decreases the number of 502's served to clients, at the cost of slightly increased memory usage, and the possibility that different requests will be served by different versions of the code. +#### `service_file_descriptor_limit` + +The number of file descriptors which [Supervisor is configured to allow +processes to use][supervisor-minds]; defaults to 40000. If your Zulip deployment +is very large (hundreds of thousands of concurrent users), your Django processes +hit this limit and refuse connections to clients. Raising it above this default +may require changing system-level limits, particularly if you are using a +virtualized environment (e.g. Docker, or Proxmox LXC). + +[supervisor-minfds]: http://supervisord.org/configuration.html?highlight=minfds#supervisord-section-values + #### `s3_memory_cache_size` Used only when the [S3 storage backend][s3-backend] is in use. diff --git a/puppet/zulip/manifests/supervisor.pp b/puppet/zulip/manifests/supervisor.pp index a661a7a217..9d21d2a899 100644 --- a/puppet/zulip/manifests/supervisor.pp +++ b/puppet/zulip/manifests/supervisor.pp @@ -97,6 +97,7 @@ class zulip::supervisor { } } + $file_descriptor_limit = zulipconf('application_server', 'service_file_descriptor_limit', 40000) concat { $zulip::common::supervisor_conf_file: ensure => 'present', require => Package[supervisor], diff --git a/puppet/zulip/templates/supervisor/supervisord.conf.erb b/puppet/zulip/templates/supervisor/supervisord.conf.erb index 221e4989ed..3794711b5c 100644 --- a/puppet/zulip/templates/supervisor/supervisord.conf.erb +++ b/puppet/zulip/templates/supervisor/supervisord.conf.erb @@ -9,7 +9,7 @@ chown=zulip:zulip logfile=/var/log/supervisor/supervisord.log ; (main log file;default $CWD/supervisord.log) pidfile=/var/run/supervisord.pid ; (supervisord pidfile;default supervisord.pid) childlogdir=/var/log/supervisor ; ('AUTO' child log dir, default $TEMP) -minfds=1000000 ; file descriptor limit for children +minfds=<%= @file_descriptor_limit %> ; file descriptor limit for children ; the below section must remain in the config file for RPC ; (supervisorctl/web interface) to work, additional interfaces may be