zulip

Commit Graph

Author	SHA1	Message	Date
Greg Price	23e6a2e579	puppet: Update memcached config to turn on this decade's technology. We've been running this change on zulipchat.com for a couple of months now. Before then, we used to regularly get exceptions like this: File "./zerver/views/messages.py", line 749, in get_messages_backend setter=stringify_message_dict) File "./zerver/lib/cache.py", line 275, in generic_bulk_cached_fetch cache_set_many(items_for_remote_cache) File "./zerver/lib/cache.py", line 215, in cache_set_many get_cache_backend(cache_name).set_many(items, timeout=timeout) File "/home/zulip/deployments/2017-09-28-21-04-12/zulip-py3-venv/lib/python3.5/site-packages/django/core/cache/backends/memcached.py", line 150, in set_many self._cache.set_multi(safe_data, self.get_backend_timeout(timeout)) pylibmc.Error: error 48 from memcached_set_multi This error means memcached was unable to find space for the new value. You might think that because memcached provides an LRU cache, this shouldn't happen because it would just evict something... but in fact * memcached splits its data into "slabs" by object size, and * until recently, once a 1MiB "chunk" is allocated to a given "slab" i.e. size class, it wouldn't be reclaimed to allocate to another. So once the cache has been filled up with objects of some distribution of sizes, if some objects come in that would go in a different size class, we have no chunks for that size class / slab, and can't get one. And that's exactly what was happening on zulipchat.com. Useful background can be found in https://github.com/memcached/memcached/wiki/ServerMaint#slab-imbalance https://github.com/memcached/memcached/wiki/ReleaseNotes1411 https://github.com/memcached/memcached/wiki/ReleaseNotes1425 https://github.com/memcached/memcached/wiki/ReleaseNotes150 We're already running v1.4.25, which provides an "automover" that should be well equipped to fix this; v1.5.0 turns it on by default. With this commit, adopt the "modern start line" recommended in the release notes for our v1.4.25, including turning on the automover.	2018-02-08 16:34:49 -08:00
Vishnu Ks	bf2961418b	puppet: Remove comment about period of soft deactivate users. This often becomes wrong over time as it is currently.	2018-01-24 17:15:08 -08:00
Vishnu Ks	a11b742984	messages: Calculate value of first visible message ID using cron job. [greg: Fixed buggy time conversion in estimate_recent_messages.]	2018-01-24 17:15:08 -08:00
Tim Abbott	9ed2a94b8c	nagios: Add configuration designed for full-stack servers. This doesn't yet pass all Nagios checks correctly, and still has a few flaws: * The ideal setup code for the `nagios` user in the database isn't included. * Some of the other details are a bit off; we need to split some host roles. But it's better than nothing, and we can iterate from here.	2018-01-24 14:16:03 -08:00
Aditya Bansal	dd0e6c8025	reminders: Fix issue with log file permissions in production.	2018-01-24 03:33:40 +05:30
Tim Abbott	2365b13b68	puppet: Move postgres Nagios plugin to main postgres-common. This plugins package is required in order to use Nagios checks to verify the Zulip postgres database, and thus belongs in the default package set.	2018-01-23 10:31:48 -08:00
Aditya Bansal	ec1297c1e8	schedulemessages: Add delivery system for scheduled message.	2018-01-10 09:18:02 -05:00
Umair Khan	68513952fb	email-worker: Create EmailSendingWorker. This commit just copies all the code from MissedMessageSendingWorker class to a new EmailSendingWorker class. All the logic to send an email through a queue was already there. This commit only makes the logic generic. It does so by creating a special purpose queue called 'email_senders' to send any type of email. To make MissedMessageSendingWorker still work we derive it from EmailSendingWorker. All the tests that were testing MissedMessageSendingWorker now run against EmailSendingWorker.	2017-12-20 19:36:27 -08:00
Tim Abbott	f423dc4930	check_send_receive_time: Fix parsing bug. This was a regression introduced with the argparse migration.	2017-11-27 14:01:30 -08:00
rht	e55898850a	Replace optparse with argparse in remaining tools. Tweaked by tabbott to fix various bugs with the usage output.	2017-11-21 21:34:38 -08:00
Vishnu Ks	766511e519	actions: Mark all messages as read when user unsubscribes from stream. This fixes a bug where, when a user is unsubscribed from a stream, they might have unread messages on that stream leak. While it might seem to be a minor problem, it can cause significant problems for computing the `unread_msgs` data structures, since it means we need to add an extra filter for whether the user is still subscribed, either in the backend or in the UI. Fixes #7095.	2017-11-21 20:09:17 -08:00
Greg Price	ae901309fc	certbot: Control auto-renew with a zulip.conf setting. This causes the cron job to run only when a Zulip-managed certbot install is actually set up. Inside `install`, zulip.conf doesn't yet exist when we run setup-certbot, so we write the setting later. But we also give setup-certbot the ability to write the setting itself, so that we can recommend it in instructions for adopting certbot in an existing Zulip installation.	2017-11-15 21:50:41 -08:00
Greg Price	dacf65b301	certbot: Move verification webroot under /var/lib/zulip . If we were making an old-fashioned webroot where hand-written static HTML files went, somewhere under `/srv` would be most appropriate. Here, this webroot is really more of an implementation detail of the certbot set up by the Zulip installer/packaging, containing transient state. So someplace under `/var` is appropriate, and specifically under `/var/lib/zulip` in order to properly namespace it. For background on `/var/www` and friends, see the top couple of answers on https://unix.stackexchange.com/questions/47436/why-web-server-var-www	2017-11-15 21:50:41 -08:00
Tim Abbott	2afc3b9e50	certbot: Move path to /usr/local/sbin. [greg: fixed typo bug]	2017-11-15 21:50:41 -08:00
rht	97ec56276c	certbot: Add certbot renew cron job to puppet. Tweaked by tabbott to use the proper command.	2017-11-15 21:50:41 -08:00
Tim Abbott	94554c65da	certbot: Modify nginx configuration to support automated renewal.	2017-11-08 12:32:26 -08:00
Tim Abbott	62bb465896	puppet: Modify lb0 nginx configuration.	2017-11-08 12:32:26 -08:00
rht	ccf2792c1c	refactor: Remove six.moves.configparser import.	2017-11-07 10:51:44 -08:00
rht	549a26860f	refactor: Remove six.moves.range import.	2017-11-07 10:46:42 -08:00
Tim Abbott	acb0b6ee43	process_fts_updates: Fix pgroonga search in development. For some reason, we have the USING_PGROONGA setting on in development right now. I'm going to disable that in another commit to match what we're doing in production, but we'll still want that setting to work in development. The problem here was that process_fts_updates only attempted to read the USING_PGROONGA setting from a /etc/zulip/zulip.conf source, and thus would just not be updating the index in development.	2017-10-30 11:44:04 -07:00
Tim Abbott	0d1194811f	mypy: Remove ignores for a few typeshed bugs fixed upstream.	2017-10-27 17:09:00 -07:00
Tim Abbott	89b97e7480	python3: Fix REMOTE_USER Apache configuration for Python 3. We were previously still installing the Python 2 version of mod_wsgi, which of course doesn't work and can't use the Zulip virtualenv.	2017-10-24 11:48:14 -07:00
Tim Abbott	15f3d5f714	nginx: Fix some buggy gzip compression configuration. We weren't compressing SVG, while at the same time were incorrectly compressing octet-stream (Which meant downloading .tar.gz files in Chrome would get double-compressed).	2017-10-20 11:01:28 -07:00
Tim Abbott	540cae19a8	puppet: Remove obsolete sparkle configuration. Sparkle was the auto-update system used by the legacy desktop app. We haven't been capable of using it for auto-update in years, so there's no reason to keep around the configuration. The new Electron app uses a different system anyway.	2017-10-19 16:35:55 -07:00
rht	b57289aacd	py3: Remove all `from __future__ import print_function. Except for these files: - tools/linter_lib/* - tools/lib - tools/lister.py	2017-10-18 12:07:19 -07:00
rht	2f3ae84e5a	py3: Remove all `__future__ import division`.	2017-10-17 23:09:12 -07:00
Tim Abbott	6a5cb0e48c	puppet: Make problems with Zephyr mirroring pageable. Generally this indicates sending messages is completely broken.	2017-10-12 00:16:32 -07:00
rht	de30400fc5	pg_backup_and_purge.py: Remove .py extension.	2017-10-08 15:32:43 -07:00
Tim Abbott	47c5aae5b2	log2zulip: Enforce using python 3 in cron job. We aren't guaranteed to have the Zulip dependencies installed on Python 2.	2017-10-06 16:37:17 -07:00
Tim Abbott	0f2e4a55c0	soft deactivation: Shorten management command name. This command is really for soft deactivation; there's just an undo feature.	2017-10-06 08:48:43 -07:00
Tim Abbott	f2055397c1	nagios: Update apache configuration to be generated. Since this is basically just stock Apache configuration for Nagios with a hostname put in, we can just fetch the hostname from our configuration.	2017-10-05 21:51:29 -07:00
Tim Abbott	3af01bed85	puppet: Simplify zulip_ops nginx configuration. Whatever dist/ functionality this had in 2014 is now served by zulip.org, and since this serves as a sample, it should be as simple as possible. Previously, this was more cluttered than it needed to be.	2017-10-05 21:17:57 -07:00
Tim Abbott	e6e7bcf6e1	nagios: Move camo_check_url into configuration.	2017-10-05 21:09:24 -07:00
Tim Abbott	82cee4fde9	check_worker_memory: Increase limits for what leaking means. The old limits were such that these would sometimes oscillated too high and page erroneously. The purpose of this check is to prevent large memory leaks, and will still achieve that with a higher limit.	2017-10-05 20:54:03 -07:00
Tim Abbott	1c453fdf2a	puppet: Add redis_password file for Nagios. This allows the Nagios user to access redis without having full access to the redis system. Ideally, this would eventually use a password that only has statistics read access, but I'm not sure redis supports that.	2017-10-05 20:42:07 -07:00
Tim Abbott	13a36d9af3	puppet: Make old redis_tunnel configuration usable. This old puppet configuration was never really used, and regardless hardcoded an ancient zulip.net hostname. We fix this to use the zulipconf system to get the host domain (though not, at present, the hostname).	2017-10-05 20:40:22 -07:00
Tim Abbott	96c3014da0	nagios: Automate configuration of outgoing email with msmtp. Now we no longer need to check in a bunch of hostnames in order to configure Nagios.	2017-10-05 20:29:47 -07:00
Tim Abbott	5b4c260c3f	puppet: Add munin apache auth configuration. This is completely stock configuration, and seems to be required for munin to run properly.	2017-10-05 20:17:12 -07:00
Tim Abbott	ba7be4102e	puppet: Update munin tunnels configuration to use zulipconf. This eliminates another old hardcoding of zulip.net.	2017-10-05 20:14:43 -07:00
Tim Abbott	162eaf8917	nagios: Modify check for swap to allow no swap. If a machine is configured with no swap intentationally, that shouldn't be a Nagios problem. This alert is intended to flag machines which are swapping.	2017-10-05 20:07:44 -07:00
Tim Abbott	80a16bf873	nagios: Fix path to source zulip_nagios.cfg. Arguably, we should make this a symlink, but it's probably a good idea to have every change in the production Nagios configuration go through the zulip-puppet-apply diff experience.	2017-10-05 20:06:50 -07:00
Tim Abbott	886a8853ac	nagios: Move server-specific config into hostgroups. These new hostgroups exist so we can eliminate explicit references to individual hosts in services.cfg.	2017-10-05 20:06:48 -07:00
Tim Abbott	b6ce9583a9	nagios: Fetch list of hosts from zulip.conf. This makes this much more configurable and much less hardcoded.	2017-10-05 20:06:30 -07:00
Tim Abbott	5193936bc3	nagios: Add Memcached and Redis monitoring. These are standard Nagios plugins that might be sometimes helpful.	2017-10-05 20:06:16 -07:00
Tim Abbott	f7d554d533	nagios: Rename zmirror2 to zmirrorp in configuration. The "p" stands for "personals", aka zephyr private messages, which is what this host manages.	2017-10-05 20:06:08 -07:00
Tim Abbott	062d280914	puppet: Clean up unnecessary pagerduty_nagios.cfg.	2017-10-05 19:23:33 -07:00
Tim Abbott	7e328ba865	nagios: Move email addresses for contacts into variables.	2017-10-05 19:23:33 -07:00
Tim Abbott	6017d3dec5	puppet: Move contacts.cfg to be a template.	2017-10-05 19:23:33 -07:00
Tim Abbott	09aec3e467	puppet: Move hosts.cfg to be managed by a template.	2017-10-05 19:23:33 -07:00
Tim Abbott	692f4b77d1	puppet: Remove messy Nagios crontab.	2017-10-05 19:23:33 -07:00

1 2 3 4 5 ...

641 Commits