Commit Graph

682 Commits

Author SHA1 Message Date
Tim Abbott 4e8487c886 nagios: Bump maximum processes limits.
These seemed to be flapping for no good reason.
2018-05-02 11:12:47 -07:00
Tim Abbott 2f937d81e2 puppet: Add Zulip specific postgres configuration for 9.6.
Mostly, this involves adding the big block at the bottom and making
9.6 a variable so that it's easier to compare different versions of
these.
2018-04-23 18:33:49 -07:00
Tim Abbott 9930e3de09 puppet: Add a stock Postgres 9.6 configuration file from Debian.
This will make it easier to see what we customize.
2018-04-23 18:29:02 -07:00
Tim Abbott 718492638b puppet: Fix name for dhcpcd5 package.
Apparently the name dhcpcd isn't installable.
2018-04-23 11:32:07 -07:00
Tim Abbott 56b0479656 puppet: Clean up indentation in various manifests.
These are inspired by puppet-lint, though we didn't take all of their
changes, since some seem to be bugs in the tool.
2018-04-23 00:15:54 -07:00
Tim Abbott b493748ddb puppet: Use single quotes where valid.
This brings our puppet codebase more in line with the standard puppet
style guide.  Changes done via `puppet-lint --fix`.
2018-04-23 00:15:54 -07:00
Tim Abbott 871078db30 puppet: Fix alignment of arrow operators.
This was done using puppet-lint --fix.
2018-04-23 00:15:54 -07:00
Tim Abbott 19cee30bf8 puppet: Fix use of under-scoped variables. 2018-04-22 23:53:34 -07:00
Tim Abbott 6e55aa2ce6 puppet: Fix mispelled variable name.
Apparently, we weren't uninstalling the old WSGI module properly.
2018-04-22 23:53:34 -07:00
Tim Abbott 6988f13201 puppet: Move safepackage definitions out of class definitions.
Also, deduplicate it while we're at it.

This fixes a puppet-lint issue that becomes an error with puppet 4.
2018-04-22 23:53:22 -07:00
Tim Abbott a6aa7042a2 puppet: Fix some unnecessarily quoted strings.
Flagged by puppet-lint.
2018-04-22 23:42:35 -07:00
Tim Abbott 35aa4f0377 puppet: Sort ensure attributes to be always first.
This inconsistency was flagged by puppet-lint.
2018-04-22 23:41:49 -07:00
Tim Abbott a56968ce68 puppet: Fix variables not clearly enclosed.
This improves readability and robustness.  Found and fixed via puppet-lint.
2018-04-22 23:35:33 -07:00
Tim Abbott 169ee5d8a1 puppet: Fix use of tab-based whitespace. 2018-04-22 23:34:30 -07:00
Tim Abbott e103c2ff2d puppet: Switch to modern quoted, octal file modes.
This is one of the prerequisite tasks for Puppet 4 support.

Constructed using puppet-lint.
2018-04-22 23:30:48 -07:00
Tim Abbott a06c7bc247 puppet: Allow manual configuration of postfix_mailname.
This allows users to configure a mailname for postfix in
/etc/zulip/zulip.conf
2018-04-19 14:41:05 -07:00
Tim Abbott 62b12e0c34 zulip_ops: Add missing dependency on dhcpcd. 2018-04-19 14:27:48 -07:00
Aditya Bansal 4898fe7ebc uploads: Change Content-Security-Policy to fix issue with pdf's.
Our recent addition of Content-Security-Policy to the file uploads
backend broke in-browser previews of PDFs.

The content-types change in the last commit fixed loading PDFs for
most users; but the result was ugly, because e.g. Chrome would put the
PDF previewer into a frame (so there were 2 left scrollbars).

There were two changes needed to fix this:
* Loading the style to use the plugin.  We corrected this by adding
  `style-src 'self' 'unsafe-inline';`
* Loading the plugin.  Our CSP blocked loading the PDf viewer plugin.
  To correct this, we add object-src 'self', and then limit the
  plugin-type to just the one for application/pdf.

We verified this new CSP using https://csp-evaluator.withgoogle.com/
in addition to manual testing.
2018-04-17 12:23:24 -07:00
Tim Abbott 568a12e254 nginx: Add PDF files to the content-types list.
Previously, user-uploaded PDF files were not properly rendered by
browsers with the local uploads backend, because we weren't setting
the correct content-type.
2018-04-17 11:50:10 -07:00
Tim Abbott a463743107 puppet: Add Content-Security-Policy for user avatars.
This adds a basic Content-Security-Policy for user-uploaded avatars
served by the LOCAL_UPLOADS backend.

I think this is for now an unnecessary follow-up to
d608a9d315, but is worth doing because
we may later change what can be uploaded in the avatars directory.
2018-04-10 14:43:08 -07:00
Aditya Bansal d608a9d315 uploads: Add Content-Security-Policy for user uploads.
This adds a basic Content-Security-Policy for user-uploaded files with local uploads.

While over time, we plan to add CSP for the main site as well, this CSP is particularly
important for the local-uploads backend, which often shares a domain with the main site.
2018-04-09 14:43:02 -07:00
Tim Abbott 0d35bbc464 install: Install the wget package.
We depend on it for installing node, and it's a standard package, not
a required one, so we do need to explicitly declare the dependency.
2018-03-29 16:03:44 -07:00
Ghislain Antony Vaillant 00dd86967b puppet: Add Debian 9 package names to definitions.
This doesn't add support for Debian 9, but it will save time for folks
working on Stretch support in the future.
2018-03-28 12:33:45 -07:00
neiljp (Neil Pilgrim) 090b47ed19 mypy: Add explicit Optional for default=None parameters in various files. 2018-03-28 12:31:51 -07:00
neiljp (Neil Pilgrim) f32f3cbf72 mypy: Amend zulip-ec2-configure-interfaces to avoid None. 2018-03-23 11:39:54 -07:00
Tim Abbott d98be2f19f puppet: Only run analytics Nagios checks on machine running cron.
Running this on additional machines would be redundant; additionally,
the FillState checker cron job runs only on cron systems, so this will
crash on other app frontends.
2018-03-06 13:38:27 -08:00
Tim Abbott 8e8faab006 puppet: Move clearsessions cron job to app_frontend_once.
While this is a different system than I'd written up in #8004, I think
this is a better solution to the general problem of cron jobs to run
on just one server.

Fixes #8004.
2018-03-06 13:35:51 -08:00
Tim Abbott 9a74ef5056 puppet: Move some cron jobs to app_frontend_once.pp.
Several cron jobs had incorrectly ended up in the app_frontend.pp
template, and thus would only run on voyager instances.
2018-03-06 13:35:51 -08:00
Tim Abbott 3ae645ed12 puppet: Rename analytics.pp to app_frontend_once.pp. 2018-03-06 13:35:51 -08:00
Tim Abbott ad7f38ab3b puppet: Move analytics cron job to analytics.pp.
This better groups it with the related code.
2018-03-06 13:35:51 -08:00
Tim Abbott 24b6106c9c puppet: Dsiable checking for evictions in memcached nagios.
Zulip's caching model for message history is such that it is normal
and healthy for there to eventually be a nontrivial volume of
evictions.
2018-03-06 13:34:02 -08:00
Greg Price 4475950ddf queue: Restore prematurely-cut upgrade path.
Revert c8f034e9a "queue: Remove missedmessage_email_senders code."
As the comment in the code says, it ensures a smooth upgrade path
from 1.7.x; we can delete it in master after 1.8.0 is released.
The removal commit was merged early due to a communication failure.
2018-02-28 11:15:53 -08:00
Tim Abbott 65767e5226 localhost_sso: Fix missing enabling of mod_wsgi.
This is apparently required on Ubuntu Xenial, at least.
2018-02-22 10:09:29 -08:00
Umair Khan c8f034e9a0 queue: Remove missedmessage_email_senders code.
After 68513952fb, all emails are sent through email_senders queue.
This commit removes code related to the legacy queue.
2018-02-21 16:43:56 -08:00
Aditya Bansal efe8545303 local-uploads: Start running authentication checks on file requests.
From here on we start to authenticate uploaded file request before
serving this files in production. This involves allowing NGINX to
pass on these file requests to Django for authentication and then
serve these files by making use on internal redirect requests having
x-accel-redirect field. The redirection on requests and loading
of x-accel-redirect param is handled by django-sendfile.

NOTE: This commit starts to authenticate these requests for Zulip
servers running platforms either Ubuntu Xenial (16.04) or above.

Fixes: #320 and #291 partially.
2018-02-16 05:06:37 +05:30
Greg Price 20c734c90a puppet: Fix type error in new Nagios check for analytics state. 2018-02-09 17:46:46 -08:00
Tim Abbott 005b0fb566 puppet: Clean up ssh authorized_keys configuration rules. 2018-02-09 16:37:03 -08:00
Tim Abbott aca25b6f0a puppet: Move ssh configuration to use notify.
This handles more correctly the case where we're using the upstream
sshd_config file.
2018-02-09 16:37:03 -08:00
Tim Abbott 486de8abfc puppet: Edit some rules to support chat.zulip.org.
This should make it possible to use the zulip_ops base rules
successfully on chat.zulip.org.  Many of the changes in this commit
are hacks and probably can be cleaned up later, but given that we plan
to drop trusty support soon, it's likely that most of them will simply
be deleted then.
2018-02-09 16:37:03 -08:00
Rishi Gupta 1d581a9c6e nagios: Add nagios check for analytics state.
This should help us detect issues where the analytics cron jobs aren't
running properly.

The cron/nagios part of the implementation done by tabbott.
2018-02-09 16:36:05 -08:00
Greg Price 7df29e7a7c puppet: Only use those "modern" options when on xenial.
On trusty, we of course have an older version -- 1.4.14 -- and it is
not so modern, so this just gives an error.
2018-02-08 18:11:52 -08:00
Greg Price 23e6a2e579 puppet: Update memcached config to turn on this decade's technology.
We've been running this change on zulipchat.com for a couple of months
now.  Before then, we used to regularly get exceptions like this:

     File "./zerver/views/messages.py", line 749, in get_messages_backend
       setter=stringify_message_dict)
     File "./zerver/lib/cache.py", line 275, in generic_bulk_cached_fetch
       cache_set_many(items_for_remote_cache)
     File "./zerver/lib/cache.py", line 215, in cache_set_many
       get_cache_backend(cache_name).set_many(items, timeout=timeout)
     File "/home/zulip/deployments/2017-09-28-21-04-12/zulip-py3-venv/lib/python3.5/site-packages/django/core/cache/backends/memcached.py", line 150, in set_many
       self._cache.set_multi(safe_data, self.get_backend_timeout(timeout))
   pylibmc.Error: error 48 from memcached_set_multi

This error means memcached was unable to find space for the new value.
You might think that because memcached provides an LRU cache, this
shouldn't happen because it would just evict something... but in fact
  * memcached splits its data into "slabs" by object size, and
  * until recently, once a 1MiB "chunk" is allocated to a given "slab"
    i.e. size class, it wouldn't be reclaimed to allocate to another.

So once the cache has been filled up with objects of some distribution
of sizes, if some objects come in that would go in a different size
class, we have no chunks for that size class / slab, and can't get one.
And that's exactly what was happening on zulipchat.com.

Useful background can be found in
  https://github.com/memcached/memcached/wiki/ServerMaint#slab-imbalance
  https://github.com/memcached/memcached/wiki/ReleaseNotes1411
  https://github.com/memcached/memcached/wiki/ReleaseNotes1425
  https://github.com/memcached/memcached/wiki/ReleaseNotes150
We're already running v1.4.25, which provides an "automover" that should
be well equipped to fix this; v1.5.0 turns it on by default.

With this commit, adopt the "modern start line" recommended in the
release notes for our v1.4.25, including turning on the automover.
2018-02-08 16:34:49 -08:00
Vishnu Ks bf2961418b puppet: Remove comment about period of soft deactivate users.
This often becomes wrong over time as it is currently.
2018-01-24 17:15:08 -08:00
Vishnu Ks a11b742984 messages: Calculate value of first visible message ID using cron job.
[greg: Fixed buggy time conversion in estimate_recent_messages.]
2018-01-24 17:15:08 -08:00
Tim Abbott 9ed2a94b8c nagios: Add configuration designed for full-stack servers.
This doesn't yet pass all Nagios checks correctly, and still has a few
flaws:
* The ideal setup code for the `nagios` user in the database isn't included.
* Some of the other details are a bit off; we need to split some host roles.

But it's better than nothing, and we can iterate from here.
2018-01-24 14:16:03 -08:00
Aditya Bansal dd0e6c8025 reminders: Fix issue with log file permissions in production. 2018-01-24 03:33:40 +05:30
Tim Abbott 2365b13b68 puppet: Move postgres Nagios plugin to main postgres-common.
This plugins package is required in order to use Nagios checks to
verify the Zulip postgres database, and thus belongs in the default
package set.
2018-01-23 10:31:48 -08:00
Aditya Bansal ec1297c1e8 schedulemessages: Add delivery system for scheduled message. 2018-01-10 09:18:02 -05:00
Umair Khan 68513952fb email-worker: Create EmailSendingWorker.
This commit just copies all the code from MissedMessageSendingWorker
class to a new EmailSendingWorker class. All the logic to send an email
through a queue was already there. This commit only makes the logic
generic. It does so by creating a special purpose queue called
'email_senders' to send any type of email. To make
MissedMessageSendingWorker still work we derive it from
EmailSendingWorker. All the tests that were testing
MissedMessageSendingWorker now run against EmailSendingWorker.
2017-12-20 19:36:27 -08:00
Tim Abbott f423dc4930 check_send_receive_time: Fix parsing bug.
This was a regression introduced with the argparse migration.
2017-11-27 14:01:30 -08:00