Commit Graph

983 Commits

Author SHA1 Message Date
Alex Vandiver 49ad188449 rate_limit: Add a flag to lump all TOR exit node IPs together.
TOR users are legitimate users of the system; however, that system can
also be used for abuse -- specifically, by evading IP-based
rate-limiting.

For the purposes of IP-based rate-limiting, add a
RATE_LIMIT_TOR_TOGETHER flag, defaulting to false, which lumps all
requests from TOR exit nodes into the same bucket.  This may allow a
TOR user to deny other TOR users access to the find-my-account and
new-realm endpoints, but this is a low cost for cutting off a
significant potential abuse vector.

If enabled, the list of TOR exit nodes is fetched from their public
endpoint once per hour, via a cron job, and cached on disk.  Django
processes load this data from disk, and cache it in memcached.
Requests are spared from the burden of checking disk on failure via a
circuitbreaker, which trips of there are two failures in a row, and
only begins trying again after 10 minutes.
2021-11-16 11:42:00 -08:00
Mateusz Mandera 2d3d0f862a process_queue: Rename Threaded_worker to ThreadedWorker.
Threaded_worker doesn't fit the python naming convention we rely on.
2021-11-16 11:21:05 -08:00
Mateusz Mandera 7cc345d7b1 process_queue: Improve handling of exceptions in process_queue.
Unhandled exceptions propagating to process_queue were not caught there,
causing improper logging - errors didn't land in errors.log as expected.
Exceptions should be caught and explicitly logged by the process_queue
logger. Exceptions occurring during consuming events are caught and
handled inside the worker's logic - however those that happen while
setting up the worker were not addressed at all, and that's the core bug
we mean to address here.

Furthermore, in multi-threaded mode we want the autoreload mechanism to
be working - which it doesn't without catching the exceptions. The
correct approach is to - again - catch the exception, log it and then
send SIGUSR1 signal to trigger exit and autoreload.
2021-11-16 11:21:05 -08:00
Alex Vandiver bc5539d871 tornado: Move SIGTERM shutdown handler into a callback.
A SIGTERM can show up at any point in the ioloop, even in places which
are not prepared to handle it.  This results in the process ignoring
the `sys.exit` which the SIGTERM handler calls, with an uncaught
SystemExit exception:

```
2021-11-09 15:37:49.368 ERR  [tornado.application:9803] Uncaught exception
Traceback (most recent call last):
  File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/http1connection.py", line 238, in _read_message
    delegate.finish()
  File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/httpserver.py", line 314, in finish
    self.delegate.finish()
  File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/routing.py", line 251, in finish
    self.delegate.finish()
  File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 2097, in finish
    self.execute()
  File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 2130, in execute
    **self.path_kwargs)
  File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/gen.py", line 307, in wrapper
    yielded = next(result)
  File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 1510, in _execute
    result = method(*self.path_args, **self.path_kwargs)
  File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/handlers.py", line 150, in get
    request = self.convert_tornado_request_to_django_request()
  File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/handlers.py", line 113, in convert_tornado_request_to_django_request
    request = WSGIRequest(environ)
  File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/django/core/handlers/wsgi.py", line 66, in __init__
    script_name = get_script_name(environ)
  File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/event_queue.py", line 611, in <lambda>
    signal.signal(signal.SIGTERM, lambda signum, stack: sys.exit(1))
SystemExit: 1
```

Supervisor then terminates the process with a SIGKILL, which results
in dropping data held in the tornado process, as it does not dump its
queue.

The only command which is safe to run in the signal handler is
`ioloop.add_callback_from_signal`, which schedules the callback to run
during the course of the normal ioloop.  This callbacks does an
orderly shutdown of the server and the ioloop before exiting.
2021-11-12 09:57:23 -08:00
Shlok Patel 893c9bc896 export: Remove `--delete-after-upload` flag in realm export.
For export realm following changes have been made:
- `./manage.py export --upload` would delete `.tar.gz` and unpacked dir
- `./manage.py export` would only delete `unpacked dir`

Besides, we have removed `--delete-after-upload` as we have set it as
the default.

Fixes #20081
2021-11-03 11:14:02 -07:00
Anders Kaseorg 58920affd4 python: Remove re.UNICODE flag (redundant in Python 3).
https://docs.python.org/3/library/re.html#re.A

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-10-22 13:42:29 -07:00
Eeshan Garg b325a4f1be realm: Rename plan type constants to be more descriptive.
It is confusing to have the plan type constants not be namespaced
by the thing they represent. We already have a namespacing
convention in place for constants, so we should use it for
Realm.plan_type as well.
2021-10-19 12:20:39 -07:00
Mateusz Mandera 73a6f2a1a7 auth: Add support for using SCIM for account management. 2021-10-14 12:29:10 -07:00
Alex Vandiver 4d428490fd outgoing_http: Use OutgoingSession subclasses in more places.
This adds the X-Smokescreen-Role header to proxy connections, to track
usage from various codepaths, and enforces a timeout.  Timeouts were
kept consistent with their previous values, or set to 5s if they had
none previously.
2021-09-01 05:34:13 -07:00
Anders Kaseorg 3e78de4ce8 sync_ldap_user_data: Log all exceptions.
This is a roundabout way to appease a semgrep complaint about
‘error_msg = error_msg % (string_id,)’ while also improving the code.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-24 07:51:48 -07:00
Mateusz Mandera 7ef1a024db management: Rename clear_auth_rate_limit_history command. 2021-08-23 11:52:35 -07:00
PIG208 460119986b management: Fix typing for management scripts.
There are some remaining errors related to the django `Manager[T]` and
the `List[T]` type that we use to annotate the `Manage[T]` objects.
2021-08-20 05:54:18 -07:00
PIG208 50ce906f31 tornado: Update the `addrport` argument.
The ability to use multiple ports has been removed a long time ago.
And the "optional" note in the help message is in fact incorrect
since `addrport` being `None` is not supported.
2021-08-20 05:49:35 -07:00
Anders Kaseorg 0d061f44c1 actions: Remove acting_client parameter from bulk_remove_subscriptions.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-08-19 01:51:37 -07:00
Mateusz Mandera fdbde59b07 rate_limit: Add management command to reset auth rate limit.
The auth attempt rate limit is quite low (on purpose), so this can be a
common scenario where a user asks their admin to reset the limit instead
of waiting. We should provide a tool for administrators to handle such
requests without fiddling around with code in manage.py shell.
2021-08-19 00:45:17 -07:00
Alex Vandiver 4c518c2bba scheduled_email: Consistently lock users table.
Only clear_scheduled_emails previously took a lock on the users before
removing them; make deliver_scheduled_emails do so as well, by using
prefetch_related to ensure that the table appears in the SELECT.  This
is not necessary for correctness, since all accesses of
ScheduledEmailUser first access the ScheduledEmail and lock it; it is
merely for consistency.

Since SELECT ... FOR UPDATE takes an UPDATE lock on all tables
mentioned in the SELECT, merely doing the prefetch is sufficient to
lock both tables; no `on=(...)` is needed to `select_for_update`.

This also does not address the pre-existing potential deadlock from
these two use cases, where both try to lock the same ScheduledEmail
rows in opposite orders.
2021-08-19 00:44:33 -07:00
Tim Abbott 36d15d85e0 send_custom_email: Only send to long_term_idle users. 2021-08-05 10:14:44 -07:00
Alex Vandiver e94b6afb00 nagios: Remove broken check_email_deliverer_* checks and related code.
These checks suffer from a couple notable problems:
 - They are only enabled on staging hosts -- where they should never
   be run.  Since ef6d0ec5ca, these supervisor processes are only
   run on one host, and never on the staging host.
 - They run as the `nagios` user, which does not have appropriate
   permissions, and thus the checks always fail.  Specifically,
   `nagios` does not have permissions to run `supervisorctl`, since
   the socket is owned by the `zulip` user, and mode 0700; and the
   `nagios` user does not have permission to access Zulip secrets to
   run `./manage.py print_email_delivery_backlog`.

Rather than rewrite these checks to run on a cron as zulip, and check
those file contents as the nagios user, drop these checks -- they can
be rewritten at a later point, or replaced with Prometheus alerting,
and currently serve only to cause always-failing Nagios checks, which
normalizes alert failures.

Leave the files installed if they currently exist, rather than
cluttering puppet with `ensure => absent`; they do no harm if they are
left installed.
2021-08-03 16:07:13 -07:00
Alex Vandiver 6fe67f0143 delete_realm: Allow deletion of realms with empty customers.
This is effectively a step closer to what was proposed in
https://github.com/zulip/zulip/pull/18678#discussion_r644490540 when
this code was written in #18678.

If the Customer object has neither of a Stripe id, nor any historical
plans, then there's no real billing association contained in the
existence of the Customer object, and it's safe to delete.
2021-08-02 22:29:16 -07:00
Tim Abbott 9968fb5081 send_custom_email: Fix emailing single users with TOS_VERSION set.
This code path previously threw an exception.
2021-08-02 17:57:16 -07:00
Tim Abbott d1dd34d7e0 send_custom_email: Add option for sending marketing emails. 2021-08-01 21:45:34 -07:00
Priyansh Garg 24dd0ff96c data_import: Add rocket chat import tool.
This commit allows to import the following from rocketchat:
* All users
* All public/private channels
* All teams and its public/private channels
* All discussion rooms as topics in their parent channel
* All the messages in all the channels
* All private conversations
* Reactions on messages (except for custom emojis)
* Mentions in messages (except @all, @here mentions)
2021-07-28 15:28:56 -07:00
Anders Kaseorg fb3ddf50d4 python: Fix mypy no_implicit_reexport errors.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-07-16 14:02:31 -07:00
Mateusz Mandera c6bfd1aa88 management: Add change_password command.
Zulip identifies users by realm+delivery_email which means that the
Django changepassword command doesn't work well -
since it looks only at the .email field.
Thus we fork its code to our own change_password command.
2021-07-09 12:34:39 -07:00
Anders Kaseorg 325cb89cbf check_redis: Fix for key format change and Python 3.
Commit 81d7dd1fda broke this nearly
eight years ago, so probably nobody cares except the ever-watchful eye
of mypy.

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-07-05 12:23:06 -07:00
Gaurav Pandey 1d12fcd168 management: Delete fix_unreads command.
This command was part of the complex migration to introduce the
`unread_msgs` data structure as the source of truth for unreads.
Effectively, it's a migration to remove anomalies that we ran several
times before turning it into the final 0104_fix_unreads.py migration.

Fixes part of #18898.
2021-06-25 09:14:45 -07:00
Gaurav Pandey af08bcdb3f management: Delete send_stats command.
This command is part of a statsd infrastructure that we stopped
supporting years ago. Its only purpose for some time has been to
provide sample code for how the restart script might trigger a
notification to a graphing system, which doesn't justify maintaining
it.

Fixes part of #18898.
2021-06-25 09:13:48 -07:00
Gaurav Pandey 7db3d76ecd management: Delete dump_messages command.
This is basically a hacky version of `manage.py export`, implemented
primarily for development work.

Fixes part of #18898.
2021-06-25 09:13:19 -07:00
Gaurav Pandey 785895c265 management: Delete turn_off_digests command.
This command was part of early prototyping of the digests feature, and
in particular its purpose is better served via the organization-level
setting to control digest emails for the organization.

Fixes part of #18898.
2021-06-25 09:12:50 -07:00
Gaurav Pandey 461c6bd791 management: Delete generate_multiuse_invite_link command.
This command was written to allow generating multiuse invite links
before the "Invite a user" UI supported them. It no longer has a
purpose and can be safely deleted.

Fixes part of #18898.
2021-06-25 09:12:17 -07:00
Gaurav Pandey ef7ec4dbf2 management: Delete generate_invite_links command.
This command predates there being a normal UI for inviting users to
Zulip. It no longer has a role for which it's a better way to do
things. (Especially with upcoming API documentation for the endpoint).

Fixes part of #18898.
2021-06-25 09:11:27 -07:00
Gaurav Pandey 331ba801ae management: Delete set_message_flags command.
This command was introduced in 2013 via
6d6c3364dc as part of implementing
marking messages as read in a separate process for performance reasons.

We fixed the performance issues and removed that pipeline years ago,
but forgot to delete this.

Fixes part of #18898.
2021-06-25 09:09:53 -07:00
Anders Kaseorg 342834ee9c python: Simplify stdio flushing using print(…, flush=True).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-06-09 14:05:31 -07:00
Tim Abbott d041bb0674 import: Fix propagation of subdomain error messages.
The previous logic would provide a very confusing error message if the
subdomain was already in use.
2021-06-09 12:58:05 -07:00
Alex Vandiver 4525543ec6 delete_realm: Disallow deletion of realms we've had billing plans with. 2021-06-03 09:38:12 -07:00
Alex Vandiver 66f77d3b4b send_test_email: Capture and show SMTP log on errors. 2021-06-02 13:18:11 -07:00
Alex Vandiver 4ffda1be87 send_email: Fix sleep logic.
This was broken in the refactor in 1e67e0f218.
2021-05-27 22:49:28 -07:00
Alex Vandiver 94e4f33b29 data_import: Support importing from Slack conversions in a directory.
Sometimes the Slack import zip file we get isn't quite the canonical
form that Slack produces -- often because the user has unzip'd it,
looked at it, and re-zip'd it, resulting in extra nested directories
and the like.

For such cases, support passing in a path to an unpacked Slack export
tree.
2021-05-27 22:46:58 -07:00
Abhijeet Prasad Bodas ec8a931761 message send: Pass individual parameters instead of single Dict.
This will allow for stronger type checking and better readability.
2021-05-20 11:06:19 -07:00
Tim Abbott 323152c5fd send_custom_email: Never email users who haven't agreed to ToS.
The `create_user` API and data import tools can result in our having
active users in the database who haven't intentionally created a Zulip
account or agreed to the ToS; we should never email such users.

The check for `TOS_VERSION is not None` is necessary for the
development environment, which has `TERMS_OF_SERVICE` set but not
`TOS_VERSION`.

It's likely that we will want this check in other places as well.
2021-05-19 13:47:36 -07:00
Alex Vandiver 670c7e7ba4 settings: Remove now-unnecessary EMAIL_DELIVERER_DISABLED setting. 2021-05-18 12:39:28 -07:00
Alex Vandiver 1e67e0f218 deliver_scheduled_*: SELECT FOR UPDATE the relevant rows.
`deliver_scheduled_emails` and `deliver_scheduled_messages` use their
respective tables like a queue, but do not have guarantees that there
was only one consumer (besides the EMAIL_DELIVERER_DISABLED setting),
and could send duplicate messages if multiple consumers raced in
reading rows.

Use database locking to ensure that the database only feeds a given
ScheduledMessage or ScheduledEmail row to a single consumer.  A second
consumer, if it exists, will block until the first consumer commits
the transaction.
2021-05-18 12:39:28 -07:00
Alex Vandiver 82797dd53c settings: Standardize the name of the deliver_scheduled_messages logs.
This makes it match its command name, and other logfile name.
2021-05-18 12:39:28 -07:00
Alex Vandiver 0f1611286d management: Rename the deliver_email command to deliver_scheduled_email.
This makes it parallel with deliver_scheduled_messages, and clarifies
that it is not used for simply sending outgoing emails (e.g. the
`email_senders` queue).

This also renames the supervisor job to match.
2021-05-11 13:07:29 -07:00
Mateusz Mandera 20f99f429d actions: Extract get_active_bots_owned_by_user function. 2021-05-10 15:38:24 -07:00
Mateusz Mandera 8261f7e801 commands: Add delete_user management command and document it. 2021-05-10 15:38:14 -07:00
Mateusz Mandera 6b056ca332 retention: Change default behavior of restore_messages command to noop.
Having "restore everything" as the default is a very bad idea. An
argument should explicitly be given for that behavior.
2021-05-10 15:13:10 -07:00
Alex Vandiver 27012e343b edit_linkifiers: Show the right command name in help text. 2021-05-10 13:51:22 -07:00
Siddharth Asthana 8f864c5107 commands: Add change_realm_subdomain management command.
Add command to change realm's subdomain. This command takes two
arguments, realm's string_id and new subdomain name.
2021-05-10 12:30:58 -07:00
Tim Abbott 2c01354569 management: Use required kwargs in add_realm_args.
This makes management commands more readable, since one doesn't need
to know details of how the library works to read based code.
2021-05-10 12:30:58 -07:00