**Background**
User groups are expected to comply with the DAG constraint for the
many-to-many inter-group membership. The check for this constraint has
to be performed recursively so that we can find all direct and indirect
subgroups of the user group to be added.
This kind of check is vulnerable to phantom reads which is possible at
the default read committed isolation level because we cannot guarantee
that the check is still valid when we are adding the subgroups to the
user group.
**Solution**
To avoid having another transaction concurrently update one of the
to-be-subgroup after the recursive check is done, and before the subgroup
is added, we use SELECT FOR UPDATE to lock the user group rows.
The lock needs to be acquired before a group membership change is about
to occur before any check has been conducted.
Suppose that we are adding subgroup B to supergroup A, the locking protocol
is specified as follows:
1. Acquire a lock for B and all its direct and indirect subgroups.
2. Acquire a lock for A.
For the removal of user groups, we acquire a lock for the user group to
be removed with all its direct and indirect subgroups. This is the special
case A=B, which is still complaint with the protocol.
**Error handling**
We currently rely on Postgres' deadlock detection to abort transactions
and show an error for the users. In the future, we might need some
recovery mechanism or at least better error handling.
**Notes**
An important note is that we need to reuse the recursive CTE query that
finds the direct and indirect subgroups when applying the lock on the
rows. And the lock needs to be acquired the same way for the addition and
removal of direct subgroups.
User membership change (as opposed to user group membership) is not
affected. Read-only queries aren't either. The locks only protect
critical regions where the user group dependency graph might violate
the DAG constraint, where users are not participating.
**Testing**
We implement a transaction test case targeting some typical scenarios
when an internal server error is expected to happen (this means that the
user group view makes the correct decision to abort the transaction when
something goes wrong with locks).
To achieve this, we add a development view intended only for unit tests.
It has a global BARRIER that can be shared across threads, so that we
can synchronize them to consistently reproduce certain potential race
conditions prevented by the database locks.
The transaction test case lanuches pairs of threads initiating possibly
conflicting requests at the same time. The tests are set up such that exactly N
of them are expected to succeed with a certain error message (while we don't
know each one).
**Security notes**
get_recursive_subgroups_for_groups will no longer fetch user groups from
other realms. As a result, trying to add/remove a subgroup from another
realm results in a UserGroup not found error response.
We also implement subgroup-specific checks in has_user_group_access to
keep permission managing in a single place. Do note that the API
currently don't have a way to violate that check because we are only
checking the realm ID now.
We have historically cached two types of values
on a per-request basis inside of memory:
* linkifiers
* display recipients
Both of these caches were hand-written, and they
both actually cache values that are also in memcached,
so the per-request cache essentially only saves us
from a few memcached hits.
I think the linkifier per-request cache is a necessary
evil. It's an important part of message rendering, and
it's not super easy to structure the code to just get
a single value up front and pass it down the stack.
I'm not so sure we even need the display recipient
per-request cache any more, as we are generally pretty
smart now about hydrating recipient data in terms of
how the code is organized. But I haven't done thorough
research on that hypotheseis.
Fortunately, it's not rocket science to just write
a glorified memoize decorator and tie it into key
places in the code:
* middleware
* tests (e.g. asserting db counts)
* queue processors
That's what I did in this commit.
This commit definitely reduces the amount of code
to maintain. I think it also gets us closer to
possibly phasing out this whole technique, but that
effort is beyond the scope of this PR. We could
add some instrumentation to the decorator to see
how often we get a non-trivial number of saved
round trips to memcached.
Note that when we flush linkifiers, we just use
a big hammer and flush the entire per-request
cache for linkifiers, since there is only ever
one realm in the cache.
Failing to remove all of the rules which were added causes action at a
distance with other tests. The two methods were also only used by
test code, making their existence in zerver.lib.rate_limiter clearly
misplaced.
This fixes one instance of a mis-balanced add/remove, which caused
tests to start failing if run non-parallel and one more anonymous
request was added within a rate-limit-enabled block.
An implicit coercion from an untyped dict to the TypedDict was hiding
a type error: CapturedQuery.sql was really str, not bytes. We should
always prefer dataclass over TypedDict to prevent such errors.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This commit renames reset_emails_in_zulip_realm function to
reset_email_visibility_to_everyone_in_zulip_realm which makes
it more clear to understand what the function actually does.
This commit also adds a comment explaining what this function
does.
This commits update the code to use user-level email_address_visibility
setting instead of realm-level to set or update the value of UserProfile.email
field and to send the emails to clients.
Major changes are -
- UserProfile.email field is set while creating the user according to
RealmUserDefault.email_address_visbility.
- UserProfile.email field is updated according to change in the setting.
- 'email_address_visibility' is added to person objects in user add event
and in avatar change event.
- client_gravatar can be different for different users when computing
avatar_url for messages and user objects since email available to clients
is dependent on user-level setting.
- For bots, email_address_visibility is set to EVERYONE while creating
them irrespective of realm-default value.
- Test changes are basically setting user-level setting instead of realm
setting and modifying the checks accordingly.
Black 23 enforces some slightly more specific rules about empty line
counts and redundant parenthesis removal, but the result is still
compatible with Black 22.
(This does not actually upgrade our Python environment to Black 23
yet.)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This adds CapturedQueryDict to provide a more accurate type annotation
for the return value of queries_captured. We also replace "Generator"
with "Iterator" because the latter two type parameters were unused.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
These were useful as a transitional workaround to ignore type errors
that only show up with django-stubs, while avoiding errors about
unused type: ignore comments without django-stubs. Now that the
django-stubs transition is complete, switch to type: ignore comments
so that mypy will tell us if they become unnecessary. Many already
have.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Note that django_stubs_ext is required to be placed within common.in
because we need the monkeypatched types in runtime; django-stubs
itself is for type checking only.
In the future, we would like to pin to a release instead of a git
revision, but several patches we've contributed upstream have not
appeared in a release yet.
We also remove the type annotation for RealmAuditLog.event_last_message_id
here instead of earlier because type checking fails otherwise.
Fixes#11560.
This works around some regression in moto 1.3.15 that I bisected to
b8820009e8
where ‘tools/test-backend test_transfer’ fails when run by itself.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Extends the URL redirect system used for documentation pages to corporate
landing pages. This makes it easier and consistent for contributors who
work on both areas to create new URL redirects when needed.
Creates `zerver.lib.url_redirects.py` to record old and new URLs
for documentation pages that have been renamed/moved and need URL
redirects.
This file is then used by `zproject.urls.py` to redirect links and
by `zerver.test.test_urls.py` to test that all of the old URLs
return a success response with a common page header/text depending
on the type of redirect (help center, policy, or API).
Adds a section to contributor docs on writing documentation for
how to use this redirect system when renaming a help center or api
documentation page.
Fixes#21946. Fixes#17897.
This eliminates the possibility of having `request.user` as
`RemoteZulipServer` by refactoring it as an attribute of `RequestNotes`.
So we can effectively narrow the type of `request.user` by testing
`user.is_authenticated` in most cases (except that of `SCIMClient`) in
code paths that require access to `.format_requestor_for_logs` where we
previously expect either `UserProfile` or `RemoteZulipServer` backed by
the implied polymorphism.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
`BaseNotes(str, str).get_notes` does not do anything here.
It was introduced in
53888e5a26
by unintendedly.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
We use this decorator on subclasses of `MigrationsTestCase`, which does
not have `self`s being `MigrationsTestCase`, but the corresponding
subclass.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
A request that has went through the auth middleware shouldn't have
`.user` being `None`. We should use `AnonymousUser` by default to
represent unauthenticated users.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
We wrap methods of the django test client for the test suite, and
type keyword variadic arguments as `ClientArg` as it might called
with a mix of `bool` and `str`.
This is problematic when we call the original methods on the test
client as we attempt to unpack the dictionary of keyword arguments,
which has no type guarantee that certain keys that the test client
requires to be bool will certainly be bool.
For example, you can call
`self.client_post(url, info, follow="invalid")` without getting a
mypy error while the django test client requires `follow: bool`.
The unsafely typed keyword variadic arguments leads to error within
the body the wrapped test client functions as we call
`django_client.post` with `**kwargs` when django-stubs gets added,
making it necessary to refactor these wrappers for type safety.
The approach here minimizes the need to refactor callers, as we
keep `kwargs` being variadic while change its type from `ClientArg`
to `str` after defining all the possible `bool` arguments that might
previously appear in `kwargs`. We also copy the defaults from the
django test client as they are unlikely to change.
The tornado test cases are also refactored due to the change of
the signature of `set_http_headers` with the `skip_user_agent` being
added as a keyword argument. We want to unconditionally set this flag to
`True` because the `HTTP_USER_AGENT` is not supported. It also removes a
unnecessary duplication of an argument.
This is a part of the django-stubs refactorings.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
This prevents us from relying on a side-effect of `allocate_handler_id`
that monkey-patches `handler_id` on the `AsyncDjangoHandler` object,
allowing mypy to acknowledge the existence of `handler_id` as an `int`.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
Since `HttpResponse` is an inaccurate representation of the
monkey-patched response object returned by the Django test client, we
replace it with `_MonkeyPatchedWSGIResponse` as `TestHttpResponse`.
This replaces `HttpResponse` in zerver/tests, analytics/tests, coporate/tests,
zerver/lib/test_classes.py, and zerver/lib/test_helpers.py with
`TestHttpResponse`. Several files in zerver/tests are excluded
from this substitution.
This commit is auto-generated by a script, with manual adjustments on certain
files squashed into it.
This is a part of the django-stubs refactorings.
Signed-off-by: Zixuan James Li <p359101898@gmail.com>
As detailed in the comments, the default behavior is undesirable for us
because we can't really predict all possibilities of exceptions that may
be raised - and thus putting str(e) in the http response is potentially
insecure as it may leak some unexpected sensitive information that was
in the exception.
As a hypothetical example - KeyError resulting from some buggy
some_dict[secret_string] call would leak information. Though of course
we aim to never write code like that.