We raise MissingAuthenticationError now, which adds
`www_authenticate=session` header to the error response. This
stops modern web-browsers from displaying a login form everytime
a 401 response it sent to the client.
Having both of these is confusing; TORNADO_SERVER is used only when
there is one TORNADO_PORT. Its primary use is actually to be _unset_,
and signal that in-process handling is to be done.
Rename to USING_TORNADO, to parallel the existing USING_RABBITMQ, and
switch the places that used it for its contents to using
TORNADO_PORTS.
This system can't update stats while the queue is idle, without using
threads for this, but at least we ensure to update the file after
consuming an event if more than MAX_SECONDS_BEFORE_UPDATE_STATS passed
since the last update, regardless of the number of iterations done so
far.
The race condition is described in the comment block removed by this
commit. This leaves room for another, remaining race condition
that should be virtually impossible, but nevertheless it seems
worthwhile to have it documented in the code, so we put a new comment
describing it.
As a final note, this is not a new race condition,
it was hypothetically possible with the old code as well.
This mimics the backend logic for adding the data-attribute -
to know what Pygments language was used to highlight the code
block - in locally echoed messages.
New test added checks our logic for canonicalizing pygments alias
(for both frontend and backend).
Other fixtures and tests amended.
In ae58ed5a7 we decided to echo back the text, when no Pygments lexer
matching that language was found. When we do so, we must take care to
HTML escape the lang before wrapping it in a data-code-language attribute.
Tweaked by tabbott to make clear the escaping is defensive.
In development and test, we keep the Tornado port at 9993 and 9983,
respectively; this allows tests to run while a dev instance is
running.
In production, moving to port 9800 consistently removes an odd edge
case, when just one worker is on an entirely different port than if
two workers are used.
tornado.web.Application does not share any inheritance with Django at
all; it has a similar router interface, but tornado.web.Application is
not an instance of Django anything.
Refold the long lines that follow it.
While urllib3 retries all connection errors, it only retries a subset
of read errors, since not all requests are safe to retry if they are
not idempotent, and the far side may have already processed them once.
By default, the only methods that are urllib3 retries read errors on
are GET, TRACE, DELETE, OPTIONS, HEAD, and PUT. However, all of the
requests into Tornado from Django are POST requests, which limits the
effectiveness of bb754e0902.
POST requests to `/api/v1/events/internal` are safe to retry; at worst,
they will result in another event queue, which is low cost and will be
GC'd in short order.
POST requests to `/notify_tornado` are _not_ safe to retry, but this
codepath is only used if USING_RABBITMQ is False, which only occurs
during testing.
Enable retries for read errors during all POSTs to Tornado, to better
handle Tornado restarts without 500's.
Without an explicit port number, the `stdout_logfile` values for each
port are identical. Supervisor apparently decides that it will
de-conflict this by appending an arbitrary number to the end:
```
/var/log/zulip/tornado.log
/var/log/zulip/tornado.log.1
/var/log/zulip/tornado.log.10
/var/log/zulip/tornado.log.2
/var/log/zulip/tornado.log.3
/var/log/zulip/tornado.log.7
/var/log/zulip/tornado.log.8
/var/log/zulip/tornado.log.9
```
This is quite confusing, since most other files in `/var/log/zulip/`
use `.1` to mean logrotate was used. Also note that these are not all
sequential -- 4, 5, and 6 are mysteriously missing, though they were
used in previous restarts. This can make it extremely hard to debug
logs from a particular Tornado shard.
Give the logfiles a consistent name, and set them up to logrotate.
Calling `render()` in a middleware before LocaleMiddleware has run
will pick up the most-recently-set locale. This may be from the
_previous_ request, since the current language is thread-local. This
results in the "Organization does not exist" page occasionally being
in not-English, depending on the preferences of the request which that
thread just finished serving.
Move HostDomainMiddleware below LocaleMiddleware; none of the earlier
middlewares call `render()`, so are safe. This will also allow the
"Organization does not exist" page to be localized based on the user's
browser preferences.
Unfortunately, it also means that the default LocaleMiddleware catches
the 404 from the HostDomainMiddlware and helpfully tries to check if
the failure is because the URL lacks a language component (e.g.
`/en/`) by turning it into a 304 to that new URL. We must subclass
the default LocaleMiddleware to remove this unwanted functionality.
Doing so exposes a two places in tests that relied (directly or
indirectly) upon the redirection: '/confirmation_key'
was redirected to '/en/confirmation_key', since the non-i18n version
did not exist; and requests to `/stats/realm/not_existing_realm/`
incorrectly were expecting a 302, not a 404.
This regression likely came in during f00ff1ef62, since prior to
that, the HostDomainMiddleware ran _after_ the rest of the request had
completed.
This commit moves docs for users/{user_id}/subscriptions/{stream_id}
enndpoint to be after users/me/subscriptions/muted_topics docs.
We are rearranging the docs because after adding the new patch
endpoint for users/{user_id}/subscriptions/{stream_id}, openapi_core
validator tries to match 'users/me/subscriptions/muted_topics'
with 'users/{user_id}/subscriptions/{stream_id}' path in zulip.yaml
and thus gives error while running tests.
This is a bug in 'openapi_core' as it does not follows OpenAPI specs
to match concrete paths before their templated counterparts. Thus,
this commit rearranges the docs such that openapi_core validator
tries to match muted_topics endpoint with the correct path in
zulip.yaml docs.
When converting fenced code markdown, we add the language (if specified)
in a data-attribute by tweaking the HTML generated. Doing so, allows the
frontend to make use of this attr to display view-in-playground option
for codeblocks.
We use pygments to get the lexer subclass name and use that instead of
directly using the language in the data-attribute. Doing so, helps us
map different language aliases (like `js` and `javascript`) into a common
variable (like `JavaScript`) - and avoids the client from dealing with
multiple tags corresponding to the same language.
The html structure for a message like this:
``` js
..content..
```
would now be:
<div class="codehilite" data-codehilite-language="JavaScript">
<pre>..content..</pre>
</div>
Tests and fixtures amended.
This was a broken abstraction that returned to its caller within
multiple forked processes on exceptions, and encouraged ignoring the
error code (as all of its callers did).
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Fixes#16284.
Most of the work for this was done when we implemented correct
behavior for guest users, since they treat public streams like private
streams anyway.
The general method involves moving the messages to the new stream with
special care of UserMessage.
We delete UserMessages for subs who are losing access to the message.
For private streams with protected history, we also create UserMessage
elements for users who are not present in the old stream, since that's
important for those users to access the moved messages.
Previously, S3UploadBackend.delete_export_tarball failed to strip the
leading ‘/’ from the export path. This mistake is now caught by Moto
1.3.15. I expect it caused deletion failures in the real S3, although
I haven’t verified this.
We store export_path in the audit log with a leading ‘/’, but the
actual S3 keys do not have a leading ‘/’. Changing either system
would require a migration. So the new convention is that the
variables named ‘export_path’ have a leading ‘/’, while variables
named ‘path_id’ or ‘key’ do not.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Previously, the GitLab webhook code, namely the `get_objects_assignee`
method first tried to get a single assignee and if that failed then it
looks for multiple assignees and then it would return the first
assignee that it found (there's actually a code smell here - a loop
which would always return on the first iteration).
Instead, this commit will change that behavior to first check for
multiple assignees first then for a single assignee if we can't find
multiple assignees. Ultimately it will return a list of all of the
assignees (however many that might be [0, n]). This method has then
aptly been renamed to `get_assignees`.
Finally, we tweked the code using this method to always use it's
output as an "assignees" parameter to templates (there's also an
assignee parameter which we want to avoid here for consistency).
Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>
For some reasons, some of the fixtures had the +x bit set, while
some didn't. What this commit does is make sure that no fixture
is marked as "executable" (for anyone).
Signed-off-by: Hemanth V. Alluri <hdrive1999@gmail.com>
The previous code only worked by accident and hyperlink 20.0.0 breaks
it.
>>> hyperlink.parse("example.com").replace(scheme="https")
DecodedURL(url=URL.from_text('https:example.com'))
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Django treats path("<name>") like re_path(r"(?P<name>[^/]+)") and
path("<path:name>") like re_path(r"(?P<name>.+)").
This is more readable and consistent than the mix of slightly
different regexes we had before, and fixes various bugs:
• The r'apps/(.*)$' regex was missing a start anchor ^, so it
incorrectly matched all URLs that included apps/ as a substring
anywhere.
• The r'accounts/login/(google)/$' regex was missing a start anchor ^,
so it incorrectly matched all URLs that ended with
accounts/login/google/.
• The type annotation of zerver.views.realm_export.delete_realm_export
takes export_id as an int, but it was previously passed as a string.
• The type annotation of zerver.views.users.avatar takes medium as a
bool, but it was previously passed as a string.
• The [0-9A-Za-z]+ pattern for uidb64 was missing the - and _
characters that can validly be part of a base64url encoded
string (although I think the id is actually a decimal integer here,
in which case only 012345ADEIMNOQTUYcgjkwxyz are present in its
base64url encoding).
Signed-off-by: Anders Kaseorg <anders@zulip.com>
$ref siblings are ignored according to the OpenAPI specification, and
the referenced definitions already have examples.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Replace default root logger with zulip.auth.apple for apple auth
in file zproject/backends.py and update the test cases
accordingly in file zerver/tests/test_auth_backends.py
Replaced mock.patch with assertLogs for testing log outputs
in file test_auth_backends.py.
This change requires adjusting
test_log_into_subdomain_when_email_is_none to use an explicit token
since that appears in the log output.
This clears it out of the data sent to Sentry, where it is duplicative
with the indexed metadata -- and potentially exposes PHI if Sentry's
"make this issue public" feature is used.
The previous link was to "extended callable" types, which are
deprecated in favor of callback protocols. Unfortunately, defining a
protocol class can't express the typing -- we need some sort of
variadic generics[1]. Specifically, we wish to support hitting the
endpoint with additional parameters; thus, this protocol is
insufficient:
```
class WebhookHandler(Protocol):
def __call__(request: HttpRequest, api_key: str) -> HttpResponse: ...
```
...since it prohibits additional parameters. And allowing extra
arguments:
```
class WebhookHandler(Protocol):
def __call__(request: HttpRequest, api_key: str,
*args: object, **kwargs: object) -> HttpResponse: ...
```
...is similarly problematic, since the view handlers do not support
_arbitrary_ keyword arguments.
[1] https://github.com/python/typing/issues/193
`zulip.zerver.lib.webhooks.common` was very opaque previously,
especially since none of the logging was actually done from that
module.
Adjust to a more explicit logger name.
Any exception is an "unexpected event", which means talking about
having an "unexpected event logger" or "unexpected event exception" is
confusing. As the error message in `exceptions.py` already explains,
this is about an _unsupported_ event type.
This also switches the path that these exceptions are written to,
accordingly.
8e10ab282a moved UnexpectedWebhookEventType into
`zerver.lib.exceptions`, but left the import into
`zserver.lib.webhooks.common` so that webhooks could continue to
import the exception from there.
This clutters things and adds complexity; there is no compelling
reason that the exception's source of truth should not move alongside
all other exceptions.
The main race conditions, which actually happened in production was with
concurrent execution of deliver_email and clear_scheduled_emails.
clear_scheduled_emails could delete all email.users in the middle of
deliver_email execution, causing it to pass empty to_user_ids list to
send_email. We mitigate this by getting the list of user ids in a single
query and moving forward with that snapshot, not having to worry about
database data being mutated anymore.
clear_scheduled_emails had potential race conditions with concurrent
execution of itself due to not locking the appropriate rows upon
selecting them for the purpose of potentially deleting them. FOR UPDATE
locks need to be acquired to prevent simultaneous mutation.
Tested manually with some print+sleep debugging to make some races
happen.
fixes #zulip-2k (sentry)
There are three functional side effects:
• Correct an insignificant but mathematically offensive bias toward
repeated characters in generate_api_key introduced in commit
47b4283c4b4c70ecde4d3c8de871c90ee2506d87; its entropy is increased
from 190.52864 bits to 190.53428 bits.
• Use the base32 alphabet in confirmation.models.generate_key; its
entropy is reduced from 124.07820 bits to the documented 120 bits, but
now it uses 1 syscall instead of 24.
• Use the base32 alphabet in get_bigbluebutton_url; its entropy is
reduced from 51.69925 bits to 50 bits, but now it uses 1 syscall
instead of 10.
(The base32 alphabet is A-Z 2-7. We could probably replace all of
these with plain secrets.token_urlsafe, since I expect most callers
can handle the full urlsafe_b64 alphabet A-Z a-z 0-9 - _ without
problems.)
Signed-off-by: Anders Kaseorg <anders@zulip.com>
For web-public streams, clients can access full topic history
without being authenticated. They only need to additionally
send "streams:web-public" narrow with their request like all
the other web-public queries.
This verifies that we actually do enqueue a record when there is an
error on non-staging. With the previous commit, it verifies that that
data serializes correctly.
The return type of `ugettext_lazy('...')` (aliased as `_`) is a
promise, which is only forced into a string when it is dealt with in
string context. This `django.utils.functional.lazy.__proxy__` object
is not entirely transparent, however -- it cannot be serialized by
`orjson`, and `isinstance(x, str) == False`, which can lead to
surprising action-at-a-distance.
In the two places which will serialize the role value (either into
Zulip's own error reporting queue, or Sentry's), force the return
value. Failure to do this results in errors being dropped
mostly-silently, as they cannot be serialized and enqueued by the
error reporter logger, which has no recourse but to just log a
warning; see previous commit.
When we do this forcing, explicitly override the language to be the
realm default. Failure to provide this override would translate the
role into the role in the language of the _request_, yielding varying
results.
AdminNotifyHandler is used to notify admins of errors; it is a
critical piece of logic. Failures in reporting errors will compound,
since its `except Exception` clauses cannot generate logging at the
`error` or `exception` level, as that would be recursive. It must
settle for logging at the `warning` level, and hope that admins are
vigilant to the logging there.
Increase the chances of being notified of failures in this logger, by
bubbling up those exceptions to Sentry, which is an orthogonal
reporting stack.
When user requests for a realm that doesn't exists, we raise
a InvalidSubdomainError.
This reduces our effort at repeatedly ensuring realm is valid
in request in web-public queries.
If there are unsupported keys, we still log an error,
but we now also send a message to the stream. (This
is a good tradeoff for the github webhook, since users
can just turn off notifications if they find it spammy.
Also, we intend to support "repository" soon.)
This is a bit of an experiment to see how this plays
in the field:
* will customers notice the change?
* will Sentry reports look any different?
The main thing fixed here is that we weren't turning
on our keys into a list. And then I refined the message
a bit more, including sorting the keys.
I also avoid the unnecessary "else".
The EVENT_FUNCTION_MAPPER maps a string event name
to a function handler. Before this we circumvented
mypy checks with a call to get_body_function_based_on_type,
which specified Any as the type of our event function.
Now the types are rigorous.
This change was impossible without the recent commit
to introduce the Helper class.
The Helper class will soon grow, but the immediate
problem it solves is the need to jankily inspect
the parameters of our get_*_body function.
Most of the changes were handled by an ad hoc
munge.py script.
The substantive changes were adding the Helper
class and passing it in.
And then the linter discovered a place where
the optional include_title parameter wasn't used
(which is one of the reasons to avoid the janky
inspect-signature technique).
As a side note, none of the include_title parameters
needed a default value of False, as we always passed
in an explicit value.
We test cover both sides of include_title, which
you can verify by hard coding it to either True or
False (and seeing the relevant failures), although I
suspect most individual codepaths
only test one value, based on whether "topic" is in
the fixture or not.
Finally, I know Helper is not a great name, but I
intend to evolve the class a bit before deciding
whether a more descriptive name is helpful here.
(For example, an upcoming commit will add a
log_unexpected helper method.)
We get the header_event one level up the call
stack now, too.
It's somewhat annoying that we have our own
concept of "event" here, instead of just returning
our event handlers directly, or just calling them
directly, but it's a bit non-trivial to fix that
right away.
In passing, I remove the strange OR for "ping",
which is already a key in EVENT_FUNCTION_MAPPER.
See https://github.com/zulip/zulip/issues/16258 for
possible follow up here.
We now ignore the following two new pull_request
actions (as well as the three existing ones
from before):
approved
converted_to_draft
As the issue above indicates, we may want to actually
support "approved" if we can find somebody to work
on the webhook. (And then the issue goes a little
broader than what changed here.)
We consolidate the tests and remove the fixtures, which
just have a lot of noisy fields that we ignore. Also,
pull_request__request_review_removed was named improperly.
Before this the only way we took advantage
of the summary from UnexpectedWebhookEventType
was by looking at exc_info().
Now we just explicitly add it to the log
message, which also sets us up to call
log_exception_to_webhook_logger directly
with some sort of "summary" info
when we don't actually want a real
exception (for example, we might want to
report anomalous webhook data but still
continue the transaction).
A minor change in passing is that I move
the payload parameter lexically.
Our isort configuration was almost Black-compatible, but we were
missing ensure_newline_before_comments.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
Some users setup zulip with trailing / at end, like 'https://meet.jit.si/
leading to extra / on clients while generating video chat link.
This commit removes trailing '/' if it exists to make it consistent. Manual
testing was done by generating jitsi url.
Fixes#16225
We eliminate optional parameters and replace `request_body`
with `payload`.
There is much less confusion if we just pass in `payload`,
and then we optionally re-format it if it's json.
For unclear reasons the original code was trying to
do `request_body = str(payload)` when `request_body`
was no longer being used.
The query to finds and marks all unread UserMessages in the stream as read
can be quite expensive, so we'll move that work to the deferred_work
queue and split it into batches.
Fixes#15770.
In 468c5b9a58 we changed the method of
getting the list of management commands. Using app_config.path has a
caveat in that the value depends on the path from which we're executing.
An example of things breaking can be reproduced by calling
/home/vagrant/zulip/tools/test-backend TestCommandsCanStart
This makes the app_config.path values to start with /home/vagrant/zulip,
but DEPLOY_ROOT in the dev environment is set to /srv/zulip.
/home/vagrant/zulip is a soft link to /srv/zulip, so it's a valid path
to call test-backend through, but it causes self.commands to end up
being an empty list. We fix this by converting app_config.path to the
real path.
Rather than catching, checking action type, and possibly re-raising,
instead return None explicitly from `get_subject_and_body`, which
already signals for a blank success result. This collocates the logic
of the action types in one place, and removes the complexity of the
re-raise.
Sentry may get reported multiple exceptions stacks, in the case where
a `raise ...` was caught, and a new exception was `raise`d. In this
case, the `filename` is the most recent exception -- but the
exceptions are stored in the `exception` key in the order in which
they occurred. As such, taking the first value with a `stacktrace`
will result in showing the wrong line, or in no stack trace being
resolved at all.
Look from the last `exception` backwards, for matching stacks.
Since ALL_HOTSPOTS is a global object, it is initialized
at the time the backend server is started. Hence, the
title and description is translated only once. Using
ugettext_lazy makes sure that the strings are translated
in each and every request according to the language
of the user.
Fixes#16224
This commit fixes examples in "400" response for deactivating user
endpoints to have msg as "Cannot deactivate the last organization
owner" instead of "Cannot deactivate the last organization
administrator".
We had already removed the restriction on deactivating last admin
and added it for last owner, while adding owner role.
The `typing: stop` event did not have any tests in test_events
hence its documentation wasn't added. So add tests and relevant
documentation for the typing stop event. Also edit the documentation
of `typing: start` to include the fact that servers should use
their own timeout incase `stop` event event isn't received.
Fixes#16122.
We display the text of the consent message, and then continue with the
export, which will scroll the content off the screen. Allow the
administrator time to examine the contents of the message, and decide
whether to proceed based on that and the fraction of users that have
responded so far.
We raise two types of json_unauthorized when
MissingAuthenticationError is raised. Raising the one
with www_authenticate let's the client know that user needs
to be logged in to access the requested content.
Sending `www_authenticate='session'` header with the response
also stops modern web-browsers from showing a login form to the
user and let's the client handle it completely.
Structurally, this moves the handling of common authentication errors
to a single shared middleware exception handler.