The code comment explains this issue in some detail, but essentially
in Kubernetes and Docker Swarm systems, the container overlayer
network has a relatively short TCP idle lifetime (about 15 minutes),
which can lead to it killing the connection between Tornado and
RabbitMQ.
We fix this by setting a TCP keepalive on that connection shorter than
15 minutes.
Fixes#10776.
This is following the change to the /users endpoint where we allow
an optional parameter "include_custom_profile_fields" which would
allow the client to request for users' custom profile fields along
with their other standard data.
The previous example no longer gives a good enough idea of what the user
can expect when the `include_custom_profile_fields` boolean parameter is
set to true.
The url scheme is now /accounts/login/social/saml/{idp_name} to initiate
login using the IdP configured under "idp_name" name.
display_name and display_logo (the name and icon to show on the "Log in
with" button) can be customized by adding the apprioprate settings in
the configured IdP dictionaries.
login_context now gets the social_backends list through
get_social_backend_dicts and we move display_logo customization
to backend class definition.
This prepares for easily adding multiple IdP support in SAML
authentication - there will be a social_backend dict for each configured
IdP, also allowing display_name and icon customization per IdP.
This changes the way django_to_ldap_username works to make sure the ldap
username it returns actually has a corresponding ldap entry and raise an
exception if that's not possible. It seems to be a more sound approach
than just having it return its best guess - which was the case so far.
Now there is a guarantee that what it returns is the username of an
actual ldap user.
This allows communicating to the registration flow when the email being
registered doesn't belong to ldap, which then will proceed to register
it via the normal email backend flow - finally fixing the bug where you
couldn't register a non-ldap email even with the email backend enabled.
These changes to the behavior of django_to_ldap_username require small
refactorings in a couple of other functions that call it, as well as
adapting some tests to these changes. Finally, additional tests are
added for the above-mentioned registration flow behavior and some
related corner-cases.
Instead of mocking the _LDAPUser class, these tests can now take
advantage of the test directory that other ldap are using. After these
changes, test_query_email_attr also verifies that query_ldap can
successfully be used to query by user email, if email search is
configured.
Fixes#11878
Instead of a confusing mix of django_auth_backed applying
ldap_to_django_username in its internals for one part of the
translation, and then custom logic for grabbing it from the email
attribute of the ldapuser in ZulipLDAPAuthBackend.get_or_build_user
for the second part of the translation,
we put all the logic in a single function user_email_from_ldapuser
which will be used by get_or_build of both ZulipLDAPUserPopulator and
ZulipLDAPAuthBackend.
This, building on the previous commits with the email search feature,
fixes the ldap sync bug from issue #11878.
If we can get upstream django-auth-ldap to merge
https://github.com/django-auth-ldap/django-auth-ldap/pull/154, we'll
be able to go back to using the version of ldap_to_django_username
that accepts a _LDAPUser object.
With this, django_to_ldap_username can take an email and find the ldap
username of the ldap user who has this email - if email search is
configured.
This allows successful authenticate() with ldap email and ldap password,
instead of ldap username. This is especially useful because when
a user wants to fetch their api key, the server attempts authenticate
with user_profile.email - and this used to fail if the user was an ldap
user (because the ldap username was required to authenticate
succesfully). See issue #9277.
This fixes a collection of bugs surrounding LDAP configurations A and
C (i.e. LDAP_APPEND_DOMAIN=None) with EmailAuthBackend also enabled.
The core problem was that our desired security model in that setting
of requiring LDAP authentication for accounts managed by LDAP was not
implementable without a way to
Now admins can configure an LDAPSearch query that will find if there
are users in LDAP that have the email address and
email_belongs_to_ldap() will take advantage of that - no longer
returning True in response to all requests and thus blocking email
backend authentication.
In the documentation, we describe this as mandatory configuration for
users (and likely will make it so soon in the code) because the
failure modes for this not being configured are confusing.
But making that change is pending work to improve the relevant error
messages.
Fixes#11715.
The value of realm attribute in confirmation object used to be empty
before. We are not currently using the realm attribute of reactivation
links anywhere. The value of realm stored in content_object is currently
used.
We currently have code to calculate the value of realm_icon_url,
admin_emails and default_discount in two diffrent places. With
the addition of showing confirmation links it would become three.
The easiest way to deduplicate the code and make the view cleaner
is by doing the calculations in template. Alternatively one can
write a function that takes users, realms and confirmations as
arguments and sets the value of realm_icon_url, admin_emails and
default_discount appropriately in realm object according to the
type of the confirmation. But that seems more messy than passing
the functions directly to template approach.
Most of the failures were due to parameters that are not intended to
be used by third-party code, so the correct fix for those was the set
intentionally_undocumented=True.
Fixes#12969.
MigrationsTestCase is intentionally omitted from this, since migrations
tests are different in their nature and so whatever setUp()
ZulipTestCase may do in the future, MigrationsTestCase may not
necessarily want to replicate.
new_name and description params should be valid JSON
strings. The format of these params are marked as
json so that the curl example genenrator can convert
them into json strings.
Previously, the logic for determining whether to provide an LDAP
password prompt on the registration page was incorrectly including it
if any LDAP authentication was backend enabled, even if LDAP was
configured with the populate-only backend that is not responsible for
authentication (just for filling in name and custom profile fields).
We fix this by correcting the conditional, and add a test.
There's still follow-up work to do here: We may still end up
presenting a registration form in situations where it's useless
because we got all the data from SAML + LDAP. But that's for a future
issue.
This fixes a bug reported in #13275.
This is a follow-up to b69213808a.
We now actually send messages from the notification_bot, which
is the real usecase for this code.
Also, this cleans up the code and removes needless asserts like
`assertNotEqual(zulip_realm, lear_realm)` making the test easier
to read.
A confirmation object is already created when
do_send_confirmation_email is called just above.
Tweaked by tabbott to remove an unnecessary somewhat hacky database
query.
There are a few outstanding issues that we expect to resolve beforce
including this in a release, but this is good checkpoint to merge.
This PR is a collaboration with Tim Abbott.
Fixes#716.
Priviously, we rendered the topic links using the msg.sender.realm.
This resulted in issues with Zulip's internal bots not having access
to the realm_filters of the destination stream's realm. For example,
sending a message via the email gateway or notification would not
linkify any realm filters that a user would expect them to.
Even though required attribute of stream and stream_id params is marked
false in openapi specification, the API expects atleast one of the
params to be set. There is no way to specify relationships like this
openapi and they dont seem to have any plan to implement this in future.
https://github.com/OAI/OpenAPI-Specification/issues/256
This limit was introduced in c588c79 as a part of the
feature and not due to performance crisis. So we are
increasing this limit to 7 days. Since topics tends to
naturally fizzle after day or two so 7 days limit
would be good enough.
One small change in behavior is that this creates an array with all the
row_objects at once, rather than creating them 1000 at a time.
That should be fine, given that the client batches these in units of
10000 anyway, and so we're just creating 10K rows of a relatively
small data structure in Python code here.
Fixes#1727.
With the server down, apply migrations 0245 and 0246. 0246 will remove
the pub_date column, so it's essential that the previous migrations
ran correctly to copy data before running this.
1. Apply migration 0243 to add date_sent column.
2. Apply migration 0244 to copy pub_date over to date_sent. Can be done
with the server running.
3. With the server down (for consistency between memory and
database state of Django objects), verify consistency with
Message.objects.exclude(date_sent=F("pub_date")).count() == 0
Apparently, our change in b8a1050fc4 to
stop caching responses on API endpoints accidentally ended up
affecting uploaded files as well.
Fix this by explicitly setting a Cache-Control header in our Sendfile
responses, as well as changing our outer API caching code to only set
the never cache headers if the view function didn't explicitly specify
them itself.
This is not directly related to #13088, as that is a similar issue
with the S3 backend.
Thanks to Gert Burger for the report.
The previous code for ensuring the sort order of emoji choices was
correct relied on an OrderedDict structure, which isn't guaranteed to
be preserved when passed to the frontend via JSON (in fact, it isn't,
since we converted the way page_params is passed to use
sort_keys=True). Switch it to a list of dictionaries to correct this.
Fixes#13220.
Previously, we were hardcoding the domain s3.amazonaws.com. Given
that we already have an interface for configuring the host in
/etc/zulip/boto.cfg (which in turn, automatically configures boto), we
just need to actually use the value configured in boto for what S3
hostname to use.
We don't have tests for this new use case, in part because they're
likely annoying to write with `moto` and there hasn't been a huge
amount of demand for it. Since this doesn't regress existing S3
backend support, it seems worth merging.
This patches an issue in f37535044 where we mistakenly tried to send
the function as part of the page_params. Instead, we should just try
to send the list of configuration options (in their user displayable
form).
This change adds the OpenAPI data needed to document the POST and
DELETE methods associated with this endpoint.
Descriptions edited slightly by tabbott.
Apparently, the Zulip notifications (and resulting emails) were
correct, but the download links inside the Zulip UI were incorrectly
not including S3 prefix on the URL, making them not work.
While we're at this, we rewrite the somewhat convoluted previous
system for formatting the data export output.
When using our EMAIL_ADDRESS_VISIBILITY_ADMINS feature, we were
apparently creating bot users with different email and delivery_email
properties, due to effectively an oversight in how the code was
written (the initial migration handled bots correctly, but not bots
created after the transition).
Following the refactor in the last commit, the fix for this is just
adding the missing conditional, a test, and a database migration to
fix any incorrectly created bots leaked previously.
This is also a useful preparatory refactor for having a user setting
controlling whether one's own email address is publicly available
within the organization.
This should dramatically improve the queue processor's performance in
cases where there's a very high volume of requests on a given endpoint
by a given user, as described in the new docstring.
Until we test this more broadly in production, we won't know if this
is a full solution to the problem, but I think it's likely. We've
never seen the UserActivityInterval worker end up backlogged without a
total queue processor outage, and it should have a similar workload.
Fixes#13180.
We don't actually need to go to the memcached (falling back to the
database) to fetch either user or client objects on every event. For
user objects, we actually can just pass through the user ID
transparently; for client objects, we can use an in-process cache,
since the mapping of string to ID never changes.
With the way these tests are, it's unnecessary to have 3 separate
classes, and it makes it confusing to decide where to add a potential
additional mm email test.
This simple backwards-compatible change saves approximately 12% in the
compressed size of the chat.zulip.org page_params. We can do much,
much better by changing the format, but this seems like a good
intermediate step.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
In a gigantic realm where we send several MB of `page_params`, it’s
slightly better to have the rest of the `<body>` available to the
browser earlier, so it can show the “Loading…” spinner and start
fetching subresources.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
any_oauth_backend_enabled is all about whether we will have extra
buttons on the login/register pages for logging in with some non-native
backends (like Github, Google etc.). And this isn't about specifically
oauth backends, but generally "social" backends - that may not rely
specifically rely on Oauth. This will have more concrete relevance when
SAML authentication is added - which will be a "social" backend,
requiring an additional button, but not Oauth-based.
This sidesteps tricky escaping issues, and will make it easier to
build a strict Content-Security-Policy.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This sidesteps tricky escaping issues, and will make it easier to
build a strict Content-Security-Policy.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Previously, incorrectly passing an existing directory to the
`manage.py export --output` option would remove its contents without
warning. Abort instead.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
* Whitelist a small number of image/ types to be served as
non-attachments.
* Serve the file using the type that we validated rather than relying
on an independent guess to match.
This issue can lead to a stored XSS security vulnerability for older
browsers that don't support Content-Security-Policy.
It primarily affects servers using Zulip's local file uploads backend
for servers running Ubuntu 16.04 Xenial or newer; the legacy local
file upload backend for (now EOL) Ubuntu 14.04 Trusty was not affected
and it has limited impact for the S3 upload backend (which uses an
unprivileged S3 bucket domain to serve files).
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This brings us in line, and also allows us to style these more like
unordered lists, which is visually more appealing.
On the backend, we now use the default list blockprocessor + sane list
extension of python-markdown to get proper list markup; on the
frontend, we mostly return to upstream's code as they have followed
CommonMark on this issue.
Using <ol> here necessarily removes the behaviour of not renumbering
on lists written like 3, 4, 7; hopefully users will be OK with the
change.
Fixes#12822.
Also cleans up the interface between the management command and the
LDAP backends code to not guess/recompute under what circumstances
what should be logged.
Co-authored-by: mateuszmandera <mateusz.mandera@protonmail.com>
The order of operations for our LDAP synchronization code wasn't
correct: We would run the code to sync avatars (etc.) even for
deactivated users.
Thanks to niels for the report.
Co-authored-by: mateuszmandera <mateusz.mandera@protonmail.com>
Fixes#13130.
django_auth_ldap doesn't give any other way of detecting that LDAPError
happened other than catching the signal it emits - so we have to
register a receiver. In the receiver we just raise our own Exception
which will properly propagate without being silenced by
django_auth_ldap. This will stop execution before the user gets
deactivated.
So the reason 38f8cf612c seems
to be flaking is because the value of harry id switches between
1 and 2 in Xenial while in Bionic it would be fixed at 2. The
reason behind this is that Bionic ships with Python3.6 which
preserves dict insert order while Python3.5 that ships with Xenial
dont preserve the order. In initialize_stream_membership_dicts
we iterate user_data_map dict and the order in which the iteration
happens affects the ID of the users.
Papertrail sends requests with the content type
`application/x-www-form-urlencoded`, with the payload parameter holding the
JSON body. This commit fixes the papertrail integration to use the payload
parameter in the request's POST data instead of trying to parse the
request's entire body as JSON.
Papertrail documentation here:
https://help.papertrailapp.com/kb/how-it-works/web-hooks#encoding
We have a very useful piece of code, _RateLimitFilter, which is
designed to avoid sending us a billion error emails in the event that
a Zulip production server is down in a way that throws the same
exception a lot. The code uses memcached to ensure we send each
traceback roughly once per Zulip server per 10 minutes (or if
memcached is unavailable, at most 1/process/10 minutes, since we use
memcached to coordinate between processes)
However, if memcached is down, there is a logging.error call internal
to the Django/memcached setup that happens inside the cache.set() call,
and those aren't caught by the `except Exception` block around it.
This ends up resulting in infinite recursion, eventually leading to
Fatal Python error: Cannot recover from stack overflow., since this
handler is configured to run for logging.error in addition to
logging.exception.
We fix this using a thread-local variable to detect whether we are
being called recursively.
This change should prevent some nasty failure modes we've had in the
past where memcached being down resulted in infinite recursion
(resulting in extra resources being consumed by our error
notifications code, and most importantly, the error notifications not
being sent).
Fixes#12595.
There's no reason for this to be a category of error that emails the
server administrator, since there's a good chance that fixing it will
need to be done in the Zulip codebase, not administrator action.
Fixes#9401.
This adds a FAKE_EMAIL_DOMAIN setting, which should be used if
EXTERNAL_HOST is not a valid domain, and something else is needed to
form bot and dummy user emails (if email visibility is turned off).
It defaults to EXTERNAL_HOST.
get_fake_email_domain() should be used to get this value. It validates
that it's correctly set - that it can be used to form valid emails.
If it's not set correctly, an exception is raised. This is the right
approach, because it's undesirable to have the server seemingly
peacefully operating with that setting misconfigured, as that could
mask some hidden sneaky bugs due to UserProfiles with invalid emails,
which would blow up the moment some code that does validate the emails
is called.
Apparently, due to poor naming of the outer capture group we use to
separate the actual match from the surrounding whitespace (etc.) we
use to determine if the syntax is a possible linkifier start/end, if
you created a linkifier using "name" as the capture group, we'd try to
compile a pattern with two capture groups called "name", which would
500, preventing anyone from accessing the organization.
We’re about to start using PostgreSQL-specific syntax that can’t be
stringified without a specified dialect.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This requires part 1 (which can take hours to run but generally
doesn't require downtime) to be completed first.
This portion of the migration will require the server to be completely
down for a brief period; for chat.zulip.org with 250M UserMessage
rows, it took about 60s to run; that time will vary depending on
hardware details like whether the server has an SSD, but fundamentally
shouldn't be long.
Our upgrade-zulip and upgrade-zulip-from-git tools can apply this
migration correctly; nothing special needs to be done.
Fixes#13040.
As part of adding support for more than 2B UserMessage rows in a Zulip
server, we need to change UserMessage.id (a field we don't access but
is needed by Django) from an int to a bigint. This commit is a series
of migrations which create a `bigint_id` column and populates it correctly.
This migration will take a long time to run; on chat.zulip.org (a
server with a lot of history), it took about 4 hours to complete.
How to migrate with minimal downtime:
1. Run `upgrade-zulip-from-git` through this commit. It will install
migration 0238 and then more or less hang while applying migration
0239. Once migration 0238 is completed, however, your server should
be able to be started back up safely while migration 0239 is running.
2. Run `/home/zulip/deployments/next/scripts/restart-server` in a
separate terminal to get Zulip running again.
3. When the `upgrade-zulip-from-git` command finishes, it will
automatically re-restart the Zulip server, leaving you in a consistent
state and ready to do part 2 of the migration.
A useful `manage.py shell` query for checking the state after this
commit is consistent is this:
assert UserMessage.objects.exclude(bigint_id=F("id")).count() == 0
Part of #13040.
Previously, several of our URL patterns accidentally did not end with
`$`, and thus ended up controlling just the stated URL, but actually a
much broader set of URLs starting with it.
I did an audit and fixed what I believe are all instances of this URL
pattern behavior. In the process, I fixed a few tests that were
unintentionally relying on the behavior.
Fixes#13082.
Historically, Zulip's implementation of wildcard mentions never
triggered either email or push notifications, instead being limited to
desktop notifications and the "mentions" counter.
We fix this just by plumbing the "wildcard_mentioned" flag through our
system.
Implements much of
https://github.com/zulip/zulip/issues/6040#issuecomment-510157264.
We're also now ready to seriously work on #3750.
We added default ToS for the development environment a few months
back; as a side effect, we now need to accept ToS when going through
the development environment registration flow, including for our
one-click account creation buttons.
After a new user joins an active organization, it isn't obvious what
to do next; this change causes there to be recent unread messages in
the stream sidebar for the user to click on to get a feel for what's
happening in the organization and experiment with Zulip.
Fixes#6512.
This commit wraps up the major work that we held back when upgrading
py-markdown 2.6.11 to 3.0.1. Since we were making our custom changes
to the link syntax, at the time we stuck to using the old method of
parsing links. This lays the groundwork for further changes to our
link and image link handling, and brings us on par with upstream.
Also, we now better document the ways in which our link handling is
different from upstream.
Previously, the unread_msgs data structure accounting (used for both
the web and mobile apps to determine the "Unread mentions" count
displayed in the UI) did not include wildcard mentions at all.
We fix this by adding the logic required to include properly that
data, with tests. As discussed in #6040, it makes sense to include
muted streams and topics for the purpose of this calculation.
Fixes part of #6040.
Apparently, get_active_presence_idle_user_ids, which is carefully
optimized to only fetch data for users who might actually need
notification processing, was only considering PMs and direct mentions,
not wildcard mentions or alert words.
This caused some pretty weird failure modes when working on adding
support for broader mention notifications, because users who had one
of these types of notifications would be treated as never
presence-idle, which was just confusing.
This is part of adding support for notifications for wildcard mentions
and alert words; it's worth merging this as an early commit because
the consequence of not doing this are very difficult to debug.
Rather than continually resetting the contents of an existing event
queue, we allocate a new one for each subtest.
We also fix a rather confusing bundle of comments.