We now use realm_id for querying UserPresence
instead of building a big WHERE clause from the
list of user_ids.
This commit may be a bit hard to measure, since
we still get the list of user_ids for the PushToken
query in the same method.
We now validate streams with a separate
function from PM recipients.
It's confusing enough all the ways you can
encode a stream or encode the PM recipients,
but trying to do it all in one function was
hard to reason about and led to at least one
bug.
In particular, there was a bug where streams
with commas in them would get split. Now
we just don't ever split on commas inside
of `extract_stream_indicator`.
Fixes#13836
After removing internal_send_message() in a recent
commit, we now have only two callers for
extract_recipients, and they are both related
to our REQ mechanism that always passes strings
to converters. (If there are default values,
REQ does not call the converters.)
We therefore make two changes:
- use the more strict annotation of "str"
for the `s` parameter
- don't bother with the isinstance check
This index is intended to optimize the performance of the very
frequently run query of "what is the presence status of all users in a
realm?".
Main changes:
- add realm_id to UserPresence
- add index for realm_id
- backfill realm_id for old rows
- change all writes to UserPresence to include
realm_id
The index is of this form:
"zerver_userpresence_realm_id_5c4ef5a9" btree (realm_id)
We will create an index on (realm_id, timestamp) in a
future commit, but I think it's a bit faster if you do
the backfill before the index.
There's also a minor tweak to the populate_db script.
This is just a refactoring to the more modern API
for sending internal messages.
To make this work we now plumb the email_gateway
flag through `internal_send_stream_message` instead
of `internal_send_message`.
We also change `send_zulip` to have its callers
pass in a full UserProfile object (which one of
them already had).
We prefer this to internal_send_message().
We are trying to deprecate `internal_send_message`,
which has extra moving parts related to
`extract_recipients` and `Addressee.legacy_build`.
There are two chunks of code that I touch here
that look pretty similar, but I'm not quite
sure they're worth de-duplicating, since they
use different topics and different message
content.
Instead of having `notify_new_user` delegate
all the heavy lifting to `send_signup_message`,
we just rename `send_signup_message` to be
`notify_new_user` and remove the one-line
wrapper.
We remove a lot of obsolete complexity:
- `internal` was no longer ever set to True
by real code, so we kill it off as well
as well as killing off the internal_blurb code
and the now-obsolete test
- the `sender` parameter was actually an
email, not a UserProfile, but I think
that got past mypy due to the caller
passing in something from settings.py
- we were only passing in NOTIFICATION_BOT
for the sender, so we just hard code
that now
- we eliminate the verbose
`admin_realm_signup_notifications_stream`
parameter and just hard code it to
"signups"
- we weren't using the optional realm
parameter
There's also a long ugly comment in
`get_recipient_info` related to this code
that I amended for now.
We should try to take action in a subsequent
commit.
This avoids an unnecessary join to UserProfile.
To verify this, you can do `print(queries)` in the
`test_get_custom_profile_fields_from_api` test. It's
kinda noisy, so I excerpted them below...
Before:
SELECT ...
FROM "zerver_customprofilefieldvalue"
INNER JOIN "zerver_userprofile" ON ("zerver_customprofilefieldvalue"."user_profile_id" = "zerver_userprofile"."id")
INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id")
WHERE "zerver_userprofile"."realm_id" = 2
After:
SELECT ...
FROM "zerver_customprofilefieldvalue"
INNER JOIN "zerver_customprofilefield" ON ("zerver_customprofilefieldvalue"."field_id" = "zerver_customprofilefield"."id")
WHERE "zerver_customprofilefield"."realm_id" = 2'
I don't have any way to measure the two queries with
realistic data, but I would assume the second
query is significantly faster on most of our instances,
since CustomProfileField should be tiny.
The line removed here is a noop, as both sides of the
immediately following conditional reassign the
same variable.
This harmless cruft was the result of the recent commit
1ae5964ab8, which added
support for single-user GETs.
Apparently, the arguments passed to template_database_status were
incorrect for the manual testing development database, in that we
didn't pass a status_dir when calling into that code from provision.
The result was that provisioning before running `test-backend` would
ignore changes to the list of check_files (etc.) made after rebasing,
and vice versa.
The cleanest fix is to compute status_dir from other values passed in;
I'm also going to open a follow-up issue for creating a better overall
interface here.
This adds a new API endpoint for querying basic data on a single other
user in the organization, reusing the existing infrastructure (and
view function!) for getting data on all users in an organization.
Fixes#12277.
This code is a bit flatter and just preps the data
for a single user. There is never any interaction
between the data for user A and user B, so we can
mostly avoid complicated nested data structures
and do most of the data-crunching on a per-user basis.
We also do an explicit sort of the data before
running it through groupby. The explicit sort
simplifies how we calculate `most_recent_info`
and also avoids needing to add `dt` to an intermediate
data structure.
Finally, when it comes to the individual client data,
the code has relied on the assumption that there is
only one row per client, which I believe to be true,
but now the code is more explicit about that.
django-phonenumber-field 2.4.0 adds tighter phone number validation
that rejects +12223334444 for having an invalid area code. This was
reverted in 4.0.0, but django-two-factor-auth still requires <3.99.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This commit includes a new `stream_post_policy` setting,
by replacing the `is_announcement_only` field from the Stream model,
which is done by mirroring the structure of the existing
`create_stream_policy`.
It includes the necessary schema and database migrations to migrate
the is_announcement_only boolean field to stream_post_policy,
a smallPositiveInteger field similar to many other settings.
This change is done to allow organization administrators to restrict
new members from creating and posting to a stream. However, this does
not affect admins who are new members.
With many tweaks by tabbott to documentation under /help, etc.
Fixes#13616.
This flag affects page_params and the
payload you get back from POSTs to this
url:
users/me/presence
The flag does not yet affect the
presence events that get sent to a
client.
This should ensure that folks rebasing past this commit from an older
database model get their database rebuilt in the way that will
match the test_subs.py query count of 40.
We will want to raise RateLimited in authenticate() in rate limiting
code - Django's authenticate() mechanism catches PermissionDenied, which
we don't want for RateLimited. We want RateLimited to propagate to our
code that called the authenticate() function.
As more types of rate limiting of requests are added, one request may
end up having various limits applied to it - and the middleware needs to
be able to handle that. We implement that through a set_response_headers
function, which sets the X-RateLimit-* headers in a sensible way based
on all the limits that were applied to the request.
While the result of this change doesn't completely do what we need, it
does remove a huge amount of duplicated lists of fields. With a bit
more similar work, we should be able to eliminate a broad category of
potential bugs involving Stream and Subscription objects being
represented inconsistently in the API.
Work towards #13787.
This has the side of effect of making new fields we add to Stream be
automatically included, which will help maintain this code as we
upgrade it.
This commit adds is_web_public, history_public_to_subscribers, and
email_notifications fields to the dictionary.
This modifies get_cross_realm_dicts in zerver.lib.users to call
format_user_row. This is done to remove current and prevent future
inconsistencies between in the dictionary formats for get_raw_user_data
and get_cross_realm_dicts.
Implementation substantially rewritten by tabbott.
Fixes#13638.
This moves get_cross_realm_dicts (from zerver.lib.actions),
get_raw_user_data and get_custom_profile_field_values (from
zerver.lib.events) to zerver.lib.users.
This extracts the user_data inner function from get_raw_user_data as a
reusable function. We intend to reuse it for cross-realm user dicts.
A few changes were made while extracting it:
* Renaming the UserProfile argument to acting_user, so we can do loops
over a local user_profile variable.
* Moved it to zerver.lib.users, as that's a more appropriate home for
this function formatting data on users.
* Simplified the calling convention for passing custom profile fields
to reflect the fact that this function processes a single user (and
is expected to be called in a loop).
"Zulip Voyager" was a name invented during the Hack Week to open
source Zulip for what a single-system Zulip server might be called, as
a Star Trek pun on the code it was based on, "Zulip Enterprise".
At the time, we just needed a name quickly, but it was never a good
name, just a placeholder. This removes that placeholder name from
much of the codebase. A bit more work will be required to transition
the `zulip::voyager` Puppet class, as that has some migration work
involved.
A wart that has long been present inin Zulip's get_messages API is how
to request "the latest messages" in the API. Previously, the
recommendation was basically to pass anchor=10000000000000000 (for an
appropriately huge number). An accident of the server's implementation
meant that specific number of 0s was actually important to avoid a
buggy (or at least wasteful) value of found_newest=False if the query
had specified num_after=0 (since we didn't check).
This was the cause of the mobile issue
https://github.com/zulip/zulip-mobile/issues/3654.
The solution is to allow passing a special value of anchor='newest',
basically a special string-type value that the server can interpret as
meaning the user precisely just wants the most recent messages. We
also add an analogous anchor='oldest' or similar to avoid folks
needing to write a somewhat ugly anchor=0 for fetching the very first
messages.
We may want to also replace the use_first_unread_anchor argument to be
a "first_unread" value for the anchor parameter.
While it's not always ideal to make a value have a variable type like
this, in this case it seems like a really clean way to express the
idea of what the user is asking for in the API.
This is required for the upcoming type behavior of the "anchor"
parameter.
This change is the minimal work required to have our OpenAPI code not
fail when checking a union-type value of this form. We'll likely want
to, in the future, do something nicer, but it'd require more extensive
infrastructure for parsing of OpenAPI data that it's worth with our
current approach (we may want to switch to using a library).
The proximal issue here is that in upcoming commits, we're going to
change the type of the `anchor` field in `get_messages_backend` to
support passing either an integer or a string.
Many of our tests using POSTRequestMock currently define a query
object that uses integer values for the integer fields we're going to
pass into it, e.g. {'num_after': 0}. That is the correct type for
that field in the Zulip API, before HTTP encoding turns it into a
string. However, because POSTRequestMock didn't use HTTP encoding at
all (which will convert the 0 into a '0'), it ended up passing an
integer to a function that can't possible receive one as an argument.
Ideally, we'd just get rid of POSTRequestMock, since it's a hack, and
just do real HTTP requests instead.
But since it's used in a lot of places making doing so somewhat
impractical, we can get past this issue by just making POSTRequestMock
convert integers to strings.
Using logging.info() rather than logger.info() meant that our
zulip.soft_deactivation logger configuration (which, in particular,
included not logging to the console) was not active on this log line,
resulting in the `manage.py soft_deactivate_users` cron job sending
emails every time it ran.
Fixes#13750.
Previously, we didn't track opening and closing fences separately,
with led to bugs like not parsing a list that was immediately after
a quoted fence; we treated each ``` as a new fence.
This commit rewrites the function to maintain a stack of currently
open fences. If any of the parent fences is a code fence, we do not
insert a new line before a list.
We also add some test cases specifically to test this behavior with
complexly nested lists.
Fixes#13745.
The desktop otp flow (to be added in next commits) will want to generate
one-time tokens for the app that will allow it to obtain an
authenticated session. log_into_subdomain will be the endpoint to pass
the one-time token to. Currently it uses signed data as its input
"tokens", which is not compatible with the otp flow, which requires
simpler (and fixed-length) token. Thus the correct scheme to use is to
store the authenticated data in redis and return a token tied to the
data, which should be passed to the log_into_subdomain endpoint.
In this commit, we replace the "pass signed data around" scheme with the
redis scheme, because there's no point having both.
This extracts a function for computing show_invites and
show_add_streams, for better readability and testability.
This commit was substantially cleaned up by tabbott.
This legacy cross-realm bot hasn't been used in several years, as far
as I know. If we wanted to re-introduce it, I'd want to implement it
as an embedded bot using those common APIs, rather than the totally
custom hacky code used for it that involves unnecessary queue workers
and similar details.
Fixes#13533.
If an email is sent with the .prefer-html option, but it has no html
body, it's better to fall back to plaintext content instead of treating
it as a user error.
Closes#13484.
These options tell zulip whether to prefer the plaintext or html version
of the email message. prefer-text is the default behavior, so including
the option doesn't change anything as of now, but we're adding it to
prepare to potentially change the default behavior in the future.
As we add more address options, which will have different behavior than
simply setting option_name=True, we need to migrate this subsystem to
something that better supports more complex logic and will allow
encapsulating it, instead of needing to be put all over the
decode_email_address function.
This essentially unused legacy variable was causing Zulip to query the
database at import time, which is generally not something we aim to
do.
Combined with the issue fixed in the previous commit, this variable
resulted in test-backend providing an unhelpful crash when provision
hadn't updated the unit testing database.
Since the intent of our testing code was clearly to clear this cache
for every test, there's no reason for it to be a module-level global.
This allows us to remove an unnecessary import from test_runner.py,
which in combination with DEFAULT_REALM's definition was causing us to
run models code before running migrations inside test-backend.
(That bug, in turn, caused test-backend's check for whether migrations
needs to be run to happen sadly after trying to access a Realm,
trigger a test-backend crash if the Realm model had changed since the
last provision).
Due to a known but unfixed bug in the Python standard library’s
urllib.parse module (CVE-2015-2104), a crafted URL could bypass the
validation in the previous patch and still achieve an open redirect.
https://bugs.python.org/issue23505
Switch to using django.utils.http.is_safe_url, which already contains
a workaround for this bug.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013. We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not. Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.
While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.
This WebSockets code path had a lot of downsides/complexity,
including:
* The messy hack involving constructing an emulated request object to
hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
needs and must be provisioned independently from the rest of the
server).
* A duplicate check_send_receive_time Nagios test specific to
WebSockets.
* The requirement for users to have their firewalls/NATs allow
WebSocket connections, and a setting to disable them for networks
where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
been poorly maintained, and periodically throws random JavaScript
exceptions in our production environments without a deep enough
traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
server restart, and especially for large installations like
zulipchat.com, resulting in extra delay before messages can be sent
again.
As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients. We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.
If we later want WebSockets, we’ll likely want to just use Django
Channels.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Fixes#13416
We used to search only one level in depth through the MIME structure,
and thus would miss attachments that were nested deeper (which can
happen with some email clients). We can take advantage of message.walk()
to iterate through each MIME part.
Previously, if you tried to invite a user whose account had been
deactivated, we didn't provide a clear path forward for reactivating
the users, which was confusing.
We fix this by plumbing through to the frontend the information that
there is an existing user account with that email address in this
organization, but that it's deactivated. For administrators, we
provide a link for how to reactivate the user.
Fixes#8144.
This experimental setting disables sending private messages in Zulip
in a crude way (i.e. users get an error when they try to send one).
It makes no effort to adjust the UI to avoid advertising the idea of
sending private messages.
Fixes#6617.
We should take adventage of the recipient field being denormalized into
the Stream model. We don't need to make queries to figure out a stream's
recipient id, so we take advantage of that to eliminate some of
those redundant queries and simplify StreamRecipientMap.
This improves the approach of creating multiple parallel processes by
using subprocess.Popen() instead of run_parallel() and
subprocess.call() while exporting an organization's message
history. This prevents forking twice for individual subprocess.
While this has some performance benefit, the main reason to fix this
is that it fixes an issue with the data export web UI introduced in
run_parallel forks exited).
Fixes#12904.
process_missed_message did nothing other than calling
send_to_missed_message_address with the same arguments, so there's no
reason to have these as separate functions.
Addresses point 1 of #13533.
MissedMessageEmailAddress objects get tied to the specific that was
missed by the user. A useful benefit of that is that email message sent
to that address will handle topic changes - if the message that was
missed gets its topic changed, the email response will get posted under
the new topic, while in the old model it would get posted under the
old topic, which could potentially be confusing.
Migrating redis data to this new model is a bit tricky, so the migration
code has comments explaining some of the compromises made there, and
test_migrations.py tests handling of the various possible cases that
could arise.