Previously, we didn't pass customized HTTP_HOST headers when making
network requests. As we move towards a world where everything is on a
subdomain, we'll want to start doing that.
The vast majority of our test code is written to interact with the
default "zulip" realm, which has a subdomain of "zulip". While
probably longer-term, we'll wish this was the root domain, for now, we
need to make our HTTP requests match what is expected by the test
code.
This commit almost certainly introduces some weird bugs where code was
expecting a different subdomain but the tests doesn't fail yet. It's
not clear how to find all of these, but I've done some grepping.
get_realm_by_email_domain was intended to be registration flow code
not used in other code, but it was leaked to a few places. This
removes one of the main remaining references to it outside the
registration code path.
The older fixture for this event assumed that the "assignee" key
had a value of '{}' if no one was assigned to a PR anymore.
However, that is no longer true because testing with requestbin
showed that in the latest JSON payload for this event, the key
"assignee" now has the value of 'null' (None when converted to
Python) when a user is unassigned from a PR. The current code
didn't handle this correctly. This commit makes sure it does!
Its unclear as to whether the old fixture was simply wrong or
whether GitHub changed its payloads in any way.
Previously, Zulip's server logs would not show which user or client
was involved in login or user registration actions, which made
debugging more annoying than it needed to be.
This is mostly pure code extraction.
It also removes some dead code in update_muted_topic, where
were updating muted_topics spuriously before calling
do_update_muted_topic.
Previously, realm.uri and realm.host didn't support using a subdomain
of the empty string (""), aka using the root domain.
Also, since we're already accessing self.subdomain, we don't need to
check REALMS_HAVE_SUBDOMAINS again.
Unlike creating a stream, there's really no reason one would want to
call the function to create a realm while uncertain whether that realm
already existed.
For filters like has:link, where the web app doesn't necessarily
want to guess whether incoming messages meet the criteria of the
filter, the server is asked to query rows that match the query.
Usually these queries are search queries, which have fields for
content_matches and subject_matches. Our logic was handling those
correctly.
Non-search queries were throwing an exception related to tuple
unpacking. Now we recognize when those fields are absent and
do the proper thing.
There are probably situations where the web app should stop hitting
this endpoint and just use its own filters. We are making the most
defensive fix first.
Fixes#6118
Most of the code in show_unreads is for diagnosising unread
counts issues, and we may not use that often.
We're creating a dedicated fix_unreads management command with
less clutter.
This change is mostly based on a similar commit from hackerkid
in a feature branch. It borrows both code and ideas. Some of
it's my own stuff, as I was working on a newer branch.
We now call get_user_including_cross_realm_email() inside of
user_profiles_from_unvalidated_emails(), instead of using
get_user_profile_by_email.
This requires a few of our callers to pass down sender into us.
One consequence of this change is that we change the symptoms
for trying to send to emails outside of your realm. In some
cases, we simply raise an error that an email is invalid to us
instead of getting into the deeper validate_recipient_user_profiles
check.
We are trying to convert emails to user_profiles earlier in
the codepath. This may cause subtle changes in which errors
appear, but it's probably generally good to report on bad
addressees sooner than later.
This class simplifies the calling sequence to methods like
check_message and _internal_prep_message, and it's also more
type safe.
Checking for message types is encapsulated with calls to is_stream()
and is_private(). There are also shortcut constructors when you
know that the type of the address (stream vs. private), which is often.
In this we basically seed a single message for the user which will
be soft deactivated by sending a stream message / group PM to
ensure that is has at least one UserMessage row, since in real
world every human user will always have at least one User Message
row.
This causes `upgrade-zulip-from-git`, as well as a no-option run of
`tools/build-release-tarball`, to produce a Zulip install running
Python 3, rather than Python 2. In particular this means that the
virtualenv we create, in which all application code runs, is Python 3.
One shebang line, on `zulip-ec2-configure-interfaces`, explicitly
keeps Python 2, and at least one external ops script, `wal-e`, also
still runs on Python 2. See discussion on the respective previous
commits that made those explicit. There may also be some other
third-party scripts we use, outside of this source tree and running
outside our virtualenv, that still run on Python 2.
An expression like `force_bytes(chr(...))`, on Python 3 where the
`force_bytes` finds itself with something to do because `chr` returns
a text string, gives the UTF-8 encoding of the given value as a
Unicode codepoint.
Here, we don't want that -- rather we want the given value as a
single byte. We can do that with `struct.pack`.
This fixes an issue where the "Link with Webathena" flow was producing
invalid credential caches when run on Python 3, breaking the Zephyr
mirror for any user who went through it anew.
This management command creates the same indexes as migrations
82, 83, and 95, which are all indexes on the huge UserMessage
table. (*)
This command quickly no-ops with clear messaging when the
indexes already exist, so it's idempotent in that regard. (If
somebody somehow creates an index by the same name incorrectly,
they can always drop it in dbshell and re-run this command.)
If any of the migrations have not been run, which we detect simply
by the existence of the indexes, then we create them using a
`CREATE INDEX CONCURRENTLY` command. This functionality in
postgres allows you to create indexes against large tables
without disrupting queries against those tables. The tradeoff
here is that creating indexes concurrently takes significantly
longer than doing them non-concurrently.
Since most tables are small, we typically just use regular
Django migrations and run them during a brief interval while
the app is down.
For indexes on big tables, we will want to run this command
as part of the upgrade process, and we will want to run
it while the app is still up, otherwise it's pointless.
All the code in create_indexes() is literally copy/pasted
from the relevant migrations, and that scheme should work
going forward. (It uses a different implementation of
create_index_if_not_exist than the migrations use, but the
code is identical lexically in the function.)
If we ever do major restructuring of our large tables, such
as UserMessage, and we end up droppping some of these indexes,
then we will need to make this command migrations-aware. For
now it's safe to assume that indexes are generally additive in
nature, and the sooner we create them during the upgrade process,
the better.
(*) UserMessage is huge for large installations, of course.
Before this change, server searches for both
`is:mentioned` and `is:alerted` would return all messages
where the user is specifically mentioned (but not
at-all mentions).
Now we follow the JS semantics:
is:mentioned -- all mentions, including wildcards
is:alerted -- has an alert word
Here is one relevant JS snippet:
} else if (operand === 'mentioned') {
return message.mentioned;
} else if (operand === 'alerted') {
return message.alerted;
And here you see that `mentioned` is OR'ed over both mention flags:
message.mentioned = convert_flag('mentioned') || convert_flag('wildcard_mentioned');
The `alerted` flag on the JS side is a simple mapping:
message.alerted = convert_flag('has_alert_word');
Fixes#5020
Given typeahed and the fact that this only worked if the person had a
full name that didn't contain whitespace, this side effect of the
original @shortname mentionfeature that we removed was experienced by
users as a bug.
Fixes#6142.
We apparently were using the default of num_before=1, not
num_before=0, which meant that if the very last randomly generated
message was one by cordelia mentioning lunch,
test_get_messages_with_search would fail because there were actually 3
matches.
This adds the authors to the Zulip repository on GitHub from
/authors/ along with re-styling the page to fit the same
aesthetic as /for/open-source/ and other product-pages.
This fixes the significant duplication of code between the
authenticate_log_and_execute_json code path and the `validate_api_key`
code path.
These's till a bit of duplication, in the form of `process_client` and
`request._email` interactions, but it is very minor at this point.
The old iOS app has been gone from the app store for 8 months, never
had a huge userbase, and its latest version didn't need this hack. So
this code is unlikely to do anything in the future; remove it to
declutter our authentication decorators codebase.
The check itself was correct, but the error message was in fact the
opposite of what this check is for. In other words, the only things
these users can do is post messages, and the error message when you
tried to do something else was to tell you that the user can't post
messages.
This technically changes the behavior in the case that
!settings.ZILENCER_ENABLED but is_remote_zulip_server(role).
Fortunately, that case is mostly irrelevant (in that remote zulip
servers is a Zilencer feature). The old behavior was also probably
slightly wrong, in that you'd get a zilencer-specific error message in
that case.
The new endpoints are:
/json/mark_stream_as_read: takes stream name
/json/mark_topic_as_read: takes stream name, topic name
The /json/flags endpoint no longer allows streams or topics
to be passed in as parameters.
This function optimizes marking streams and topics as read,
by using UserMessage.where_unread(), which uses a partial
index on the "read" flag.
This also simplifies the code path for ordinary message
flag updates.
In order to keep 100% line coverage, I simplified the
logging in update_message_flags, so now all requests
will show the "actually" format.
This is an interim step toward creating dedicated endpoints
for marking streams/topics as reads, so we do error checking
with asserts for flag/operation, so we don't introduce a
temporary translation string.
This is mostly a pure code extraction, except that we now
disregard the `messages` option for stream/topic updates,
since the web app always passes in an empty list (and this
commit is really just an incremental step toward creating
new endpoints.)
This is the first part of a larger migration to convert Zulip's
reactions storage to something based on the codepoint, not the emoji
name that the user typed in, so that we don't need to worry about
changes in the names we're using breaking the emoji storage.
We recently changed the populate_db data set to include more variable
message content, which happened to include the possibility of the word
"lunch" appearing in the test messages. This caused occasional
failures of the search tests that looked for messages containing
"lunch" starting at the beginning of time, not the beginning of the
test.
This commits adds new helper functions which are:
* get_users_for_soft_deactivation(): This function can be used to
fetch a list of human users which pass the criteria of minimum
inactivity period (in days) passed as a parameter to the function.
* do_soft_activate_users(): Given a list of users this function
reactivates them and help them catch up with the missing message
rows for them in the UserMessage table.
This function will help us in creating undisturbed experience for
returning soft deactivated users.
Tweaked by tabbott to fix minor performance and clarity issues.
This field is convenient for bankruptcy checks. Clients could
calculate it from page_params.unread_msgs before this change, but
it would kind of a painful calculation.
To add count, we had to simplify the mypy annotations, which weren't
really accurate before.
Update Email, Beanstalk, Hubot, JIRA, and Trello integrations
links.
The Hubot integrations section (/integrations#hubot-integrations)
was removed in an earlier redesign of /integrations. This commit
replaces the link with the hubot-scripts organization on
Github, which displays the comprehensive list of all integrations
available via Hubot.
Fixes#5875.
We were exiting this function in certain cases before updating
mentions. This bug was always there, but it was flaky in terms
of database setup whether the tests would fail, so now the
relevant test sends three consecutive messages.
We also avoid putting duplicate message ids in mentions.
This should significantly improve the user experience for new users
signing up with GitHub/Google auth. It comes complete with tests for
the various cases. Further work may be needed for LDAP to not prompt
for a password, however.
Fixes#886.
This allows us to go to Registration form directly. This behaviour is
similar to what we follow in GitHub oAuth. Before this, in registration
flow if an account was not found, user was asked if they wanted to go to
registration flow. This confirmation behavior is followed for login
oauth path.
It's hard to find literature with the community tone we're going for, that
is consistent with the Zulip code of conduct, etc.
This commit removes the special tooling for Gutenberg plays, and changes the
text to be some mixture of scigen, Communications From Elsewhere,
chat.zulip.org, and various books from the public domain.
We apparently were not correctly clearing the user_profile's email
address from caches when changing email addresses, which meant that
trying to look up the old email in the user_profile caches would still
work.
Fixes#6035.
This no longer does the correct thing (in terms of onboarding emails,
default streams, etc), and is tempting for new server admins to use.
Once we remove it we'll also have the invariant that we can't have a realm
without a user, which will simplify accounts_register a bit.
The "all" option for 'message/flags' was dangerous, as it could
apply to any of our flags. The only flag it made sense for, the
"read" flag, now has a dedicated endpoint.
This change simplifies how we mark all messages as read. It also
speeds up the backend by taking advantage of our partial index
for unread messages. We also use a new statsd indicator.
This now breaks the process of cleaning up unread counts for
non-active streams into a three step process.
This allows us to use our unread message flags index, at least
in testing on dev. Here is the relevant excerpt from explain
analyze:
Bitmap Index Scan on zerver_usermessage_unread_message_id
This makes supervisor see the service as cheerfully running
and let it alone, rather than constantly retry starting it.
Because the crash/restart loop means repeatedly spending a
couple of seconds loading Django and the app, separated by
brief periods while supervisor notices the crash and acts
on it, it was actually consuming about 30-50% CPU on the
zulipchat.com staging server.
zerver/message.py used it in this way previously when the type was not
a stream, so the type has been set to match usage and implementation.
Also added docstring to clarify this for the specific function.
Create a generator script to pull lines from a play, enhancing
random lines with emoji, Markdown and other flair.
With numerous contributions from Rein Zustand and Tim Abbott to finish
the project.
Fixes: #1666.
Some of this code was only used by the `active_user_stats`
management command deleted in the previous commit. Other
code appears to have already been dead. Remove it all.
This completes the major endpoint migrations to eliminate legacy API
endpoints from Zulip.
There's a few other things that will happen naturally, so I believe
this fixes#611.
While we do have some known cases where syntax diverges intentionally,
this change should make it a lot easier to maintain
markdown.contains_backend_only_syntax over time.
interface_type select menu will be used to choose the interface
for outgoing webhooks. It will be displayed only when the selected
bot type is OUTGOING WEBHOOK type. The default value is GENERIC
interface type (1).
This is to be used for the case of container orchestration instead of
shell arg to prevent snooping by any user account on the server via `ps
-ef` or any superuser with read access to the user\'s bash history.
We are adding a new list of unread message ids grouped by
conversation to the queue registration result. This will allow
clients to show accurate unread badges without needing to load an
unbound number of historic messages.
Jason started this commit, and then Steve Howell finished it.
We only identify conversations using stream_id/user_id info;
we may need a subsequent version that includes things like
stream names and user emails/names for API clients that don't
have data structures to map ids -> attributes.
In anticipation of have all unread message ids available to the
web app in page_params (via a separate effort), we are simplifying
the /topics endpoint to no longer return unread counts.
Instead we have a list of tiny dictionaries with these fields:
name - name of the topic
max_id - max message id for the topic (aka most recent)
The items in the list are order by most-recent-topic-first.
This route is called only in `js/compose.js`, to handle autosubscribe.
That code doesn't check this "exists" field, because there's no need
-- the same information is already carried in whether the result was
success or failure. So just eliminate it.
This makes the logic here a little simpler. It also eliminates
another usage of the `data` parameter to `json_error`. I have half a
mind to eliminate that parameter, in favor of making `JsonableError`
subclasses whenever there's structured data to include, in particular
to get the benefits of typing. There are a couple of places where
that change isn't locally a clear win, but this is not one of them.
This error isn't saying that any kind of authentication or
authorization failed -- it's just a validation error like
any other validation error in the values the user is asking to
set. The thought of authentication comes into it only because
the setting happens to be *about* authentication.
Fix the error to look like the other validation errors around it,
rather than give a 403 HTTP status code and a "reason" field that
mimics the "reason" fields in `api_fetch_api_key`.
This allows us to reliably parse the error in code, rather than
attempt to parse the error text. Because the error text gets
translated into the user's language, this error-handling path
wasn't functioning at all for users using Zulip in any of the
seven non-English languages for which we had a translation for
this string.
Together with 709c3b50f which fixed a similar issue in a
different error-handling path, this fixes#5598.
Process the unicode emojis in twitter link previews and render them
properly. Before this we were not processing the unicode emojis in
twitter link previews and hence on the systems which don't have
fonts for displaying them they were rendered as blank boxes.
Fixes: #5427.
This commit renames list named `to_linkify` in twitter link processor
to `to_process` and adds a `type` field to each entry in it to
indicate the type of data represented by that particular entry.
Add test to check if the embedded bot service being used is in the
registry or not.
Add test to check if the bot being added to the registry has a valid
bot corresponding to it.
Move 'get_bot_handler' to 'zerver/lib/bot_lib.py' as it is an independent
function, not related to the 'EmbeddedBotWorker' class that it was
previously a part of.
This fixes the original issue that #5598 was the root cause of; when
the user returns to a Zulip browser tab after they've been idle past
the timeout (10 min, per IDLE_EVENT_QUEUE_TIMEOUT_SECS), we now
correctly reload the page even if they're using Zulip in German or
another non-English language where we have a translation for the
relevant error message.
The one purpose this exception was serving was to carry a message
in `msg`. We can do that with `JsonableError`, and as a bonus replace
a repetition of the familiar "'result': 'error', ..." JSON pattern
with a call to a common implementation.
Also wrap the error messages for translation -- we hadn't been doing
that, oops. Our linter notices that issue now that it's the familiar
JsonableError class.
There's one other potential change in behavior here: this
except-clause might now catch a JsonableError raised from some other
code. That seems like a bonus, if so; the handler isn't doing
anything actually specific to this code, and the more exceptions it
successfully turns into proper error responses to the client and lines
in the log, the better.
All JsonableError subclasses now have corresponding ErrorCode values
of their own, reducing the number of different patterns for using
the new JsonableError API.
This provides the main infrastructure for fixing #5598. From here,
it's a matter of on the one hand upgrading exception handlers -- the
many except-blocks in the codebase that look for JsonableError -- to
look beyond the string `msg` and pass on the machine-readable full
error information to their various downstream recipients, and on the
other hand adjusting places where we raise errors to take advantage
of this mechanism to give the errors structured details.
In an ideal future, I think all exception handlers that look (or
should look) for a JsonableError would use its contents in structured
form, never mentioning `msg`; but the majority of error sites might
continue to just instantiate JsonableError with a string message. The
latter is the simplest thing to do, and probably most error types will
never have code looking for them specifically.
Because the new API refactors the `to_json_error_msg` method which was
designed for subclasses to override, update the 4 subclasses that did
so to take full advantage of the new API instead.
This simplifies things for all codepaths not involving this feature.
Using this feature becomes slightly easier when you're already
defining a subclass, but now requires you to define a subclass.
Currently we use it just once out of >100 uses of JsonableError, and
that use already has a subclass, so this seems like a win.
With #5598 there will soon be an application-level error code
optionally associated with a `JsonableError`, so rename this
field to make clear that it specifically refers to an
HTTP status code.
Also take this opportunity to eliminate most of the places
that refer to it, which only do so to repeat the default value.
The file `zerver/lib/request.py` doesn't have type annotations
of its own; if they did, they would duplicate the annotations that
exist in its stub file `zerver/lib/request.pyi`. The latter exists
so that we can provide types for the highly dynamic `REQ` and
`has_request_variables`, which are beyond the type-checker's ken
to type-check, but we should minimize the scope of code that gets
that kind of treatment and `JsonableError` is not at all the sort of
code that needs it.
So move the definition of `JsonableError` into a file that does
get type-checked.
In doing so, the type-checker points out one issue already:
`__str__` should return a `str`, but we had it returning a `Text`,
which on Python 2 is not the same thing. Indeed, because the
message we pass to the `JsonableError` constructor is generally
translated, it may well be a Unicode string stuffed full of
non-ASCII characters. This is potentially a bit of a landmine.
But (a) it can only possibly matter in Python 2 which we intend to
be off before long, and (b) AFAIK it hasn't been biting us in
practice, so we've probably reasonably well worked around it where
it could matter. Leave it as is.
The whole thing is an error, so "message" is a more apt word for the
error message specifically. We abbreviate that as `msg` in the actual
HTTP responses and in the signatures of `json_error` and friends, so
do the same here.
In order to benefit from the modern conveniences of type-checking,
add concrete, non-Any types to the interface for JsonableError.
Relatedly, there's no need at this point to duck-type things at
the places where we receive a JsonableError and try to use it.
Simplify those by using straightforward standard typing.