While this functionality to post slow queries to a Zulip stream was
very useful in the early days of Zulip, when there were only a few
hundred accounts, it's long since been useless since (1) the total
request volume on larger Zulip servers run by Zulip developers, and
(2) other server operators don't want real-time notifications of slow
backend queries. The right structure for this is just a log file.
We get rid of the queue and replace it with a "zulip.slow_queries"
logger, which will still log to /var/log/zulip/slow_queries.log for
ease of access to this information and propagate to the other logging
handlers. Reducing the amount of queues is good for lowering zulip's
memory footprint and restart performance, since we run at least one
dedicated queue worker process for each one in most configurations.
We use cloud-config for setting up the SSH keys and executing
some commands. When cloud-config sets the SSH key it doesn't override
the existing keys. So we need to set the SSH keys manually using a command
instead. This means we no longer require cloud config. We can instead
pass a bash script as the user data instead of cloud-config.
I also included a command to set the SSH key of the root.
size_slug represents the plan the droplet should be created on.
Since the new base droplet is created on the cheaper but more
feature rich new plan we have to update the slug_size as well
to take advantage of the cheaper plan.
Previously a YAML syntax error resulted in an
UnhandledPromiseRejectionWarning and a successful exit code; now it
gives a clear message and a failing exit code.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
The major PROVISION_VERSION bump would not be needed, but it was
missing in commit 5ab62a3514 (#14834),
so I’m doing it here.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
We now use the `--streamlined` options for `run-dev.py`
when we use `test_server_running` for `test-api` and
`test-js-with-casper` (and its experimental
replacement, `test-js-with-puppeteer`).
This means we don't slow anything down with
processes like thumbor, process_fts_updates, etc.,
which aren't meaningfully exercised by these tests.
We may eventually want some tests to meaningfully
exercise those processes, and when that day comes,
we will need to add an extra argument to
`test_server_running`, probably, but until then,
we just always set `--streamlined` in that codepath.
There is actually a tool called `./tools/test-run-dev`
that we run in CI, and it will use the full mode.
It just doesn't verify much stuff--it mostly polls
the server without testing specific features.
This seems to save about 1s of the startup time on a system I use
(~10.6s -> ~9.7s).
For basic testing (either manual or automated), we
generally only need the server and tornado running.
Obviously, it's nice to test the complete system,
but if you're on a slow PC, the overhead can be
annoying.
Note that we don't launch any of these processes
in `--streamlined` mode:
process_queue
process_fts_updates
deliver_scheduled_messages
thumbor
And then by not launching process_queue, we avoid
several child processes.
Basic functionality like sending messages will
still work here.
The streamlined mode may be helpful in debugging
our generally slow server startup time. Obviously,
some of the problem with startup is the auxiliary
processes here, but removing them as a variable
could help us focus on getting the core stuff fast.
Note that we still have the webpack watcher running
in streamlined mode.
For the particular case of thumbor, note that we
modify the proxy server to explicitly print and
return an error if we get a `/thumbor/*` request.
We clean up the code related to launching
processes here.
We extract:
server_processes
We also extract these helper for webpack
stuff:
do_one_time_webpack_compile
start_webpack_watcher
And then we move the code to actually launch
them lexically within the file (so as not to
be obscured by various function definitions).
Here is the new output for displaying ports:
Zulip services will listen on ports:
9991: web proxy
9992: Django
9993: Tornado
9994: webpack
9995: Thumbor
Note to Vagrant users: Only the proxy port (9991) is exposed.
I tone down the yellow for the Vagrant warning, and I show
the web proxy in cyan to emphasize it.
I also extracted the code into a function, and I don't call
that function until after `app.listen()`. (The users probably
won't notice much difference in the timing of this message, but
the message won't show if the `listen` step fails for some
reason, which I think is what we want here.)
We remove the import-tools code that was plunked
right into the middle of our command line
arguments.
Then we add a local var called `DESCRIPTION` to
fix some ugly code formatting, and we stop with the
unnecessary `r` prefix to the multi-line string.
This does not rely on the desktop app being able to register for the
zulip:// scheme (which is problematic with, for example, the AppImage
format).
It also is a better interface for managing changes to the system,
since the implementation exists almost entirely in the server/webapp
project.
This provides a smoother user experience, where the user doesn't need
to do the paste step, when combined with
https://github.com/zulip/zulip-desktop/pull/943.
Fixes#13613.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
This will give help up write new digest only if the db rebuild
succeeds. We were relying on the caller to
be successful in building db, this was hacky and unreliable.
We write new db digest once the caller succeeds, this ensures
that we write new digest after every successful attempt.
This fixes the anomality we were facing that Databases were rebuild
on the 2nd provision attempt with no changes to files or migrations.
This was happening because we didn't write a new digest for db
after the first provision (The case of DB didn't exist).
During the 1st provision, we check the template_status() of
Database both Dev and Test, but database_exists() of Databases
obviously returned false, and we rebuild the database,
but forgot to write_new_digest and hence the anomaly in the
second provision explained above.
Yes, it's slightly janky to create an
argparse.Namespace object like this, but it
saves us from shelling out to a script whose
only real value-add is parsing a single
`threshold_days` argument.
This saves about 130ms for a no-op provision.
We now just have two modes for setting up a dev/test
database. This makes it easy to see these things
side-by-side, when you're trying to understand how
the two different databases get built:
# dev:
USERNAME=zulip
DBNAME=zulip
STATUS_FILE_NAME=migration_status_dev
# test:
USERNAME=zulip_test
DBNAME=zulip_test
STATUS_FILE_NAME=migration_status_test
And then we make it more explicit the things that
are common between dev and test (which are
important things to understand when troubleshooting
provision-related glitches):
SEARCH_PATH=zulip,public
PASSWORD=$("$(dirname "$0")/../../scripts/get-django-setting" LOCAL_DATABASE_PASSWORD)
DBNAME_BASE=${DBNAME}_base
We lose some "generality" here, but passing in arbitrary
combinations of username/dbname/status_file to the script
would cause chaos for our digest checks, and all the different
template/base databases could cause confusion too.
We now prevent these variations:
* <hr/>
* <hr />
* <br/>
* <br />
We could enforce similar consistency for other void
tags, if we wished, but these two are particularly
prevalent.
Instead of figuring out the image path from the integration name in the
puppeteer script, we do it in the `generate-integration-docs-screenshot`
script and pass it as an argument to `message-screenshot.js`.
Since in travis we don't have root access so we used to add different
srv path. As now we shifted our production suites to Circle CI
we don't need that code so removed it.
Also we used a hacky code in commit-lint-message for travis which is
now of no use.
The test-all tool is useful if you want to be really sure everything
passes, but it's almost always better to run a smaller subsuite when
doing active development, and just run test-all (or CI) at the end.
Our docs already covered this issue well, but this presents the idea
in a place one is definitely looking at when running test-all.
Previously, we added support for 'none', 'plain' and 'noop' and a
function `lang = remap_language(lang)`. This also had the potential
to encourage adding more remappings- something that we deliberatly
want to keep to a minimum.
For context, Anders K doesn't want us to keep any remapping (only
keeping 'text' which is the default no-op lexer that pygments has)
and Tim wants to keep 'plain' and 'text'. We should only document
and advertise 'text'.
In Django 2.1, the preferred way to express a nullable BooleanField
changed from NullBooleanField to passing null=True to BooleanField.
This updates our codebase to use the preferred API. Tweaked by
tabbott to update the linter rules.
The migration is a noop for Django accounting only.
Part of #11341.
This allows to run scripts between extraction and install
process.
It will be used to restore npm caches for production install jobs.
We extract the tarball in the working directory so that yarn.lock and
package.json are available to restore cache.
(And also so the path is deterministic).
Now that we've cleaned up this tool's output, there's no reason to use
an awkward mechanism to hide its output; we can just print it out like
a normal program.
Fixes#14644; resolves#14701.
We now forbid tags of the form `<foo ... />` in most
places, and we also forbid it even for several void
tags.
We make exceptions for tags that are already formatted
in two different ways in our codebase. This is mostly
svg tags, plus these common cases:
- br
- hr
- img
- input
It would be nice to lock down a convention for these,
even though the HTML specification is unopinionated
on these. We'll probably want to stay flexible for
svg tags, since they are sometimes copy/pasted from
other sources (although it's probably rare enough for
them that we can tolerate just doing minor edits as
needed).
If folks put something like '<br/>' in the HTML,
we would think the tag's name was "br/" instead
of "br". I think we were assuming most folks
would write either "<br>" or <br />".
ASIDE:
We should probably have a consistent
preference among these styles:
* <br>
* <br/>
* <br />
I prefer the first.
Runs a test at the end of tools/ci/backend to check if any untracked
files have been created during this ci tests. Exits with error code 1
if untracked files are found, otherwise exits successfully.
Fixes#14691.
We recently changed our droplet setup such that their
host names no longer include zulipdev.org. This caused
a few things to break.
The particular symptom that this commit fixes is that
we were trying to server static assets from
showell:9991 instead of showell.zulipdev.org:9991,
which meant that you couldn't use the app locally.
(The server would start, but the site's pretty unusable
without static assets.)
Now we rely 100% on `dev_settings.py` to set
`EXTERNAL_HOST` for any droplet users who don't set
that var in their own environment. That allows us to
remove some essentially duplicate code in `run-dev.py`.
We also set `IS_DEV_DROPLET` explicitly, so that other
code doesn't have to make inferences or duplicate
logic to detemine whether we're a droplet or not.
And then in `settings.py` we use `IS_DEV_DROPLET` to
know that we can use a prod-like method of calculating
`STATIC_URL`, instead of hard coding `localhost`.
We may want to iterate on this further--this was
sort of a quick fix to get droplets functional again.
It's possible we can re-configure droplets to have
folks get reasonable `EXTERNAL_HOST` settings in their
bash profiles, or something like that, although that
may have its own tradeoffs.
Avatar images for bots used by the tool to generate integration
documentation screenshots are pre-generated and committed to the repository.
The `generate-integration-docs-screenshot` tool now uses these images,
instead of trying to create these avatar images on the fly.
Also, deleted the unused `create_png_from_svg` function.
We now have helpers for the two places where
we create databases.
There was already one helper in place, and
I gave it a more concrete name, to match
its actual database name in postgres.
So `generate-fixtures` only ever did 9 lines of code (really
3 lines of actual code) in its normal mode of operation.
But it was cluttered with lots of stuff that really only
happened when you called it with the `-force` option, which
was only invoked by `rebuild-test-database`.
Now we inline most of the code into `rebuild-test-database`.
And now `generate-fixtures` is simple (and doesn't support
a `-force` flag.
We remove the `generate_fixtures` option here mostly
for simplicity, but in particular to facilitate
an upcoming commit to simplify the job of
`generate-fixtures` (and remove its `--force` option).
The command line option here for `test-backend`
was really calling `generate_fixtures --force`,
which we're about to rename `tools/rebuild-test-database`.
The `test-backend` tools is already smart about catching
up on migrations, so we generally don't need to tell it
to repair the database.
And if the database does get corrupt, you can just do
it directly with `tools/rebuild-test-database`.
This eliminates the `use_force` flag in
`update_test_databases_if_required`, which was easy
to confuse with `rebuild_test_database`.
The other caller wasn't using `use_force`.
The new tools now have more concise, more parallel names:
- rebuild-dev-database
- rebuild-test-database
The actual implementations are still pretty different:
rebuild-dev-database:
mostly delegates to 5 management scripts
rebuild-test-database:
is a very thin wrapper for generate-fixtures
We'll try to clean that up a bit soon.
This tool was part of a very ad hoc investigation
during 2017 into our JS dep dependencies.
It's very out of date, and it has a non-trivial
maintenance cost, as these type of tools seem
to come up in every code sweep.
Commit 35577a1f66 (#9406) moved the
OpenAPI definitions to zerver/openapi, and this script was not
updated, so it has been validating nothing for two years.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Restart postgres service if provision is called in production test suite.
This is required because terminate-psql-sessions script (used
in tools/ci/setup-production) throws error if postgres service is not running.
Restart rabbitmq service if provision is called in production test suite.
This is done to start the node as Circle CI don't start services on installation.
Removed memcached restart as flush-memcached script (which is furthur
used in tools/ci/production) throws UNKNOWN READ FAILURE if memcached is restarted
in development.
Since now we want to use production suites on Circle CI so there
is no need to set TRAVIS in env while running scripts.
CIRCLECI is set default in the enviroment of Circle CI builds
so we can use it directly.
Also Travis CI had rabbitmq-server installed so we had to add workaround
in install script to avoid the error. That workaround is removed.
Used postgres 10 inplace of postgres 9.5 as it is used in Bionic.
Upgraded nginx version in success-http-headers which is in CircleCI
Bionic enviroment.
This seems to be a complicated wrapper around `sed`.
It's not used anywhere or documented anywhere. I
assume it was used in some code sweep or something.
See cec45b0ae5
I built this in 2013 to help me quickly build
some sample data for our very early node tests,
I believe.
Ever since then it's just been changed for various
code sweeps. Also, if we wanted to resurrect this
idea for some reason, we now have the template
parser that was written since then.
This tool was last modified in 2016, has no
documentation, and requires `dot` to run, so
I think we can remove it to reduce some clutter
in the tools directory.
This tool is hackish and incomplete, but it's still
worth looking at if somebody wants to hunt down
obsolete CSS. There are probably better ways to
tackle this problem, so we should eventually just
remove this tool, but it's pretty low maintenance.
Current output looks like this:
$ ./tools/find-unused-css
actions_hovered
actions_link
bookend_tr
btn-skip
company-name
flatpickr-months
label_for_text
loading_more_messages_indicator_box
loading_more_messages_indicator_box_container
logoimage
messages-collapse
messages-expand
numInputWrapper
page_loading_indicator_box
page_loading_indicator_box_container
skinny-user-gravatar
sp-input
summary_colorblock
summary_row
summary_row_private_message
tutorial-done-button
Else for each retry the duplicate commits would be removed
again and again from the contributor's zulip/zulip commits.
This is a bug in the original commit
6bed6ccdcf that added the
functionality to remove duplicate commits.
We now have two functions related to digests
for processes:
is_digest_obsolete
write_digest_file
In most cases we now **wait** to write the
digest file until after we've successfully
run a process with its new inputs.
In one place, for database migrations, we
continue to write the digest optimistically.
We'll want to fix this, but it requires a
little more code cleanup.
Here is the typical sequence of events:
NEVER RUN -
is_digest_obsolete returns True
quickly (we don't compute a hash)
write_digest_file does a write (duh)
AFTER NO CHANGES -
is_digest_obsolete returns False
after reading one file for old
hash and multiple files to compute
hash
most callers skip write_digest_file
(no files are changed)
AFTER SOME CHANGES -
is_digest_obsolete returns False
after doing full checks
most callers call write_digest_file
*after* running a process
There's no real reason to do the lazy import any
more, as we use this unconditionally inside `main`
(indirectly), and `provision_inner` runs after we
have set up the venv.
I make these all functions for consistency,
and in particular I want to continue to avoid
`glob.glob` calls until we are actually
computing hashes.
This is mostly a prep to allow us to do
hashing in two separate places:
- check hashes
- update hashes
We would only update hashes **after** running
processes anew.
For `provision_inner` I considered using a
class to put the three path-related helpers
into a mini namespace, but it felt too heavy.
It wouldn't be completely implausible here
to extract something like a JSON config
file that has a list of globs for each
process that we do path-hashing for, but I
want to clean up other stuff first.
Importing cairosvg in fails in production, because `libgtk-3-dev` is not
available. This commit moves the import for `cairosvg` into the function
where it is used. This code will soon be moved into a separate script that
will not be run on production.
This guarantees that we don’t accidentally upgrade one without the
other, which could happen for example due to different third-party
version constraints between the two.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We no longer need to maintain duplicate code
related to where we set up the emoji
cache directory.
And we no longer need two extra steps for
people doing advanced (i.e. manual) setup.
There was no clear benefit to having provision
build the cache directory for `build_emoji`,
when it was easy to make `build_emoji` more
self-sufficient. The `build_emoji` tool
was already importing the library that has
`run_as_root`, and it was already responsible
for 99% of the create-directory kind of tasks.
(We always call `build_emoji` unconditionally from
`provision`, so there's no rationale in terms
of avoiding startup time or something.)
ASIDE:
Its not completely clear to me why we need
to put this directory in "/srv", instead of
somewhere more local (like we already do for
Travis), but maybe it's just to be like
its siblings in "/srv":
node_modules
yarn.lock
zulip-emoji-cache
zulip-npm-cache
zulip-py3-venv
zulip-thumbor-venv
zulip-venv-cache
zulip-yarn
I guess the caches that we keep in var are
dev-only, although I think some of what's under
`zulip-emoji-cache` is also dev-only in nature?
./var/webpack-cache
./var/mypy-cache
In `docs/subsystems/emoji.md` we say this:
```
The `build_emoji` tool generates the set of files under
`static/generated/emoji` (or really, it generates the
`/srv/zulip-emoji-cache/<sha1>/emoji` tree, and
`static/generated/emoji` is a symlink to that tree;we do this in
order to cache old versions to make provisioning and production
deployments super fast in the common case that we haven't changed the
emoji tooling). [...]
```
I don't really understand that rationale for the development
case, since `static/generated` is as much ignored by `git` as
'/srv' is, without the complications of needing `sudo` to create it.
And in production, I'm not sure how much time we're really saving,
as it takes me about 1.4s to fully rebuild the cache in dev, not to
mention we're taking on upgrade risk by sharing files between versions.
So, `source_emoji_dump` is not the greatest variable
name, but at least we now define it relative to its
parent instead of the `.success-stamp` file. (And
then `success_stamp` is just another join.)
If the directory `templates/zerver/emails/compiled/`
is missing, then we need to run `inline_email_css`
again.
This can happen if somebody gets overzealous about
cleaning untracked files.
This is more encapsulated and more efficient.
In the cases where `is_force` is `True` or
`pygments_data.json` is missing, we now avoid
the unnecessary step of importing `pygments`, at
least up front.
(Of course, we probably import that once we generate
the artifacts.)
If somebody is having issues with provision, it's
plausible they'll do something like `git clean -fX`
to clean up old artifacts of earlier provision runs,
as part of debugging things.
We defend against this by detecting the most obvious
symptom as cheaply as possible.
I remove `is_force` from `file_or_package_hash_updated`
and modernize its mypy annotations.
If `is_force` is `True`, we just now run the thing
we want to force-run without having to call
`file_or_package_hash_updated` to expensively
and riskily return `True`.
Another nice outcome of this change is that if
`file_or_package_hash_updated` returns `True`,
you can know that the file or package has
indeed been updated.
For the case of `build_pygments_data` we also
skip an `os.path.exists` check when `is_force`
is `True`.
We will short-circuit more logic in the next
few commits, as well as cleaning up some of
the long/wrapper lines in the `if` statements.
We change the message for skipping RabbitMQ
configuration to match nearby messages:
No need to run `tools/setup/build_pygments_data`.
No need to run `scripts/setup/inline_email_css.py`.
No need to run `scripts/setup/configure-rabbitmq.
No need to regenerate the dev DB.
No need to regenerate the test DB.
No need to run `manage.py compilemessages`.
For upgrade-zulip-from-git to work, we need to be able to run
update-prod-static on production systems, which means provision code
like this cairosvg logic needs to be there for now.
When creating a webhook integration or creating a new one, it is a pain to
create or update the screenshots in the documentation. This commit adds a
tool that can trigger a sample notification for the webhook using a fixture,
that is likely already written for the tests.
Currently, the developer needs to take a screenshot manually, but this could
be automated using puppeteer or something like that.
Also, the tool does not support webhooks with basic auth, and only supports
webhooks that use json fixtures. These can be fixed in subsequent commits.
We figure out the dev host using the same logic as
dev_settings.py, so that we don't use wrong things
like 127.0.0.1 for droplet users.
And we display the link in cyan.
Generated by `pyupgrade --py3-plus --keep-percent-format` on all our
Python code except `zthumbor` and `zulip-ec2-configure-interfaces`,
followed by manual indentation fixes.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
If a request fails the tool sleeps for some time before making
further requests. The sleep time is a random number between
0 and 2^failures capped at 64 seconds. More details about the
algorithm can be found at https://chat.zulip.org/#narrow/stream/
92-learning/topic/exponential.20backoff.20--.20with.20jitter
This is a prep commit for generating /team page data
using cron job. zerver/tests directory is not present in
production installation. So moving the file from the directory
tests to tools.
As described in the commit that added this function, this fixes one
quite annoying bug and one at least in-principle bug:
* On Windows, the simple version (lacking `git update-index
--refresh`) routinely gives false positives, making the tools
that rely on it basically unusable.
* If you have uncommitted changes in the index but manage to have
the worktree nevevertheless match HEAD, the simple version will
give a false negative and we'd blow away those changes.
This is verbatim from Git upstream, at an older version. (The one
change since then is to add localization for the messages like "You
have unstaged changes" -- which complicates the code, is important and
worth it for Git itself, but for our tools we can do without.)
This function will replace our use of `git diff-index --quiet HEAD`
in several scripts. The key differences in behavior are:
* The `git update-index --refresh`. Without this, on Windows
apparently `git diff-index` routinely (but not all the time!)
reports that tons of files have changed. See report:
https://chat.zulip.org/#narrow/stream/9-issues/topic/.2E.2Ftools.2Ffetch-pull-request.20issue/near/834435
* Instead of one command comparing the worktree to HEAD, we
separately compare the worktree to the index and the index to
HEAD, and abort if either diff is nonempty. This one is obvious,
but rather an edge case (it matters only if you've managed to
make the worktree and HEAD agree while the index has some
changes), and the extra code is annoying if written out in every
script that needs it. But that's what a subroutine is for. :-)
We'll make a few tweaks before actually switching to use this.
The Git commands we're invoking to do the real work are useful to
print, for transparency to see what's happening and that there's no
magic here.
The boring shell stuff like `remote=${2:-"upstream"}` is not so
helpful, and nor is the rather arcane and in any case read-only
command `git diff-index --quiet HEAD`. Those only add noise that
obscures the interesting parts. So, move the `set -x` down to when
we're done with the boring preparatory stuff and ready to perform
the commands that do the work.
Add sgrep (sgrep.dev) to tooling and include simple rule as
proof of concept. Included rule detects use of old django render
function.
Also added a rule that looks for if-else statements where both
code paths are identical.
While we could fix this issue by changing the markdown processor,
doing so is not a robust solution, because even a momentary bug in the
markdown processor could allow cached messages that do not follow our
security policy.
This change ensures that even if our markdown processor has bugs that
result in rendered content that does not properly follow our policy of
using rel="noopener noreferrer" on links, we'll still do something
reasonable.
Co-authored-by: Tim Abbott <tabbott@zulipchat.com>
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Folks can have issues connecting to Casper
as zulipdev.com when they are not connected to
the internet or just have a bad connection, since
the DNS record is on the internet. Folks can
work around this by just creating an /etc/hosts
entry for zulipdev.com, but people don't always
know.
This fix moves the symptom slightly earlier in
the process--we don't advertise that the server
is "up" if you can't actually connect to it as
"zulipdev.com".
In the past it has blocked Python library security updates with overly
strict version bounds, and we don’t use it as a library, only as a
binary.
Skip the PROVISION_VERSION bump because we can use the tx binary from
either location.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This reverts commit 36a8e61e67 (#13934).
The Django 2.2 autoreloader works by forking into a child process that
exits with status 3 when a file changes, and a parent process that
restarts the child when it exits with status 3. Setting this
environment variable had the effect of pretending we were already the
child process, without a parent process to restart it. Therefore,
changing any code used by the queue processor caused it to exit rather
than restart.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
python-dev will be depreciated in Focal but can be used as python2-dev
so removed it from common dockerfile.template and added it
as an extra package in .circleci/config.yml.
Previously, we only did apt updates when our sources.list files or
keys changed, which could result in provisioning errors for
development systems that don't routinely update their apt cache
(probably including ~all Vagrant environments).
cgi.escape is deprecated in python3.2 and removed in python3.8.
This function was unsafe because quote is false by default, hence
removed and replaced with a safer html.escape.
Used get_venv_dependencies function to return the correct dependencies
for RHEL, Centos, Fedora rather than importing them as separate
COMMON_YUM_DEPENDENCIES in provision and create-production-venv.
We now restrict emails on the zulip realm, and now
`email` and `delivery_email` will be different for
users.
This change should make it more likely to catch
errors where we leak delivery emails or use the
wrong field for lookups.
Tests for these links often result in rate-limiting from GitHub,
leading to the builds failing in Circle CI. We temporarily mark
github.com/zulip links as external to keep the builds passing.
Added a get_venv_dependencies() function in setup_venv.py which
returns VENV_DEPENDENCIES according to the vendor and os_version.
The reason for adding this function was because python-dev will be
depreciated in Focal but can be used as python2-dev so when adding
support for Focal VENV_DEPENDENCIES should to be os_version dependent.
There were two problems with the previous code-
1) The code glob.glob("scripts/lib/build-") should be
glob.glob("scripts/lib/build-*) otherwise it would always return [].
2) The part of the code where we included scripts/lib/build-* for sha1 sum
check would only run when debian is not in os_families(). This wasn't
correct as we could have a situation where we have to build pgroonga
from source even in case of debian and so we need to improve the
condition on it.
Now since we only have build-pgroonga there its better to just directly hash
its content with the condition of BUILD_PGROONGA_FROM_SOURCE.
This should fix spurious failures, where test-run-dev would occasionally
freeze. What exactly about these changes was causing that is still to
be potentially investigated. This is merely meant as a fix to the
failures.
This reverts commit 19429c3ad7.
We plan to use these records to check and record the schema of Zulip's
events for the purposes of API documentation.
Based on an original messier commit by tabbott.
In theory, a nicer version of this would be able to work directly off
the mypy type system, but this will be good enough for our use case.
Before this test, we were validating the behavior
of `i18next`, but we weren't validating our light
layer that sits on top of `i18next`, which currently
resides in the slightly misnamed `translations.js`
file.
The translations module is now so small that I'll
just quote it verbatim here:
import i18next from 'i18next';
i18next.init({
lng: 'lang',
resources: {
lang: {
translation: page_params.translation_data,
},
},
nsSeparator: false,
keySeparator: false,
interpolation: {
prefix: "__",
suffix: "__",
},
returnEmptyString: false, // Empty string is not a valid translation.
});
window.i18n = i18next;
We now just do `zrequire('translations')` to initialize
the `i18next` library, which allows us to have simpler
test setup and to actually exercise the above call to
`i18next.init`.
This change now gives us 100% line coverage of `translations.js`,
which of course isn't that hard to acheive (see above).
isort 5 knows not to reorder imports across function calls, so this
will stop isort from breaking our code.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This gives them cache-compatible URLs, and also avoids some extra
copies of the sprite sheet images.
Comments on the Octopus emoji added by tabbott.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
Now the caller simply imports the debug ‘require’ function as a
module, deciding for itself how to expose it and with what name (in
our case, we expose it as ‘require’ with expose-loader). Also, remove
a stray console.log.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This adds a global require() function that makes JS modules accessible
to the browser console without adding them to the global window
object:
» const typeahead = require("./static/shared/js/typeahead");
» typeahead.popular_emojis
Array(6) [ "1f44d", "1f389", "1f642", "2764", "1f6e0", "1f419" ]
The list of known modules is exposed via the keys of require.ids
object.
This will allow us to migrate more modules to ES6 without losing
access to this debugging functionality.
I’ll probably upload this plugin to NPM at some point, but I figured
I’ll let it bake in-tree first.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
This moves some code from settings_display.js
into the new module settings_config.js.
Extracting this module breaks some dependencies
on settings_display.js (which has some annoying
transitive dependencies, including jQuery).
In particular this isolates stream_data from
from settings_display.js.
Two of the three structures that we moved here
weren't even directly used by settings_display.js,
since we do a lot of rendering in the modules
admin.js and setting.js.
We make get_all_display_settings() a function
to avoid a require-time dependency on page_params.
Breaking the dependencies simplifies a few
node tests.
Most of the node test complexity came from the
following commit in March 2019:
5a130097bf
The commit itself seems harmless enough, but
dependencies can have a somewhat "viral" nature,
where making stream_data depend on settings_display
caused us to modify four different node tests.
This cleans up a few things:
- just yield values so we don't have to do
tedious max logic
- use values() instead of items() for
skin_variations loop
In the ideal world the emoji.json would reduce this
code to `get_square_size = lambda data: data['square_size']`,
but I don't think we can get the square size explicitly.
This commit changes the calculation of the
background-size parameter that we use to
render emojis from sprite sheets.
In particular, it now makes the parameter
match the sizes of our latest sprite
sheets from Twitter/Google.
This should fix the geometry aspect of #13959,
but we also need to fix some issues with the
cache being sticky.
There is also some minor cleanup:
- Remove obsolete -moz/-webkit CSS.
- Remove needless precision in percentages.
- Fix the transposed nrows/ncols names.
- Add extensive commenting.
Finally, we add a minor bump to the provision
number. This commit should be merged in the
same series as the other fix for this issue,
which will probably have a major bump, and we'll
need to rebase this appropriately.
While it's a bit of extra complexity to do this check, which I'm not
excited about, we've had multiple folks spend significant time being
confused rebasing past d7d8632525 into
deleting `pygments_data.json`, with provision not rebuilding it, so
this seems worth merging as a transitional fix even if we decide to
remove it in 2 months.
Without calling cov.erase() the data file seems to persist and even
pollute future test runs if not removed. Registering an atexit handler
seems like a good, and reasonably clean way to ensure the cleanup
happens.
Fixes#13933.
This allows us to collect coverage for Handlebars templates, and also
improves the readability of Handlebars-related stack traces.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
In Django 2.2 the autoreload system has changed.
DJANGO_AUTORELOAD_ENV env variable should be set when calling code
that'll use the autoreloader. Otherwise there's some kind of race
condition in the autoreload code when SIGINT is sent, where
restart_with_reloader() (called only if the env variable isn't set)
has the subprocess module calling p.kill() on a process that's already
exited, raising ProcessLookupError and printing an ugly traceback. This
causes non-deterministic test-run-dev failures.
We used to have a block of code doing this just in the presence
endpoint because that's where we'd had error-handling problems with it
not being present, but it seems more correct for it to run
unconditionally on all HTTP requests.
This requires adding a dependency of channel on reload_state, which we
record in the webpack configuration for now.
This should return us to a situation where we won't get blueslip
browser error reporting for users created while a device was offline
just before it reloads.
1) Created a new class `DatabaseType` and access its objects inside
`template_database_status()` instead of sending five arguments with
default values.
2) Made `check_files` and `setting_name` local variables instead of
function parameters since they had same value(None) for every call.
Fixes#13845.
webpack optimizes JSON modules using JSON.parse("{…}"), which is
faster than the normal JavaScript parser.
Update the backend to use emoji_codes.json too instead of the three
separate JSON files.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
I believe we can remove these and rely on
other parts of our testing/code-review
to ensure template quality.
These tests never really exercised our
app code, as evidenced by us not regressing
any of the 100%-line-coverage files.
We have a couple other ways that we verify
the correct format of the templates:
- webpack (can they compile?)
- check-templates (are they nicely indented?)
For deep testing, we have Casper, which
exercises most of our most important templates
in some meaningful way.
I think it's pretty rare that we get bugs
now that are directly caused by bad templates,
and an even smaller subset of them would
have been caught by the node tests.
If that trend changes in the future, I would prefer to
just do something "greenfield" to address
any common problems rather than resurrect
this code, but we could always resurrect it
from git.
The template node tests did check a little bit of
detail about which fields are there, but not
in an integrated way, so that aspect of the tests
wasn't very useful either.
This adds Ubuntu 19.10 as a valid provisioning target.
The release test in setup-apt-repo was changed from a list of values to
a regex check for brevity.
Every CLI program should have a usage message.
Also add a mention in the `push-to-pull-request` usage message of
its participation in the `refs/remotes/pr/` pseudo-remote feature.
This gives us the right behavior when using the `url.*.insteadOf`
mechanism for aliases in Git remote URLs. For example, if
one's ~/.gitconfig has:
[url "git@github.com:"]
insteadOf = gh:
then `git remote add upstream gh:zulip/zulip` will work great, as
the nice, short, mnemonic `gh:` prefix gets expanded to the more
finicky `git@github.com:`. I use just such a prefix routinely.
But the feature does require that scripts go through the right
abstractions. In particular `git remote get-url`, since Git 2.7
(from 2016), exists for exactly this reason. A plain `git config`
command bypasses the expansion, getting the verbatim `gh:...`
version, which doesn't work.
So, switch to that.
As a bonus, we get to behave correctly if for some reason the user
has configured a push URL distinct from the fetch URL for this
remote, just by adding `--push`. With `git config`, we'd have had
to manually implement the fallback from `remote.upstream.pushUrl` to
`remote.upstream.url` in order to properly handle that case.
We prefer this to internal_send_message().
We are trying to deprecate `internal_send_message`,
which has extra moving parts related to
`extract_recipients` and `Addressee.legacy_build`.
There are two chunks of code that I touch here
that look pretty similar, but I'm not quite
sure they're worth de-duplicating, since they
use different topics and different message
content.
Instead of having `notify_new_user` delegate
all the heavy lifting to `send_signup_message`,
we just rename `send_signup_message` to be
`notify_new_user` and remove the one-line
wrapper.
We remove a lot of obsolete complexity:
- `internal` was no longer ever set to True
by real code, so we kill it off as well
as well as killing off the internal_blurb code
and the now-obsolete test
- the `sender` parameter was actually an
email, not a UserProfile, but I think
that got past mypy due to the caller
passing in something from settings.py
- we were only passing in NOTIFICATION_BOT
for the sender, so we just hard code
that now
- we eliminate the verbose
`admin_realm_signup_notifications_stream`
parameter and just hard code it to
"signups"
- we weren't using the optional realm
parameter
There's also a long ugly comment in
`get_recipient_info` related to this code
that I amended for now.
We should try to take action in a subsequent
commit.
We now have 100% line coverage on 71 JS files.
This is thanks to about 150 people who have
contributed code to frontend/node_tests.
And then 126 files are still short of 100% line
coverage.
We now enforce line coverage with a set called
EXEMPT_FILES, which are the files for which
we do NOT expect to have 100% coverage.
Using an exemption list makes it so that adding
a new JS file to the project without 100% line
coverage will cause the build to fail. This will
encourage folks to be intentional about their
lack of test coverage.
If a file that had 100% coverage somehow regressed
to 0% coverage, we would report an error to the
console, but we weren't treating it as an actual
failure.
We've probably always had this bug, but it probably
rarely was an issue, since devs might have seen
the error locally, or hopefully whatever crazy
thing you did to totally remove coverage would
have had other symptoms.
If this was intentional, I suspect it might have
had something to do with wanting to get coverage
reports when you just run individual tests. But
a while back we changed it so that when you run
individual tests, we don't do the line coverage
enforcement.
My upstream remote is named origin, so commit-message-lint was always
complaining at me. Detect the right remote name from the output of
`git remote -v`.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
We now use vdom-ish techniques to track the
list items for the pm list. When we go to update
the list, we only re-render nodes whose data
has changed, with two exceptions:
- Obviously, the first time we do a full render.
- If the keys for the items have changed (i.e.
a new node has come in or the order has changed),
we just re-render the whole list.
If the keys are the same since the last re-render, we
only re-render individual items if their data has
changed.
Most of the new code is in these two modules:
- pm_list_dom.js
- vdom.js
We remove all of the code in pm_list.js that is
related to updating DOM with unread counts.
For presence updates, we are now *never*
re-rendering the whole list, since presence
updates only change individual line items and
don't affect the keys. Instead, we just update
any changed elements in place.
The main thing that makes this all work is the
`update` method in `vdom`, which is totally generic
and essentially does a few simple jobs:
- detect if keys are different
- just render the whole ul as needed
- for items that change, do the appropriate
jQuery to update the item in place
Note that this code seems to play nice with simplebar.
Also, this code continues to use templates to render
the individual list items.
FWIW this code isn't radically different than list_render,
but it's got some key differences:
- There are fewer bells and whistles in this code.
Some of the stuff that list_render does is overkill
for the PM list.
- This code detects data changes.
Note that the vdom scheme is agnostic about templates;
it simply requires the child nodes to provide a render
method. (This is similar to list_render, which is also
technically agnostic about rendering, but which also
does use templates in most cases.)
These fixes are somewhat related to #13605, but we
haven't gotten a solid repro on that issue, and
the scrolling issues there may be orthogonal to the
redraws. But having fewer moving parts here should
help, and we won't get the rug pulled out from under
us on every presence update.
There are two possible extensions to this that are
somewhat overlapping in nature, but can be done
one a time.
* We can do a deeper vdom approach here that
gets us away from templates, and just have
nodes write to an AST. I have this on another
branch, but it might be overkill.
* We can avoid some redraws by detecting where
keys are moving up and down. I'm not completely
sure we need it for the PM list.
If this gets merged, we may want to try similar
things for the stream list, which also does a fairly
complicated mixture of big-hammer re-renders and
surgical updates-in-place (with custom code).
BTW we have 100% line coverage for vdom.js.
This legacy cross-realm bot hasn't been used in several years, as far
as I know. If we wanted to re-introduce it, I'd want to implement it
as an embedded bot using those common APIs, rather than the totally
custom hacky code used for it that involves unnecessary queue workers
and similar details.
Fixes#13533.
responses is an module analogous to httpretty for mocking external
URLs, with a very similar interface (potentially cleaner in that it
makes use of context managers).
The most important (in the moment) problem with httpretty is that it
breaks the ability to use redis in parts of code where httpretty is
enabled. From more research, the module in general has tendency to
have various troublesome bugs with breaking URLs that it shouldn't be
affecting, caused by it working at the socket interface layer. While
those issues could be fixed, responses seems to be less buggy (based
on both third-party reports like ckan/ckan#4755 and our own experience
in removing workarounds for bugs in httpretty) and is more actively
maintained.
In this commit, we basically match any kinda of jinja2 start tag,
no matter its special kind (eg. jinja2_whitespace_stripped_start)
to any kinda jinja2 end tag (eg. jinja2_whitespace_stripped_end)
Idea is special operators like `-` do not change the meaning of
inline tag and thus matching shouldn't depend upon this.
Zulip has had a small use of WebSockets (specifically, for the code
path of sending messages, via the webapp only) since ~2013. We
originally added this use of WebSockets in the hope that the latency
benefits of doing so would allow us to avoid implementing a markdown
local echo; they were not. Further, HTTP/2 may have eliminated the
latency difference we hoped to exploit by using WebSockets in any
case.
While we’d originally imagined using WebSockets for other endpoints,
there was never a good justification for moving more components to the
WebSockets system.
This WebSockets code path had a lot of downsides/complexity,
including:
* The messy hack involving constructing an emulated request object to
hook into doing Django requests.
* The `message_senders` queue processor system, which increases RAM
needs and must be provisioned independently from the rest of the
server).
* A duplicate check_send_receive_time Nagios test specific to
WebSockets.
* The requirement for users to have their firewalls/NATs allow
WebSocket connections, and a setting to disable them for networks
where WebSockets don’t work.
* Dependencies on the SockJS family of libraries, which has at times
been poorly maintained, and periodically throws random JavaScript
exceptions in our production environments without a deep enough
traceback to effectively investigate.
* A total of about 1600 lines of our code related to the feature.
* Increased load on the Tornado system, especially around a Zulip
server restart, and especially for large installations like
zulipchat.com, resulting in extra delay before messages can be sent
again.
As detailed in
https://github.com/zulip/zulip/pull/12862#issuecomment-536152397, it
appears that removing WebSockets moderately increases the time it
takes for the `send_message` API query to return from the server, but
does not significantly change the time between when a message is sent
and when it is received by clients. We don’t understand the reason
for that change (suggesting the possibility of a measurement error),
and even if it is a real change, we consider that potential small
latency regression to be acceptable.
If we later want WebSockets, we’ll likely want to just use Django
Channels.
Signed-off-by: Anders Kaseorg <anders@zulipchat.com>
I added this tool a few years ago, and I did have
a vision for how it would improve our codebase, but
I can't remember exactly where I was going with it.
At this point the tool is just a little too noisy
to be helpful. An example of it creating confusion
was a recent PR where somebody was patching
user_circle_class in the PM list, and we already
had similar code in the buddy list, because they
use the same CSS. I mean, there was possibly a way
that the code could have been structured to remove
some of the duplication, but it probably would have
just moved the complexity around.
I just don't think it's worth maintaining the tool
at this point.