This code is going to end up pretty complex -- each stat has multiple levels
of aggregation (UserCount, RealmCount, InstallationCount), and refinement
(subgroups), and soon we'll have charts that take data from multiple stats
as input.
Not sure what the best way to present it is, but hopefully this simplifies
it a bit.
We use "Everyone" for the button labels already.
Soon we'll support "Everyone" meaning either the installation or the realm,
depending on the URL route used to access the stats.
In this commit:
Two new URLs are added, to make all realms accessible for server
admins. One is for the stats page itself and another for getting
chart data i.e. chart data API requests.
For the above two new URLs corresponding two view functions are
added.
I've wanted this when looking at a tab from the day before.
Also provides the date and time in UTC, which is handy for
interpreting some of the data.
Pretty sure this is not the world's cleanest way to do this in the
front-end code. It'll do for now.
Substantively, this makes the table more readable by grouping users
into expanding sets by level of activity: active in last day, active
in last week, have an account at all. The class "active in last week",
as opposed to "active in last week but not in last day", makes more
natural comparisons both between realms and for one realm through time,
and it's less sensitive to the details of our definitions.
This also makes the terminology more standard. We already made that
change in the display, in the previous commit; as we go through the
logic here, we adjust the terminology in the code too.
Previously, entering a non-UTC end time for a daily stat would give you
incorrect results. This is because:
* All daily stats are collected at and have end_times in the database in
midnight UTC.
* For daily stats, time_range returns a list of datetimes at midnight in the
timezone of its end argument. These datetimes are the only ones we look
for when looking for rows corresponding to the stat in the database.
* Previously, we passed on the end argument from the API to time_range,
without modification.
This consists of the `zulip_ops::stats` Puppet class, which has apparently
not been used since 2014, and a number of files that I believe were
only used for that. Also a couple of tiny loose ends in other files.
sort_client_labels sorts first by total, and then to ensure deterministic
outcomes, sorts (reverse) alphabetically by label.
Fixes regression introduced in 0c0e539.
Previously we showed the total number of users with an active account. This
changes it to show only the number of users that have logged in in the past
two weeks.
Rename 'zulip_internal' decorator to 'require_server_admin', add
documentation for 'server_admin', explaining how to give permission
for ./activity page.
Fixes: #1463.
Groundwork for allowing stats like "Monthly Active Users".
CountStat.interval is no longer as clean a value as before, so removed it
from views.get_chart_data. It wasn't being used by the frontend anyway.
Removing interval from logger calls in counts.py is not a big loss since we
now include the frequency (which is typically also the interval) in
CountStat.property.
Originally, all the client names in populate_analytics_db started with
underscores to make it easy to selectively delete and regenerate them when
re-running populate_analytics_db.
We eventually want to merge populate_analytics_db into populate_db though,
in which case it makes more sense for them to share client names, and not
worry about the case where we run (or re-run) populate_analytics_db
independently of populate_db.
It will simplify the logic needed to process the "Sent by Me" view in
Messages Sent Over Time in stats.js.
Also, we gzip the data sent from our server, so there is little additional
network usage by doing this.
Django 1.10 has changed the implementation of this function to
match our custom implementation; in addition to this, we prefer
render().
Fixes#1914 via #4093.
API: Adds a "display_order" to the response, which is a suggested order of
importance for the clients or recipient types respectively.
frontend: Changes messages_sent_by_{client,recipient_type} to use a fixed
order for any given user.
Also includes a number of changes to messages_sent_by_recipient_type that
were convenient to do at the same time, since the two charts share a lot of
code.
This adds a frontend for the analytics system we've had for a few
months, showing several graphs of the data in Zulip.
There's a ton more that we can do with this tooling, but this initial
version is enough to provide users with a pretty good experience.
Fixes#2052.
Makes a number of simplications to the analytics views code. The main one is
that we now return the entire data series, even if the data is eventually
going to go into a pie chart. This was prompted by us wanting several
different pie charts for each stat (one for last 30 days, one for all time,
etc), but I think it is also a more natural API. The total amount of data
being sent for the pie charts now is maybe half of what is being sent for
our single 'hourly' stat, or maybe up to 10,000 ints per year the
organization has been around.
The other big change is that the data being sent back is now always explicit
about whether it is data about the realm (stored in data['realm'], or data
about the user (stored in data['user']).
Not sure if this would actually be a performance problem in practice, but
this was originally making a database query for each subgroup (instead of
just a single query getting data for all the subgroups).
Also removed the filter against the interval column, which will soon not be
needed (interval will be uniquely determined by the property).
interval refers to a time interval, and frequency refers to something that
semantically means something closer to 'hourly' or 'daily'.
Currently, interval can have values 'hour', 'day', or 'gauge', and frequency
can only have values 'hour' and 'day'.
Type of parameter for function `is_recent`(line no.812) is `datetime`.
MyPy errors out, however, when the parameter is defined as `datetime`.
To get around, type `Any` is used.
This results in a substantial performance improvement for all of
Zulip's backend templates.
Changes in templates:
- Change `block.super` to `super()`.
- Remove `load` tag because Jinja2 doesn't support it.
- Use `minified_js()|safe` instead of `{% minified_js %}`.
- Use `compressed_css()|safe` instead of `{% compressed_css %}`.
- `forloop.first` -> `loop.first`.
- Use `{{ csrf_input }}` instead of `{% csrf_token %}`.
- Use `{# ... #}` instead of `{% comment %}`.
- Use `url()` instead of `{% url %}`.
- Use `_()` instead of `{% trans %}` because in Jinja `trans` is a block tag.
- Use `{% trans %}` instead of `{% blocktrans %}`.
- Use `{% raw %}` instead of `{% verbatim %}`.
Changes in tools:
- Check for `trans` block in `check-templates` instead of `blocktrans`
Changes in backend:
- Create custom `render_to_response` function which takes `request` objects
instead of `RequestContext` object. There are two reasons to do this:
1. `RequestContext` is not compatible with Jinja2
2. `RequestContext` in `render_to_response` is deprecated.
- Add Jinja2 related support files in zproject/jinja2 directory. It
includes a custom backend and a template renderer, compressors for js
and css and Jinja2 environment handler.
- Enable `slugify` and `pluralize` filters in Jinja2 environment.
Fixes#620.
get_realm is better in two key ways:
* It uses memcached to fetch the data from the cache and thus is faster.
* It does a case-insensitive query and thus is more safe.
To make this give accurate numbers, we need to filter out the
automated traffic from Zephyr mirroring and our internal monitoring.
(imported from commit 83642bc9a9d8d01dd9dc5dc7b3e3dee6c9705162)
This shows the number of messages sent by humans for the last
eight 24-hour periods, for each realm. "Messages sent" isn't a
perfect metric of activity, but it's easier to query with our
current data model than certain other statistics.
(imported from commit 9de3c479640a0b9dbc017b245dda21d951f4efa4)
This makes it easier to see how many messages are being sent
by webhook bots. This assumes a 1:1 relationship between
hitting webhook endpoints and sending messages, which is probably
valid enough in the near future.
(imported from commit eb272cd38b9cabd54d317ce2dfdf12099d302fce)
For legacy reasons, this template wanted each tab's content as
a one-key dictionary, instead of a string. Each tab already has
a tuple to allow for fields like title, so this wasn't really giving us
any long term flexibility; it was just crufting up the calling
code.
(imported from commit 2a316107ec223a83efa8735f4810a6fa43107541)