The previous zerver_* names were unwieldy and not very readable. This also
puts more of the useful information in one place; in particular, makes it
easier to skim a CountStat declaration and see if we're collecting it at a
user/stream granularity or a realm granularity.
It turned out to not be that useful once we added subgroup. The previous
design of the CountStat object also assumed more reuseability of the *_query
strings than what ended up happening.
The filter_args also had some carrying costs:
* It's hard to be confident that filter_args other than the ones explicitly
in our tests would have had expected behavior.
* The filter_args/join_args system is the most complex part of the CountStat
object, and makes understanding the *_query strings unnecessarily
difficult for a new contributor.
Groundwork for allowing stats like "Monthly Active Users".
CountStat.interval is no longer as clean a value as before, so removed it
from views.get_chart_data. It wasn't being used by the frontend anyway.
Removing interval from logger calls in counts.py is not a big loss since we
now include the frequency (which is typically also the interval) in
CountStat.property.
The code in fixtures.py is only called from populate_analytics_db, and is
only used for generating pretty fixture data for manual testing. This commit
adds tests for a few things that were easy to add tests for, and provides
some minimal coverage of the file, but is not meant to be comprehensive.
Originally, all the client names in populate_analytics_db started with
underscores to make it easy to selectively delete and regenerate them when
re-running populate_analytics_db.
We eventually want to merge populate_analytics_db into populate_db though,
in which case it makes more sense for them to share client names, and not
worry about the case where we run (or re-run) populate_analytics_db
independently of populate_db.
It will simplify the logic needed to process the "Sent by Me" view in
Messages Sent Over Time in stats.js.
Also, we gzip the data sent from our server, so there is little additional
network usage by doing this.
Django 1.10 has changed the implementation of this function to
match our custom implementation; in addition to this, we prefer
render().
Fixes#1914 via #4093.
count_message_type_by_user_query is in a different format (no WHERE clause)
from the rest since I'm having a hard time reasoning about how that would
interact with the LEFT JOIN, especially given that there are %(join_args)s.
Analytics database tables are getting big, and so we're likely moving to a
model where ~all stats are day stats, and we keep hourly stats only for the
last N days.
Also changed the name because:
* messages_sent_* suggests the counts (summed over subgroup) should be the
same as the other messages_sent stats, but they are different (these don't
include PMs).
* messages_sent_by_stream:is_bot:day is longer than 32 characters, the max
allowable length for a BaseCount.property.
Includes a database migration to remove the old stat from the analytics
tables.
When you pass a naive datetime to the Django ORM, it uses settings.TIME_ZONE
for the time zone. In the development environment, both settings.TIME_ZONE
and datetime.now() use 'America/New_York', so there is no change in behavior
there. (fromtimestamp with no tz argument uses the same timezone as
datetime.now)
We are soon going to change settings.TIME_ZONE to UTC, so need to remove
naive datetimes from queries to the ORM.
This actually fixes previously broken behavior, since 'date' here gets
turned into the 'day' argument of seconds_active_during_day(day), where
tzinfo is set to UTC.
API: Adds a "display_order" to the response, which is a suggested order of
importance for the clients or recipient types respectively.
frontend: Changes messages_sent_by_{client,recipient_type} to use a fixed
order for any given user.
Also includes a number of changes to messages_sent_by_recipient_type that
were convenient to do at the same time, since the two charts share a lot of
code.