Commit Graph

52 Commits

Author SHA1 Message Date
Tim Abbott e8b0880320 analytics: Log updates to analytics counts. 2017-02-01 17:02:46 -08:00
umkay 76f3d02590 analytics: Add cron job to run analytics jobs.
This adds a cron job to update the Zulip analytics counts, complete
with locking etc.

Substantially tweaked by tabbott.
2017-02-01 17:02:46 -08:00
Tim Abbott 4e171ce787 lint: Clean up E126 PEP-8 rule. 2017-01-23 22:06:13 -08:00
Tim Abbott 9640a9e864 lint: Clean up E712 PEP-8 rule. 2017-01-23 21:11:18 -08:00
Rishi Gupta 734ca4644c analytics: Add random_seed argument to generate_time_series_data. 2017-01-17 15:54:57 -08:00
Rishi Gupta 37bdc7c010 analytics: Remove COUNT_STATS['messages_sent:hour'].
Having both messages_sent:hour and messages_sent:is_bot:day is confusing,
since a single messages_sent:is_bot:hour would have a superset of the
information and take less total space. This commit and its parent together
replace the two stats with a single messages_sent:is_bot:hour.
2017-01-17 15:54:57 -08:00
Rishi Gupta b593ac9d7c analytics: Change messages_sent:is_bot to hourly frequency.
In preparation for replacing messages_sent.
2017-01-17 15:54:57 -08:00
Rishi Gupta 68fcb4152f analytics: Remove interval field from *Count tables.
Includes a database migration. The interval field was originally there to
facilitate time aggregation (e.g. aggregate_hour_to_day), but we now do such
aggregations in views code or in the frontend.
2017-01-17 15:54:57 -08:00
Rishi Gupta a8f2ebb443 analytics: Include interval in COUNT_STATS property names. 2017-01-17 15:54:57 -08:00
Rishi Gupta 690002aef8 analytics: Add fixtures for several CountStats. 2017-01-17 15:54:57 -08:00
Rishi Gupta 2710a944e8 analytics: Refactor fixture creation to make it more general.
Also less verbose, in preparation for adding a bunch more fixtures.
2017-01-17 15:54:57 -08:00
Rishi Gupta 1f4a4e5e26 analytics: Force --clear-existing-data option in populate_analytics_db.
Makes more sense for a fixture generating script to just clear the existing
data every time.
2017-01-17 15:54:57 -08:00
Rishi Gupta 680e7f75e1 analytics: Change generate_time_series_data argument from length to days.
Previously, this function seemed ambivalent about whether it was generating
a series of abstract data points or a series of data points that would
correspond to times. Switch firmly to the latter, so e.g. if the frequency
changes, so will the length of the output sequence.
2017-01-17 15:54:57 -08:00
Rishi Gupta 6374596a77 analytics: Add initial fixture for testing views. 2017-01-10 17:48:07 -08:00
Rishi Gupta ac29928d91 Remove domain from analytics management commands. 2017-01-09 11:26:08 -08:00
Rishi Gupta 552d626ef2 analytics: Fix FillState.last_modified not being updated.
We were updating FillState with FillState.objects.filter(..).update(..),
which does not update the last_modified field (which has auto_now=True).
The correct incantation is the save() method of the actual FillState
object.
2017-01-08 23:36:34 -08:00
Rishi Gupta f5899dd14b analytics: Add lib/ function to drop all analytics tables. 2017-01-08 17:24:51 -08:00
Rishi Gupta 2b0a7fd0ba Rename models.get_realm_by_string_id to get_realm.
Finishes the refactoring started in c1bbd8d. The goal of the refactoring is
to change the argument to get_realm from a Realm.domain to a
Realm.string_id. The steps were

* Add a new function, get_realm_by_string_id.

* Change all calls to get_realm to use get_realm_by_string_id instead.

* Remove get_realm.

* (This commit) Rename get_realm_by_string_id to get_realm.

Part of a larger migration to remove the Realm.domain field entirely.
2017-01-04 17:12:23 -08:00
reyha 82e32ad255 Access realm by `string_id` in management commands.
`Realm.string_id` replaces 'Realm.domain'
in the management commands.

Fixes #2325.
2016-12-14 10:38:03 -08:00
Sidhant Bhavnani 8c0c12c1d9 pep8: Fix E303 violations. 2016-12-02 15:34:11 -08:00
Rafid Aslam c5316b4002 lint: Fix E127 pep8 violations.
Fix pep8: E127 continuation line over-indented for visual indent
style issue.
2016-12-01 10:23:55 -08:00
Anders Kaseorg 207cf6302b Always start python via shebang lines.
This is preparation for supporting using Python 3 in production.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2016-11-26 14:46:37 -08:00
Umair Khan 682aa1f298 Django 1.10: Use add_argument for options in BaseCommand. 2016-11-04 10:20:23 -07:00
Rishi Gupta 655ee51e35 analytics: Add table to keep track of fill state.
Adds two simplifying assumptions to how we process analytics stats:
* Sets the atomic unit of work to: a stat processed at an hour boundary.
* For any given stat, only allows these atomic units of work to be processed
  in chronological order.

Adds a table FillState that, for each stat, keeps track of the last unit of
work that was processed.
2016-10-14 10:18:37 -07:00
umkay 721529b782 analytics: Remove HuddleCount for now.
Planned changes to the underlying analytics model will require potentially
complicated changes to huddle queries.
2016-10-14 10:18:37 -07:00
Tim Abbott 273c17a072 update_analytics_counts: Add missing future imports. 2016-10-05 17:13:46 -07:00
umkay 5d0bed8673 Add script to clear analytics tables. 2016-10-05 17:11:13 -07:00
Tim Abbott 3973ae5dbb update_analytics_counts: Fix buggy argument parsing. 2016-10-04 20:43:19 -07:00
umkay d260a22637 Add a new statistics/analytics framework.
This is a first pass at building a framework for collecting various
stats about realms, users, streams, etc. Includes:
* New analytics tables for storing counts data
* Raw SQL queries for pulling data from zerver/models.py tables
* Aggregation functions for aggregating hourly stats into daily stats, and
  aggregating user/stream level stats into realm level stats
* A management command for pulling the data

Note that counts.py was added to the linter exclude list due to errors
around %%s.
2016-10-04 17:18:54 -07:00
Taranjeet a137bf15ed Wrap some lines with length greater than 120.
With some tweaks by tabbott.
2016-07-06 14:35:16 -07:00
Tim Abbott a1a27b1789 Annotate most Zulip management commands. 2016-06-04 10:12:06 -07:00
Eklavya Sharma 94e4b39112 Replace python2.7 by python everywhere. 2016-05-29 05:03:08 -07:00
Tim Abbott b869be9301 style: Use 'not in' consistently rather than `not foo in`. 2016-05-09 17:00:10 -07:00
Tim Abbott 191201bd10 Fix unnecessary whitespace between % and (. 2016-05-04 14:22:52 -07:00
Tim Abbott 54022ac204 Fix unnecessary whitespace between , and ). 2016-05-04 14:16:53 -07:00
Ashish 6356584f84 Replace /json/update_pointer with REST style route. 2016-04-11 21:38:23 -07:00
Ashish 41993ef2f5 Replace /json/update_message_flags with REST style route. 2016-04-11 21:38:22 -07:00
Tim Abbott b8c82d5b43 Add PEP-484 type annotations to analytics/. 2016-04-03 15:40:23 -07:00
Tim Abbott df1670ef59 Fix various float initialization to use 0.0 instead of 0.
This is needed to type-check these values.
2016-02-03 19:29:07 -08:00
Tim Abbott 1f44417fc1 Switch to using Python 3 style division everywhere.
Also add testing for this to our Python 3 compatibility test suite.
2016-01-26 21:09:43 -08:00
Tim Abbott a79e89b28f Cleanup remaining usage of % comprehensions without explicit tuples. 2015-12-05 15:29:42 -08:00
Tim Abbott f7878a61e1 Apply Python 3 futurize transform libmodernize.fixes.fix_xrange_six. 2015-11-01 09:35:06 -08:00
Tim Abbott b3ac668779 Apply Python 3 futurize transform libmodernize.fixes.fix_filter. 2015-11-01 09:26:16 -08:00
Tim Abbott f3783fb4a1 Apply Python 3 futurize transform libfuturize.fixes.fix_print_with_import. 2015-11-01 09:26:16 -08:00
Tim Abbott 8c34c40924 Apply Python 3 futurize transform lib2to3.fixes.fix_except. 2015-11-01 08:08:33 -08:00
Steven Oud d5435fad1d Consistently use /usr/bin/env python2.7 in shebangs and commands. 2015-10-21 22:58:21 +00:00
Tim Abbott 71a06d58de Convert uses of Realm.objects.get() to get_realm().
get_realm is better in two key ways:
* It uses memcached to fetch the data from the cache and thus is faster.
* It does a case-insensitive query and thus is more safe.
2015-10-15 09:16:58 -04:00
Reid Barton ae0ae3dde8 Django 1.8: declare positional arguments in management commands
(imported from commit d9efca1376de92c8187d25f546c79fece8d2d8c6)
2015-08-20 23:35:40 -07:00
Tim Abbott 3b7bf691e7 Add tool to query our usage stats as of a given date.
This contains the various fixes that needed to be made in order to get
accurate statistics.

Most notably, the active_users_between function in the previous
version of zerver/lib/statistics.py was broken for end dates in the
past, because it used the UserActivity table to get its data -- so in
fact it really was querying "users last active between".

This commit isn't super clean, but I figure we're probably better off
having our latest code for historical usage data in git so it doesn't
bitrot and anyone can improve on it.

(imported from commit 24ff2f24a22e5bdc004ea8043d8da12deb97ff2f)
2013-12-17 15:34:44 -05:00
Leo Franchi 1e7a22f14e Replace other non-zerver uses of iPhone client
(imported from commit 0988e2c9bd0499a0711daed97f89aa672776f085)
2013-12-03 14:35:24 -05:00