zulip

Commit Graph

Author	SHA1	Message	Date
Rishi Gupta	30024d0a8f	models: Remove Realm.domain.	2017-03-25 19:55:48 -07:00
Rishi Gupta	9f60dd8387	analytics: Send zeros for data.user.bot in Messages Sent Over Time. It will simplify the logic needed to process the "Sent by Me" view in Messages Sent Over Time in stats.js. Also, we gzip the data sent from our server, so there is little additional network usage by doing this.	2017-03-25 14:18:23 -07:00
Tim Abbott	a474f4359d	tests: Set maxDiff to None unconditionally.	2017-03-21 07:34:16 -07:00
Tim Abbott	8041ebf579	mypy: Annotate maxDiff variable.	2017-03-21 07:31:37 -07:00
Tim Abbott	20a7609018	analytics: Rename message count types to use standard Zulip casing.	2017-03-21 00:09:54 -07:00
hollywoodno	dd067c761a	analytics: Separate private messages from group private messages. This makes it possible for our graphs to show the group private message counts as separate from 1:1 private messages. Fixes #4102.	2017-03-20 11:46:29 -07:00
Tim Abbott	0c0e5397c4	analytics: Fix nondeterministic ordering of labels.	2017-03-20 11:39:08 -07:00
Rishi Gupta	ceac6d9c59	analytics: Remove stray comment from test_counts.py. The "actual test that would be nice to do" was indeed done!	2017-03-17 21:58:51 -07:00
Umair Khan	4442703011	jinja2: No need for custom render_to_response. Django 1.10 has changed the implementation of this function to match our custom implementation; in addition to this, we prefer render(). Fixes #1914 via #4093.	2017-03-17 13:57:34 -07:00
Umair Khan	6511f929cc	analytics: Change render_to_response to render. Related to #4093	2017-03-17 13:52:59 -07:00
Rishi Gupta	7c6f0033ed	analytics: Add test for do_drop_all_analytics_tables.	2017-03-14 16:59:54 -07:00
Rishi Gupta	87981a2bf1	analytics: Fix direct import of models in migrations.	2017-03-14 16:59:54 -07:00
Rishi Gupta	ebebd04587	analytics: Fix ValueErrors affecting test coverage. Pathways that only catch internal code errors should use AssertionError so that they are not included when computing test coverage.	2017-03-14 16:59:54 -07:00
Rishi Gupta	b18bfe6771	analytics: Standardize format of zerver count queries. count_message_type_by_user_query is in a different format (no WHERE clause) from the rest since I'm having a hard time reasoning about how that would interact with the LEFT JOIN, especially given that there are %(join_args)s.	2017-03-14 16:59:54 -07:00
Rishi Gupta	e33ef1c788	analytics/models: Remove extended_id and key_model. They are unused / were part of a previous design.	2017-03-14 16:59:54 -07:00
Rishi Gupta	35f854a2fd	analytics: Add test for do_aggregate_to_summary_table.	2017-03-04 16:46:09 -08:00
Rishi Gupta	8feea6c598	analytics: Add LoggingCountStat for number of users.	2017-03-04 16:46:09 -08:00
Rishi Gupta	51b7677db7	Add RealmAuditLog table and record user activation/deactivation events. The RealmAuditLog will make it easier for server admins to replay history.	2017-03-04 16:45:44 -08:00
Raghav Jajodia	a3a03bd6a5	mypy: Added Dict, List and Set imports. Fixed mypy errors associated with the upgrade.	2017-03-04 14:33:44 -08:00
Rishi Gupta	1453a5bfda	Change string_id of test zephyr realm from mit to zephyr. Also changes Realm.is_zephyr_mirror_realm to use string_id=zephyr instead of domain=mit.edu. Part of a larger migration away from Realm.domain.	2017-03-04 12:18:01 -08:00
Rishi Gupta	8bea47d6b5	analytics: Do a stylistic cleanup of TestProcessCountStat.	2017-03-03 16:12:12 -08:00
Rishi Gupta	6c784d6321	analytics: Refactor COUNT_STATS declaration to not repeat itself.	2017-03-03 16:11:28 -08:00
Rishi Gupta	20255e48a4	analytics: Change messages_sent_to_stream to a daily stat. Analytics database tables are getting big, and so we're likely moving to a model where ~all stats are day stats, and we keep hourly stats only for the last N days. Also changed the name because: * messages_sent_* suggests the counts (summed over subgroup) should be the same as the other messages_sent stats, but they are different (these don't include PMs). * messages_sent_by_stream:is_bot:day is longer than 32 characters, the max allowable length for a BaseCount.property. Includes a database migration to remove the old stat from the analytics tables.	2017-03-03 16:11:28 -08:00
Rishi Gupta	4dc791f393	Clean up timestamps.py and add a test.	2017-03-01 23:03:56 -08:00
Rishi Gupta	562bc6429c	Replace datetime.now() with timezone.now() in Django ORM queries. When you pass a naive datetime to the Django ORM, it uses settings.TIME_ZONE for the time zone. In the development environment, both settings.TIME_ZONE and datetime.now() use 'America/New_York', so there is no change in behavior there. (fromtimestamp with no tz argument uses the same timezone as datetime.now) We are soon going to change settings.TIME_ZONE to UTC, so need to remove naive datetimes from queries to the ORM.	2017-03-01 22:54:28 -08:00
Rishi Gupta	01a4615f6e	Change datetime.now to timezone.now in active_user_stats_by_day. This actually fixes previously broken behavior, since 'date' here gets turned into the 'day' argument of seconds_active_during_day(day), where tzinfo is set to UTC.	2017-03-01 22:54:28 -08:00
Rishi Gupta	2b2be8120f	Change datetime.now(tz=X) to timezone.now(). datetime.now with a timezone set is equivalent to timezone.now() if it's never being printed out, but the latter is cleaner and more idiomatic.	2017-03-01 22:54:28 -08:00
Rishi Gupta	eee5cb5197	analytics: Add tests for views code.	2017-02-11 14:51:01 -08:00
Rishi Gupta	d6ce017a58	analytics: Minor cleanup of views.py.	2017-02-11 14:51:01 -08:00
Rishi Gupta	480fc0874b	analytics: Break ties deterministically in sort_client_labels.	2017-02-11 14:51:01 -08:00
Tim Abbott	f944ac8902	mypy: Fix incorrect annotation for by_used_time.	2017-02-10 23:53:44 -08:00
Rishi Gupta	19d1fc6223	stats: Pass user data to the frontend for messages sent over time.	2017-02-10 14:41:18 -08:00
Rishi Gupta	68a7f91022	stats: Add a fixed display order to summary charts. API: Adds a "display_order" to the response, which is a suggested order of importance for the clients or recipient types respectively. frontend: Changes messages_sent_by_{client,recipient_type} to use a fixed order for any given user.	2017-02-10 14:41:18 -08:00
Rishi Gupta	cf3ae2eafe	stats: Turn messages_sent_by_client into a bar chart. Also includes a number of changes to messages_sent_by_recipient_type that were convenient to do at the same time, since the two charts share a lot of code.	2017-02-10 14:41:18 -08:00
Rishi Gupta	ce89c64f43	stats.js: Move name_map computation to the backend.	2017-02-10 14:41:18 -08:00
Rishi Gupta	a1b1ffe1e4	analytics: Base default views end_time on FillState, not current time.	2017-02-10 14:41:07 -08:00
Rishi Gupta	6ab31d1bac	analytics: Move time computation to later in get_chart_data.	2017-02-10 14:40:14 -08:00
Tim Abbott	ec52322ae1	stats: Include Zulip and realm name in heading.	2017-02-07 11:22:57 -08:00
Tim Abbott	6c4eaf3d14	analytics: Map client names to user-facing versions. This makes the pie charts on /stats more readable.	2017-02-05 22:19:10 -08:00
Tim Abbott	161522e04c	analytics: Add comment explaining server admin routes.	2017-02-02 16:23:10 -08:00
Rishi Gupta	5eb5fa3f31	analytics: Change time_range to not include current day/hour. Current day/hour will always be 0, since we haven't computed it yet for the CountStat tables.	2017-02-02 10:59:52 -08:00
Tim Abbott	e8b0880320	analytics: Log updates to analytics counts.	2017-02-01 17:02:46 -08:00
umkay	76f3d02590	analytics: Add cron job to run analytics jobs. This adds a cron job to update the Zulip analytics counts, complete with locking etc. Substantially tweaked by tabbott.	2017-02-01 17:02:46 -08:00
Tim Abbott	b7df84d5a8	analytics: Add indexes to optimize performance of aggregation. These indexes fix some slow queries used in updating the analytics tables, resulting in the analytics system consuming far less total resources.	2017-02-01 15:47:49 -08:00
Amy Liu	0a39e354dc	analytics: Add graphs of usage statistics on /stats. This adds a frontend for the analytics system we've had for a few months, showing several graphs of the data in Zulip. There's a ton more that we can do with this tooling, but this initial version is enough to provide users with a pretty good experience. Fixes #2052.	2017-01-31 22:18:54 -08:00
Tim Abbott	4e171ce787	lint: Clean up E126 PEP-8 rule.	2017-01-23 22:06:13 -08:00
Tim Abbott	d6e38e2a5c	lint: Clean up E123 PEP-8 rule.	2017-01-23 21:34:26 -08:00
Tim Abbott	9cc83f87fc	lint: Clean up E241 PEP-8 rule.	2017-01-23 21:21:14 -08:00
Tim Abbott	9640a9e864	lint: Clean up E712 PEP-8 rule.	2017-01-23 21:11:18 -08:00
Rishi Gupta	29799d93c6	analytics/views.py: Always return time series data for stats. Makes a number of simplications to the analytics views code. The main one is that we now return the entire data series, even if the data is eventually going to go into a pie chart. This was prompted by us wanting several different pie charts for each stat (one for last 30 days, one for all time, etc), but I think it is also a more natural API. The total amount of data being sent for the pie charts now is maybe half of what is being sent for our single 'hourly' stat, or maybe up to 10,000 ints per year the organization has been around. The other big change is that the data being sent back is now always explicit about whether it is data about the realm (stored in data['realm'], or data about the user (stored in data['user']).	2017-01-19 17:44:17 -08:00
Rishi Gupta	734ca4644c	analytics: Add random_seed argument to generate_time_series_data.	2017-01-17 15:54:57 -08:00
Rishi Gupta	37bdc7c010	analytics: Remove COUNT_STATS['messages_sent:hour']. Having both messages_sent:hour and messages_sent:is_bot:day is confusing, since a single messages_sent:is_bot:hour would have a superset of the information and take less total space. This commit and its parent together replace the two stats with a single messages_sent:is_bot:hour.	2017-01-17 15:54:57 -08:00
Rishi Gupta	b593ac9d7c	analytics: Change messages_sent:is_bot to hourly frequency. In preparation for replacing messages_sent.	2017-01-17 15:54:57 -08:00
Rishi Gupta	68fcb4152f	analytics: Remove interval field from *Count tables. Includes a database migration. The interval field was originally there to facilitate time aggregation (e.g. aggregate_hour_to_day), but we now do such aggregations in views code or in the frontend.	2017-01-17 15:54:57 -08:00
Rishi Gupta	a8f2ebb443	analytics: Include interval in COUNT_STATS property names.	2017-01-17 15:54:57 -08:00
Rishi Gupta	c466036c80	analytics: Remove unneeded references to interval from test_counts.py.	2017-01-17 15:54:57 -08:00
Rishi Gupta	12d277d4f4	analytics: Change messages_sent:client stat to daily frequency. A few reasons: * Our two other subgroup'd message stats in UserCount are at CountStat.DAY frequency (messages_sent:is_bot and messages_sent:message_type). * Keeping this stat at hourly frequency would likely double the size of our analytics table, given the current stats. (Counterpoint: if there are roughly as many active streams as active users, and we keep messages_sent_to_stream:is_bot at hourly frequency, then maybe this stat is only a 30% or 50% increase). * We're currently only showing this on the frontend as a pie chart anyway.	2017-01-17 15:54:57 -08:00
Rishi Gupta	690002aef8	analytics: Add fixtures for several CountStats.	2017-01-17 15:54:57 -08:00
Rishi Gupta	2710a944e8	analytics: Refactor fixture creation to make it more general. Also less verbose, in preparation for adding a bunch more fixtures.	2017-01-17 15:54:57 -08:00
Rishi Gupta	1f4a4e5e26	analytics: Force --clear-existing-data option in populate_analytics_db. Makes more sense for a fixture generating script to just clear the existing data every time.	2017-01-17 15:54:57 -08:00
Rishi Gupta	680e7f75e1	analytics: Change generate_time_series_data argument from length to days. Previously, this function seemed ambivalent about whether it was generating a series of abstract data points or a series of data points that would correspond to times. Switch firmly to the latter, so e.g. if the frequency changes, so will the length of the output sequence.	2017-01-17 15:54:57 -08:00
Rishi Gupta	3712fda30d	analytics: Ensure fixture data points are non-negative.	2017-01-17 15:54:57 -08:00
Rishi Gupta	ecfc336a15	analytics: Add views for remaining /stats graphs.	2017-01-17 15:54:57 -08:00
Rishi Gupta	73c0c4c52e	analytics/views.py: Increase efficiency of get_time_series_by_subgroup. Not sure if this would actually be a performance problem in practice, but this was originally making a database query for each subgroup (instead of just a single query getting data for all the subgroups). Also removed the filter against the interval column, which will soon not be needed (interval will be uniquely determined by the property).	2017-01-17 15:54:57 -08:00
Rishi Gupta	d873902755	analytics/views.py: Refactor get_messages_sent_by_humans_and_bots. Refactor out the reusable parts, since we're about to add several more views.	2017-01-17 15:54:57 -08:00
Rishi Gupta	3a72b5cda9	analytics: Rename messages_sent_to_realm. Several additional stats in the pipeline that also relate to messages sent to the realm.	2017-01-17 15:54:57 -08:00
Rishi Gupta	cdb1c96169	analytics tests: Refactor assertCountEquals calls to be more readable.	2017-01-17 15:54:57 -08:00
Rishi Gupta	59d50c3a47	analytics tests: Make it easy to refer to users in test realm.	2017-01-17 15:54:57 -08:00
Rishi Gupta	54e66e6079	analytics: Add remaining backend tests in TestCountStats.	2017-01-17 15:54:57 -08:00
aakash-cr7	b373f2ef0f	analytics: Add backend test for messages_sent_to_stream:is_bot.	2017-01-17 15:54:57 -08:00
Amy Liu	10c0c2b16d	analytics: Add backend tests for messages_sent:message_type.	2017-01-17 15:54:57 -08:00
Rishi Gupta	f30b174199	analytics: Set property and interval defaults in assertCountEquals.	2017-01-17 15:54:57 -08:00
Rishi Gupta	a563a15f88	analytics: Make TestCountStats tests more robust. Adds two things to TestCountStats.setUp(): * A realm with no messages, that generally should not show up in Count tables, Users/streams/messages created at 0, 1, 61, and 1441 (just over a day) minutes ago (previously was 0, 60), to better test the start_time/end_time in the queries, and the frequency/interval setting in the CountStats.	2017-01-17 15:54:57 -08:00
Rishi Gupta	e94bc8f142	analytics tests: Autogenerate names for create* functions.	2017-01-17 15:54:57 -08:00
Amy Liu	f7ce76fb63	analytics: Add create_stream_with_recipient and create_huddle_with_recipient. This commit replaces AnalyticsTestCase.create_stream with create_stream_with_recipient and adds the method create_huddle_with_recipient.	2017-01-17 15:54:57 -08:00
Rishi Gupta	f375caed46	/activity: Fix URL route for analytics.views.get_realm_activity. analytics.views.get_realm_activity was taking a 'realm_str', but the URL route was expecting a 'realm'. Changed the URL route to take a 'realm_str'.	2017-01-12 15:21:06 -08:00
Rishi Gupta	3f2a002c6e	analytics/lib/counts.py: Fix one of the COUNT_STATS definitions. Fixes an error in the definition of COUNT_STATS['messages_sent_to_stream:is_bot']. The CountStat needs a group_by argument since it is supposed to group by UserProfile.is_bot.	2017-01-10 20:41:07 -08:00
Rishi Gupta	977f5b9178	analytics/lib/counts.py: Fix error in count_message_type_by_user_query. This query counts the number of messages each user has sent, subgroup'd by whether the message was a private_message (PM or sent to a huddle), sent to a 'private_stream', or sent to a 'public_stream'. We need to join on zerver_stream to find out whether stream messages were sent to public streams or private streams, but it needs to be a LEFT JOIN rather than a JOIN so that we preserve the messages sent to non-streams.	2017-01-10 20:41:07 -08:00
Rishi Gupta	6374596a77	analytics: Add initial fixture for testing views.	2017-01-10 17:48:07 -08:00
Tim Abbott	3f8d4193da	lint: Fix % comprehensions being used without a tuple.	2017-01-09 11:45:11 -08:00
Rishi Gupta	ac29928d91	Remove domain from analytics management commands.	2017-01-09 11:26:08 -08:00
Rishi Gupta	e14f575979	Remove domain from analytics/views.py.	2017-01-09 11:26:08 -08:00
Rishi Gupta	552d626ef2	analytics: Fix FillState.last_modified not being updated. We were updating FillState with FillState.objects.filter(..).update(..), which does not update the last_modified field (which has auto_now=True). The correct incantation is the save() method of the actual FillState object.	2017-01-08 23:36:34 -08:00
Rishi Gupta	190d320afa	analytics: Change CountStat.property from Text to str.	2017-01-08 17:24:51 -08:00
Rishi Gupta	a07757c127	analytics/views: Fix query in get_messages_sent_to_realm.	2017-01-08 17:24:51 -08:00
Rishi Gupta	f8962d521d	analytics: Fix uses of 'interval' in arguments and variable names. interval refers to a time interval, and frequency refers to something that semantically means something closer to 'hourly' or 'daily'. Currently, interval can have values 'hour', 'day', or 'gauge', and frequency can only have values 'hour' and 'day'.	2017-01-08 17:24:51 -08:00
Rishi Gupta	f5899dd14b	analytics: Add lib/ function to drop all analytics tables.	2017-01-08 17:24:51 -08:00
Rishi Gupta	73dc904e9c	analytics: Move time_range from views.py to lib/time_utils.py	2017-01-08 17:24:51 -08:00
Tommy Ip	28abfca565	analytics: Fix bare except clause.	2017-01-08 16:25:22 -08:00
Rishi Gupta	2b0a7fd0ba	Rename models.get_realm_by_string_id to get_realm. Finishes the refactoring started in `c1bbd8d`. The goal of the refactoring is to change the argument to get_realm from a Realm.domain to a Realm.string_id. The steps were * Add a new function, get_realm_by_string_id. * Change all calls to get_realm to use get_realm_by_string_id instead. * Remove get_realm. * (This commit) Rename get_realm_by_string_id to get_realm. Part of a larger migration to remove the Realm.domain field entirely.	2017-01-04 17:12:23 -08:00
Rishi Gupta	605361ec86	makemessages: Fix string with unnamed arguments in analytics/views.py.	2016-12-30 16:52:24 -08:00
Rishi Gupta	9e5325a164	Add /stats page with basic stats graph. Adds a new url route and a new json endpoint.	2016-12-29 14:20:13 -08:00
Rishi Gupta	31efe858ef	Clean up imports in analytics/views.py.	2016-12-29 14:20:13 -08:00
Rishi Gupta	717afcb408	Remove calls to get_realm in preparation for its deprecation. Also removes two calls to email_to_domain.	2016-12-26 17:53:32 -08:00
Rishi Gupta	c7c0e36508	analytics: Add InstallationCount checks to prototype TestCountStat. Was enabled by commit `41e8ee3` where we moved TIME_ZERO to before the realms created by populate_db.py. Also removes the stub for TestAggregates, since the remaining thing to be tested was the aggregation from RealmCount to InstallationCount, and the end to end checks provided by the TestCountStat tests should be sufficient.	2016-12-20 12:03:23 -08:00
Rishi Gupta	dbc94d0fc0	analytics: Remove test for no longer supported behavior. In a previous design, there was no FillState table, and one could run any CountStat at any time. This is no longer supported. This test was making sure that if one ran a CountStat at a certain hour, and then ran it at a previous hour, the old rows would still be there.	2016-12-20 12:03:23 -08:00
Rishi Gupta	e09aaf1020	analytics: Remove tests that will be subsumed by TestCountStats.	2016-12-20 12:03:23 -08:00
Rishi Gupta	6748b72ccc	analytics: Remove tests now covered by test_active_users_by_is_bot.	2016-12-20 12:03:23 -08:00
Rishi Gupta	2211b8b102	analytics: Change count_message_by_stream to join on UserProfile. It seems unlikely we will need count_message_by_stream without the UserProfile table in the future, so write count_message_by_stream_and_is_bot in the usual query form and replace count_message_by_stream with it. This also has the benefit of shortening our list of "special case" queries from two to one. The pathways of the removed test will be covered more thoroughly in the new TestCountStats tests.	2016-12-20 12:03:23 -08:00
Rishi Gupta	6992f9784c	analytics: Update TestCountStat prototype.	2016-12-20 12:03:23 -08:00

1 2 3 4 5 ...

276 Commits