zulip

Commit Graph

Author	SHA1	Message	Date
Tim Abbott	9ac3e1099c	analytics: Remove last_modified field from FillState. This field wasn't used for anything, and I think it has very limited use for debugging, since fundamentally, it'll almost always have a value within the hour of the actual timestamp in FillState, and any more fine-grained logging we might want would be available in the analytics job's own logs. The proximal reason to remove it is that apparently Django's model_to_dict doesn't support auto_now fields, and that caused some trouble when working on adding more complete import/export support for analytics data.	2020-01-26 20:38:26 -08:00
Tim Abbott	8e7ce7cc79	python: Sort migrations/management command imports with isort. This is a preparatory commit for using isort for sorting all of our imports, merging changes to files where we can easily review the changes as something we're happy with. These are also files with relatively little active development, which means we don't expect much merge conflict risk from these changes.	2020-01-14 13:07:47 -08:00
Anders Kaseorg	4bd28f7ae6	migrations: Remove unused imports. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-02-02 17:01:04 -08:00
Rishi Gupta	85f7ac8172	analytics: Remove Anomaly model.	2019-02-01 18:48:18 -08:00
Tim Abbott	c679920c01	python: Fix unnecessary uses of str_utils library.	2018-11-27 11:44:09 -08:00
Tim Abbott	f0ef335412	models: Remove unused ModelReprMixin class. It appeared to be used as a base class in various Django migrations, but because it didn't define any model fields, it wasn't actually.	2018-05-15 19:11:22 -07:00
rht	8106a25e61	django-2.0: Add on_delete on ForeignKeys. In Django 2.0, one must specify the on_delete behavior for all ForeignKeys explicitly.	2018-01-30 10:53:54 -08:00
rht	d1689b5884	analytics: Use python 3 syntax for typing.	2017-11-17 13:16:49 -08:00
Tim Abbott	2b43a0302a	python: Sort imports in smaller apps.	2017-11-15 15:55:49 -08:00
rht	b2ad8fd747	py3: Remove all `from __future__ import unicode_literals`. This was mostly used in migrations, so it's a pretty safe change.	2017-10-17 23:07:42 -07:00
Umair Khan	c74f125b7c	analytics: Add on_delete in foreign keys. on_delete will be a required arg for ForeignKey in Django 2.0. Set it to models.CASCADE on models and in existing migrations if you want to maintain the current default behavior. See https://docs.djangoproject.com/en/1.11/ref/models/fields/#django.db.models.ForeignKey.on_delete	2017-06-13 15:13:49 -07:00
Rishi Gupta	dfbeab73b5	analytics: Change update_analytics_counts to only use hour boundaries. Fixes a recent regression where analytics were not being run on hour boundaries. Includes a migration that dumps all the analytics data.	2017-04-28 16:15:07 -07:00
hollywoodno	dd067c761a	analytics: Separate private messages from group private messages. This makes it possible for our graphs to show the group private message counts as separate from 1:1 private messages. Fixes #4102.	2017-03-20 11:46:29 -07:00
Rishi Gupta	87981a2bf1	analytics: Fix direct import of models in migrations.	2017-03-14 16:59:54 -07:00
Rishi Gupta	20255e48a4	analytics: Change messages_sent_to_stream to a daily stat. Analytics database tables are getting big, and so we're likely moving to a model where ~all stats are day stats, and we keep hourly stats only for the last N days. Also changed the name because: * messages_sent_* suggests the counts (summed over subgroup) should be the same as the other messages_sent stats, but they are different (these don't include PMs). * messages_sent_by_stream:is_bot:day is longer than 32 characters, the max allowable length for a BaseCount.property. Includes a database migration to remove the old stat from the analytics tables.	2017-03-03 16:11:28 -08:00
Tim Abbott	b7df84d5a8	analytics: Add indexes to optimize performance of aggregation. These indexes fix some slow queries used in updating the analytics tables, resulting in the analytics system consuming far less total resources.	2017-02-01 15:47:49 -08:00
Rishi Gupta	68fcb4152f	analytics: Remove interval field from *Count tables. Includes a database migration. The interval field was originally there to facilitate time aggregation (e.g. aggregate_hour_to_day), but we now do such aggregations in views code or in the frontend.	2017-01-17 15:54:57 -08:00
umkay	a94599fca7	analytics/models.py: Add subgroup column to unique_together constraints.	2016-11-01 16:53:56 -07:00
umkay	e92604ab78	analytics: Alter field length for property and interval in BaseCount.	2016-10-27 16:33:58 -07:00
umkay	610e92b94e	analytics: Add subgroup column to analytics tables. This is a major change to the analytics schema, and is the first step in a number of refactorings and performance improvements. For instance, it allows * Grouping sets of similar CountStats in the Count tables. For instance, active{_humans,_bots} will now have the same property, but have different subgroup values. Combining queries that differ only in their value on 1 filter clause, so that we make fewer passes through the zerver tables. For instance, instead of running a query for each of messages_sent_to_public_streams and messages_sent_to_private_streams, we can now run a single query with a group by on Stream.invite_only, and store the group by value in the subgroup column.	2016-10-27 16:33:58 -07:00
Rishi Gupta	655ee51e35	analytics: Add table to keep track of fill state. Adds two simplifying assumptions to how we process analytics stats: * Sets the atomic unit of work to: a stat processed at an hour boundary. * For any given stat, only allows these atomic units of work to be processed in chronological order. Adds a table FillState that, for each stat, keeps track of the last unit of work that was processed.	2016-10-14 10:18:37 -07:00
umkay	721529b782	analytics: Remove HuddleCount for now. Planned changes to the underlying analytics model will require potentially complicated changes to huddle queries.	2016-10-14 10:18:37 -07:00
umkay	78477ea071	Reorder the columns in analytics tables inherited from BaseCount. This is primarily implemented through altering the migration file in order to move the columns, but also we try to make the defaults a little better for future tables inherited from BaseCount.	2016-10-06 17:51:01 -07:00
umkay	d260a22637	Add a new statistics/analytics framework. This is a first pass at building a framework for collecting various stats about realms, users, streams, etc. Includes: * New analytics tables for storing counts data * Raw SQL queries for pulling data from zerver/models.py tables * Aggregation functions for aggregating hourly stats into daily stats, and aggregating user/stream level stats into realm level stats * A management command for pulling the data Note that counts.py was added to the linter exclude list due to errors around %%s.	2016-10-04 17:18:54 -07:00

24 Commits