zulip

Commit Graph

Author	SHA1	Message	Date
Alex Vandiver	50c3dd88e6	models: Migrate ids of all non-Message-related tables to bigint. Migrate all `ids` of anything which does not have a foreign key from the Message or UserMessage table (and would thus require walking those) to be `bigint`. This is done by removing explicit `BigAutoField`s, trading them for explicit `AutoField`s on the tables to not be migrated, while updating `DEFAULT_AUTO_FIELD` to the new default. In general, the tables adjusted in this commit are small tables -- at least compared to Messages and UserMessages. Many-to-many tables without their own model class are adjusted by a custom Operation, since they do not automatically pick up migrations when `DEFAULT_AUTO_FIELD` changes[^1]. Note that this does multiple scans over tables to update foreign keys[^2]. Large installs may wish to hand-optimize this using the output of `./manage.py sqlmigrate` to join multiple `ALTER TABLE` statements into one, to speed up the migration. This is unfortunately not possible to do generically, as constraint names may differ between installations. This leaves the following primary keys as non-`bigint`: - `auth_group.id` - `auth_group_permissions.id` - `auth_permission.id` - `django_content_type.id` - `django_migrations.id` - `otp_static_staticdevice.id` - `otp_static_statictoken.id` - `otp_totp_totpdevice.id` - `two_factor_phonedevice.id` - `zerver_archivedmessage.id` - `zerver_client.id` - `zerver_message.id` - `zerver_realm.id` - `zerver_recipient.id` - `zerver_userprofile.id` [^1]: https://code.djangoproject.com/ticket/32674 [^2]: https://code.djangoproject.com/ticket/24203	2024-06-05 11:48:27 -07:00
Alex Vandiver	4f4725f810	analytics: Migrate models' id columns to bigint. This helps prevent wraparound on exceedingly large and old installs, particularly Zulip Cloud. These are relatively simple migrations since they are not referenced by any other tables; however, they are quite large, and are actively used from Django by running servers, making this not a migration which is possible to run without stopping the server. Use the escape hatch in the previous commit to temporarily pause analytics writes while the migration happens. This should make the migration transparent to users, at the small cost of an artificial dip in statistics (specifically, to push notification counts, and unread message counts) while the migration runs.	2024-06-05 11:48:27 -07:00
Anders Kaseorg	8a7916f21a	python: Consistently use from…import for datetime. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-12-05 12:01:18 -08:00
Anders Kaseorg	a50eb2e809	mypy: Enable new error explicit-override. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-10-12 12:28:41 -07:00
Anders Kaseorg	7e707270f0	models: Convert deprecated index_together option to indexes. index_together is slated for removal in Django 5.1: https://docs.djangoproject.com/en/4.2/internals/deprecation/#deprecation-removed-in-5-1 We set the optional index names to match the previously generated index names to avoid adding new migrations. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-07-12 07:12:43 -07:00
Anders Kaseorg	2d9b2a2a05	models: Remove type prefixes from __str__ values. The Django convention is for __repr__ to include the type and __str__ to omit it. In fact its default __repr__ implementation for models automatically adds a type prefix to __str__, which has resulted in the type being duplicated: >>> UserProfile.objects.first() <UserProfile: <UserProfile: emailgateway@zulip.com <Realm: zulipinternal 1>>> Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-03-08 22:56:55 -08:00
Zixuan James Li	4c3c976174	models: Implicitly type model fields with django-stubs. Previously, we type the model fields with explicit type annotations manually with the approximate types. This was because the lack of types for Django. django-stubs provides more specific types for all these fields that incompatible with our previous approximate annotations. So now we can remove the inline type annotations and rely on the types defined in the stubs. This allows mypy to infer the types of the model fields for us. Signed-off-by: Zixuan James Li <p359101898@gmail.com>	2022-10-05 16:15:56 -07:00
Anders Kaseorg	6e4c3e41dc	python: Normalize quotes with Black. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	11741543da	python: Reformat with Black, except quotes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Vishnu KS	235a347639	analytics: Move last_successful_fill to CountStat. This is a prep commit. Currenty we only pass CountStat.property to last_successful_fill function. But it needs access to CountStat.time_increment as well. We can pass the entire CountStat object to the function as a workaround. But making last_successful_fill a property of CountStat seems to be much more cleaner.	2020-12-22 16:44:31 -08:00
Anders Kaseorg	365fe0b3d5	python: Sort imports with isort. Fixes #2665. Regenerated by tabbott with `lint --fix` after a rebase and change in parameters. Note from tabbott: In a few cases, this converts technical debt in the form of unsorted imports into different technical debt in the form of our largest files having very long, ugly import sequences at the start. I expect this change will increase pressure for us to split those files, which isn't a bad thing. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-11 16:45:32 -07:00
Anders Kaseorg	69730a78cc	python: Use trailing commas consistently. Automatically generated by the following script, based on the output of lint with flake8-comma: import re import sys last_filename = None last_row = None lines = [] for msg in sys.stdin: m = re.match( r"\x1b\[35mflake8 \\|\x1b\[0m \x1b\[1;31m(.+):(\d+):(\d+): (\w+)", msg ) if m: filename, row_str, col_str, err = m.groups() row, col = int(row_str), int(col_str) if filename == last_filename: assert last_row != row else: if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) with open(filename) as f: lines = f.readlines() last_filename = filename last_row = row line = lines[row - 1] if err in ["C812", "C815"]: lines[row - 1] = line[: col - 1] + "," + line[col - 1 :] elif err in ["C819"]: assert line[col - 2] == "," lines[row - 1] = line[: col - 2] + line[col - 1 :].lstrip(" ") if last_filename is not None: with open(last_filename, "w") as f: f.writelines(lines) Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-06-11 16:04:12 -07:00
Anders Kaseorg	67e7a3631d	python: Convert percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-10 15:02:09 -07:00
Anders Kaseorg	fead14951c	python: Convert assignment type annotations to Python 3.6 style. This commit was split by tabbott; this piece covers the vast majority of files in Zulip, but excludes scripts/, tools/, and puppet/ to help ensure we at least show the right error messages for Xenial systems. We can likely further refine the remaining pieces with some testing. Generated by com2ann, with whitespace fixes and various manual fixes for runtime issues: - invoiced_through: Optional[LicenseLedger] = models.ForeignKey( + invoiced_through: Optional["LicenseLedger"] = models.ForeignKey( -_apns_client: Optional[APNsClient] = None +_apns_client: Optional["APNsClient"] = None - notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) - signup_notifications_stream: Optional[Stream] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) + notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) + signup_notifications_stream: Optional["Stream"] = models.ForeignKey('Stream', related_name='+', null=True, blank=True, on_delete=CASCADE) - author: Optional[UserProfile] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE) + author: Optional["UserProfile"] = models.ForeignKey('UserProfile', blank=True, null=True, on_delete=CASCADE) - bot_owner: Optional[UserProfile] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL) + bot_owner: Optional["UserProfile"] = models.ForeignKey('self', null=True, on_delete=models.SET_NULL) - default_sending_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) - default_events_register_stream: Optional[Stream] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) + default_sending_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) + default_events_register_stream: Optional["Stream"] = models.ForeignKey('zerver.Stream', null=True, related_name='+', on_delete=CASCADE) -descriptors_by_handler_id: Dict[int, ClientDescriptor] = {} +descriptors_by_handler_id: Dict[int, "ClientDescriptor"] = {} -worker_classes: Dict[str, Type[QueueProcessingWorker]] = {} -queues: Dict[str, Dict[str, Type[QueueProcessingWorker]]] = {} +worker_classes: Dict[str, Type["QueueProcessingWorker"]] = {} +queues: Dict[str, Dict[str, Type["QueueProcessingWorker"]]] = {} -AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional[LDAPSearch] = None +AUTH_LDAP_REVERSE_EMAIL_SEARCH: Optional["LDAPSearch"] = None Signed-off-by: Anders Kaseorg <anders@zulipchat.com>	2020-04-22 11:02:32 -07:00
arpit551	f299f31340	analytics: Fix missing unique constraint when subgroup is null. Replaced unique_together with UniqueConstraint in models that covered nullable fields as in unique_together database indexes don't work where subgroup=None. So added conditional unique index handling invalid duplicate Count data. Added 0015_clear_duplicate_counts migration to handle existing data that violates the constraints. Also corrected a test case in test_counts.py which didn't clear its state properly and thus was accidentally taking advantage of this database schema bug.	2020-03-06 11:10:04 -08:00
Tim Abbott	9ac3e1099c	analytics: Remove last_modified field from FillState. This field wasn't used for anything, and I think it has very limited use for debugging, since fundamentally, it'll almost always have a value within the hour of the actual timestamp in FillState, and any more fine-grained logging we might want would be available in the analytics job's own logs. The proximal reason to remove it is that apparently Django's model_to_dict doesn't support auto_now fields, and that caused some trouble when working on adding more complete import/export support for analytics data.	2020-01-26 20:38:26 -08:00
Anders Kaseorg	f5197518a9	analytics/zilencer/zproject: Remove unused imports. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2019-02-02 17:31:45 -08:00
Rishi Gupta	85f7ac8172	analytics: Remove Anomaly model.	2019-02-01 18:48:18 -08:00
Aditya Bansal	5adf983c3c	analytics: Change use of typing.Text to str.	2018-05-10 14:19:49 -07:00
rht	8106a25e61	django-2.0: Add on_delete on ForeignKeys. In Django 2.0, one must specify the on_delete behavior for all ForeignKeys explicitly.	2018-01-30 10:53:54 -08:00
rht	6c286b5eb6	analytics: Use Python 3 syntax for typing (part 2).	2017-11-22 12:16:58 -08:00
Tim Abbott	2b43a0302a	python: Sort imports in smaller apps.	2017-11-15 15:55:49 -08:00
rht	51c1a6dfc9	analytics: Text-wrap long lines exceeding 110. License: Apache-2.0 Signed-off-by: rht <rhtbot@protonmail.com>	2017-11-10 16:22:00 -08:00
rht	5cfffb0e51	analytics: Remove inheritance from object.	2017-11-06 08:53:48 -08:00
rht	dcc831f767	refactor: Replace all __unicode__ method with __str__. Close #6627.	2017-11-02 11:01:47 -07:00
rht	e51d98cd96	refactor: Remove usage of ModelReprMixin.	2017-11-02 11:01:47 -07:00
Christian Hudon	c80e6edb4e	mypy: Declare models with null=True Optional.	2017-05-23 14:36:40 -07:00
Aditya Bansal	2ca1f60ac5	pep8: Add compliance with rule E261 to analytics/models.py.	2017-05-07 23:21:50 -07:00
hackerkid	5c8f011d66	Remove unused timezone import.	2017-04-16 12:28:56 -07:00
Rishi Gupta	e33ef1c788	analytics/models: Remove extended_id and key_model. They are unused / were part of a previous design.	2017-03-14 16:59:54 -07:00
Rishi Gupta	4dc791f393	Clean up timestamps.py and add a test.	2017-03-01 23:03:56 -08:00
Rishi Gupta	a1b1ffe1e4	analytics: Base default views end_time on FillState, not current time.	2017-02-10 14:41:07 -08:00
Tim Abbott	b7df84d5a8	analytics: Add indexes to optimize performance of aggregation. These indexes fix some slow queries used in updating the analytics tables, resulting in the analytics system consuming far less total resources.	2017-02-01 15:47:49 -08:00
Rishi Gupta	68fcb4152f	analytics: Remove interval field from *Count tables. Includes a database migration. The interval field was originally there to facilitate time aggregation (e.g. aggregate_hour_to_day), but we now do such aggregations in views code or in the frontend.	2017-01-17 15:54:57 -08:00
Rishi Gupta	552d626ef2	analytics: Fix FillState.last_modified not being updated. We were updating FillState with FillState.objects.filter(..).update(..), which does not update the last_modified field (which has auto_now=True). The correct incantation is the save() method of the actual FillState object.	2017-01-08 23:36:34 -08:00
Rishi Gupta	d95fb33d8d	analytics: Add subgroups to unicode representations in models.py.	2016-12-20 12:03:23 -08:00
anirudhjain75	beaa62cafa	mypy: Convert several directories to use typing.Text. Specifically, these directories are converted: [analytics/, scripts/, tools/, zerver/management/, zilencer/, zproject/]	2016-12-07 20:51:05 -08:00
umkay	a94599fca7	analytics/models.py: Add subgroup column to unique_together constraints.	2016-11-01 16:53:56 -07:00
umkay	e92604ab78	analytics: Alter field length for property and interval in BaseCount.	2016-10-27 16:33:58 -07:00
umkay	610e92b94e	analytics: Add subgroup column to analytics tables. This is a major change to the analytics schema, and is the first step in a number of refactorings and performance improvements. For instance, it allows * Grouping sets of similar CountStats in the Count tables. For instance, active{_humans,_bots} will now have the same property, but have different subgroup values. Combining queries that differ only in their value on 1 filter clause, so that we make fewer passes through the zerver tables. For instance, instead of running a query for each of messages_sent_to_public_streams and messages_sent_to_private_streams, we can now run a single query with a group by on Stream.invite_only, and store the group by value in the subgroup column.	2016-10-27 16:33:58 -07:00
Rishi Gupta	82b814a1cd	analytics: Simplify frequency and measurement interval options. Change the CountStat object to take an is_gauge variable instead of a smallest_interval variable. Previously, (smallest_interval, frequency) could be any of (hour, hour), (hour, day), (hour, gauge), (day, hour), (day, day), or (day, gauge). The current change is equivalent to excluding (hour, day) and (day, hour) from the list above. This change, along with other recent changes, allows us to simplify how we handle time intervals. This commit also removes the TimeInterval object.	2016-10-14 10:18:37 -07:00
Rishi Gupta	655ee51e35	analytics: Add table to keep track of fill state. Adds two simplifying assumptions to how we process analytics stats: * Sets the atomic unit of work to: a stat processed at an hour boundary. * For any given stat, only allows these atomic units of work to be processed in chronological order. Adds a table FillState that, for each stat, keeps track of the last unit of work that was processed.	2016-10-14 10:18:37 -07:00
umkay	721529b782	analytics: Remove HuddleCount for now. Planned changes to the underlying analytics model will require potentially complicated changes to huddle queries.	2016-10-14 10:18:37 -07:00
Rishi Gupta	929b69397b	analytics: Change string representation of BaseCount models. Previously we showed both the value and the id of the BaseCount record, which is confusing in a typical case where you only care about the value, and both the value and id are smallish ints.	2016-10-09 16:09:04 -07:00
umkay	78477ea071	Reorder the columns in analytics tables inherited from BaseCount. This is primarily implemented through altering the migration file in order to move the columns, but also we try to make the defaults a little better for future tables inherited from BaseCount.	2016-10-06 17:51:01 -07:00
umkay	d260a22637	Add a new statistics/analytics framework. This is a first pass at building a framework for collecting various stats about realms, users, streams, etc. Includes: * New analytics tables for storing counts data * Raw SQL queries for pulling data from zerver/models.py tables * Aggregation functions for aggregating hourly stats into daily stats, and aggregating user/stream level stats into realm level stats * A management command for pulling the data Note that counts.py was added to the linter exclude list due to errors around %%s.	2016-10-04 17:18:54 -07:00

46 Commits