zulip/analytics/management/commands/analyze_mit.py

import datetime
import logging
import time
from typing import Any, Dict

from django.core.management.base import BaseCommand, CommandParser

from zerver.lib.timestamp import timestamp_to_datetime
from zerver.models import Message, Recipient

def compute_stats(log_level: int) -> None:
    logger = logging.getLogger()
    logger.setLevel(log_level)

    one_week_ago = timestamp_to_datetime(time.time()) - datetime.timedelta(weeks=1)
    mit_query = Message.objects.filter(sender__realm__string_id="zephyr",
                                       recipient__type=Recipient.STREAM,
                                       pub_date__gt=one_week_ago)
    for bot_sender_start in ["imap.", "rcmd.", "sys."]:
        mit_query = mit_query.exclude(sender__email__startswith=(bot_sender_start))
    # Filtering for "/" covers tabbott/extra@ and all the daemon/foo bots.
    mit_query = mit_query.exclude(sender__email__contains=("/"))
    mit_query = mit_query.exclude(sender__email__contains=("aim.com"))
    mit_query = mit_query.exclude(
        sender__email__in=["rss@mit.edu", "bash@mit.edu", "apache@mit.edu",
                           "bitcoin@mit.edu", "lp@mit.edu", "clocks@mit.edu",
                           "root@mit.edu", "nagios@mit.edu",
                           "www-data|local-realm@mit.edu"])
    user_counts = {}  # type: Dict[str, Dict[str, int]]
    for m in mit_query.select_related("sending_client", "sender"):
        email = m.sender.email
        user_counts.setdefault(email, {})
        user_counts[email].setdefault(m.sending_client.name, 0)
        user_counts[email][m.sending_client.name] += 1

    total_counts = {}  # type: Dict[str, int]
    total_user_counts = {}  # type: Dict[str, int]
    for email, counts in user_counts.items():
        total_user_counts.setdefault(email, 0)
        for client_name, count in counts.items():
            total_counts.setdefault(client_name, 0)
            total_counts[client_name] += count
            total_user_counts[email] += count

    logging.debug("%40s | %10s | %s" % ("User", "Messages", "Percentage Zulip"))
    top_percents = {}  # type: Dict[int, float]
    for size in [10, 25, 50, 100, 200, len(total_user_counts.keys())]:
        top_percents[size] = 0.0
    for i, email in enumerate(sorted(total_user_counts.keys(),
                                     key=lambda x: -total_user_counts[x])):
        percent_zulip = round(100 - (user_counts[email].get("zephyr_mirror", 0)) * 100. /
                              total_user_counts[email], 1)
        for size in top_percents.keys():
            top_percents.setdefault(size, 0)
            if i < size:
                top_percents[size] += (percent_zulip * 1.0 / size)

        logging.debug("%40s | %10s | %s%%" % (email, total_user_counts[email],
                                              percent_zulip))

    logging.info("")
    for size in sorted(top_percents.keys()):
        logging.info("Top %6s | %s%%" % (size, round(top_percents[size], 1)))

    grand_total = sum(total_counts.values())
    print(grand_total)
    logging.info("%15s | %s" % ("Client", "Percentage"))
    for client in total_counts.keys():
        logging.info("%15s | %s%%" % (client, round(100. * total_counts[client] / grand_total, 1)))

class Command(BaseCommand):
    help = "Compute statistics on MIT Zephyr usage."

    def add_arguments(self, parser: CommandParser) -> None:
        parser.add_argument('--verbose', default=False, action='store_true')

    def handle(self, *args: Any, **options: Any) -> None:
        level = logging.INFO
        if options["verbose"]:
            level = logging.DEBUG
        compute_stats(level)
python: Sort imports in smaller apps. 2017-11-16 00:55:49 +01:00			`import datetime`
			`import logging`
			`import time`
mypy: Added Dict, List and Set imports. Fixed mypy errors associated with the upgrade. 2017-03-03 19:01:52 +01:00			`from typing import Any, Dict`
Annotate most Zulip management commands. 2016-06-04 16:52:18 +02:00
Django 1.10: Use add_argument for options in BaseCommand. 2016-11-03 10:22:19 +01:00			`from django.core.management.base import BaseCommand, CommandParser`
python: Sort imports in smaller apps. 2017-11-16 00:55:49 +01:00
[manual] Rename Django app from zephyr to zerver. This needs to be deployed to both staging and prod at the same off-peak time (and the schema migration run). At the time it is deployed, we need to make a few changes directly in the database: (1) UPDATE django_content_type set app_label='zerver' where app_label='zephyr'; (2) UPDATE south_migrationhistory set app_name='zerver' where app_name='zephyr'; (imported from commit eb3fd719571740189514ef0b884738cb30df1320) 2013-07-29 23:03:31 +02:00			`from zerver.lib.timestamp import timestamp_to_datetime`
python: Sort imports in smaller apps. 2017-11-16 00:55:49 +01:00			`from zerver.models import Message, Recipient`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00
analytics: Use python 3 syntax for typing. 2017-11-05 06:54:00 +01:00			`def compute_stats(log_level: int) -> None:`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00			`logger = logging.getLogger()`
			`logger.setLevel(log_level)`

			`one_week_ago = timestamp_to_datetime(time.time()) - datetime.timedelta(weeks=1)`
Change string_id of test zephyr realm from mit to zephyr. Also changes Realm.is_zephyr_mirror_realm to use string_id=zephyr instead of domain=mit.edu. Part of a larger migration away from Realm.domain. 2017-03-04 09:19:37 +01:00			`mit_query = Message.objects.filter(sender__realm__string_id="zephyr",`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00			`recipient__type=Recipient.STREAM,`
			`pub_date__gt=one_week_ago)`
			`for bot_sender_start in ["imap.", "rcmd.", "sys."]:`
Access the UserProfile's new email field rather than using User. This is preparatory for stopping using the User model. (imported from commit a1b0808c8cc2ddd19a25163f91c4f18620c9ce90) 2013-03-28 20:43:34 +01:00			`mit_query = mit_query.exclude(sender__email__startswith=(bot_sender_start))`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00			`# Filtering for "/" covers tabbott/extra@ and all the daemon/foo bots.`
Access the UserProfile's new email field rather than using User. This is preparatory for stopping using the User model. (imported from commit a1b0808c8cc2ddd19a25163f91c4f18620c9ce90) 2013-03-28 20:43:34 +01:00			`mit_query = mit_query.exclude(sender__email__contains=("/"))`
			`mit_query = mit_query.exclude(sender__email__contains=("aim.com"))`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00			`mit_query = mit_query.exclude(`
Access the UserProfile's new email field rather than using User. This is preparatory for stopping using the User model. (imported from commit a1b0808c8cc2ddd19a25163f91c4f18620c9ce90) 2013-03-28 20:43:34 +01:00			`sender__email__in=["rss@mit.edu", "bash@mit.edu", "apache@mit.edu",`
			`"bitcoin@mit.edu", "lp@mit.edu", "clocks@mit.edu",`
			`"root@mit.edu", "nagios@mit.edu",`
			`"www-data\|local-realm@mit.edu"])`
pep8: Add compliance with rule E261 to analyze_mit.py. 2017-05-07 16:31:31 +02:00			`user_counts = {} # type: Dict[str, Dict[str, int]]`
Access the UserProfile's new email field rather than using User. This is preparatory for stopping using the User model. (imported from commit a1b0808c8cc2ddd19a25163f91c4f18620c9ce90) 2013-03-28 20:43:34 +01:00			`for m in mit_query.select_related("sending_client", "sender"):`
			`email = m.sender.email`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00			`user_counts.setdefault(email, {})`
			`user_counts[email].setdefault(m.sending_client.name, 0)`
			`user_counts[email][m.sending_client.name] += 1`

pep8: Add compliance with rule E261 to analyze_mit.py. 2017-05-07 16:31:31 +02:00			`total_counts = {} # type: Dict[str, int]`
			`total_user_counts = {} # type: Dict[str, int]`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00			`for email, counts in user_counts.items():`
			`total_user_counts.setdefault(email, 0)`
			`for client_name, count in counts.items():`
			`total_counts.setdefault(client_name, 0)`
			`total_counts[client_name] += count`
			`total_user_counts[email] += count`

Change Humbug => Zulip in text/comments. (imported from commit 2f9d73431ae40e1b9e9e11bc2f4f62f566ae758a) 2013-08-06 21:32:15 +02:00			`logging.debug("%40s \| %10s \| %s" % ("User", "Messages", "Percentage Zulip"))`
pep8: Add compliance with rule E261 to analyze_mit.py. 2017-05-07 16:31:31 +02:00			`top_percents = {} # type: Dict[int, float]`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00			`for size in [10, 25, 50, 100, 200, len(total_user_counts.keys())]:`
Fix various float initialization to use 0.0 instead of 0. This is needed to type-check these values. 2016-01-26 04:08:05 +01:00			`top_percents[size] = 0.0`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00			`for i, email in enumerate(sorted(total_user_counts.keys(),`
			`key=lambda x: -total_user_counts[x])):`
Change humbug => zulip in some local variables. (imported from commit 88caa4a87ea0fd269ab741645c124c5d07d69c0a) 2013-08-06 22:20:02 +02:00			`percent_zulip = round(100 - (user_counts[email].get("zephyr_mirror", 0)) * 100. /`
lint: Fix E127 pep8 violations. Fix pep8: E127 continuation line over-indented for visual indent style issue. 2016-11-30 14:17:35 +01:00			`total_user_counts[email], 1)`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00			`for size in top_percents.keys():`
			`top_percents.setdefault(size, 0)`
			`if i < size:`
Change humbug => zulip in some local variables. (imported from commit 88caa4a87ea0fd269ab741645c124c5d07d69c0a) 2013-08-06 22:20:02 +02:00			`top_percents[size] += (percent_zulip * 1.0 / size)`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00
			`logging.debug("%40s \| %10s \| %s%%" % (email, total_user_counts[email],`
Change humbug => zulip in some local variables. (imported from commit 88caa4a87ea0fd269ab741645c124c5d07d69c0a) 2013-08-06 22:20:02 +02:00			`percent_zulip))`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00
			`logging.info("")`
			`for size in sorted(top_percents.keys()):`
			`logging.info("Top %6s \| %s%%" % (size, round(top_percents[size], 1)))`

			`grand_total = sum(total_counts.values())`
Apply Python 3 futurize transform libfuturize.fixes.fix_print_with_import. 2015-11-01 17:11:06 +01:00			`print(grand_total)`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00			`logging.info("%15s \| %s" % ("Client", "Percentage"))`
			`for client in total_counts.keys():`
			`logging.info("%15s \| %s%%" % (client, round(100. * total_counts[client] / grand_total, 1)))`

			`class Command(BaseCommand):`
			`help = "Compute statistics on MIT Zephyr usage."`

analytics: Use python 3 syntax for typing. 2017-11-05 06:54:00 +01:00			`def add_arguments(self, parser: CommandParser) -> None:`
Django 1.10: Use add_argument for options in BaseCommand. 2016-11-03 10:22:19 +01:00			`parser.add_argument('--verbose', default=False, action='store_true')`

analytics: Use python 3 syntax for typing. 2017-11-05 06:54:00 +01:00			`def handle(self, args: Any, *options: Any) -> None:`
Add tool to analyze fraction of Zephyrs sent using Humbug. (imported from commit b491961b21e845471b1c52eae2b7069cc5328103) 2013-01-29 00:34:32 +01:00			`level = logging.INFO`
			`if options["verbose"]:`
			`level = logging.DEBUG`
			`compute_stats(level)`