zulip

Commit Graph

Author	SHA1	Message	Date
rht	1c84f02f57	slack import: Convert threads to nicely named Zulip topics. Fixes #9006.	2023-05-30 16:35:19 -07:00
Anders Kaseorg	9797de52a0	ruff: Fix RUF010 Use conversion in f-string. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-05-26 22:09:18 -07:00
Alex Vandiver	567d1d54e7	upload: Rename upload_message_file to use word "attachment". For consistency with the table, which is named Attachment.	2023-03-02 16:36:19 -08:00
Alex Vandiver	fe654b76b7	data_import: Stop tar'ing up converted data. `./manage.py import` does not take a tarball; it takes a directory. Making a separate tarball is a waste of CPU time and disk, as it is never used. This was included in the commit of the initial Slack conversion code in `5b37c5562b` and propagated from there into every conversion tool. Remove the unnecessary tarball creation.	2023-02-26 17:42:01 -08:00
Alex Vandiver	92c8c17190	import: Add the UTF-8 flag on file entries in zipfiles from Slack. Fixes: #22533.	2023-01-31 16:07:48 -08:00
Anders Kaseorg	e5d671bf2b	ruff: Fix SIM210 Use `bool(…)` instead of `True if … else False`. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-23 11:18:36 -08:00
Anders Kaseorg	8f7a7877fe	python: Clean up janky URL matching code with urlsplit. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-18 17:25:46 -05:00
Alex Vandiver	7c0d414aff	uploads: Split out S3 and local file backends into separate files. The uploads file is large, and conceptually the S3 and local-file backends are separable.	2023-01-09 18:23:58 -05:00
Anders Kaseorg	e1ed44907b	ruff: Fix SIM118 Use `key in dict` instead of `key in dict.keys()`. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-04 16:25:07 -08:00
Anders Kaseorg	872f4b41c1	ci: Check that non-scripts aren’t marked executable. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-12-07 09:54:01 -08:00
M@	280523ad48	slack import: Merge and dedupe same-base emoji reactions and userlists. The naive solution #23465 creates situations where the same user can have multiple reactions as the base emojis are not unique, e.g. +1::skin2 and +1::skin4 would both reduce to +1 but the userlists are separate. This solution handles the reduction, merges the same-base reactions, and deduplicates the userlist. Co-authored-by: Alex Vandiver <alexmv@zulip.com> Co-authored-by: rht <rhtbot@protonmail.com>	2022-11-16 11:11:43 -08:00
rht	5ce2103b87	Slack import: Cache emoji.json in static/generated/emoji. Previously, emoji.json was read from "$ZULIP_PATH/node_modules/emoji-datasource-google/emoji.json". This path doesn't exist in production when installing from scratch from a release tarball. And so, we ensure emoji.json exists by copying it to `static/generated/emoji`. With tweaks to comments by tabbott. Fixes: #23469	2022-11-15 10:43:11 -08:00
Matt Keller	958b58f174	slack: Skip reactions for deleted users. Fixes #23552.	2022-11-14 13:08:15 -08:00
Matt Keller	8aa7ff4bbb	slack: Parse emoji skin tone variants. Fixes part of #23276.	2022-11-07 14:25:49 -08:00
Matt Keller	4d87bf291c	slack: Skip files where file_access: file_not_found.	2022-10-25 12:18:20 -07:00
Matt Keller	c5f106ce1b	slack: Skip files where file_access: access_denied. These stubs are incomplete and should be treated akin to tombstones.	2022-10-11 10:53:16 -07:00
Mateusz Mandera	00b3546c9f	models: Add denormalized .realm column to Message. This commit adds the OPTIONAL .realm attribute to Message (and ArchivedMessage), with the server changes for making new Messages have this set. Old Messages still have to be migrated to backfill this, before it can be non-nullable. Appropriate test changes to correctly set .realm for Messages the tests manually create are included here as well.	2022-10-07 10:09:38 -07:00
Mateusz Mandera	2811a1228f	import_util: Make build_message only take kwargs. build_message has a lot of arguments, so it's hard to verify correctness of callers that just try to get the order right. It's much clearer to be explicit via kwargs. mattermost.py and rocketchat.py already do this, so let's bring slack.py and gitter.py up to par.	2022-09-27 15:04:48 -07:00
Matt Keller	fd996c286e	slack: Filter out non-.json files for processing.	2022-09-23 09:59:34 -07:00
rht	a7cff0f091	Slack import: Translate to emoji name to codepoint using iamcal data. Because Slack emoji naming is different from Zulip's. According to https://emojipedia.org/slack/, Slack's emoji shortcodes are derived from https://github.com/iamcal/emoji-data. There are probably some deviations from that dataset, but this PR should at least catch the ones that are identical to iamcal's.	2022-09-17 12:04:07 -07:00
Mateusz Mandera	eed8800573	long_term_idle_helper: Change all_user_ids arg to an Iterator.	2022-08-29 11:03:27 -07:00
Mateusz Mandera	75f26bb8ff	long_term_idle_helper: Take list of user_ids as arg instead of dicts. Only ["id"] is accessed on the dicts (representing the external tool users). Given that for some tools the id may be under a different name etc. due to different user dicts format, it's best to just pass those ids to the function so that it can stay generalized and not reliant on a specific user dict format.	2022-08-29 11:03:27 -07:00
Mateusz Mandera	c4c270380a	slack: Use get_timestamp_from_message helper function where relevant. get_timestamp_from_message was extracted in the previous commit. We can deduplicate and the code a bit cleaner by using it where appropriate instead of message["ts"].	2022-08-29 11:03:27 -07:00
Mateusz Mandera	9e56e71afe	long_term_idle_helper: Take timestamp_from_message callable arg. message["ts"] is slack-specific. For this to be a general util function it needs to take a callable that will grab a timestamp from the message dict (which has varying formats depending on what we're importing from).	2022-08-29 11:03:27 -07:00
Alex Vandiver	1b1faa3907	import_util: Factor out long_term_idle_helper.	2022-08-29 11:03:27 -07:00
Anders Kaseorg	b945aa3443	python: Use a real parser for email addresses. Now that we can assume Python 3.6+, we can use the email.headerregistry module to replace hacky manual email address parsing. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-07-29 15:47:33 -07:00
Alex Vandiver	2e50ead9d1	data_import: Fix bot email address de-duplication. `4815f6e28b` tried to de-duplicate bot email addresses, but instead caused duplicates to crash: ``` Traceback (most recent call last): File "./manage.py", line 157, in <module> execute_from_command_line(sys.argv) File "./manage.py", line 122, in execute_from_command_line utility.execute() File "/srv/zulip-venv-cache/56ac6adf406011a100282dd526d03537be84d23e/zulip-py3-venv/lib/python3.8/site-packages/django/core/management/__init__.py", line 413, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "/srv/zulip-venv-cache/56ac6adf406011a100282dd526d03537be84d23e/zulip-py3-venv/lib/python3.8/site-packages/django/core/management/base.py", line 354, in run_from_argv self.execute(args, cmd_options) File "/srv/zulip-venv-cache/56ac6adf406011a100282dd526d03537be84d23e/zulip-py3-venv/lib/python3.8/site-packages/django/core/management/base.py", line 398, in execute output = self.handle(args, **options) File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/management/commands/convert_slack_data.py", line 59, in handle do_convert_data(path, output_dir, token, threads=num_threads) File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 1320, in do_convert_data ) = slack_workspace_to_realm( File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 141, in slack_workspace_to_realm ) = users_to_zerver_userprofile(slack_data_dir, user_list, realm_id, int(NOW), domain_name) File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 248, in users_to_zerver_userprofile email = get_user_email(user, domain_name) File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 406, in get_user_email return SlackBotEmail.get_email(user["profile"], domain_name) File "/home/zulip/deployments/2022-03-16-22-25-42/zerver/data_import/slack.py", line 85, in get_email email_prefix += cls.duplicate_email_count[email] TypeError: can only concatenate str (not "int") to str ``` Fix the stringification, make it case-insensitive, append with a dash for readability, and add tests for all of the above.	2022-03-31 11:10:18 -07:00
Anders Kaseorg	b0ce4f1bce	docs: Fix many spelling mistakes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-02-07 18:51:06 -08:00
rht	58b19761b8	slack import: Fix requests.get usage of get_slack_api_data. We also rewrite the tests using the `responses` module to avoid the problematic mocking that made this bug possible. Fixes #19833.	2021-10-07 11:46:23 -07:00
rht	d8e1409fe5	Slack import: Use Python ZipFile to unzip. This should handle the case when non-ASCII Unicode folder names are created on Windows. Fixes #19899.	2021-10-07 09:24:19 -07:00
Priyansh Garg	4815f6e28b	data_import: Make slack bot emails unique. Slack bot emails generated by us can be duplicate for two bots. If such a case occur, append a counter to the email to make it unique. For maintaining the counter of duplicate emails and the final email assigned to each bot, a class based approach is used with static variables and static (class) methods. This keeps all the data related to slack bot emails at the same place and easily accessible from anywhere inside the module (without defining any class object and passing it around). Fixes: #16793	2021-08-03 16:18:14 -07:00
Anders Kaseorg	5483ebae37	python: Convert "".format to Python 3.6 f-strings. Generated automatically by pyupgrade. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-08-02 15:53:52 -07:00
Anders Kaseorg	3665deb93a	python: Remove unnecessary intermediate lists. Generated automatically by pyupgrade. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-08-02 15:53:52 -07:00
rht	1bbd36d181	slack_import: Remove obsolete SlackImportAttachment placeholder. This was introduced in `f4ad464d82`, and incompletely removed in e037c2f93e649c28a71c02559b5ae7a3333f42a8; here we finish removing it.	2021-08-02 13:13:28 -07:00
Alex Vandiver	ff9126ac1e	data_import: Protect better against bad Slack tokens. An invalid token would be treated the same as a token with no scopes; differentiate these better.	2021-05-27 22:46:58 -07:00
Alex Vandiver	94e4f33b29	data_import: Support importing from Slack conversions in a directory. Sometimes the Slack import zip file we get isn't quite the canonical form that Slack produces -- often because the user has unzip'd it, looked at it, and re-zip'd it, resulting in extra nested directories and the like. For such cases, support passing in a path to an unpacked Slack export tree.	2021-05-27 22:46:58 -07:00
Alex Vandiver	8228ea2a17	import_data: Do some quick verification of Slack import formats.	2021-05-27 22:46:58 -07:00
Anders Kaseorg	544bbd5398	docs: Fix capitalization mistakes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-05-10 09:57:26 -07:00
Cyril Pletinckx	ba7da6d5c0	import/export: Fix deprecated authentication method for Slack. The query string parameter authentication method is now deprecated for newly created Slack applications since the 24th of February[1]. This causes Slack imports to fail, claiming that the token has none of the required scopes. Two methods can be used to solve this problem: either include the authentication token in the header of an HTTP GET request, or include it in the body of an HTTP POST request. The former is preferred, as the code was already written to use HTTP GET requests. Change the way the parameters are passed to the "requests.get" method calls, to pass the token via the `Authorization` header. [1] https://api.slack.com/changelog/2020-11-no-more-tokens-in-querystrings-for-newly-created-apps Fixes: #17408.	2021-03-08 12:56:37 -08:00
Anders Kaseorg	6e4c3e41dc	python: Normalize quotes with Black. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Anders Kaseorg	11741543da	python: Reformat with Black, except quotes. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2021-02-12 13:11:19 -08:00
Alex Vandiver	7c849fa940	slack: Check token access scopes before importing. The Slack API always (even for failed requests) puts the access scopes of the token passed in, into "X-OAuth-Scopes"[1], which can be used to determine if any are missing -- and if so, which. [1] https://api.slack.com/legacy/oauth-scopes#working-with-scopes	2020-12-15 11:33:15 -08:00
Tim Abbott	067cd3a97a	docs: Remove incorrect references to chat.zulip.org. Most of these are Help Center links that should be pointing to the production Help Center.	2020-10-29 16:46:40 -07:00
Anders Kaseorg	4e9d587535	python: Pass query parameters as a dict when making GET requests. This provides automatic URL-encoding. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-27 13:47:02 -07:00
Anders Kaseorg	72d6ff3c3b	docs: Fix more capitalization issues. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-10-23 11:46:55 -07:00
Anders Kaseorg	b7b7475672	python: Use standard secrets module to generate random tokens. There are three functional side effects: • Correct an insignificant but mathematically offensive bias toward repeated characters in generate_api_key introduced in commit 47b4283c4b4c70ecde4d3c8de871c90ee2506d87; its entropy is increased from 190.52864 bits to 190.53428 bits. • Use the base32 alphabet in confirmation.models.generate_key; its entropy is reduced from 124.07820 bits to the documented 120 bits, but now it uses 1 syscall instead of 24. • Use the base32 alphabet in get_bigbluebutton_url; its entropy is reduced from 51.69925 bits to 50 bits, but now it uses 1 syscall instead of 10. (The base32 alphabet is A-Z 2-7. We could probably replace all of these with plain secrets.token_urlsafe, since I expect most callers can handle the full urlsafe_b64 alphabet A-Z a-z 0-9 - _ without problems.) Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-09-09 15:52:57 -07:00
Anders Kaseorg	61d0417e75	python: Replace ujson with orjson. Fixes #6507. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-08-11 10:55:12 -07:00
Steve Howell	c44500175d	database: Remove short_name from UserProfile. A few major themes here: - We remove short_name from UserProfile and add the appropriate migration. - We remove short_name from various cache-related lists of fields. - We allow import tools to continue to write short_name to their export files, and then we simply ignore the field at import time. - We change functions like do_create_user, create_user_profile, etc. - We keep short_name in the /json/bots API. (It actually gets turned into an email.) - We don't modify our LDAP code much here.	2020-07-17 11:15:15 -07:00
Steve Howell	0b65abcdf5	pointer: Remove pointer from UserProfile. Most of the changes here are just that we no longer need to provide a value for pointer when we create UserProfile objects.	2020-07-03 13:08:40 +00:00
Anders Kaseorg	74c17bf94a	python: Convert more percent formatting to Python 3.6 f-strings. Generated by pyupgrade --py36-plus. Now including %d, %i, %u, and multi-line strings. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2020-06-14 23:27:22 -07:00

1 2 3

131 Commits