zulip

Commit Graph

Author	SHA1	Message	Date
Anders Kaseorg	50e6cba1af	ruff: Fix UP032 Use f-string instead of `format` call. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-07-19 16:14:59 -07:00
Anders Kaseorg	052984bc14	utils: Remove make_safe_digest wrapper. It’s unclear what was supposed to be “safe” about this wrapper. The hashlib API is fine without it, and we don’t want to encourage further use of SHA-1. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-07-19 10:54:05 -07:00
Steve Howell	67cdf1a7b4	emojis: Use get_emoji_data. The previous function was poorly named, asked for a Realm object when realm_id sufficed, and returned a tuple of strings that had different semantics. I also avoid calling it duplicate times in a couple places, although it was probably rarely the case that both invocations actually happened if upstream validations were working. Note that there is a TypedDict called EmojiInfo, so I chose EmojiData here. Perhaps a better name would be TinyEmojiData or something. I also simplify the reaction tests with a verify helper.	2023-07-17 09:35:53 -07:00
Alex Vandiver	21aeb4a040	slack: Handle the special case of permissions denied on team.info call. This is a follow-up to `4c8915c8e4`, for the case when the `team:read` permission is missing, which causes the `team.info` call itself to fail. The error message supplies information about the provided and missing permissions -- but it also still sends the `X-OAuth-Scopes` header which we normall read, so we can use that as normal.	2023-06-27 11:04:41 -07:00
Lauryn Menard	d3f7cfccbc	zerver: Update comments with "private message" or "PM". Updates comments/doc-strings that use "private message" or "PM" in files in the `/zerver` directory to instead use "direct message".	2023-06-23 11:24:13 -07:00
Lauryn Menard	2eeeda7694	mattermost: Update references to "private message" and "PM". Updates references to "private message" and "PM" in the data import and related tests for Mattermost to be "direct message" or "DM" instead.	2023-06-23 11:24:13 -07:00
Alex Vandiver	4c8915c8e4	slack: Provide more information when a Slack token fails to validate.	2023-06-23 11:09:45 -07:00
rht	1c84f02f57	slack import: Convert threads to nicely named Zulip topics. Fixes #9006.	2023-05-30 16:35:19 -07:00
Anders Kaseorg	9797de52a0	ruff: Fix RUF010 Use conversion in f-string. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-05-26 22:09:18 -07:00
Alex Vandiver	724de9cd49	rocketchat: Treat users with "bot" roles as bots when importing. We previously relied on `type`, but we have observed bots typed with a `bot` role as well.	2023-05-16 15:10:58 -07:00
Alex Vandiver	34394cec9a	rocketchat: Handle users with no email address set. Fixes: #25596.	2023-05-16 15:10:58 -07:00
Mateusz Mandera	ffa3aa8487	auth: Rewrite data model for tracking enabled auth backends. So far, we've used the BitField .authentication_methods on Realm for tracking which backends are enabled for an organization. This however made it a pain to add new backends (requiring altering the column and a migration - particularly troublesome if someone wanted to create their own custom auth backend for their server). Instead this will be tracked through the existence of the appropriate rows in the RealmAuthenticationMethods table.	2023-04-18 09:22:56 -07:00
Alex Vandiver	567d1d54e7	upload: Rename upload_message_file to use word "attachment". For consistency with the table, which is named Attachment.	2023-03-02 16:36:19 -08:00
Alex Vandiver	fe654b76b7	data_import: Stop tar'ing up converted data. `./manage.py import` does not take a tarball; it takes a directory. Making a separate tarball is a waste of CPU time and disk, as it is never used. This was included in the commit of the initial Slack conversion code in `5b37c5562b` and propagated from there into every conversion tool. Remove the unnecessary tarball creation.	2023-02-26 17:42:01 -08:00
Anders Kaseorg	df001db1a9	black: Reformat with Black 23. Black 23 enforces some slightly more specific rules about empty line counts and redundant parenthesis removal, but the result is still compatible with Black 22. (This does not actually upgrade our Python environment to Black 23 yet.) Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-02-02 10:40:13 -08:00
Alex Vandiver	92c8c17190	import: Add the UTF-8 flag on file entries in zipfiles from Slack. Fixes: #22533.	2023-01-31 16:07:48 -08:00
Anders Kaseorg	e5d671bf2b	ruff: Fix SIM210 Use `bool(…)` instead of `True if … else False`. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-23 11:18:36 -08:00
Anders Kaseorg	b8b29dc3ad	ruff: Fix SIM110 Use `return any(…)` instead of `for` loop. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-23 11:18:36 -08:00
Anders Kaseorg	8f7a7877fe	python: Clean up janky URL matching code with urlsplit. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-18 17:25:46 -05:00
Alex Vandiver	7c0d414aff	uploads: Split out S3 and local file backends into separate files. The uploads file is large, and conceptually the S3 and local-file backends are separable.	2023-01-09 18:23:58 -05:00
Anders Kaseorg	e1ed44907b	ruff: Fix SIM118 Use `key in dict` instead of `key in dict.keys()`. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2023-01-04 16:25:07 -08:00
Anders Kaseorg	edab4ec997	rocketchat: Import timezone-aware datetimes. The bson library creates naive datetime objects by default. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-12-27 10:34:30 -08:00
Anders Kaseorg	872f4b41c1	ci: Check that non-scripts aren’t marked executable. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-12-07 09:54:01 -08:00
M@	280523ad48	slack import: Merge and dedupe same-base emoji reactions and userlists. The naive solution #23465 creates situations where the same user can have multiple reactions as the base emojis are not unique, e.g. +1::skin2 and +1::skin4 would both reduce to +1 but the userlists are separate. This solution handles the reduction, merges the same-base reactions, and deduplicates the userlist. Co-authored-by: Alex Vandiver <alexmv@zulip.com> Co-authored-by: rht <rhtbot@protonmail.com>	2022-11-16 11:11:43 -08:00
Anders Kaseorg	0258fba345	ruff: Fix N811 constant imported as non-constant. Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-11-16 09:29:11 -08:00
rht	5ce2103b87	Slack import: Cache emoji.json in static/generated/emoji. Previously, emoji.json was read from "$ZULIP_PATH/node_modules/emoji-datasource-google/emoji.json". This path doesn't exist in production when installing from scratch from a release tarball. And so, we ensure emoji.json exists by copying it to `static/generated/emoji`. With tweaks to comments by tabbott. Fixes: #23469	2022-11-15 10:43:11 -08:00
Matt Keller	958b58f174	slack: Skip reactions for deleted users. Fixes #23552.	2022-11-14 13:08:15 -08:00
Matt Keller	8aa7ff4bbb	slack: Parse emoji skin tone variants. Fixes part of #23276.	2022-11-07 14:25:49 -08:00
Matt Keller	4d87bf291c	slack: Skip files where file_access: file_not_found.	2022-10-25 12:18:20 -07:00
Matt Keller	c5f106ce1b	slack: Skip files where file_access: access_denied. These stubs are incomplete and should be treated akin to tombstones.	2022-10-11 10:53:16 -07:00
Mateusz Mandera	00b3546c9f	models: Add denormalized .realm column to Message. This commit adds the OPTIONAL .realm attribute to Message (and ArchivedMessage), with the server changes for making new Messages have this set. Old Messages still have to be migrated to backfill this, before it can be non-nullable. Appropriate test changes to correctly set .realm for Messages the tests manually create are included here as well.	2022-10-07 10:09:38 -07:00
Anders Kaseorg	47c5deeccd	python: Mark dict parameters with defaults as read-only. Found by semgrep 0.115 more accurately applying the rule added in commit `0d6c771baf` (#15349). Signed-off-by: Anders Kaseorg <anders@zulip.com>	2022-10-06 13:48:28 -07:00
Mateusz Mandera	2811a1228f	import_util: Make build_message only take kwargs. build_message has a lot of arguments, so it's hard to verify correctness of callers that just try to get the order right. It's much clearer to be explicit via kwargs. mattermost.py and rocketchat.py already do this, so let's bring slack.py and gitter.py up to par.	2022-09-27 15:04:48 -07:00
Matt Keller	fd996c286e	slack: Filter out non-.json files for processing.	2022-09-23 09:59:34 -07:00
rht	a7cff0f091	Slack import: Translate to emoji name to codepoint using iamcal data. Because Slack emoji naming is different from Zulip's. According to https://emojipedia.org/slack/, Slack's emoji shortcodes are derived from https://github.com/iamcal/emoji-data. There are probably some deviations from that dataset, but this PR should at least catch the ones that are identical to iamcal's.	2022-09-17 12:04:07 -07:00
Florian Pritz	a276603766	rocketchat: Deduplicate and ignore huddle rooms with same users. If there are more than 1 room with the same set of users, the import will fail due to a unique constraint on the huddle_hash. Figuring out why and which room is causing this database error is kinda difficult. We deduplicate those cases here and simply merge the rooms together. Note however, that the deduplication does not work as expected so we simply ignore them all together for now and only raise an exception along some logging output. At least this way, it is pretty clear what is wrong and you do not have to wait to get a database error during the actual import. We also ignore empty huddle rooms since those are the duplicates that caused problems for me and if they are empty, ignoring them is easier than trying to get the merge to work. Not sure where those channels come from since we discovered this with production data. Signed-off-by: Florian Pritz <bluewind@xinu.at>	2022-09-09 16:57:24 -07:00
Florian Pritz	3677aabcbd	rocketchat: Ignore mention mapping failures. Not sure where those come from since we discovered this with production data. Signed-off-by: Florian Pritz <bluewind@xinu.at>	2022-09-09 16:57:24 -07:00
Florian Pritz	c308799133	rocketchat: Only set message content if it exists. Not sure where those come from since we discovered this with production data. Signed-off-by: Florian Pritz <bluewind@xinu.at>	2022-09-09 16:57:24 -07:00
Florian Pritz	1cc2764d45	rocketchat: Ignore reactions from non-existant users. Not sure where those come from since we discovered this with production data. Somehow there were reactions with usernames that were old and no longer existed. Signed-off-by: Florian Pritz <bluewind@xinu.at>	2022-09-09 16:57:24 -07:00
Florian Pritz	26fe028534	rocketchat: Truncate long stream names. These will lead to an error during import otherwise. Signed-off-by: Florian Pritz <bluewind@xinu.at>	2022-09-09 16:57:24 -07:00
Florian Pritz	3a27919b5b	rocketchat: Ignore rocketchat attachments without types. Not sure where those come from since we discovered this with production data. There only was a single instance of this in my entire batch of data in an old message from the time when we started using Rocket.Chat. This might be an old issue or it might require some special settings that were later changed. Signed-off-by: Florian Pritz <bluewind@xinu.at>	2022-09-09 16:57:24 -07:00
Florian Pritz	5ec8f4ef09	rocketchat: Ignore missing rocketchat attachments. Not sure where those come from since we discovered this with production data. Signed-off-by: Florian Pritz <bluewind@xinu.at>	2022-09-09 16:57:24 -07:00
Florian Pritz	96fa0991f8	rocketchat: Handle long or invalid rocketchat attachment names. Signed-off-by: Florian Pritz <bluewind@xinu.at>	2022-09-09 16:57:24 -07:00
Mateusz Mandera	5bcf78e0cb	import: Fix timestamp check in long_term_idle_helper. This is supposed to be 60 days, but timestamps are in seconds.	2022-08-29 15:18:00 -07:00
Mateusz Mandera	d350406991	gitter: Make imported Realm start with only GitHub auth enabled. Users will only be able to login via GitHub, because imported users get GitHub's generated noreply email addresses - so this should be the only auth method enabled at first, to avoid confusion.	2022-08-29 11:10:18 -07:00
Mateusz Mandera	eed8800573	long_term_idle_helper: Change all_user_ids arg to an Iterator.	2022-08-29 11:03:27 -07:00
Mateusz Mandera	4c7a9816ff	gitter: Soft deactivate appropriate imported users. We want to use the long_term_idle_helper logic for gitter imports just like we do for slack.	2022-08-29 11:03:27 -07:00
Mateusz Mandera	75f26bb8ff	long_term_idle_helper: Take list of user_ids as arg instead of dicts. Only ["id"] is accessed on the dicts (representing the external tool users). Given that for some tools the id may be under a different name etc. due to different user dicts format, it's best to just pass those ids to the function so that it can stay generalized and not reliant on a specific user dict format.	2022-08-29 11:03:27 -07:00
Mateusz Mandera	7ac31223e8	gitter: Extract get_user_from_message helper.	2022-08-29 11:03:27 -07:00
Mateusz Mandera	c4c270380a	slack: Use get_timestamp_from_message helper function where relevant. get_timestamp_from_message was extracted in the previous commit. We can deduplicate and the code a bit cleaner by using it where appropriate instead of message["ts"].	2022-08-29 11:03:27 -07:00

1 2 3 4 5 ...

288 Commits