markdown: Catch urllib3 exceptions which are raised during streaming.

requests transforms the base urllib3 exceptions into
requests.RequestExceptions -- but only within code that it is
running.  When parsing the streaming body in fetch_open_graph_image,
the read itself (inside lxml) may trigger urllib3 to raise its own
timeout error -- which escapes the current catch of
requests.RequestExceptions.

Catch both requests and urllib3 exceptions.
This commit is contained in:
Alex Vandiver 2024-05-24 19:38:46 +00:00 committed by Tim Abbott
parent c98bf184bb
commit b7bf9e41c7
1 changed files with 2 additions and 1 deletions

View File

@ -44,6 +44,7 @@ import re2
import regex
import requests
import uri_template
import urllib3.exceptions
from django.conf import settings
from markdown.blockparser import BlockParser
from markdown.extensions import codehilite, nl2br, sane_lists, tables
@ -474,7 +475,7 @@ def fetch_open_graph_image(url: str) -> Optional[Dict[str, Any]]:
elif element.get("property") == "og:description":
og["desc"] = element.get("content")
except requests.RequestException:
except (requests.RequestException, urllib3.exceptions.HTTPError):
return None
return None if og["image"] is None else og