zulip/zerver/lib/url_preview/parsers/generic.py

from typing import Dict, Optional
from zerver.lib.url_preview.parsers.base import BaseParser


class GenericParser(BaseParser):
    def extract_data(self) -> Dict[str, Optional[str]]:
        return {
            'title': self._get_title(),
            'description': self._get_description(),
            'image': self._get_image()}

    def _get_title(self) -> Optional[str]:
        soup = self._soup
        if (soup.title and soup.title.text != ''):
            return soup.title.text
        if (soup.h1 and soup.h1.text != ''):
            return soup.h1.text
        return None

    def _get_description(self) -> Optional[str]:
        soup = self._soup
        meta_description = soup.find('meta', attrs={'name': 'description'})
        if (meta_description and meta_description.get('content', '') != ''):
            return meta_description['content']
        first_h1 = soup.find('h1')
        if first_h1:
            first_p = first_h1.find_next('p')
            if (first_p and first_p.text != ''):
                return first_p.text
        first_p = soup.find('p')
        if (first_p and first_p.text != ''):
            return first_p.text
        return None

    def _get_image(self) -> Optional[str]:
        """
        Finding a first image after the h1 header.
        Presumably it will be the main image.
        """
        soup = self._soup
        first_h1 = soup.find('h1')
        if first_h1:
            first_image = first_h1.find_next_sibling('img')
            if first_image and first_image['src'] != '':
                return first_image['src']
        return None
zerver/lib: Change use of typing.Text to str. 2018-05-10 19:13:36 +02:00			`from typing import Dict, Optional`
Add oembed/Open Graph/Meta tags data retrieval from inline links. This change adds support for displaying inline open graph previews for links posted into Zulip. It is designed to interact correctly with message editing. This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control whether this feature is enabled. By default, this setting is currently disabled, so that we can burn it in for a bit before it impacts users more broadly. Eventually, we may want to make this manageable via a (set of?) per-realm settings. E.g. I can imagine a realm wanting to be able to enable/disable it for certain URLs. 2016-10-27 12:06:44 +02:00			`from zerver.lib.url_preview.parsers.base import BaseParser`


			`class GenericParser(BaseParser):`
zerver/lib: Change use of typing.Text to str. 2018-05-10 19:13:36 +02:00			`def extract_data(self) -> Dict[str, Optional[str]]:`
Add oembed/Open Graph/Meta tags data retrieval from inline links. This change adds support for displaying inline open graph previews for links posted into Zulip. It is designed to interact correctly with message editing. This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control whether this feature is enabled. By default, this setting is currently disabled, so that we can burn it in for a bit before it impacts users more broadly. Eventually, we may want to make this manageable via a (set of?) per-realm settings. E.g. I can imagine a realm wanting to be able to enable/disable it for certain URLs. 2016-10-27 12:06:44 +02:00			`return {`
			`'title': self._get_title(),`
			`'description': self._get_description(),`
			`'image': self._get_image()}`

zerver/lib: Change use of typing.Text to str. 2018-05-10 19:13:36 +02:00			`def _get_title(self) -> Optional[str]:`
Add oembed/Open Graph/Meta tags data retrieval from inline links. This change adds support for displaying inline open graph previews for links posted into Zulip. It is designed to interact correctly with message editing. This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control whether this feature is enabled. By default, this setting is currently disabled, so that we can burn it in for a bit before it impacts users more broadly. Eventually, we may want to make this manageable via a (set of?) per-realm settings. E.g. I can imagine a realm wanting to be able to enable/disable it for certain URLs. 2016-10-27 12:06:44 +02:00			`soup = self._soup`
			`if (soup.title and soup.title.text != ''):`
			`return soup.title.text`
			`if (soup.h1 and soup.h1.text != ''):`
			`return soup.h1.text`
			`return None`

zerver/lib: Change use of typing.Text to str. 2018-05-10 19:13:36 +02:00			`def _get_description(self) -> Optional[str]:`
Add oembed/Open Graph/Meta tags data retrieval from inline links. This change adds support for displaying inline open graph previews for links posted into Zulip. It is designed to interact correctly with message editing. This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control whether this feature is enabled. By default, this setting is currently disabled, so that we can burn it in for a bit before it impacts users more broadly. Eventually, we may want to make this manageable via a (set of?) per-realm settings. E.g. I can imagine a realm wanting to be able to enable/disable it for certain URLs. 2016-10-27 12:06:44 +02:00			`soup = self._soup`
			`meta_description = soup.find('meta', attrs={'name': 'description'})`
url_preview: Fix crash when description has no content. There's several things we'll want to cleanup with this feature, but for now we're content to just make this not crash. 2018-05-17 21:40:43 +02:00			`if (meta_description and meta_description.get('content', '') != ''):`
Add oembed/Open Graph/Meta tags data retrieval from inline links. This change adds support for displaying inline open graph previews for links posted into Zulip. It is designed to interact correctly with message editing. This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control whether this feature is enabled. By default, this setting is currently disabled, so that we can burn it in for a bit before it impacts users more broadly. Eventually, we may want to make this manageable via a (set of?) per-realm settings. E.g. I can imagine a realm wanting to be able to enable/disable it for certain URLs. 2016-10-27 12:06:44 +02:00			`return meta_description['content']`
			`first_h1 = soup.find('h1')`
			`if first_h1:`
			`first_p = first_h1.find_next('p')`
url preview: Return generic parser <p> text as str (not bs4 string). 2019-05-05 20:15:00 +02:00			`if (first_p and first_p.text != ''):`
Add oembed/Open Graph/Meta tags data retrieval from inline links. This change adds support for displaying inline open graph previews for links posted into Zulip. It is designed to interact correctly with message editing. This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control whether this feature is enabled. By default, this setting is currently disabled, so that we can burn it in for a bit before it impacts users more broadly. Eventually, we may want to make this manageable via a (set of?) per-realm settings. E.g. I can imagine a realm wanting to be able to enable/disable it for certain URLs. 2016-10-27 12:06:44 +02:00			`return first_p.text`
			`first_p = soup.find('p')`
url preview: Return generic parser <p> text as str (not bs4 string). 2019-05-05 20:15:00 +02:00			`if (first_p and first_p.text != ''):`
			`return first_p.text`
Add oembed/Open Graph/Meta tags data retrieval from inline links. This change adds support for displaying inline open graph previews for links posted into Zulip. It is designed to interact correctly with message editing. This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control whether this feature is enabled. By default, this setting is currently disabled, so that we can burn it in for a bit before it impacts users more broadly. Eventually, we may want to make this manageable via a (set of?) per-realm settings. E.g. I can imagine a realm wanting to be able to enable/disable it for certain URLs. 2016-10-27 12:06:44 +02:00			`return None`

zerver/lib: Change use of typing.Text to str. 2018-05-10 19:13:36 +02:00			`def _get_image(self) -> Optional[str]:`
Add oembed/Open Graph/Meta tags data retrieval from inline links. This change adds support for displaying inline open graph previews for links posted into Zulip. It is designed to interact correctly with message editing. This adds the new settings.INLINE_URL_EMBED_PREVIEW setting to control whether this feature is enabled. By default, this setting is currently disabled, so that we can burn it in for a bit before it impacts users more broadly. Eventually, we may want to make this manageable via a (set of?) per-realm settings. E.g. I can imagine a realm wanting to be able to enable/disable it for certain URLs. 2016-10-27 12:06:44 +02:00			`"""`
			`Finding a first image after the h1 header.`
			`Presumably it will be the main image.`
			`"""`
			`soup = self._soup`
			`first_h1 = soup.find('h1')`
			`if first_h1:`
			`first_image = first_h1.find_next_sibling('img')`
			`if first_image and first_image['src'] != '':`
			`return first_image['src']`
			`return None`