MRSS And RSS Feed Requirements

Feed validation requirements

This document specifies general rules for feed validation, as well as specific rules for the individual feed types: RSS, MRSS, YouTube, Atom.

General feed validation

All feeds need to be valid XML documents and thus the following characters: '&', '>', '<' are not allowed inside strings such as titles, descriptions, etc. while ' and " are not allowed inside attributes. These should instead be encoded using XML entities e.g. &amp;

For each feed item, the Crawling Service looks for the following information: title, description, image url, video url, link to html page. The Crawler then tries to parse the link (if present) and check if it contains additional information for these field (specified using the og protocol). If the same information is present in both the feed item and the html page (e.g. for and RSS feed we have both <title> in the feed item as well as og:title in the html page) then the information in the feed item takes precedence, with the exception of the image url, in which case the og:image take precedence as it is likely to be of higher relevance and quality.

Not all the retrieved information is required for each media type. See table below.

 

Article

Video

Highlights

Title

Required

Required

Optional

Description

Optional

Optional

Optional

Image Url

Optional

Optional

Required

Video Url

Optional

Required

Optional

Link

Required

Optional

Required

Feeds with credentials

We support specifying authentication credentials for feeds or subitems of a feed (e.g. video or image url) as long as the url is in the form http://use rname:password@url or https://username:password@url.

Examples:

https://tudor:pass123@demo.connatix.com/Tudor/feed.xml https://tudor:pass123@demo.connatix.com/Tudor/image.png

og protocol

Information from html pages is specified using the og protocol. See the specs here: https://wordpress.org/plugins/open-graph-protocol-framework/

Examples (note that for each og:<<property>> we also support og:<<property>>:url and og:<<property>>:secure_url as alternatives):

<meta charset="utf-8" property="og:title" content="MyPage" />

    <meta charset="utf-8" property="og:description" content="Description" />

    <meta charset="utf-8" property="og:image" content="index2.jpg" />

    <meta charset="utf-8" property="og:url" content="https://www.google.com" />

    <meta charset="utf-8" property="og:video" content="http://demo.connatix.com/Tudor/test.mp4" />

<meta charset="utf-8" id="og_title" content="MyPage" />

<meta charset="utf-8" id="og_description" content="Description" />

<meta charset="utf-8" id="og_image" content="index2.jpg" />

<meta charset="utf-8" id="og_url" content="https://www.google.com" />

<meta charset="utf-8" id="og_video" content="http://demo.connatix.com/Tudor/test.mp4" />

RSS feed validation

If a valid XML document has <rss> as its document element then it is an RSS feed. If, in addition, the <rss> element has the xmlns:media attribute set to either http://search.yahoo.com/mrss, http://search.yahoo.com/mrss/, http://www.rssboard.org/media-rss or http://www.rssboard.org /media-rss/ then it considered to be an MRSS feed (see below).

Item validation for RSS feeds

In order to correctly identify the information inside an RSS feed item we look for the following child elements of <item>

Info

Contained in

Title

<title>

Description

<description>

Link

<link>

Image

<image> or url attribute inside <media:thumbnail> element (this technically exceeds the RSS spec)

An RSS feed item does not directly contain a link to a video. In this case the video link may be specified as og:video in the page from the <link> element.

Example RSS feed:

<rss>

  <channel>

    <title>RSS Title</title>

    <description>This is an example of an RSS feed</description>

    <link>http://demo.connatix.com/Tudor</link>

    <lastBuildDate>Mon, 06 Sep 2010 00:01:00 +0000 </lastBuildDate>

    <pubDate>Sun, 06 Sep 2009 16:20:00 +0000</pubDate>

    <ttl>1800</ttl>

    <item>

      <title>Example entry</title>

      <description>Here is some text containing an interesting description.</description>

      <link>http://demo.connatix.com/Tudor</link>

      <guid isPermaLink="false">7bd204c6-1655-4c27-aeee-53f933c5395f</guid>

      <image>index.png</image>

      <pubDate>Sun, 06 Sep 2009 16:20:00 +0000</pubDate>

    </item>

</channel>

</rss>

MRSS feed validation

M(edia)RSS feeds are an extension of the RSS standard to allow media (such as video) to be specified as part of the feed item.

If a valid XML document has <rss> as its document element then it is an RSS feed. If, in addition, the <rss> element has the xmlns:media attribute set to either http://search.yahoo.com/mrss, http://search.yahoo.com/mrss/, http://www.rssboard.org/media-rss or http://www.rssboard.org /media-rss/ then it considered to be an MRSS feed.

Item validation for MRSS feeds

In order to correctly identify the information inside an MRSS feed item we look for the following child elements of <item>:

Info

Contained in

Title

<title> or <media:title>

Description

<description or <media:description>

Link

<link>

Image Url

<media:thumbnail> as a child of <item> or <media:thumbnail> as a child of <media:content> or <media:thumbnail> as a child of <media:group>

Video Url

url attribute of <media:content>, if type is "video/mp4" or medium attribute is set to "video"; <media:content> can be a child of either <item> or <media:group>

Note that the type if optional if the file has the .mkv, .flv or .mp4 extension.

Atom feed validation

If a valid XML document has <feed> as its document element then it is an Atom feed. If, in addition, the <feed> element has the xmlns:yt attribute set to http://www.youtube.com/xml/schemas/ then it is a YouTube feed (see below).

Item validation for Atom feeds

In order to correctly identify the information inside an Atom feed item we look for the following child elements of <entry>:

Info

Contained in

Title

<title>

Description

<summary> or <content>

Link

"href" attribute of <link> if the "rel" attribute is set to "alternate"

Image Url

"href" attribute of <link> if the "rel" attribute is set to "enclosure" and type is "image".

Video Url

"href" attribute of <link> if the "rel" attribute is set to "enclosure" and type is "video".

Example Atom feed:

<feed xmlns="http://www.w3.org/2005/Atom">

 <title>Example Feed</title>

 <subtitle>A subtitle.</subtitle>

 <link href="http://example.org/feed/" rel="self" />

 <link href="http://example.org/" />

 <id>urn:uuid:60a76c80-d399-11d9-b91C-0003939e0af6</id>

 <updated>2003-12-13T18:30:02Z</updated> 

 
   

 <entry>

  <title>Atom-Powered Robots Run Amok</title>

  <link rel="alternate" type="text/html" href="http://example.org/2003/12/13/atom03.html"/>

  <link rel="enclosure" type="image" href="http://example.org/2003/12/13/atom03.jpg"/>

<link rel="enclosure" type="video" href="http://example.org/2003/12/13/atom03.mp4"/>

  <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>

  <updated>2003-12-13T18:30:02Z</updated>

  <summary>Some text.</summary>

  <content type="xhtml">

   <div xmlns="http://www.w3.org/1999/xhtml">

    <p>This is the entry content.</p>

   </div>

  </content>

  <author>

   <name>John Doe</name>

   <email>johndoe@example.com</email>

  </author>

 </entry>

</feed>

YouTube feed validation

If a valid XML document has <feed> as its document element then it is an Atom feed. If, in addition, the <feed> element has the xmlns:yt attribute set to http://www.youtube.com/xml/schemas/ then it is a YouTube feed.

Item validation for YouTube feeds

In order to correctly identify the information inside a YouTube feed item we look for the following child elements of <entry>:

Info

Contained in

Title

<title>

Description

<media:description> child of <media:group>

Link

"href" attribute of <link>

 

Image Url

"url" attribute of <media:thumbnail>; <media:thumbnail> is a child of <media:group>

 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

Article is closed for comments.