SEO strategist Kyle Sutton of ClickSeed noted today on Google+ that Google's guide to markup properties for article rich snippets* on the Google Developers site has been considerably expanded, and that this revision was doubtlessly "in preparation for the launch of AMP [Google's Accelerated Mobile Pages project] next year."
He goes on to note that the Google Structured Data Testing Tool now contains a filter for "AMP Articles" that returns warnings and errors for non-compliant markup.
The number of
Article properties described in the guidelines have increased from six to seventeen. Only three properties were previously listed as required, whereas now the markup specifications list fourteen required properties, along with one recommended property.
A more detailed examination of these changes provides us, I think, with some insights on the direction Google is taking in their display, distribution and handling of articles moving forward, and very much substantiates Kyle's proposition that these changes have been made in support of the AMP initiative.
mainEntityOfPage added as a recommended property
Google now recommends declaring the canonical URL of the article for the
mainEntityOfPage property "when the article is the primary topic of the article page" (which may not be the most edifying of annotations).
mainEntityOfPage usage is complicated enough that schema.org site provides a background note on this specific property, where it says "the
mainEntityOfPage property serves more to clarify which of several entities is the main one for that page." I'm thankfully spared the embarrassment of trying to adequately explain how
mainEntityOfPage works because my colleague Jarno van Driel has already written an excellent article that answers everything you ever wanted to know about
mainEntityOfPage but were afraid to ask.
A couple of brief notes, though, in regard to
mainEntityOfPage as it pertains specifically to article rich snippets and AMP.
First, the fact that Google recommends using this for articles suggests that the entity it really cares about when you feed its bots an article is, well, the article. That is, the focus from a structured data consumption point of view is on the entity being consumed (the article) rather than what it talks about (the topic of the article). This in turn suggests that the benefit Google hopes to gain from
Article markup, in general, is the ability to better manipulate articles in the Google ecosystem due to an improved understanding of an article's components and provenance, rather than using the markup to better understand what an article is about.
Second, the syntax Google recommends employing here is likely a little obscure for many webmasters, and itself varies from the syntax employed in the AMP project documentation. The JSON-LD example on the Google Developer site uses
@id, JSON-LD keywords used to set the data type and uniquely identify the thing being described, respectively. The AMP guidelines on metadata don't mention
mainEnityofPage, but the current JSON-LD example on the AMP GitHub repository does use it, where a simple URL declaration is employed.
All of this to say that, should one prefer to use it, the simpler syntax is both supported by the AMP example and passes muster in the Google Structured Data Testing Tool.
headline now has a character count
An annotation has been added to the usage guidelines for the
headline property saying that "Headlines should not exceed 110 characters."
Organic search marketers are well acquainted with 65-ish characters displayed in the linked title portion of a Google search snippet, which this considerably exceeds. So it seems likely this is related to the display of AMP articles.
In the post announcing the AMP project the news carousel pictured shows headlines of moderate length displayed without truncation, but it remains to be seen if that will be the case for articles that use the full 110-character allotment.
image information is now required
The previous guidelines for the use the of the
image property required only an image URL, and the example listed an array of image URLs – consistent with the guideline that the data declared should be "[a] URL, or list of URLs pointing to the representative image file(s)."
The revised guidelines now make reference to a single
image declaration, which should be "[t]he representative image of the article… that directly belongs to the article." Google, apparently, does not want to have to guess which of many images to use when displaying the article. Nor does it want to have to upsample, with the prior minimum image dimensions of "160×90 pixels" now replaced with the directive that "[i]mages should be at least 696 pixels wide."
The additional new requirement that image height and width now be declared (requiring the use of a nested
ImageObject, which means that a simple URL declaration for
image is no longer supported for article rich snippets) points squarely at support for AMP.
In brief, many AMP elements are concerned with the layout of an article, and explicitly declaring image dimensions gives Google the ability to render images associated with AMP articles correctly on different devices (one of potential optimizations of AMP is to "[r]eplace image references with images sized to the viewer’s viewport"). The requirement to declare image dimensions is also consistent with the
amp-img requirement that the tag "[m]ust include an explicit width and height."
url declaration in the new example is an absolute URL rather than the relative URL used in the prior example. This is consistent with the broad AMP markup directive – found in a comment in some AMP example code – that "[a]ll marked-up URLs should be absolute."
In aggregate Google doesn't want to have to parse code in order to calculate image dimensions and absolute image URLs: it now wants image data upfront.
publisher data, including
logo, now required
publisher property wasn't previously required, and in fact wasn't referenced at all in the prior version of the specifications or examples for article rich snippets. Without recourse to speculation
publisher data obviously makes it easier for Google to support a diversity of sources in news carousels and to correctly identify the publication source of AMP articles in various Google environments if it knows an article's publisher (I'm a little surprised a
sameAs declaration didn't find it's way into the requirement).
The requirement that publisher
logo dimensions be declared is a variation on the
image requirements discussed above. The requirement for a logo itself can (along with the specifics of the size and aspect ratio requirements), again, be readily linked to the way in which AMP articles are displayed in Google's examples to date.
The footprint of a logo container displayed in the Google AMP demo SERP screenshot displayed above is 600×60 px, exactly the size recommended in the revised article rich snippet documentation.
Joining the already-required
datePublished property is the
dateModified property. Obviously this is good information for any search engine to have when its pertinent, both for calculating the currency of article information and for generating visible article time-stamps and labels (like "updated").
author must be named
As with publisher,
author was not referenced at all in the previous version of the article rich snippets documentation, but is now a required property.
The relationship to AMP here is less obvious, but nonetheless the Structured Data Testing Tool generates an error for pages that lack the required author declaration. It's worth noting briefly that, AMP aside, explicit information about authors is extremely useful for search engines in coming to some sort of understanding of an author's topical expertise and authority (although, like the
publisher markup guidelines, there's no requirement for a
sameAs URI that could be used to disambiguate authors with the same name).
The expected type for
author in the guidelines is
Person, which at first blush is limiting, as many organizations – including news organizations – often publish articles that aren't ascribed to a specific author. However, the second expected type for author in schema.org,
Organization, also passes validation for AMP articles in the Structured Data Testing Tool when that type is used for
The AMP articles filter and the morphing definition of "article rich snippets"
While the updated documentation on Google Developers is still headed "Enabling Rich Snippets for Articles", not including newly-required
Article properties does not generate error messages in the Google Structured Data Testing Tool when the "Articles Rich Snippets" filter is selected. It does, however, when the newly-added "AMP Articles" filter is selected.
This suggests that whatever Google is calling these guidelines, these are their de facto code requirements for Accelerated Mobile Pages metadata.
I underline "their" because Google requirements aren't AMP requirements, even if Google's article rich snippets requirements are a proxy for Google's AMP page requirements.
Dan Scott noted in a comment on Kyle's post that "schema.org markup is only a recommendation of the AMP HTML spec". Both in that spec and in documentation on the main site schema.org use is only recommended, and the requirements listed on the Google Developers site are absent. The site documentation is careful to point out (emphasis in the original):
For some platforms, this metadata is additional, for others it is a requirement, meaning they won’t show links to your content if you didn’t include the right metadata. Make sure you include the right metadata for the platforms you want your content to appear on.
Although AMP is a Google initiative, it's an open source project of which tech lead Malte Ubl says "[n]othing is set in stone." So it makes sense that Google would want to keep its metadata requirements separate from those of the likely-to-change AMP. And it also makes sense for AMP to keep its metadata requirements as lean as possible so that AMP is easier to integrate with other platforms.
Whether the AMP specifications will ever align with on the Google Developers site remains to be seen, but all the evidence suggests that this revision of Google's article rich snippets documentation was undertaken in support of AMP.
Yes, these changes are specific to AMP, not for Article markup in general. We'll clarify that in the docs.
* I know that the Google name for these are "articles rich snippets, but I just can't bring myself to use that awkward construction, so I use the singular throughout except when quoting Google documentation.