Structured Data for Author Pages and Linked Snippets

by Aaron Bradley on September 24, 2013

in Semantic Web, SEO

Structured Data for Author Profile Pages and Linked Snippets

Author pages and the way in which articles link to them are important sources of information for search engines and other data consumers. Information about authors and the content they create are increasingly used by Google and Bing to help determine the topical expertise and stature of authors, and may be used by the search engines in the ranking of content or generation of rich snippets in search results.

While search engines do their best to understand authors and their publishing environment by crawling and indexing whatever documents they encounter, providing the search engines with structured data about authorship provides them with much more precise information about authors and their works.

"Structured data" is a broad term that encompasses various standards and encoding mechanisms but, at the end of the day, refers to information provided to data consumers specifically for machine consumption. This machine-readable code is closely-allied to, but separate from, the presentation layer that us humans consume when we read a web page, providing a way for dumb machines to better understand the entities and connections between entities present in a piece of content.

Contents

Benefits of structured data for author pages and linked author snippets

While this purpose of this article is to provide practical examples of structured data use related to authorship rather than (at least to any great degree) make a case for its use, there are some obvious benefits to providing search engines and other data consumers with structured data about authors and their works.

  • Entity disambiguation

    By providing data about an author and, especially, interlinking content and profile pages, structured data makes it much easier for search engines to know which author is being referred to when authors have identical or similar names – to know which "John Smith" wrote a given blog post, for example. This not only allows the search engines to provide better answers for queries about the author, but to properly link authors to their works and, from that, link authors to the topics about which they write – improving the chances that an author's work will appear in the results for relevant topical queries.

  • Rich snippets

    Structured data linking authors to their content and, especially, correctly verified author pages are used by search engines and other web services to show multiple pieces of information about an author in one brief citation, such as a snippet on a search engine results page. Rich snippets have been consistently correlated with higher-than-average click-through rates from search result pages.

  • Mapped entity relationships

    Information about an author like their job title, where they live and who they work for can be used by search engines to map relationships between different authors, allowing them to more precisely answer a broader range of queries ("reporter for salon in missouri"), providing more data points with which to make topical inferences (for example, that Bob works at Samsung would support other signals that he knows something about electronics), and so possibly appearing – at the end of the day – in a greater number of search results, more often.

And, of course, aside from the traffic that any one of these search benefits may provide, having greater visibility as an author in the SERPs is going to expose more people to your work, provide them with more information about you, and facilitate the type of communication with – and sharing and linking by – your readers that results in even more search engine visibility.

Structured data is only as valuable the content of that data, and the existence of machine-readable information about authorship is no substitute for richly-detailed author profile pages, helpful biographical snippets on article pages, and readily digestible bylines that your readers can understand and appreciate – all things that encourage readers to learn more about you and to share with you. For some pointers on making the most of the presentation layer, see the companion piece to this article:

Example code and files used in this article

I've provided example code and, where possible, validator output for the most commonly-used methods to provide structured data about authorship. These code examples all rest on two document templates.

There's an article template which simply links an article to an author profile page. The byline contains "Dr." and "PhD" as part of the byline in order to demonstrate how multiple properties about an author can be encoded on an article page, but a variant on this template is used to illustrate more common, straightforward links to an author profile page.

The microdata, RDFa Lite and microformat markup and JSON-LD code found below all declare the same item types, properties and values for the main example article page.

  • An article
    • that has an author
      • who has the name prefix
        • Dr.
      • and the name
        • Samuel Jones
      • and the name suffix
        • PhD
      • that can be found at the URL
        • http://authors.airshock.com/author/samuel-jones-md.html

As you can see, the example embeds, or nests, properties. That is, in the words of schema.org, when value of an item property is itself "another item with its own set of properties." So the value of the article property author is not the name "Samuel Jones", but the schema.org type Person, that has for its property name the value "Samuel Jones".

The "brief" version simply omits the name prefix and suffix.

There's also an author profile page that displays an author's name, picture, biography and linked email address.

The microdata, RDFa Lite and microformat markup and JSON-LD code found below all declare the same item types, properties and values for the example author profile page.

  • A person
    • who has the name prefix
      • Dr.
    • and the name
      • Samuel Jones
    • and the name suffix
      • PhD
    • and the image URL
      • http://authors.airshock.com/images/samuel-jones.jpg
    • and the job title
      • Director of Authorship Studies
    • and works for
      • an educational organization
        • that has the name
          • Woodlands University
    • and has an address
      • that is a postal address
        • that has the town or city name
          • Briar Park
        • and the state or province name
          • Manitoba
    • and has the email address URL
      • mailto:samjones@woodlands.edu

I've tried to use the minimum of HTML possible for these examples, and so they're – to say the least – unassuming web pages.

Templates used for example code cited in this article

An index page linked to full HTML versions of all examples used in this post can be found here.

While I've made every to validate code used this piece, I welcome corrections and suggestions (keeping in mind there's often more than one "correct" way of using specific vocabularies or syntaxes): leave a comment or drop me a line directly.

Name property use in schema.org and microformats

Note that when used in conjunction with a person, both schema.org (schema.org/Person) and microformats (hCard) permit the use of a name property where the string used for the value is interpreted as "FirstName LastName" when two words are used, or the use of alternate properties to specify first and last names separately.

For example, in schema.org a first and last name can be used together as the value for the property name, or split between the properties givenName and familyName.

schema.org and data-vocabulary.org

schema.org and data-vocabulary.org are both vocabularies that can be used to mark up information about authors and the works they create (schema.org is, more precisely – among other things – a collection of schemas, but it's not incorrect to call it a vocabulary).

data-vocabulary.org is the predecessor to schema.org, and in many ways has been superseded by it. schema.org has a broader range of types and a greater number of properties than data-vocabulary.org, and is being actively developed while data-vocabulary.org is not (at time of writing, the data-vocabulary.org home page is dominated by a pointer to schema.org).

So while data-vocabulary.org had a rich set of definitions that could be used to describe people and organizations, I've not used any data-vocabulary examples here.

Both vocabularies consist of items (also referred to as types, or classes, or item types) which have one or more defined properties associated with them, and each property used is assigned a value. The schema.org type Person, for example, has properties that include jobTitle, for which a possible value is "CEO".

schema.org properties have expected types. For example, it's expected that the value of an image property will be a URL – as opposed, say, to a number or a text string.

Further resources – schema.org

  • schema.org/Person
    The main schema.org item relevant to author information.
  • schema.org/author
    The main CreativeWork property relevant to author information, with an expected value of Person (see above) or Organization
  • schema.org FAQ
    The main article on schema.org found on Google Webmaster Tools.
  • Google Structured Data Testing Tool
    The tool provided by Google for testing schema.org (along with other types of structured data).

schema.org and microdata

Microdata is an HTML specification used to add semantic information to existing content on web pages. Along with RDFa, it is one of the two recommended mechanisms by which existing HTML can be marked up with schema.org information.

While schema.org documentation originally recommended microdata as the exclusive method of marking up web pages with schema.org, RDFa was soon after recognized as an equally valid syntax for use with schema.org.

Microdata employs the itemscope attribute to declare the item and the scope of the descendant elements that contain information about that element, the itemtype attribute to declare the schema.org item being described, and the itemprop attribute to declare a property of the item being described and to specify that property's value.

As noted above, microdata supports the nesting of one item type within another, as you'll see in the examples below. The schema.org site provides this concise explanation of embedding one item within another using microdata:

Sometimes the value of an item property can itself be another item with its own set of properties. For example, we can specify that the director of the movie is an item of type Person and the Person has the properties name and birthDate. To specify that the value of a property is another item, you begin a new itemscope immediately after the corresponding itemprop.

Article page – schema.org / microdata

Article using schema.org with microdata

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>What's in a Name? [Microdata - Brief]</title>
  </head>
  <body itemscope itemtype="http://schema.org/Article">
    <div>
    <h1 itemprop="name">What's in a Name?</h1>
    <p>By <span itemprop="author" itemscope itemtype="http://schema.org/Person"><a href="/author/samuel-jones-md.html" itemprop="url"><span itemprop="name">Samuel Jones</span></a></span></p>
	<p>[Article body]</p>
    </div>
  </body>
</html>

Google Structured Data Testing Tool output
For code example above – Article using schema.org with microdata

Article using schema.org with microdata - Google Structured Data Testing Tool output

Article using schema.org with microdata (verbose author properties)

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>What's in a Name? [Microdata]</title>
  </head>
  <body itemscope itemtype="http://schema.org/Article">
    <div>
    <h1 itemprop="name">What's in a Name?</h1>
    <p>By <span itemprop="author" itemscope itemtype="http://schema.org/Person"><a href="/author/samuel-jones-md.html" itemprop="url"><span itemprop="honorificPrefix">Dr.</span> <span itemprop="name">Samuel Jones</span>, <span itemprop="honorificSuffix">PhD</span></a></span></p>
	<p>[Article body]</p>
    </div>
  </body>
</html>

Google Structured Data Testing Tool output
For code example above – Article using schema.org with microdata (verbose author properties)

Article using schema.org with microdata (verbose author properties) - Google Structured Data Testing Tool output

Author profile page – schema.org / microdata

Author profile page using schema.org with microdata

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>Samuel Jones [Microdata]</title>
  </head>
  <body>
    <div itemscope itemtype="http://schema.org/Person">
      <h1><span itemprop="honorificPrefix">Dr.</span> <span itemprop="name">Samuel Jones</span>, <span itemprop="honorificSuffix">PhD</span></h1>
      <p><img src="/images/samuel-jones.jpg" itemprop="image"></p>
      <p>Samuel Jones is the author of the book <i>My Name is Sammy Jones</i>.<br>
	  He is <span itemprop="jobTitle">Director of Authorship Studies</span> at <span itemprop="worksFor" itemscope itemtype="http://schema.org/EducationalOrganization"><span itemprop="name">Woodlands University</span></span>.<br>
He lives in <span itemprop="address" itemscope itemtype="http://schema.org/PostalAddress"><span itemprop="addressLocality">Briar Park</span>, <span itemprop="addressRegion">Manitoba</span></span> with his wife Tammy.<br>
You can reach him <a itemprop="email" href="mailto:samjones@woodlands.edu">here</a>.</p>
    </div>
  </body>
</html>

Google Structured Data Testing Tool output
For code example above – Author profile page using schema.org with microdata

Author profile page using schema.org with microdata - Google Structured Data Testing Tool output

"worksFor" vs. "affiliation"

Astute observers may have noticed that the rich snippet preview for the microdata example above (and the RDFa Lite example below) does not display the author's place of work – "Woodlands University" – whereas the preview for the hCard example below does. What's going on?

For some reason Google does not consider the property worksFor to be a strong enough association to display the name specified by Organization (in this case the EducationalOrganization) alongside other information about a person.

However, the property affiliation does trigger the appearance of the organization in the rich snippet preview (oddly, as it should be evident that a person is affiliated with an organization if they work for that organization).

In my opinion Google has this inverted: it makes more sense to see the organization for which one works appear in a rich snippet than an organization for which one is merely affiliated. Indeed, the official schema.org description of affiliation makes it pretty clear that this could be applied to basically any organization with which you have any sort of relationship:

An organization that this person is affiliated with. For example, a school/university, a club, or a team.

I once belonged to a eight-ball pool league team called the Ferrule Cats. As much as I loved the Cats, I don't think the following rich snippet would be an appropriate one for me, even though this is what's generated if I supply the value "Ferrule Cats" for affiliation and my workplace, InfoMine, as the value for worksFor (nested in the items SportsTeam and Organization respectively).

Rich snippet preview for a an author profile page using both affiliation and works for properties - Google Structured Data Testing Tool

Seriously? Seriously.

So (currently, at least) to ensure the organization you want to be associated with you appears in a rich snippet, ensure it is declared using affiliation, or declare affiliation alongside worksFor.

He is <span itemprop="jobTitle">Director of Authorship Studies</span> at <span itemprop="worksFor affiliation" itemscope itemtype="http://schema.org/EducationalOrganization"><span itemprop="name">Woodlands University</span></span>

This produces the desired snippet (as it does if similarly added to the appropriate RDFa property).

Author profile page using both worksFor and affiliation properties for schema.org/Person - Google Structured Data Testing Tool rich snippet preview

schema.org/ProfilePage

While schema.org does not describe the item ProfilePage except to say "Web page type: Profile page" one would think on the strength of its name that this would be a good wrapper for an author profile page.

However, declaring a web page with ProfilePage and using the about property to provide information about the page's subject does not produce a rich snippet with information about the described person's location, role or affiliation – as does simply using Person to declare a block of information about a person on a page.

While Google's Structured Data Testing Tool has no difficulty in extracting structured data about a Person when ProfilePage is used, the Tool's failure to generate a rich snippet from this item suggests that simply employing Person may be more beneficial.

Author profile page declared as schema.org/ProfilePage (fragment)

<body itemscope itemtype="http://schema.org/ProfilePage">
    <div itemprop="about" itemscope itemtype="http://schema.org/Person">
      [...] 
    </div>
  </body>
  </body>

Google Structured Data Testing Tool output
Preview for code example above when <body type="schema.org/ProfilePage"> is used

Google Structured Data Testing Tool preview for when ProfilePage is used

Further resources – Microdata

  • Microdata
    The chapter on microdata from Mark Pilgrim's classic work Dive Into HTML5 remains a definitive resource. While the examples in this chapter reference data-vocabulary.org, they have the additional benefit (in the context of this article) of chiefly using examples for the Person type.
  • An Uber-comparison of RDFa, Microdata and Microformats
    A great piece from Manu Sporny describing the difference between these three approaches to structured data markup. An equally important resource (obviously) for the sections on RDFa and microformats you'll find below.

schema.org and RDFa Lite

One of the reasons that schema.org documentation originally advocated the use of microdata was the perception that RDFa was too technically challenging for many webmasters.

RDFa Lite is a "minimal subset of RDFa" which, by reducing the number of attributes employed to a manageable minimum, makes it much easier to mark up documents with schema.org information in RDFa, while being fully upwards compatible with the full set of RDFa 1.1 attributes. schema.org gave RDFa 1.1 Lite its full blessing in November 2011.

For marking up HTML with schema.org the RDFa Lite attribute vocab declares the value "http://schema.org/", while the typeof attribute is used to declare the item being described (e.g. the value "Person" for the type schema.org/Person). Item properties use are declared using the attribute property.

As with microdata, RDFa permits one item and its properties to be embedded within another by using a new typeof immediately after the corresponding property.

If RDFa Lite seems eerily similar to microdata this is because schema.org markup in RDFa Lite, in the words of schema.org, "looks almost isomorphic to the Microdata version."

Article page – schema.org / RDFa Lite

Article using schema.org with RDFa Lite

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>What's in a Name? [RDFa Lite - Brief]</title>
  </head>
  <body vocab="http://schema.org/" typeof="Article">
    <div>
      <h1>What's in a Name?</h1>
      <p>By <span property="author" typeof="Person"><a property="url" href="samuel-jones-rl.html"><span property="name">Samuel Jones</span></a></span></p>
      <p>[Article body]</p>
    </div>
  </body>
</html>

Google Structured Data Testing Tool output
For code example above – Article using schema.org with RDFa Lite

Article using schema.org with RDFa Lite - Google Structured Data Testing Tool output

Article using schema.org with RDFa Lite (verbose author properties)

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>What's in a Name? [RDFa Lite]</title>
  </head>
  <body vocab="http://schema.org/" typeof="Article">
      <div>
      <h1>What's in a Name?</h1>
      <p>By <span property="author" typeof="Person"><a property="url" href="/author/samuel-jones-rl.html"><span property="honorificPrefix">Dr.</span> <span property="name">Samuel Jones</span>, <span property="honorificSuffix">PhD</span></a></span></p>
      <p>[Article body]</p>
    </div>
   </body>
</html>

Google Structured Data Testing Tool output
For code example above – Article using schema.org with RDFa Lite (verbose author properties)

Article using schema.org with RDFa Lite - Google Structured Data Testing Tool output (verbose author properties)

Author profile page – schema.org / RDFa Lite

Author profile page using schema.org with RDFa Lite

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>Samuel Jones [RDFa Lite]</title>
  </head>
  <body>
    <div vocab="http://schema.org/" typeof="Person">
      <h1><span property="honorificPrefix">Dr.</span> <span property="name">Samuel Jones</span>, <span property="honorificSuffix">PhD</span></h1>
      <p><img property="image" src="/images/samuel-jones.jpg"></p>
      <p>Samuel Jones is the author of the book <i>My Name is Sammy Jones</i>.<br>
	  He is <span property="jobTitle">Director of Authorship Studies</span> at <span property="worksFor" typeof="EducationalOrganization"><span property="name">Woodlands University</span></span>.<br>
He lives in  <span property="address" typeof="PostalAddress"><span property="addressLocality">Briar Park</span>, <span property="addressRegion">Manitoba</span></span> with his wife Tammy.<br>
You can reach him <a property="email" href="mailto:samjones@woodlands.edu">here</a>.
    </div>
  </body>
</html>

Google Structured Data Testing Tool output
For code example above – Author profile page using schema.org with RDFa Lite

Author profile page using schema.org with RDFa Lite - Google Structured Data Testing Tool output

RDFa / Play visualization
For code example above and below – Author profile page using schema.org with RDFa

Author profile page using schema.org with RDFa or RDFa Lite - RDFa / Play visualization

Note that for legibility the above visualization does not show the parent of "Item 1", "Web Page".

First item - Web Page - shown in the RDFa / Play visualization

Using worksFor and affiliation in RDFa Lite

As with microdata, Google will not display the organization associated with a person in a rich snippet preview if it is declared using worksFor, but it will if it is declared using affiliation.

As with microdata, in RDFa Lite the same value can be assigned to two or more properties by listing the them in the property declaration, separated by spaces.

He is <span property="jobTitle">Director of Authorship Studies</span> at <span property="worksFor affiliation" typeof="EducationalOrganization"><span property="name">Woodlands University</span></span>.

This produces a snippet preview that displays the value of Organization.

Author profile page using both worksFor and affiliation properties for schema.org/Person with RDFa Lite - Google Structured Data Testing Tool rich snippet preview

Further resources – RDFa Lite

  • RDFa Lite 1.1
    The official W3C recommendation.
  • RDFa / Play
    A great (beta) tool for visualizing the output of RDFa Lite (and RDFa) code on rdfa.info.
  • Rich Snippets – Organizations on Google Webmaster Tools
    With the exception of a single example on a documentation page, at time of writing schema.org does not show examples in RDFa (any flavor). And the Google Webmaster Tools RDFa example for people uses rdf.data-vocabulary.org marked up with RDFa 1.0 (full). However, Google is slowly populating its RDFa examples for rich snippets with schema.org marked up with RDFa Lite (as is the case with the markup for organizations linked here), and now explicitly (again, on this page about organizations) recommends using schema.org instead of data-vocabulary.org "as it offers a larger vocabulary and wider compatibility."

Author information in RDFa 1.0 and 1.1 (full)

While Google and other data consumers generally have little problem parsing RDFa, the majority of webmasters who want to use RDFa for schema.org will find RDFa Lite much easier than any full RDFa version, and perfectly adequate for the task at hand. As the RDFa Lite recommendation says, RDFa Lite works "for most day-to-day needs and can be learned by most Web authors in a day."

If you do encounter RDFa you want to use, improve upon or understand better, a couple of quick notes.

  • Be wary of Google Webmaster Tools examples that employ RDFa. Among other things, they tend to use the a combination of rel and property attributes, which is fraught with peril.
  • In my testing, the Google Structured Data Testing Tool did not generate a rich snippet in the preview for RDFa, while it did for RDFa Lite employing the same properties. Caveat emptor.

Further resources – RDFa (full)

  • RDFa 1.1 Primer
    The primer from W3C, subtitled "Rich Structured Data Markup for Web Documents".
  • RDFa Core 1.1 – Second Edition
    The official W3C recommendation, subtitled "Syntax and processing rules for embedding RDF through attributes".

schema.org and JSON-LD

JSON-LD is a lightweight linked data format based on JSON (JavaScript Object Notation). Unlike microdata or RDFa, JSON-LD is not a method of marking up HTML with structured data, although JSON-LD "data islands" can be placed in HTML documents using the <script> tag.

JSON-LD was only added to the list of formats recommended for use with schema.org in June 2013, where it was noted that – in contrast to microdata and RDFa – that there were "often cases when data is exchanged in pure JSON or as JSON within HTML."

Since these cases are not likely to be encountered by most webmasters interested in providing structured data about authorship (the JSON-LD examples you'll currently find on schema.org are restricted to the Action type), I'll limit myself to providing JSON-LD code for the example files already presented.

However, I am including them because there might be situations where JSON-LD provides an efficient way of facilitating the exchange of schema.org data, especially in pure JSON. Any JSON-LD implementation should be at pains to ensure that JSON-LD encoded data is identical to the equivalent data presented in HTML.

Note that the JSON-LD used in these examples is modeled on the examples used on schema.org and the examples provided on the Google Developer section devoted to adding schema.org markup to emails, and were successfully tested using the schema validator found on the Google Developers site. In other words, this is a very Google-centric view of JSON-LD, which may differ from other implementations of JSON-LD, and may not validate or display correctly when plugged into different JSON-LD tools.

Article page – schema.org / JSON-LD

Article using schema.org with JSON-LD

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <script type="application/ld+json">
    {
      "@context": "http://schema.org/",
      "@type": "Article",
      "name": "What's in a Name?",
      "author": {
        "@type": "Person",
        "url": "http://authors.airshock.com/samuel-jones-rl.html", 
        "honorificPrefix": "Dr.",
        "name": "Samuel Jones",
        "honorificSuffix": "PhD"
		}
      }
    </script>
    <title>What's in a Name? [JSON-LD]</title>
  </head>
  <body>
    <div>
    <h1>What's in a Name?</h1>
    <p>By <a href="/author/samuel-jones-jsonld.html">Dr. Samuel Jones, PhD</a></p>
	<p>[Article body]</p>
    </div>
  </body>
</html>

Schema Validator (beta) output
For code example above – Article using schema.org with JSON-LD

Article using schema.org with JSON-LD - Schema Validator (beta) output

Author profile page – schema.org / JSON-LD

Author profile page using schema.org with JSON-LD

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <script type="application/ld+json">
    {
      "@context": "http://schema.org/",
      "@type": "Person",
      "image": "http://authors.airshock.com/images/samuel-jones.jpg",
      "honorificPrefix": "Dr.",
      "name": "Samuel Jones",
      "honorificSuffix": "PhD",
      "jobTitle": "Director of Authorship Studies",
      "worksFor": {
        "@type": "EducationalOrganization",
        "name": "Woodlands University"
      },
      "address": {
        "@type": "PostalAddress",
        "addressLocality": "Briar Park",
        "addressRegion": "Manitoba"
      },
      "email": "mailto:samjones@woodlands.edu"
    }
    </script>
    <title>Samuel Jones [JSON-LD]</title>
  </head>
  <body>
    <div>
    <h1>Dr. Samuel Jones, PhD</h1>
    <p><img src="/images/samuel-jones.jpg"></p>
    <p>Samuel Jones is the author of the book <i>My Name is Sammy Jones</i>.<br>
	He is Director of Authorship Studies at Woodlands University<br>He lives in Briar Park, Manitoba with his wife Tammy.<br>
	You can reach him <a href="mailto:samjones@woodlands.edu">here</a>.</p>
    </div>
  </body>
</html>

Schema Validator (beta) output
For code example above – Author profile page using schema.org with JSON-LD

Author profile page using schema.org with JSON-LD - Schema Validator (beta) output

Further resources – JSON-LD

Microformats for authorship

Microformats predate schema.org and schema.org markup syntaxes. Like schema.org, microformats allow data to be embedded in HTML documents. Unlike schema.org, however, the syntax used for marking up information about items and properties for a given microformat is itself integral to the format (microformats use the HTML class attribute to declare properties and their values). These differences are deftly summarized on webdatacommons.org:

Microformats uses a set of well-known HTML constructs to add semantics to HTML elements. However, it is limited in its expressivity, as only a limited number of formats for well-defined use cases exist. On the other hand, RDFa and Microdata can use arbitrary vocabularies, and – together with their flexible data formats – are therefore able to express arbitrary data.

In a nutshell, microformats are not well suited to extensibility. Furthermore, while schema.org is actively being refined and extended, microformats are no longer being developed in any significant way (and such development would make little sense now because of schema.org).

Having said this, microformats are ubiquitous, and are perhaps the structured data markup format best understood by the search engines. And while I would not recommend marking up freshly-minted authorship information with the microformat used for providing information about people and organizations – hCard – it's anything but rare for a webmaster to encounter this markup, and it's likely to be around for many years to come. Accordingly I've provided the standard code samples marked up with hCard for reference, along with some other information about microformats relevant to authorship.

The difference between vCard and hCard

vCard is a file format for electronic business card – an early standard that emerged in the mid-1990s. hCard is a microformat developed later that allowed vCard properties to be marked up in HTML.

Put another way, the vCard is the chunk of HTML, and hCard is the thing that chunk of HTML is marked up with to make it a vCard.

It's stupid confusing – so much so that Google (sensibly) has a disclaimer about the lingo employed on its microformats help page:

In the first line, class="vcard" indicates that the HTML enclosed in the <div> describes a Person. (The microformat used to describe people is called hCard and is referred to in HTML as vcard. This isn't a typo.)

Thankfully this is not the model followed by other types of microformats. If you were to mark up a review using hReview, for example, you'd be happy to see the class declared is hreview, not vreview.

The hCard required property fn

hCard has a required property: if an hCard is declared with vCard, either the fn (full name) property or the n (structured name) property is required.

Article page – microformats (hCard)

Article page with hCard

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>What's in a Name? [hCard - Brief]</title>
  </head>
  <body>
    <div>
    <h1>What's in a Name?</h1>
    <p class="vcard">By <span class="fn url"><a href="/author/samuel-jones-mf.html">Samuel Jones</a></span></p>
	<p>[Article body]</p>
    </div>
  </body>
</html>

Google Structured Data Testing Tool output
For code example above – Article page with hCard markup

The error message displayed at the bottom of Tool output is not problematic, as the markup is not so much about a person, as about an article that links to a page that's about a person.

Article using hCard microformat - Google Structured Data Testing Tool output

Article page with hCard (verbose author properties)

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>What's in a Name? [hCard]</title>
  </head>
  <body>
    <div>
    <h1>What's in a Name?</h1>
    <p class="vcard">By <span class="fn n url"><a href="/author/samuel-jones-mf.html"><span class="honorific-prefix">Dr.</span> <span class="given-name">Samuel</span> <span class="family-name">Jones</span>, <span class="honorific-suffix">PhD</span></a></span></p>
	<p>[Article body]</p>
    </div>
  </body>
</html>

Google Structured Data Testing Tool output
For code example above – Article page with hCard markup (verbose author properties)

Article using hCard microformat - Google Structured Data Testing Tool output (verbose author properties)

Author profile page – microformats (hCard)

Author profile page with hCard

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>Samuel Jones [hCard]</title>
  </head>
  <body>
    <div class="vcard">
      <h1 class="fn n"><span class="honorific-prefix">Dr.</span> <span class="given-name">Samuel</span> <span class="family-name">Jones</span>, <span class="honorific-suffix">PhD</span></h1>
      <p><img class="photo" src="/images/samuel-jones.jpg"></p>
      <p>Samuel Jones is the author of the book <i>My Name is Sammy Jones</i>.<br>
	  He is <span class="title">Director of Authorship Studies</span> at <span class="org">Woodlands University</span><br>
	  He lives in <span class="adr"><span class="locality">Briar Park</span>, <span class="region">Manitoba</span></span> with his wife Tammy.<br>
You can reach him <a class="email" href="mailto:samjones@woodlands.edu">here</a>.</p>
    </div>
  </body>
</html>

Google Structured Data Testing Tool output
For code example above – Author profile page with hCard markup

Author profile page using hCard microformat (i.e. vCard) - Google Structured Data Testing Tool output

Further resources – Microformats

  • hCard 1.0
    Information about hCard on the microformats wiki.

Structured linking for Google authorship

First introduced in 2011, Google authorship is a mechanism by which Google formally associates authors with the content they produce.

The primary benefits of Google authorship are twofold:

  • Rich snippets
    When an article by an author recognized by Google appears in the SERPs, it is accompanied by a picture of the author, a linked instance of their name, and a linked count of the number of people that have added the author to their Google+ circles. This rich snippet, in turn, provides the following benefits to the author referenced:
    • A higher click-through rate from the SERPs to the article page referenced (because the snippet stands out in the SERPs).
    • A mechanism – the linked author's name – by which searchers can explore other works by (and other pages related to) the author.
    • A mechanism – the linked circle count – by which searchers can visit the author's Google+ profile page.
  • Authorship metrics
    Author statistics (clicks and impressions) are provided in Google Webmaster Tools (for sites in Webmaster Tools where the author is able to login with the account associated with their Google+ profile).

More speculatively, it is thought that by being able to disambiguate individual authors and associate them with their works, Google is able to use this information in the display and ranking of those author-associated pages in the search results. While it makes a great deal of sense that Google should use authorship to help inform their understanding of an individual's authority and that individual's areas of topical expertise, "Author Rank" nonetheless remains speculative at this time.

There are three hard requirements for Google authorship: a Google+ profile, a "a good, recognizable headshot" used for the profile photo, and a method of linking your content to your Google+ profile.

The easiest of way of linking content to a Google+ profile is to verify an email address with the same domain as the site hosting your content.

The alternate method is by structured linking of your content, which is necessary when you don't have an email address on the same domain as your content – and, obviously, is the method relevant in the context of this post.

Google authorship – ?rel=author

The currently-promoted method of linking content to a Google+ profile page has two components: first, a hyperlink from the author's content to their Google+ profile, with ?rel=author appended to the profile URL and; second, a reciprocal link from the author's Google+ profile to each site (domain) on which that linked content appears (in the "Contributor To" section of the author's Google+ profile).

The anchor used for the link to an author's Google+ profile is unimportant: all that is required is the link itself, with the ?rel=author parameter appended to it.

<a href="https://plus.google.com/100000000000000000000?rel=author">Google+</a>

Here's the code integrated into our example article page.

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
  <head>
    <title>What's in a Name? [Google Authorship - ?rel=author Google+ Link]</title>
  </head>
  <body>
    <div>
    <h1>What's in a Name?</h1>
    <p>By <a href="/author/samuel-jones-gl.html">Dr. Samuel Jones, PhD</a></p>
	<p>You can find Samuel on <a href="https://plus.google.com/100000000000000000000?rel=author">Google+</a>.</p>
	<p>[Article body]</p>
    </div>
  </body>
</html>

Running this through the Google Structured Data Testing Tool will not produce a Google authorship rich snippet in the preview, as there is no reciprocal link back to the content domain.

Article without a Contributor-To link pointing to the domain - Google Structured Data Testing Tool output

However, when a functioning Google+ URL is used (mine), and the code placed on a domain (www.seoskeptic.com) to which the author has linked from the "Contributor To" section of his Google+ profile, the authorship rich snippet is generated.

Article with a Contributor-To link pointing to the domain - Google Structured Data Testing Tool output

Google authorship – rel="author" / rel="me" (deprecated)

An older method of establishing Google authorship uses two different values for the rel attribute, depending on the hyperlinks being encoded.

Using this method, an article is linked to the author's profile page on that domain using the rel="author" attribute, indicating the author found at the linked URL is the author of the article.

By <a rel="author" href="/author/samuel-jones.html">Dr. Samuel Jones, PhD</a>

The linked profile page then must include a link to the author's Google+ profile page, and that link must employ the rel="me" attribute, indicating that the person identified on the author profile page and Google+ profile page are one and the same.

You can reach him <a href="mailto:samjones@woodlands.edu">here</a> or on <a href="https://plus.google.com/100000000000000000000" rel="me">Google+</a>.

And, as with the ?rel=author parameter linking method, the author's Google+ profile must include a link back (in the "Links" section) to the domain on which the author's content is found.

While this is a deprecated method (?rel=author was presumably introduced as a simpler alternative), it has the benefit of providing information to data consumers about the relationship between an article and profile page on the same domain.

When a real Google+ profile URL with the appropriate reciprocal link is used, you can see how the Google Structured Data Testing Tool is able to extract information about the relationship between the article page and profile page.

Article using rel=author (in conjuction with a profile page using rel=me) - Google Structured Data Testing Tool output

Google authorship – <link rel="author"> (deprecated)

Yes, there's yet another way of linking an author to a Google+ profile page – namely a <link> tag in the <head> of a document that points the Google+ profile in question.

While it still works, the use of this method is not recommended – for the simple reason that Google has ceased to recommend it. However, you may still encounter it (and it does provide a method of last resort for webmasters who can only add data to the document <head>).

The syntax for this method is straightforward, but it obviously requires the ability to place code in the <head> section of an HTML document (which why it may have been retired as a going concern).

<link rel="author" href="https://plus.google.com/100000000000000000000" />

Despite being a deprecated method of establishing authorship for Google, today a page linked in this fashion to a valid Google+ profile page (with a reciprocal link back to the domain on which the link is found) still generates a rich snippet in the Structured Data Testing Tool.

Article using link rel=author - Google Structured Data Testing Tool output

Dublin Core Metadata Initiative (DCMI)

The Dublin Core Metadata Initiative (DCMI) is a set of vocabulary terms used to describe resources and the relationships between them.

As work on DCMI began in 1995 it was for a long time one of the best methods for marking up authorship information. It is still alive and well, and immensely useful in arenas where the focus is on information resources (especially libraries), but marking up an author profile page with DCMI isn't going to provide much additional value in a typical web authorship environment.

I'd be remiss, however, not to note that many publishing platforms still encode authorship information in RSS using DCMI, such as you'd find if you examined this site's RSS feed (only DCMI-specific code shown).

<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:creator>Aaron Bradley</dc:creator>

FOAF (Friend-of-a-Friend)

FOAF is an RDF/XML vocabulary facilitating the creation "of machine-readable pages describing people, the links between them and the things they create and do" (from the FOAF wiki).

As the name suggests, FOAF is especially well-suited to encoding authorship information in a way that machines can understand. A FOAF profile (an RDF file) is, however, of limited additional value to authors that have already marked up their author profile pages with structured data. And while its perfectly possible to mark up HTML with FOAF properties (using, say, RDFa) it is an unlikely contemporary markup choice given the search engines' stated preference for schema.org.

Acknowledgments

For their ongoing help, support and patience as I navigate the (to me) murky waters of structured data, my sincere thanks to Dan Brickley, Manu Sporny, Gregg Kellogg, Martin Hepp and Kingsley Idehen.

A special shout-out to Stéphane Corlosquet and Niklas Lindström for their invaluable help with the RDFa sections of this article.

{ 3 comments… read them below or add one }

1 dani January 28, 2014 at 4:34 am

Thanks for the article. It gave me the information I was searching for, about json-ld and schemas.

Reply

2 Harry Lawir Aseani May 19, 2014 at 8:33 pm

But less of 2% of the sites in the world use the schema.org. I thnk is too much work for a professional who only wants to write in a blog or other online platform.

Reply

3 Simon May 23, 2014 at 6:03 am

wow…that was the most thoroughly researched article I’ve found yet on structured data. Bookmarked!

Reply

Leave a Comment

Previous post:

Next post: