JSON-LD, the Google Knowledge Graph and schema.org SEO

by Aaron Bradley on March 13, 2014

in Semantic Web, SEO

JSON-LD, the Google Knowledge Graph and schema.org SEO

Finally got your head around using RDFa or microdata for marking up HTML documents with schema.org? Prepare to to come to terms with a new protocol that will almost certainly become a vital item in the contemporary digital marketer's tool kit: JSON-LD.

Google today announced two types of information that will be integrated into their Knowledge Graph results.

Official tour date information will now start appearing in the Knowledge Graph, as will contact phone numbers for companies.

In both cases the route to getting into the Knowledge Graph is by employing existing schema.org item types on official websites: ContactPoint nested within Organization for contact phone numbers, and MusicEvent for band concert dates.

These are well-established and well-understood schema.org types. And the appearance of band tour dates in the Knowledge Graph has received extensive coverage in search engine marketing circles – although the business contact numbers have flown a bit lower under the radar, as only the help article referenced above has so far been published (thanks Manu Sporny for the heads-up on that).

The bigger news here is that this is the first time that Google has officially endorsed JSON-LD as a way of providing schema.org information.

It's big news because JSON-LD is a significantly different method of providing structured data to search engines (and other data consumers).

RDFa and microdata – the only two methods of adding schema.org to a website previously sanctioned by Google – are both markup syntaxes. That is, they rely on adding schema.org information directly to the HTML code already present on a page.

JSON-LD (JavaScript Object Notation – Linked Data), by contrast, is an alternative to using HTML markup. JSON is "JSON-based format to serialize Linked Data," meaning it relies on JSON to provide that same schema.org information to data consumers.

So while RDFa and microdata require HTML, JSON-LD can be provided as islands embedded in HTML, or used directly with data-based web services and in application environments.

Here, for example, is some HTML code containing schema.org authorship information marked up with microdata:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>What's in a Name?</title>
  </head>
  <body itemscope itemtype="http://schema.org/Article">
    <div>
    <h1 itemprop="name">What's in a Name?</h1>
    <p>By <span itemprop="author" itemscope itemtype="http://schema.org/Person"><a href="/author/samuel-jones-md.html" itemprop="url"><span itemprop="name">Samuel Jones</span></a></span></p>
	<p>A name is a terrible thing to waste.</p>
    </div>
  </body>
</html>

Here are those same data provided using JSON-LD:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <script type="application/ld+json">
    {
      "@context": "http://schema.org/",
      "@type": "Article",
      "name": "What's in a Name?",
      "author": {
        "@type": "Person",
        "url": "http://authors.airshock.com/samuel-jones-rl.html", 
        "honorificPrefix": "Dr.",
        "name": "Samuel Jones",
        "honorificSuffix": "PhD"
		}
      }
    </script>
    <title>What's in a Name?</title>
  </head>
  <body>
    <div>
    <h1>What's in a Name?</h1>
    <p>By <a href="/author/samuel-jones-jsonld.html">Dr. Samuel Jones, PhD</a></p>
	<p>A name is a terrible thing to waste.</p>
    </div>
  </body>
</html>

As you can see, the JSON-LD – unlike the microdata – is entirely separate from the HTML code where the schema.org values are found, although at the end of the day the same property/value pairs are provided to Google with both protocols.

This represents both a challenge and opportunity for SEO. The challenge is keeping the JSON-LD data in sync with what appears on the page, as it is important for the search engines that the data you're providing to them (via JSON-LD) is the same as the data you're providing to humans (via HTML).

It's an opportunity insofar as SEOs are freed from including structured data within HTML documents. It could conceivably be provided directly in JSON-LD without HTML, or as <script>-encoded islands within documents that might be difficult to mark up (such as AJAX-based web pages).

With this nascent JSON-LD support also comes two new structured data markup testing tools, both of which accept, parse and provide feedback for JSON-LD code: one for musical events (the Events Markup Tester), and another for corporate contact information (the Corporate Contacts Markup Tester).

This is what the Events Markup Tester returns when the first block of example JSON-LD music event code from the Google Webmaster Tools help page on music events is run through it (complete with a helpful suggestion to include a ticket price):

Sample Output from the Google Event Markup Tester

For Google, this is a limited foray into JSON-LD support for schema.org. Aside from these two narrow categories of data, Google has not indicated that JSON-LD is a method of providing them with structured data that they'll respect. And in part, they are almost certainly using these initial integrations in order to test how well JSON-LD works in this context, and to what degree webmasters avail themselves of this (relatively recently developed) protocol.

However, it's a pretty clear sign that JSON-LD is going to loom larger on the SEO stage (it has already been embraced in a major way by semantic web community, and especially among developers in that community).

And – as a concluding aside – the integration of music events and contact phone numbers in the Knowledge Graph demonstrate, again, how being an early adopter of structured data technologies can pay off.

{ 12 comments… read them below or add one }

Leave a Comment

Previous post:

Next post: