Who's Canonical Now, Revisited: Rel=Canonical Use in 2011

by Aaron Bradley on January 19, 2011

in SEO

Use of Rel Canonical by Major Websites - Updated

It's been just under two years since Google revealed support for the link rel="canonical" tag. In April of 2009, just a couple of months after the tag (or, to be precise, the "canonical" value for the rel attribute of the <link> tag) was announced, I surveyed 37 sites to see which of them were employing this code. Perhaps unsurprisingly after such a short time, I was only able to find two sites employing rel="canonical."

Fast forward to January 2011. Of those 37 sites, now 19 of them are employing rel="canonical" in some form or another. The adoption of rel="canonical" has been highest for blogs, online stores and news sites, but at least one or two sites in all categories I polled have started to employ rel="canonical." Perhaps more interesting than the overall results are some of the specific site implementations of rel="canonical" and Google's handling of them.

SEO Industry Sites

http://www.seo.com/

April 2009 – not used.  January 2011 – employed site-wide.

http://www.seochat.com/

April 2009 – not used.  January 2011 – not used.

http://www.seobook.com/

April 2009 – not used.  January 2011 – not used.

http://www.seomoz.org/

April 2009 – not used.  January 2011 – used selectively. Used on the site home page and at least one other page.  See note below on SEOmoz.

http://www.davidnaylor.co.uk/

April 2009 – not used. January 2011 – not employed on site pages; not used on blog home page, but employed on blog post pages.

http://sphinn.com/

April 2009 – not used. January 2011 – not used.

http://searchenginewatch.com/

April 2009 – not used. January 2011 – not used.

http://www.mattcutts.com/blog/

April 2009 – not used. January 2011 – employed site-wide.

Search Engines and Search Engine Blogs

http://www.yahoo.com/

April 2009 – not used.  January 2011 – not used.

http://ysearchblog.com/

April 2009 – not used.  January 2011 – not used.

http://www.live.com/ – now compared to http://www.bing.com/
April 2009 – not used. January 2011 – not used.

http://blogs.msdn.com/livesearch/ – now compared to http://www.bing.com/community/site_blogs/b/search/default.aspx
April 2009 – not used. January 2011 – not used.

http://www.google.com/

April 2009 – not used. January 2011 – not used (in any search capacity).

http://googleblog.blogspot.com/

April 2009 – not used. January 2011 – employed site-wide (including labels).

http://googlewebmastercentral.blogspot.com/

April 2009 – not used. January 2011 – employed site-wide (including labels).

http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=35291

April 2009 – employed.  January 2011 – employed. Appears to be employed on all /support/ pages, but not necessarily all upper-level google.com pages.

Ecommerce Sites

http://www.shopping.com/

April 2009 – not used.  January 2011 – not used.

http://www.shopzilla.com/

April 2009 – not used.  January 2011 – not used.

http://shopping.yahoo.com/

April 2009 – not used.  January 2011 – not used.

http://www.overstock.com/

April 2009 – not used. January 2011 – employed on all category and product pages; not used on home page and other select upper-level or non-shopping cart pages.

http://www.bizrate.com/

April 2009 – not used. January 2011 – not used.

http://www.amazon.com/

April 2009 – not used. January 2011 – employed on all product pages, but not on category or other pages.

Blogs and Social Media Sites

http://en.wikipedia.org/

April 2009 – not used. January 2011 – not used.

http://technorati.com/

April 2009 – observed on home page. January 2011 – not used.

http://www.myspace.com/

April 2009 – not used. January 2011 – with the exception of some upper-level pages, used site-wide.

http://www.facebook.com/

April 2009 – not used. January 2011 – employed on profile pages when user is not logged in. See note below on Facebook.

http://wordpress.org/

April 2009 – not used. January 2011 – employed on a couple of site pages, but not site-wide; employed on blog post pages, but not category pages.

http://dailykos.com / http://www.dailykos.com/
April 2009 – not used. January 2011 – not employed on home page (which still lacks a canonical domain). Employed on blog post pages, but not tag or other pages. See note below on Daily Kos.

Poker Sites
I had added this category because online gaming sites are supposed to care so much about SEO.

http://poker.bodoglife.com/ – now compared to http://poker.bodog.com
April 2009 – not used. January 2011 – not used.

http://www.pokerstars.com/

April 2009 – not used. January 2011 – employed on the home page, but not observed on other pages.

http://www.partypoker.com/

April 2009 – not used. January 2011 – employed on the home page, but not observed on other pages.

Media and Content Sites

http://www.latimes.com/

April 2009 – not used. January 2011 – used on home page and most (but not all) upper-level pages. Used on most (but not all) individual news stories; not used on tag-like topic pages.

http://www.bbc.co.uk/

April 2009 – not used. January 2011 – used on home page and most news category and story pages (on the www subdomain).

http://www.theglobeandmail.com/

April 2009 – not used. January 2011 – not used.

http://www.suite101.com/

April 2009 – not used. January 2011 – used on article and category pages, but not on the home page or author profile pages.

http://www.about.com/

April 2009 – not used. January 2011 – not used.

(I hadn't checked eHow in 2009, but it now appears they're using the tag site-wide.)

http://www.nytimes.com/

April 2009 – not used. January 2011 – not used on home page, section pages or topic pages, but employed for news stories. Used on blogs.nytimes.com post pages.

The SEOmoz "About" Page

On SEOmoz, the on-site navigation links to this "About" page:
http://www.seomoz.org/about
That page contains the following code:
<link rel="canonical" href="http://www.seomoz.org/dp/new-about" />

So the canonical URL is, of course, different than the link target in the primary navigation. As I've noticed before in similar situations, Google favors the specified canonical page over the other version, despite the weight provided by strong internal linking. Not only does the canonical page rank for likely keyword queries, but the URL linked from the on-site navigation cannot be found in Google's index. This extends to all SEOmoz "about" pages with a canonical URL that differs from the on-site URL link target, such as the "team" or "contact" pages. While I doubt that this has any material impact on the performance of these pages (though one could argue that this is not ideal for controlling link flow, and that it unnecessarily introduces duplicate content issues), I have seen situations where specifying the wrong canonical on a page results in the page housing the mismatched canonical tag disappearing from the SERPs altogether. Interestingly, at the time of writing Bing was either returning the "/dp/" version of about pages or a mix of canonical and non-canonical pages.

(For posterity, and for normalization of personalized/geolocated results, here's a screenshot of the three queries referenced above.)

Use of Rel="Canonical" on Facebook Pages

If you search for a brand with a Facebook page on Google, you'll usually see something like this result returned for a "starbucks facebook" query:
www.facebook.com/Starbucks
That page contains the following code:
<link rel="canonical" href="http://www.facebook.com/Starbucks?v=app_338375791266" />

Clearly Google isn't being overly-swayed by the canonical URL, but nor is it ignoring it. It is returning the URL in a sitelink, as well as in a separate snippet in the stack (which actually makes three Google links to the actual page content from this query when you include the first result).

Canonical URL for Starbucks on Facebook

Canonical URL for Starbucks on Facebook

Obviously Facebook is trying to return the correct view of a brand page for the search engines, though this might be a case (on Facebook's part) of conflating canonicalization with user experience, since what users see when they arrive at the main Starbucks Facebook URL has nothing to do with the canonical URL. Sometimes for brands (as for individual user profiles) there is not this variance between the the "clean" version of a URL and the parameterized canonical; in cases where the default view is a page's wall – as with Starbuck's Frappucino – the specified canonical is the "clean" version of the URL, as the wall is presumably the default view.

The Strange Case of Daily Kos

Daily Kos has two primary URLs for each of its posts. The first is the post as linked by "Permalink" anchors, sidebars and (301 redirected) RSS feed URLs, such as:
http://www.dailykos.com/storyonly/2011/1/18/937062/-Midday-open-thread
They have a separate URL for this content with the addition of comments the post has received, linked as "Discuss" from their home page, and "View Comments" from the main story page:
http://www.dailykos.com/story/2011/1/18/937062/-Midday-open-thread
Both carry this code:
<link rel="canonical" href="http://www.dailykos.com/storyonly/2011/1/18/937062/-Midday-open-thread" />

Here's the curiosity. Google seemingly ignores the canonical (/storyonly/) version, and indexes only the version with comments (/story/). The canonical versions are not to be found in Google's index, at least when using the site: command.
site:www.dailykos.com/storyonly/ – 0 results
site:www.dailykos.com/story/ – 282,000 results
Bing does not show the same bias:
site:www.dailykos.com/storyonly/ – 299,000 results
site:www.dailykos.com/story/ – 141,000 results

Is this a case of Google saying "thanks, but no thanks" to the "strong hint" offered them by Daily Kos? The content doesn't seem blocked by a robots.txt disallow or page-level meta robot control (and offering Google a canonical URL that they're not supposed to index doesn't make sense anyway), and while I don't know what (if any) URLs are being served up by XML sitemaps, sitemaps are seemingly used for URL discovery rather than preferential indexing. Without speculating further about Google's handling of Daily Kos' content (and morass of duplicate content – there's a /story/ version, and multiple mobile versions to boot), suffice it to note that Google is capable of ignoring a URL that is the specified canonical and strongly linked.

Use of Rel="Canonical" by Ecommerce Sites

Since Google in their original announcement used an ecommerce product details page as their example, and as it certainly seems as though ecommerce sites – with a URL environment typically muddied by both internally- and externally- spawned parameters added to pages – I thought I'd take a look at rel=canonical adoption by some major online stores.

Site Home Page Category Pages Product Pages
1-800-Flowers Yes Yes Yes
Amazon No No Yes
Apple Store Yes Yes Yes
Bust Buy Yes No No
Dell Yes Yes Yes
Ebay Yes Yes Yes
JC Penny No No No
Kohl's No No No
Overstock No Yes Yes
Sears Yes Yes Yes
Target No No No
Walmart Yes Yes Yes
Zappos Yes Yes Yes

One can observe the employment of rel="canonical" most prominently at the level where one would think it would have the most impact – individual product detail pages (9 of 13 sites). This is analogous to its observed employment on sites with an integrated blog, where upper-level pages don't necessarily carry the tag, but individual blog post pages do. Does this make for superior ecommerce SEO? I would only point out many of the sites on the list above that are employing link rel="canonical" don't seem to have many problems achieving visibility in the SERPs (though I make no claim of a direct causal relationship; it would seem as though those online stores already adept at SEO would be the most likely to adopt a Google-approved method of describing canonical URLs for any given page).

Concluding Observations and a Quick Tip

While by no means universal, link rel="canonical" has certainly been adopted by many sites, including at the enterprise level. It is also apparent that it is not always deployed in an optimal fashion, and that – as many others have pointed out – that there can be far-ranging consequences if rel="canonical" is not used properly. Clearly Google gives the tag a lot of weight, but, at the same time doesn't regard it as a piece of meta data gospel. It will be interesting to see when Bing starts to provide support for rel="canonical," how it will process it, and whether another its support from another search engine will encourage wider use.

I have been aided in this post by use of the SearchStatus extension for Firefox and SeaMonkey by Quirk eMarketing. Where a canonical URL is specified on a page, a "C" appears in the address bar.

SearchStatus Plugin - Canonical URL in Location Bar

A greyed-out "C" appears when the current page matches the specified location; a bright blue "C" appears when the page differs from the currently loaded page (it has always returned correct information for me, but sometimes won't appear until I view the source of or reload a page, which would simply be a case of a conflict with another plugin). I consider this a must-have for SEOs working on sites that employ rel="canonical," as it allows you to quickly identify mismatched canonical tags (and other aspects of the extension are helpful too, particularly the mozRank display).

{ 0 comments… add one now }

Leave a Comment

Previous post:

Next post: