The SEO Skeptic Manifesto

On the Speculative Nature of SEO

The practice of search engine optimization is speculative, for the simple and obvious fact that the search engines do not tell the world how they rank websites.  When a change is made with the purpose of improving a web page's ranking for a specific keyword in the search engines, that change must be classified as a hypothetical improvement until a favorable increase in rankings are observed.

Search engine optimization as a professional field is, by extension, a speculative one, and – by necessity – exchanges between SEO practitioners are largely discussions concerning possibilities, rather than facts.

That SEO is speculative, though, does not mean that uncovering and applying techniques that can influence search engine rankings is a hopeless pursuit.  It means that results – sometimes statistically sound results – can be deduced from observation and experimentation, but that these results (especially reported rather than directly observed results) are almost always subject to a degree of statistical uncertainty.

So, as the name suggests, the aim of SEO Skeptic is to promote a robust examination of discussions pertaining to search engine algorithms, optimization techniques and factors observed to improve or depress search engine rankings.

It bears pointing out advocating a skeptical approach to optimization techniques and strategies in not an indictment of the practice of SEO.  However imprecise our understanding of search engine ranking factors, it clearly is possible to influence a website's search engine ranking (or, more generally, the visibility of an online resource in the search engines).  That is to say, SEO Skeptic is not skeptical about the benefits of optimizing websites for search.  Since the basic value of SEO has been the subject of an ongoing debate (oddly, in my eyes) I've addressed in a separate article.

Data:  Good, Bad and Indifferent

Whenever a claim is made about the efficacy of a particular optimization technique, the first question a skeptic should ask is "where are the data that supports your conclusion?"  In most situations this data is not shared.  This does not mean that the supporting data is absent, but for a variety of reasons is not made available, usually because the correspondent is unable to divulge corporately-protected information, or because it is – understandably – not in their best interests to share that information with real and potential competitors.

On occasion hard data is offered in support of a technique's effectiveness based on testing, and this is much more trustworthy – to a point.  Tests to determine the impact of SEO techniques are fraught with peril, and all too often any test data is digested uncritically as having validated a testing hypothesis.  SEO testing is not as straightforward as, say, A/B testing of landing pages, chiefly because of the testing environment.

A normal, scientifically-conducted experiment – such as drug trial – consists of a experimental subject and an unmodified control.  For testing the effectiveness of an on-page optimization tactic, such an experiment would require a web page and singularly modified copy of that page.  As acknowledged by the search engines themselves, duplicate content has an impact on ranking.  This impact is indirect, as the handling of duplicate content is seemingly a matter of filtering, rather than the imposition of a penalty per se.  So creating two very similar versions of a page in order to test a single element (for example, one page with an on page header in bold, and another with that header in an <H1> tag) will almost certainly skew results due to its handling of the duplicate content.

It is even probable that a search engine may, in a quest to return the canonical or "correct" version of the page, toss one of the two pages out of the equation altogether.  And removal or preferential ranking of one copy may be due to temporal factors rather than the optimization technique being tested (for example, which version of the page is crawled first).  One may test two pieces of similar rather exact content but, since a query is always tied to a keyword or keyword phrase, the aggregate of words used on a page is going to itself impact the ranking of that page.

There is also the issue of page linking dynamics.  In order to index a page a search engine spider either needs to discover a link to a page, or otherwise be informed of its existence.  Two pages require two links, and since linking dynamics themselves play a role in the ranking of target pages, creating two equal links may be difficult.  For example, in the case of two stacked hyperlinks to very similar pages, will being the first link encountered provide additional ranking value to the first page?  If linked from two separate pages, might one of those pages be considered a more meaningful link source?  It is possible to exclude linking factors if two URLs are submitted via a sitemap or even input into a submit form, but the weak rankings associated with orphan pages may themselves degrade the usefulness of the data.

Having said all this, testing can be enormously informative, and I wish there was more of it.  You want to know if the text of an <img> alt attribute is indexed?  Throw up a page, get it indexed, and query the clever non-word or nonsense string you've encoded.  For such technical tasks, especially those related to crawling and indexing, testing can return very hard information.  But once the observed effect is upon rankings, the multiple, complex and largely opaque components of a search engine algorithm come into play, and -  especially when compared to real-world pages, using actual words, with real-world linking relationships  – those effects are difficult to measure with precision in a test environment.

On the flip side all this are detractors that claim particular techniques are ineffective, typically in regard to architectural or code choices that are intended to improve search performance.  These assertions, too, require statistical validation but, in fact, such claims rarely cite any data at all.  The well-worn skeptical adage "absence of proof is not proof of absence" is worth consideration here.  That a website built with <table> tags can rank well is not proof that using <div> tags with CSS is not a superior coding scheme for SEO.

I do, however, believe that the cumulative data from multiple correspondents over time deserve respect, even if any individual experiment may be flawed, or even absent.  For example, we do not require large-scale controlled experiments to prove that the <title> tag impacts ranking, or that a page produced in HTML will generally perform better in search then that same content rendered in Flash.  In terms of benefit, of course, these observed "best practices" can hardly be considered revolutionary advances that are going to give you a leg up on your competition, but such techniques can at least be reliably employed on a wide scale.

The best data are invariably going to be those which you observe yourself.  So a robust critical approach to SEO entails replicating experiments you judge to be meritorious on sites you control.

The Value of Information Offered by the Search Engines

All of the major engines offer standing or periodic nuggets of advice on how to rank well in search.  In many cases, the advice offered is so broad as to be no help at all.  "Create great content" or "don't duplicate the same content on different pages of your site" will not provide any eureka moments for professional SEOs.  Occasionally, however, this advice – particularly technical advice on the employment of specific types of meta data – is useful, and should be heeded.

Be aware, however, that the information search engine blogs and spokespeople disseminate is always selective and, especially in cases when the advice is to avoid the employment of a particular technique, purposeful.  I have seen no evidence of search engines ever misleading webmasters, but plenty of occasions where they have tried to dissuade webmasters from employing techniques they do not like with vague admonitions of dire consequences.

Put another way, just because a search engine does not want you employing a specific is not sufficient reason for you to avoid it, particularly when you judge that the technique honestly represents your site's content and will help the search engines discover it.  Should you shy away from an image replacement technique that renders the exact text of a header image as a <H1> because Google has typified this a s "trick?"  You have to assess the risks, of course, but while there is no court of appeal if you incur a Google penalty, you should also consider whether such a penalty would be reasonable under the circumstances.

The Value of Information from the SEO Community

I adore being a member of the SEO community, and love the marketplace of ideas that this community creates.  The information shared by fellow professionals on blogs, in forums and at conferences can provide great actionable insight into SEO, and help your site rise to the top of the SERPs.  It can also be utter, unmitigated crap.

Because SEO is speculative, discussion of optimization techniques between practitioners is highly speculative as well.  That is, we more often talk of what we think is beneficial than what we know is beneficial.  This makes for very interesting and invigorating conversations, but there are obvious things to watch out for before acting on any particular suggestion.

First of all, the very best information is the least likely to be divulged publically.  Search marketers are not secretive because of their personalities, but because successful deployment of a favorable SEO technique provides a competitive advantage.  The last thing you want to do when you are on top is give somebody else the tools to dethrone you.  Community-spirited SEOs will often speak generally of such successes, and these generalized tips deserve attention, but are difficult to act on forcefully.

Second, well-meaning SEOs may also share information that is honest, but wrong-headed.  That is, they will attribute ranking success to a single change, without considering other changes to their site, their inbound link profile, search engine algorithms or any other of hundreds of factors that might account for their rise in fortunes.  In the absence of controlled experiments, few active SEOs make singular changes to their site, and it can be very difficult to isolate the impact of a single change among many.  Again, look at the data:  if it is incomplete, muddy or absent, be wary.

Finally,  the very active and gregarious nature of the SEO community can result in the dreaded network effect.  One idea, strategy or technique is put forward, gains a toehold, and then erupts in an echo chamber of increasingly ill-informed cheerleading.  There was once a widespread belief that the future of SEO lay in understanding and applying the principles of latent semantic indexing – despite that fact that perhaps one in a thousand SEOs had the faintest idea of what it was.  Or the multitude of posts that cited the miraculous impact of PageRank – during the one year period that Google was quietly ignoring internal nofollow tags altogether.

So what constitutes good (if not necessarily wholly reliable) community information?  Some of the best information comes from sources that have absolutely no self-aggrandizing motives.  One of the best sources of this type of data is the not-so-sexy "Crawling, indexing & ranking" help forum on Google Webmaster Central, along with other Google Groups there.   The motivation to post there – generally unlike an SEO blog – is too seek help on a particular SEO problem, and the grateful receiver of successful advice is usually gracious enough to broadcast the results (and almost always desperate enough, when the advice fails, to continue to seek solutions).

In a similar vein are the SEO forums on WebmasterWorld.  While, like the Google forums, webmasters go there to seek help, they also report on both good and bad shifts they have observed in their site's rankings, changes in search engine behavior, and other relatively factual items related to SEO.  Trolling forums can be a bit of a slog and a time sink:  thankfully Barry Schwartz summarizes notable activity from both these sources in his Daily Search Forum Recap.

The very best information you can receive is directly from other search marketers, and this is in itself reason for an SEO to be an active member of the community by posting to a personal blog, contributing to forums and engaging with other search marketers Twitter.  When you forge personal relationships with others in your profession, you will start to have access to data that is pure gold.  And in the spirit of quid pro quo, the more you are willing to share with those you trust, the more you are likely to receive.

The Manifesto Summarized

Trust but verify

  • If presented with information that seems reasonable, trust it to be possibly beneficial, but verify it through testing or limited deployment before making use of it on a large scale.

Critically evaluate sources of information

  • Be wary of claims unsubstantiated by data.
  • Take the search engines at their word, but be circumspect about their motives.
  • Be wary of information based on limited or imperfect testing.
  • Be wary of second- or third-hand information.

Share and share alike

  • Get to know other professionals you respect, in order that you may help one another.