Perhaps the most important innovation of Google, and the force behind its creation, was the ideas contained in the doctoral thesis of Google’s founders, Sergey Brin and Larry Page, entitled ‘ The Anatomy of a Large-Scale Hypertextual Web Search Engine‘. The main premise boiled down to a simple idea. When people can vote for themselves, they will sometimes abuse the privilege.
Search engines of the time evaluated the relevance of web pages to a user’s query by examining the content of the page itself, including ‘keyword’ metadata (data contained within the document that does not display) placed on the page by the web developer, as well as the content of the page itself. This led unscrupulous web publishers to play all sorts of games, making the primary page topic to appear to the search engine as one thing, and the visible content to appear as something entirely different (think Viagra).
The premise of the thesis was that you could achieve more useful and relevant search results by considering links or citations from high quality external sites (think the NY Times) when deciding whether and when to include the page in the search results. Of course, once Google’s search engine came into widespread use, it didn’t take long for the bad guys to game this new system with techniques such as ‘link farms’ (pages with nothing but irrelevant links), so they began modifying their approach in an attempt to plug the leaks.
Certainly progress has been made by both Google and their competitors. It is also more difficult and costly today for legitimate websites to appear anywhere near the top of the SERP (search engine results page) . One of the primary difficulties in search engine optimization (SEO) is obtaining ‘quality’ links from third party sites.
Some site owners try to charge for links to other sites. Google and the other search services frown on this, and paid links usually do more harm than good. As a result, a whole industry has developed where people provide ‘legitimate’ links by publishing content on blogs and content sites, but you have to pay someone to write the content. This is big business, and there are some really grey areas in it.
There are also highly legitimate ways to obtain links. One of the best is to publish quality content that someone may want to reference on their own site. You can also write articles and speak at conferences, and earn links back to a page on your site.
The primary function of high quality links is to raise the ‘quality score’ of your website, and the pages on it.
If you have a Google toolbar, you may notice a little rounded corner rectangle that is partly green and partly grey. When you mouse over it, it will indicate the ‘pagerank’ of the page being displayed, which is a number between 1 and 10. The pagerank indicated is not quite the same as the internal one that they actually use in their search algorithm, and Google is starting to remove this item from its newer toolbars, because of concern that people are focusing too much on a single number.
Regardless, there is a concept of pagerank (and now siterank) that is used, along with relevance, to determine where and when your page is listed in the search results. Without a reasonable rank, you may be on page 270,000 in the search results, even if your page is right on topic with the user’s query.
All links, even from the same website or page, are not of equal value. One reason for this is the ‘link text’.
A web page links to another using a ‘hyperlink’. Basically, an element on the page (some text or an image) has an HTML ‘anchor tag’ wrapped around it, providing your browser with instructions to load another page when the user clicks on it. If the link is placed around text, the search algorithm will use those words, along with other data (such as the title, headlines and text on the target page) to attempt to determine the subject of the page being linked to. If the link text says ‘click here’ from a page with a pagerank of 9, your page will be considered very important on that topic. Since your page is probably not about ‘click here’, this link may not help that much. Even if the link text is your company name, that may not help much if the user’s search terms are related to something you sell.
Getting external links is hard enough. Asking another site owner to use words you choose as link text is much harder.
That brings us to the subject of external vs. internal links.
While it’s hard to obtain external links, and even harder to control their link text, it’s much easier to control links within your own site. Of course, internal links don’t count as heavily as external ones, but they do count. They not only show that you consider the page important enough to link to, but the link text from other pages in your site can add legitimacy to the subject of the target page.
If you are concerned at all about search optimization, you should work with your web developer to create a link strategy that emphasizes to the search engines the pages that you consider important, and the topics that each pertains to. This will certainly include menus (potentially including the main menu, sidebar menus and the footer). It should also include links within the content area of your pages, providing cross references or citations to other pages. This is what hyperlinks were originally designed to do.
Links are by far the most costly element of search engine optimization (SEO), simply because they are so time consuming. Even internal links can be a daunting task on a large site, and most links (read citations) require link worthy text on the receiving end. Someone has to write this text, and that takes time and money. The days when you could optimize your site for search by placing a string of keywords in the keyword metatag are long since gone.