The rel=”canonical” attribute can be used to avoid duplicate content penalties when a domain has more than one page which displays very similar, or identical content. The correct use of this attribute tells search engines which version of the page to index, furthermore it also passes link juice from the other similar or identical pages to the preferred version you have specified, thereby avoiding duplicate content penalties.

This method is useful if for instance you have a website with price lists in different currencies there could be several pages with almost identical content, specifying a canonical page will not only allow you to avoid duplicate content penalties for these pages, but google will also view all inbound links for the simular pages as though they were linking the single page you specify as “preferred” in the canonical attribute. this can obviously have a major positive effect on that pages position in google’s results.

How to Specify a canonical page.

Firstly determine which page you want to use as the original version of your similar pages, this is the page that all incoming links currently spread across the duplicate content pages will be attributed to.

Once you have determined this you can go ahead and add the following attribute in the HEAD section of each of the other pages that has similar content: (do not add it to the one you have chosen as your original version)

<link rel=”canonical” href=”full_original_page_url_goes_here” />

There is another aspect to canonicalization, which applies to home pages on a domain and the variances on the URL that can be used to link that page for instance:

In this instance it is feasible, and highly likely that you would have inbound links pointing to both versions of the same page although this would not attract a duplicate penalty, it would just result in only one version of the page being indexed therefore the link juice pointing to the de-indexed version is going to waste. But here’s the good bit, you can use the rel=”canonical” full_original_page_url_goes_here attribute to redirect the link juice from the de-indexed version to the version that IS indexed, this could also result in a potentially massive increase in inbound links for that page and have a major effect on its search engine position.

Webmasters who host their pages on Apache servers have an altogether different solution made available to them to avoid the links attributed to a single page from being split between different URLs although I would only recommend this technique for brand new domains/websites, as it results in urls becoming unusable This is achieved by editing the htaccess file on the servers public root folder.

For instance, adding the following commands to the bottom of the htaccess file:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^yourdomain\.com
RewriteRule ^(.*)$ http ://$1 [R=permanent,L]

Would stop anybody being able to access your site without using the www. prefix, therefore nobody would create links to this URL.