Prevent Duplicate Content using the Canonical Url Tag

I was recently doing an seo audit of a small ecommerce website. One of the first things I did was to do a ‘site:www.domain.com’ in Google. Amazingly, the site in question has approximately 8200 pages indexed in Google. This was quite surprising when the store only sold less than 1500 unique products. The site used a horrible ecommerce module bolted onto phpnuke and has a horrible url structure, appending lots on unecessary querystring data onto the url.

Whilst looking through the Google results for this site the majority of pages were as follows:

/index.php?tab=123&txtSearch=ALL&List=oasc&Sort=PName%2CPName&CreatedUserID=1&pageindex=40&Language=en-GB

The site has an advanced search page, whereby you can sort products using a variety of options such as ascending order, descending order, size, price etc. This is bad for a number reasons, but mainly due to duplicate content (not to mention lower serps ranking, traffic loss and decreased page relevancy) . A page of results in ascending and descending order is essentially the same page, simply a different view of your data – you can help search engines via using the relatively new canonical url link tag.

To illustrate I’ll use an example of a typical category page, whereby you can sort a list of products in ascending and descending order, leaving you with a number of urls as follows:

http://www.shop.com/category.php?catName=Shirts&sortOrder=ASC

In this example the part of the querystring creating the duplicate content would be the sortOrder parameter – as you would want your seperate categories indexed.

The solution is quite simple. In your head tag add the following:

<link rel="canonical" href="http://www.shop.com/category.php?catName=Shirts" />

By adding this to your category page you are telling search engines (currently Google, Yahoo, Ask and Bing use this tag) that this page is a copy of http://www.shop.com/category.php?catName=Shirts. Indicators such as Google Pagerank are also transferred to your preferred url.

The canonical url tag has many uses and can be used to help with the following issues:

Pages that contain session IDs appended to the querystring
Search results pages that append search data to the querystring
Print versions of page
Duplicate content for www. and non-www. pages 0 in your canonical tag you would include your preferred url
Same content contained in multiple categories – E.g. a product contained in multiple categories on an online store
Removing affiliate ids in the url
Preventing multiple pages of a discussion topic with comments from being indexed E.g. shop.com/post.php?id=123&page=1

You can read more about the canonical tag at the official Google Webmaster Blog. Matt Cutt’s also has a 20 minute video explaing the canonical tag in more depth.

The main point to consider is that the canonical tag is simply a hint and not a directive. It is another method to give search engines help in indexing your content. This is very useful when working on existing sites already indexed by Google. However, on new sites bit more planning can help. For instance, in a previous article I covered 301 redirects for seo using htaccess – how to set a prefferred version of your site via htaccess. On an ecommerce store you could avoid appending search data to the querystring.

EDIT: wordpress and all in one seo plugin generate canonical link tags for blog posts. For example, comments are seperated into multiple pages E.g


https://www.web-design-talk.co.uk/157/how-to-deal-with-difficult-clients-using-split-testing/comment-page-1/#comment-344

With the actual content being at:


https://www.web-design-talk.co.uk/157/how-to-deal-with-difficult-clients-using-split-testing

If you have a quick look at the source code to the comments page you’ll see the following has been added:

<pre id="line34"><link rel="canonical" href="https://www.web-design-talk.co.uk/157/how-to-deal-with-difficult-clients-using-split-testing/" />

10 thoughts on “Prevent Duplicate Content using the Canonical Url Tag”

Web Company says:

January 10, 2010 at 2:42 pm

I’m pretty sure you don’t need this as you can use 301 redirects and the meta robots tag nindex, no follow to prevent it

Rob says:

January 11, 2010 at 8:37 pm

Hi,

Thanks for the comment. The canocial tag (like 301 redirects and the meta you mention) is mean as another hint to search engines to index the correct, unique. It’s another sign you can send (I’d advise doing it alongside and method such as 301 redirects or using the method I described in a previous post on www. and non www. versions of your site) to search engines saying where the correct content is located. For the effect required to implement this into something like wordpress or your own website, I think it’s definately worth it.

It’s also worth nothing that overall you’re not only trying to help users, but also consolidate your page rankings from multiple urls, to a single one – teherby avoiding duplicate content penalties.

I personally believe as many methods we have to prevent duplivate content, the better. For example, to manually integrate the canocial into your wordpress install the process is quite pain free – there’s an excellent simple article here on how edit WordPress to prevent duplicate content (you only have to edit your header.php file – simple!)

There, I wrote way too much again for a comment reply 🙂

web design new york says:

February 16, 2010 at 11:41 pm

Thanks, we have often wondered how styop mutliple search pages getting indexed.

Pingback: Multiple Categorisation for SEO | Web Design Talk
Daniel White says:

March 6, 2011 at 8:32 pm

Possibly good for other major search engines as Google no longer penalizes (and hasn’t for some time) for duplicate but rather just filters out the duplicate content.

1. Aikkyam says:
  
  September 25, 2014 at 10:01 pm
  
  when we add canonical tag we can tell google that this url is a copy of that.
  http://www.example.com and example.com are different,wright!!
  
  do we have to add this in the head tad?
  
  1. Rob Allport says:
    
    September 26, 2014 at 4:47 pm
    
    Yup, as in the example, needs to go into the head section 🙂
    
Rob says:

March 6, 2011 at 11:27 pm

@Daniel White

Do you have any articles or links as I’m still seeing sites being penalised for duplicate content.

Pingback: Just Search SEO Review, justsearchseo.co.uk Reviews
Pingback: Using Opencart Product Filters in Opencart 1.5.5.1

Prevent Duplicate Content using the Canonical Url Tag

Published by

Rob Allport

10 thoughts on “Prevent Duplicate Content using the Canonical Url Tag”

Leave a Reply Cancel reply