Category Archives: General

Prevent Duplicate Content using the Canonical Url Tag

I was recently doing an seo audit of a small ecommerce website. One of the first things I did was to do a ‘’ in Google. Amazingly, the site in question has approximately 8200 pages indexed in Google. This was quite surprising when the store only sold less than 1500 unique products. The site used a horrible ecommerce module bolted onto phpnuke and has a horrible url structure, appending lots on unecessary querystring data onto the url.

Whilst looking through the Google results for this site the majority of pages were as follows:


The site has an advanced search page, whereby you can sort products using a variety of options such as ascending order, descending order, size, price etc. This is bad for a number reasons, but mainly due to duplicate content (not to mention lower serps ranking, traffic loss and decreased page relevancy) . A page of results in ascending and descending order is essentially the same page, simply a different view of your data – you can help search engines via using the relatively new canonical url link tag.

To illustrate I’ll use an example of a typical category page, whereby you can sort a list of products in ascending and descending order, leaving you with a number of urls as follows:

In this example the part of the querystring creating the duplicate content would be the sortOrder parameter – as you would want your seperate categories indexed.

The solution is quite simple. In your head tag add the following:

<link rel="canonical" href="" />

By adding this to your category page you are telling search engines (currently Google, Yahoo, Ask and Bing use this tag) that this page is a copy of Indicators such as Google Pagerank are also transferred to your preferred url.

The canonical url tag has many uses and can be used to help with the following issues:

  • Pages that contain session IDs appended to the querystring
  • Search results pages that append search data to the querystring
  • Print versions of page
  • Duplicate content for www. and non-www. pages 0 in your canonical tag you would include your preferred url
  • Same content contained in multiple categories – E.g. a product contained in multiple categories on an online store
  • Removing affiliate ids in the url
  • Preventing multiple pages of a discussion topic with comments from being indexed E.g.

You can read more about the canonical tag at the official Google Webmaster Blog. Matt Cutt’s also has a 20 minute video explaing the canonical tag in more depth.

The main point to consider is that the canonical tag is simply a hint and not a directive. It is another method to give search engines help in indexing your content. This is very useful when working on existing sites already indexed by Google. However, on new sites bit more planning can help. For instance, in a  previous article I covered 301 redirects for seo using htaccess – how to set a prefferred version of your site via htaccess. On an ecommerce store you could avoid appending search data to the querystring.

EDIT: wordpress and all in one seo plugin generate canonical link tags for blog posts. For example, comments are seperated into multiple pages E.g

With the actual content being at:

If you have a quick look at the source code to the comments page you’ll see the following has been added:

<pre id="line34"><link rel="canonical" href="" />

How to Deal With Difficult Clients Using Split Testing

Sometimes you can be in the process of trying to tell a client that their idea simply won’t work. Be it a flimsey campaign idea of extra design element that you know through experience will not work and produce the desired KPIs for a client project. You can even show the client links, articles and examples of why their idea will fail to deliver results. However, if this is potential or existing client they are likely to go elsewhere, to a company willing to follow their every word without consideration – I have come across web companies who will do this.

Recently an existing client came to me asking why his site isn’t showing up when people search for a particular long tail search term. Now, his existing site used a pretty awful content management system that didn’t even allow him to set his own pages titles or meta descriptions. Furthermore, he was lacking inbound links, which people ranked above him did have. This all sounds simple and straightforward but even after I had explained (in quite clear and non technical langauge may I add) the merits of good SEO and one page content the client simply wouldn’t accept this as a solution. He had his own short term and less costly solutions – basically revolving around the the idea me resubmitting his sitemap page to all the major search engines each day. I’m not debating that submitting a sitemap isn’t a good idea, because it is. However, the client’s main KPI for this project was increased site enquiries.

After much discussing this we had both come to a bit of an awkward silence – not a good thing if you’ve ever experinced this in client meetings. For some reason I remebered back to my unoversity days where I had read something about split testing (or A/B testing) – where you can turn a negative situation into a positive one.This is quite a delicate situation to be in as it can damage your client relationship quite quickly.

The idea was to use the client’s idea for a period of time and my idea for a period of time. At the start of this I would install Google Analytics (I was tempted to use Google’s website optimizer, but decided against it) and let the statisitics do the talking – as a no one can argue with statistics.

This method has been very useful previously when demonstrating the merits of creating a dedicated landing page for Google Adword campaigns, but can be used anywhere if you’re willing to a little bit extra.

This method is beneficial for the following reasons:

  • The client’s idea are being dismissed as wrong (however right you think you are)
  • You are showing the client that you care enough to demonstrate your ideas
  • Occassionaly the client will back down as soon as you explain your plan of attack
  • It prevents those awful awkward silences
  • You have a real world example to use in your other client meetings
  • You are speaking the clients language in that you are demonstrating how your actions lead to increased conversions
  • You are being direct, which I personally think is alwasy a good thing – as such statistics are often a huge eye opener for clients
  • If and when the client comes to the same conslusion as you, they won’t blame you

There is always the arguement that the client is the client and that it’s all business at the end of the day. However, I personally pride myself on doing things properly. Others will say just get on with, do what the client wants and forget about it – you can only offer your opinion. This is a good point but can still damage your client relationships when they return later on and you need to charge them again. It all depends if you require long or short term client relationships – as they are definately an investment.

Tracking Twitter Performance Using Google Analytics

If you use the ever popular twitter there’s a high chance you’ll be linking to your company webiste or personal blog in your tweets or profile link. As the aim is use twitter as a marketing tool to drive traffic, you can use Google Analytics to track the link you placed the twitter profile – just like an email campiagn or PPC advert

If you use Twitter as a marketing tool to drive traffic to your site then you should treat it in exactly the same way as you would a newsletter, a PPC advert or a banner and track each Tweet’s performance beyond simple click data. How many visits do you get, how long do they stay on your site, how deep do they go, what is the bounce rate like and how much revenue do they generate?

The benefit to ‘tagging’ this link is that Google Analytics will record more than use basic click data – you can record a whole host of advanced user data such as how they navigate your site and length of visit. By default Google will track such links, but traffic from services such as will be dumped into the direct traffic area of Google Analytics. The steps to get the latter up and running are quite simple:

1: Go to Google’s URL builder to generate an url . Enter the following information:

Website URL: your website address

Campaign Source: enter a relevant source here to identify your campaign E.g. twitter

Campaign Name: enter a name used to identify the campaign, this is used to identify the campaign in Google Anlytics E.g. twittertracking

2: Click generate URL and something similar to the following will be created:

3: If posting to twitter you can paste this URL directly in the tweet box, as twitter will automatically shorten this url.

4: After approximately 24 hours data will appear in your analytics account. Simply navigate to Traffic Sources. If you’ve used the same terms to build the url as above you’ll see an entry called ‘twitter / social’. You can also view information by navigating to Traffic Sources > Campaigns where you can click the campiagn name (‘twittertrack’ was used in he example above).

Google Analytics Once Tracking is Installed

Google Analytics Once Tracking is Installed

Getting htaccess Mod-rewrite rules working locally with XAMPP

After spending a whole 2 hours of my life trying to get Apache mod-rewrite rules working with XAMPP on a local computer, I thought I’d share my results as I seemingly tried everything. The problem, I have a simple mod-rewrite rule in my htaccess file. When I upload this to my online web host everything is fine – the working htaccess file for my online host:

RewriteBase /
RewriteEngine on
RewriteRule amnesia/resetpass(.*) recover-password.php$1 [PT]

So typing in does a simple re-write to, without the user ever knowing. All is fine. However, when I treid to get this seemingly simple rule to work with XAMPP I ran into problems, getting 404 and 500 responses from the server – obviously quite a pain as this essentially means I can’t test the site using my own web server (E.g. localhost). The site hosted from my computer via the normal setup E.g. xampp/htdocs/mysite. I’ll jump straight to the solution and then explain exactly what things were changed – the working htaccess file is below:

RewriteEngine on
RewriteBase /mysite
options +FollowSymLinks
RewriteRule amnesia/resetpass(.*) recover-password.php$1 [PT]

Firstly, the extra line that uses the +FollowSymLinks directive was added. To explain this I’ll quote straught from the Apache documentation:

To enable the rewriting engine for per-directory configuration files, you need to set “RewriteEngine On” in these files andOptions FollowSymLinks” must be enabled. If your administrator has disabled override of FollowSymLinks for a user’s directory, then you cannot use the rewriting engine. This restriction is needed for security reasons.

The re-write base has been changed to the relative path of the website directory. To finish up, open the http.conf file (the default settings for XAMPP, that get overwritten with you .htaccess file rules on a directory basis), located by default at C:\xampp\apache\conf\http.conf. Find all occurances of AllowOverride None and change it to AllowOverride All. After restarting XAMPP everythign should work. In a nutshell changing the AllowOverride directive in the http.conf file decalres which directives in .htaccess files can override directives from httpd.conf, this is discussed in more dept over here, but basically by having this directive set to None, you’re stopping individual htaccess files from working locally.

SEO Friendly URLs With Mod Rewrite

So called ‘dirty URLS’ (E.g. not only look untidy but also pose a security risk as they expose the underlying technology used, in this case, ASP. A much preferred URL in this case would be or even better, The latter URL structure not only improves useability for your site (the URL makes more sense to the user) and is argued to improve search engine rankings. There is a lot of debate on this subject, but everyone agress that these so called pretty URL’s don’t hurt anything and mainly improve user experience.  Google has also recently posted a video (obviously not giving much away) saying that SEO friendly URL’s do in fact make a small difference and don’t hurt SERPs’

Take the example of this very blog. Pretty urls are used to display the post title and id within the url. There is the option to simply include the title, but this has been proven to slow down general performance of your blog. I digress, let’s get onto some examples where simple URL rewriting with mod rewrite is useful.

Continue reading

CSS Sprities and Website Optimization

One of the latest and most well established design practices are CSS sprites. This is the practice of combing multiple images into a larger and single, composite image. By using the CSS background-position property selected portions of that master image (or sprite) are displayed. The main issue here is how can a larger image witha larger file size be beneficial, especially when compared to several smaller images? The answer lies in HTTP requests and Yahoo’s 80/20 rule explains this much better than I could! To summarise, the numbers of HTTP requests to the websites is drastically cut, thus loading the page much faster in  single request. Another major beenfit is that not Javascript is required for mouseover code, so you can make image rollovers easily. I have used this technque in the past, but like a lot of people never knew it was called sprites.

In fact, using sprites are so effective many of the internets biggest site’s are using them, all in slightly different ways. On such sites a truely huge number of requests are saved every day. For example, Youtube, Google, AOL,  Amazon and Apple all use CSS sprites. Take the mimilist example of Google (left):

Google CSS Sprite

Google CSS Sprite

Youtube, does things slightly different and uses a absolutely massive sprite 2800px in height! You’ll notice that some sprites are tightly packed together and other have spacing around each image. This is to allow for browser based text sizing, that would otherwise display multiple background images. A good example of planning for such an occurance is Digg’s sprite, where each image is highly spaced out.

The CSS is relatively simple. Take the example of a simple unordered list with a different image for each item. I made a very simple sprite and applied the background-position property to each link class. Here’s the simple HTML used for the list: Continue reading

Reasons to let Google Host your JQuery Files

It’s often the case that I see busy sites hosting copies of the JQuery library locally. E.g

<script src="/js/jQuery.min.js" type="text/javascript"></script>

The preferred and better way is to host your JQuery through Google E.g.

<script src="" type="text/javascript"></script>

So, why is this better? Well there are several valid reasons:

CDN (Content Delivery Network) – Google’s datacenters are located over a range of locations and when a user requests content the closest location is automatically chosen. This is better because it does not force users to download from a single server location (E.g your server) and the chances are Google will be able to serve content faster than your webhost. A similar theory is used for the popularweb based game called quakelive. Usually CDN‘s are a service you pay for, but you’re getting this free through Google!

Less server load – When all your website’s files are located on a single server, downloading them simulainiously increases server load and some users will recieve delays while files download. By having an external location for your JQuery library the latter is not an issue.

Improved caching – This is the biggest benefit as users will not have to re-download content. Hosting JQuery on your own server will cause a first time visitor to download the whole file, even if they have several copies of the same file from other sites. Through Google’s CDN, re-requests for the same file will result in a response to cache the file for up to one year, as it understands that it is a repeat request for a duplicate file.

Local Bandwidth savings – by letting Google host the file for you, you are in essence saving bandwidth. For personal sites this may not be an issue, but busy sites will notice significant bandwidth savings.

Google actually suggests using a .load() function to load the library (see below), but this not only interrupts JQuery’s killer feature (document.ready), but also causes an extra HTTP request. Personally I prfer the old fashioned script method, even though there are several other valid reasons to use the .load() method.

<script type="text/javascript" 
<script type="text/javascript">
  google.load("jquery", "1.3.2");
  google.setOnLoadCallback(function() {

Fixing Common W3C Validation Errors and SEO

Yet another thing to check when doing SEO is that your site validates via the w3c validation checker. A site that is xHTML valid will recieve more frequent search engine crawls and more importantly, longer crawl times. I won’t bore you with further details about why validation is a good thing (it’s a huge subject), but if you must there is a great article about the subject right here. Creating a site to an xHTML valid standard encourages better coding practice and more semantic coding – making your site easier to crawl. You are also giving your site a betttr chance of displaying the same across multiple and future browsers.

Another less known theory is that spiders get full when crawling a page, semantic coding practice will allow for cleaner and more lightweight code. For instance, when crawling a badly coded page with lots of line styles and JavaScript (E.g. content not useful to a spider) the spider may become full too quickly and leave – missing you important content contained further on within the page.

Validating your site to at least xHTML Transaitional 1.0 (the test strict version, compared to xHTMl 1.0 Strict) is highly encouraged and is an area often ignored by developers. Below, I’ll quickly outline some of the common validation errors and how to easily fix them:

cannot generate system identifier for general entity X - 99% of the time this relates to errors with entity references such as ampersands in URLs. E.g. having an url like product.php?id=2&mode=view would result in this error as the ‘&’ wasn;t used within the url.

required attribute “alt” not specified –  simply find the line number and add an alt tag for the image. The presence of an alt tag is required for both transitional and strict doc types.

XML Parsing Error: Opening and ending tag mismatch – Depending on how organised you are when coding this fix can take a matter of seconds or a lot longer. It relates to unclosed block level tags, such as a table or div. One plus point is that fixing such an error often results in several validation errors being fixed at once.

Continue reading