Fixing Common W3C Validation Errors and SEO

Yet another thing to check when doing SEO is that your site validates via the w3c validation checker. A site that is xHTML valid will recieve more frequent search engine crawls and more importantly, longer crawl times. I won’t bore you with further details about why validation is a good thing (it’s a huge subject), but if you must there is a great article about the subject right here. Creating a site to an xHTML valid standard encourages better coding practice and more semantic coding – making your site easier to crawl. You are also giving your site a betttr chance of displaying the same across multiple and future browsers.

Another less known theory is that spiders get full when crawling a page, semantic coding practice will allow for cleaner and more lightweight code. For instance, when crawling a badly coded page with lots of line styles and JavaScript (E.g. content not useful to a spider) the spider may become full too quickly and leave – missing you important content contained further on within the page.

Validating your site to at least xHTML Transaitional 1.0 (the test strict version, compared to xHTMl 1.0 Strict) is highly encouraged and is an area often ignored by developers. Below, I’ll quickly outline some of the common validation errors and how to easily fix them:

cannot generate system identifier for general entity X – 99% of the time this relates to errors with entity references such as ampersands in URLs. E.g. having an url like product.php?id=2&mode=view would result in this error as the ‘&’ wasn;t used within the url.

required attribute “alt” not specified –  simply find the line number and add an alt tag for the image. The presence of an alt tag is required for both transitional and strict doc types.

XML Parsing Error: Opening and ending tag mismatch – Depending on how organised you are when coding this fix can take a matter of seconds or a lot longer. It relates to unclosed block level tags, such as a table or div. One plus point is that fixing such an error often results in several validation errors being fixed at once.

Continue reading Fixing Common W3C Validation Errors and SEO

301 Redirects for SEO Using htaccess

301 Redirects Prevent 404 Errors
301 Redirects Prevent 404 Errors

Google treats www.website.com and website.com as two totally different websites. This is very bad for your (or even a client’s) website as it may lead to duplicate content and different pageranks to those sites.  This is how Google “canonicalizes” the url and is very bad from an SEO standpoint.

In essence, a web server could return totally different results for each of those pages. I have also encountered the situation where clients have set their preferred domain in Google webmaster tools, have given out the opposite version for SEM and wonder why they don’t see results :)You can easily check the above by using the “site:” operator in Google search. E.g. site:www.website.com and site:website.com

You can use “mod rewrite” rules as a powerful method for redirecting many URLs from one location to another.  This is a simple server level technique for handling redirects. The way people handle this canonicalization issue is purely a personal choice, although the below method can be altered for directing to the none www version of the url.

The .htaccess file is simply an ASCII file created with any normal text editor. You need to save the file as ‘.htaccess’ (no filename, .htaccess is the extension!). Open you newly created .htaccess file in your favoured text editor and add the following lines of code, replacing domain.com with your domain:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^domain.com
RewriteRule (.*) http://www.domain.com/$1 [R=301,L]

Upload the .htaccess file to the root folder of your website and you’re done. All your traffic will be permanently redirected from a non-www version of your website to a www version of your website. To do the opposite (direct all traffic to the non www version use the below code in the .htaccess file):

RewriteEngine On
RewriteCond %{HTTP_HOST} ^www.domain.com
RewriteRule (.*) http://domain.com/$1 [R=301,L]