Connect with us: facebook twitter rss

Site Diagnosis – are all your pages indexed properly?

I have just released a new feature to Link Diagnosis Firefox Extension that will allow easy diagnostics of the indexed pages on your website.

Couple of weeks ago I was facing a tedious task of finding out which pages out of 100k on the site are indexed and which are not. I knew that some of them could have been marked as duplicate content or that Google simply didn’t indexed them because of the size of the website.

First I installed Google Webmaster Tools hoping that Google will tell me that. Unfortunately, the Indexed Pages tab just points me to use site: command.

I don’t trust site: command. Especially, the count of number of pages is very inaccurate. I know I have 100k pages and Google tells me I have 150k pages indexed.

Also, there is no easy way to see more than 1000 pages (you can play with inurl: commands but it takes ages and you can get banned).

Because of these problems I decided to code a tool which would automate it – Site Diagnosis.

The internal algorithm of the tool works as follows:

1. Go through every URL in XML Sitemap file and do a simple check inurl:http://www.samplesite.com/dir/url1
2. For every url that does not appear on inurl: command there is still a chance that page is indexed but does not appear with inurl
3. For every url in XML Sitemaps I get a title and perform this check site:http://www.samplesite.com sample title
4. If the page does not rank in top 10 for its title within the site then probably something is wrong.

This check is suprisingly accurate and most of the pages that don’t survive this check have some problems like duplicate content, missing titles, missing content or not enough content. These troubled pages usualy don’t appear in the search results if you search for any text on the page – not even when you enclose sentences in quotes.

Obviously, the goal of search engine optimization is to fix these pages so Site Diagnosis will hopefully be essential in identifying them.

  • nice points. thanks for sharing.
  • I searched and found! Excellent tool.
  • your tool is really great. i like it. its fast and you can see all information direktly after the scan.
  • I know this is off the topic but I found this site by searching on Yahoo for hotel marketing. How did you optimize your site to rank so high in the search engine results?
  • That's new.I hadn't never thought the evidently simple ways Google worked in. The truth of the issue is that even though Google looks at your page multiple times, it takes a metric tonne of work on your part in order to get a page to become intriguing to Google. This will add to my understanding of search engine optimization.
  • This is actually a feature I had been hoping someone would come up with. As far as I'm concerned, if pages aren't indexed in Google they are potentially losing me money. So being able to see what pages aren't indexed I can simply get a few links to them. Thanks.
  • Now that's a useful tool. That's the problem I always have because of the limitations the search engines give. Nice.
  • Great Tool. Very helpful!!
  • 4everdesign
    Thanks. FF ext is especially cool.
  • Hej - a great tool! Really love it. And very interesting insights you shre in your articles - so many thanks to you!
  • Great tool
  • I love your tool , I have read the there is a not-free version or will be it?
  • VERY USEFUL INSTRUMENT
  • Very nice tool. Thanks for aplication.
  • the best thing since sliced bread.
    - Great stuff
  • Randy Moore
    Great Tool!
  • Great tool! Thanks a lot! Very useful!
  • It's really a nice tool: useful and simple!
  • Thank you for that awesome tool. I´m a frequent user and I have to say that all the features are very usefull and functional.
  • Very nice tool. Thanks for sharing.
  • Janusz
    Billy> I am not planning to do any white-labels sorry. You might try Blogstorm hosted link analysis tool.
  • Billy
    Hi, is there a way to private label your tool?
blog comments powered by Disqus