Importance of Robots.txt File about Robots.txt

I try to monitor how search engines like Google and Yahoo are indexing my blog. I have been watching over the past two weeks and Google had indexed like 35 pages, posts, and tags. Now for this blogging being just over one month hold I thought that wasn’t to bad. Then i checked my sites listing on Google yesterday and I had 131 listings. For those of you that don’t know how to find out how search engines like Google and Yahoo are linking your site it is very simple. Just go to either one of the search engines and simply type
site:www.yourdomain.com
and it will list for you all of the pages that have been indexed. When I saw this I knew that something wasn’t right, because I didn’t have anywhere near that many posts or pages for Google-bot to have crawled. So obviously I started to go through all of the listing and see what it was from my site that Google-bot had found. The more I looked the listings for my site I found that Google-bot had indexed my WP-content folder, which means that Google-bot had listed my theme folder along with my plug-ins folder. Now this is not a big problem I just simply didn’t configure my Robots.txt file. So last night I went through and set up my Robots.txt file and have submitted it to Google and am currently waiting for the addition of my Robots.txt file to reflect my listings in Google.
Here is a very simple example of a Robots.txt File.
User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content
Disallow: /i/
Disallow: /f/
Disallow: /t/
Disallow: /wget/
Disallow: /httpd/
Disallow: /c/
Disallow: /j/
Now once you have created your robots.txt, it is important that you go to your Google Webmaster Tools and make sure that Google-bot is able to access it. This can be done in three easy steps:
Step 1 - Login to your Google Account and Click on webmaster tools
Step 2 - Select the domain that your are adding the Robots.txt file to.
Step 3 - Click on the tools tab and select robots.txt
Step 4 - Insert the text from your Robots.txt file inside the first box you see. It should have listed in it “User-agent: *
Disallow:”. This is where you will want to past the text from your Robots.txt file.
Step 5 - Make sure your domain url is listed in the second box.
Step 6 - Click on check at the bottom and Google-bot will download your Robots.txt file and your done.
Note this is the only time you will need to make changes to your Robots.txt in Google Webmaster Tools. To make changes to the file access the Robots.txt file on your server and Google-bot will download the changes.

Share

Twitter Delicious Facebook Digg Stumbleupon Favorites