Monday, November 2, 2009

Create Robots.txt File - Robots Text File

Creating a robots.txt file will not improve your search engine positioning, but it does provide robots with information concerning which files you will not allow to be crawled and indexed in the search engines.

When a robot crawls your site it looks for the  robots.txt file. If it doesn't find one it assumes automatically that it may crawl and index the entire site. Not having a robots.txt file can also create unnecessary 404 errors in your server logs, making it more difficult to track "real" 404 errors.

What is a simple robots.txt file?

Please note: This will allow all robots to crawl and index all files.

This allows all robots to crawl all files.

User-agent: *

What if I don't want a particular file crawled?
Please note: Disallowing a specific file to be crawled will keep it from being indexed. The file disallowed will not show up in the search engines. HOWEVER, this is only effective for friendly robots. Robots can choose to ignore your instructions.

This allows all robots to crawl all files except the images file.

User-agent: *
Disallow: /images/

This allows all robots to crawl all files except the images file and the stats file.

User-agent: *
Disallow: /images/
Disallow: /stats/

What if I want to disallow a particular robot?
Occasionally you may find that you would like to disallow specific robots from crawling your site or limit which files they may have access to.

This denies access to Googlebot-image to any files in your domain

User-agent: Googlebot-Image
Disallow: /

This specifically denies Googlebot-image to your images file

User-agent: Googlebot-Image
Disallow: /images/

