Search Engine Optimization Software | Image

  Home SEO Blog Increase PR Products SiteMap

Contact Us

SEO Directory

SEO Elite

What is the robots.txt file

In order to place good with any Search Engine you will need to do certain things that will allow the spiders that are coming to your site be able to index your entire site.

Be sure to use the robots.txt file so the SE's will know what's allowed to be indexed at your site.

User-agent: *
Disallow: /cgi-bin/

The above is what I have in my robots.txt in which I do not want the SE's to spider my cgi-bin.

User-agent

The User-agent line specifies the robot. For example:

User-agent: googlebot

You may also use the wildcard charcter "*" to specify known robots:

User-agent: *

To see the names of the spider robots that are coming to your site check your own logs that are making requests to your robots.txt.

Disallow:

The second part of a record consists of the Disallow: These lines specify files and directories that you DO NOT want the spider robots to index. For example, the following line will tell the spiders that it can not download (index) members.htm:

Disallow: members.htm

You may also specify directories:

Disallow: /cgi-bin/

Which would block spiders from your cgi-bin directory.

If you leave the Disallow line blank, it indicates that ALL files may be retrieved. At least one disallow line must be present for each User-agent directive to be correct.

A completely empty robots.txt file is the same as if it were not present.


Return to SEO Blog


Home Page - Registry Easy - XP Repair Pro - Reg Sweep - Registry Fix - Keywords Analyzer - SEO Elite
Keyword Elite  - Word Tracker - Strong Box - RecipManager - Contact Us - Type of Links - SiteMap