“ A best website is the balance of SEO and an excellent design ”

Wednesday, August 27, 2008

What is robots.txt

,,,

Many web designers turn a deaf ear to the concept of robots.txt by saying that it is mearly a job of Search Engine Optimizer. But one who want his website to boom in case of design and SEO will try to learn such features. Well, that might be a part of interest. Lets try to innovate the  concept of robots.txt. We all want our site to list in Google search at the topmost place. More accurately at the top of our compitator's site. We do submit our site in the directories, we write daily a blog and all that stuff....But sometimes we do not want Google to index our specific pages. The case arises specially when the pages are under construction. In seach cases we include robots.txt file. This inclusion is called The Robots Inclusion Protocol.  So, How can I then create a robots.txt for my site and where should I put it would be the obvious question from your mind. Lets explain this.A robots.txt file can be created by using a simple text editor like notepad. If you are aware of Google Webmaster tool then you can easily create robots.txt file by using robots.txt generator tool available there. The filename should be in lowercase.   The robots.txt file should be placed in the root of the domain. Don't create a directory naming robots and inside the file robots.txt. The proper path for the robots.txt would be http://www.yousite.com/robots.txt. If you want to see any site's robots.txt file you can easily see it by entering the above url replacing the sitename.  The simplest robots.txt file syntax would include two lines:

User-agent :

Disallow:

   

 

The user-agent is the robots(the programs which browses the web automatically) of the search engine. You can set the rule for the specific robot by listing its name or you can include all the robots by putting asterisk in User-agent line.  In the disallow line you include the pages that you want to block. In Disallow line we list all the pages that we want to hide from robot. The list should begin with forward slash.

For example:

User-agent: *

Disallow: /

Will block the entire site from all the robots. 

To block a directory we use:  Disallow: /directory-name

To block a specific page we use:  Disallow: /page.html

This was all about the basics of Robots Inclusion Protocol. Thanks!

2 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. As a webmaster, you definitely should use user-agent headers to manager server traffic. But understand that this is purely a pragmatic tactic and not a serious security measure.

    I wrote more about this here:

    Webmaster Tips: Blocking Selected User-Agents
    http://faseidl.com/public/item/213126

    ReplyDelete

Do you have something to say...then write to me

 
Marketing / SEO blogs BlogRankings.com
Blog Collector