Web search robots

Search robots, also known as bots, wanderers, spiders, and crawlers, are the tools many web search engines, such as Google, Bing, and Yahoo!, use to build their databases. Most robots work like web browsers, except they don't require user interaction.

Robots access web pages, often using links to locate and link to other sites. They can index titles, summaries, or the entire contents of documents much more quickly and thoroughly than a human could.

While their speed and efficiency make them very appealing to the managers of search engines, search robots, especially poorly constructed ones, can overwhelm some servers. Administrators can exclude or limit robot access by placing robots.txt files on their servers that outline how their sites are to be accessed.

IU Pages websites come with a default robots.txt file that denies access to all search robots. Account owners can edit the file at any time to allow search engines to index the website.

For more about excluding robot searches, see About /robots.txt and A Standard for Robot Exclusion.

If you have your own pages on a system that is not protected with a robots.txt file and you wish to exclude robot searches, you can add the following tag to the headers of your pages:

  <meta name="robots" content="noindex,nofollow">

For more about how to use the <meta> tag with the robot attribute to regulate robot searches of your pages, see About the Robots <META> tag.

Unfortunately, not all robots honor robot exclusions and limitations.

In addition to regulating whether robots search your page, you can also use the <meta> tag with the description attribute to improve the results that robots get. Search engines use descriptions to describe your page, which can be especially useful if your page contains little text. For example, a web page about Darth Vader might include this <meta> tag:

  <meta name="description" content="Darth Vader: More than just another pretty face">

For more about these tags and <meta> tags in general, see HTTP-EQUIV (HTTP header) Index. For more about robots, including a FAQ and a list of known robots, see Frequently Asked Questions.