Indiana University
University Information Technology Services
  
What are archived documents?
Login>>
Login

Login is for authorized groups (e.g., UITS, OVPIT, and TCC) that need access to specialized Knowledge Base documents. Otherwise, simply use the Knowledge Base without logging in.

Close

What are web search robots, and how do they affect me?

Search robots, also known as bots, wanderers, spiders, and crawlers, are the tools many web search engines, such as Google, Bing, and Yahoo! use to build their databases. Most robots work like web browsers, except they don't require user interaction.

Robots access web pages, often using links to locate and link to other sites. They can index titles, summaries, or the entire contents of documents much more quickly and thoroughly than a human could.

While their speed and efficiency make them very appealing to the managers of search engines, search robots, especially poorly constructed ones, can overwhelm some servers. Administrators can exclude or limit robot access by placing robots.txt files on their servers that outline how their sites are to be accessed.

At Indiana University, the personal web page server, mypage.iu.edu, has a robots.txt file that denies access to all robots. Faculty and staff may change their robots.txt search engine crawl settings at Search Engine Crawl Settings for Mypage. For more about the Mypage service, as well as alternative ways to publicize your Mypage page, see At IU, what is Mypage, and how can I publish a web page there?

For more about excluding robot searches, see About /robots.txt and A Standard for Robot Exclusion.

If you have your own pages on a system that is not protected with a robots.txt file and you wish to exclude robot searches, you can add the following tag to the headers of your pages:

<meta name="robots" content="noindex,nofollow">

For more about how to use the <meta> tag with the robot attribute to regulate robot searches of your pages, see About the Robots <META> tag.

Unfortunately, not all robots honor robot exclusions and limitations.

In addition to regulating whether robots search your page, you can also use the <meta> tag with the keyword and description attributes to improve the results that robots get. Search engines use descriptions to describe your page, which can be especially useful if your page contains little text. Search engines index keywords in addition to text in the title and body of your document. For example, a web page about Darth Vader might include these <meta> tags:

<meta name="keyword" content="evil leader, darkside, sith, choke, empire, asthmatic"> <meta name="description" content="Darth Vader: More than just another pretty face">

For more about these tags and <meta> tags in general, see HTTP-EQUIV (HTTP header) Index. For more about robots, including a FAQ and a list of known robots, see Frequently Asked Questions.

This is document aeub in domain all.
Last modified on February 19, 2013.

I need help with a computing problem

  • Fill out this form to submit your issue to the UITS Support Center.
  • Please note that you must be affiliated with Indiana University to receive support.
  • All fields are required.



Please provide your IU email address. If you currently have a problem receiving email at your IU account, enter an alternate email address.

I have a comment for the Knowledge Base

  • Fill out this form to submit your comment to the IU Knowledge Base.
  • If you are affiliated with Indiana University and need help with a computing problem, please use the I need help with a computing problem section above, or contact your campus Support Center.