Indiana University
University Information Technology Services
  
What are archived documents?
Login>>
Login

Login is for authorized groups (e.g., UITS, OVPIT, and TCC) that need access to specialized Knowledge Base documents. Otherwise, simply use the Knowledge Base without logging in.

Close

In Stata, how can I randomly select a certain number of observations from a data set?

In Stata, the .sample command selects random samples of the data set in memory and removes unselected observations from the data set.

Suppose you want to randomly draw a sample of 100 observations from the current data set. First, load a data set, and then run the following command with the count option:

. sample 100, count

If you want to take a sample of 20% from the current data set, drop the count as follows:

. sample 20

If you want to take a sample that maintains the same proportion of each group, use the by() option. The following command selects 20% observations within male (male=1) and female (male=0) groups.

. sample 20, by(male)

.sample draws a sample without replacement. If you want to allow replacement, use the .bsample command instead.

For more about statistical and mathematical software, email the UITS Stat/Math Center, visit the center's web page, or phone 812-855-4724 (IUB) or 317-278-4740 (IUPUI). The center is located in Bloomington at 410 N. Park Avenue, and is open for consultation by appointment Monday-Friday 9am-5pm.

This is document awja in domain all.
Last modified on May 04, 2011.

Comments/Questions/Corrections

Use this form to offer suggestions, corrections, and additions to the Knowledge Base. We welcome your input!

If you are affiliated with Indiana University and would like assistance with a specific computing problem, please use the Ask a Consultant form, or contact your campus Support Center.

Contact Information

Note: We will reply to your comment at this address. If your message concerns a problem receiving email, please enter an alternate email address.