Indiana University
University Information Technology Services
  
What are archived documents?
Login>>
Login

Login is for authorized groups (e.g., UITS, OVPIT, and TCC) that need access to specialized Knowledge Base documents. Otherwise, simply use the Knowledge Base without logging in.

Close

In SAS, how can I randomly select a certain number of observations from a dataset?

Since SAS version 8.0, you can use the SURVEYSELECT procedure for random sampling. The procedure supports various methods for selecting probability-based random samples from the existing data set. The SURVEYSELECT procedure can conduct simple (SRS), unrestricted (URS), systematic (SYS), and sequential (SEQ) random sampling methods. It also supports the probability-proportional-to-size (PPS) method.

Suppose you want to randomly draw 100 observations from the data set pop with 7,000 observations. Consider the following SAS code:

PROC SURVEYSELECT DATA=pop OUT=sample METHOD=SRS SAMPSIZE=100 SEED=1234567; RUN;

The METHOD=SRS option specifies the simple random sampling method. The SEED option specifies the seed to be used in the random number generation, allowing replication of the same set of random numbers. The 100 observations drawn are stored in the data set sample.

If you have to use SAS 6.12, which does not have the SURVEYSELECT procedure, you must write SAS code to randomly select observations. The following example generates random numbers from a uniform probability distribution using the UNIFORM() function:

DATA sample2; RETAIN n 7000 k 100; SET pop; prob=k/n; IF UNIFORM(1234567) < prob THEN DO; OUTPUT; k=k-1; END; n=n-1; RUN;

In the above SAS 6.12 code, the probability of an observation being selected is not the same across observations. The probability depends on the order of observations and the seed value. Hence, this approach is not recommended as a random sampling method in a strict statistical sense.

For more about statistical and mathematical software, email the UITS Stat/Math Center, visit the center's web page, or phone 812-855-4724 (IUB) or 317-278-4740 (IUPUI). The center is located in Bloomington at 410 N. Park Avenue, and is open for consultation by appointment Monday-Friday 9am-5pm.

This is document aeji in domain all.
Last modified on January 27, 2011.

Comments/Questions/Corrections

Use this form to offer suggestions, corrections, and additions to the Knowledge Base. We welcome your input!

If you are affiliated with Indiana University and would like assistance with a specific computing problem, please use the Ask a Consultant form, or contact your campus Support Center.

Contact Information

Note: We will reply to your comment at this address. If your message concerns a problem receiving email, please enter an alternate email address.