ARCHIVED: In SAS, how can I randomly select a certain number of observations from a data set?

This content has been archived, and is no longer maintained by Indiana University. Information here may no longer be accurate, and links may no longer be available or reliable.

You can use the SURVEYSELECT procedure for random sampling. The procedure supports various methods for selecting probability-based random samples from the existing data set. The SURVEYSELECT procedure can conduct simple (SRS), unrestricted (URS), systematic (SYS), and sequential (SEQ) random sampling methods. It also supports the probability-proportional-to-size (PPS) method.

Suppose you want to randomly draw 100 observations from the data set pop with 7,000 observations. Consider the following SAS code:

  PROC SURVEYSELECT DATA=pop OUT=sample METHOD=SRS
  SAMPSIZE=100 SEED=1234567;
  RUN;

The METHOD=SRS option specifies the simple random sampling method. The SEED option specifies the seed to be used in the random number generation, allowing replication of the same set of random numbers. The 100 observations drawn are stored in the data set sample.

If you have questions about using statistical and mathematical software at Indiana University, contact the UITS Research Applications and Deep Learning team.

This is document aeji in the Knowledge Base.
Last modified on 2023-05-09 14:37:27.