Merge two data sets in SAS

To merge two or more data sets in SAS, you must first sort both data sets by a shared variable upon which the merging will be based, and then use the MERGE statement in your DATA statement. If you merge data sets without sorting, called one-to-one merging, the data of the merged file will overwrite the primary data set without considering whether or not two observations are the same.

Suppose you create two data sets (one and two below), with a common variable, id. The SAS codes below show how they can be sorted and merged:

DATA one;
 INPUT id v1 v2;
 DATALINES;
 1 10 100
 2 15 150
 3 20 200
 ;
PROC SORT Data=one;
 BY id;
RUN;
DATA two;
 INPUT id v3 v4;
 DATALINES;
 1 1000 10000
 2 1500 15000
 3 2000 20000
 4  800 30000
 ;
PROC SORT Data=two;
 BY id;
RUN;
DATA three;
 MERGE one two;
 BY id;
PROC PRINT DATA=three; 
RUN;

In the example above, data set three is created by merging data sets one and two. It will have five variables (id, and v1 to v4) and four cases. Where id=4, variables v1 and v2 will be missing.

If you have questions about using statistical and mathematical software at Indiana University, contact the UITS Research Applications and Deep Learning team.

This is document afin in the Knowledge Base.
Last modified on 2023-10-02 13:29:42.