Indiana University
University Information Technology Services
  
What are archived documents?
Login>>
Login

Login is for authorized groups (e.g., UITS, OVPIT, and TCC) that need access to specialized Knowledge Base documents. Otherwise, simply use the Knowledge Base without logging in.

Close

In SAS, how do I update one data set with another?

In SAS, you can use either the MERGE statement or the UPDATE statement in a DATA step to update the values of observations in a master data set. Both statements should be followed by the BY statement, which specifies the primary key (variable) sorted by the SORT procedure.

The MERGE statement automatically replaces all existing values with new values, whether they are missing or not. In contrast, the UPDATE statement updates values only when corresponding data are not missing. The default option, UPDATEMODE=MISSINGCHECK, prevents missing values in a transaction data set from replacing values in a master data set. The UPDATEMODE=NOMISSINGCHECK option is similar to the MERGE statement in that all values in the master data set are overwritten by corresponding data in the transaction data set. Consequently, the UPDATE statement is more flexible than the MERGE statement.

Suppose you have two data sets, master and transact. The master data set has four variables, id, name, gender, and weight, as follows:

01 Perry M 165 02 Miller M 145 03 Davis F 127

The transact data set has five variables: id, name, gender, weight, and height. Missing values are indicated by  .  (a period) in this data set:

02 Miller . 160 5.9 03 Bush . 157 . 05 Elliot F 125 5.2 02 . M 170 .

Now, sort the data by id:

PROC SORT data=master; BY id; RUN; PROC SORT data=transact; BY id; RUN;

The following example uses the UPDATE statement:

DATA new_master1; UPDATE master transact; BY id; RUN;

The output would appear as follows:

01 Perry M 165 . 02 Miller M 170 5.9 03 Bush F 157 . 05 Elliot F 125 5.2

The new variables (height) and observation (05 Elliot) in the transaction data set that do not exist in the master data set will be added to the output data set, new_master1. If a transaction data set has multiple observations with the same primary identification key, they will be sequentially updated.

The following example uses the MERGE statement:

DATA new_master2; MERGE master transact; BY id; RUN;

The output is given below:

01 Perry M 165 . 02 Miller . 160 5.9 02 . M 170 . 03 Bush . 157 . 05 Elliot F 125 5.2

Note that Miller and Bush look different from those in the above example. The last observation of the transact data set was added, although it has the same id (02) as Miller.

If you have questions about using statistical and mathematical software at Indiana University, email UITS Research Analytics (formerly known as the Stat/Math Center). Research Analytics is located on the IU Bloomington campus at Woodburn Hall 200, and is open for consultation by appointment Monday-Friday 9am-5pm. For more, visit Research Analytics on the web, or call 812-855-4724 (IUB) or 317-278-4740 (IUPUI).

This is document aljg in domain all.
Last modified on September 26, 2012.

I need help with a computing problem

  • Fill out this form to submit your issue to the UITS Support Center.
  • Please note that you must be affiliated with Indiana University to receive support.
  • All fields are required.



Please provide your IU email address. If you currently have a problem receiving email at your IU account, enter an alternate email address.

I have a comment for the Knowledge Base

  • Fill out this form to submit your comment to the IU Knowledge Base.
  • If you are affiliated with Indiana University and need help with a computing problem, please use the I need help with a computing problem section above, or contact your campus Support Center.