In Stata, how do I merge two data sets?
Note: For a one-to-many or many-to-one match merge,
use .merge 1:m or .merge m:1 ; see In Stata, how do I merge two data sets in the many-to-one relationship?
To merge two data sets in Stata, first sort each data set on the key
variables upon which the merging will be based. Then, use the
.merge command followed by a list of key variable(s) and
data set(s). In Stata version 11:
merge 1:1 varlist using filename [,
options]
Note: If you're using Stata version 10 or older, omit
the 1:1 specification. Observations in each data set
should be unique in the one-to-one match merge.
Suppose we have two key variables id and
name in two data sets stat and
math. The following code sorts and saves the
stat data set and then sorts the math data
set. Then, while the math data set is still in memory,
it merges (using the stat data set) on the key variables
id and name:
If two data sets share variables besides the key variables, use the
.update command to replace missing values in the master
file (in memory) with corresponding non-missing values in the
secondary file. Use .replace to replace non-missing
values in the master file with corresponding non-missing values in the
secondary file.
To use the drop-down menu in Stata version 11:
Data > Combine Datasets > Merge Two Datasets
For more about statistical and mathematical software, email the UITS Stat/Math Center, visit the center's web page, or phone 812-855-4724 (IUB) or 317-278-4740 (IUPUI). The center is located in Bloomington at 410 N. Park Avenue, and is open for consultation by appointment Monday-Friday 9am-5pm.
Last modified on March 25, 2011.







