Merge two data sets in Stata
For a one-to-many or many-to-one match merge, use merge 1:m
or merge m:1
; see Merge two data sets in the many-to-one relationship in Stata.
To merge two data sets in Stata, first sort each data set on the key variables upon which the merging will be based. Then, use the merge
command followed by a list of key variable(s) and data set(s). In Stata version 11 and later:
merge 1:1 varlist using filename [, options]
1:1
specification. Observations in each data set should be unique in the one-to-one match merge.
Suppose you have two key variables id
and
name
in two data sets stat
and
math
. The following code sorts and saves the
stat
data set and then sorts the math
data set. Then, while the math
data set is still in memory, it merges (using the stat
data set) on the key variables
id
and name
:
use stat.dta, clear
sort id name
save stat.dta, replace
use math.dta, clear
sort id name
merge 1:1 id name using stat.dta
If two data sets share variables besides the key variables, use the ,update
option to replace missing values in the master file (in memory) with corresponding non-missing values in the secondary file. Use ,update replace
to replace non-missing values in the master file with corresponding non-missing values in the secondary file.
To use the drop-down menu in Stata version 11 and later, select
.If you have questions about using statistical and mathematical software at Indiana University, contact the UITS Research Applications and Deep Learning team.
This is document azck in the Knowledge Base.
Last modified on 2023-07-12 16:28:38.