In Stata, how do I create a new variable based on existing data?
Following are examples of how to create new variables in
Stata using the gen
(short for generate) and
egen
commands:
 To create a new variable (e.g.,
newvar
) and set its value to0
, use: gen newvar = 0  To create a new variable (e.g.,
total
) from the transformation of existing variables (e.g., the sum ofv1
,v2
,v3
, andv4
), use: gen total = v1 + v2 + v3 + v4Alternatively, use
egen total = rowtotal(v1 v2 v3 v4)egen
with the builtinrowtotal
option:Note: The
egen
command treats missing values as0
.  To create a variable (e.g.,
avg
) that stores the average of four variables (e.g.,v1
,v2
,v3
, andv4
), use: gen avg = (v1 + v2 + v3 + v4) / 4Note: Use the
/
(slash) to denote division and an*
(asterisk) for multiplication.Alternatively, use
egen avg = rowmean(v1 v2 v3 v4)egen
with the builtinrowmean
option:
Stata also lets you take advantage of builtin functions for
variable transformations. For example, to take the natural log of
v1
and create a new variable (e.g., v1_log
),
use:
For additional help, see the help files within Stata (for each of
the following topics, enter the corresponding help
command):
Topic  Command 

Using functions 
help functions

Using gen 
help gen

Using egen 
help egen

