In Stata, how do I create a new variable based on existing data?
Following are examples of how to create new variables in
Stata using the gen
(short for generate) and
egen
commands:
 To create a new variable (e.g.,
newvar
) and set its value to0
, use: gen newvar = 0  To create a new variable (e.g.,
total
) from the transformation of existing variables (e.g., the sum ofv1
,v2
,v3
, andv4
), use: gen total = v1 + v2 + v3 + v4Alternatively, use
egen total = rowtotal(v1 v2 v3 v4)egen
with the builtinrowtotal
option:Note: The
egen
command treats missing values as0
.  To create a variable (e.g.,
avg
) that stores the average of four variables (e.g.,v1
,v2
,v3
, andv4
), use: gen avg = (v1 + v2 + v3 + v4) / 4Note: Use the
/
(slash) to denote division and an*
(asterisk) for multiplication.Alternatively, use
egen avg = rowmean(v1 v2 v3 v4)egen
with the builtinrowmean
option:
Stata also lets you take advantage of builtin functions for
variable transformations. For example, to take the natural log of
v1
and create a new variable (e.g., v1_log
),
use:
For additional help, see the help files within Stata (for each of
the following topics, enter the corresponding help
command):
Topic  Command 

Using functions 
help functions

Using gen 
help gen

Using egen 
help egen

At Indiana University, see the UITS Research Analytics Stata page.
If you have questions about using statistical and mathematical software at Indiana University, email UITS Research Analytics (formerly known as the Stat/Math Center). Research Analytics is located on the IU Bloomington campus at Woodburn Hall 200, and is open for consultation by appointment MondayFriday 9am5pm. For more, visit Research Analytics on the web, or call 8128554724 (IUB) or 3172784740 (IUPUI).
Last modified on January 13, 2014.
I need help with a computing problem