In Stata, how do I create a new variable based on existing data?

Following are examples of how to create new variables in Stata using the gen (short for generate) and egen commands:

  • To create a new variable (e.g., newvar) and set its value to 0, use: gen newvar = 0
  • To create a new variable (e.g., total) from the transformation of existing variables (e.g., the sum of v1, v2, v3, and v4), use: gen total = v1 + v2 + v3 + v4

    Alternatively, use egen with the built-in rowtotal option:

    egen total = rowtotal(v1 v2 v3 v4)

    Note: The egen command treats missing values as 0.

  • To create a variable (e.g., avg) that stores the average of four variables (e.g., v1, v2, v3, and v4), use: gen avg = (v1 + v2 + v3 + v4) / 4

    Note: Use the  /  (slash) to denote division and an  *  (asterisk) for multiplication.

    Alternatively, use egen with the built-in rowmean option:

    egen avg = rowmean(v1 v2 v3 v4)

Stata also lets you take advantage of built-in functions for variable transformations. For example, to take the natural log of v1 and create a new variable (e.g., v1_log), use:

gen v1_log = log(v1)

For additional help, see the help files within Stata (for each of the following topics, enter the corresponding help command):

Topic Command
Using functions help functions
Using gen help gen
Using egen help egen

At Indiana University, see the UITS Research Analytics Stata page.

If you have questions about using statistical and mathematical software at Indiana University, email UITS Research Analytics (formerly known as the Stat/Math Center). Research Analytics is located on the IU Bloomington campus at Woodburn Hall 200, and is open for consultation by appointment Monday-Friday 9am-5pm. For more, visit Research Analytics on the web, or call 812-855-4724 (IUB) or 317-278-4740 (IUPUI).

This is document afrg in domain all.
Last modified on January 13, 2014.

