Create a new variable based on existing data in Stata

Following are examples of how to create new variables in Stata using the gen (short for generate) and egen commands:

  • To create a new variable (for example, newvar) and set its value to 0, use:
    gen newvar = 0
  • To create a new variable (for example, total) from the transformation of existing variables (for example, the sum of v1, v2, v3, and v4), use:
    gen total = v1 + v2 + v3 + v4

    Alternatively, use egen with the built-in rowtotal option:

    egen total = rowtotal(v1 v2 v3 v4)
    Note:
    The egen command treats missing values as 0.
  • To create a variable (for example, avg) that stores the average of four variables (for example, v1, v2, v3, and v4), use:
    gen avg = (v1 + v2 + v3 + v4) / 4
    Note:
    Use the / (slash) to denote division and an * (asterisk) for multiplication.

    Alternatively, use egen with the built-in rowmean option:

    egen avg = rowmean(v1 v2 v3 v4)

Stata also lets you take advantage of built-in functions for variable transformations. For example, to take the natural log of v1 and create a new variable (for example, v1_log), use:

gen v1_log = log(v1)

For additional help, see the help files within Stata (for each of the following topics, enter the corresponding help command):

Topic Command
Using functions help functions
Using gen help gen
Using egen help egen

If you have questions about using statistical and mathematical software at Indiana University, contact the UITS Research Applications and Deep Learning team.

This is document afrg in the Knowledge Base.
Last modified on 2023-11-20 14:28:45.