In Stata, how do I create a new variable based on existing data?

Following are examples of how to create new variables in Stata using the gen (short for generate) and egen commands:

  • To create a new variable (e.g., newvar) and set its value to 0, use:
     gen newvar = 0
  • To create a new variable (e.g., total) from the transformation of existing variables (e.g., the sum of v1, v2, v3, and v4), use:
     gen total = v1 + v2 + v3 + v4

    Alternatively, use egen with the built-in rowtotal option:

     egen total = rowtotal(v1 v2 v3 v4)
    The egen command treats missing values as 0.
  • To create a variable (e.g., avg) that stores the average of four variables (e.g., v1, v2, v3, and v4), use:
     gen avg = (v1 + v2 + v3 + v4) / 4
    Use the / (slash) to denote division and an * (asterisk) for multiplication.

    Alternatively, use egen with the built-in rowmean option:

     egen avg = rowmean(v1 v2 v3 v4)

Stata also lets you take advantage of built-in functions for variable transformations. For example, to take the natural log of v1 and create a new variable (e.g., v1_log), use:

 gen v1_log = log(v1)

For additional help, see the help files within Stata (for each of the following topics, enter the corresponding help command):

Topic Command
Using functions help functions
Using gen help gen
Using egen help egen

