In Stata, how do I create a new variable based on existing data?

Following are examples of how to create new variables in Stata using the gen (short for generate) and egen commands:

  • To create a new variable (e.g., newvar) and set its value to 0, use:
     gen newvar = 0
  • To create a new variable (e.g., total) from the transformation of existing variables (e.g., the sum of v1, v2, v3, and v4), use:
     gen total = v1 + v2 + v3 + v4

    Alternatively, use egen with the built-in rowtotal option:

     egen total = rowtotal(v1 v2 v3 v4)
    Note:
    The egen command treats missing values as 0.
  • To create a variable (e.g., avg) that stores the average of four variables (e.g., v1, v2, v3, and v4), use:
     gen avg = (v1 + v2 + v3 + v4) / 4
    Note:
    Use the / (slash) to denote division and an * (asterisk) for multiplication.

    Alternatively, use egen with the built-in rowmean option:

     egen avg = rowmean(v1 v2 v3 v4)

Stata also lets you take advantage of built-in functions for variable transformations. For example, to take the natural log of v1 and create a new variable (e.g., v1_log), use:

 gen v1_log = log(v1)

For additional help, see the help files within Stata (for each of the following topics, enter the corresponding help command):

Topic Command
Using functions help functions
Using gen help gen
Using egen help egen

If you have questions about using statistical and mathematical software at Indiana University, contact Research Analytics. Research Analytics is located on the IU Bloomington campus at Woodburn Hall 200; staff are available for consultation Monday-Friday 9am-noon and by appointment.

This is document afrg in the Knowledge Base.
Last modified on 2016-11-15 08:34:27.

Contact us

For help or to comment, email the UITS Support Center.