How can I create multiple dummy (indicator) variables in Stata?

Researchers may often need to create multiple indicator variables from a single, often categorical, variable. For example, the variable region (where 1 indicates Southeast Asia, 2 indicates Eastern Europe, etc.) may need to be converted into twelve indicator variables with values of 1 or 0 that describe whether the region is Southeast Asia or not, Eastern Europe or not, etc. You may use the generate and replace commands twelve times to create each of the indicator variables:

. generate dregion1 = 0
. replace dregion1 = 1 if region==1
. generate dregion2 = 0
. replace dregion2 = 1 if region==2
...and so on.

Repeating this code twelve times is tedious and could lead to mistakes. An alternative to this approach is the tabulate...,generate() command, which creates a set of indicator variables based on the observed values of the tabulated variable. To generate twelve indicator variables based on the variable region, execute the following code in Stata:

. tabulate region, generate(dregion)

This single command will generate twelve indicator variables (dregion1, dregion2, etc.) based on the twelve observed values of region. For example, dregion10 takes the value of 1 when region equals 10, and is 0 otherwise.

If you have questions about using statistical and mathematical software at Indiana University, contact Research Analytics. Research Analytics is located on the IU Bloomington campus at Woodburn Hall 200; staff are available for consultation Monday-Friday 9am-noon and by appointment.

This is document bajq in the Knowledge Base.
Last modified on 2015-06-26 00:00:00.

Contact us

For help or to comment, email the UITS Support Center.