Create multiple dummy (indicator) variables in Stata

Researchers may often need to create multiple indicator variables from a single, often categorical, variable. For example, the variable region (where 1 indicates Southeast Asia, 2 indicates Eastern Europe, etc.) may need to be converted into twelve indicator variables with values of 1 or 0 that describe whether the region is Southeast Asia or not, Eastern Europe or not, etc. You may use the generate and replace commands twelve times to create each of the indicator variables:

. generate dregion1 = 0
. replace dregion1 = 1 if region==1
. generate dregion2 = 0
. replace dregion2 = 1 if region==2

Repeating this code twelve times is tedious and could lead to mistakes. An alternative to this approach is the tabulate...,generate() command, which creates a set of indicator variables based on the observed values of the tabulated variable. To generate twelve indicator variables based on the variable region, execute the following code in Stata:

. tabulate region, generate(dregion)

This single command will generate twelve indicator variables (dregion1, dregion2, etc.) based on the twelve observed values of region. For example, dregion10 takes the value of 1 when region equals 10, and is 0 otherwise.

If you have questions about using statistical and mathematical software at Indiana University, contact the UITS Research Applications and Deep Learning team.

This is document bajq in the Knowledge Base.
Last modified on 2023-06-23 15:06:11.