If the SUM() and MEAN() functions keep cases with missing values in SPSS
Statistical functions in SPSS, such as SUM()
, MEAN()
, and SD()
, perform calculations using all available cases. SPSS will not automatically drop observations with missing values, but instead it will exclude cases with missing values from the calculations. SPSS will correctly estimate the mean with the MEAN()
function by using all non-missing values.
However, problems can arise when trying to exclude missing cases and estimate results based only on observations with complete information. For example, suppose two variables (v1
and v2
) sum to create an index variable (v3
). While v1
has ten valid cases with no missing values, v2
has eight valid cases and two missing values. Use the following syntax to add the two variables (v1
and v2
) and create an index variable (v3
):
COMPUTE V3 = SUM(V1, V2).
EXECUTE .
The resulting index variable v3
has ten cases and no missing values. When SPSS encounters a missing value in any of the v2
cases, it ignores it and sets v3
equal to v1
. Essentially, SPSS treats the missing values of v2
as zeroes. The results can potentially be misleading.
To ensure that v3
is equal to the sum of v1
and v2
and that all missing cases are dropped rather than ignored, specify the minimum number of valid cases that SPSS should use to calculate a given function. For example, to create an index variable v3
using only observations without missing values, use the following syntax:
COMPUTE V3 = SUM.2(V1, V2).
EXECUTE .
The .2
appended to the end of the SUM
function in the above example can be any integer. Use it to indicate the minimum number of valid cases necessary to perform a given calculation.
If you have questions about using statistical and mathematical software at Indiana University, contact the UITS Research Applications and Deep Learning team.
This is document adjn in the Knowledge Base.
Last modified on 2023-06-26 11:06:17.