ARCHIVED: In SPSS, why were the results of my merge with MATCH FILES scrambled?

This content has been archived, and is no longer maintained by Indiana University. Information here may no longer be accurate, and links may no longer be available or reliable.

If the results of your merge using the MATCH FILES command in SPSS are scrambled, follow the example below to fix your data. If you wish to use SPSS graphical interface to fix the problem, see SPSS graphical interface below.

The MATCH FILES command in SPSS merges two data sets to add or update variables. Without a /BY or /TABLE subcommand, the command sometimes produces odd results by simply putting two data sets together. The problem becomes worse when the index variable is not a unique identification variable.

Consider the following data sets and the MATCH FILES command:

                      ID V1 V2                         ID V3 V4
  C:\spss\DATA1.sav   1  1  1      c:\spss\DATA2.sav   1  5  5
                      1  2  2                          2  6  6
                      2  3  3
                      2  4  4

  MATCH FILES 
    /FILE='c:\spss\DATA1.sav' 
    /FILE='c:\spss\DATA2.sav' 
    /BY ID. 
  EXECUTE.

Since ID, the index variable, has duplicate cases, SPSS gets confused and gives you the warning message, "Warning # 5132 Duplicate key in a file...", producing the following result. Notice that when SPSS found a duplicate index, it automatically set the variable from DATA2 as missing and then went on to the next unique case.

  ID V1 V2 V3 V4
  1  1  1  5  5
  1  2  2  .  .
  2  3  3  6  6
  2  4  4  .  .

To avoid this, use the /TABLE subcommand to tell SPSS that one file will be used as a lookup table for the other data set. SPSS will then look through the file you defined with the /TABLE subcommand and use it to match data to each field duplicated in the other file, for example:

  MATCH FILES 
    /FILE='c:\spss\DATA1.sav' 
    /TABLE='c:\spss\DATA2.sav' 
    /BY ID. 
  EXECUTE.

The command with /TABLE will produce the following data set:

  ID V1 V2 V3 V4
  1  1  1  5  5
  1  2  2  5  5
  2  3  3  6  6
  2  4  4  6  6

SPSS graphical interface

To merge data using SPSS graphical interface:

  1. Open the data file Data1.sav.
  2. From the Data menu, select Merge Files and then Add Variables....
  3. Select the file to merge (e.g., Data2.sav), and then click Continue.
  4. In the "Add variables" dialog box, select Match cases on key variables in sorted files and check Non-active dataset is keyed table. Under "Key Variables:", select ID. Click OK.

    This will produce the same results as the example using the MATCH FILES command with the /TABLE option above. It is important to sort both of the data files by key variable(s) before you merge them.

If you have questions about using statistical and mathematical software at Indiana University, contact the UITS Research Applications and Deep Learning team.

This is document afit in the Knowledge Base.
Last modified on 2023-05-09 14:37:15.