MATRIX DATA (PSPP)

MATRIX DATA
        VARIABLES = columns
        [FILE=’file_name’| INLINE }
        [/FORMAT= [{LIST | FREE}]
                  [{UPPER | LOWER | FULL}]
                  [{DIAGONAL | NODIAGONAL}]]
        [/N= n]
        [/SPLIT= split_variables].

The MATRIX DATA command is used to input data in the form of matrices which can subsequently be used by other commands. If the FILE is omitted or takes the value ‘INLINE’ then the command should immediately followed by BEGIN DATA (see BEGIN DATA).

There is one mandatory subcommand, viz: VARIABLES, which defines the columns of the matrix. Normally, the columns should include an item called ‘ROWTYPE_’. The ‘ROWTYPE_’ column is used to specify the purpose of a row in the matrix.

matrix data
    variables = rowtype_ var01 TO var08.

begin data.
mean  24.3  5.4  69.7  20.1  13.4  2.7  27.9  3.7
sd    5.7   1.5  23.5  5.8   2.8   4.5  5.4   1.5
n     92    92   92    92    92    92   92    92
corr 1.00
corr .18  1.00
corr -.22  -.17  1.00
corr .36  .31  -.14  1.00
corr .27  .16  -.12  .22  1.00
corr .33  .15  -.17  .24  .21  1.00
corr .50  .29  -.20  .32  .12  .38  1.00
corr .17  .29  -.05  .20  .27  .20  .04  1.00
end data.

In the above example, the first three rows have ROWTYPE_ values of ‘mean’, ‘sd’, and ‘n’. These indicate that the rows contain mean values, standard deviations and counts, respectively. All subsequent rows have a ROWTYPE_ of ‘corr’ which indicates that the values are correlation coefficients.

Note that in this example, the upper right values of the ‘corr’ values are blank, and in each case, the rightmost value is unity. This is because, the FORMAT subcommand defaults to ‘LOWER DIAGONAL’, which indicates that only the lower triangle is provided in the data. The opposite triangle is automatically inferred. One could instead specify the upper triangle as follows:

matrix data
    variables = rowtype_ var01 TO var08
    /format = upper nodiagonal.

begin data.
mean  24.3 5.4  69.7  20.1  13.4  2.7  27.9  3.7
sd    5.7  1.5  23.5  5.8   2.8   4.5  5.4   1.5
n     92    92   92    92    92    92   92    92
corr         .17  .50  -.33  .27  .36  -.22  .18
corr               .29  .29  -.20  .32  .12  .38
corr                    .05  .20  -.15  .16  .21
corr                         .20  .32  -.17  .12
corr                              .27  .12  -.24
corr                                  -.20  -.38
corr                                         .04
end data.

In this example the ‘NODIAGONAL’ keyword is used. Accordingly the diagonal values of the matrix are omitted. This implies that there is one less ‘corr’ line than there are variables. If the ‘FULL’ option is passed to the FORMAT subcommand, then all the matrix elements must be provided, including the diagonal elements.

In the preceding examples, each matrix row has been specified on a single line. If you pass the keyword FREE to FORMAT then the data may be data for several matrix rows may be specified on the same line, or a single row may be split across lines.

The N subcommand may be used to specify the number of valid cases for each variable. It should not be used if the data contains a record whose ROWTYPE_ column is ‘N’ or ‘N_VECTOR’. It implies a ‘N’ record whose values are all n. That is to say,

matrix data
    variables = rowtype_  var01 TO var04
    /format = upper nodiagonal
    /n = 99.
begin data
mean 34 35 36 37
sd   22 11 55 66
corr 9 8 7
corr 6 5
corr 4
end data.

matrix data
    variables = rowtype_  var01 TO var04
    /format = upper nodiagonal
begin data
n    99 99 99 99
mean 34 35 36 37
sd   22 11 55 66
corr 9 8 7
corr 6 5
corr 4
end data.

The SPLIT is used to indicate that variables are to be considered as split variables. For example, the following defines two matrices using the variable ‘S1’ to distinguish between them.

matrix data
    variables = s1 rowtype_  var01 TO var04
    /split = s1
    /format = full diagonal.

begin data
0 mean 34 35 36 37
0 sd   22 11 55 66
0 n    99 98 99 92
0 corr 1 9 8 7
0 corr 9 1 6 5
0 corr 8 6 1 4
0 corr 7 5 4 1
1 mean 44 45 34 39
1 sd   23 15 51 46
1 n    98 34 87 23
1 corr 1 2 3 4
1 corr 2 1 5 6
1 corr 3 5 1 7
1 corr 4 6 7 1
end data.

8.12 MATRIX DATA