MATCH FILES
MATCH FILES
Per input file:
/{FILE,TABLE}={*,'FILE_NAME'}
[/RENAME=(SRC_NAMES=TARGET_NAMES)...]
[/IN=VAR_NAME]
[/SORT]
Once per command:
/BY VAR_LIST[({D|A}] [VAR_LIST[({D|A})]...]
[/DROP=VAR_LIST]
[/KEEP=VAR_LIST]
[/FIRST=VAR_NAME]
[/LAST=VAR_NAME]
[/MAP]
MATCH FILES
merges sets of corresponding cases in multiple input
files into single cases in the output, combining their data.
MATCH FILES
shares the bulk of its syntax with other PSPP commands
for combining multiple data files (see Common
Syntax for details).
How MATCH FILES
matches up cases from the input files depends on
whether BY
is specified:
-
If
BY
is not used,MATCH FILES
combines the first case from each input file to produce the first output case, then the second case from each input file for the second output case, and so on. If some input files have fewer cases than others, then the shorter files do not contribute to cases output after their input has been exhausted. -
If
BY
is used,MATCH FILES
combines cases from each input file that have identical values for theBY
variables.When
BY
is used,TABLE
subcommands may be used to introduce "table lookup files".TABLE
has same syntax asFILE
, and theRENAME
,IN
, andSORT
subcommands may follow aTABLE
in the same way asFILE
. Regardless of the number ofTABLE
s, at least oneFILE
must specified. Table lookup files are treated in the same way as other input files for most purposes and, in particular, table lookup files must be sorted on theBY
variables or theSORT
subcommand must be specified for thatTABLE
.Cases in table lookup files are not consumed after they have been used once. This means that data in table lookup files can correspond to any number of cases in
FILE
input files. Table lookup files are analogous to lookup tables in traditional relational database systems.If a table lookup file contains more than one case with a given set of
BY
variables, only the first case is used.
When MATCH FILES
creates an output case, variables that are only in
files that are not present for the current case are set to the
system-missing value for numeric variables or spaces for string
variables.