All files in the corpus have this record at offset 0x100 with length
0xb0 (but readers should find this record, like the others, via the
records
table in the directory). Its format is:
uint16 one0; char product[62]; flt64 sysmis; uint32 zero0; uint32 zero1; uint16 one1; uint16 compressed; uint16 nominal_case_size; uint16 n_cases0; uint16 weight_index; uint16 zero2; uint16 n_cases1; uint16 zero3; char creation_date[8]; char creation_time[8]; char label[64];
uint16 one0;
uint16 one1;
Always set to 1.
uint32 zero0;
uint32 zero1;
uint16 zero2;
uint16 zero3;
Always set to 0.
It seems likely that one of these variables is set to 1 if weighting is enabled, but none of the files in the corpus is weighted.
char product[62];
Name of the program that created the file. Only the following unique values have been observed, in each case padded on the right with spaces:
DESPSS/PC+ System File Written by Data Entry II PCSPSS SYSTEM FILE. IBM PC DOS, SPSS/PC+ PCSPSS SYSTEM FILE. IBM PC DOS, SPSS/PC+ V3.0 PCSPSS SYSTEM FILE. IBM PC DOS, SPSS for Windows
Thus, it is reasonable to use the presence of the string ‘SPSS’ at offset 0x104 as a simple test for an SPSS/PC+ data file.
flt64 sysmis;
The system-missing value, as described previously (see SPSS/PC+ System File Format).
uint16 compressed;
Set to 0 if the data in the file is not compressed, 1 if the data is compressed with simple bytecode compression.
uint16 nominal_case_size;
Number of data elements per case. This is the number of variables, except that long string variables add extra data elements (one for every 8 bytes after the first 8). String variables in SPSS/PC+ system files are limited to 255 bytes.
uint16 n_cases0;
uint16 n_cases1;
The number of cases in the data record. Both values are the same. Some files in the corpus contain data for the number of cases noted here, followed by garbage that somewhat resembles data.
uint16 weight_index;
0, if the file is unweighted, otherwise a 1-based index into the data record of the weighting variable, e.g. 4 for the first variable after the 3 system-defined variables.
char creation_date[8];
The date that the file was created, in ‘mm/dd/yy’ format. Single-digit days and months are not prefixed by zeros. The string is padded with spaces on right or left or both, e.g. ‘_2/4/93_’, ‘10/5/87_’, and ‘_1/11/88’ (with ‘_’ standing in for a space) are all actual examples from the corpus.
char creation_time[8];
The time that the file was created, in ‘HH:MM:SS’ format. Single-digit hours are padded on a left with a space. Minutes and seconds are always written as two digits.
char file_label[64];
File label declared by the user, if any (see FILE LABEL in PSPP Users Guide). Padded on the right with spaces.