The data record must follow all other records in the system file.
Every system file must have a data record that specifies data for at
least one case. The format of the data record varies depending on the
compression in the file header record:
Data is arranged as a series of 8-byte elements.
Each element corresponds to
the variable declared in the respective variable record (see Variable Record). Numeric values are given in
flt64 format; string
values are literal characters string, padded on the right when
necessary to fill out 8-byte units.
The first 8 bytes of the data record is divided into a series of 1-byte command codes. These codes have meanings as described below:
Ignored. If the program writing the system file accumulates compressed data in blocks of fixed length, 0 bytes can be used to pad out extra bytes remaining at the end of a fixed-size block.
A number with
value code - bias, where
code is the value of the compression code and bias is the
bias from the file header. For example,
code 105 with bias 100.0 (the normal value) indicates a numeric variable
of value 5.
One file has been seen written by SPSS 14 that contained such a code
in a string field with the value 0 (after the bias is
subtracted) as a way of encoding null bytes.
End of file. This code may or may not appear at the end of the data stream. PSPP always outputs this code but its use is not required.
A numeric or string value that is not compressible. The value is stored in the 8 bytes following the current block of command bytes. If this value appears twice in a block of command bytes, then it indicates the second group of 8 bytes following the command bytes, and so on.
An 8-byte string value that is all spaces.
The system-missing value.
The end of the 8-byte group of bytecodes is followed by any 8-byte blocks of non-compressible values indicated by code 253. After that follows another 8-byte group of bytecodes, then those bytecodes’ non-compressible values. The pattern repeats to the end of the file or a code with value 252.
The data record consists of the following, in order:
The ZLIB data header has the following format:
int64 zheader_ofs; int64 ztrailer_ofs; int64 ztrailer_len;
The offset, in bytes, of the beginning of this structure within the system file.
The offset, in bytes, of the first byte of the ZLIB data trailer.
The number of bytes in the ZLIB data trailer. This and the previous field sum to the size of the system file in bytes.
The data header is followed by
(ztrailer_ofs - 24) / 24 ZLIB
compressed data blocks. Each ZLIB compressed data block begins with a
ZLIB header as specified in RFC 1950, e.g. hex bytes
01 (the only header yet observed in practice). Each block
decompresses to a fixed number of bytes (in practice only
0x3ff000-byte blocks have been observed), except that the last
block of data may be shorter. The last ZLIB compressed data block
gends just before offset
The result of ZLIB decompression is bytecode compressed data as described above for compression format 1.
The ZLIB data trailer begins with the following 24-byte fixed header:
int64 bias; int64 zero; int32 block_size; int32 n_blocks;
The compression bias as a negative integer, e.g. if
the file header record is 100.0, then
int_bias is -100
(this is the only value yet observed in practice).
Always observed to be zero.
The number of bytes in each ZLIB compressed data block, except
possibly the last, following decompression. Only
been observed so far.
The number of ZLIB compressed data blocks, always exactly
(ztrailer_ofs - 24) / 24.
The fixed header is followed by
n_blocks 24-byte ZLIB data
block descriptors, each of which describes the compressed data block
corresponding to its offset. Each block descriptor has the following
int64 uncompressed_ofs; int64 compressed_ofs; int32 uncompressed_size; int32 compressed_size;
The offset, in bytes, that this block of data would have in a similar
system file that uses compression format 1. This is
zheader_ofs in the first block descriptor, and in each
succeeding block descriptor it is the sum of the previous desciptor’s
The offset, in bytes, of the actual beginning of this compressed data
block. This is
zheader_ofs + 24 in the first block descriptor,
and in each succeeding block descriptor it is the sum of the previous
final block descriptor’s
compressed_size sum to
The number of bytes in this data block, after decompression. This is
block_size in every data block except the last, which may be
The number of bytes in this data block, as stored compressed in this system file.