The data record must follow all other records in the system file.
Every system file must have a data record that specifies data for at
least one case. The format of the data record varies depending on the
value of compression
in the file header record:
Data is arranged as a series of 8-byte elements.
Each element corresponds to
the variable declared in the respective variable record (see Variable Record). Numeric values are given in flt64
format; string
values are literal characters string, padded on the right when
necessary to fill out 8-byte units.
The first 8 bytes of the data record is divided into a series of 1-byte command codes. These codes have meanings as described below:
Ignored. If the program writing the system file accumulates compressed data in blocks of fixed length, 0 bytes can be used to pad out extra bytes remaining at the end of a fixed-size block.
A number with
value code - bias, where
code is the value of the compression code and bias is the
variable bias
from the file header. For example,
code 105 with bias 100.0 (the normal value) indicates a numeric variable
of value 5.
A code of 0 (after subtracting the bias) in a string field encodes null bytes. This is unusual, since a string field normally encodes text data, but it exists in real system files.
End of file. This code may or may not appear at the end of the data stream. PSPP always outputs this code but its use is not required.
A numeric or string value that is not compressible. The value is stored in the 8 bytes following the current block of command bytes. If this value appears twice in a block of command bytes, then it indicates the second group of 8 bytes following the command bytes, and so on.
An 8-byte string value that is all spaces.
The system-missing value.
The end of the 8-byte group of bytecodes is followed by any 8-byte blocks of non-compressible values indicated by code 253. After that follows another 8-byte group of bytecodes, then those bytecodes’ non-compressible values. The pattern repeats to the end of the file or a code with value 252.
The data record consists of the following, in order:
The ZLIB data header has the following format:
int64 zheader_ofs; int64 ztrailer_ofs; int64 ztrailer_len;
int64 zheader_ofs;
The offset, in bytes, of the beginning of this structure within the system file.
int64 ztrailer_ofs;
The offset, in bytes, of the first byte of the ZLIB data trailer.
int64 ztrailer_len;
The number of bytes in the ZLIB data trailer. This and the previous field sum to the size of the system file in bytes.
The data header is followed by (ztrailer_len - 24) / 24
ZLIB
compressed data blocks. Each ZLIB compressed data block begins with a
ZLIB header as specified in RFC 1950, e.g. hex bytes 78
01
(the only header yet observed in practice). Each block
decompresses to a fixed number of bytes (in practice only
0x3ff000
-byte blocks have been observed), except that the last
block of data may be shorter. The last ZLIB compressed data block
gends just before offset ztrailer_ofs
.
The result of ZLIB decompression is bytecode compressed data as described above for compression format 1.
The ZLIB data trailer begins with the following 24-byte fixed header:
int64 bias; int64 zero; int32 block_size; int32 n_blocks;
int64 int_bias;
The compression bias as a negative integer, e.g. if bias
in
the file header record is 100.0, then int_bias
is −100
(this is the only value yet observed in practice).
int64 zero;
Always observed to be zero.
int32 block_size;
The number of bytes in each ZLIB compressed data block, except
possibly the last, following decompression. Only 0x3ff000
has
been observed so far.
int32 n_blocks;
The number of ZLIB compressed data blocks, always exactly
(ztrailer_len - 24) / 24
.
The fixed header is followed by n_blocks
24-byte ZLIB data
block descriptors, each of which describes the compressed data block
corresponding to its offset. Each block descriptor has the following
format:
int64 uncompressed_ofs; int64 compressed_ofs; int32 uncompressed_size; int32 compressed_size;
int64 uncompressed_ofs;
The offset, in bytes, that this block of data would have in a similar
system file that uses compression format 1. This is
zheader_ofs
in the first block descriptor, and in each
succeeding block descriptor it is the sum of the previous desciptor’s
uncompressed_ofs
and uncompressed_size
.
int64 compressed_ofs;
The offset, in bytes, of the actual beginning of this compressed data
block. This is zheader_ofs + 24
in the first block descriptor,
and in each succeeding block descriptor it is the sum of the previous
descriptor’s compressed_ofs
and compressed_size
. The
final block descriptor’s compressed_ofs
and
compressed_size
sum to ztrailer_ofs
.
int32 uncompressed_size;
The number of bytes in this data block, after decompression. This is
block_size
in every data block except the last, which may be
smaller.
int32 compressed_size;
The number of bytes in this data block, as stored compressed in this system file.