Next: GNU Free Documentation License, Previous: Portable File Format, Up: GNU PSPP Developers Guide [Contents]
SPSS/PC+, first released in 1984, was a simplified version of SPSS for IBM PC and compatible computers. It used a data file format related to the one described in the previous chapter, but simplified and incompatible. The SPSS/PC+ software became obsolete in the 1990s, so files in this format are rarely encountered today. Nevertheless, for completeness, and because it is not very difficult, it seems worthwhile to support at least reading these files. This chapter documents this format, based on examination of a corpus of about 60 files from a variety of sources.
System files use four data types: 8-bit characters, 16-bit unsigned
integers, 32-bit unsigned integers, and 64-bit floating points, called
here char
, uint16
, uint32
, and flt64
,
respectively. Data is not necessarily aligned on a word or
double-word boundary.
SPSS/PC+ ran only on IBM PC and compatible computers. Therefore, values in these files are always in little-endian byte order. Floating-point numbers are always in IEEE 754 format.
SPSS/PC+ system files represent the system-missing value as -1.66e308,
or f5 1e 26 02 8a 8c ed ff
expressed as hexadecimal. (This is
an unusual choice: it is close to, but not equal to, the largest
negative 64-bit IEEE 754, which is about -1.8e308.)
Text in SPSS/PC+ system file is encoded in ASCII-based 8-bit MS DOS codepages. The corpus used for investigating the format were all ASCII-only.
An SPSS/PC+ system file begins with the following 256-byte directory:
uint32 two; uint32 zero; struct { uint32 ofs; uint32 len; } records[15]; char filename[128];
uint32 two;
uint32 zero;
Always set to 2 and 0, respectively.
These fields could be used as a signature for the file format, but the
product
field in record 0 seems more likely to be unique
(see Record 0: Main Header Record).
struct { … } records[15];
Each of the elements in this array identifies a record in the system
file. The ofs
is a byte offset, from the beginning of the
file, that identifies the start of the record. len
specifies
the length of the record, in bytes. Many records are optional or not
used. If a record is not present, ofs
and len
for that
record are both are zero.
char filename[128];
In most files in the corpus, this field is entirely filled with spaces. In one file, it contains a file name, followed by a null bytes, followed by spaces to fill the remainder of the field. The meaning is unknown.
The following sections describe the contents of each record,
identified by the index into the records
array.
Next: GNU Free Documentation License, Previous: Portable File Format, Up: GNU PSPP Developers Guide [Contents]