Encrypted File Wrappers
SPSS 21 and later can package multiple kinds of files inside an encrypted wrapper. The wrapper has a common format, regardless of the kind of the file that it contains.
⚠️ Warning: The SPSS encryption wrapper is poorly designed. When the password is unknown, it is much cheaper and faster to decrypt a file encrypted this way than if a well designed alternative were used. If you must use this format, use a 10-byte randomly generated password.
Common Wrapper Format
An encrypted file wrapper begins with the following 36-byte header,
where xxx
identifies the type of file encapsulated: SAV
for a system
file, SPS
for a syntax file, SPV
for a viewer file. PSPP code for
identifying these files just checks for the ENCRYPTED
keyword at
offset 8, but the other bytes are also fixed in practice:
0000 1c 00 00 00 00 00 00 00 45 4e 43 52 59 50 54 45 |........ENCRYPTE|
0010 44 xx xx xx 15 00 00 00 00 00 00 00 00 00 00 00 |Dxxx............|
0020 00 00 00 00 |....|
Following the fixed header is essentially the regular contents of the encapsulated file in its usual format, with each 16-byte block encrypted with AES-256 in ECB mode.
To make the plaintext an even multiple of 16 bytes in length, the encryption process appends PKCS #7 padding, as specified in RFC 5652 section 6.3. Padding appends 1 to 16 bytes to the plaintext, in which each byte of padding is the number of padding bytes added. If the plaintext is, for example, 2 bytes short of a multiple of 16, the padding is 2 bytes with value 02; if the plaintext is a multiple of 16 bytes in length, the padding is 16 bytes with value 0x10.
The AES-256 key is derived from a password in the following way:
-
Start from the literal password typed by the user. Truncate it to at most 10 bytes, then append as many null bytes as necessary until there are exactly 32 bytes. Call this
password
. -
Let
constant
be the following 73-byte constant:0000 00 00 00 01 35 27 13 cc 53 a7 78 89 87 53 22 11 0010 d6 5b 31 58 dc fe 2e 7e 94 da 2f 00 cc 15 71 80 0020 0a 6c 63 53 00 38 c3 38 ac 22 f3 63 62 0e ce 85 0030 3f b8 07 4c 4e 2b 77 c7 21 f5 1a 80 1d 67 fb e1 0040 e1 83 07 d8 0d 00 00 01 00
-
Compute
CMAC-AES-256(password, constant)
. Call the 16-byte resultcmac
. -
The 32-byte AES-256 key is
cmac || cmac
, that is,cmac
repeated twice.
Example
Consider the password pspp
. password
is:
0000 70 73 70 70 00 00 00 00 00 00 00 00 00 00 00 00 |pspp............|
0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
cmac
is:
0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
The AES-256 key is:
0000 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
0010 3e da 09 8e 66 04 d4 fd f9 63 0c 2c a8 6f b0 45
Checking Passwords
A program reading an encrypted file may wish to verify that the password it was given is the correct one. One way is to verify that the PKCS #7 padding at the end of the file is well formed. However, any plaintext that ends in byte 01 is well formed PKCS #7, meaning that about 1 in 256 keys will falsely pass this test. This might be acceptable for interactive use, but the false positive rate is too high for a brute-force search of the password space.
A better test requires some knowledge of the file format being wrapped, to obtain a "magic number" for the beginning of the file.
-
The plaintext of system files begins with
$FL2@(#)
or$FL3@(#)
. -
Before encryption, a syntax file is prefixed with a line at the beginning of the form
* Encoding: ENCODING.
, where ENCODING is the encoding used for the rest of the file, e.g.windows-1252
. Thus,* Encoding
may be used as a magic number for system files. -
The plaintext of viewer files begins with
50 4b 03 04 14 00 08
(50 4b
isPK
).
Password Encoding
SPSS also supports what it calls "encrypted passwords."
⚠️ Warning: SPSS "encrypted passwords" are not encrypted. They are encoded with a simple, fixed scheme and can be decoded to the original password using the rules described below.
An encoded password is always a multiple of 2 characters long, and never longer than 20 characters. The characters in an encoded password are always in the graphic ASCII range 33 through 126. Each successive pair of characters in the password encodes a single byte in the plaintext password.
Use the following algorithm to decode a pair of characters:
-
Let
a
be the ASCII code of the first character, andb
be the ASCII code of the second character. -
Let
ah
be the most significant 4 bits ofa
. Find the line in the table below that hasah
on the left side. The right side of the line is a set of possible values for the most significant 4 bits of the decoded byte.2 ⇒ 2367 3 ⇒ 0145 47 ⇒ 89cd 56 ⇒ abef
-
Let
bh
be the most significant 4 bits ofb
. Find the line in the second table below that hasbh
on the left side. The right side of the line is a set of possible values for the most significant 4 bits of the decoded byte. Together with the results of the previous step, only a single possibility is left.2 ⇒ 139b 3 ⇒ 028a 47 ⇒ 46ce 56 ⇒ 57df
-
Let
al
be the least significant 4 bits ofa
. Find the line in the table below that hasal
on the left side. The right side of the line is a set of possible values for the least significant 4 bits of the decoded byte.03cf ⇒ 0145 12de ⇒ 2367 478b ⇒ 89cd 569a ⇒ abef
-
Let
bl
be the least significant 4 bits ofb
. Find the line in the table below that hasbl
on the left side. The right side of the line is a set of possible values for the least significant 4 bits of the decoded byte. Together with the results of the previous step, only a single possibility is left.03cf ⇒ 028a 12de ⇒ 139b 478b ⇒ 46ce 569a ⇒ 57df
Example
Consider the encoded character pair -|
. a
is 0x2d and b
is
0x7c, so ah
is 2, bh
is 7, al
is 0xd, and bl
is 0xc. ah
means that the most significant four bits of the decoded character is
2, 3, 6, or 7, and bh
means that they are 4, 6, 0xc, or 0xe. The
single possibility in common is 6, so the most significant four bits
are 6. Similarly, al
means that the least significant four bits are
2, 3, 6, or 7, and bl
means they are 0, 2, 8, or 0xa, so the least
significant four bits are 2. The decoded character is therefore 0x62,
the letter b
.