IBM SPSS 21 and later support a form of encrypted data files. The cryptographic scheme used, which is not publicly documented, has recently come to my attention. It is flawed enough that I feel I must present it publicly.
The encrypted data file format is identical to the pre-existing plaintext file format (which isn't important here), except that each 16-byte block is encrypted with AES-256 in ECB mode. The AES-256 key is derived from a password by a single AES-256 CMAC operation, as:
where password is the literal password typed by the user (padded on the right with zeros to fill out a 32-byte AES-256 key, since CMAC needs a real cryptographic key not just any random string of bytes like HMAC) and constant is a particular 73-byte constant. This only produces a 16-byte result. AES-256 needs a 32-byte key, so the 16-byte result is repeated twice to expand it to 32 bytes.
(I think that the authors of the implementation must have thought they were doing something smart, because the 73-byte constant is in the right form for the NIST SP 800-108 key derivation function in counter mode. But that KDF is meant for deriving one cryptographic key from another, not from a password.)
The problems I see:
Cheap password derivation function (single round of CMAC) instead of an intentionally expensive function like PBKDF2 with thousands of iterations.
No salt, and the first 16 bytes of plaintext are essentially constant (as a magic number). I believe that this means rainbow tables are possible.
Password is silently truncated after 10 bytes, limiting actual entropy in the key to 80 bits at the very most and probably more like 40 to 60 bits realistically. (AES-256 is obviously overkill.)
Governments, universities, and companies use SPSS to analyze survey data that sometimes contain people's personal information that must not become public, so confidentiality is actually important here. Mitigating that a little, this encrypted format is new in the last year or two, and a lot of organizations don't upgrade SPSS frequently because it is very expensive, so the encrypted format may not be widely used yet.
SPSS documentation talks about “encrypted passwords” that can be used in SPSS program syntax in place of plaintext passwords. However, calling these passwords “encrypted” is a misnomer, because the encoding algorithm is simple, fixed, and unkeyed.
I have written a program that decrypts such a data file, given the plaintext or “encrypted” password. Compile the source code and link against libcrypto (from OpenSSL).