Previous: , Up: q2c Input Format   [Contents][Index]


F.3 Grammar Rules

The grammar rules describe the format of the syntax that the parser generated by q2c will understand. The way that the grammar rules are included in q2c input file are described above.

The grammar rules are divided into tokens of the following types:

Identifier (ID)

An identifier token is a sequence of letters, digits, and underscores (‘_’). Identifiers are not case-sensitive.

String (STRING)

String tokens are initiated by a double-quote character (‘"’) and consist of all the characters between that double quote and the next double quote, which must be on the same line as the first. Within a string, a backslash can be used as a “literal escape”. The only reasons to use a literal escape are to include a double quote or a backslash within a string.

Special character

Other characters, other than white space, constitute tokens in themselves.

The syntax of the grammar rules is as follows:

grammar-rules ::= command-name opt-prefix : subcommands .
command-name ::= ID
             ::= STRING
opt-prefix ::=
           ::= ( ID )
subcommands ::= subcommand
            ::= subcommands ; subcommand

The syntax begins with an ID token that gives the name of the procedure to be parsed. For command names that contain multiple words, a STRING token may be used instead, e.g. ‘"FILE HANDLE"’. Optionally, an ID in parentheses specifies a prefix used for all file-scope identifiers declared by the emitted code.

The rest of the syntax consists of subcommands separated by semicolons (‘;’) and terminated with a full stop (‘.’).

subcommand ::= default-opt arity-opt ID sbc-defn
default-opt ::=
            ::= *
arity-opt ::=
          ::= +
          ::= ^
sbc-defn ::= opt-prefix = specifiers
         ::= [ ID ] = array-sbc
         ::= opt-prefix = sbc-special-form

A subcommand that begins with an asterisk (‘*’) is the default subcommand. The keyword used for the default subcommand can be omitted in the PSPP syntax file.

A plus sign (‘+’) indicates that a subcommand can appear more than once. A caret (‘^’) indicate that a subcommand must appear exactly once. A subcommand marked with neither character may appear once or not at all, but not more than once.

The subcommand name appears after the leading option characters.

There are three forms of subcommands. The first and most common form simply gives an equals sign (‘=’) and a list of specifiers, which can each be set to a single setting. The second form declares an array, which is a set of flags that can be individually turned on by the user. There are also several special forms that do not take a list of specifiers.

Arrays require an additional ID argument. This is used as a prefix, prepended to the variable names constructed from the specifiers. The other forms also allow an optional prefix to be specified.

array-sbc ::= alternatives
          ::= array-sbc , alternatives
alternatives ::= ID
             ::= alternatives | ID

An array subcommand is a set of Boolean values that can independently be turned on by the user, listed separated by commas (‘,’). If an value has more than one name then these names are separated by pipes (‘|’).

specifiers ::= specifier
           ::= specifiers , specifier
specifier ::= opt-id : settings
opt-id ::=
       ::= ID

Ordinary subcommands (other than arrays and special forms) require a list of specifiers. Each specifier has an optional name and a list of settings. If the name is given then a correspondingly named variable will be used to store the user’s choice of setting. If no name is given then there is no way to tell which setting the user picked; in this case the settings should probably have values attached.

settings ::= setting
         ::= settings / setting
setting ::= setting-options ID setting-value
setting-options ::=
                ::= *
                ::= !
                ::= * !

Individual settings are separated by forward slashes (‘/’). Each setting can be as little as an ID token, but options and values can optionally be included. The ‘*’ option means that, for this setting, the ID can be omitted. The ‘!’ option means that this option is the default for its specifier.

setting-value ::=
              ::= ( setting-value-2 )
              ::= setting-value-2
setting-value-2 ::= setting-value-options setting-value-type : ID
setting-value-options ::=
                      ::= *
setting-value-type ::= N
                   ::= D
                   ::= S

Settings may have values. If the value must be enclosed in parentheses, then enclose the value declaration in parentheses. Declare the setting type as ‘n’, ‘d’, or ‘s’ for integer, floating-point, or string type, respectively. The given ID is used to construct a variable name. If option ‘*’ is given, then the value is optional; otherwise it must be specified whenever the corresponding setting is specified.

sbc-special-form ::= VAR
                 ::= VARLIST varlist-options
                 ::= INTEGER opt-list
                 ::= DOUBLE opt-list
                 ::= PINT
                 ::= STRING (the literal word STRING)
                 ::= CUSTOM
varlist-options ::=
                ::= ( STRING )
opt-list ::=
         ::= LIST

The special forms are of the following types:

VAR

A single variable name.

VARLIST

A list of variables. If given, the string can be used to provide PV_* options to the call to parse_variables.

INTEGER

A single integer value.

INTEGER LIST

A list of integers separated by spaces or commas.

DOUBLE

A single floating-point value.

DOUBLE LIST

A list of floating-point values.

PINT

A single positive integer value.

STRING

A string value.

CUSTOM

A custom function is used to parse this subcommand. The function must have prototype int custom_name (void). It should return 0 on failure (when it has already issued an appropriate diagnostic), 1 on success, or 2 if it fails and the calling function should issue a syntax error on behalf of the custom handler.


Previous: , Up: q2c Input Format   [Contents][Index]