Data files store input/output vector pairs which are the data set to be used for classification or approximation. The data file format can be one of multiple types. Currently only two types are supported-- SEQUENCE and STATIC. The STATIC format is for simple input and output vector pairs containing no temporal information. The SEQUENCE format is for time sequences in which groups of time ordered input vectors have a single output vector.
There are several specifications for the data files which are common
to both types. All vector component values are considered to be
floating point numbers and may not contain extraneous characters.
However, everything after a % character on a line is
considered to be a comment. While it is generally considered to be
convention that each vector pair is on a separate line, this is not
required. Incorrectly formatted files may or may not be parsed
correctly, so do not depend on any particular behavior for errors in
the file.
Each file begins with a header describing the format parameters of
that file. Parameters are of the form
PARAMETERNAME:PARAMETERVALUE. The order and capitalization
of parameters is fixed. After the parameters, the vectors follow,
according to the format of the specific file type.