This file describes the EEP 3.1 compressed file format. MPI/ANT (eeprobe@ant-software.nl) Max-Planck-Institute of Cognitive Neuroscience, Leipzig $Id$ 1. Overall Structure -------------------- The cnt file format conforms to the RIFF(Resource Interchange File Format) specification. The (unregistered) form type is . RIFF means that the file is composed of data blocks (chunks) which are accessible via an ID (four-character-code, FOURCC) in a hierarchy (chunk tree). Data are stored lo-byte first ("Intel-style") if not mentioned otherwise. Refer to a RIFF documentation for details. I have used Josef Poepsel, "Multimediale Klippen" c't 11/94 p. 327 ff, Heinz Heise Verlag 2. Chunk Tree ------------- archived NeuroScan header, optional archived REFA configuration, optional ASCII header (labels, scalings...) compressed data LIST channel sequence compressed data epoch offsets in data chunk session events, optional It is save to add more chunks in the toplevel. Existing EEP modules copy unknown chunks from input files to output files. There is no guaranteed toplevel chunk sequence. One has to look for the interesting chunks. 3. Chunk Contents ------------------------------ 3.1 Just a copy of the binary NeuroScan header, (only present if one was available during file creation time) 3.2 Just a copy of the REFA acquisition configuration. (only present if the file was generated by refa2cnt). 3.2 Global file informations in plain ASCII text. See the example: [Sampling Rate] 249.9999881256 [Samples] 36350 [Channels] 4 [Basic Channel Data] ;label calibration factor EOGV 1.7187500000e+00 4.8828125000e-02 uV EOGH 1.7187500000e+00 4.8828125000e-02 uV E1 1.7187500000e+00 4.8828125000e-02 uV E2 1.7187500000e+00 4.8828125000e-02 uV [History] ns2riff 3.7 (OSF1 V4.0 alpha) Wed Nov 26 12:27:41 1997 EOH The first column stores the unique(!) channel label. It can consist of max. 10 alphanumeric characters. Channel labels are case insensitive; it is an error to have "a1" and "A1" channels in one file. Column 2 and 3 contains two scaling factors for each channel. Most amplifiers uses separate calibration/amplification/definition factors for real world conversion and I didn't want to merge them. Column 4 is for a max. 10 character unit string. Each 16 bit sample value must be multiplied with both factors to convert it to a value in the listed unit. Note that the latin "u" is used for "micro" instead of the greek "mue". The [History] block is optional. Each entry is one free-form line of text. The final EOH means "End Of History" and is required if a [History] is present. The comment line ";label calibration factor" is required. 3.3 list of sample/code pairs, the number of pairs can be calculated with chunksize/12 bytes: | 4 | 8 | 4 | 8 | ... value: | sample[0] | code[0] | sample[1] | code[1] | ... sample time point of event as 0-based sample index code alphanumeric event code, 0-terminated if shorter then 8 characters 3.4 LIST chunk which contain the compressed data matrix and the informations needed for decompression in three subchunks; ("raw3" is how I called my compression algorithm) 3.4.1 The data channels are rearranged to improve the compression ratio. The chunk stores the channel indices in the original record - one 16 bit signed 0-based value for each channel in file. 3.4.2 The huge compressed data chunk. The record is stored channelwise in epochs. Each of these signal pieces contains the data of a few hundred sample points (typically 1 second). The actual epoch_length is stored in the chunk(see below). |chan_0_epoch_0|chan_1_epoch_0| ... |chan_n-1_epoch0|chan_0_epoch_1| ... Each of the blocks above stores residuals (the signal prediction errors of the compression algorithm) in a compressed form and the rules how to read the residuals and how to build the original data. Block starts are aligned to byte boundaries. The block data itself are counted in bits and there may be unused bits in the last byte of a block. All values are stored with MSB first. The general form of a block is bits: | 4 | ? | ? | value: | method | header | data | The actual header and data layout depends on the prediction method and the required bitwidth (16 or 32). Possible methods are: 0 no residuals, original values stored 1 residuals from first deviation 2 residuals from second deviation 3 residuals from difference of first deviation and first dev. of neighbor channel 8 same as above for 32 bit data 9 10 11 for methods 1, 2 and 3 the block layout is (n is for the number of samples in the current epoch) bits: | 4 | 4 | 4 | 16 | nbits or (nbits + nexcbits) | value: | method | nbits | nexcbits | y[0] | r[1] .. r[n-1] | for method 9, 10 or 11: bits: | 4 | 6 | 6 | 32 | nbits or (nbits + nexcbits) | value: | method | nbits | nexcbits | y[0] | r[1] .. r[n-1] | nbits number of bits needed to store a "regular" residual nexcbits number of bits needed to store a "exceptional" residual (0 means 16 in method 1,2 or 3) y[0] first sample value r[1] .. r[n-1] residuals, read nbits bits for each residual value, if this value equals -(2^(nbits-1)) read the next nexcbits bits to get the residual The signal values y[1] .. y[n-1] are computed from the residuals as follows: method 1 or 9: y[i] = y[i-1] + r[i] i = 1 .. n-1 method 2 or 10: y[1] = y[0] + r[1] y[i] = y[i-2] + (y[i-1] - y[i-2]) + r[i] i = 2 .. n-1 method 3 or 11: y[i] = y[i-1] + (Y[i] - Y[i-1]) + r[i] i = 1 .. n-1 where Y denotes the previous decompressed channel or a vector filled with zeroes if y is the first channel method 0 is simple (there is no compression) bits: | 4 | 4 | 16 .. 16 | value: | method | dummy | y[0] .. y[n-1] | method always 0 dummy unused, undefined y[0] .. y[n-1] signal values method 8 is equivalent bits: | 4 | 4 | 32 .. 32 | value: | method | dummy | y[0] .. y[n-1] | 3.4.3 the total number of epochs (ne) can be calculated with (chunksize - 4) / 4 bytes: | 4 | 4 .. 4 | value: | epoch_length | epoch_start[0] .. epoch_start[ne-1] | epoch_length length of compressed epochs in samples the last epoch can be shorter (Samples % epoch_length) epoch_start start of epoch as a byte index in the chunk