CIAO: DATA STREAMING FORMAT FOR COSA

Author:	       Mikael Patel
Date:	       2013-05-15
Rev.:	       1.0.5

This is a desciption of the Cosa data streaming format; Ciao. It
is basically a tagged data format that supports the C/C++ language
data types, descriptors (struct) and sequences of these. Pointers are
not supported in the streaming format.

Each data object is tagged with a prefix byte. It contains a four bit
type tag and a four bit attribute. Additional prefix attributes are
used for sequence length and user defined data types. 

The prefix information (type tag and length) is stored in MSB first
(big-endian) while data is stored according to endian setting in the
Ciao header. On the Arduino/AVR/Amtel processors data is stored in LSB
first (little-endian). Small embedded devices such as the Arduino will
stream data in native order to avoid byte swapping.

Typically usage of this format are; data streaming between systems
(Arduino or others), remote procedure call (as data type descriptor
and values may be interpreted as closures), tracing, file format,
etc. 

1. TYPE TAGS

3210 BIT POSITION, UNSIGNED INTEGER NUMBERS
0000 uint8_t, unsigned char
0001 uint16_t, unsigned short 
0010 uint32_t, unsigned long
0011 uint64_t, unsigned long long/*

3210 BIT POSITION, USER DEFINED TYPES
0100 type_t, system data type descriptor, 8-bit identity
0101 type_t, user data type descriptor, 16-bit identity
0110 type_t, system data type data, 8-bit identity
0111 type_t, user data type data, 16-bit identity

3210 BIT POSITION, SIGNED INTEGER NUMBERS
1000 int8_t, signed char
1001 int16_t, signed short
1010 int32_t, signed long
1011 int64_t, signed long long/*

3210 BIT POSITION, FLOATING POINT NUMBERS (IEEE 754)
1100 binary16_t, half-precision floating-point format/*
1101 binary32_t, single, float
1110 binary64_t, double/*
1111 binary80_t, long double/*

Bit(3) is set for signed data types. Cleared bit(2) indicates for
integer and set for floating point numbers. For integers bit(1..0) is
log2 number of bytes, values 0..3 for 1..8 bytes which is 8..64 bits.

A maximum of 64 K user defined data types are allowed. See section 5
for more details on the system data types.

*/ are optional in small embedded systems.

2. DATA SEQUENCE

Data values and sequences are prefixed with the type nibble and
primary attribute nibble and possible secondary attributes. 

TYPE    ATTRIBUTES           	 DESCRIPTION
dddd	0000      	 V*0     null terminated sequence 
dddd    nnnn             V*      data value(N=0..7)
dddd    1000   N         V*      data value(N=8..255)
dddd    1001   N:N       V*      data value(N=256..64K)
dddd    nnnn   		 	 reserved(N=10..15)

The attribute is the length of the data sequence [0..64K] using 4, 12
or 20 bits. This is symmetrical for all types except user data type
descriptor. 

Null terminated sequences are of most interest for representation of C
strings (i.e. uint8_t[N]).

3. DATA TYPE DESCRIPTOR

The type tag may be extended with user defined types. These are given
an 8 or 16-bit unsigned integer number allowing at most 256 + 64K
additional data types. Sequences of user defined data types are
included.  

Note that the elementary data tags are reinterpreted for a data type
descriptor. Instead of data values the type prefix is followed by a
possible name string (null terminated). Union data types are not
supported in this release. 

TYPE    ATTRIBUTES           	 DESCRIPTION
0100    0000        T    V*0     start descriptor(type, name), 8-bit id
0100	nnnn	    	 	 reserved (N=1..14)
0100	1111	    	 	 end of descriptor

TYPE    ATTRIBUTES           	 DESCRIPTION
0101    0000        T:T  V*0     start descriptor(type, name), 16-bit id
0100	nnnn	    	 	 reserved (N=1..14)
0101	1111			 end of descriptor

TYPE    ATTRIBUTES           	 DESCRIPTION
dddd	0000      	 V*0     data<D> null terminated sequence(name)
dddd    nnnn             V*0     data<D> type(N=1..7, name)
dddd    1000   N         V*0     data<D> type(N=8..255, name)
dddd    1001   N:N       V*0     data<D> type(N=256..64K, name)
dddd    nnnn   		 	 reserved (N=10..15)

TYPE    ATTRIBUTES           	 DESCRIPTION
0110	0000        T  	 V*0     data<T> null terminated sequence(name)
0110    nnnn        T    V*0     data<T> type(N=1..7, name), 8-bit id
0110    1000   N    T    V*0     data<T> type(N=8..255, name)
0110    1001   N:N  T    V*0     data<T> type(N=256..64K, name)
0110    nnnn		 	 reserved (N=10..15)

TYPE    ATTRIBUTES           	 DESCRIPTION
0111	0000        T:T	 V*0     data<T> null terminated sequence(name)
0111    nnnn        T:T  V*0     data<T> type(N=1..7, name), 16-bit id
0111    1000   N    T:T  V*0     data<T> type(N=8..255, name)
0111    1001   N:N  T:T  V*0     data<T> type(N=256..64K, name)
0111    nnnn		 	 reserved (N=10..15)

A user defined data type may be viewed as a schema and may be used for
both data tables and closures. The meta information allows human
readable printout of data streams.

4. TYPED DATA SEQUENCE

TYPE    ATTRIBUTES           	 DESCRIPTION
0110	0000        T	 V*      data<T> value null terminated sequence(name)
0110    nnnn        T    V*      data<T> value(N=1..7, name), 8-bit id
0110    1000   N    T    V*      data<T> value(N=8..255, name)
0110    1001   N:N  T    V*      data<T> value(N=256..64K, name)
0110    nnnn		 	 reserved (N=10..15)

TYPE    ATTRIBUTES           	 DESCRIPTION
0111	0000        T:T	 V*      data<T> value null terminated sequence(name)
0111    nnnn        T:T  V*      data<T> value(N=1..7, name), 16-bit id
0111    1000   N    T:T  V*      data<T> value(N=8..255, name)
0111    1001   N:N  T:T  V*      data<T> value(N=256..64K, name)
0111    nnnn		 	 reserved (N=10..15)

A typed data sequence is prefixed with the type identity.

5. SYSTEM AND COSA FAI DATA TYPES

The short 8-bit data type identities are reserved for system data
types. The below identities are currently defined. Note that the
identities 0..127 are used for up-stream (status/response) and
128..255 for down-stream (control/requests). See Ciao.h and Fai.h
for more details on the struct member description. 

ID     NAME		     DESCRIPTION
0x00   HEADER_ID	     Ciao stream header (magic/version/endian)
0x10   COSA_FAI_ID	     Cosa fai types
0x10   ANALOG_PIN_ID	     Analog pin value
0x11   DIGITAL_PIN_ID	     Digital pin value 
0x12   DIGITAL_PINS_ID	     Digital pin values 
0x13   EVENT_ID	     	     Event trace
0x20   SAMPLE_REQUEST_ID     Digital/analog pin set sample request
0x21   SET_MODE_ID     	     Set digital/analog pin mode

The descriptor data structure is defined in Cosa/Ciao.h and Cosa/Fai.h. 
The system data type descriptors are found in Cosa/Ciao.cpp and 
Cosa/Fai-directory.

6. EXAMPLES

EXPRESSION			ENCODING
uint8_t x = 15;			[0x01] [0x0f]
int32_t y = -2;			[0xa1] [0xfe][0xff][0xff][0xff]
char* s = "Ciao!"		[0x00] ['C']['i']['a']['o']['!'][0]
struct Point {			[0x50] [0x10][0x42] ['P']['o']['i']['n']['t'][0]
  int16_t x;   			[0x91] ['x'][0]
  int16_t y;			[0x91] ['y'][0]
};	  			[0x5f]
Point p = { 1, -1 };		[0x71] [0x10][0x42] [0x01][0x00][0xff][0xff]
Point q[100];  	  		[0x78] [100] [0x10][0x42] [..][..]...[..][..]

In the example above the struct Point is given the identity (0x1042).

7. REFERENCES

[1] Sun Microsystems (1987). "XDR: External Data Representation
    Standard". RFC 1014. Network Working Group. Retrieved July 11,
    2011. http://tools.ietf.org/html/rfc1014
[2] Boost Serialization, 
    http://www.boost.org/doc/libs/1_46_1/libs/serialization/doc/index.html
[3] Java Stream Format,
    http://docs.oracle.com/javase/7/docs/platform/serialization/spec/protocol.html#10258
[4] Arduino/Firmata, http://www.firmata.org



