Document	Status
Status	Draft
Version	0.4
Date	2017-07-05

Abstract: This documents defines a file format for UA Information Models which replaces the XML UA-Nodeset Schema from the OPC Foundation. The binary file format only requires a fraction of the memory that a comparable XML file needs and it is easy and fast to load also on embedded systems without the need for an XML parser. This documents also defines new binary encoding scheme which is more efficient in terms of memory usage than the default UA Binary Encoding. This encoding would also be a good alternative for a new binary protocol for embedded systems. For now we use it only to store UA information models in binary files.

Terms, definitions and conventions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

UA Compact Binary Encoding

General Notes

The UA Compact Binary Encoding does not support a distinction between Null-Elements and Empty-Elements as it is done in the default UA Binary Encoding for String, ByteString, Arrays, etc.

Rationale:

The UA specification allows already today that decoders can map both Null and Empty elements to the same value. E.g. Null-String and Empty-String can both be mapped to a NULL pointer in C when the data gets decoded.
There is no advantage in such distinctions on protocol level. It's only a difference that occurs in some programming languages.
The usage of signed integers for length fields is considered bad practice, because it often results in programming errors.
This encoding uses unsigned integers for all length fields, where the error scenario of negative values can be completely avoided, thus the implementation is easier and more secure.
Also variable length encoding of unsigned integers is more efficient.

Fixed Size Encodings

When fixed size integer values are encoded this document uses the according C-Type also as encoding name. E.g. uint32_t in an "_Encoding Type_" column means the value is encoded as 4-Byte integer value. The least significant byte is always encoded first (little endian).

Boolean Encoding

According to the C Standard the size of the bool datatype is implementation-defined. However this encoding defines a bool to be encoded as 1 byte. Furthermore we limit the number of valid values to 0 and 1. All other values are invalid and MUST result in a decoding error.

Boolean value	Description
0	false
1	true
2-255	invalid

When the type bool is used in an "_Encoding Type_" column this refers to this Boolean encoding.

VarInt Encoding

A VarInt is an unsigned integer with variable length encoding. The MSB of each byte is used to indicate that there is another byte following. The last byte has a MSB of zero. The bytes are encoded with least significant byte first.

This enables the following numeric ranges:

Bytes	Range	Example
1	0 – 127 (0 – 2^7-1)	17 = 00010001 (bin)
2	0 – 16.383 (0 – 2^14-1)	300 = 10101100 00000010 (bin)
3	0 – 2.097.151 (0 – 2^21-1)	1000000 = 11000000 10000100 00111101 (bin)
4	0 – 268.435.455 (0 – 2^28-1)
n	0 – 2^(n*7)-1

To store a UINT32_MAX in a VarInt 5 bytes are required, UINT64_MAX requires 10 bytes.

SVarInt Encoding

Is a VarInt encoding for signed integer values. The normal VarInt encoding would not work efficiently, because negative numbers would be interpreted as large unsigned integers and so would result always in the longest encoding. For this reason we use the so called ZigZag encoding which maps signed integers to unsigned integers by using the distance from zero. This always results in small unsigned values for small signed values which then can be encoded in the same way as VarInts.

Signed Original	Unsigned value
0	0
-1	1
1	2
-2	3
2	4
2147483647	4294967294
-2147483648	4294967295

This mapping can be done by using

(n << 1) ^ (n >> 15)

for int16_t, and by using

(n << 1) ^ (n >> 31)

for int32_t, and by using

(n << 1) ^ (n >> 63)

for int64_t.

Note that the second shift – the (n >> 31) part – is an arithmetic shift. So, in other words, the result of the shift is either a number that is all zero bits (if n is positive) or all one bits (if n is negative).

IEEE754 Encoding

In OPC UA we use 32bit and 64bit floating point types according to IEEE754 (ISO/IEC/IEEE 60559:2011), which is the same encoding as used by most CPUs and programming languages. Thus no conversion is needed, only the endianess must be normalized. When encoding IEEE754 values the least significant byte is encoded first.

String Encoding

A String is encoded as string length followed by the string data. The length field is encoded as VarInt. The string contents is an UTF-8 encoded Unicode string. The string data is encoded as sequence of UTF-8 encoded bytes without a null terminator.

Example: "Hello World" (11) is encoded as

Hex: 0b  48  65  6c  6c  6f  20  57  6f  72  6c  64
     11 'H' 'e' 'l' 'l' 'o' ' ' 'W' 'o' 'r' 'l' 'd'

In OPC UA Strings are technically limited to a max. length of 2.147.483.647 bytes (Int32), but most SDKs will use much shorter encoding limits. So the max. length is implementation specific. This file format does not enforce any limitations.

ByteString Encoding

Is encoded in the same ways as String, but the data may contain arbitrary values, e.g. PNG image data.

Example: A PNG file with 4158 Byte.

$> hexdump -C -unified_128x128.png
00000000  89 50 4e 47 0d 0a 1a 0a  00 00 00 0d 49 48 44 52  |.PNG........IHDR|
...

VarInt(4158) results in 10111110 00100000 = 0xBE 0x20, thus the resulting encoded ByteString looks as follows.

hex: BE 20 89 50 4E 47 0D 0A 1A 0A  00 00 00 0D 49 48 44 52 ...

In OPC UA ByteStrings are technically limited to a max. length of 2.147.483.647 bytes (Int32), but most SDKs will use much shorter encoding limits. So the max. length is implementation specific. This file format does not enforce any limitations.

Guid Encoding

A Guid (Global unique identifier) is defined using the following C struct:

struct ua_guid {
    uint32_t data1;    /**< Data1 field. */
    uint16_t data2;    /**< Data2 field. */
    uint16_t data3;    /**< Data3 field. */
    uint8_t  data4[8]; /**< Data4 array. */
};

Such a Guid is encoded using the fixed size types of the individual fields.

Pseudo-Code:

int ua_encode_guid(const struct ua_guid *guid, ua_buffer *buf)
{
    int ret;

    ret = ua_encode_uint32(&guid->data1, buf);
    if (ret != 0) goto error;

    ret = ua_encode_uint16(&guid->data2, buf);
    if (ret != 0) goto error;

    ret = ua_encode_uint16(&guid->data3, buf);
    if (ret != 0) goto error;

    ret = ua_encode_bytearray(&guid->data4[0], 8, buf);
    if (ret != 0) goto error;

    return 0;
error:
    return -1;
}

Example:

// 12345678-1122-3344-0001-020304050607
struct ua_guid demo = { 0x12345678, 0x1122, 0x3344, {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07 }};
// encoded data stream: 78563412221144330001020304080607 (hex)

NodeId Encoding

The NodeId is encoded as a sequence of type, nsidx and identifier. Type and nsidx are encoded into a single byte. The identifier is a choice of different types (union in C) which is encoded depending on the type field.

Encoding Type	Name	Description
VarInt	type_nsidx	The type (2 bits) and nsidx (16 bits) fields are packed into one uint32_t which is then encoded as VarInt.
		Therefore the nsidx__ is shifted 2 bits left and bitwise ORed with type. This way both can usually be encoded in one byte,
		as long as _nsidx < 32. If nsidx is bigger a multi-byte sequence is created according to the VarInt encoding.
VarInt	numeric	Numeric value (only present when type==0)
String	string	String value (only present when type==1)
Guid	guid	Guid value (only present when type==2)
ByteString	opaque	Opaque value (only present when type==3)

Note: In OPC UA string and opaque NodeIds are limited to 4096 bytes. Implementations may choose even shorter encoding limit. This file format does not enforce any limitations.

Examples:

NodeId XML      type_nsidx  Identifier (Data is binary if not prefixed with 0x)
-----------------------------------------------------------------------------
nsidx=0;i=17:   00000000    00010001 (two byte nodeid)
nsidx=1;i=300:  00000100    10101100 00000010 (three byte nodeid)
nsidx=2;s=abc:  00001001    00000011 0x616263 (5 bytes)
nsidx=3;g=936DA01F-9ABD-4D9D-80C7-02AF85C822A8:
                00001110    0x1FA06D93BD9A9D4D80C702AF85C822A8 (17 bytes)
nsidx=4;b=YWJj: 00010011    00000011 0x616263 (5 bytes)

ExpandedNodeId Encoding

Not required!!! Only for referencing nodes outside of the local server.

The encoding could be done as follows:

Encoding Type	Name	Description
NodeId	id	NodeId of foreign node.
String	ns_uri	Namespace URI for foreign node.
VarInt	server_index	Server index. Index of the foreign server in the server table.

QualifiedName Encoding

Encoding Type	Name	Description
VarInt	nsidx	The namespace index.
String	name	The name field.

Examples:

0:""     : 00 00
1:Hello  : 01 05 48 65 6C 6C 6F

LocalizedText Encoding

Encoding Type	Name	Description
String	locale	The locale, e.g. "en-US". This can also by empty.
String	name	The text in the specified locale.

Examples:

<Empty>     : 00 00
Hello       : 00 05 48 65 6C 6C 6F
en-US:Hello : 05 65 6e 2d 55 53 05 48 65 6C 6C 6F

ExtensionObject Encoding

In contrast to the default UA Binary Encoding this encoding supports only a binary body. It makes simply no sense to use XML encoding in small binary protocols. So we can get rid of the encoding byte of Part 6 - Table 13 and can simplify ExtensionObjects to the following.

Encoding Type	Name	Description
NodeId	typeid	The identifier for the DataTypeEncoding node.
ByteString	body	This serialized binary data of this datatype.

Note that the body is encoded with default OPC UA binary encoding. This way the application loading this file simply can use the value as-is for the protocol.

2nd note: it would be possible to use the optimized encoding of this document also for the body. This would make sense when this encoding would also be used for the protocol, which means a different typeid would be used.

Variant Encoding

C Type	Encoding Type	Field Name	Description
uint8_t	uint8_t	encoding	The type of data encoded in the stream.
			The mask has the following bits assigned:
			Bit 0:5 Built-in Type Id
			Bit 6 (HasDimensions): True if the array___dimensions field is encoded.
			Bit 7 (IsArray): True if an array of values is encoded.
uint32_t	VarInt	array_length	The number of elements in the array. This field is only present if
			the IsArray bit is set in the encoding byte. For multi-dimensional
			arrays this field contains the total number of elements.
union	-	-	Variant value. May be one of the following fields.
bool	uint8_t	boolean	Boolean value encoded as one byte (0=false, 1=true).
uint8_t	uint8_t	ui8	Unsigned byte value.
int8_t	int8_t	i8	Signed byte value.
uint16_t	VarInt	ui16	Unsigned word value.
int16_t	SVarInt	i16	Signed word value.
uint32_t	VarInt	u32	Unsigned double word value.
int32_t	SVarInt	i32	Signed double word value.
uint64_t	VarInt	u64	Unsigned quad word value.
int64_t	SVarInt	i64	Signed quad word value.
float	IEEE754 32bit	flt	32bit floating point value.
double	IEEE754 64bit	dbl	64bit floating point value.
ua_string	String	str	String value.
ua_datetime	uint64_t	dt	DateTime is a 64bit Windows FILETIME value.
ua_guid	uint8_t[16]	guid	Guid value. Technically an array of 16 byte.
ua_bytestring	String	bs	An arbitrary array of bytes. Encoded in the same way as a string.
ua_bytestring	String	xml	XML element, which is represented as ua_bytestring.
ua_nodeid	NodeId	nodeid	UA NodeId value.
ua_expandednodeid	ExpandedNodeId	enodeid	UA ExpandedNodeId value.
ua_statuscode	uint32_t	status	UA StatusCode value. (maybe SVarInt would be good also for this)
ua_qualifiedname	QualfiedName	qn	UA QualifiedName value.
ua_localizedtext	LocalizedText	lt	UA LocalizedText value.
ua_extensionobject	ExtensionObject	eo	UA ExtensionObject value.
unionend	-	-	End of union fields.
uint32_t[]	VarInt[]	array_dimensions	The length of each dimension encoded as VarInt. This field is only
			present when HasDimensions is true.

Examples:

Variant Type	Variant Value	Encoded Data (hex)	Standard UA Binary Encoding (for comparison)
Empty	-	00	00
Bool	true	01 01	01 01
SByte	-17	02 EF	02 EF
Byte	17	03 11	03 11
Int16	-17	04 21	04 EF FF
UInt16	17	05 11	05 11 00
Int32	-17	06 21	06 EF FF FF FF
UInt32	17	07 11	07 11 00 00 00
Int64	-17	08 21	08 EF FF FF FF FF FF FF FF
UInt64	17	09 11	09 11 00 00 00 00 00 00 00
Float	1.23	0A A4 70 9D 3F	0A A3 70 9D 3F
Double	1.23	0B AE 47 E1 7A 14 AE F3 3F	0B AE 47 E1 7A 14 AE F3 3F
Bool[3]	{ true, false, true }	81 03 01 00 01	81 03 00 00 00 01 00 01
Int32[2]	{ 2, -2 }	86 02 04 03	81 04 00 00 00 01 00 00 00 02 00 00 00
UInt32[3][3]	{ { 1, 2, 3 },	C7 09 01 02 03 04 05 06	C7 09 00 00 00 01 00 00
	{ 4, 5, 6 },	07 08 09 02 03 03	00 02 00 00 00 03 00 00
	{ 7, 8, 9 } }	(14 byte)	00 04 00 00 00 05 00 00
			00 06 00 00 00 07 00 00
			00 08 00 00 00 09 00 00
			00 02 00 00 00 03 00 00
			00 03 00 00 00 (53 byte)
NodeId	ns=0;i=17	11 00 11	11 00 11 (two byte numeric nodeid)
NodeId	ns=1;i=256	11 01 80 02	11 01 01 00 01 (four byte numeric nodeid)
NodeId	ns=1;i=65536	11 01 80 80 04	11 02 01 00 00 00 01 00 (7 byte numeric nodeid)
NodeId	ns=3;s=Hello	11 43 05 48 65 6C 6C 6F	11 03 03 00 05 00 00 00 48 65 6C 6C 6F

File Format

Encoding Type	Name	Description
char[4]	signature	File signature: { 'U', 'A', 'A', 'D' }
char[2]	version	File format version: major.minor. For this document version version={ 1, 3 }
uint64_t	last_modified	Last modified timestamp as 64 bit time_t (UNIX epoch)
VarInt	num_xmlnamespaces	Number of entries in `xmlnamespacetable`
VarInt	num_stringtables	Number of entries in `stringtables`
VarInt	num_namespaces	Number of entries in `namespacetable`
VarInt	num_datatypes	Number of entries in `datatypetable`
VarInt	num_referencetypes	Number of entries in `referencetypetable`
VarInt	num_variabletypes	Number of entries in `variabletypetable`
VarInt	num_objecttypes	Number of entries in `objecttypetable`
VarInt	num_variables	Number of entries in `variabletable`
VarInt	num_objects	Number of entries in `objecttable`
VarInt	num_methods	Number of entries in `methodtable`
VarInt	num_views	Number of entries in `viewtable`
VarInt	num_references	Number of entries in `referencetable`
String[]	xmlnamespacetable	XML namespace table. List of XML namespaces used by extensions.
Extensions	extensions	Global file extensions
StringTable[]	stringtables	String tables
Namespace[]	reqnamespacetable	Required namespace table. List of namespaces that are required for this file.
Namespace[]	namespacetable	Namespace table. List of namespaces provided by this file.
DataType[]	datatypetable	DataType table
ReferenceType[]	referencetypetable	ReferenceType table
VariableType[]	variabletypetable	VariableTable table
ObjectType[]	objecttypetable	ObjectType table
Variable[]	variabletable	Variable table
Object[]	objecttable	Object table
Method[]	methodtable	Method table
View[]	viewtable	View table
Reference[]	referencetable	Reference table
uint8_t[4]	checksum	Adler32 checksum

All nodes are stored in node class specific node tables, this means the node class itself is implicitly known and does not need to be stored explicitly in the file.

Checksum

The checksum is computed using the Adler32 algorithm, which results in a 4 byte checksum, which gets appended at the end of the file.

For validating the checksum, the checksum must be calculated again over the complete file, excluding the last 4 bytes of the file. Then the computed checksum must be compared with the one at the end of the file to see if the file is corrupted or not.

Adler32 is best known of being used in ZLIB compression. The algorithm is explained in detail in RFC1950. An example implementation of Adler32 can be found on Wikipedia: https://de.wikipedia.org/wiki/Adler-32

String Table

To eliminate redundant strings we create a string table which contains only unique strings. The node entries use special variants of LocalizedText and QualifiedName which use a string reference instead of the string itself. A string reference is simply the string index in the string table encoded as VarInt.

QualifiedNameRef Encoding

Encoding Type	Name	Description
VarInt	nsidx	The namespace index.
VarInt	name	The name field as string table index.

LocalizedTextRef Encoding

The locale can be eliminated, because it is implicitly known by used string table. The String name field is replaced by the string table index.

Encoding Type	Name	Description
VarInt	name	The text as string table index..

StringTable Entry

A StringTable entry defines one string table (not one string). This is because a file can contain multiple string tables.

Encoding Type	Name	Description
String	locale	The locale of the string table (de-DE, fr-FR, etc.)
VarInt	size	Number of strings in string table.
String[]	table	Array of strings.

Internationalization (I18n)

The first string table contains the default locale and should always be complete. Other tables used for translations can contain empty strings for strings that have not been translated. Note that also the first stringtable will contain one empty string. Empty strings are used a lot in UA and these will reference the empty string entry in the stringtable. It is recommended to add the empty string as the first element in the string table, so the index 0 always references an empty string.

String lookup procedure (pseudo code):

/** Lookup string translation by \c index and \c locale.
 * @param index The string index in the table.
 * @param locale The locale of the string to return.
 * @return The found string or NULL if not found.
 */
string lookup_string(string_index index, string locale) {
    // lookup translation table
    stringtable table = stringtable_lookup(locale);
    string ret;

    if (table) {
        // lookup string
        ret = table->lookup(index);
        if (ret) return ret;
    }

    // fallback to default locale
    table = stringtable_lookup("en-US");
    return table->lookup(index);
}

Note: All string tables need to have the same size and the contained strings must have the same order. A difference in the table sizes MUST result in a decoding error.

Example:

String Index	en-US (default)	de-DE	fr-FR
0	""	""	""
1	"Hello World"	"Hallo Welt"	"Bonjour monde"
2	"data type"	"Datentyp"	"type de données"
3	"apple"	"Apfel"	"pomme"

Namespace Entry

The namespace entries are divided into two separate tables. The first table contains the required namespaces which are used by this file, the second table contains the namespaces which are provided by this file.

If an application loads a file it needs to verify that it "knows" all required namespaces, either because they are built into the application (e.g. NS0), or by loading other files in advance which provide those required namespaces.

Both tables together form the complete namespace table of the server, thus the namespace indices must be unique as shown in the example below.

Namespace Encoding:

Encoding Type	Name	Description
VarInt	nsidx	The Namespace index.
String	nsuri	The Namespace uri.
Extensions	extensions	Extensions. Typically used to add Permission info.

Example:

Number of required namespaces (file header): 2
Number of namespaces (file header): 2
Required Namepspaces:
Namespace entry 0: nsidx=0, uri=http://opcfoundation.org/UA/, num_extensions=0
Namespace entry 1: nsidx=2, uri=http://opcfoundation.org/UA/DI/, num_extensions=0
Provided Namespaces:
Namespace entry 0: nsidx=1, uri=urn:demo.unifiedautomation.com:UnifiedAutomation:UaDemoServerAnsiC, num_extensions=0
Namespace entry 1: nsidx=3, uri=http://baluff/rfid, num_extensions=1
  Extension 0: Permission(type=5, len=6): { uid: 0, gid: 5, mode: 0x00000007 }

Required Namepspaces:
Namespace entry 0: 00 1C 68 74 74 70 3a 2f 2f 6f  70 63 66 6f 75 6e 64 61 ...  00
Namespace entry 1: 02 1F 68 74 74 70 3a 2f 2f 6f  70 63 66 6f 75 6e 64 61 ...  00
Provided Namespaces:
Namespace entry 0: 01 42 75 72 6e 3a 64 65 6d 6f  2e 75 6e 69 66 69 65 64 ...  00
Namespace entry 1: 03 1B 68 74 74 70 3a 2f 2f 77  77 77 2e 62 61 6c 6c 75 ...  01 05 06 00 05 07 00 00 00

Abstract Node Entry

All "node" entries of different nodeclasses contain the same base attributes, where some of them are optional and some are mandatory. But not all attributes that are defined by OPC UA need to be encoded at all, as explained in the following table.

Attribute	Encoded	Description
NodeId	yes	This one is essential.
NodeClass	no	The NodeClass is implicitly known. E.g. a node entry in `variabletable` must be a variable.
BrowseName	yes	This one is mandatory.
DisplayName	optional	If missing the BrowseName is used also for DisplayName.
Description	optional	If missing this is empty (optional in UA information model)
WriteMask	optional	Optional in UA information model.
UserWriteMask	no	Optional in UA information model. User specific values are never part of the file, but derived from other information.

A node entry always starts with an Encoding byte followed by mandatory attributes, optional attributes and optional extensions.

The four least significant bits of the Encoding byte are identical for all node classes The four most significant bits of the Encoding byte are node class specific and are defined in the following sections.

Encoding Byte:

Bit	Name	Description
0	HasDisplayName	The display name attribute is encoded.
1	HasDescription	The description attribute is encoded.
2	HasWriteMask	The WriteMask attribute is encoded.
3	HasExtension	This node contains extensions.
4-7	-	Node class specific attributes.

Node Entry Encoding Format:

The following table contains the order and type information of the base attributes that are used by all node classes.

Encoding Type	Name	Optional	Default Value
uint8_t	encoding	no	-
NodeId	nodeid	no	-
QualifiedNameRef	browsename	no	-
LocalizedTextRef	displayname	yes	BrowseName
LocalizedTextRef	description	yes	<empty>
uint32_t	writemask	yes	0
Extensions	extensions	yes	<empty>

Pseudo code of a node entry decoder:

/// Decodes extensions.
int ua_base_decode_extensions(struct buffer *buf, ua_node_t n)
{
    int ret = 0;

    // TODO: parse all extensions and call user-defined callback for each.

    return ret;
}

/// Decodes UA base attributes.
int ua_base_decode(struct buffer *buf, ua_node_t n, uint8_t *enc)
{
    int ret = 0;
    uint8_t encoding;
    // base attributes
    struct ua_nodeid nodeid;
    struct ua_qualifiedname browsename;
    struct ua_localizedtext displayname;
    struct ua_localizedtext description;
    uint32_t writemask = 0;

    // read encoding byte
    if (ua_buffer_remaining_byte(buf) < 1) return -1;
    encoding = ua_buffer_getbyte(buf);

    // decode nodeid: mandatory
    ret = ua_base_decode_nodeid(buf, &nodeid)
    if (ret < 0) return ret;

    // decode browsename: mandatory
    ret = ua_base_decode_qualifiedname(buf, &browsename)
    if (ret < 0) return ret;

    // decode displayname: optional
    if (encoding & BIT_DISPLAYNAME)
        ua_base_decode_localizedtext(buf, &displayname);
    else
        ua_localized_text_attach_const("", browsename.text);

    // decode description: optional
    if (encoding & BIT_DESCRIPTION)
        ua_base_decode_localizedtext(buf, &description);
    else
        ua_localized_init(&description);

    // decode writemask: optional
    if (encoding & BIT_WRITEMASK)
        ua_base_decode_uint32_t(buf, &writemask);

    // decode extensions: optional
    if (encoding & BIT_EXTENSIONS)
        ua_base_decode_extensions(buf, n);

    // assign attributes
    ret = ua_node_set_id(n, &nodeid);
    if (ret < 0) return ret;
    ret = ua_node_set_browsename(n, &browsename);
    if (ret < 0) return ret;
    ret = ua_node_set_displayname(n, &displayname);
    if (ret < 0) return ret;
    ret = ua_node_set_description(n, &description);
    if (ret < 0) return ret;
    ret = ua_node_set_writemask(n, &writemask);

    // return encoding byte
    if (enc) *enc = encoding;

    return ret;
}

/// Decodes a variable entry.
int ua_variable_decode(struct buffer *buf, ua_node_t *node)
{
    int ret = 0;
    ua_node_t n = ua_node_create(NODECLASS_VARIABLE);
    uint8_t encoding;
    // variable attributes
    struct ua_variant value;
    struct ua_nodeid datatype;
    ...

    // decode common base attributes and extensions
    ret = ua_base_decode(buf, n, &encoding);
    if (ret < 0) return ret;

    // decode value
    if (attr.encoding & BIT_VALUE)
        ua_variant_decode(buf, &value);
    else
        ua_variant_init(&value);

    // decode datatype
    if (attr.encoding & BIT_DATATYPE)
        ua_nodeid_decode(buf, &datatype);
    else
        ua_nodeid_init(&datatype);

    ...

    // assign attributes

    ua_variable_set_value(n, &value);
    ...

    return ret;
}

XML Namespace Entry

XML Namspace entries are used to uniquely identify proprietare extensions which can be part of this file. The XML namespace index is the index in the XML namespace table, starting with 0.

Each entry is a string containing an XML Namespace URI.

XML Namespace Encoding:

Encoding Type	Name	Description
String	uri	The Namespace uri.

Example:

Number of XML namespaces (file header): 2
Namespace entry 0: uri=http://unifiedautomation.com/Configuration/NodeSet.xsd
Namespace entry 1: uri=http://www.siemens.com/OPCUA/2017/SimaticNodeSetExtensions

Extensions

Extensions can be added to individual nodes as well as to the file as global extensions. Extensions are a way to add vendor-specific information in a generic way. This can be e.g. a runtime address, that a server uses to connect the UA variable to the underlying system. It is out-of-scope of UA how this information looks like, thus it is simply represented by a ByteString, which needs to be interpreted by the application. The design of extensions allows to skip unknown extensions. Implementations MUST ignore unknown extensions and continue decoding without error.

Extensions Encoding:

Encoding Type	Name	Description
VarInt	num_extensions	Number of extensions in this node
Extension[]	extensiontable	Array of extensions.

Extension Encoding:

Encoding Type	Name	Description
VarInt	nxidx	Index of the XML namespace within the xmlnamespacetable. The XML namespace uniquely identifies the organisation / technology which defined this extension.
VarInt	type	Type number of the extension. Within the XML namespace, this number shall uniquely identify the content and semantics of this extension.
ByteString	body	Binary data of extension. The ByteString contains the length of the data, so this can be skipped if `type` is unknown.

List of Unified Automation extensions /TBD/:

Extension Type	Name	DataType	Description	Scope
0	userdb	UserDatabase	Contains UserId / Username mappings.	Global
1	groupdb	GroupDatabase	Contains GroupId / Groupname mappings as well user lists.	Global
2	passdb	PasswordDatabase	Contains Password hashes for each user.	Global
3	generator	GeneratorInfo	Contains information about the tool that has generated this file.	Global
4	enginfo	EngineeringInfo	Contains additional information for engineering tools like UaModeler.	Node
5	permission	Permission	Contains node permission information.	Node,Namespace
15	eo	ExtensionObject	Contains a typified structure in UA Binary Encoding.*	Node

TODO: * is for Mats. Do we really need this?

Global Extensions

UserDatabase Extension

This user database is necessary to implement authorization. If authentication uses an internal implementation you can add also the PasswordDatabase extension, but this is not required if you are using different authentication schemes.

UserDatabase Encoding:

Users Encoding:

Encoding Type	Name	Description
VarInt	num_users	Number of users in user table.
User[]	users	User table.

User Encoding:

Encoding Type	Name	Description
VarInt	uid	UserId (0 is reserved for root).
String	username	The username.

GroupDatabase Extension

This group database is necessary to implement authorization. If authentication uses an internal implementation you can add also the PasswordDatabase extension, but this is not required if you are using different authentication schemes.

Groups Encoding:

Encoding Type	Name	Description
VarInt	num_groups	Number of groups in user table.
Group[]	groups	Group table.

Group Encoding:

Encoding Type	Name	Description
VarInt	gid	GroupId (0 is reserved for root).
String	groupname	The groupname.
VarInt	num_users	Number of users in group.
VarInt[]	uids	Array of user ids.

PasswordDatabase Extension

The password database stores SHA256 password hashes, not the plain-text password itself. This extension requires the UserDatabase extension.

PasswordDatabase Encoding:

Encoding Type	Name	Description
VarInt	num_passwords	Number of passwords in password table.
Password[]	passwords	Password table.

Password Encoding:

Encoding Type	Name	Description
VarInt	uid	The user id for this password entry.
String	salt	A random salt value for the hash.
uint8_t[32]	hash	The SHA256 hash value.

Generator Extension

This extension is purely informative and has no influence on the server. This extension MAY be stripped by "Downloaders" to save space.

Encoding Type	Name	Description
String	name	The name of the generator tool.
String	version	The version number of the tool.

Node Extensions

Node extensions are extensions that can be added to individual nodes.

Extensions are only encoded if the HasExtension bit is set in the Encoding byte.

Permission Extension

Each node as assigned to a user (the owner of the node), a group and others. This is similar to UNIX file system permission. The permission flags (mode) are a bitmask as shown in the table below, and exist three times for owner, group and others. An additional flag in the mode can be used to require an encrypted connection (e.g. for Audit events).

Name	Encoding Type	Description
uid	VarInt	The user id of the owner of this node.
gid	VarInt	The group id of this node.
mode	uint32_t	The access permission bits of this node. Encoded in LE format.
		Bit 0: AttributeReadable (=Browseable*)
		Bit 1: Readable (value attribute)
		Bit 2: Writable (value attribute)
		Bit 3: Executable
		Bit 4: HistoryReadable
		Bit 5: HistoryInsertable
		Bit 6: HistoryModifiable
		Bit 7: HistoryDeletable
		Bit 8: EventReadable
		Bit 9: AttributeWritable
		Bits 0-9 contain permissions for others, bits 10-19 contain
		the same information for group, bits 20-29 for the owner.
		Bit 30: not used
		Bit 31: requires encryption

* without AttributeReadable permission browse make no sense, so a separate Browseable permission is not necessary.

Engineering Info

This information is only useful for engineering tools not for servers. This extension MAY be stripped by "Downloaders" to save space.

Name	Encoding Type	Description
symbolicname	String	A symbolic id for core generators.
category	String	A category used to filter generated output.
documentation	String	Additional documentation that could be used by code generators.

Extension Examples

Fictive example of vendor-specific extensions:

Extension Type	Name	DataType	Description
30	S7ANYPOINTER	uint8_t[10]	Runtime address of S7 PLCs.
40	ModbusAddr	VarInt	Modbus register number.

Extension Encoding Example:

Extension	Value
Permission (5)	{ uid: 1, gid: 5, mode: 0x00000007 }
S7ANYPOINTER (30)	0xDEADBEEFDEADBEEFDEAD

Resulting byte stream of encoded extensions:

Hex: 02 05 06 01 05 07 00 00 00 1E 0A DE AD BE EF DE AD BE EF DE AD

Dissected Encoding for Clarification:

Encoded Data	Meaning
02	Number of extensions: 2
05 06 01 05 07 00 00 00	1. Extension: Type=5, Len=6, Value=0x010507000000
1E 0A DE AD BE EF DE AD BE EF DE AD	2. Extension: Type=30, Len=10, Value=0xDEADBEEFDEADBEEFDEAD (uint8_t[10])

Variable Entry

Variable specific attributes:

Attribute	Encoded	Description
Value	optional	The process value.
DataType	optional	The data type of value.
ValueRank	optional	ValueRank according to Part 3, OPC Specification.
ArrayDimensions	optional	Length of each array dimension.
AccessLevel	optional	AccessLevel bitmask according to Part 3, OPC Specification.
UserAccessLevel	no	User specific data is not store in the file.
Historizing	yes	Encoded as part of the 2nd encoding byte*.
MinimumSamplingInterval	optional	Minimum sampling interval of the variable in µs**.

* Making a boolean value optional makes no sense, because you would need one bit to define if the bool gets encoded or not, which is a whole byte if gets encoded. Thus it makes more sense to store the boolean value directly in the encoding byte.

** In the UA Information Model MinimumSamplingInterval is defined as Duration (=_Double_). Using an 8 byte floating point type is a complete overkill, so we use a VarInt with µS resolution, which should be small enough for most use-cases and still can be encoded much smaller than a double.

Examples:

MinSamplingInterval  Encoded value (hex)
1 µs                 = 01
1 ms = 1.000 µs      = E8 07
1 s  = 1.000.000 µs  = C0 84 3D
10 s = 10.000.000 µs = 80 AD E2 04

Encoding Byte:

For variables there are more than four additional optional attributes, thus we must add a second optional encoding byte after the base attributes. One bit of the default encoding byte tells us if the second encoding byte is encoded or not. If not all bits of the second byte are treated as zero.

1st Encoding Byte:

Bit	Name	Description
0-3		See Abstract Node Entry.
4	HasValue	The Value attribute is encoded.
5	HasDataType	The DataType attribute is encoded.
6	HasValueRank	The ValueRank attribute is encoded. MUST be set if HasArrayDimensions is set.
7	Has2ndEncodingByte	There is a 2nd encoding byte.

2nd Encoding Byte:

Bit	Name	Description
0	HasArrayDimensions	The ArrayDimensions attribute is encoded.
1	HasAccessLevel	The AccessLevel attribute is encoded.
2	HasMinimumSamplingInterval	The MinimumSamplingInterval attribute is encoded.
3	Historizing	Historical data gets recorded for this variable.
4-7		Reserved, must be zero.

Encoding Table:

Encoding Type	Name	Description	Default Value
base attributes	-	Base attributes from section Abstract Node Entry.
Byte	encoding2	2nd encoding byte. Encoded only if Has2ndEncodingByte=1.	0
Variant	value	Encoded only if HasValue=1.	<empty>
NodeId	datatype	Encoded only if HasDataType=1.	<empty>
SVarInt	valuerank	Encoded only if HasValueRank=1.	-1 (Scalar)
Byte	arraydimension_length	Encoded only if HasArrayDimensions=1.	0
VarInt[]	arraydimensions	Encoded only if HasArrayDimensions=1.	<empty>
Byte	accesslevel	Encoded only if HasAccessLevel=1.	1 (Read only)
VarInt	minimumsamplinginterval	Encoded only if HasMinimumSamplingInterval=1.	0

Object Entry

Object specific attributes:

Attribute	Encoded	Description
EventNotifier	yes	Can be used to subscribe for events.

Encoding Byte:

Bit	Name	Description
0-3		See Abstract Node Entry.
4	HasEventNotifier	The EventNotifier attribute is encoded.
5-7		Reserved, must be zero.

Encoding Table:

Encoding Type	Name	Description	Default Value
base attributes	-	Base attributes from section Abstract Node Entry.
Byte	eventnotifier	EventNotifier bitmask according to Part 3, OPC Specification.	0 (None)

Method Entry

Method specific attributes:

Attribute	Encoded	Description
Executable	yes	The method is executable.
UserExecutable	no	User specific data is not stored in the file.

Encoding Byte:

Bit	Name	Description
0-3		See Abstract Node Entry.
4	Executable	If this bit is set the method is executable.
5-7		Reserved, must be zero.

Encoding Table:

Encoding Type	Name	Description
base attributes	-	Base attributes from section Abstract Node Entry.

View Entry

View specific attributes:

Attribute	Encoded	Description
EventNotifier	yes	Can be used to subscribe for events.
ContainsNoLoop	yes	If this bit is set, this indicates that by following the References in context of the View there are no loops.

Encoding Byte:

Bit	Name	Description
0-3		See Abstract Node Entry.
4	HasEventNotifier	The EventNotifier attribute is encoded.
5	ContainsNoLoop	If this bit is set, the View contains no loops.
6-7		Reserved, must be zero.

Encoding Table:

Encoding Type	Name	Description
base attributes	-	Base attributes from section Abstract Node Entry.
Byte	eventnotifier	EventNotifier bitmask according to Part 3, OPC Specification.	0 (None)

VariableType Entry

VariableType specific attributes:

Attribute	Encoded	Description
Value	optional	A default process value.
DataType	optional	The data type of Value.
ValueRank	optional	ValueRank according to Part 3, OPC Specification.
ArrayDimensions	optional	Length of each array dimension.
IsAbstract	yes	If "true" the VariableType is abstract.

Encoding Byte:

1st Encoding Byte:

Bit	Name	Description
0-3		See Abstract Node Entry.
4	HasValue	The Value attribute is encoded.
5	HasDataType	The DataType attribute is encoded.
6	HasValueRank	The ValueRank attribute is encoded. MUST be set if HasArrayDimensions is set.
7	Has2ndEncodingByte	There is a 2nd encoding byte.

2nd Encoding Byte:

Bit	Name	Description
0	HasArrayDimensions	The ArrayDimensions attribute is encoded.
1	IsAbstract	The IsAbstract attribute.
2-7		Reserved, must be zero.

Encoding Table:

Encoding Type	Name	Description	Default Value
base attributes	-	Base attributes from section Abstract Node Entry.
Byte	encoding2	2nd encoding byte. Encoded only if Has2ndEncodingByte=1.	0
Variant	value	Encoded only if HasValue=1.	<empty>
NodeId	datatype	Encoded only if HasDataType=1.	<empty>
SVarInt	valuerank	Encoded only if HasValueRank=1.	-1 (Scalar)
Byte	arraydimension_length	Encoded only if HasArrayDimensions=1.	0
VarInt[]	arraydimensions	Encoded only if HasArrayDimensions=1.	<empty>

ObjectType Entry

ObjectType specific attributes:

Attribute	Encoded	Description
IsAbstract	yes	If "true" the ObjectType is abstract.

Encoding Byte:

Bit	Name	Description
0-3		See Abstract Node Entry.
4	IsAbstract	The IsAbstract attribute.
5-7		Reserved, must be zero.

Encoding Table:

Encoding Type	Name	Description
base attributes	-	Base attributes from section Abstract Node Entry.

DataTypeDefinition

In the OPC UA information model the DataTypeDefinition is an abstract base type for StructureDefinition and EnumDefinition. When reading the attribute the extension object's encodingId tells the client, what subtype is contained in the body.

In this file format we don't have this extension object meta information, so the DataTypeDefinition is used as a header structure defining the type of the following body.

Name	Type	Description
DataTypeDefinitionType	uint8_t	0: StructureDefinition, 1: EnumDefinition
structure	StructureDefinition	encoded only when DataTypeDefinitionType=0
enum	EnumDefinition	encoded only when DataTypeDefinitionType=1

StructureDefinition

This type is based on the StructureDefinition structure in OPC UA Part 3 V1.04, section 8.49. The main difference is that we don't repeat inherited structure fields from base types in all sub types as this would create a lot of redundancies. Also the UA Binary File Format encoding is used instead of UA Binary encoding to save even more space.

Name	Type	Description
defaultEncodingId	NodeId	The NodeId of the default DataTypeEncoding for the DataType. The default
		depends on the message encoding, Default Binary for UA Binary encoding
		and Default XML for XML encoding.
		If the DataType is only used inside nested Structures and is not directly
		contained in an ExtensionObject, the encoding NodeId is null.
baseDataType	NodeId	The NodeId of the direct supertype of the DataType. This might be the
		abstract Structure or the Union DataType.
structureType	uint8_t	0: Plain structure, 1: Structure with optional fields, 2: Union
fields	StructureField[]	The list of fields that make up the data type. This list does not repeat
		fields inherited from the base type.

Note that a server needs to create a full list of fields including the inherited ones when sending StructureField information to the client. The server may create this information at startup when loading this file or on the fly when the client is reading the information. This is an implementation detail of the server. The later option saves memory by omitting the redundancies, but needs more time to compute the information at runtime.

Note on defaultEncodingId: In UA Binary Encoding encoded structures in extension objects contain the encodingId of the datatype not the dataTypeId. For this reason an application which is decoding an extensionobject needs to lookup the encodingId instead of the dataTypeId. When encoding data normally the dataTypeId is given, but the encodingId needs to be written into the byte stream. For this reason the StructureDefinition contains the defaultEncodingId. An applications needs to index the StructureDefinition by both, encodingId and dataTypeId.

StructureField

Name	Type	Description
name	VarInt	A name for the field that is unique within the StructureDefinition as index into the string table.
description	LocalizedTextRef	A localized description of the field.
dataType	NodeId	The NodeId of the DataType for the field.
valueRank	int32_t	The value rank for the field. It shall be Scalar (-1) or a fixed rank Array (>=1).
isOptional	bool	The field indicates if a data type field in a Structure is optional. If the
		structureType is 2 (union) this field shall be ignored. If the structureType is
		0 (plain structure) this field shall be false.

EnumDefinition

This Structured DataType is used to provide the metadata for a custom Enumeration or OptionSet DataType. It is derived from the DataType DataTypeDefinition.

Name	Type	Description
fields	EnumField[]	The list of fields that make up the data type.

EnumField

The EnumField as defined in the OPC UA Spec. including the inherited fields, but using Binary File Format encoding and string tables.

Name	Type	Description
name	VarInt	A non-localized name of the field as table index. This is used as symbolic name in UaModeler
		for code generation and required for XML encoding and JSON non-reversible encoding.
		See Part 6 for information about Enumeration encoding.
value	SVarInt	The integer representation of an Enumeration.
displayName	LocalizedTextRef	A human-readable representation of the Value of the Enumeration.
description	LocalizedTextRef	A localized description of the enumeration value. This field can contain an
		empty string if no description is available.

TODO: name could also be a simple String. There is no translation, but there could be redundant strings with other types.

Hints: EnumField is defined in Part 3, section 8.52 (contains only name field), but inherits from EnumValueType which is defined in Part 3, section 8.40. So EnumField contains name + all inherited fields.

DataType Entry

DataType specific attributes:

Attribute	Encoded	Description
IsAbstract	yes	If "true" the DataType is abstract.
DataTypeDefinition	optional	Describes the datatype details of structures, unions and enums.

Encoding Byte:

Bit	Name	Description
0-3		See Abstract Node Entry.
4	IsAbstract	The IsAbstract attribute.
5	HasDataTypeDefinition	The DataTypeDefinition attribute is encoded.
6-7		Reserved, must be zero.

Encoding Table:

Encoding Type	Name	Description
base attributes	-	Base attributes from section Abstract Node Entry.
DataTypeDefinition	datatypedefinition	Encoded only if HasDataTypeDefinition=1.

ReferenceType Entry

ReferenceType specific attributes:

Attribute	Encoded	Description
IsAbstract	yes	If "true" the ReferenceType is abstract.
Symmetric	yes	If "true" the ReferenceType is symmetric.
InverseName	optional	The inverse name attribute as defined in Part 3, OPC Specification.

Encoding Byte:

Bit	Name	Description
0-3		See Abstract Node Entry.
4	IsAbstract	The IsAbstract attribute.
5	Symmetric	The Symmetric attribute.
6	HasInverseName	The InverseName attribute is encoded.
7		Reserved, must be zero.

Encoding Table:

Encoding Type	Name	Description	Default Value
base attributes	-	Base attributes from section Abstract Node Entry.
LocalizedTextRef	inversename	Encoded only if HasInverseName=1.	<empty>

Reference Entry

A Reference entry consist simply of three NodeIds.

Type	Name	Description
NodeId	src	NodeId of source node
NodeId	dst	NodeId of destination node
NodeId	type	NodeId of reference type node

Future Extensions

String table compression