Embedded OPC UA Stack
Embedded OPC UA Stack Documentation


This manual is an API reference and also contains some general information on how to use the SDK. Please read the Introduction first to understand the principles of this SDK before you start implementing.

This manual is divided into the following sections:



As the leading supplier in OPC UA Technology we have analyzed the issues and bottlenecks of today’s OPC UA implementations. We came to the conclusion that only a complete redesign can solve the issues to improve the performance, improve the scalability, and increase the security of OPC UA. This makes OPC UA usable in smallest devices and thus “IoT Ready”, and furthermore guarantees improved performance for high end servers which must be able to handle thousands of connections in parallel.

With a new software architecture and new implementation from scratch we have achieved all these goals. Of course, the new implementation is still wire-compatible with the original OPC Foundation Stacks.

Parallelism Revisited

One problem of many network applications is bad multi-threaded design. Too many threads are created without a clear concept, which lead to enormous waste of resources, bad performance due to locking problems, and trashed CPU caches. Some implementations even create one thread per connection, which is the worst design in a sense of scalability.

With the new SDK, we have designed a set of OPC UA components which can work in parallel, independently of each other, and thus achieve superior performance on multi-core CPUs without interference. In addition, this architecture allows to drive the components from a single threaded main loop in smallest micro controllers.


The component design allows to run components like the network encoder/decoder in a separate process. This not only can improve performance, it also allows to benefit from sandboxing mechanisms like Linux Secure Computing Mode. This allows to disable any operating system calls for this process. In the case of a bug which could lead to an exploit, the process is terminated by the OS as soon as an attacker tries to access a forbidden operating system function. The master process detects this and can restart the terminated process.

Asynchronous Network API

The new OPC UA implementation is based on a completely asynchronous network API as an OS abstraction layer. The different network backends allow to benefit from modern OS specific APIs like POSIX AIO, Linux epoll, BSD kqueue or Windows Completion Port APIs. These APIs don’t suffer from scalability problems like the ancient Berkeley Socket APIs and are the enabler for high performance server applications. The usage of these APIs allows to reduce the number of context switches and copy operations which improves the performance when scaling to thousands of connections.

With this new API we also have introduced solutions for non-blocking domain name resolution, which we have identified as a big design problem in today’s implementations.

Asynchronous Crypto and PKI APIs

As with network APIs, today’s crypto implementations suffer from synchronous blocking implementations. Our new OPC UA implementation is designed completely asynchronously to solve this issue. Two different backends are supported out of the box: OpenSSL and mbedTLS. More backends can be added over time. This concept also allows to add hardware accelerated cryptography. The asynchronous design now allows to delegate an encryption job to a hardware chip, continue OPC UA communications and later on process the result of the hardware encryption, even in a single-threaded environment.

Improved Performance

As one of the biggest performance bottle-necks we have identified the encoder/decoder component of today’s ANSI C based OPC UA implementation. Even though it is faster than Java and C# based stacks, the whole potential was not reached. With a complete redesign of the encoding procedure we could gain a performance boost by a factor of 10 for the encoding process. This can lead to an overall performance boost of the OPC UA protocol up to a factor of 4, depending on the type of data transferred.

Small Footprint

During the whole design process we kept a focus on small footprint to make the software usable for embedded devices. A modular concept, configurable memory pools, and efficient implementation make it perfectly suitable for smallest devices and for Internet of Things (IoT). On an ARM based demonstration device running Euros Real Time Operating system we were able to integrate an OPC UA server in 300K Codesize including the Operating System. A new table based address space concept allows to integrate huge address spaces with a fraction of the memory required in other SDKs. It also supports read only address space models that completely reside in ROM.

Software Quality

To ensure the best quality from the beginning, we developed a comprehensive test environment. Using this toolset, we are already able to achieve 98% line coverage and 95% branch coverage.