C++ Based OPC UA Client/Server SDK  1.5.5.355
Redundancy

OPC UA defines two main modes for server redundancy, non-transparent and transparent. The non-transparent mode has the submodes Cold, Warm and Hot.

Introduction

Non-Transparent Redundancy

In the non-transparent case, all servers in the redundant set have their own Server URI and Endpoint URLs. Every server in the redundant set provides a list of the other redundant servers in the set (Server URI) and the failover mode (Cold, Warm, Hot, or HotPlusMirrored, see below). With this feature, a client only needs to know one of the servers and can find the other available servers using the information in the server object (Objects → Server → ServerRedundancy). The advantage of non-transparent redundancy is that it is easy to support on the server side. The disadvantage is that the client needs to do something to benefit from redundancy. But implementing generic support in a client can be done without much effort using the information provided by the server.

Cold failover mode is where only one server can be active at a time. This may mean that redundant servers are unavailable (not powered up) or are available but not running (PC is running, but application is not started).

Warm failover mode is where the backup server(s) can be active, but cannot connect to actual data points (typically, a system where the underlying devices are limited to a single connection). Underlying devices, such as PLCs, may have limited resources that permit a single Server connection. Therefore, only a single server will be able to consume data. The ServiceLevel variable indicates the ability of the server to provide its data to the client.

Hot failover mode is where all servers are powered-on, and are up and running. In scenarios where servers acquire data from a downstream device, such as a PLC, one or more servers are actively connected to the downstream device(s) in parallel. These servers have minimal knowledge of the other servers in their group and are independently functioning. When a server fails or encounters a serious problem, its ServiceLevel drops. On recovery, the server returns to the redundant server set with an appropriate ServiceLevel to indicate that it is available.

HotPlusMirrored failover mode is where failovers are for servers that are mirroring their internal states to all servers in the redundant server set and more than one server can be active and fully operational. Mirroring state minimally includes sessions, subscriptions, registered nodes, continuation points, sequence numbers, and sent notifications. The ServiceLevel variable should be used by the client to find the servers with the highest ServiceLevel to achieve load balancing.

A server configuration example can be found below.

Transparent Redundancy

In the transparent case, all servers in the redundant set have the same Server URI and Endpoint URL. The servers are running in a cluster with the same IP address. A client cannot connect to a specific server, but it can request the list of available servers and their statuses in Objects → Server → ServerRedundancy. The advantage of transparent redundancy is that the client doesn’t need to know anything about redundancy to benefit from the redundant servers. The disadvantage is the higher effort on the server side.

The C++ based Server SDK supports both main modes for server redundancy, non-transparent and transparent. For transparent redundancy, an additional module is needed that is not part of the standard SDK delivery but available on request. For non-transparent redundancy, the submodes Cold, Warm, and Hot are supported out-of-the-box, HotPlusMirrored requires an additional module available on request. Please contact sales.nosp@m.@uni.nosp@m.fieda.nosp@m.utom.nosp@m.ation.nosp@m..com for more information.

Configuration for Non-Transparent Redundancy

This section describes the configuration steps to achieve non-transparent redundancy. It is only shown for an XML based configuration, of course the same settings can be realized using an INI file instead.

Let’s assume MyUaServer is to be replaced with two redundant servers on two different computers and that the following ProductUri, ServerUri and ServerName were configured for the original server:

<ProductUri>urn:MyCompany:MyUaServer</ProductUri>
<ServerUri>urn:MyPC:MyCompany:MyUaServer</ServerUri>
<ServerName>MyUaServer@MyPC</ServerName>

Then the configuration for the server on running on PC1 would be

<ProductUri>urn:MyCompany:MyUaServer</ProductUri>
<ServerUri>urn:PC1:MyCompany:MyUaServer</ServerUri>
<ServerName>MyUaServer@PC1</ServerName>

and for the server on PC2:

<ProductUri>urn:MyCompany:MyUaServer</ProductUri>
<ServerUri>urn:PC2:MyCompany:MyUaServer</ServerUri>
<ServerName>MyUaServer@PC2</ServerName>

We need to configure endpoints for each server:

Server on PC1:

<Url>opc.tcp://PC1:48011</Url>

Server on PC2:

<Url>opc.tcp://PC2:48011</Url>

Finally it is necessary to configure the redundancy information that each server provides in the Server object (see above). The RedundancySettings are identical for both servers in the redundant set:

<RedundancySettings>
<RedundancySupport>Hot</RedundancySupport>
<ServerUri>urn:PC1:MyCompany:MyUaServer</ServerUri>
<ServerUri>urn:PC2:MyCompany:MyUaServer</ServerUri>
</RedundancySettings>

For each server, the other one has to be listed at AdditionalServerEntries, i.e. the server on PC2 has to be listed in the configuration of the server on PC1:

<AdditionalServerEntries>
<ApplicationDescription>
<ApplicationUri>urn:PC2:MyCompany:MyUaServer</ApplicationUri>
<ProductUri>urn:MyCompany:MyUaServer</ProductUri>
<ApplicationName>MyUaServer@PC2</ApplicationName>
<ApplicationType>Server</ApplicationType>
<DiscoveryUrl>opc.tcp://PC2:48011</DiscoveryUrl>
</ApplicationDescription>
</AdditionalServerEntries

and Server1 has to be listed in the configuration of Server2:

<AdditionalServerEntries>
<ApplicationDescription>
<ApplicationUri>urn:PC1:MyCompany:MyUaServer</ApplicationUri>
<ProductUri>urn:MyCompany:MyUaServer</ProductUri>
<ApplicationName>MyUaServer@PC1</ApplicationName>
<ApplicationType>Server</ApplicationType>
<DiscoveryUrl>opc.tcp://PC1:48011</DiscoveryUrl>
</ApplicationDescription>
</AdditionalServerEntries

ServiceLevel

The service level of the server defines the redundancy switch over behavior of the clients. Therefore you must set the service level (or server state) of the server depending on the data and service quality of the server.

The ServiceLevel provides information to a client regarding the health of a server and its ability to provide data. It is a byte with a range of 0 to 255, where the values fall into the subranges shown in the table below.

The algorithm used by a server to determine its ServiceLevel within each subrange is server specific. However, all servers in a redundant server set shall use the same algorithm to determine the ServiceLevel. All servers, regardless of redundant server set membership, shall adhere to the subranges below.

Subrange Name Description
0-0 Maintenance The failed server is in maintenance subrange. Therefore, new clients shall not connect and currently connected clients shall disconnect. The server should expose a target time at which the clients are able to reconnect. See Variable EstimatedReturnTime for additional information.
1-1 NoData The failed server is not operational. Therefore, a client is not able to exchange any information with it. The server most likely has no data other than ServiceLevel, ServerStatus and diagnostic information available.
2-199 Degraded The server is partially operational, but is experiencing problems such that portions of the address space are out of service or unavailable. An example usage of this ServiceLevel subrange would be if three of ten devices connected to a server are unavailable.
200-255 Healthy

The server is fully operational. Therefore, a client can obtain all information from this server. The subrange allows a server to provide information that can be used by clients to load balance. An example usage of this ServiceLevel subrange would be to reflect the server’s CPU load where data is delivered as expected.

Servers in the Healthy ServiceLevel subrange are able to deliver information in a timely manner. This ServiceLevel may change for internal server reason or it may be used for load balancing.

The following pseudo code exemplarily shows how to set the service level or server state:

NodeManagerRoot* pNMRoot = (NodeManagerRoot*)m_pServerManager->getNodeManagerRoot();

You can either create an algorithm to calculate the service level within the local server based on knowledge you have or you need some information exchange between the servers.