PRTG Manual: Failover Cluster Configuration
PRTG offers single failover clustering in all licenses—even in the freeware edition. A single failover cluster consists of two servers (Current Master node and Failover node), each of them running one installation of PRTG. They are connected to each other and exchange configuration and monitoring data.
This feature is not available in PRTG hosted by Paessler.
To set up a cluster you need two or more servers. One PRTG core server installation is necessary on each of them—with different settings configured for each type of node. In return, you benefit from seamless, highly available monitoring with automatic failover and multi-location monitoring.
In a cluster, you can run:
- 1 Master Node
On the master node, you set up your devices and configuration. Notifications, reporting, and many other things are also handled by the master node.
- Up to 4 Failover Nodes
You can install one, two, three, or four additional nodes for fail-safe, gapless monitoring. For more than one failover node, you need additional licenses. Each of these nodes can monitor the devices in your network independently, collecting their own monitoring data. You can review the data in a summarized way that enables you to compare monitoring data from different nodes.
During an outage of one node, you will see data gaps for the time of the outage on that node. However, data for that time span will still be available on all other cluster nodes.
Configuring a cluster with one failover node is the most common way to set up seamless network monitoring with PRTG. You will need two servers that run any Windows version (Windows 7 or later). Your servers can be real hardware (strongly recommended!) or virtual machines.
For details, see section Detailed System Requirements.
Please consider the following notes about PRTG clustering.
- Your servers must be up and running.
- Your servers must be similar in regard to the system performance and speed (like CPU, RAM).
- In a cluster setup, each of the cluster nodes will individually monitor the devices added to the Cluster Probe. This means that monitoring load will increase with every cluster node. Make sure your devices and network can handle these additional requests. Often, a larger scanning interval for your entire monitoring setup is a good idea. For example, set a scanning interval of 5 minutes in the Root Group Settings.
- We recommend that you install PRTG on dedicated, real hardware systems for best performance.
- Please bear in mind that a server running a cluster node may in rare cases be rebooted automatically without notice (for example, because of special software updates).
- Both servers must be visible for each other through the network.
- Communication between the two servers must be possible in both directions. Make sure that no software or hardware firewall blocks communication. All communication between nodes in the cluster is directed through one specific TCP port. You will define it during cluster setup (by default, it is TCP port 23570).
- Email notifications for failover: The Failover Master will send notifications if the Primary Master is not connected to the cluster. To ensure that PRTG can deliver emails in this case, configure the Notification Delivery settings so that PRTG can use them to deliver emails from your failover node as well. For example, use the option to set up a secondary Simple Mail Transfer Protocol (SMTP) email server. This fallback server must be available for the failover master so that it can send emails over it independently from the first email server.
- Make your servers safe! From every cluster node, there is full access to all stored credentials as well as other configuration data and the monitoring results of the cluster. Also, PRTG software updates can be deployed through every node. So, make sure you take security precautions to avoid security attacks like hackers and Trojans. Secure every node server the same careful way as the master node server.
- Run the nodes in your cluster either on 32-bit or 64-bit Windows versions only. Avoid using both 32-bit and 64-bit versions in the same cluster, as this configuration is not supported and may result in an unstable system. Also, ZIP compression for the cluster communication will be disabled and you may encounter higher network traffic between your cluster nodes.
- If you run cluster nodes on Windows systems with different timezone settings and use Schedules to pause monitoring of defined sensors, schedules will apply at the local time of each node. Because of this, the overall status of a particular sensor will be shown as Paused every time the schedule matches a node's local system time. Use the same timezone setting on each Windows system with a cluster node to avoid this behavior.
- The password for the PRTG System Administrator login to PRTG is not automatically synchronized on cluster nodes. You need to manually change it on each node. On a failover node, open the PRTG Administration Tool and change the password on the Administrator tab. Click Save&Close to save your new password.
- Stay below 2,500 sensors per cluster for best performance in a single failover. Clusters with more than 5,000 sensors are not officially supported. For each additional failover node, divide the number of sensors by two.
In cluster mode, you cannot use sensors that wait for data to be received. Because of this, you can use the following sensor types only on a local or remote probe:
- HTTP Push Count
- HTTP Push Data
- HTTP Push Data Advanced
- IPFIX and IPFIX (Custom)
- jFlow V5 and jFlow V5 (Custom)
- NetFlow V5 and NetFlow V5 (Custom)
- NetFlow V9 and NetFlow V9 (Custom)
- Packet Sniffer and Packet Sniffer (Custom)
- sFlow and sFlow (Custom)
- SNMP Trap Receiver
- Syslog Receiver
PRTG provides cluster support for remote probes. This means that all your probes can connect to all your cluster nodes, the primary master node as well as the failover node. Because of this you can still see monitoring data of remote probes and sensor warnings and errors even when your master node fails.
Please consider the following notes about PRTG clustering with remote probes.
- You have to allow remote probe connections to your failover nodes. To do so, log in to each server in your cluster and open the PRTG Administration Tool. On the Core Server tab, define to accept connections from remote probes on each cluster node.
- If you use remote probes outside your local network: You have to use IP addresses or DNS names for your cluster nodes that are valid for both the cluster nodes to reach each other and for remote probes to reach all cluster nodes individually. Open the System Administration—Cluster settings and adjust the entries for cluster nodes accordingly so that these addresses are reachable from the outside. New remote probes try to connect to these addresses but cannot reach cluster nodes that use private addresses.
- If you use Network Address Translation (NAT) with remote probes outside this NAT: You have to use IP addresses or DNS names for your cluster nodes that are reachable from the outside. If your cluster nodes are inside the NAT and the cluster configuration only contains internal addresses, your remote probes from outside the NAT will not be able to connect. The PRTG core server must be reachable under the same address for both other cluster nodes and remote probes.
- A remote probe only connects to the PRTG core server with the defined IP address when starting. This PRTG server must be the Primary Master!
- Initially, existing remote probes are not visible on failover nodes. You need to set their Cluster Connectivity first in the Administrative Probe Settings to be visible and working with all cluster nodes. Choose the option Probe sends data to all cluster nodes for each remote probe that you want to connect to all cluster nodes.
- Newly connected remote probes are visible and working with all cluster nodes immediately after you have acknowledged the probe connection. The connectivity setting Probe sends data to all cluster nodes is default for new probes.
- As soon as a probe is activated for all cluster nodes, it connects automatically to the correct IP addresses and ports of all cluster nodes.
- Once a remote probe has connection data from the Primary Master, it can connect to all remaining cluster nodes also when the Primary Master fails.
- Changes to connection settings of cluster nodes are automatically sent to your remote probes.
- If a PRTG server (which is a cluster node) in your cluster is currently not running, your probes will deliver monitoring data after the restart of this server. This happens individually for each PRTG server in your cluster.
- If you enable cluster connectivity for a probe, it will not deliver monitoring data from the past when cluster connectivity was disabled. For sensors using difference values, the difference between the current value and the last value is shown with the first new measurement (if the respective sensor previously sent values to the PRTG server).
- Except for this special case, all PRTG servers show the same values of sensors on devices you add to the Cluster Probe.
- The responsible PRTG server for the configuration and management of a remote probe is always the master that is currently active. This means that all tasks of the PRTG core server are only executed by the current master. If you use a split cluster with several master nodes, only the master that appears first in the cluster configuration is responsible.
You can use remote probes in a cluster as described above, which is showing monitoring data of all your probes on all nodes in your cluster. However, you cannot cluster a remote probe itself. To ensure gapless monitoring for a specific remote probe, install a second remote probe on a machine in your network next to the existing probe, and create all devices and sensors of the original probe on it. For example, you can clone the devices from the original probe. The second probe would be a copy of the first probe then and you can still monitor the desired devices if the original probe fails.
Probes that send data to all cluster nodes result in increased bandwidth usage. Choose the option Probe sends data only to primary master node in the Administrative Probe Settings for one or more remote probes to lower bandwidth usage if necessary.
Please explicitly check on each cluster node if a remote probe is connected. PRTG does not notify you if a remote probe is disconnected from a node in the cluster. For example, log in to the PRTG web interface on a cluster node and check in the device tree if your remote probes are connected.
Ready to get started? Go to section Failover Cluster Step by Step!
Knowledge Base: What's the Clustering Feature in PRTG?
Knowledge Base: What are the bandwidth requirements for running a PRTG Cluster?
Knowledge Base: What is a Failover Master and how does it behave?
Knowledge Base: I need help with my PRTG cluster configuration. Where do I find step-by-step instructions?
Knowledge Base: PRTG Cluster: How do I convert a (temporary) Failover Master node to be the Primary Master node?
Paessler Blog: Cluster Support for Remote Probes: Failover Nodes Show Remote Probe Data
- Active Directory Integration
- Application Programming Interface (API) Definition
- Filter Rules for xFlow, IPFIX, and Packet Sniffer Sensors
- Channel Definitions for xFlow, IPFIX, and Packet Sniffer Sensors
- Define IP Ranges
- Define Lookups
- Regular Expressions
- Add Remote Probe
- Failover Cluster Configuration
- Data Storage
- Using Your Own SSL Certificate
- Calculating Percentiles