The page below is from the manual of PRTG, our quick-to-install and easy-to-use network monitoring software
Try PRTG now and see how it can make your job easier.
PRTG Manual: Clustering
A PRTG Cluster consists of two or more installations of PRTG that work together to form a high availability monitoring system. The objective is to reach true 100% uptime for the monitoring tool. By clustering, the uptime will no longer be degraded by failing connections because of an internet outage at a PRTG server's location, failing hardware, or because of downtime due to a software update for the operating system or PRTG itself.
How a PRTG Cluster Works
A PRTG cluster consists of one Primary Master Node and one or more Failover Nodes . Each node is simply a full installation of PRTG which could perform the whole monitoring and alerting on its own. Nodes are connected to each other using two TCP/IP connections. They communicate in both directions and a single node only needs to connect to one other node to integrate into the cluster.
During normal operation the Primary Master is used to configure devices and sensors (using the web interface or Enterprise Console). The master automatically distributes the configuration to all other nodes in real time. All nodes are permanently monitoring the network according to this common configuration and each node stores its results into its own database. This way, the storage of monitoring results is also distributed among the cluster (the downside of this concept is that monitoring traffic and load on the network is multiplied by the number of cluster nodes, but this will not be a problem for most usage scenarios). The user can review the monitoring results by logging into the web interface of any of the cluster nodes in read-only mode. Because the monitoring configuration is managed centrally, it can only be changed on the master node, though.
By default, all devices created on the Cluster Probe are monitored by all nodes in the cluster, so data from different perspectives is available and monitoring for these devices always continues, even if one of the nodes fails. In case the Primary Master fails, one of the Failover Nodes takes over the master role and controls the cluster until the master node is back. This ensures a fail-safe monitoring with gapless data.
If you use remote probes in a cluster, each probe connects to each node of your cluster and sends the data to all cluster nodes, the current primary master as well as the failover nodes. You can define Cluster Connectivity of each probe in the Probe Administrative Settings.
During the outage of a node, it will not be able to collect monitoring data. The data of this single node will show gaps. However, monitoring data for this time span is still available on the other node(s). There is no functionality to actually fill in other nodes' data into those gaps.
If downtimes or threshold breaches are discovered by one or more nodes, only one installation, either the Primary Master or the Failover Master, will send out notifications (via email, SMS text message, etc.). Thus, the administrator will not be flooded with notifications from all cluster nodes in case failures occur.
More than 5,000 sensors per cluster are not supported.
For detailed information, see Failover Cluster Configuration .
Knowledge Base: What's the Clustering Feature in PRTG?
Video Tutorial: Cluster in PRTG - This is how it works
Video Tutorial: How to set up a PRTG cluster