Visibility provided by Paessler PRTG drops customer support tickets to near zero at Total Uptime
About Total Uptime
North Carolina-based Total Uptime provides high availability cloud services for a worldwide roster of corporate clients. Its Network-as-a-Service offerings spanning Managed Cloud DNS, Cloud Load Balancing and Web Application Firewall services were built from the ground up with multi-datacenter at the core.
Forever is a very, very long time. So, when Total Uptime Technologies pledges 100% availability for its global cloud platform with a “24xForever” promise, customers take the commitment seriously. And Total Uptime delivers, using a multi-datacenter, multi-country, even multi-continent strategy.
"Our network presence covers 32 cities in 17 countries," notes Jonathan Hoppe, Co-Founder. "As a result, we have a substantial number of network devices, links and compute assets we need to keep an eye on."
"Our infrastructure cannot go down. Period. And if it does go down, it must be detected and resolved immediately. At this point we couldn't live without PRTG."
Jonathan Hoppe, Co-Founder, Total Uptime Technologies
‘Single pane of glass’
To ensure not only availability but also peak performance for its global cloud infrastructure, Total Uptime places the highest priority on its monitoring services. At the core of its monitoring is Paessler PRTG Network Monitor.
"Many years ago we researched several options for monitoring, including Orion and Nimsoft," recalled Hoppe. "We wanted one monitoring tool that could do it all. Multiple tools are a mess, so we were looking for something with the ability to accurately probe Windows and BSD machines as well as network devices and more. The goal was to gather and display data in a 'single pane of glass'."
Total Uptime uses two external service providers to provide third-party information, but it knew that wasn't sufficient for the level of awareness it needed. "Very few external providers offer probes that can be installed on servers, nor do they offer SNMP which can be used to gather core info essential to trending and anticipating future issues," Hoppe said. "They provide us alerts when a device is not reachable externally which is good to
know, but there's not enough detail."
Most of all, however, Total Uptime sought real-time insight about its cloud infrastructure. "We wanted to be able to trend and anticipate outages or performance bottlenecks before they escalated to actual outages or downtime," explained Hoppe. "Even more importantly, we wanted a new vantage point that would let us mitigate before external systems detect and customers are impacted."
A multi-dimensional solution
After conducting a trial of alternatives, Total Uptime selected PRTG based on its flexibility, its presentation features and powerful alert capabilities, and its competitive price. It soon began using the software to oversee systems status and performance across its networks on five continents.
"We've installed PRTG on Dell R710's with RAID1 SSD 250GB drives, dual quad-core Xeon 5620 2.40Ghz CPUs, and 16 GB of RAM. We have one PRTG server as a primary at our Toronto datacenter, and the secondary/backup PRTG server at our San Jose, CA datacenter," noted Hoppe. "Primarily we're monitoring Windows and BSD servers, although we also monitor switches, routers, firewalls, edge appliances and other networking hardware."
Because global communications are so important to Total Uptime's performance, data line availability is a constant focus.
The company monitors GigE and 10Gig Ethernet connections on all of its switches, routers and external facing devices. It graphs the bandwidth from its Cisco routers and firewalls, Force10 switches, and open source routers into its NOC, in order to predict usage trends and to detect attack traffic. It also monitors mail servers as well as the SNMP feature to monitor bandwidth/traffic for a variety of virtual interfaces on routers and switches.
Another high priority, according to Hoppe, is the company's DNS infrastructure. "We aggressively use PRTG's DNS test feature to make sure our servers are resolving at all times from all test points. We created a custom script on our BSD DNS servers that we poll with PRTG every 60 seconds. The custom script provides a DNS traffic count with various parameters including DNS record types, which we also graph."
Although the company uses PRTG to monitor internal switches, its main purpose is to watch its external networks and related components. It has nearly 500 sensors set up currently, with plans to expand to 750 in the near future. It also uses the PRTG cluster at its Toronto and San Jose locations, and hopes to expand to its use to a Windows virtual machine in either Amsterdam or London.
This infrastructure, as well as information from remote probes at all 32 of Total Uptime's datacenters (Sydney, Auckland, Hong Kong, Tokyo, Singapore, Philippines, Mumbai, Johannesburg, London, Portsmouth UK, Frankfurt, Amsterdam, Milan, Madrid, and Sao Paulo, as well as 17 in North America) is fed to the company's North Carolina NOC.
"We use the PRTG web interface to display maps in full screen on our NOC display wall," stated Hoppe. "Many monitoring products are deficient in their dashboard and mapping features-but with PRTG we've been able to display a lot of information that others can't."
Customer tickets eliminated
According to Hoppe, PRTG performs on four important levels. "First, PRTG gives us the insight we need, telling us what is happening across our global network at any given moment. Second, the improved insight has increased our availability. We can now spot trends and detect issues, incidents, and events before they happen-and for those that do, we receive rapid alerts so our NOC team can respond quickly to mitigate customer impact. Third, our response time has improved, since the constantly updated information allows us to detect issues sooner. And fourth is our SLA performance. We provide aggressive service level agreements to our customers; in fact, after as little as five minutes of cumulative downtime, our agreements require us to begin compensating customers with refunds of monthly fees. With PRTG we have a way to not only deliver on our promises, but also prove our performance."
Since its installation, PRTG has become an indispensible part of Total Uptime Technologies' "24xForever Promise." Hoppe notes that while visibility and insight are critical parts of the PRTG value proposition, the platform has had a positive effect on customer service as well.
"PRTG has completely eliminated support calls and tickets due to previously undetected bottlenecks or outages-and has reduced support calls and tickets for network performance issues by more than 90%," he reports. "PRTG also lets us trend network growth requirements far more accurately than through manual check methods. Now we can add infrastructure in the right location at the right time in order to provide sufficient headroom on our platform."
With the goal of not just maintaining, but increasing, its customers' cloud network availability, PRTG is doing the job for Total Uptime. "Our infrastructure cannot go down. Period. And if it does go down, it must be detected and resolved immediately," says Hoppe. "At this point we couldn't live without PRTG."
Get to know more happy PRTG customers