One of the most common (and frustrating!) questions a sysadmin needs to answer is: who is hogging all my bandwidth?

The network is slow, users are complaining, and your internet connection is at 100% (again...).  You need to figure out who or what is hogging all the bandwidth, and you need to do it fast. In this article, I'll explain the different methods that are available in different situations, and how to use SNMP, RMON, flow and packet sniffing to track down the culprits.

monitoringinsights-orange.png

 

The options that are available will depend very much on the hardware you're using, and how much management access you have to that hardware:

1) What type of hardware do you have?

Enterprise-grade hardware offers many more possibilities than SOHO or consumer-grade hardware.  Port mirroring, for example, is rarely supported by consumer-grade equipment.

2) What hardware vendor(s) do you have?    

Some of the most useful protocols, such as netflow, aren't supported by all vendors.  So, your ability to monitor bandwidth based on flows is limited to those vendors and models that support flow protocols.

3) How much management access do you have?

You need administrative access to the router or switch to enable SNMP or to mirror traffic to an additional port.  In a corporate environment, the network administrators will have full access to their own equipment, but only limited access to provider equipment.  In a home environment, most customers will have no management access to their ISP router. If you don't have management access to the equipment, your options for monitoring are limited, and you may need to rely on reporting from the ISP.

4) How many free ports do you have?

Port mirroring (for sniffing) requires an unused port on your switch. If you're already actively using all the ports, you'll need to disconnect something first (not a ideal plan), or you won't be able to mirror traffic to a sniffer.

So, let's look at the steps in detail:

1) Look at statistics on your router, switch or firewall

If your hardware supports it, one of the first places to look is at the device itself.  Many devices include detailed traffic statistics as part of their user interface. If you're lucky, your device will report which ports have the most traffic on them, and what IP addresses or protocols are causing this traffic.

This requires that you have enough management access to the router to be able to view the statistics, and that the router provides these statistics. If you don't have management access, you can try asking your ISP to generate a report for you.

2) SNMP

The next line of attack is SNMP, the Simple Network Management Protocol.  There are standard SNMP metrics to measure the amount of traffic in/out on each port.  These traffic details are included in the "IF_MIB" (Interfaces MIB), which is supported by all major hardware vendors and operating systems.

To use SNMP, you must first enable SNMP on your router/switch. The steps to do this vary from vendor to vendor, so please check the documentation from your vendor.  Pay attention to two important factors as you're configuring SNMP: what version of SNMP the device supports (v1, v2 or v3), and the read community string, which is like a password for SNMP.

To test that your device is responding to SNMP, you can use Paessler's free SNMP Tester.

Once the switch is responding to SNMP, you need a monitoring tool to query your device using SNMP.  There are SNMP-based monitoring tools available at all price levels, from freeware to large enterprise platforms.  PRTG, for example, is a unified monitoring tool, including SNMP monitoring, which can be run as freeware in a SOHO environment or with a commercial license for corporate environments.

Bandwidth monitoring with SNMP will tell you the amount of traffic, over time, on each port. If certain ports have spikes of traffic, you know that the devices connected to those ports are generating a lot of traffic.

As an example, here's a screenshot of an SNMP traffic sensor from PRTG, showing the amount of traffic in/out, and some additional details about the type of traffic, such as unicast versus broadcasts.

The additional information, such as the number of broadcasts, can be very useful when debugging network problems.  A high number of broadcasts, for example, can indicate spanning tree problems.  If your spanning tree is constantly recalculating, you will have recurring network problems. What you think is somebody hogging the network could actually be underlying protocol problems, so don't ignore these additional counters.

3) RMON

It's gone a bit out of fashion, but RMON (Remote MONitoring) is a useful extension to SNMP that you can also consider.  If your vendor supports it, RMON adds additional details about the type of traffic you've got.  It was originally developed for monitoring remote sites (hence the name), but can monitor LAN and WAN equipment as well. Since RMON is an extension to SNMP, you need to have SNMP enabled, and your device needs to support the RMON MIB files.

In addition to the SNMP traffic statistics shown above, RMON includes the number of drops, collisions, CRC errors, oversized packets, and much more. This doesn't tell you who's hogging your bandwidth, at least not directly.  However, problems here (eg. a lot of CRC errors) tell you that you have underlying network problems, so the issue you're trying to track down might be the network rather than a user.

But let's go back to looking for the cause of a bandwidth spike...

At this point, we know from SNMP how much traffic is flowing through a port, and we can see which ports have a lot of traffic on them.  If we're lucky, there is only one device attached to a port, and then we know which device is causing all the traffic.

However, there could easily be multiple devices behind that port, and knowing the total traffic from all those devices doesn't tell us which one device is the culprit.  To see that, we need to dig deeper into the content of the traffic, and we do that using "flows".

4) Flow Protocols

The flow protocols are a family of protocols that have one thing in common:  they keep track of traffic flowing through the switch and they analyze the data to record things like source/destination IP addresses, source/destination MAC addresses, class of service, IP protocol used, etc.

The flow protocols include:

  • NetFlow (Cisco proprietary)
  • sFlow ("Sampled Flow", an industry standard for flow, supported by multiple hardware vendors)
  • jFlow (Juniper Flow)
  • IPFIX (Internet Protocol Flow Information Export - a standardized version of flow from the IETF)
  • Flexible NetFlow (Cisco proprietary)
  • NetFlow Lite (Cisco proprietary)

A "flow" is like a conversation between two devices.  Flow-enabled routers keep track of each packet they see, and create a flow record for each flow that they see.  The flows are identified by the source IP, destination IP, source port, destination port, and IP protocol.  So, all packets flowing between, say, 10.10.10.10:80 and 10.200.200.200:51072, make up the one flow between those two machines.

Flow-enabled routers can send information about the flows they see to a flow collector device.  The flow collector receives information about the flows from multiple devices, and can then create reports about the flows.

PRTG, for example, includes flow collectors for NetFlow v5, NetFlow v9, sFlow, jFlow and IPFIX.  It then determines "top lists" from the flows:

  • Top Talkers - The servers or PCs that are generating the most traffic in your network
  • Top Connections - The top connections that are using the most bandwidth in your network
  • Top Protocols - The top TCP and UDP protocols that are using the most bandwidth in your network
  • And custom top lists

And now you can see the real power of flow monitoring: the top lists tell you exactly who or what is using the most bandwidth.  You've found the culprit!

Um, but why is there still more writing below?  We should be done now, shouldn't we?

Well, that depends...

Unfortunately, lots of devices don't support flow, especially lower-end equipment.  Or, your device might support it, but you don't have management access to the device to be able to enable flow monitoring.  What then?

5) Packet Sniffing

At this point, the only option left is traffic sniffing.  That means using some additional device, such as your laptop, to sniff packets and analyze the results. 

The best way to sniff traffic is to configure your router to "mirror" or "span" all of the traffic it sees to an unused port.  And then you attach your sniffer device (eg. your laptop) to that mirror/span port. However, this requires administrative access to the router to configure it to start mirroring/spanning.

(An aside: what's the difference between "mirroring" and "spanning"?  None. Cisco calls their mirroring function "SPAN (Switched Port ANalyzer)", which is why the two terms have become interchangeable.)

If you're able to configure the router to mirror traffic, then you can attach a laptop to that port, and then use sniffing software to analyze the traffic.  If you can't configure the router to mirror, then look for some other device where you *do* have access, that's close to the target router (from a network point of view), and sniff on it instead.  The results won't be perfect, but might still be enough to show you what's going on in the network.

You now need some kind of sniffer software.  If you'd like to see top lists, similar to netflow, then you can use the PRTG "packet sniffer" sensor to analyze the traffic and produce top lists similar to those you get from netflow. 

If you need more than just toplists, then Wireshark is THE gold standard for traffic sniffing.  It's not the easiest to learn, but it's extremely powerful once you've got the hang of it. Wireshark offers multiple ways to track down bandwidth hogs, for example, under Statistics | Endpoints | IP and then sort the columns to identify the top talkers.

6) Taps and Packet Brokers 

If none of the above has helped, your last line of defense is taps in combination with a packet broker.  Taps are physical devices that are installed in-line in your network.  Because they're in-line, they see all of your traffic and send copies of the traffic it to a central monitoring device. The monitoring device, called a packet broker, collects the traffic from all of your taps and forwards it to network monitoring tools for analysis. 

How Network Taps Work How Network Taps Work

Taps and packet brokers are usually too expensive for an SMB to consider.  However, consulting companies often offer network analysis based on taps/brokers as a service. So, if you really, really need to track down a problem, and the steps above haven't helped, you can hire someone to temporarily tap the network for you. Installing taps involves temporary interruptions in the network, so this isn't something you want to do often.

Summary:

We've now seen all the steps, from easiest to most difficult, that you can use to track down bandwidth hogs in your network.

Steps for Tracking Down Bandwidth Hogs Steps for Tracking Down Bandwidth Hogs

Did you like our blog post?

FOLLOW US VIA RSS >> FOLLOW US ON FEEDLY >>

Entries (RSS) Entries (Atom)

 

Blog Categories

Blog Archives

 

PRTG Network Monitor

Intuitive to Use.
Easy to manage.

150,000 administrators have
chosen PRTG to monitor their
network. Find out how you can
reduce cost, increase QoS and
ease planning, as well.

Free PRTG
Download

Feedback / Questions
Copyright © 1998 - 2017 Paessler AG