What are the most common errors when monitoring WMI?

Votes:

1

Your Vote:

Up

Down

What are the most common errors when monitoring WMI and what can I do about them?

common-errors prtg troubleshooting wmi

Created on Feb 19, 2010 11:43:54 AM by  Volker Uffelmann [Paessler Support] (1,487) 2 3



1 Reply

Accepted Answer

Votes:

1

Your Vote:

Up

Down

The most common WMI errors

This is only a small overview and we cannot guarantee to offer the solution to your specific problem here, but it's a start and we will expand this article constantly.


WMI Overload

Probe Health sensor showing a WMI delay

The delay value shows how many WMI requests had to be postponed globally from their intended scanning times. This is indicating an overload problem. A delay of 0% is the most favorable value, if you keep seeing a higher number over a siginificant amount of time you should reduce the total amount of WMI requests on this probe by increasing the scanning intervals of the sensors. Alternatively you can distribute the sensors over one or more additional remote probes.

Note: On Windows XP/Windows 2003/Windows 7/Windows 2008 R2 you can run about 10,000 WMI sensors with one minute interval under optimal conditions (such as running the core and the target systems exclusively under Windows 2003 and being located within the same LAN segment). Actual performance can be significantly less depending on network topology and WMI health of the target systems - we have seen configurations that could not go beyond 500 sensors (and even less).

Tip: The bottlenecks for WMI monitoring are these two services:

  • WmiPrvSE.exe
  • lsass.exe

which don't support the usage of multiple processors. So if you encounter WMI delays and one or both of these services are running with maximal load (100% / number of processors) on the PRTG probe and/or one of the target computers, you might get a hint at where to decrease the amount of WMI Monitoring requests.


WMI Timeouts

WMI Timeouts are caused by several reasons. You can find an overview under PRTG WMI error messages


WMI Connection based errors - "Connection could not be established"

Very often, PRTG is being obstructed from monitoring WMI counters. As these errors are on a very low communication level, no WMI sensor in the device will run and all of them will show one of the following errors:

Port Error 135: RPC Server Not Accessible

If you see this precise error message, the port which all DCOM communication protocols are routed over is blocked. This very likely is the case because the RPC server on the monitored machine is:

  • blocked by a local firewall
  • blocked by domain policies
  • not running
  • running on a different port than specified in PRTG's setting for this computer

A possible solution for this could be to add the device name and ip address in the probe host file and to use the ip address in the settings.

800706BA - RPC Server Is Unavailable

This is quite an ambiguous message, as there can be different causes to it, amongst which are:

  • The RPC service is not running on the monitored machine.
  • Sometimes, this error occurs when using an IP address to connect to a device. Try using the hostname or FQDN (Fully Qualified Domain Name) instead. In PRTG, please enter this information in the device's settings (section "Credentials for Windows Systems").
  • The monitored machine is not able to connect to the Primary Domain Controller thus unable to verify the Windows credits provided for the WMI sensor. Check your Domain Controller settings.
  • Either the computer running the PRTG probe or the monitored machine are provided with wrong DNS entries. This might be the case when the machine has opened one or more VPN connections additionally to the normal network connection. You can test this with a simple ping to the machine in question. Do you see the correct IP?
  • Allow Remote Administration Exception - to enable this is a fix that helps in some cases (even if the Windows Firewall is turned off anyway). However, the WMI connection is working with the target's name only, not with its IP address. Read Microsoft's technet article and MS's KB article how to do it.
  • ISA Server is blocking all RPC traffic by default, so you have to configure the server explicitly in order to use WMI sensors. Please read the following articles on external sites:
  • UAC blocks root access to disk drives, as one of our customers found out in this discussion. You can add the following registry key to disable this feature of UAC.
    Path: HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System Add a new DWORD value:
    Name: LocalAccountTokenFilterPolicy
    Value: 1
    • Note: This disables some of the protection provided by UAC. Specifically, any remote access to the server using an administrator security token is automatically elevated with full administrator rights, including access to the root folder. More information can be found here: http://support.microsoft.com/kb/951016

80070005 - Access is denied and 80041003 - Access Denied

There is a fine distinction between those two errors.

  • 80070005 means that the Domain Controller / local Windows could not verify the credentials for the target computer.
    • Perhaps wrong credentials were provided, so you might want to check your entries in the “Credentials for Windows Systems” section of your device, group, probe, or even root group.
    • DCOM needs to be enabled on probe and target computer. Check the respective registry entry.
    • If the server on which PRTG is installed is part of a domain, whereas you are trying to monitor a target machine that is not part of the domain, please see How can I monitor WMI sensors if the target machine is not part of a domain? for more information.
    • We have also seen trouble with DNS/DHCP entries, directing the host name to the wrong IP address, resulting in this error - try using the explicit IP address as host setting when possible.
    • If the target host(s) are accessible with our WMI Tester tool, but PRTG still insists on showing the 80070005 error, try using "localhost" as "domain or computer name" in the Windows credentials section of the device's settings.
    • If this error shows up and vanishes again sporadically then we have no explanation at the moment, as PRTG only reports the error the Windows System has encountered.
  • 80041003 means that the user has no sufficient rights to use WMI, so you might want to check your access rights or the respective policies.

80041002: The object could not be found

This means the probe is able to connect to the host's WMI system, but for some reason it isn't able to see the objects that are needed for the sensors' functionality. Most likely it's due to configuration problems regarding the access rights. One of our customers was able to avoid this error by moving the erroneous device to a different probe.

80004002: No such interface supported

This is unfortunately a very vague error. One customer reported that it appeared when the Primary Domain Controller was offline and PRTG's attempts to monitor remote Windows computers failed due to the inability of asserting the credentials.

Note:

If any one of these errors occurs, please make sure your systems meet all of the basic requirements, as listed in the respective section of our main WMI article.


Sensor based errors

The following errors might not affect all sensors in a device but are widespread nonetheless:

80041010 – The specified class is not valid

This is quite an ambiguous message, as there can be different causes to it, among which are:

  • If certain services (e.g. Exchange) are not started when the WMI AutoDiscovery/AutoPurge (ADAP) process is started, the performance counters are not transferred to WMI because WMI uses ADAP to build its internal performance counter table. To fix this, open a command console and execute wmiadap.exe /f (http://support.microsoft.com/?scid=kb;en-us;820847&x=12&y=10)
  • Performance Counters are disabled for a specific service. There’s a registry entry for WMI counters for each service. In order to check and fix this you have to edit the registry. Please read here how to do that: http://technet.microsoft.com/en-us/library/cc784382%28WS.10%29.aspx

Nonsensical or wrong results

Although the WMI counters and their results are well defined this doesn't necessarily mean that Windows always adheres to this fact.

Depending on the Windows version and the current patchlevel it can happen that Windows limits 64bit counters to 32bit values.

This manifests itself in wrong results: e.g. the system memory and/or 64bit processes never show more than 4GB in PRTG. Or you see strange errors as referred to here in the "WMI counter value related errors" section.

Note:

We have found the reason for the 4GB limitation with 64bit systems/processes: It seems that Windows' own WoW64 emulation layer for 32bit applications (which PRTG is) somehow caps off these values at 4GB.

The only solution we can recommend at the moment is to use a (remote) probe running on a 32bit Windows for these sensors.

Yes, you have read that right: for the correct monitoring of 64-bit processes you have to run the PRTG Probe on a 32-bit machine... until Microsoft fixes that bug in the WoW64 layer.

For the other cases there is nothing we can do to fix these errors as they are caused by Windows, unfortunately. You can try to use the alternative query of a sensor if applicable.


0% Free Disk Space

One customer of ours found the following solution for this problem on W28k servers:

"A while back I started closely monitoring the WMI service on the various servers that often false alarm during WMI overload. Most often, the free space check would fail returning a null or zero value instead of the correct value for free disk space. To troubleshoot, I added the PRTG WMI service monitor for the WMI Service itself with a 60 second interval. I also monitored the system and application event logs with 15 minute intervals. It helped to get more specific errors about what was going on with WMI. Most often I'd get one of these errors in succession:

  1. WMI Free Disk Space [Multi Drive]: 0 % (Free Space C:) is below the error limit of 10 %
  2. Windows Management Instrumentation (WMI Service) Warning: 80041006: There was not enough memory for the operation.
  3. Event Log (Windows API) Value changed: Faulting application wmiprvse.exe, version 6.1.7600.16385, faulting module 4a5bc794, version ole32.dll, fault address 0x6.1.7600.16624

Armed with this new information, I was able to find two Microsoft KB articles that seemed to match the symptoms:

  • 958124 A wmiprvse.exe process may leak memory when a WMI notification query is used heavily on a Windows Server 2008-based or Windows Vista-based computer
  • 954563 Memory corruption may occur with the Windows Management Instrumentation (WMI) service on a computer that is running Windows Server 2008 or Windows Vista Service Pack 1

Applying these two hotfixes stopped the false alarms on my Windows 2008 and 2008 R2 servers. "


More information about WMI and PRTG

General introduction to WMI and PRTG

Created on Feb 19, 2010 11:58:52 AM by  Volker Uffelmann [Paessler Support] (1,487) 2 3

Last change on Dec 20, 2011 12:26:05 PM by  Volker Uffelmann [Paessler Support] (1,487) 2 3



Please log in or register to enter your reply.


Disclaimer: The information in the Paessler Knowledge Base comes without warranty of any kind. Use at your own risk. Before applying any instructions please exercise proper system administrator housekeeping. You must make sure that a proper backup of all your data is available.

PRTG
Network Monitor
Intuitive to Use.
Easy to manage.

150.000 administrators have chosen PRTG to monitor their network. Find out how you can reduce cost, increase QoS and ease planning, as well.

Visit
www.paessler.com

What is this?

This knowledgebase contains questions and answers about PRTG Network Monitor and network monitoring in general. You are invited to get involved by asking and answering questions!

Learn more

Top Tags


View all Tags