Obviously time is of essence when the customer is down. Assuming both Data and Voice services aren’t functioning, restoring service can take from minutes to painstaking hours!

We typically can isolate the causes of outages in minutes, partly because of our ability to have good visibility into the inner workings of the system.

We use Nagios for quick and easy monitoring. Nagios is great and provides a simple way to get an early warning on systems troubles. When a customer is down, we seek first to determine the scope of the outage.

For instance if phones are up but computers are down, there’s a few things that can quickly narrow it down further. Since we need to determine as rapidly as possible the scope, we can check dialing extension to extension. If that works then you know it’s a problem with the edge or the carrier.

DHCP addressing can also be problematic. If your phones aren’t getting addresses, they won’t work! Checking to see if you have a 169.x.x.x address range can be indicative of a DHCP server failure.

A checklist is great but it’s not a replacement for good old fashioned skill!