When C-level IT execs think of business continuity, their minds usually wander to backing up and recovering data. They worry about how to deploy seamless, ultrafast access to backup data when the primary systems and storage have been compromised.
But a complete business continuity plan needs to address an even more pressing issue. Despite all the attention paid to backup and recovery, the network generally goes down more often than storage crashes. And without the network (and all the infrastructure and data centers dependent on that network) running, key business applications and services stop in their tracks.
Consequently, keeping increasing complex networks up and running—and performing efficiently—should be a key part of any CIO’s business continuity game plan.
Complex Networks Are the New Normal
Keeping the network up is more complicated and error-prone than ever. The old days where the network comprised a LAN, perhaps a WAN to connect to other offices, and a pure pipe to the internet are gone. Today’s internal networks are more elaborate due the prevalence of virtualization and wireless technologies.
Here are eight key steps to develop a complete business continuity game plan.
Step One: Understand the Cost of Downtime
Downtime has serious, multi-faceted impact on companies. It kills employee productivity, of course. Furthermore, it brings sales and other revenue generating activities to a halt. Downtime also can deliver a hit to a company’s reputation among its clients and partners.
Gartner analyst Andrew Lerner has looked at the math and the costs of downtime are staggering. “Based on industry surveys, the number we typically cite is $5,600 p/minute, which extrapolates to well over $300K p/hour,” Lerner blogged.
Step Two: Choose the Right Cloud Provider
Cloud services from outside cloud vendors is virtually a given in today’s world. However, not all cloud providers are created equal.
The good news is cloud provider services are far more reliable than they were just five years ago when major outages made the news regularly. Today the reverse is true. In fact, Amazon Web Services, arguably the leading cloud provider today, was out for just two and a half hours in 2015.
Here are some tips on choosing a cloud provider that can help keep critical business applications running:
- SLAs: Service Level Agreements (SLAs) are always required, but especially critical when running serious business applications. These SLAs should have real teeth so that there is actual compensation for downtime or sub-par performance.
- References: References are important. Companies need to get references from organizations of similar size and need that match their own. At the same time, companies need to do their own research on how the provider’s network is architected, applications hosted, the approach to security, and what measures they take to ensure performance and uptime.
- Visibility: What tools does the provider offer that will allow internal IT groups to see how their applications are operating and data moving?
Step Three: Create the Right Cloud Infrastructure
Keep in mind that while these tips are useful in choosing the most reliable cloud vendors (and there are still provider problems), the vast majority of cloud downtime has to with the “last mile” networks that are connecting cloud users to these cloud services. For most companies, this last mile is their own network.
Providing the right supporting infrastructure starts with providing a WAN that is fast enough to deliver adequate performance and offer resilience. The good news is that you can get the right speed and gain a backup connection at the same time. The trick is to bond two WAN connections so they act as one. If one connection goes down, the other takes full control.
Step Four: Realize that Cloud and Network Monitoring and Management is the Key to Uptime
Problems that bring a business service to a halt can come from a number of sources—from a router or NIC issue to another network infrastructure component causing the issue.
Unfortunately, traditional monitoring tools tend to operate in silos. They are often focused on discrete areas such as on-premises networks, specific applications—including cloud apps, OSes, virtual servers, or bits of network gear such as routers and switches. This makes it tough just to pinpoint the cause of any cloud networking problems—never mind having sysadmins fix them quickly.
Fortunately, a unified cloud and network monitoring and management solution provides deep visibility into the overall network, and the management aspect enables fast remediation. By spotting problems in the network that compromise cloud or network uptime, IT can take quick action, keeping the network healthy so end users don’t even know there was an issue.
Step Five: Monitor and Fix Your Cloud Network
The cloud presents special network management challenges for IT staff because internal IT doesn’t have full control of the provider’s cloud infrastructure or a full view of all the network pieces that support these cloud applications and services.
And while IT teams struggle to monitor and manage the public cloud services, they still need to take care of internal networks and even hybrid cloud configurations. Fortunately, full-featured network monitoring solutions, such as Kaseya Traverse, are designed to holistically monitor performance across on-premises, cloud and hybrid infrastructure.
Step Six: Protect Business Services, Not Just the Network
Traditional network monitoring and remediation tools focus on network components. That may have worked well in the past, but organizations today are laser-focused on business uptime and that means keeping services running and operating with proper performance.
The right solution lets IT focus on spotting service problems—not just infrastructure problems—then diagnosing and ultimately repairing the issue. So instead of just asking if the routing table needs work, IT looks to see if business services such as ERP or CRM are working properly, and if not, why not.
Step Seven: Recognize the Problem—Your Admins Need to Predict the Future
Root cause analysis has never been exactly easy, but when networks and IT infrastructure had a simpler architecture, finding a root cause was not overly complex. It was largely either the LAN or the WAN. And if an application was to blame, it was installed on a non-virtualized on-premises server so hunting it down was pretty straightforward.
Today, that relative simplicity has been replaced with ever increasing complexity, as this paper has discussed.
If your IT staff is tasked with keeping business services stay up and running with a goal of zero or near-zero downtime, then they need a way not just to manage servers, applications and network devices, but also to address problems preemptively before they turn into downtime.
Step Eight: End the ‘Swivel Chair” Approach to Network Administration and Troubleshooting
We argued earlier that traditional monitoring tools operate in silos. They are often focused on on-premises networks, specific applications—including cloud apps, OSes, virtual servers, or bits of network gear such as routers and switches.
This morass of tools means that IT admins are staring at a bank of screens, with each screen displaying a different console. As they look for problems, or try to find a solution, they shift from console to console—an approach that leads to “swivel chair mentality.” This makes it tough to pinpoint the cause of networking problems, never mind identify issues before they impact end users—which is neither efficient nor holistic.
A True Partner in Business Continuity Processes
Traverse, a next-generation monitoring solution from Kaseya, is integral to any company’s business continuity processes. Traverse allows enterprises to monitor, manage and optimize their entire IT infrastructure and distributed data centers—including hybrid and virtualized infrastructure—in a single unified console.
Learn more and sign up for a free trial today!
And learn more by downloading our Eight Key Steps to Business Continuity here.