Like their larger enterprise counterparts mid-market organizations have taken extensive advantage of virtualization and server consolidation. Yet despite their increasing investment in virtual server, storage and networking capabilities, they frequently fail to invest in the tools needed to truly optimize their virtualized environments for performance and ROI.
Many mid-market IT operations groups find that optimizing their infrastructure to get the best returns on their investments, while simultaneously maximizing availability, is a significant challenge. Most have implemented virtualization over the past few years to reduce the number of physical servers they need, along with the associated office space, energy usage and IT staff resources. However, they frequently underutilize the virtual machines (VMs) created to avoid overloading the hosts*.
Tool sophistication and coverage
The problem doesn’t seem to be a lack of tools but rather a lack of tool sophistication and coverage. Each hypervisor, storage and network vendor offers tools for managing and optimizing the capabilities of their own technologies. While these tools provide real-time monitoring, they are not usually able to correlate information across different domains, cannot filter derivative conditions effectively and provide little information about expected norms and predictable variations. This leaves manpower strapped IT organizations the task of manually reviewing and evaluating monitoring results in order to do configuration design, capacity planning or to determine the root-cause of performance issues.
The complexity of today’s hybrid-cloud IT environments and the ever increasing demands placed on IT make it difficult for small IT teams to dedicate sufficient time to monitoring and managing activities. So despite the underutilization of server capacity, agreed to service levels are hard to maintain and IT, in fact, relies on end-users for poor performance notification! The net result for many groups is a lower virtualization ROI than anticipated, lower IT service availability and sometimes, a less than stellar IT reputation.
Advanced application monitoring
One approach to dealing with this issue is to adopt a more advanced service level monitoring solution. By aggregating individual managed elements into collections of applications, VMs, storage, networking devices and rules that represent complete IT services, it becomes possible to take a more holistic approach to performance management and ROI improvement. Such monitoring solutions not only monitor the individual components and their associated parameters, they also correlate data from all of the service components as a whole and are able to undertake trending and baselining to help proactively identify forthcoming issues as well as to eliminate predictable parameter variations as causes of concern.
By monitoring applications through virtualized servers or from cloud services while keeping track of network, storage and other infrastructure components, advanced service level monitoring solutions are also far better at preventing those complex performance issues where nothing seems to be broken, no alerts have been sent, yet users are complaining. The wide and deep purview of such solutions also allows a more comprehensive approach to root-cause analysis. Here five areas where advanced service level monitoring tools can take the hard work out of monitoring virtualized environments and help improve both performance and ROI.
- Server over utilization and/or underutilization. Time constraints often limit the ability of mid-market IT services groups to monitor virtual and physical server utilization and the associated storage and networking resources. Examining utilization even on a weekly basis can be totally inadequate. What’s needed is a continuous monitoring capability that correlates results between different VMs running on the same server so that CPU capacity-related performance issues can be diagnosed. Application performance can also be affected by networking and storage constraints, which in turn may be caused by applications running on adjacent VMs. Server and performance optimization requires understanding not simply the peak load requirements of individual applications but also workload patterns and system demands created by multiple applications. Reports can be viewed on a weekly basis, but data should be collected continuously and saved for later analysis and review.
- Server versus infrastructure optimization. Monitoring server compute and storage capacity is very important but performance issues are frequently associated with the volume of network traffic or of data to be processed. Typically there are trends and patterns around these that, if identified proactively, can be used to overcome performance issues before they have impact. Identifying such trends can signal the need for additional network bandwidth, improved internet connectivity, greater or faster storage capacity, more processing power etc. – investments that are far easier to justify when related to their impact on service level agreements.
- Static versus dynamic workloads. Another challenge is to track business application performance across dynamic server environments. When system applications such as VMware’s vMotion or Storage vMotion are used, VMs can migrate dynamically from one physical server to another without service interruption, for example when DRS or maintenance modes are enabled. In simple environments it may be easy to determine where VMs (and hence applications) have migrated but in more complex environments this becomes problematic. The advantage of vMotion is that when activated it automatically preserves virtual machine network identities and network connections, updating routers to ensure they are aware of new VM locations. The challenge from the perspective of application end-to-end performance is to know which physical server is now hosting the application – particularly as the address hasn’t changed. Advanced monitoring solutions follow these migrations and, by containerizing all the infrastructure elements that make up a particular IT service, can take account of the dynamic changes occurring in hosting, storage and networking components.
- Cyclical, erratic and variable workloads and traffic patterns. Optimizing server consolidation is relatively straightforward when application workloads are consistent over time. However many applications place highly variable, cyclical or erratic demands on server, storage and networking components making it more likely that resources are sub-optimized in favor of simplicity and time. Advanced service level monitoring solutions are able to analyze the patters of usage and baseline the results to provide a more granular view which can be used to better take advantage of available resources and avoid unnecessary alerts. For example, a payroll application that requires significant resources prior to the end of each pay period might be pared with a finance application that needs to run after orders have been taken at the end of each month. Similarly it may make sense to pair development related activities with test activities, assuming that development and testing are done in series not in parallel. Advanced monitoring can help identify not only the processing capacity requirements and patterns but also those of storage and bandwidth so that all factors can be taken into account when optimizing resource allocation and setting thresholds.
- Root-cause analytics and meeting/reporting on SLAs. Optimization is an important goal to maximize the virtualization ROI but what most users care about is IT service availability and performance. As with all things complex, problems will occur. The challenge is to be able to resolve them as quickly as possible. Advanced service level monitoring solutions help because they are able to pin-point problem areas and then drill-down, through dashboard screens, to rapidly identify root-causes. Because they are able to look across every element of the infrastructure, they can identify interactions between different components to determine cause in ways that discrete management systems cannot. In addition, the ability to track and trend parameters of components that make up each IT service provides a proactive mechanism able to predict likely performance issues or SLA violations in advance. This provides IT Ops with reports that can be shared with management and users to justify any changes or additional investments needed.
Advanced service level performance management tools have affordable starting prices and offer significant ROI themselves by increasing the return from virtualization and allowing SLAs to be met and maintained. Add speeding mean time to problem resolution and freeing IT resources to undertake more productive activities and their value is very significant.
By helping the IT departments of mid-sized companies optimize their virtualized environments, Kaseya’s advanced monitoring solution, Traverse, supports SLA mandates and frees in-house IT staff to better respond to business requests. It also provides detailed intelligence that IT can use to add strong value in conversations regarding business innovation.
Learn more about how Kaseya technology can help. Read our whitepaper, Solving the Virtualized Infrastructure and Private Cloud Monitoring Challenge.
Author: Ray Wright