The quest to improve the productivity and efficiency of IT organizations is an ongoing one. A number of technologies and processes have been adopted over the decades to make IT operations leaner and more effective. With the arrival and rapid adoption of Virtualization technology and Cloud infrastructures in the past few years, IT organizations worldwide are starting to realize significant economy-of-scale benefits. Reduction in costs for ‘incremental units’ of computing power, the ability to more easily flex up and down as needed, and the lack of restrictions imposed by the traditional models, will all drive a dramatic increase in the consumption of computing and application resources as organizations will be freed up to do more. On the flip side, steps will need to be taken to deal with the resulting increase in the administration burden, else the efficiency gains realized from shared, flexible IT infrastructure will be outstripped by the high cost of managing a more dynamic and complex environment.
Terms like “virtualization sprawl” have been coined to refer to the increase in the number of discrete virtual servers and related application components within the overall IT environment. This is no longer a hypothetical scenario, and organizations are already experiencing administration challenges because of the fundamental IT transformation driven by virtualization and cloud technologies. Consider the case of a leading educational institution in the Northeastern United States. Prior to embarking on an aggressive virtualization initiative, the operations team was responsible for ensuring the performance of approximately 1000 distinct physical servers. By the time the first phase of the server consolidation and virtualization initiative was completed, the team was tracking and managing the performance of over 7000 virtual servers!
As the number of discrete virtual servers, components and resident applications explodes, the performance monitoring and root-cause-analysis demands on IT administrators will multiply exponentially. Manually intensive legacy and point monitoring tools will not be able to keep up, and organizations will face significant challenges in detecting and resolving issues in a timely manner. In one recent case of an organization being overwhelmed, the IT team resorted to forced daily ‘proactive reboots’ of a large number of their servers. The team claimed that this workaround was the only way to keep the infrastructure performing, given the absence of a comprehensive monitoring and management solution to identify real issues and isolate problem sources. The IT team acknowledged that the organization’s users and business operations were being impacted by this daily reset cycle, but viewed this approach as the lesser evil compared to blind, reactive fire-fighting!
Off course, the better approach would be to take a more strategic stance and implement the right systems/processes to assure the performance of their IT infrastructure. Today’s cloud monitoring software solutions have to be capable of supporting automation of many of the routine administration tasks. More importantly, these systems need to have in-built intelligence to infer what his going on in the IT infrastructure and automate decision-making. The increased demands on the IT team will be partially offset by the automation capabilities of the monitoring solution, allowing IT personnel to focus on the deeper and more complex administration tasks. Furthermore, the overall efficiency and utilization of IT resources will be higher with the right capabilities in the IT monitoring software (see http://tiny.cc/cwytn to learn how).