Вы находитесь на странице: 1из 10

TITLE PAGE Data Center Consolidation: Using Performance Metrics to Achieve Success By Omar Zaidi International Network Services

Ronnie Ray InfoVista Corporation

INTRODUCTION In the modern business climate, consolidation of IT operations is a common pursuit towards enabling cost reduction and efficiency improvement, technology consolidation, service level management, and quality-driven best practices. The following are common problems that lead to the need for consolidation projects: Acquisitions have resulted in multiple data centers supported by a single, shrinking IT budget; Servers have multiplied beyond the ability to effectively and economically manage them; Applications have become more complex, often resulting in multiple layers of technology, each of which is managed by a separate functional group; Organizational boundaries prevent smooth communication or cooperation, diminishing the quality of service to the end user; IT spending is spread across dozens of budgets, which are difficult to administer and rationalize. However, despite the obvious potential gains, consolidation projects are far from simple, either technically or organizationally. The economy of scale to be gained by consolidating servers and data centers has an attractive economic incentive, but inadequate planning can result in problems that quickly eradicate the potential savings. The more complex the project, the more essential is a risk mitigation strategy to minimize network disruption and application availability. This paper presents a framework for planning and executing data center consolidation efforts to ensure success with minimal risk to the business, and in that context, discusses the critical role of performance monitoring and reporting throughout the life cycle of a consolidation project.

TYPES OF DATA CENTER CONSOLIDATION EFFORTS Data center consolidation initiatives can be driven by a range of objectives. Most apparent are IT operations needs to reduce complexity, simplify support, improve processes and procedures, and gain better control over sprawling IT resources. Additional business concerns may include the need to comply with government regulations, transition to an IT outsourcing model, gain control over security vulnerabilities, improve business continuity capability, and establish the ability to extract business-related intelligence from IT operations. Driving all of this is the need to realize both immediate and ongoing cost savings, sometimes with the intent of freeing funds for investment in new technologies to further improve business operations. The following table identifies various types of consolidation initiatives and associated areas of potential cost savings.
Consolidation Type Physical Consolidation Objectives/Description Reduction in the number of data center sites through co-location Migration of equipment to site locations with lower running costs Replacement of existing hardware platforms with a selected standard (can include server, storage and/or network elements) Cost Savings Facility costs Hardware & software maintenance Non-application specific admin staff Hardware costs Hardware maintenance Non-application specific administration staff costs Number of servers Number of storage/backup devices Hardware maintenance Storage and backup staff

Equipment Standardization

Server & Storage Consolidation

Optimize the number of applications per service platform Replace many low-end servers with fewer high-end servers Share backup, replication and data recovery platforms

Consolidation Type Application Consolidation

Objectives/Description Standardization of applications used across the business

Cost Savings Number of servers Software maintenance Application specific admin staff

As one moves down the rows of the table, potential cost savings increase dramatically along with the complexity, project cost and associated timeline. The more complex the effort, the more important it is to incorporate the use of state-of-the art performance tools and methodologies to mitigate risk to business operations and realize projected cost savings.

A RECOMMENDED FRAMEWORK FOR PLANNING AND EXECUTING DATA CENTER CONSOLIDATIONS Before discussing performance elements in more detail, this section will provide an overview of a best practices consolidation project framework. Because data center consolidation projects can dramatically change the way IT assets are deployed and used to serve the core business processes, a well-founded framework is critical. Too often, enterprises begin a data center consolidation (DCC) effort, and later find that the project is under-funded, or that the IT department has not agreed with the business verticals (lines of business) on how the data center consolidation (DCC) should be undertaken. A well-organized framework will avoid many of these pitfalls, and allow for full benefits to be derived from the effort. The framework below identifies four phases: Plan, Design, Implement and Operate. Each phase must address several layers: business, application and network. At the intersection of each phase and layer are key project functions and deliverables. This section will describe the deliverables and the key considerations one has to make in developing them.

The arrows on the diagram indicate the primary flow within each phase. The downward arrows in the planning and design phase indicate that during these phases, the business layer must drive application and infrastructure requirements and deliverables. For the implementation and operations phases, the flow is reversed, with infrastructure feeding the application layer, which feeds into the business layer. The implementation phase at the business layer is blank, and this is intentional; if planning and design are done correctly, there should be little impact to the business layer during the implementation. Most data center consolidation efforts are considered failures by the lines of business (business verticals) in a company when there is significant perceived negative impact to ongoing business processes. The goal of using this framework should be to ensure that no negative impact occurs. The following table(s) provide a brief description of the tasks within the framework and the risks associated with omission or poor implementation.
PLAN PHASE Business Layer Key Tasks Justify business case and complete financial planning to gain project funding Associated Risks Poor formulation could result in partial funding or lack of management commitment to see the consolidation project through to completion. If executive sponsorship/funding is lacking, business verticals will be reluctant to put forth resources to assist in the design & implementation stages.

Application Layer

Ensure vertical business units have buy-in for project and ownership transition of assets/software Inventory applications, including current architecture and traffic flows

Performance Assessment of applications to determine risks during/after migration

Network Layer

Baseline traffic patterns and system-wide performance of components and applications

Lack of an ownership agreement can place the entire project in jeopardy due to lack of communication and follow through on key actions. Rank and file members of IT and business verticals may not embrace the coming changes if there is a perception that upper management is not supportive and unified. Lack of visibility into application architecture can significantly increase the chances of post-migration performance failures. Proper instrumentation of applications to monitor service level compliance requires a full grasp of the applications role in the business, the transactions it supports, and the architecture of the application (physical, logical, data flows, etc) If applications are not assessed to determine which ones could suffer performance hits due to a consolidation, the likelihood increases that end users will be dissatisfied with application performance after migration. This step is critical to ensuring that analysis and optimization efforts are focused on the application to ensure SLA compliance after the migration. Missing or incomplete baselining information makes it difficult to fully determine what migration and consolidation strategies will work best from a performance and scalability perspective. Ensuring that the investment in a consolidation is recovered quickly is key to project success. Having a scenario where the new environment does not scale/grow over its expected lifespan is not a desirable end result. Having a good baseline of the current environment informs the design team as to where economies of scale can be achieved while also pointing out bottlenecks that could be alleviated to ensure the new environment grows and lives as expected.

DESIGN PHASE Business Layer

Review business processes and map to applications

Review disaster recovery plans Gap analysis of Operations

Application Layer

Design Application Migration Plan and Conduct Engineering Review

Infrastructure Layer

Conduct Service Level Impact Analysis to determine which business processes will have performance changes Design New Infrastructure and Validate

Lack of information about how business processes map to applications seriously impacts the ability of the design team to plan migration/consolidation activities and design the migration plan for a given application. Ensuring minimal impact to the business from consolidation activities depends on fully understanding the linkages between the business and applications. Missing or incomplete information about disaster recovery plans will put serious risks for outages of applications and services across the business in the event of a disaster. Not reviewing operations and ensuring that a good platform exists (tools, people, processes) for managing the infrastructure after the consolidation can put the business at serious risk of having service levels and KPIs violated. It is far better to have visibility into these metrics and know that corrective measures are in place in the operations space to deal with event management, trouble ticketing, break/fix management, vendor management and other issues that will be impacted by the migration. Applications and servers need to be considered as a whole constellation of components that work together to bring data to the user and process user requests. All the previously discussed tasks/functions/deliverables need to be completed as they all provide inputs to the design process and ensure that a scalable design emerges that can be implemented easily. Not doing a study of service level impacts due to the proposed migration strategies can be a fatal mistake because it may be too late to make changes in the design after the migration, should performance problems be found then.

Construct Provisioning Plan

If the infrastructure design is not completed and validated with some knowledge about application and business layer inputs, scalability and reliability of the infrastructure could be much lower then expected. Given the large investments being made in data center consolidation strategies, it makes sense to do some testing/validation of the infrastructure so that the operating bottlenecks and thresholds are known. Lack of a provisioning plan will dramatically increase the likelihood of scalability, reliability and availability problems. It also can increase the turnup time required at a data center to get an application migrated and put into service.

Instrumentation Plan for all components, applications during/after migration

Gaps in the monitoring of infrastructure/application components constitute a serious risk to service level compliance. As the old saying goes If you cannot measure it, you cannot optimize it, and you will not know whether you are in compliance or out of compliance Associated Risks

IMPLEMENT PHASE Business Layer

Key Tasks If proper planning/design is done, business layer should not feel impact from migration Buildout Infrastructure to scalability and growth requirement Provision Transport Services

Infrastructure Layer

Application Layer

Instrument Infrastructure for performance, security, network management Provision Applications, instrument them and optimize them based on performance measurements Implementation usually done as a pilot case, followed by scaling and growth of design to handle increasing loads and more applications being deployed Allow for re-working of designs; contingencies will arise Key Tasks Implement Operations Architecture and Tools for new environment

Gaps in transport services will render application components unreachable or seriously impact scalability and reliability of applications which support key business processes If components of infrastructure are not instrumented to collect/report performance indicators, the result will be poor visibility into Service Level Compliance

Piloting the design on a small scale is the only effective way to debug problems in provisioning, implementation, scaling or any design flaws that might exist. Not successfully completing a pilot can put the reliability and performance of the design at risk for failure

If allowances are not made for reworking design to remove flaws found during the pilot, then project schedule and budget assumptions may become invalid

OPERATE PHASE Infrastructure Layer

Associated Risks Not implementing an operations framework (people, processes, tools) which is appropriate to the new data center environment can result in poor visibility into performance, availability and scalability of infrastructure. Additionally, service levels for outages and break/fix repair and problem resolution could be seriously impacted Not baselining infrastructure regularly and reporting on KPIs will result in poor data on service level compliance Service Level Compliance is a key yardstick for the business units to use in determining how effectively applications are serving the business after the move. Failure to report/measure on them risks losing the credibility of the IT department among the vertical business units in the enterprise Planning proactively for upgrades in the infrastructure cannot be done unless trending of utilization can be done Providing high level summarization of performance across multiple applications is critical to giving executives and application owners quick visibility into how their applications are serving end users. Key decision makers and stakeholders need to get to rolled-up data quickly and have capabilities to drill down from there into service level compliance data and application performance data. Just reporting application layer and service level layer data risks confusing the consumers of the data who may not be experts in all the different applications which serve the business unit.

Application Layer

Start ongoing baselining and KPI measurements Collect Service Level data and report on SLA compliance Gather data on traffic patterns and transaction volumes to manage capacity Provide reports for Quality of Experience (application performance across a user community such as sales, marketing, engineering, etc) and also provide reports for Business/Executive levels

Business Layer

THE IMPORTANCE OF PERFORMANCE METRICS IN DATA CENTER CONSOLIDATION Encapsulating key points from the above outline, effective and accurate performance monitoring and management is critical in every phase of the data center consolidation process as follows: In the pre-consolidation planning and design phases, the emphasis is on establishing historical performance baselines for the networks and systems architecture. This provides valuable intelligence in identifying resources that are under stress or those that are inefficiently used and their relationship with end-to-end services, lines of business and supported applications. This knowledge directly contributes to the planning of the new architecture of the data center. Without solid network modeling and performance impact analysis, slowed application response times and network failures are unavoidable. During the implementation phase, the actual migration of infrastructure, servers and applications, will cause dramatic changes in network traffic patterns. Real time monitoring proactively assesses how migration affects systems and network segments and highlights abnormalities in configuration and design. Comparative analysis of the shifting utilization loads and flows of network and systems against planned design goals provides an effective gauge to benchmark success and make adjustments to the plan as necessary. In the post-consolidation production phase, comprehensive performance management establishes performance baselines for the new environment and provides ongoing intelligence around problem resolution, capacity planning and service management of the consolidated architecture.

Through all the phases outlined above the key contributions of an effective performance management system are the following Decision support Risk mitigation Service Assurance

In the following sections we look at each of these aspects more closely. Decision Support Consolidation decisions need to be backed by accurate historical data that reveals the impact of application and business growth on data center infrastructure performance. Such data not only covers individual network and systems resources, but also rolls up performance degradation, utilization and downtime history to key business or service groups. Embedded intelligence and analytics within the performance management system clearly identifies business and service groups that are close to running out of capacity, those that are in the normal usage range and those that have resources that are largely under utilized. These aggregated views easily link to underlying components and resources at risk or those with spare capacity providing the basis for crucial performance and capacity decisions around the new architecture. Risk Mitigation While consolidation projects offer sufficient incentives in cost savings, service quality and improved business agility, they also carry the risk of adversely affecting business operations if improperly managed. The key to risk mitigation is accurate and effective visibility of performance across all phases of the project. In the pre-consolidation phase, intelligent trend analytics provide visibility and insight into current and past utilization as well as trends on future capacity needs based on business growth. During the implementation phase, real time analytics and historical comparatives provide actionable intelligence for reducing errors and downtime, preventing performance lags and enabling-on-the-fly adjustment of implementation plans. In the post-consolidation phase, proactive performance management assures the health of supported business services from performance, capacity and service management perspectives.

Service Assurance The next generation architecture of the consolidated data center demands new levels of reporting for services and lines of business. With the increasing involvement of business units in IT decision making, visibility around service level delivery and business impact is increasingly important. This visibility helps establish the dynamic of a successful consolidation process with the stakeholders, ensuring trust and support for future rounds of consolidation that is inevitable with business growth. Key capabilities of the performance management system that ensures the health and future growth of business systems include - predictive performance alerting prior to end users being affected; proactive capacity planning that incorporates trends of business and application growth; and role based service level reporting that provides the business and technology aggregations of performance to the respective audiences.

KEY CHARACTERISTICS OF THE PERFORMANCE MANAGEMENT PLATFORM In a well-planned consolidation project, the performance tools instrumented for use during the planning, design and implementation stages are convertible to an ongoing performance management platform and integrated into the postimplementation service management architecture. Such tools should be selected, then, with the full life-cycle of the project in mind, and implemented at every stage in a manner that incorporates thorough knowledge of the underlying business requirements. A well-instrumented performance management platform both ensures the success of the consolidation effort, and also establishes the basis for futures planned growth and new technology initiatives. In order to best deliver support on performance intelligence through the data consolidation process, the following key characteristics were identified. While these characteristics apply equally to both system and network performance management systems, we will focus on the effect on server consolidation as an illustrative example case. Further, we will demonstrate some of these aspects by drilling down into the capability and reporting provided by InfoVistas VistaInsight for Servers solution. It is of note, that similar capability is available in InfoVistas VistaInsight for Network solution to comprehensively manage network infrastructure elements in the data center through the consolidation phase. Composite KPIs The data center that is being consolidated can easily have several hundred servers and often several thousand. This is more so with the proliferation of lower end server systems that typically run a single application with support for a limited number of concurrent users. The number of individual performance metrics required to provide comprehensive visibility into server performance can easily range upwards of 20 to 30 per server resulting in an overload of available information to analyze for effective decision support. The need is paramount for aggregation of these individual metrics into composite Key Performance Indicators (KPIs) that provide a holistic view of server health while retaining the sensitivity to changes in the underlying data. Composite KPIs break down the information overflow into manageable streams of actionable management data and intelligence. For example, VistaInsight for Servers provides an assessment of the Workload of each individual server. The Workoad KPI incorporates critical metrics related to CPU utlization, process queue length, memory utilization and swap space, disk and network I/O utilization and many other metrics. An at a glance view of server Workload can easily provide a holistic view of the health of the server and its potential for consolidation. Combined real time and historical analytics Data center consolidation is an ongoing process that needs risk management both from real time and historical perspectives. Ongoing historical trend analytics provide meaningful data on what and how to upgrade, right size or consolidate in alignment with business needs. Real time early warning analytics based on continuous assessment of current performance against historical baselines mitigate risk during the migration phase and ensure service continuity in the post consolidation phase. The ability to switch between real time performance issues and historical measures of key KPIs provides the right balance for both forward planning and operational support of service delivery. VistaInsight for Servers can provide

easy drill downs from top-down service performance KPIs (e.g. availability and response time of an online reservation system) to real time reporting of the underlying raw metrics. Alternatively, a real time service level violation can be easily link back to historical performance assessments of the problem resource providing actionable intelligence into the pattern and scope of the problem. Business and service group driven prioritization and navigation One of the key factors behind decision support on data center consolidation projects, is the impact of the planned changes on business operations. Business relevance drives the priorities and action plan for undertaking the consolidation effort. The performance management system of choice thus needs to provide top-down views of infrastructure performance and business impact rolled up to the business or service level through the use of composite KPIs. Portal based navigation can cover capacity and performance information across business or service groups with the ability to drill down into individual infrastructure performance. The top-down business and service centric visibility is also key to creating the high level dashboard views that can report into senior management and business lines of the organization and clearly communicate the value of the consolidation process. VistaInsight for Servers provides graphical charts like Workload Distribution that visually identify the load distribution across servers within a business group. The multi-tiered grouping capability can mirror the organizational structure and create an accurate picture of residual capacity and business impact at each level. Heterogeneous vendor and systems support Data centers today host multiple hardware and vendor platforms, driven by application and business requirements and IT policies dictating such acquisitions. Any performance management system that provides intelligence around the data center consolidation process must be able to collect, analyze and report on data from heterogeneous systems through a common KPI model. Each KPI is therefore vendor and technology independent and is available for dynamic aggregation and analytics across business and service groups. VistaInsight for Servers supports all the major server operating platforms including Windows, Linux, Solaris, HP-UX and IBM AIX. It also supports a variety of data collection agents from the leading enterprise system management vendors to protect current investments in performance reporting instrumentation. Flexible and personalized portal based reporting The many stakeholders around a data center consolidation process including the lines of business, technology management and operations staff need to have their specific view into the key metrics that drive their activity. Flexible and easily customizable portals provide just that while reporting from a common set of KPI and analytics. The portal is also key to integrate external data sets from technology and business systems to effectively align business metrics with data center infrastructure. VistaInsight for Servers provides dynamic dashboards and portal views that are easily targeted to different roles within the organization. Historical service level views by business group can be easily correlated to real time status on infrastructure performance, capacity and exception alerts. Unified Cross-silo Views Data center infrastructure spans traditional silos like networks, systems and applications. The creation of servicecentric cross-silo views (e.g. across services, applications, systems and networks) facilitate bi-directional drill down between infrastructure issues and service impact. Performance events prioritized by business impact can drill down to the problem resource regardless of silo of origin as part of a shared business service model. The relationship between business and service groups, end to end driven service management and supporting infrastructure provides the right visibility to for continued support of business operations. VistaInsight for Servers easily integrates with other VistaInsight solutions to provide a consolidated view of the entire data center service infrastructure via a single pane of glass portal.

Ease of implementation and management Managing all the moving pieces in a data center consolidation environment is a complex effort. Auto discovery driven dynamic provisioning that adapts to infrastructure changes on a periodic basis can lessen some of that pain. Out of the box analytics that are tuned to providing the right KPI visibility supporting data center consolidation can further facilitate the process. Automation through out of the box implementation and adaptive performance management of the moving infrastructure is a highly desired characteristic of the targeted performance management system. VistaInsight for Servers delivers on this automation by providing packaged intelligence and analytics for out of the box implementation around the server consolidation process. Periodic rediscovery identifies changes in the infrastructure and automatically starts new collection and reporting for new servers and stops existing reports for which the servers have moved.

CONCLUSION Realizing the goals of a complex data center consolidation project requires adherence to a well thought out plan that addresses business, organizational and technical considerations. The plan and framework need to address issues at the business layer, application layer and infrastructure layer. The framework should also break down the data center migration effort into phases that involve planning and information discovery, design creation/validation, implementation and scaling/growth into mature operations. Broad executive and business unit sponsorship along with risk mitigating processes and tools will help remove organizational barriers to success. The use of performance management and reporting solutions during the planning and implementation phases of the project will provide the metrics required to measure success as well as enable management of the IT infrastructure from a business service perspective. A well-planned data center consolidation effort will also lay the groundwork for converting the performance management and monitoring tools used during the early stages into an integral part of ongoing IT operations. This wholistic approach to performance management and monitoring will not only ensure the success of the project, but will build new levels of business intelligence into the IT operations that will enable the company to realize ongoing cost savings and service improvements, continuing to contribute to the companys bottom line.

Вам также может понравиться