Вы находитесь на странице: 1из 6

Data Center Cooling Management And Analysis A Model Based Approach

Rongliang Zhou, Zhikui Wang, Cullen E. Bash, Alan McReynolds

Abstract As the hub of information aggregation, processing, and dissemination, todays data centers consume signicant amount of energy. The data center electricity consumption mainly comes from the IT equipment and the supporting cooling facility that manages the thermal status of the IT equipment. The traditional data center cooling facility usually consists of chilled water cooled computer room air conditioning (CRAC) units and chillers that provide chilled water to the CRAC units. Electricity used to power the cooling facility could take up to a half of the total data center electricity consumption, and is a major contributor to the data center total cost of ownership. While the data center industry has established the best practice to improve the cooling efciency, the majority of it is rule of thumbs providing only qualitative guidance. In order to provide on demand cooling and achieve improved cooling efciency, a model based description of the data center thermal environment is indispensable. In this paper, a computationally efcient multivariable model capturing the effects of CRAC units blower speed and supply air temperature (SAT) on rack inlet temperatures is introduced, and model identication and reduction procedures are discussed. Using the model developed, data center cooling system design and analysis such as thermal zone mapping, CRAC units load balancing, and hot spot detection are investigated.

reference of Supply Air Temperature (SAT) or Return Air Temperature (RAT). The ow rate of the cool air supply can also be tuned continuously if a Variable Frequency Drive (VFD) is installed for each CRAC unit to vary the speed of its blowers. For the particular conguration shown in Fig. 1, neither the cold aisles nor the hot aisles are contained and hence air streams are free to mix. Most of the hot air in the hot aisles returns to the CRAC units, but a small portion of it might escape into the cold aisles from the top, the sides, or even the bottom of the racks and causes recirculation. Recirculation can be also due to the reverse ows with certain IT equipment (some network switches, for example) of which the internal fans blow the hot exhaust air from the hot aisle into the cold aisle. The inlet air ow of the IT equipment is thus a mixture of cool air from the vent tiles in its vicinity and the recirculated hot air [4]. The recirculation of hot air into the cold aisle generates entropy and lowers the data center cooling efciency.

I. INTRODUCTION Due to the ever-increasing computing and hence power density of the IT equipment, todays data centers require tremendous amount of cooling power to maintain the desired thermal status. According to [1], [2], about a third to a half of data center total power consumption goes to the cooling system. Highly efcient cooling systems are thus indispensable to reduce the total cost of ownership and environmental footprint of data centers. Figure 1 shows a typical raised-oor air-cooled data center with hot aisles and cold aisles separated by rows of IT equipment racks. The thermal requirements of IT equipment are usually specied in terms of the inlet air temperatures of the equipment [3]. The equipment temperature thresholds are not necessarily uniform across the entire data center but are dependent on the different functions, such as computing, storage, and networking, which the IT equipment serves. Service contracts of the IT workload hosted in the IT equipment can also affect the temperature threshold. The blowers of the Computer Room Air Conditioner (CRAC) units pressurize the under-oor plenum with cool air, which in turn is drawn through the vent tiles located in front of the racks in the cold aisles. Hot air carrying the waste heat from the IT equipment is rejected into the hot aisles. Depending on its design, the CRAC unit internal control can regulate the chilled water valve opening to track the given
Sustainable Ecosystems Research Group, HP Labs, HewlettPackard Company, 1501 Page Mill Road, Palo Alto, CA 94304-1126.

Fig. 1.

Typical Raised Floor Data Center

{firstname.lastname}@hp.com

The challenge of data center cooling management is the coordination of CRAC units blower speeds and supply air temperature (SAT) tuning to minimize the power consumption of both CRAC units and chiller plants, while maintaining hundreds or even thousands of rack inlet temperatures below their respective thresholds. The major hurdle to overcome this challenge is the lack of simplied and computationally efcient models that are capable of capturing the complex energy and mass ows within the data centers and performing transient cooling analysis. Computational Fluid Dynamics (CFD) have been used extensively for data center cooling system design, but most of the applications are built upon steady-state system analysis. The few CFD based transient cooling system performance analyses reported so far have been focused on predicting system responses to cooling failures. Beitelmal and Patel [5], for example, use transient CFD simulation to investigate data center temperature distribution

change caused by a malfunctioning CRAC unit, and show that acceptable rack inlet temperatures can still be maintained if the IT load and available cooling resources can be appropriately re-organized. In another application, CFD simulation is used to analyze the various failure scenarios of IBMs cooling infrastructure design for the water cooled cluster of 11 racks [6]. While transient CFD analysis provides valuable insights in data center cooling system design as well as measures to handle different failure modes, it is impractical to use it for real-time data center cooling management. This is partly because data center IT load could change from minute to minute, and transient CFD analysis is normally time consuming and could take hours or even longer to nish. In addition, as pointed out in [7], [8],it is usually difcult to capture the true IT and cooling congurations in sufcient details to predict the resulting environment with desired accuracy. Targeting higher computationally efciency, some alternative approaches have been utilized or developed by researchers for transient data center cooling performances analysis. Khankari [9] uses a simple energy balance model to investigate the availability of data center thermal mass in various congurations during power shutdown, Kummert [10] studies the effects of chiller failure and cooling system thermal inertia on room temperature variations, and Zhang together with VanGilder develops data center transient thermal models. These alternative approaches, however, achieve the improved computational efciency by sacricing the spatial non-uniformity witnessed in most data centers, and hence are not suitable for real-time data center thermal management either. In an attempt to bridge this gap, the authors recent work [11], [12] develops physics based state-space models that describe the air ow transport and distribution within the data centers. The parameters of the models are obtained from measurement data of system identication experiments, and hence are ensured to reect the data center reality emphasized in [8]. The physics based data center cooling model is utilized in [11] to coordinate zonal (CRAC units blower speeds and SAT) and local cooling actuation (adaptive vent tiles), and validation on a small portion of a research data center shows signicant cooling power savings. Using the same but simplied physics based model, the authors present in [12] a decentralized model predictive control (MPC) design approach for CRAC units SAT and blower speed regulation targeting large scale data centers. Compared with the commonly used CFD modeling, the model employed in this paper is computationally light without losing the data center spatial non-uniformity, and is suitable for both offline analysis and online dynamic control. In this paper, the computationally efcient model developed is used to perform critical cooling system design and analysis tasks such as thermal zone mapping, CRAC units load balancing, and hot spot detection. The other sections of this paper are organized as follows. Section II rst briey introduces the dynamic rack inlet temperature model we developed previously using the energy and mass balance principles, followed by a discussion on the model identication and reduction procedures. In Section III,

we demonstrate how model based grouping of rack inlet temperatures and CRAC units can be used for improved thermal zone mapping and coordination of decentralized cooling controllers. Section IV denes Hot Spot Index (HSI) using the model parameters and shows that it is powerful in data center hot spot detection. Finally, Section V concludes the paper with a summary of the work presented. II. DYNAMIC COOLING SYSTEM MODELING AND MODEL PARAMETER IDENTIFICATION A. Dynamic Cooling System Modeling In this subsection, we briey introduce simplied models from the basic mass and energy balance principles to characterize the complex mass and energy ows within the raised-oor air-cooled data centers. Since only the modeling results are presented, the interested readers can refer to [11], [12] for more detailed description on how these modes are derived. In the open environment, air ow coming into the IT equipment inlet is a mixture of the cool air from the CRAC units (through the vent tiles) and the recirculated hot (exhaust) air that escapes into the cold aisle. In hot aisle contained environment, although signicantly reduced, recirculation could still exist because of imperfect containment, or the reverse ows from some network switches that draw hot air from the hot aisle for cooling and reject the even hotter air (up to 40 C) into the cold aisle.

Fig. 2.

Air Mixing at the Rack Inlet

Consider a small control volume in the proximity of the rack inlet with mass m and temperature T , as shown in Fig. 2. Cool and recirculated hot air ows with mass and temperature (mc , Tc ) and (mh , Th ) enter the control volume, mix well with the air (m, T ) already in the volume, leave the control volume altogether and enter the rack inlet with total mass m and temperature T . It can be found that the temperature change T of the air within the control volume before and after the mixing is: T T T = mc (Tc T ) mh (Th T ) + , m + mc + mh m + mc + mh (1)

which reveals that the inuence of cool and recirculated hot air on rack inlet temperature can be mainly captured by mc (Tc T ) and mh (Th T ), respectively. In raised-oor data centers, all the CRAC units pressurize the under oor plenum by blowing the cool air into it. The

cool air mc owing into a rack inlet could come from all the CRAC units and hence:
NCRAC

mc =
j=1

bj V F Dj ,

(2)

in which NCRAC is the number of CRAC units, bj quanties the cooling air contribution from the j th CRAC unit to a specic rack inlet, and V F D stands for the speed of the blower in the percentage of its maximum. It can be seen from Eqn. (1) that both cool and recirculated hot air contribute to the rack inlet temperature change T . Since the recirculated hot air ow is beyond direct control, we can lump its effect into a time-varying term C and simplify Eqn. (1) as: mc t(Tc T ) + C, m + mc t + mh in which t is the length of the sampling interval. The discrete form of Eqn. (3) is: T T = T (k + 1) =T (k)
NCRAC

(3)

+{
j=1

gj [SATj (k) T (k)] V F Dj (k)}

+ C(k), (4) in which T (k + 1) and T (k) are rack inlet temperatures at time steps k +1 and k, respectively. In Eqn. (4), gj quanties the combined inuences of VFD and SAT tuning of the j th CRAC unit, and also lumps the effects of parameters bj , t together with the nonlinearity associated with mc . The vector form of Eqn. (4) for multiple rack inlet temperatures is: T (k + 1) = T (k) + F + C, in which T = [T1 , T2 , , TNT ]T , F = [F1 , F2 , , FNT ]T ,
NCRAC

be performed on part of the experimental data collected such that the parameterized model minimizes the error between model prediction and rack inlet temperature measurements. The remaining part of the experimental data can then be used to validate the model identied. Note that after the model identication data is collected, the model parameter identication can be performed per rack inlet temperature, since different rack inlet temperatures only have input coupling. The associated parallelization can be exploited to speed up the model identication process and is extremely useful for large scale data centers with thousands of rack inlet temperatures. From the physical perspective, every rack inlet temperature of interest is inuenced by all the CRAC units, and correspondingly each CRAC unit affects all the rack inlet temperatures within the data center. As a result, the matrix G = [gi,j ] (1 i NT , 1 j NCRAC ) obtained from the model identication process is fully populated. However, it is observed that most rack inlet temperatures are usually affected by a selected small number of CRAC units, and that this subset of CRAC units varies with the location of the specic rack inlet temperature relative to the layout of the cooling facilities. Reected on the dynamic rack inlet temperature model identied, this observation manifests through the fact that for any rack inlet temperature Ti (1 i NT ) some of the gi,j (1 j NCRAC ) items are dominant with noticeably larger values than the rest, suggesting that the non-dominant items may be ignored without signicant effects on modeling accuracy. In order to obtain a sparse G matrix and hence take the associated advantages such as system decoupling and parallelization of sub-problem solving, we can perform model reduction for each rack inlet temperature. For rack inlet temperature Ti , dene gi,max = max(gi,j ), 1 j NCRAC .

(5)
r In the reduced matrix Gr , item gi,j is set to the corresponding gi,j if and only if

gi,j gi,max , gi,j [SATj (k) Ti (k)]V F Dj (k), 1 i NT ,


r and otherwise gi,j = 0. In the inequality above is a adjustable threshold between 0 and 1. The physical interpretation of this model reduction process is that in order to regulate a specic rack inlet temperature, we can only consider the CRAC units that have signicant inuence over it and simply ignore the CRAC units that just marginally affect this rack inlet temperature of interest.

Fi =
j=1

C = [C1 , C2 , , CNT ]T , and NT is the number of rack inlet temperatures of interest. B. Model Parameter Identication and Model Reduction Parameters of the dynamic rack inlet temperature model described in Eqn. (5), including gi,j (1 i NT , 1 j NCRAC ) and C can be obtained through model identication experiments. During the experiments, the various available cooling actuation, such as CRAC unit blower speed and SAT, is perturbed through sequential/simultaneous step changes or other specically designed identication sequences. Both the cooling actuation signals and the corresponding temperature response at the rack inlets are collected. In order to nd the model parameters gi,j (1 i NT , 1 j NCRAC ) and C, a nonlinear optimization can

III. M ODEL BASED G ROUPING OF CRAC U NITS AND R ACK I NLET T EMPERATURES The matrix Gr obtained after the model reduction process trims the weak relationships between CRAC units and rack inlet temperatures, maintaining only the strong ones. The sparse structure of Gr implies the natural grouping of CRAC units and rack inlet temperatures, which can be used for both static system analysis and dynamic system control.

A. Thermal Zone Mapping Previously, CRAC units thermal zone mapping has been based on the absolute values of thermal correlation index (TCI) dened in [13] as: Ti T CIi,j = , (6) SATCRAC,j which quanties the steady-state response of the ith rack inlet temperature to a step change in the SAT of the j th CRAC unit. This steady-state system information based method, however, has a drawback since the TCI values obtained through the system commissioning process is dependent on the blower speeds settings of the CRAC units. The thermal zone of a particular CRAC unit established using TCI could expand or shrink when its blower speed increases or decreases, leading to a family of data center thermal zone mappings under different CRAC units blower speed settings. The drawback of TCI based thermal zone mapping method outlined above comes from the fact that only steady-state system information is utilized and the rich dynamic system information embedded in system parameter such as the Gr matrix is left out. From the CRACs perspective, CRAC #j effectively affects rack inlet temperature Ti (1 i NT ) r only if the corresponding item gi,j of matrix Gr is nonzero. th The nonzero items of the j column of Gr denes the zone of inuence of the j th CRAC units, with the value of the corresponding item denoting the exact intensity of the inuence of CRAC #j on a particular rack inlet temperature. Using this model based approach, Fig. 3 shows the thermal zones of a research data center with 8 CRAC units and 10 rows of racks, and the thermal zone of each CRAC unit is approximated by a balloon. The signicant overlapping between the thermal zones of CRAC #3 and #4, and CRAC #5 and #6 are clearly indicated in the gure. Since CRAC units with overlapping of thermal zones all affect the rack inlet temperatures in the overlapping area, some coordination might be necessary between them such that they may have balanced loads when managing the shared rack inlet temperatures.

method for each CRAC unit does not vary with its blower speed, since Gr captures the combined effects of CRAC unit SAT and blower speed on rack inlet temperatures. Because of the consistency of this model based thermal zone mapping method, it is valuable for distributed controller design of large scale data centers. In the authors previous work [12], a decentralized controller is designed for each CRAC unit to regulate the rack inlet temperatures within its established zone of inuence.

B. Load Balancing Based on CRAC Units Grouping In raised oor air cooled data centers, there might be strong interactions between neighboring CRAC units, thus causing load balancing problems such as load piggybacking and load swapping, which is not uncommon in decentralized or decoupled CRAC unit controller design. In load piggybacking, one of the CRAC units may keep increasing its cooling provisioning in order to drive a temporary rack inlet temperature violation below the specied threshold, and part of the cool air from this CRAC unit may also be routed to the racks intended to be cooled by the neighboring CRAC units because of the shared underoor plenum. Observing the rack temperature decrease in its thermal zone, the neighboring CRAC units may piggyback on the CRAC unit with high load and decrease its own load provisioning and in turn cause the high load CRAC unit to reach an even higher load. Load piggybacking often manifests itself as a high load CRAC unit with low SAT and high blower speed, with its neighbor(s) working at the opposite extreme with high SAT and low blower speed. The signicant load imbalance between CRAC units from load piggybacking may shorten the CRAC units life span, and the mixing of supply air streams with big temperature difference also generates entropy and lowers the overall data center cooling efciency. Apart from load piggybacking, load swapping can also be observed for CRAC units that are individually controlled by different controllers. When stuck in load swapping, the high and low load status could switch back and forth between neighboring CRAC units, resulting in oscillation in the SAT and blower speeds. Load swapping could be triggered by the temperature disturbances such as a sudden and temporary load increase from a server rack, opening of a cold aisle vent tile due to maintenance, or introduction of free cooling from the outside air, and may not stop without intervention of the operator. The root cause of both load piggybacking and load swapping is lack of coordination in decentralized or decoupled CRAC unit controllers, in which the physical input coupling between neighboring thermal zones are neglected. The local controller for each thermal zone tries to maintain the thermal status within the zone using the least efforts (usually through minimization of the cooling power required), and uncoordinated local optimization could easily lead to global suboptimal solution (load piggybacking) or instability (load swapping). In order to address these problems, simple and yet effective load balancing mechanism needs to be established

Fig. 3.

Model Based Thermal Zone Mapping of a Research Data Center

Compared with thermal correlation index (TCI) based method, the thermal zone identied using the model based

between neighboring thermal zones to coordinate the outputs of the CRAC units. The rst step toward CRAC load balancing is to identify for each thermal zone the neighboring thermal zones that it needs to coordinate with. For each rack inlet temperature Ti , the corresponding temperature violation Tv,i over its reference temperature Tref,i is dened as: Tv,i = Ti Tref,i . In the j th thermal zone, the rack inlet temperature with the highest Tv,i is called its master sensor, and is denoted as Tm,j . At each control interval, the local controller of each thermal zone broadcasts the index and reference temperature of its master sensor, together with its CRAC units current SAT and blower speed setting. Each local controller also receives broadcast messages from all other thermal zones, and compares its master sensor with its neighbors. After this information exchange, each local controller has a most recent snapshot of the settings of all the cooling controllers, and CRAC units sharing the same master sensor can be grouped for load balancing purpose. In the case that several thermal zones share the same master sensor, it is desirable that all the CRAC units in this group coordinate to the same SAT since mixing of supply air streams at different temperatures generates entropy and lowers cooling efciency. In order to coordinate to the same SAT, the thermal zone with the highest SAT of the group adds to the SAT setting of its local controller an additional load balancing term: SATc = k (SATmin SATmax ), in which SATmin and SATmax are the lowest and highest CRAC unit SAT settings of the group, and k is appropriate feedback gain for SAT coordination. Note that in this coordination mechanism, load balancing is only performed on the low load CRAC unit to increase its provisioning, and the reason is to minimize the chance of rack inlet temperature violation during the load balancing process. Figure 4 shows the experimental results in a research data center as shown in Fig. 3 with 8 CRAC units. Before load balancing is enabled shortly after time t = 1hr, the data center has already entered a relatively steady state. CRAC #3/4 share the same master sensor, and CRAC #5/6 share another master sensor. Due to the lack of coordination between the decentralized controller which each controls a CRAC unit, the supply air temperature difference is as large as 2.2 C between CRAC #3 and #4, and 1.8 C between CRAC #5 and #6. After load balancing is enforced at time t = 1hr, CRAC #3 and #4 quickly converge to the same SAT, and the same is true for CRAC #5 and #6. The blower speeds of these four CRAC units, denoted by the percentage of the maximum of variable frequency drive (VFD) output, also reach their new steady-state settings after load balancing is enforced as shown in Fig. 4(b). The trajectories of other CRAC units are not shown here since their settings do not change for the duration of the experiment, either because they do not share master sensors (CRAC #1 and #2) or load balancing has already been achieved (CRAC #7 and #8).
Fig. 4. (a) SAT Load Balancing (b) VFD

IV. M ODEL BASED H OT S POT D ETECTION In data center cooling management, ability to identify the hot spots is essential. While examining the snapshot or temporal trends of the data center rack inlet temperature distribution is helpful, a systematic approach is needed to automate the process. The detection of hot spots can not be accomplished without the denition of an appropriate metric or measure. While most people tend to believe that the highest rack inlet temperature within the entire data center or a thermal zone indicates a hot spot and thus temperature seems to be the right measure, it is not always the case. First, location where the highest rack inlet temperature observed within the data center or a thermal zone could easily change as the settings of the CRAC units vary. A hot spot previously identied could disappear as the thermal zone it belongs to is over provisioned while the neighboring thermal zones are congured to be insufciently provisioned. Second, the hot spot detection results using temperature measure might be subject to various disturbances and dependent on whether the detection is performed when the data center has reached a relatively steady state. These drawbacks indicate that using temperature as the measure for data center hot spot detection lacks consistency. New and improved metric for data center hot spot detection needs to be developed. In order to address this problem, we dene the model based Hot Spot Index (HSI) for data center hot spot detection. For a rack inlet temperature Ti with reduced order model: Ti (k + 1) =Ti (k)
NCRAC

+{
j=1

r gi,j [SATj (k) Ti (k)] V F Dj (k)}

+ Ci , HSI is dened as: HSI Ci r gi ,


1

(7)

r r r r in which vector gi = [gi,1 gi,2 gi,NCRAC ], and 1 stands for the vector 1-norm. From a physical perspective, HSI measures the ratio between effects of hot air recirculation and CRAC units tuning on a specic rack inlet temperature. A high HSI value means that hot air recirculation is severe while the cooling effects from all the CRAC units are weak at the location of interest, and hence indicates a potential hot spot.

identied by HSI based method, such as those in rack D5 and G9, now have relatively low temperatures among all the rack inlet locations and hence could be left out in hot spot detection. V. CONCLUSIONS In this paper, a data center management and analysis scheme based on dynamic rack inlet temperature model is introduced. The model parameter identication and model reduction procedures are both discussed. Improved thermal zone mapping approach is introduced through model based rack inlet temperature grouping, and load balancing mechanism is investigated through grouping of CRAC units. In order to detect the hot spots within data centers, Hot Spot Index (HSI) is dened using the model parameters and proves to be effective in data center hot spot detection. R EFERENCES
[1] Steve Greenberg, Evan Mills, Bill Tschudi, Peter Rumsey, and Bruce Myatt. Best practices for data centers: Results from benchmarking 22 data centers. In 2006 ACEEE Summer Study on Energy Efciency in Buildings. [2] Chandrakant D. Patel, Cullen E. Bash, Ratnesh K. Sharma, Monem H. Beitelmal, and Rich J. Friedrich. Smart cooling of data centers. In IPACK03, The Pacic Rim/ASME International Electronic Packaging Technical Conference and Exhibitions. [3] ASHRAE. Datacom equipment power trends and cooling applications. Atlanta, GA, 2005. [4] Cullen E. Bash, Chandrakant D. Patel, Ratnesh K. Sharma. Efcient thermal management of data centers Immediate and long-term research needs, volume 9, no. 2. HVAC&R Research, Apr 2003. [5] Monem H. Beitelmal and Chandrakant D. Patel. Thermo-uids provisioning of a high performance high density data center. Distributed and Parallel Databases, 21(2):227238, 2007. [6] Roger Schmidt, Mike Ellsworth, Madhu Iyengar, and Gary New. IBMs power6 high performance water cooled cluster at NCAR: Infrastructure design. In ASME 2009 InterPACK Conference collocated with the ASME 2009 Summer Heat Transfer Conference and the ASME 2009 3rd International Conference on Energy Sustainability (InterPACK2009), San Francisco, California, USA, July 1923 2009. [7] Jim VanGilder. Real-Time data center cooling analysis. Electronics Cooling, September, 2011. [8] Mark Seymour, Christopher Aldham, Matthew Warner, and Hassan Moezzi. The increasing challenge of data center design and management: Is CFD a must? Electronics Cooling, December, 2011. [9] Kishor Khankari. Thermal mass availability for cooling data centers during power shutdown. ASHRAE Transactions, 116(2):205217, 2010. [10] Michael Kummert, William Dempster, and Ken McLean. Thermal analysis of a data centre cooling system under fault conditions. In 11th International Building Performance Simulation Association Conference and Exhibition, Building Simulation 2009, Glasgow, Scotland, July 2730 2009. [11] Rongliang Zhou, Zhikui Wang, Cullen E. Bash, Christopher Hoover, Rocky Shih, Alan McReynolds, Niru Kumari, and Ratnesh K. Sharma. A holistic and optimal approach for data center cooling management. In American Control Conference (ACC2011), pages 13461351. IEEE, 2011. [12] Rongliang Zhou, Zhikui Wang, Cullen E. Bash, and Alan McReynolds. Modeling and control for cooling management of data centers with hot aisle containment. In ASME 2011 International Mechanical Engineering Congress & Exposition, Denver, USA, November 11-17 2011. [13] Cullen E. Bash, Chandrakant D. Patel, and Ratnesh K. Sharma. Dynamic thermal management of air cooled data centers. In Thermal and Thermomechanical Phenomena in Electronics Systems, 2006. ITHERM06. The Tenth Intersociety Conference on, pages 445452. IEEE, 2006.

Fig. 5. HSI Values for Rack Inlet Locations of a Research Data Center in Descending Order

Figure 5 shows the HSI values in descending order for the 192 rack inlet temperature locations in the research data center shown in Fig. 3. A closer look at the locations with the leading HSI values points to rack inlet temperatures at rack B6, D5, D6, and G9. Among these hot spots, rack B6, D5, and D6 are affected by severe reverse ow from the network switches mentioned earlier, while G9 is at the end of the row G and is most affected by the hot air escaped from the hot aisle. The locations of these hot spots agree with the observations of the research data center. Furthermore, the severity of these hot spots can also be indicated by their corresponding HSI values, which can not be easily detected by simply observing the data center temperature distribution. Rack B6, for example, has higher HSI values, and it is found that the reverse ow is much more severe than that of rack D5, and D6, which follow B6 in HSI ranking.

(a) Uniform CRAC Settings Fig. 6.

(b) Nonuniform CRAC Settings

Rack Inlet Temperatures (in Descending Order of HSI)

Compared with HSI based method, relying solely on data center temperature distribution for hot spot detection could lead to misleading results. Figure 6, for example, shows the temperature distribution of the aforementioned research data center in descending order of HSI. In Fig. 6(a), all the 8 CRAC units are congured with the same SAT and blower speed. The trend of rack inlet temperatures roughly follows that of HSI values as shown in Fig. 5, meaning that in this particular data center cooling conguration the rack inlet temperatures can be used as a reference for hot spot detection. The rack inlet temperatures as shown in Fig. 6(b), however, vary signicantly in both amplitudes and ranking relative to each other with a nonuniform setting of CRAC units. Although the ve rack inlet locations with the highest HSI values still have the highest temperatures, other hot spots