Sie sind auf Seite 1von 17
Data center cooling strategies technology brief
Data center cooling strategies
technology brief

Abstract

2

Introduction

2

Understanding the challenges of high-density data centers

2

Rising computing

demand

2

Increasing power density

Optional cooling technologies

3

Conventional thermal management techniques

4

4

Environmental impact

5

Increasing energy and infrastructure costs

5

Characterizing cost, performance, and energy efficiency

6

Total Cost of Ownership

7

Coefficient of Performance of the ensemble

8

Delivery factor and Coefficient of Energy Efficiency

9

Data center cooling strategies

10

Adopt established best practices

10

Use efficient components and systems to manage power

10

Efficient components

11

Efficient systems

11

Use virtual machines to consolidate computing resources

12

Optimize infrastructure efficiency

12

Computational fluid dynamics

12

Visualization tools

13

HP Dynamic Smart Cooling

14

HP Modular Cooling System

15

Summary

16

Call to action

16

For more information

17

To send comments

17

Cooling System 15 Summary 16 Call to action 16 For more information 17 To send comments

Abstract

The rise in demand for IT business support and the resulting increase in density of IT equipment are stretching data center power and cooling resources to the breaking point. Due to the complex interdependencies of data center resources, HP has taken a holistic view of heat removal, evaluating the energy flow of the entire information technology ensemble from the computer chip to the cooling tower. This paper describes strategies that combine technologies and services to help convert cooling resources from a fixed asset to a variable asset based on demand, distribute compute workloads to maximize cooling efficiency, and provide intelligent fault tolerance to eliminate unnecessary power and cooling infrastructure.

Introduction

Until about 2005, the major problem plaguing data centers (IT and facilities organizations) was having adequate floor space to accommodate rapid growth. With solutions for these concerns now available, the main problems have become energy costs and the inability of data center infrastructures to accommodate new high-density computing platforms. Higher rack densities have caused power and cooling costs to surpass the costs of the IT equipment and the facility space. 1 IT organizations tend to focus on equipment whereas facilities organizations focus on the infrastructure. Now, both must view the data center holistically and join together to focus on lowering the total cost of ownership (TCO).

To lower TCO, data centers must adopt a new paradigm that focuses on maximizing the energy efficiency of components, systems, and the infrastructure while controlling energy usage. For example, organizations need to use energy-efficient technologies for cooling newer, high-density server platforms. Another factor contributing to cooling inefficiency is the generation gap between prevalent operating practices and the recommended “best practices” for new computing platforms.

Because the demand for IT business support will continue to increase, an organization’s best strategy to manage growth is to improve power and cooling infrastructure efficiency, which leads to the “call to action” of this paper.

This paper calls for IT organizations to understand the challenges ahead; to abandon out-dated technologies that do not focus on energy efficiency; to become familiar with new tools and technologies that improve performance and energy efficiency; and to adopt proven strategies and best practices for high-density IT equipment.

Understanding the challenges of high-density data centers

Rising computing demand, increasing power density, and increasing infrastructure and energy costs are major issues for data centers around the world. A key cause of rising infrastructure and energy costs is the inefficiency of conventional thermal management techniques in cooling high-density equipment. Some actions taken by organizations to manage high-density equipment can result in increased inefficiency and higher operating cost. Therefore, this section describes some common challenges that IT and facilities organizations must address to manage high-density data centers.

Rising computing demand

Data centers are vital to the world’s economy, providing mission-critical IT services—computing, networking, and data storage—for all major organizations. The demand for IT business support

1 Belady, C., “In the data center, power and cooling costs more than the IT equipment it supports,” Electronics Cooling, volume 13, no. 1, February 2007.

continues to grow with the need for increased connectivity and on-demand access to digital information and media.

The good news is that server performance continues to grow in accordance with Moore’s Law and server performance per watt (benchmark performance divided by average power usage) has doubled every two years since 1999, as shown in Figure 1. However, business application demand is increasing faster than server performance, resulting in a continuous demand for more servers. This means that data centers must continue to scale their infrastructure and new data centers will be brought online to meet the computing demand.

will be brought online to meet the computing demand. Figure 1. Server performance is increasing faster

Figure 1. Server performance is increasing faster than predicted by Moore’s Law, and performance per watt is doubling every two years. 2

and performance per watt is doubling every two years. 2 Increasing power density The increase in
and performance per watt is doubling every two years. 2 Increasing power density The increase in

Increasing power density

The increase in server density is being driven by the need to maximize the use of data center floor space and to extend the life of data centers. As a result, rack power density (kW/rack) is up more than 5X in the past 10 years. 3 The growth in rack power density and the associated heat load are outpacing conventional thermal management techniques, which are typically designed for previous generations of IT equipment.

To achieve thermal management targets, some organizations partially populate the racks up to a maximum heat load of 5 - 6 kW, resulting in underutilization. Therefore, in order to attain high power density, new techniques are needed. For instance, rather than limit rack utilization, it can be more cost-effective to isolate high-density equipment and improve the efficiency of nearby cooling resources.

Equipment power density predictions, such as those published by ASHRAE 4 , have been helpful for understanding rack power requirements for high-density deployments in new and existing data centers. These power density projections are germane to facilities designed for Internet and financial services, and to high performance computing applications, such as graphics rendering farms, grid computing, transaction processing, etc.

2 Belady, C., “In the data center, power and cooling costs more than the IT equipment it supports,” Electronics Cooling, volume 13, no. 1, February 2007.

3 “Datacom Equipment Power Trends and Cooling Applications," Association of Heating, Refrigeration and Air- Conditioning Engineers, Atlanta, GA, 2005.

4 Ibid.

Conventional thermal management techniques

Conventional thermal management techniques are less efficient for cooling high-density heat loads in mixed-density environments. One reason is because traditional computer room air handlers/air conditioners (CRAH/CRACs) using a raised floor plenum are typically controlled by a return air inlet temperature sensor that provides an overall indication of the heat dissipated in the room (Figure 2). In reality, the air flow patterns in many data centers promote mixing of warm return air with cool supply air, which can cause cooler air to be circulated back to the CRAH/CRACs lowering the sensible cooling capacity and overall system efficiency. Additionally, this leads to warmer air being recirculated through computer equipment, creating hot spots. These hot spots are not sensed by the CRAH/CRAC return air sensor, which is typically set at 68°F to 72°F (20°C to 22°C). Traditional computer room airflow management techniques are effective for power densities less than 5 kW/rack, while high-density equipment can push power density well above 15 kW/rack.

equipment can pu sh power density well above 15 kW/rack. Figure 2. Conventional cooling solutions do

Figure 2. Conventional cooling solutions do not sense local hot spots caused by mixing of return and supply air.

local ho t spots caused by mixing of return and supply air. Optional cooling technologies There
local ho t spots caused by mixing of return and supply air. Optional cooling technologies There

Optional cooling technologies

There are several optional data center cooling technologies on the market that place additional burdens, such as increased power usage and inflexibility, on the infrastructure. These technologies include passive cooling, active cooling, and vertical cooling.

Passive cooling technologies include air flow containment devices that create a physical barrier between cold air supply and warm air return streams. Although these devices can be effective to some extent, they only treat half of the problem and can cause additional issues related to turbulence and unbalanced air flow. Some containment devices, such as fully enclosed rack systems, place operational limits on the servers inside them, and on the facilities, to stay within their designed capacity limits. In instances where air is ducted from the bottom to top of the rack, very high fan power may be required to move enough air through the narrow cross-section and long length of the rack. If these types of fully enclosed rack system fail, the enclosed devices will not have adequate airflow to cool themselves.

Active cooling technologies include fan-powered devices that are designed to pull additional cool supply air into high-density areas or push warm return air to the air handlers. These devices also treat only half of the problem and should be used carefully as they can cause additional issues related to turbulence and unbalanced air flow. Similarly, if an active cooling device fails, the operation of associated IT equipment is put at risk. Another serious drawback is that active cooling devices significantly increase energy use and add parasitic heat to the overall system.

Other inflexible cooling systems are those that dramatically change airflow patterns, thus requiring the creation of special areas in the data center. Some of these systems create vertical airflow that forces air through an upright stack of heat-producing devices. These vertical cooling systems require careful planning to ensure that they do not disrupt the rest of the environment. For example, vertical cooling systems move extremely large volumes of air at very high velocities. Such high-velocity airflow can cause problems by robbing nearby systems of needed supply air. On the exhaust air side, high- velocity airflow can contribute to turbulence and recirculation in the surrounding area. In addition, the high-velocity fans generate additional heat that must be removed, thus increasing power use.

In contrast to these inflexible solutions, HP has adopted a data center cooling philosophy that “simplicity drives efficiency.” Server manufacturers invest heavily in research to ensure that their servers work properly when adequate inlet temperature is maintained. By creating a balanced data center environment that allows the IT equipment to operate as designed, organizations can increase efficiency without the addition of inflexible cooling technologies. HP promotes Dynamic Smart Cooling (DSC), a data center environment that dynamically provides sensing, communications, and flexibility. With DSC, critical enterprise hardware can operate in a flexible environment that adapts as IT requirements change. Through a pervasive sensing network and communications with the cooling infrastructure, the cooling system can operate in a balanced, non-turbulent manner that prevents hot spots and significantly reduces the energy required to operate the cooling system.

Environmental impact

Over 70 percent of the electricity consumed in the U.S. is generated by power plants that burn fossil fuels. Coal-fired power plants—the least environmentally-friendly fossil fuel—generate 50 percent of the electricity consumed in the U.S. This proportion is expected to increase to 57 percent by 2030. 5

Fossil fuel power plants account for almost 40 percent of the total CO 2 produced in the U.S., and coal-fired power plants are responsible for nearly 80 percent of that amount. 6 The increased concentration of CO 2 in the atmosphere enhances the Earth’s greenhouse effect, which accelerates global climate change.

HP is aware of the environmental impact of data center power consumption. Therefore, to cool future generations of IT equipment, HP’s focus is on solutions such as HP Dynamic Smart Cooling that improve overall system efficiency and reduce overall power consumption, rather than on cooling techniques that increase power use. HP is also a co-sponsor of The Green Grid initiative, which seeks to lower overall power consumption in data centers around the world. 7 The organization’s charter is to develop meaningful standards, measurement methods, processes, and new technologies to increase the energy efficiency of data centers.

Increasing energy and infrastructure costs

Due to the increase in equipment density and rising energy prices, energy cost has become a significant portion—as much as 40 percent—of the total data center operating costs. Worldwide, electricity used by servers doubled between 2000 and 2005. 8 Almost all of this increase was due to growth in the number of servers; a small percentage was due to increased power use per server.

5 For more information, see “Electricity Demand and Supply” from the U.S. Department of Energy at

6 For more information, see “Carbon Dioxide Emissions from the Generation of Electric Power in the United States” from the U.S. Department of Energy at

7 For more information, see http://www.thegreengrid.org

8 Koomey, J. “Estimating Total Power Consumption by Servers in the U.S. and the World,” Stanford University, February 2007.

The annual energy costs for a server can be estimated based on its rated power and the price of electricity ($0.11 per kW-hr in some parts of the U.S., for example) as follows:

Annual power cost = 8760 hrs/yr x $0.11/kW-hr x (Server power in kW)

In addition to rising energy costs, infrastructure costs are growing significantly because data centers

are becoming more mission-critical, requiring year-round monitoring and maintenance of redundant power and cooling equipment. To quantify this cost, the Uptime Institute introduced a simplified data center Infrastructure Cost (IC) equation that sums the cost of raw computer room space ($/ft 2 ) and the

cost of the power and cooling resources ($/kW). The value of $/kW component in the equation is

obtained from one of four functionality ratings, Tier I to Tier IV. The highest rating, Tier IV, represents

a fault-tolerant (mission-critical) data center. 9 At Tier IV, the IC equation is as follows:

IC = (Total Power × $22,000/kW) + (Area × $220/ft 2 , or $2,400/m 2 )

The result derived from this equation can be amortized by dividing it by the life span of the data center (typically 10 to 15 years) to estimate an annual infrastructure cost. The IC equation is useful, but it only provides a rough estimate of infrastructure costs.

By using the two equations above, it can be shown that the total infrastructure and energy costs can cost more than the server itself. 10 For example, Figure 3 compares the cost of a fully-configured, 500 W, 1U server to its annualized energy costs and associated infrastructure costs. The chart reflects that the price for a 1U server has remained relatively stable. However, the server cost was exceeded by the combined infrastructure and energy costs in 2001 and by the infrastructure cost alone in 2004. Findings like this have caused a paradigm shift away from strategies that focus on driving down the cost of IT equipment as a primary means to control data center costs. Instead, energy and infrastructure costs have become the main concern.

and infrastructure costs have become the main concern. Figure 3. Annual amortized cost of a fully-configured

Figure 3. Annual amortized cost of a fully-configured 1U server in a mission-critical (Tier IV) data center. The combined cost of energy and infrastructure surpassed the server cost in 2001.

and infrastruc ture surpassed the server cost in 2001. Characterizing cost, performance, and energy efficiency A
and infrastruc ture surpassed the server cost in 2001. Characterizing cost, performance, and energy efficiency A

Characterizing cost, performance, and energy efficiency

A

common goal for all data centers is to become more energy efficient. The challenge for the industry

is

determining the best indicators, or metrics, to measure efficiency. Some metrics measure the overall

efficiency of the data center infrastructure, while other metrics measure the efficiency of specific

9 Turner, P., Seader, J., “Dollars per kW plus Dollars per Square Foot Is a Better Data Center Cost Model than Dollars per Square Foot Alone,” The Uptime Institute, 2006,

10 Belady, C., “In the data center, power and cooling costs more than the IT equipment it supports,” ElectronicsCooling, volume 13, no. 1, February 2007.

components or subsystems. IT organizations need a common set of benchmark metrics that allow them to compare cost, performance, and energy efficiency. This paper identifies four such metrics: Total Cost of Ownership (TCO), Coefficient of Performance of the ensemble, delivery factor, and Coefficient of Energy Efficiency.

Note

A frequently used term in the following discussion is CRAH/CRAC unit “provisioning.” Provisioning is

a measure of the heat extracted by a CRAH/CRAC unit compared to its rated cooling capacity-. The

term “under-provisioned” refers to a CRAH/CRAC unit with a cooling load higher than the capacity

of the unit. The term “over-provisioned” refers to a CRAH/CRAC unit that operates with a cooling

load significantly below the capacity of the unit. An over-provisioned CRAH/CRAC unit wastes energy if operation of the unit cannot be adjusted to match the lower cooling load.

Total Cost of Ownership

Data center organizations are trying to lower TCO through consolidation and better utilization of resources. TCO includes

the cost of the facility space

the capital and maintenance costs for power and cooling resources

the cost of all IT and non-IT equipment

personnel costs

software costs

The interdependence of these variables requires a holistic (end-to-end) cost model that follows the flow of energy from heat generation at the chip core to heat dissipation at the cooling tower (see Figure 4). HP Laboratories has developed a chip core-to-cooling tower TCO model 11 that captures all the costs listed above plus the power consumption and thermodynamic behaviors all major components and systems. 12 The TCO model also goes a step further by factoring the amortization and preventative maintenance costs of power and cooling resources into the recurring energy expense, a technique referred to as “burdening.”

The comprehensive TCO model can be used in the design and optimization of data centers. After a data center is in operation, the TCO model can be used to create “set points” for a thermal management control system and to calculate real energy savings for various cooling technologies and strategies.

11 Patel, C., Shah, A., “Cost Model for Planning, Development and Operation of a Data Center,” HP Laboratories Technical Report, HPL-2005-107(R.1), June 9, 2005.

12 Patel, C., Sharma, R., Bash, C., Beitelmal, M., “Energy Flow in the Information Technology Stack: Coefficient

of Performance of the Ensemble and its Impact on the Total Cost of Ownership,” HP Laboratories Technical

Report, HPL-2006-55, March 21, 2006.

Figure 4. Holistic view of energy transfer in a typical air-cooled data center from chip

Figure 4. Holistic view of energy transfer in a typical air-cooled data center from chip to cooling tower

a typical air-cooled data center from chip to cooling tower Coefficient of Performance of the ensemble
a typical air-cooled data center from chip to cooling tower Coefficient of Performance of the ensemble

Coefficient of Performance of the ensemble

As exemplified by the chilled water cooling system in Figure 4, the chip-to-ambient model traces the energy flow path from chips, systems (servers and enclosures), racks, air distribution and chilled water systems, to the cooling tower. Work (energy) is introduced at each stage to transfer heat to a fluid stream (air or liquid). For example, at the chip level, a server’s fan blows air across the heat sink, and the warmer air is exhausted in the back of the rack.

Inefficiencies are introduced at each stage of the energy path due to flow (air mixing) and

thermodynamic irreversibilities. For example, the exhaust airstreams from different servers undergo mixing and other thermodynamic processes in the rack before being ejected into the aisle where further mixing occurs. These airstreams (or a portion of them) make it back to the CRAH/CRAC units

to transfer heat to the chilled water or refrigerant. Irreversibilities also arise due to the mechanical

efficiency of air handling devices at each stage. The mechanical efficiency of air handling devices, which include the fans in the servers, the CRAH/CRAC unit blowers, and the cooling tower blowers, can be as low as 65 percent.

A dimensionless thermodynamic metric, known as the Coefficient of Performance (COP), can be

applied at each stage of the energy flow path to track performance. COP is simply the heat extracted,

or dissipated, divided by the work supplied to the device. Then, the COP of each device can be

combined into an aggregate COP of the ensemble (COP G ). 13 Data center COP is defined as the ratio

of the total heat load dissipated to the power consumed by the cooling system:

COP G =

Total Heat Dissipation

(Flow Work + Thermodynamic Work) of Cooling System

Heat Extracted by Air Conditioners

=

Net Work Input

The COP G formulation allows performance to be tracked using instrumentation for each component along the energy flow path. Such a monitoring network would allow the heat load and work done by any component in the cooling system to be monitored in real time. In this way, COP G can be used as

a basis to operate the data center in a variety of modes, and it can even be used to compare data centers. The COP G metric can also be used in the TCO model described earlier to capture the recurring costs related to data center cooling.

13 Patel, C., Sharma, R., Bash, C., Beitelmal, M., “Energy Flow in the Information Technology Stack: Coefficient of Performance of the Ensemble and its Impact on the Total Cost of Ownership,” HP Laboratories Technical Report, HPL-2006-55, March 21, 2006.

Delivery factor and Coefficient of Energy Efficiency

While TCO and COPG are useful analytical metrics that provide a thorough understanding of the data center infrastructure, the “field” metrics introduced in this section are useful in understanding overall data center efficiency.

These field metrics, developed in 2003 by several experts attending a Rocky Mountain Institute (RMI) design workshop, are used to evaluate the energy efficiency of data centers. One recommended metric is commonly known as the delivery factor—the total power delivered to the facility divided by the net power that goes directly to the IT equipment. 14 Delivery factor can be used by an organization to benchmark its data center operations against the industry. It also offers a simple metric for determining compliance with governmental energy policies.

An alternative metric is the Coefficient of Energy Efficiency (CEE)—the inverse of delivery factor— which is represented as a percentage. For example, the CEE of a non-optimized data center may be 60 percent, while an optimized data center could have a CEE of 75 percent. 15 Figure 5 illustrates the concepts of delivery factor and CEE.

5 illustrates the concepts of delivery factor and CEE. Figure 5. Data center delivery factor and

Figure 5. Data center delivery factor and Coefficient of Energy Efficiency

center delivery factor and Coefficient of Energy Efficiency HP and other industry leaders are using two
center delivery factor and Coefficient of Energy Efficiency HP and other industry leaders are using two

HP and other industry leaders are using two metrics called Power Use Efficiency (PUE) and Data Center Efficiency (DCE) that are equivalent to delivery factor and CEE, respectively. Research by the Uptime Institute 16 indicates that 85 percent of data centers consume 2 kW for each kW consumed by IT equipment (Figure 6), which results in a PUE of 3 or more. The figure shows that most of the non-IT power is consumed by the cooling resources, and only a small portion goes to power conditioning/conversion. Data centers with a PUE of 3 or greater typically have a grossly over- provisioned cooling system. As described previously, over-provisioning increases capital and recurring expenses and decreases utilization and efficiency, resulting in a higher TCO. A PUE of 3.0 yields a DCE of 33 percent, meaning that only one-third of facility power is consumed by IT equipment.

14 RMI, 2003: Design Recommendations for High-Performance Data Centers. Report of the Integrated Design Charrette, 2-5 February 2003. Rocky Mountain Institute, Snowmass, CO. www.rmi.org/

15 Aebischer, B. Eubank, H., Tschudi, W., “Energy Efficiency Indicators for Data Centers,“ International Conference on Improving Energy Efficiency in Commercial Buildings “IEECB’04”, 21 - 22 April 2004, Frankfurt (Germany).

16 Belady, C., Malone, C., "Metrics to Characterize Data Center & IT Equipment Energy Use," Digital Power Forum, Richardson, TX (September, 2006).

Neither PUE nor DCE can identify the core inefficiencies or analyze their causes at any point in time or even over the lifetime of the datacenter. Therefore, alternate approaches and strategies (described later) are required to isolate root causes and provide remedial solutions.

to isolate root ca uses and provide remedial solutions. Figure 6. In 85 percent of data

Figure 6. In 85 percent of data centers, most of the non-IT power is used by the cooling resources.

most of the no n-IT power is used by the cooling resources. Data center cooling strategies
most of the no n-IT power is used by the cooling resources. Data center cooling strategies

Data center cooling strategies

In an environment of rising computing demand and unpredictable energy prices, HP promotes cooling strategies that improve efficiency without increasing overall power usage. These strategies range from adopting the latest best practices for high-density computing platforms to creating a management layer that enables consolidation and an infrastructure that can dynamically provision power and cooling resources as needed.

Adopt established best practices

The primary strategy for all IT organizations should be to optimize the data center infrastructure using best practices adapted to the requirements of new computing platforms. Industry user groups and analysts have found that implementing industry-proven best practices, such as those detailed in ASHRAE’s “Thermal Guidelines for Data Processing Environments,” 17 can be the single most effective and least costly improvements that data centers can make to increase efficiency. However, as of this publication date, less than 50 percent of data centers have implemented these practices.

For example, old practices, such as spreading heat loads throughout the room, may not apply to high- density IT equipment. It may be more cost-effective to isolate and cool high-density equipment rather than increase the burden on the entire cooling infrastructure. Read the technology brief “Optimizing facility operation in high density data center environments” at www.hp.com/servers/technology.

In addition, organizations can implement the following strategies to increase energy efficiency and improve data center TCO.

Use efficient components and systems to manage power

HP takes a holistic approach to improving data center energy efficiency starting at the component level, including the latest power saving processors from Intel and AMD and high-efficiency power supplies. At the system level, power management tools, such as HP Insight Power Manager and HP Power Regulator, help to accurately monitor server power use, improve server power efficiency, and provision power use for one or more ProLiant servers.

17 For information, visit www.ashrae.org.

Efficient components

Multi-core processors

The latest server processors from Intel and AMD have power state hardware registers that are available (exposed) to allow IT organizations to control the performance and power consumption of the processor. These capabilities are implemented through Intel’s Enhanced Intel SpeedStep® Technology and demand-based switching, and through AMD’s PowerNow with Optimized Power Management. With the appropriate ROM firmware or operating system interface, programmers can use the exposed hardware registers to switch a processor between different performance states, also called P-states, which have different power consumption levels. For more information, see “HP Power Regulator” in the following section.

High-efficiency power supplies

All ProLiant servers are equipped with high-efficiency switch-mode power supplies. When connected to a high-line voltage source, these power supplies enable ProLiant servers to operate with efficiencies of 85 percent or greater compared to white box servers that operate at efficiencies between 65 and 70 percent.

ProLiant-server power supplies operate at maximum efficiency when connected to high-line input power (200 to 240 VAC). As with typical power supplies in the industry, operating at low line power (100 to 120 VAC) causes the power supply to operate at a lower efficiency and to draw more current for the same power output. HP continues to develop new power supply solutions that deliver higher efficiency.

Efficient systems

HP Dynamic Power Saver

HP Dynamic Power Saver is a feature that improves power usage by using only the power supplies needed to match the requirements of the application load. HP Dynamic Power Saver runs continuously in the background, pooling power distribution to maintain system performance at higher application loads and providing power savings at lower application loads.

HP Power Regulator

HP developed a power management feature called HP Power Regulator that utilizes multi-core processor P-state registers to control processor power usage and performance. Using the P-states implemented in the CPUs, Power Regulator allows IT organizations to minimize power consumption, maintain desired performance levels, and maximize facility resources. 18 These capabilities have become increasingly important for power and heat management in high-density data centers. When combined with data-center management tools like HP Insight Power Manager (IPM), IT organizations have more control over the power consumption of all the servers in the data center. HP Power Regulator is included as a standard feature on HP ProLiant servers.

Power Cap

Using updated HP Integrated Lights-Out (iLO) 2 firmware (version 1.30) and update System ROM/BIOS (dated 5/1/2007), selected HP ProLiant servers now have the ability to limit the amount of power consumed. Customers may set a limit in watts or Btu/hr. The purpose of this limit is to constrain the amount of power consumed, which reduces the heat output into the data center. The iLO 2 firmware monitors the power consumption of the server, checks it against the power cap goal, and, if necessary, adjusts the server’s performance to maintain an average power consumption that is less than or equal to the power cap goal.

18 For more information about HP Power Regulator, read the technology brief “HP Power Regulator for ProLiant servers” at http://h20000.www2.hp.com/bc/docs/support/SupportManual/c00593374/c00593374.pdf.

Using the IPM v1.10 plug-in to Systems Insight Manager v5.1, customers may set power caps on groups of supported servers. The IPM software statically allocates the group power cap among the servers in the group. The group cap is allocated equitably among all servers in the group based on a calculation using each server’s idle and maximum measured power consumption.

Use virtual machines to consolidate computing resources

Virtual machine technology allows computing resources to be pooled and shared—rather than dedicated to a particular user or application—so that workloads can be allocated dynamically. An IDC study of the datacenter environment has shown that typical x86 processor utilization rates (without virtualization) range between 10 to 20 percent. 19 Using virtual machine technology to consolidate computing increases processor utilization rates and reduces both capital equipment expenses and operating expenses (physical space requirements as well as power and cooling costs). These benefits are achieved through an automated control system that schedules compute workloads across racks of servers in a way that minimizes energy consumption and maximizes cooling efficiency. The computing resources not in use are put on standby to further improve operating efficiency. As application demand grows, the number of virtual machines can be expanded in a disciplined way that maximizes these benefits.

The benefits of virtual machine technology are significant; however, virtualization is not a standalone solution. It requires a data center infrastructure that can quickly provision power and cooling resources as workloads are moved around the data center. Also, as application demand grows, the expanding pools of virtual machines will eventually approach the data center infrastructure limits.

For information about infrastructure considerations, HP management software, and virtual machine technology, read the “Server virtualization technologies for x86-based HP BladeSystem and HP ProLiant servers” technology brief at www.hp.com/servers/technology.

Optimize infrastructure efficiency

Computational fluid dynamics

Significant energy savings can be realized through the use of computational fluid dynamics (CFD) software. CFD is a branch of fluid mechanics in which numerical methods and algorithms are used to solve and analyze problems that involve fluid flows. Planners and designers can use specialized CFD software to build a computer model of a data center and to determine the impact on cooling resources when changes are made to equipment layout, CRAH/CRAC flow rate, floor tiles, heat load distribution, and other parameters.

19 The Worldwide Server Power and Cooling Expense 2006-2010 Forecast, by Jed Scaramella, IDC Market Analysis, September 2006, IDC #203598, uses an estimate of 10 to 20 percent server utilization.

Visualization tools

Planners and designers can graphically view the CFD data by importing it into visualization software tools. HP has developed visualization tools to help customers evaluate the thermal performance of data centers as part of the HP Data Center Thermal Assessment service. These tools allow the visualization of parameters like temperature, air velocity, and pressure over a top view image of the room layout. For example, the visualization tool can generate images of each CRAH/CRAC unit’s “region of influence” over equipment in the room (Figure 7). The tool can combine the regions of influence for different CRAH/CRAC units in the same plot, allowing identification of areas where multiple regions overlap. These redundant regions can be ideal locations for mission-critical and high- density equipment, which typically require uninterrupted cooling. Visualization tools also allow designers to simulate how regions of influence are redistributed when one CRAH/CRAC unit fails.

In addition, customers can use CFD and visualization tools to

Verify cooling capacity before construction

Statically optimize the rack layout and airflow distribution

Manage varying power loads and equipment arrangements

• Manage varying power loads and equipment arrangements Figure 7. The visualization tool can generate images

Figure 7. The visualization tool can generate images of regions of influence for individual CRAH/CRAC units (left), as well as overlay images of multiple regions in the same plot (right), identifying areas where the regions overlap.

CRAC 1

identifying areas where the regions overlap. CRAC 1 CRAC 4 Green areas show overlapping of CRAC
identifying areas where the regions overlap. CRAC 1 CRAC 4 Green areas show overlapping of CRAC

CRAC 4

Green areas show overlapping of CRAC 1 and CRAC 4 regions
Green areas
show overlapping
of CRAC 1 and
CRAC 4 regions
areas where the regions overlap. CRAC 1 CRAC 4 Green areas show overlapping of CRAC 1

HP Dynamic Smart Cooling

HP Dynamic Smart Cooling (DSC) offers a much higher level of automated facility management for air-cooled environments than traditional thermal management systems. 20 DSC functions as a management layer, residing above the existing infrastructure to provide real-time control settings that optimize efficiency. The DSC environmental control system uses a distributed network of sensors that are attached to racks to measure environmental conditions close to computing resources. The control system constantly analyzes data from the sensor network and manipulates supply air temperature and flow rate from individual CRAH/CRAC units by adjusting the amount of chilled water entering the CRAH/CRAC units and by changing the blower speed. DSC provides real-time CRAH/CRAC response to rack inlet temperature variations, allowing higher inlet temperatures with fewer CRAH/CRAC units for the same layer of redundancy.

Figure 8 shows some of the basic features of DSC:

A pervasive sensor network down to the rack level

- temperature sensors at the CRAH/CRAC supply

- thermal sensors on the racks measuring supply and exhaust air temperature of the systems

Adaptive control of VFDs (Variable Flow Devices) to modulate flow work, along with variable cooling coil temperature in the CRAH/CRACs to modulate thermodynamic work

A data aggregation system that collects sensed data from all locations

A system management controller that modulates the variable air conditioning resources through a control algorithm for a given distribution of workloads (heat loads)

for a given distri bution of workloads (heat loads) Figure 8. Basic features of HP Dynamic

Figure 8. Basic features of HP Dynamic Smart Cooling

loads) Figure 8. Basic features of HP Dynamic Smart Cooling The benefits of DSC include: •
loads) Figure 8. Basic features of HP Dynamic Smart Cooling The benefits of DSC include: •

The benefits of DSC include:

Energy savings: Up to 45% reduction in cooling costs, depending on cooling plant

Resiliency: Automatic re-tuning in the event of IT load changes or unplanned disturbances that may threaten the effectiveness of cooling resources

Higher COP: Facilitates higher return air temperature and higher inlet temperature at the air handler coils, resulting in a higher Coefficient of Performance.

20 Bash, C.E., Patel, C.D., Sharma, R.K, “Dynamic Thermal Management of Air Cooled Data Centers”, IEEE Itherm, 2006.

Increased efficiency: Reduces wasteful overprovisioning of data center resources and enables higher chilled water temperature for energy savings, yet maintains additional cooling capacity for use when needed.

Flexibility: The data center becomes easy to retrofit; customers can choose to spend energy savings on IT scalability

Investment protection: Increased degrees of freedom for future growth

HP Modular Cooling System

The drive to maximize the compute density of data centers has prompted alternative cooling methods that move the cooling solution closer to the cooling target. These methods, categorized as closely- coupled cooling solutions, include liquid-cooled racks such as the HP Modular Cooling System. The benefits of closely-coupled solutions include better efficiency (by applying cooling directly where it is needed), higher rack densities, and conservation of floor space. One MCS enclosure has enough cooling capacity to support the heat load generated by a 35-kW rack of equipment, which is more than that generated by three 10-kW racks while occupying 40 percent less floor space.

The MCS evenly distributes cold supply air at the front of the rack of equipment (Figure 9). Each server receives adequate supply air, regardless of its position within the rack or the density of the rack. The servers expel warm exhaust air in the rear of the rack. The variable-speed fan modules channel the warm air from the rear of the rack into the heat exchanger (HEX) modules. The HEX modules cool the air and then re-circulate it to the front of the rack. Chilled water for the heat exchangers can be provided by the facility’s chilled water system or by a dedicated chilled water unit.

led water system or by a dedicated chilled water unit. Figure 9. HP Modular Cooling System

Figure 9. HP Modular Cooling System

chilled water unit. Figure 9. HP Modular Cooling System The MCS is ideal for remote, unattended
chilled water unit. Figure 9. HP Modular Cooling System The MCS is ideal for remote, unattended

The MCS is ideal for remote, unattended sites; for data centers that have reached the limit of their installed floor-level cooling capability; or for facilities that need to reduce the effect of high-density racks on their infrastructure. It can support fully populated high-density racks while eliminating the need to add more facility air conditioning capacity.

The MCS has a maximum rated capacity of 35 kW at greater than 90 percent cooling efficiency, thus presenting an almost neutral (10 percent) heat load in a data center. For example, the heat load of a 35-kW rack of equipment requires 9.95 tons of cooling capacity (35 kW x 0.2844 tons/kW=9.95 tons). This means that at 90 percent efficiency, the MCS provides almost 9 tons of the cooling capacity required. To provide the same amount of cooling as the MCS, typical CRAH/CRAC units

operating at 50 percent efficiency would need to deliver 18 tons of cooling capacity (9 tons ÷ 0.50). In other words, the MCS relieves the cooling infrastructure of 17 tons of cooling capacity, after subtracting1.0 ton (10 percent of 9.95 tons) needed to cool the MCS.

Summary

HP’s strategies include improving power and cooling efficiency at the component, system, and infrastructure level within the data center. At the component level, HP uses a portfolio of technologies including power-saving multi-core processors from Intel and AMD, more-efficient power supplies, and other components (not mentioned), such as low-power small form factor drives and high-efficiency HP ActiveCool fans.

The benefits of these components are multiplied at the system level in HP ProLiant servers through power management technologies such as Dynamic Power Saver, Power Regulator, and Insight Power Manager. Dynamic Power Saver increases power supply efficiency at the same time that HP Power Regulator reduces processor power consumption to save even more energy. The Insight Power Manager plug-in to Systems Insight Manager v5.1 allows a power cap to be set on groups of servers to limit their power consumption and heat generation.

At the infrastructure level, HP is leading the industry with energy-efficient technologies, such as the HP Modular Cooling System and Dynamic Smart Cooling, as well as the HP Data Center Thermal Assessment service. The MCS is ideal for sites that want to add high-density racks with minimal impact on their existing cooling infrastructure and floor space. DSC can reduce cooling costs by up to 45 percent, depending on the type of cooling plant. HP Data Center Thermal Assessment service delivers thermal analysis and site planning to allow IT and facilities managers to make informed decisions regarding current and future data center operations.

Call to action

This paper calls for IT organizations to understand the challenges ahead; to abandon out-dated technologies that do not focus on energy efficiency; to become familiar with new tools and technologies that improve performance and energy efficiency; and to adopt proven strategies and best practices for high-density IT equipment.

16
16

For more information

For additional information, refer to the resources listed below.

Source

Hyperlink

Patel, C., Shah, A., “Cost Model for Planning, Development and Operation of a Data Center,” HP Laboratories Technical Report, HPL-2005-107(R.1), June 9, 2005

“Energy Flow in the Information Technology Stack: Coefficient of Performance of the Ensemble and its Impact on the Total Cost of Ownership”

“Estimating Total Power Consumption by Servers in the U.S. and the World”

“Server virtualization technologies for x86-based HP BladeSystem and HP ProLiant servers”

To send comments

Send comments about this paper to TechCom@HP.com.

© 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

AMD, AMD Opteron, and PowerNow are trademarks of Advanced Micro Devices, Inc.

Intel, SpeedStep and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries and is used under license.

TC070805TB, August 2007

or its subsidiaries in the United States and other countries and is used under license. TC070805TB,