Вы находитесь на странице: 1из 31

Keeping Hot Chips Cool

Circuits
R-US

Ruchir Puri, Leon Stok, Subhrajit Bhattacharya


IBM T.J. Watson Research Center
Yorktown Heights, NY

So, Whats Going On ?


P o w e r D e n s ity (W /c m ^ 2 )

1E+4

1E+2

Active Power

1E+0

Shrinking Margin

1E-2

1E-4

SubThreshold
Power

1E-6

1E-8
0.01

0.1

10

Gate Length (microns)

At 65nm node Static Power is equal to Active Power

Clock distribution accounts for half of active power

Why Cant We Keep Scaling Vt ?

Delay (ps)

Device Leakage vs Delay


40

1000

30

100

20

10

10

0
0

0.1

0.2
Threshold Voltage

0.3

0.4

L a t e M o d e T im in g C h e c k s ( T h o u s a n d s )

Low Power Opportunities


5%

200

150

10%

15%

20%

Exploiting positive slacks

100

50

280

260

240

220

200

180

160

140

120

100

80

60

40

20

-2 0

-4 0

Timing Slack (psec)

Power4 Timing Histogram

Most of the Power reduction techniques exploit this

positive slack.

Low Power Levers


Structural Techniques

Voltage Islands
Multi-threshold devices
Multi-oxide devices
Minimize capacitance by custom design
Power efficient circuits
Parallelism in micro-architecture

Dynamic Techniques

Clock gating
Power gating
Variable frequency
Variable voltage supply
Variable device threshold

Outline

Voltage

Clock & Latch

Power

Islands

Optimization

Gating

Active Power

Clock Power

Leakage Power

Outline

Voltage

Clock & Latch

Power

Islands

Optimization

Gating

Active Power

Clock Power

Leakage Power

Minimizing Active Power:


Coarse Grained Voltage Islands
Vdd1

Vdd2

Trade off power for delay by

Vdd0

running functional blocks at


different voltages

SWITCH

SWITCH

High VT
LOGIC

LOGIC

Can use mix of Low and High Vt to

balance performance and leakage


Switch off inactive blocks to

reduce leakage power


IP 1

IP 2

Power Management Unit

E.g.: Telecom ASIC 1.0/1.2 V islands saved:

16 % active power
50 % standby power

Fine-Grained Voltage Islands


PowerPC 405
Secondary power drop

Vddl = 1.2V

Vddh = 1.5V

No timing degrade, and no area increase for the core!

Outline

Voltage

Clock & Latch

Power

Islands

Optimization

Gating

Active Power

Clock Power

Leakage Power

Minimizing Clock Power:


Local Clock buffer - Latch clustering
Clocks consume large amount of power in high-performance designs

Large portion of that power goes to the last stage of the clock tree

Minimize the Capacitive loading on local clock buffers by clustering

latches around them.

Tradeoff between latch placement flexibility and clock power savings


Reduction in clock skew between capturing and launching latch
compensates for loss in latch placement flexibility.

Clock Power Savings


70

% Capacitance Savings

60

Wire

Total

50
40
30
20
10
c1_0
c1_1
c1_2
c1_3
c1_4
c1_5
c1_6
c1_7
c1_8
c1_9
c1_10
c1_11
c1_12
c2_0
c2_1
c2_2
c2_3
c2_4
c2_5
c2_6
c2_7
c2_8
c2_9
c2_10
c2_11
c2_12

Clock Net

Reduces total capacitance on the local clock buffer by 25%


Direct savings in clock power in the Random Control Logic

Outline

Voltage

Clock & Latch

Power

Islands

Optimization

Gating

Active Power

Clock Power

Leakage Power

Minimizing Leakage Power:


Power Supply Gating

Logic Block

SLEEP

Footer Switch

Leakage power is now more than switching power

Limits the performance of microprocessors

Power gating is one of the most effective ways of minimizing leakage

power

Cut-off power to inactive units/components

Dynamic/workload based power gating

Reduces both gate and sub-threshold leakage


Over 20-2000x reduction in leakage with little or no cycle time penalty.

Power Gating Concept


Performance on Demand

P1

P2

Dedicated Units
off
on
P1

L2
P3

P2
L2

P4

P3

P4

More Power Available to Scalar Units

Dedicated Units Available for

Higher SPEC Performance

Higher Application Performance

Normal Operation Mode


VDDL

IDS,MAX

CORE

VGS = VDD

IDS
VGND

VDS,LINEAR

IACTIVE

VGS = 0 V
VDS

GNDL

To reduce the performance degradation, the voltage


drop across SLEEP transistor should be minimized to
reduce active leakage current. Requires sizing up of
footer device

Sleep Mode
VDDL
CORE

IDS,MAX

VGS = VDD

IDS
VGND

VGS = 0 V
VDS
GNDL

During the sleep mode, all of the internal


capacitive nodes and VGND node are charged
up to near VDD. Requires sizing down of footer
device to reduce standby leakage.

Wake-Up Mode
VDDL

IDS,MAX

CORE

VGS = VDD

IDS
VGND

ITURN_ON

Rs

VGS = 0 V
VDS

GNDL

When the SLEEP transistor is turned on,


the maximum instant current can flow.
Requires sizing up of footer device.

Sleep / Wake / Run State Control


Exit sleep
state

of

assert discharge
wake

Enter sleep
state

run

enabl
e
fence

assert disable
&
run
run
fence

deassert
wake/ru
n

charge

discharg
e cycle
(wake)

sleep

run
(idle)

charge
cycles

sleep

of

Footer Selection and Sizing

15.5x

20x
25x
33x
50x

< 1% Frequency Loss

100x

Leakage Reduction

10x-20x Leakage Reduction

Power vs Performance Tradeoff


130nm Hardware

~8% Performance Degradation


Due to Sleep Transistor
with 1% area overhead

Target Specification: 250MHz at 0.9V ~ 500MHz at 1.4V


1% footer size is used for a 2-stage pipelined 40-bit ALU

Sleep Transistor Sizing and Performance


130nm Hardware

Less Than 2%
Performance
Degradation
More Than 8%
Performance
Degradation

Leakage Power Reduction


130nm Hardware
Leakage Suppression Using
VDD Scaling
~8.4 x

~2000 x

Leakage Suppression using Power Gating


Structure with 1% area overhead

Physical Design:
External Footer Switch

Global
Grid

GND
VGND

Macro/Core
M1 metal

Virtual
Grid

M2 metal

Footer Switch
Location

Physical Design:
Internal Footer Switch
GND

VDD

GND

VDD

GND

VDD
VGND
VGND
M1 metal

VDD
VDD
VGND

Footer Locations

M2 metal

Internal fine-grained power gating is more efficient in

addressing:

Electro-Migration and Current Delivery.

Ground Redistribution

This part of
the
redistribution
is electrically
similar to an
unmodified
distribution

The real chip-level ground


distribution is M4 and above.
It is unchanged by power
gating
Global ground
Virtual ground
M3
V2
M2
V1
M1
Contact
Logic Device

Footer Cell

Physical Design: Footer Insertion

Footer Rows

Without Footers

With Footers

Power Gating in High-Performance


Gated and non-gated
logic have
identical width
5% total area
overhead
for power gating
20X leakage
reduction
<1% performance
degradation

Non-gated Logic

Gated Logic

Power Gating: Footer area overhead

10.4%

5.7%

10mV Virtual Ground

Conclusions
Power is the limiting factor in traditional CMOS scaling

and must be dealt with aggressively

Controlling leakage is crucial for future scaling


Power gating and voltage islands are effective techniques
to minimize leakage and active power
Special consideration to clock distribution must be given
in high performance designs to minimize clock power

In order to keep hot chips cool, a holistic power

minimization approach across the whole design stack is


required which must include :

Device level techniques


Circuit level techniques
System level power management

Вам также может понравиться