Вы находитесь на странице: 1из 4

Clock Gating A Power Optimizing Technique for

VLSI Circuits
Jitesh Shinde1, Dr.S.S. Salankar2
1, 2 Department of Electronics & Telecommunication Engineering
J.L.Chaturvedi College of Engineering, Nagpur
1
victoria_jitesh@yahoo.com
Abstract Clock gating is one of the power-saving techniques
used on the Pentium 4 processor and in next generation
processors. To save power, clock gating refers to activating the
clocks in a logic block only when there is work to be done.
From the earliest days of the Pentium 4 processor design,
power consumption was a concern. The clock gating concept
isn't a new one; however, the Pentium 4 processor used this
technology to a large extent. Every unit on the chip has a
power reduction plan, and almost every Functional Unit Block
(FUB) contains clock gating logic.
The work in this paper investigates the various clock gating
techniques that can be used to optimise power in VLSI circuits
at RTL level and various issues involved while applying this
power optimization techniques at RTL level.
Keywords Clock Gating (CG), latch free clock gating, latch
based clock gating, core dynamic power dissipation.

I. INTRODUCTION
With the advent of the consumer era and the popularity of
mobile applications, power optimization is the mantra of the
day. Designers go through several iterations to optimize
power in order to achieve their power budgets. Though
power should be optimized at all stages of the design flow,
optimizations in early design stages have the greatest impact
in reducing power [9, 10].
Clock power consumes 50-70 percent of total chip power
and is expected to significantly increase in the next
generation of designs at 45nm and below. This is due to the
fact that power is directly proportional to voltage and the
frequency of the clock as shown in the following equation:
Power = Capacitance * (Voltage) 2 * (Frequency)
Hence, reducing clock power is very important. Clock
gating is a key power reduction technique used by many
designers and is typically implemented by gate-level power
synthesis tools.
RTL Clock Gating is the most commonly used
optimization technique for reducing dynamic power. The
challenge of optimizing power by adding clock gating is
knowing where and when to insert clock gating. The
traditional method of looking at the percentage of registers
that are clock gated is not indicative of the power savings
because it does not take into account switching activity. The
average Clock-Gating Efficiency for a design is a much
better indicator of dynamic power consumption because it is

a measure of both how many and how long registers are


gated.
II. CLOCK GATING
Clock gating, which is probably one of the most
well-known low-power techniques, is very effective in
reducing the power consumption in digital circuits and also
VLSI circuits. The goal of this technique is to disable or
suppress transitions from propagating to parts of the clock
path (i.e., flip-flops, clock network, and logic) under a
certain condition computed by clock-gating circuits. The
savings are mainly due to the switching capacitance
reduction in the clock network and the switching activity in
the logic fed by the storage elements because unnecessary
transitions are not loaded when the clock is not active. CG
is illustrated in figure 1 block CG, which inhibits the clock
signal when the idle condition is true, is associated with
each sequential functional unit.. The clock signal is
computed by function Fcg. CLK is the system clock and
CLKG the gated clock of the functional unit.

Fig 1. Clock gating principle

It is good design idea to turn off the clock when it is not


needed. Automatic clock gating is supported by modern
EDA tools. They identify the circuits where clock gating
can be inserted.
The RTL stage is the best point in the design process to
optimize dynamic power. At this point, the system
architecture is defined, the design is clock cycle accurate,
and there is accurate power information available from
lower design stages. The only thing left is for hardware
designers to have a RTL metric to evaluate and identify
candidate logic within a design for optimization of clock
gating.

RTL clock gating works by identifying groups of flipflops which share a common enable control signal.
Traditional methodologies use this enable term to control
the select on a multiplexer connected to the D port of the
flip-flop or to control the clock enable pin on a flip-flop
with clock enable capabilities. RTL clock gating uses this
enable signal to control a clock gating circuit which is
connected to the clock ports of all of the flip-flops with the
common enable term. Therefore, if a bank of flip-flops
which share a common enable term have RTL clock gating
implemented, the flip-flops will consume zero dynamic
power as long as this enable signal is false.

the clock, just as in the traditional ungated design style


(figure 3).

Fig 3.Latch Based clock gating

III. HOW TO IMPLEMENT CLOCK GATING


There are many clock gating styles available to optimize
power in VLSI circuits. They can be:
1) Latch-free based design.
2) Latch-based design.
3) Flip-flop based design.
4) Intelligent clock gating optimizing option available
in synthesis tool like Xilinx, Altera, Cadence SOC
Encounter etc.
LATCH-FREE BASED CLOCK GATING DESIGN
The latch-free clock gating style uses a simple AND or
OR gate (depending on the edge on which flip-flops are
triggered). Here if enable signal goes inactive in between
the clock pulse or if it multiple times then gated clock
output either can terminate prematurely or generate multiple
clock pulses. This restriction makes the latch-free clock
gating style inappropriate for our single-clock flip-flop
based design (figure 2).

In some applications, latch-based designs are preferred to


D Flip Flop (DFF)based designs. The basic concept is that
a DFF can be split into two latches, and each one is clocked
with an independent clock signal. The two clocks are nonoverlapping clocks as presented in figure 4. Combinational
network is usually inserted between the two latches to build
a pipelined datapath.The main advantage is that this kind of
design supports greater clock skew before failing than a
similar DFF-based design. The second advantage is that
time borrowing is achieved naturally in the pipelined
datapath.

Fig 4. Master-slave latch and no overlapping clock concepts

The clock gating is easy to implement. A simple AND


gate is used to generate the gated clock. This configuration
(figure 5) is glitch-free because the control signal, generated
when Phi1 is high, is stable and remains stable when Phi2
goes high.

Fig 2.Latch free clock gating

LATCH-BASED CLOCK GATING DESIGN


The latch-based clock gating style adds a level-sensitive
latch to the design to hold the enable signal from the active
edge of the clock until the inactive edge of the clock. Since
the latch captures the state of the enable signal and holds it
until the complete clock pulse has been generated, the
enable signal need only be stable around the rising edge of

Fig 5. Clock gating of latched based design

FLIP-FLOP BASED CLOCK GATING DESIGN


This technique is similar to latch based design with only
difference that instead of latches usually D-flip-flops are
used. But due to advantages latch based design offers, the
flip-flop based design is generally not preferred.
Register
A

Q1

Datain

Q1

CLKG

1
H
Datain

Q2

Register
A

Q2

SET

CTRL

CTRLint

ENB

Q
L

ENB

Qbar
CLR

CTRL
CLK
CLK

Fig 4. Enabled (a) to gated clock transformation (b).

It is well-known that this kind of flip-flops based design


are area and power-consuming, but their advantage
compared with gated-clock-based design is that testability
can be easily implemented and clock skew is more
manageable.
INTELLIGENT CLOCK GATING OPTIMIZING OPTION
AVAILABLE IN SYNTHESIS TOOL
Recently, in many industry sign-off tools like Cadence
SOC Encounter, Altera, Xilinx etc intelligent clock gating
option has been made available in the tool to optimize the
power consumption of the design [6].
It is important to note that in such cases it may be
possible that designer may not always get power reduction
to desired level. Hence in such conditions designer may
have to incorporate possible clock gating methods discussed
above at RTL level to further reduce the dynamic power
consumption of the circuit.
IV. ISSUES IN IMPLEMENTATION OF CLOCK GATING
DESIGN TECHNIQUES

i.] The clock gate (i.e., AND or OR) must not alter
the waveform of the clock other than turning the clock on or
off.
ii.] Clock gating hold time violations and set-up
time violations can be fixed like other violations during
physical design phase (Timing Closure phase of Backend
design).
iii.] Techniques can used to fix hold violations are
clock skewing/buffering in data path near to endpoint
(Timing Closure phase of Backend design).
iv.] Is clock gating dividing clock? , then designer
should take care about phase of clock gating signal.
v.] Glitches may occur in the gated clock if clock
gating is not done properly.
vi.] Improper control of the gating signal could result
in big functional problems.

vii.] Overhead in design, verification and silicon


area.
viii.] Clock-Gating Efficiency is defined as the
percentage of time a register is gated for a given
switching activity. When looking at an entire design, the
average Clock-Gating Efficiency can be computed as the
average of Clock-Gating Efficiencies for all registers in the
design for a given simulation test bench.
Improving the Clock-Gating Efficiency in turn means
reduced switching, which can save dynamic power. A
designers goal is to improve the average Clock-Gating
Efficiency as much as possible. It is not practical to achieve
100%, which means the design is idle and non-functional all
the time.
Low Clock-Gating Efficiency is a good metric to identify
candidate areas of the design to add clock gating. It may not
always be possible to add clock gating to low efficiency
areas and adding clock gating may not necessarily be
accompanied by reduced power because dynamic power is
also a function of clock frequency, voltage, and capacitance.
While Clock-Gating Efficiency is not an absolute
indicator of power, it is a very good metric for hardware
designers to gain visibility into power at the RTL without
requiring time consuming power analysis or synthesis.
V. CASE STUDY : IMPLEMENTATION OF CLOCK GATING
IN 8-BIT ARITHMETIC LOGIC UNIT
At RT and gate-level for dynamic power management, a
gated clock provides a way to selectively stop the clock, and
thus, force the original circuit to make no transition,
whenever the computation that is to be carried out at the
next clock cycle is redundant. In other words, the clock
signal is disabled according to the idle conditions of the
logic network. For reactive circuits, the number of clock
cycles in which the design is idle in some wait states is
usually large. Therefore, avoiding the power waste
corresponding to such states may be significant.
In this case study, first an 8 bit ALU (Arithmetic Logic
Unit) is designed and implemented on Xilinx ise Project
Navigator 12.4 tool. This 8-bit ALU is planned to be used in
design of an 8-bit microprocessor later wherein it may be
required to inhibit the activity of 8 bit ALU during certain
number of cycles of the instruction as required to reduce
dynamic power consumption of the microprocessor.
So, during first phase of study, an 8-bit ALU is
implemented. During this phase, design was tested with
respect to various intelligent clocks gating options and
design strategy available in Xilinx Project Navigator Tool
version 12.4 to study its effect on net dynamic power
dissipated or area in terms of logic blocks used. The results
of this synthesis and implementation (FPGA FamilySpartan -6) are as follows:

Table 1: Results for 8 bit-ALU [FPGA family: Spartan 6]


Design
Strategy

Power
Reduction

Balanced
Balanced
Power
Minimization
Area
Minimization

OFF
ON
---

Area
Minimization

OFF
Strategy-I
Strategy-II
ON
Strategy-I
Strategy-II

Total
Power
(Watt)
0.210
0.209
0.209

Dynamic
power
(Watt)
0.197
0.196
0.196

No.
of
Logic slices
used
57 / 2400
57 / 2400
60 / 2400

0.209
0.209

0.196
0.196

60 / 2400
59 / 2400

0.208
0.208

0.195
0.195

60 / 2400
59 / 2400

From the above results, it was observed that using


inherent tool capability to optimize dynamic power or area
may not achieve optimization as desired.
So in next phase of case study, a clock gating concept
(latch based) was incorporated in the design without
affecting the functionality of the design. The results of this
synthesis and implementation (FPGA Family-Spartan -6)
are as follows:
Table 2: Results for 8 bit-ALU with CG [FPGA family: Spartan 6]
Design
Strategy
Balanced
Balanced
Power
Minimization
Area
Minimization
Area
Minimization

Power
Reduction
OFF
ON
--OFF
Strategy-I
Strategy-II
ON
Strategy-I
Strategy-II

Total
Power
(Watt)
0.034
0.032
0.031

Dynamic
power
(Watt)
0.025
0.024
0.022

No.
of
Logic slices
used
22 / 2400
22 / 2400
21 / 2400

0.034
0.034

0.025
0.025

22 / 2400
20/ 2400

0.032
0.032

0.024
0.023

22 / 2400
20/ 2400

On comparing the results of table1 and table 2


respectively, following points are concluded:i.] It is good practice to use inherent intelligent clock
gating option viz. Design Strategy: Power Minimization, or
Balanced with power reduction ON, to enhance the further
chances of minimizing dynamic power consumption of the
circuit.
ii.] If is possible, it is wiser to look for options whether
clock gating concepts can be incorporated in circuit at RTL
level. It is evident from results obtained in Table 1 and
Table 2 respectively. In both the cases, test vectors applied
to the circuit were same.
iii.] Estimating power depends on representative
switching activity. A simulator can generate a switching
activity file based on a given test-bench. This is only as

representative as the test-bench itself, so selection of a


representative test-bench is critical to good power
estimation.
VI. CONCLUSIONS
Power optimization, traditionally relegated to the
synthesis, and placement and routing stages, has moved up
to the System level and RTL stages. Hardware designers use
clock gating to turn off inactive sections of the design and
reduce overall dynamic power consumption.
The RTL approach is important because designers
usually verify power only at the gate level and any change
to the RTL needs many design iterations to reduce power.
The RTL solution thus saves weeks of effort by fixing
potential power issues up-front.
The RTL coding step is not too early in the design flow
to address power consumption optimization. For each
source of consumption and each type of digital block,
appropriate solutions can be implemented. Although the
theory behind some of these techniques can be complex,
they are often easy to implement. RTL designers should be
aware of these techniques and use their knowledge of the
system not only to optimize the speed performance, but also
to reduce the unnecessary switching activity.
REFERENCES
[1]
[2]
[3]

[4]
[5]

[6]
[7]
[8]
[9]

Massoud Pedram and Afshin Abdollahi, Low Power RT-Level


Synthesis Techniques: A Tutorial Dept. of Electrical Engineering,
University of Southern California
L. Benini, M. Favalli, and G. De Micheli, Design for testability of
gated-clock FSMs, in Proc. European Design and Test Conf.,
Paris, France, Mar. 1996, pp.
Veena S Chakravarthi, ,K S Gurumurthy, , Low Power Design
Methodology for Core based ASSP Centillium Communications
India Pvt. Ltd and U V College of Engineering Bangalore, India.
Pieter J. Schenmakers and J.Frans M. Theeuen, Eindhoven
University of Technology, Neatherland, Clock Gating on RTVHDL.
Frank Emnett and Mark Biegel, Automotive Integrated Electronics
Corporation, Power Reduction Through RTL Clock Gating.
Safeen Huda, Muntasir Mallick, Jason H. Anderson, Dept. of ECE,
Univ. of Toronto, Toronto, ON Canada, Clock Gating
Architectures for FPGA Power Reduction.
Frederic RivoallonReducing Switching Power with Intelligent
Clock Gating, WP370, Xilinx, March 1, 2011.
Mitch Dale Power Optimization in a High Performance
Microprocessor Design, Calypto Design Systems.
Mitch Dale Utilizing Clock-Gating Efficiency to Reduce Power in
RTL Designs,, Calypto Design Systems.
Mitch Dale Power Optimization in a High Performance
Microprocessor Design,, Calypto Design Systems.

Вам также может понравиться