Академический Документы
Профессиональный Документы
Культура Документы
This unit is mainly concernred with the dynamic operation of MOS circuits. The focus will be on
MOS inverters and calculation of propagation delays, delay models for large interconnects, and
power dissipation estimates. These approaches are mainly used for hand analysis, leaving
more accurate predictions to simulation tools.
6.1
This load capacitance is an important parameter since it determines how fast a gate can
change its output, after a change in gate input. This, in turn, determines how frequently the
input can change. A CMOS inverter is shown in Fig. 6.1: in this figure, all capacitances associated with a MOS transistor are included. Some of them are not necessary, e.g., Csb,p , because
0 From
Kang Ch. 6. Note: Figure and Equation numbers dont necessarily match.
Figure 6.2: First stage CMOS inverter with lumped output load capacitance
for this PMOS VG = VS = VDD . Cgs,p and Cgs,n are also not required since neither terminals
of these capacitors are on a path between Vin , VDD , orGND and Vout .
(6.1)
Cint represents the wiring capactiance for the internal and external connections and Cg is the
gate capactiance of the driven MOS transistors.
6.2
The propagation delay gives a measure of how long a gate takes to respond to changes
in its inputs. Since in MOS circuits, charging and discharging take place through different
transistors, we have two different metrics: PHL measures the amount of time between the
1 Similar reasoning can be used to derive the capacitances in Equation 7.14 (p 281) and Equation 7.28 (p 285)
and for any other MOS gate.
50% transition level of the rising input signal and the 50% transition level of the falling output
signal. i.e., it is a measure of the amount of time taken for the NMOS to discharge the output
node to VOL . Similarly, PLH measures the amount of time between the 50% transition of the
falling input and 50% level of the rising input. e.g., in an inverter it is a measure of the time
taken by the PMOS transistor to charge the output node to VOH . We will do a detailed analysis
of only an inverter, since most other CMOS circuits can be reduced by calculating (W/L) Eq
for the PUN and the PDN networks. The 50% transition level for the output is defined as
V50% =
1
(VOL + VOH )
2
p =
PHL + PLH
2
Fig. 6.4 shows a pictorial representation of rise and fall times of an output signal. The 10%
3
f all = t B t A
rise = t D tC
6.3
The delay times are calculated by applying Kirchoffs current law at the output node in Fig.
6.5 and solving the resulting differential equation. Based on whether the node is charging to
VDD or discharging to GND,the PMOS or the NMOS transistor will be part of the differential
equation. Further, depending on the transistor operating mode (linear/saturation) the solution
to the differential equation will also change.
Cload
dVout
= i D,n
dt
(6.2)
Next, we need an expression for i D,n in terms of Vout to allow us to solve (6.2). This can be
obtained by determining which mode the transistor is working in, shown qualitatively in Fig. 6.6.
4
VGS,n VT,n and this point is time t1 in Fig. 6.6 when the NMOS switches to linear operation.
Hence, for the initial saturation period we can write the differential equation as
Cload
dVout
kn
= (Vin TT,n )2
dt
2
kn
= (VOH VT,n )2
2
(6.3)
(6.4)
(6.4) is valid for the region between t0 and t1 which is the same as when VOH VT,n < Vout
VOH . Also, in (6.3), Vout does not appear anywhere, allowing us to solve the differential
equation easily after separation of variables. Hence, from (6.3), we get
Z t = t
1
t = t0
dt = Cload
Z Vout =VOH VT,n
1
Vout =VOH
2Cload
k n (VOH VT,n )2
i D,n
dVout
dVout
(6.5)
(6.6)
Solving the integral in (6.6) and after substituting the limits we get
t1 t0 =
2Cload VT,n
k n (VOH VT,n )2
5
(6.7)
i D,n =
kn
2
2(VOH VT,n )Vout Vout
2
(6.8)
dt = 2Cload
Z Vout =V50%
Vout =VOH VT,n
1
2 ]
k n [2(VOH VT,n )Vout Vout
dVout
(6.9)
1
ln( x ) ln( x k )
=
2
kx x
k
t1 t1
2C
1
= load
ln
k n 2(VOH VT,n )
Vout
2(VOH VT,n ) Vout
(6.10)
using the upper and lower limits of VOH VT,n and V50% for Vout we get
t1 t1
Cload
ln
=
k n (VOH VT,n
(6.11)
The overall propagation delay PHL can now be obtained as the sum of (6.7) and (6.11).
Further, since for a CMOS inverter VOH = VDD and VOL = 0 (and consequently V50% =
VDD /2 we get
PHL
6.3.2
2VT,n
4(VDD VT,n )
Cload
+ ln
1
=
k n (VDD VT,n ) VDD VT,n
VDD
(6.12)
PLH Derivation
The Low-To-High transition time of the inverter can be obtained similarly. The output node
charges to VDD through the PMOS transistor, the time taken to charge to the V50% level is
computed as a sum of time taken for the PMOS first in saturation mode and then in linear
mode. The differential equation for charging the output is given as
i D,p = Cload
6
dVout
dt
(6.13)
Since the PMOS transistor has VGS = VOH < VT,p it is ON, and since VGS VT,p > VDS (=
dVout =
after separation of the variables. Applying the integration operator to (??) we get
Z Vout =VOL VT,p
Vout =VOL
dVout
k
= P (VOH VT,p )2
2Cload
Z t = t
1
t = t0
dt
t1 t0 =
2Cload |VT,p |
k P (VDD |VT,p |)2
(6.14)
where we have used VOH = VDD and VOL = 0. In this case, the PMOS goes into linear mode
when VDS reaches VGS VT,p , i.e., when Vout = |VT,p |. Using the same (6.13) we get
Cload
k p
dVout
=
[2(VDD VT,p )(Vout VDD ) (Vout VDD )2 ]
dt
2
where we have used appropriate CMOS values for VOL and VOH . Simplifying we get
Cload
dVout
k
= P (Vout VDD )(VDD + Vout + 2VT,p )
dt
2
(6.15)
This equation is also integrated with appropriate time and voltage level limits:
kP
2Cload
Z t1
t=t1
dt =
Z Vout =VDD /2
Vout =|VT,p |
dVout
(Vout VDD )(VDD + Vout + 2VT,p )
dx
ln( x a) ln( x + b)
=
( x a)( x + b)
a+b
t1 t1
Cload
=
ln
k p (VDD + VT,p )
VDD /2
Vout VDD
Vout + VDD + 2VT,p |V |
(6.16)
T,p
t1 t1
|VT,p | VDD
Cload
VDD /2
=
ln
ln
k p (VDD + VT,p )
3/2VDD + 2VT,p
VDD VT,p
The second term on the RHS of (6.17) is ln(1) = 0 and the RHS becomes
Cload
ln
k p (VDD + VT,p )
3VDD + 4VT,p
VDD
(6.17)
t1 t1
4(VDD + VT,p )
Cload
=
ln
1
k p (VDD + VT,p )
VDD
(6.18)
The overall low to high propagation delay is just the sum of (6.14) and (6.18). The book has
replaced VT,p with |VT,p |, but otherwise all formula are exactly the same.
6.3.3
Using the definitions of propagation delay discussed in the previous section, we can characterize the frequency of oscillation of a ring oscillator circuits. A ring oscillator consists of an odd
number of CMOS inverter stages. Due to this reason, it does not have a stable operating point
- the falling voltage of one stage trigerring a rise in the next state and vice versa. If all inverters
are uniform in size, the Vth is the only DC operating point and any drift from this voltage will
cause the circuit to move away from this operating point.
Figure 6.7 shows the three stage oscialltor, assumed to consists of identical elements. Qualitative analysis of this circuit is easily done, and consists of alternating high and low voltage
levels at V1 , V2 , and V3 . The waveforms are shown in Fig. 6.8, where the relation between
voltage levels and propagation delays is also marked. e.g., PHL1 is the high to low transition
time of V1 . V1 is in turn controlled by V3 being fed back so it must measure the time elapsed
between the V50% levels of V3 falling and V1 rising - which is marked in Fig. 6.8 with dotted
lines. All propagation delays are marked in this way, giving the overall period of oscillation as
= 2P + 2P + 2P
= 6P
8
For any odd number of inverter stages, the relation can be written as
f =
1
1
=
T
2 n P
(6.19)
Many times, these oscillator circuits are fabricated as test structures - a long chain of oscillators
is used to measure the frequency of operation in some technology node. Through reverse engineering, this frequency value can be used to obtain other process specific information such
as propagation delay, threshold voltage etc., thereby allowing for an accurate characterization
of the fabrication process.
6.4
As mentioned in the beginning of this chapter, the three main sources of circuit parasitics are
Figure 6.9: Inverter driving three other inverters through interconnect lines
10
While the first and third elements from above are purely capacitive and resistive in nature, the
interconnect elements can also have inductive properties. The interconnect elements can be
modeled as lumped or distributed elements, depending on the length of the interconnect. If it
is distributed, then a transmission line model is used for the interconnect, to calculate power
and delay metrics. Along with isolated operation, interaction between neighboring components
also have to be taken into account in noise analysis. Fig. 6.9 shows a sample portion of a
circuit where an inverter with three fanout lines is shown, along with the lumped values of the
interconnect capacitance. Fig. 6.10 shows an interconnect network with inductive elements
also included. The waveforms show the node voltage along with delays (Fig. 6.11). To determine whether an RC or an RLCG models is appropriate, a general rule is to compare signal
rise time to transmission times - if the transmission time (which depends on the interconnect
length)is much shorter than rise time then a RC lumped or distributed network can be used.
If the interconnect is long and the rise time is comparable to the time of transmission, then
inductance also becomes important and the interconnect must be modeled as a RLCG transmission line. For example, if a wire is 2 cm long, the transmission time is approximately 130 ps,
which is much smaller than rise and fall times, a lumped RC model can be used. On the other
hand, a 10 cm multichip module interconnect may have a transmission time of 1 ns, thereby
requiring a inductive transmission line model for analysis of power dissipation and propagation
delay. The effect of these interconnect capacitances is prominent in smaller technology nodes,
as is seen in Fig. 6.12. At these nodes a proper analysis of inter module and intra module
connections is necessary, in order for designs to satisfy timing specifications. As is seen in
Fig. 6.13, there is a peak at approximately 0.12 and 0.42 units, which are average values for
intra module and inter module interconnect lengths. At the design phase, these lengths should
be kept at a minimum to improve speed and power metrics.
6.4.1
To be able to reuse area occupied by interconnects, they are usuall stacked on top of each
other, with insulating material separating them, e.g., as in Fig. 6.14. For example, most
modern technologies use upto 8 vertical layers of metal interconnects. However, for accurate
estimation of delay or power, the capacitances between these layers and ground has to be
taken into account.
11
12
First, consider a single interconnect wire running parallel to the GND plane as shown in Fig.
6.15. If geometry allows, a parallel plate model can be used for the capacitance. If the
thickness of the wire, t in Fig. 6.15, is comparable to the distance of the wire from the GND
plane h, then the fringing electrical fields also contribute to increasing the capacitance. For
example, if the wire width w to h ratio is 0.1, then the fringing field capacitance can be as high
as 10 to 20 times the parallel plate capacitance. These fields are shown in Fig. 6.16, and
physically they represent narrow metal lines with considerable vertical thickness. Next, due to
multiple wiring layers, this type of analysis should also be done with neighboring interconnect
wires and also wires at different vertical layers. An example of the capacitances in these cases
is shown in Fig.6.17. In these cases, the overall capacitance include both mutual capacitances
along with the fringing field capacitances. An important effect of these capacitances is a
phenomenon called crosstalk: signals changing on one interconnect can cause noise (not
acoustic, electrical noise, which is a current) to be generated on lines which are capacitively
coupled. If this noise is high enough, then it can lead to erroneous operation of the circuit.
Fig. 6.18 shows the capacitances for a double metal CMOS structure, along with an active
area at the bottom. The three materials in this figure are metal1, metal2, and the polysilicon
gate, and the interwire capacitances are labeled as Cm1m2 , Cm1p , and Cm2p . A special mention
must be made for metal-active area capacitances, shown as Cm1a and Cm2a in Fig. 6.18, since
these capacitances have a larger value as compared to Cm2 f since the oxide thickness in
these areas are lower due to the active area window. This lower oxide thickness causes a
larger value for the capacitance in these areas.
13
Similar to ring oscillators being fabricated on a chip, separate test structures are also included
on the chip to allow for dedicated measurement of parasitic capacitance values for that particular process. These form the inputs to a CAD tool(e.g., microwind/MAGIC), which takes into
account geometry details from the layout to generate process specific parasitic capacitance
values for that particular technology/process batch.
* Look through tables 6.1 and 6.2(p. 251) for actual interlayer capacitance values.
14
6.4.2
The resistance of interconnect wires have a significant effect on propagation delays and also
power( I 2 R) losses. The values of this resistance depends on the type of metal used for the
interconnect(e.g., Al, Au, Poly etc.), the physical dimensions of the interconnect and the number of contacts made on the interconnect wire. Resistance in MOS materials is usually given
as a sheet resistance Rsheet using which any resistance is computed as
Rwire = Rsheet
w
l
where w and l are the width and length of the interconnect. For example, polysilicon has
Rsheet as 20-40/square, Al is approximately 0.1/square and metal-diffusion contacts are 2030/square etc.
6.5
In most cases, an RC interconnect model suffices to estimate the propagation delays due to
interconnects. In the simplest of such cases, both resistance and capacitance are lumped the resistance is between the input and output and the capacitance is lumped at the output
end - and a numerical solution is obtained for the output signal. In this case the output is given
as
6.5.1
Elmore Delay
The Elmore delay method is an approximate method that allows for quick computation of
delays in large RC networks. In particular, this approach is used for computing delays in large
interconnect networks, such as a clock distribution network where delay metrics are essential
for proper synchronous operation. Before we discuss the method, some definitions from graph
theory need to be mentioned.
1. Graph: A set of nodes {n1 , n2 , . . . } and a set of edges {e1 , e2 , . . . }. Each edge, in turn,
is a pair of nodes between which it is connected. e.g., e1 may be (n2 , n5 ).
2. Node: A terminal or intersection point, it is an abstraction of a physical object like a city,
or a bus stop.
3. Edge: A link between any two nodes. An edge will exist if the link exists in the phyical
model. E.g., there may not be a link from Bengaluru to Colombo on a road network.
4. Path: A sequence of consecutive edges.
5. Connected graph: A path exists between every pair of vertices.
6. Tree: A connected graph where there is exactly one path(hence unique) between every
set of edges. i.e., there are no cycles.
7. Root: A node for which every other node is at the extremity of a path
16
Firstly, Fig. 6.21 is not a tree because multiple paths exist between the same set of nodes. Fig.
6.20 is a tree - all nodes are connected and the paths are unique. If we temporarily assume
that we are only allowed to travel down from node 6, then we can list the following decendants:
17
The concepts mentioned above can be used to define the Elmore delay in an RC network. The
Elmore delay model is a mathematical approach for approximating a distributed RC network
with one single RC delay term, which can be used to calculate the delay as = 0.69RC.
Some(not all) RC interconnect networks can be represented as a tree, also called as an RC
tree. This implies that all nodes in such RC networks are connected and all paths are unique.
Each R and C element forms an edge and each circuit node forms a node on the RC tree.
For example, an RC tree is shown in Fig. 6.19. In such RC trees, we usually want to find the
delay at some node i, with respect to an input signal which will become the root node. With
the definitions from above, the Elmore delay model is very direct.
The Elmore delay of an RC tree at a node i from the root node (e.g., a node which is an input
node) is give by Td,i . This Td,i is determined by the unique path Pi from the root to node i and
is given by
Td,i =
Rj
j Pi
Ck
(6.20)
k desc( j)
j Pi : These are all the nodes j that are encountered in the path Pi (the unique path
from root to node i)
k desc( j): Each k is a descendants of node j in the path Pi
Hence the Elmore delay may be obtained by traversing the path from the root to the node
where the delay is required, while at each node adding to the delay term the product of node
resistance R j and sum of all decendant node capacitances.
18
As an example consider the circuit in Fig. 6.22. The Elmore delay can be used to find a
equivalent RC network for the delay at any of the nodes, with the input signal forming the root
node. Let us apply the model for a few cases:
Node 2: The unique path from Vin , which is also the root node, to node 2 is through
node 1. Hence the first term in the delay is
R1
Ck
k desc(1)
for all k which are descendants of node 1 - in this case all capacitances. Using the same
reasoning, the second term is the product of R2 and the sum of all capacitances which
are descendants of node 2 - C2 + C3 + C4 + C5 . Hence the overall delay is
19
* Calculate the Elmore delay for nodes 4 and 8 for the RC network in Fig. 6.22.
* Verify if the Elmore delay model terms commute, i.e., can we use the same definitions above with the delay term as
Td,i =
Cj
j Pi
Rk
k descj
A more direct case is when there is only one branch in the RC tree, as shown in Fig. 6.23. In
this case, for any node n the delay model can be used to write
Td,n =
j =1
k= j
Cj Rk
Another approach to visualize the Elmore delay model is to compare current flow with water
flow. The water flow analogy also tells us that since the flow rate at all places along the water
channel are inter-related, capacitance value at any point has an affect on the delay at all points.
All capacitances in a circuit are related and the nodes downstream have maximum effect since
they are accounted for multiple times as descendants of multiple nodes when calculating the
Elmore delay. This also gives an idea as to which elements are crucial for reducing overall
delay in the circuit, as these capacitances should be reduced aggresively.
20
21
22