Академический Документы
Профессиональный Документы
Культура Документы
2009-10
Semester –VI
Digital VLSI Design
Subject Code: EC 64 Total No. of Hrs: 48
Credits: 4 Hours per week: 4
Scaling principles, Interconnect layer scaling, Scaling models and scaling factors, scaling
factors for device parameters, some discussion on scaling, and limitations of scaling.
Introduction, CMOS Logic Gate Design, Basic Physical Design of Simple Logic Gates,
CMOS Logic Structures Clocking Strategies, I/O Structures, Low power Design
Reference Books:
1. Principles of CMOS VLSI design – Neil Weste & Kamaran Eshraghian
2. CMOS Digital Integrated Circuits Analysis and Design- Sung-mo-kang & Yusuf
Leblebici
7
MOS Transistor Theory
Metal Oxide Semiconductor Field Effect Transistor (MOSFET)
Introduction
The Metal Oxide Semiconductor Field Effect Transistor (MOSFET) is the fundamental
building block of MOS and CMOS digital integrated circuits. Compared to the bipolar junction
transistor (BJT), the MOS transistor occupies a relatively smaller silicon area, and its fabrication
involves fewer processing steps. These technological advantages, together with the relative
simplicity of MOSFET operation, have helped make the MOS transistor the most widely used
switching device in VLSI and VLSI circuits.
The MOSFET is a four terminal device. The voltage applied to the gate terminal
determines if and how much current flows between the source and the drain ports. The body
represents the fourth terminal of the transistor. Its function is secondary as it only serves to
modulate the device characteristics and parameters.
At the most superficial level, the transistor can be considered to be a switch. When a
voltage is applied to the gate that is larger than a given value called the threshold voltage VTh, a
conducting channel is formed between drain and source. In the presence of a voltage difference
between the latter two, current flows between them. The conductivity of the channel is
modulated by the gate voltage—the larger the voltage difference between gate and source, the
smaller the resistance of the conducting channel and the larger the current.
MOSFET diagram
2. DEPLETION-mode MOSFET :
If a conducting channel already exists at zero gate bias, on the other hand,
the device is called a depletion-type (or depletion-mode) MOSFET.
MOSFET Explanation
In a MOSFET with p-type substrate and with n+ source and drain regions, the
channel region to be formed on the surface is n-type. Thus, such a device with p-
type substrate is called an n-channel MOSFET.
In a MOSFET with n-type substrate and with p+ source and drain regions, on the
other hand, the channel is p-type and the device is called a p-channel MOSFET.
The device terminals are: G for the gate, D for the drain, S for the source,
and B for the substrate (or body).
In an n-channel MOSFET, the source is defined as the n region which has
a lower potential than the other n region, the drain.
By convention, all terminal voltages of the device are defined with respect to the
source potential. Thus, the gate-to-source voltage is denoted by VGS, the drain-to-
source voltage is denoted by VDS , and the substrate-to-source voltage is denoted
by VBS.
principle of this device is: control the current conduction between the source and the drain,
using the electric field generated by the gate voltage as a control variable. Since the current flow
in the channel is also controlled by the drain-, to-source voltage and by the substrate voltage, the
current can be considered a function of these external terminal voltages. In order to start current
flow between the source and the drain regions, however, we have to form a conducting channel
first.
Region of Operation
At a point x along the channel, the voltage is V(x), and the gate-to-channel voltage at that
point equals VGS – V(x). Under the assumption that this voltage exceeds the threshold voltage all
along the channel, the induced channel charge per unit area at point x can be computed.
Qi(x) = –Cox[V GS – V(x) – VTh]--------------------1.1
Cox stands for the capacitance per unit area presented by the gate oxide, and equals
The electron velocity is related to the electric field through a parameter called the mobility
2/
(expressed in m V s). The mobility is a complex function of crystal structure, and local electrical
field. In general, an empirical value is used.
--------------------------------1.3
---------------1.4
Substitute equations 1.1 and 1.3 in 1.2 yields
Integrating the equation over the length of the channel L yields the voltage-current relation of
the transistor with boundary conditions x=0 to L and V=0 to VDS along the channel
------------------------------------------1.5
’
Where kn , is called the process transconductance parameter and equals
-------------------------------1.6
The W and L parameters in Equation the effective channel width and length of the transistor
respectively.
Also, drain current measurements with constant VS show that the current ID does not show much
variation as a function of the drain voltage. VDS beyond the saturation boundary, but rather remains
approximately constant around the peak value reached for VDS = VDSAT .This saturation drain current
level can be found simply by substituting Eq.5 in Eq.4
------------1.8
Figure shows the typical drain current versus drain voltage characteristics of an n-channel
MOSFET, as described by the current equations (4) and (6). The parabolic boundary between the
linear and the saturation regions is indicated here. The current-voltage characteristics of the MOS
transistor can also be visualized by plotting the drain current as a function of the gate voltage, as
shown in Fig. This ID Versus VGS transfer characteristic in saturation mode (VDS > VDSAT)
provides a simple view of the drain current increasing as a second-order function of the gate-to
source voltage The current is obviously equal to zero for any gate voltage smaller than the
threshold voltage VT
As the gate voltage increases, the potential at the silicon surface at some point reaches a
critical value, where the semiconductor surface inverts to n-type material. This point marks the
onset of a phenomenon known as strong inversion and occurs at a voltage equal to twice the
Fermi Potential.
---------1.9
Further increases in the gate voltage produce no further changes in the depletion layer width, but
result in additional electrons in the thin inversion layer directly under the oxide. These are drawn
into the inversion layer from the heavily doped n+ source region. Hence, a continuous n-type
channel is formed between the source and drain regions, the conductivity of which is modulated
by the gate-source voltage. In the presence of an inversion layer, the charge stored in the
depletion region is fixed and equals
------------1.10
This picture changes somewhat in case a substrate bias voltage VSB is applied (VSB is
normally positive for n-channel devices). This causes the surface potential required for strong
inversion to increase and to become |–2 F + VSB|. The charge stored in the depletion region now
is expressed by
----------1.11
The value of the gate-to-source voltage VGS needed to cause strong surface inversion (to
create the conducting channel) is called the threshold voltage VTh.
VTh is a function of several components, most of which are material constants such as
the difference in work-function between gate and substrate material,
the oxide thickness,
the charge of impurities trapped at the surface between channel and gate oxide, and
The dosage of ions implanted for threshold adjustment.
From the above arguments, it has become clear that the source-bulk voltage VSB has
-----------1.12
The parameter (gamma) is called the body-effect coefficient, and expresses the impact of
changes in VSB.
----------------1.13
We can simply replace the threshold voltage terms in linear-mode and saturation-mode current
equations with the more general VTh(VSB) term.
We will examine the mechanisms of channel pinch-off and current flow in saturation
mode. Consider the inversion layer charge Qi that represents the total mobile electron charge on
the surface. The inversion layer charge at the source end of the channel is
Q(X=0 ) = -C OX (VGS-VTh)
and the inversion layer charge at the drain end of the channel is
Q,(x= L) = -CoX (VGS - VTh - VDS))------1.14
Note that at the edge of saturation, i.e., when the drain-to-source voltage reaches VDSAT,
VDS = VDSAT = VGS - VTh
The inversion layer charge at the drain end becomes zero according to Eq. 1.14.In reality, the
channel charge does not become exactly equal to zero but it indeed becomes very small.
VIL is the maximum allowable voltage at the input of the second inverter, which is low
enough to ensure a logic "1" output
VIH is the minimum allowable voltage at the input of the third inverter which is high
enough to ensure a logic "0" output.
These observations lead us to the definition of noise tolerances for digital circuits, called noise
margins and denoted by NM. The noise immunity of the circuit increases with NM. Two noise
margins will be defined: the noise margin for low signal levels (NML) and the noise margin for
high signal levels (NMH). For a gate to be robust and insensitive to noise disturbances, it is
essential that the “0” and “1” intervals be as large as possible. A measure of the sensitivity of a
gate to noise is given by the noise margins
The CMOS inverter has two important advantages over the other inverter configurations.
The first and perhaps the most important advantage is that the steady-state power
dissipation of the CMOS inverter circuit is virtually negligible, except for small power
dissipation due to leakage currents.
The other advantages of the CMOS configuration are that the voltage transfer
characteristic (VTC) exhibits a full output voltage swing between 0 V and VDD, and that
the VTC transition is usually very sharp. Thus, the VTC of the CMOS inverter resembles
that of an ideal inverter.
Region B:
EQUATION-D
Region E:
Note that the driver MOSFET is initially in saturation, since its drain-to source voltage.
(VDs = V0ut=VDD) is larger than (Vin - Vth) =VGS-Vth VDs>> VGS-Vth
NMOS Saturation current is given by
( )
IR =kn
Consider PMOS
Similarly, pMOS transistors pass 1s well but 0s poorly. If the pMOS source drops below |Vtp|,
the transistor cuts off. Hence, pMOS transistors only pull down to within a threshold above
GND, as shown in Figure
When EN is 0, both enable transistors are OFF, leaving the output floating.
When EN is 1, both enable transistors are ON. They are conceptually removed from the circuit,
leaving a simple inverter.
Tristate were once commonly used to allow multiple units to drive a common bus, as
long as exactly one unit is enabled at a time. If multiple units drive the bus, contention
occurs and power is wasted. If no units drive the bus, it can float to an invalid logic level
that causes the receivers to waste power. Moreover, it can be difficult to switch enable
signals at exactly the same time when they are distributed across a large chip. Delay
between different enables switching can cause contention. Given these problems,
multiplexers are now preferred over tristate busses.
Now we make some analysis, if the input voltage is applied ie., Vleft= VA= VRight then
each transistor has VGS=VA-VN, where VN is the voltage accrossthe constant current source. Thus
is IDS is same for both the MOSFET and also Vout1=Vout2 .
Now increase the Vleft and VRight equally,then VN also rises to maintain the constant
current through the current source. The ouput voltages Vout1 and Vout2 will stay at same value.
BIPOLAR DEVICES
1. DIODE
2. BJT
3. BiCMOS
Two bipolar transistors (T3 and T4), one nMOS and one pMOS transistor (both enhancement-
type devices)
The MOS switches perform the logic function & bipolar transistors drive output loads
Vin = 0 :
T1 is off. Therefore T3 is non-conducting
T2 ON - supplies current to base of T4
T4 base voltage set to Vdd.
T4 conducts & acts as current source to charge load CL towards Vdd.
Vout rises to VDD - VBE (of T4)
Note: VBE (of T4) is base-emitter voltage of T4.
(Pull-up bipolar transistor turns off as the output approaches 5V - VBE (of T4))
Vin = VDD :
T2 is off. Therefore T4 is non-conducting.
T1 is on and supplies current to the base of T3
T3 conducts & acts as a current sink to discharge load CL towards 0V.
Vout falls to 0V+ VCEsat (of T3)
Note: VCEsat (of T3) is saturation Voltage from T3 collector to emitter
small & VBE 0.7V. Therefore, inverter has high noise margins
Vin = 0 :
T1 is off. Therefore T3 is non-conducting
T2 ON - supplies current to base of T4
T4 base voltage set to Vdd.
T5 is turned on & clamps base of T3 to GND. T3 is turned off.
Vin = Vdd :
T2 is off
T1 is on and supplies current to the base of T3
T6 is turned on and clamps the base of T4 to GND. T4 is turned off.
T3 conducts & acts as a current sink to discharge load CL towards 0V
Vout falls to 0V+ VCEsat (of T3)
Again, this BiCMOS gate does not swing rail to rail. Hence some finite power is dissipated
when driving another CMOS or BiCMOS gate. The leakage component of power dissipation can
be reduced by varying the BiCMOS device parameters
Advantage:
BiCMOS devices offer many advantages where high load current sinking and sourcing
is required. The high current gain of the NPN transistor greatly improves the output
drive capability of a conventional CMOS device.
It follows that BiCMOS technology goes some way towards combining the virtues of
both CMOS and Bipolar technologies.
Main disadvantage:
Greater process complexity compared to CMOS
Results in a 1.25 1.4 times increase in die costs over conventional CMOS.
Taking into account packaging costs, the total manufacturing costs of supplying a
BiCMOS chip ranges from 1.1 1.3 times that of CMOS.
CMOS processing steps can be broadly divided into two parts. Transistors are formed in
the Front-End-of-Line (FEOL) phase, while wires are built in the Back-End-of-Line (BEOL)
phase. This section examines the steps used through both phases of the manufacturing process.
Wafer Formation
The basic raw material used in CMOS fabs is a wafer or disk of silicon, roughly 75 mm to 300
mm in diameter and less than 1 mm thick. Wafers are cut from boules, cylindrical ingots of
single-crystal silicon that have been pulled from a crucible of pure molten silicon. This is known
as the Czochralski method and is currently the most common method for producing single-
crystal material.
Photolithography
The regions of dopants, polysilicon, metal, and contacts are defined using masks. For
instance, in places covered by the mask, ion implantation might not occur or the dielectric or
metal layer might be left intact. In areas where the mask is absent, the implantation can occur, or
The photomask has chrome where light should be blocked. The UV light floods the mask from
the backside and passes through the clear sections of the mask to expose the organic Photoresist
(PR) that has been coated on the wafer. A developer solvent is then used to dissolve the soluble
exposed or unexposed Photoresist
Silicon Dioxide (SiO2)
Oxidation of silicon is achieved by heating silicon wafers in an oxidizing atmosphere.
The following are some common approaches:
Wet oxidation––when the oxidizing atmosphere contains water vapor. The temperature is
usually between 900 °C and 1000 °C. This is also called pyrogenic oxidation when a 2:1
mixture of hydrogen and oxygen is used. Wet oxidation is a rapid process.
Si+2H2O SiO2+2H2
Dry oxidation––when the oxidizing atmosphere is pure oxygen. Temperatures are in the
region of 1200 °C to achieve an acceptable growth rate. Dry oxidation forms a better
quality oxide than wet oxidation. It is used to form thin, highly controlled gate oxides,
while wet oxidation may be used to form thick field oxides.
P-well process
twin-tub process
Note: (Here I have not mentioned or drawn any Mask layer during this processing but in
exam you have to mention)
Step1:
Process starts with a moderately doped (1015 cm-3) p-type substrate (wafer)
An initial oxide layer is grown on the entire surface (barrier oxide)
Field
Oxide
Step2:
Step3:
• Active area mask - define the regions in which MOS devices will be created
• LOCOS process to isolate NMOS and PMOS transistors
• Grow gate oxide (dry oxidation) - only in the open area of active region
Thin oxide
of 500Å
Step4:
Step5:
Step6:
Step7:
Step9:
Tub formation
Thin oxide etching
Source and drain implantations
1) A thin film (7-8 μm) of very lightly doped n-type Si is grown over an insulator. Sapphire is a
commonly used insulator.
2) An anisotropic etch is used to etch away the Si except where a diffusion area will be needed.
3) The p-islands are formed next by masking the n-islands with a photoresist. A p-type dopant
(boron) is then implanted. It is masked by the photoresist and at the unmasked islands. The p-
islands will become the n-channel devices.
4) The p-islands are then covered with a photoresist and an n-type dopant, phosphorus, is
implanted to form the n-islands. The n-islands will become the p-channel devices.
5) A thin gate oxide (500-600Å) is grown over all of the Si structures. This is normally done by
thermal oxidation.
7) The polysilicon is then patterned by photomasking and is etched. This defines the polysilicon
layer in the structure.
8) The next step is to form the n-doped source and drain of the n-channel devices in the p-
islands. The n-island is covered with a photoresist and an n-type dopant (phosphorus) is
implanted.
9) The p-channel devices are formed next by masking the p-islands and implanting a p-type
dopant. The polysilicon over the gate of the n-islands will block the dopant from the gate,
thus forming the p-channel devices
10) A layer of phosphorus glass is deposited over the entire structure. The glass is etched at
contact cut locations. The metallization layer is formed. A final passivation layer of a
phosphorus glass is deposited and etched over bonding pad locations.
But the drawback is due to absence of substrate diodes, the inputs are difficult to protect. As
device gains are lower, I/O structures have to be larger. Single crystal sapphires are more
expensive than silicon and processing techniques tend to be less developed than bulk silicon
techniques.
Interconnect has advanced rapidly. While two or three metal layers were once the
norm, CMP has enabled inexpensive processes to include seven or more layers.
Copper metal and low-k dielectrics are almost universal to reduce the resistance and
capacitance of these wires.
Digital VLSI Design, ECE Dept.SET,JU. Page 16
i. Copper Damascene Process While aluminum was long the interconnect metal of
choice; copper has largely superseded it in nanometer processes. This is primarily due
to the higher conductivity of copper compared to aluminum.
Copper atoms diffuse into the silicon and dielectrics, destroying transistors.
The processing required to etch copper wires is tricky.
Copper oxide forms readily and interferes with good contacts.
Care has to be taken not to introduce copper into the environment as a pollutant.
Barrier layers have to be used to prevent the copper from entering the silicon
surface. A new metallization procedure called the damascene process was invented
to form this barrier.
ii. Low-k Dielectrics SiO2 has a dielectric constant of k = 3.9–4.2. Low-k dielectrics
between wires are attractive because they decrease the wire capacitance. This reduces
wire delay, noise, and power consumption.
i. POLY
Minimum Spacing 2λ
Minimum Width 2λ
2λ 2λ
2λ
ii. ACTIVE
Minimum Spacing 3λ
Minimum Width 3λ
iii. METAL1
Minimum Spacing 3λ
Minimum Width 3λ
3λ
iv. NWELL
Minimum Spacing 6λ
Poly
Metal1
Buried contact: The contact cut is made down each layer to be joined and it is shown
in figure
3.
4.
5. 2- INPUT NAND
6. 2- INPUT NOR
A byproduct of the Bulk CMOS structure is a pair of parasitic bipolar transistors. The collector
of each BJT is connected to the base of the other transistor in a positive feedback structure. A
phenomenon called latch up
(1) Both BJT's conduct, creating a low resistance path between VDD and GND
(2) The product of the gains of the two transistors in the feedback loop, beta1 x beta2, is greater
than one.
The result of latch up is at the minimum a circuit malfunction, and in the worst case, the
destruction of the device.
The most likely place for latch up to occur is in pad drivers, where large voltage transients and
large currents are present.
The mask database is the interface between the semiconductor manufacturer and the chip
designer. Two basic checks have to be completed to ensure that this description can be turned
into a working chip.
First, the specified geometric design rules must be obeyed.
Second, the interrelationship of the masks must, upon passing through the
manufacturing process, produce the correct interconnected set of circuit elements.
To check these two requirements, two basic CAD tools are required: a Design
Rule Check (DRC) program and a mask circuit extraction program.
2. Delay estimations
4. Power consumption
5. Charge sharing
6. Design margin
7. Reliability
8. Yield
RESISTANCE ESTIMATION
Integrated Circuit (IC) chips contain many types of materials such as polysilicon, oxide,
various diffusions of basic CMOS transistors, and metal. A popular resistor material is
polysilicon, also known as poly.
The concept of sheet resistance is being used to know the resistive behavior of the layers
that go into formation of the MOS device. Let us consider a uniform slab of conducting material
of the following characteristics.
W -width
2λ
2λ
CAPACITANCE ESTIMATION
Parasitics capacitances are associated with the MOS device due to different layers that go
into its formation. Interconnection capacitance can also be formed by the metal, diffusion and
polysilicon in addition with the transistor and conductor resistance. All these capacitances
actually define the switching speed of the MOS device.
Understanding the source of parasitics and their variation becomes a very essential part of
the design specially when system performance is measured in terms of the speed. The various
capacitances that are associated with the CMOS device are
1) Gate capacitance - due to other inputs connected to output of the device
2) Diffusion capacitance - Drain regions connected to the output
3) Routing capacitance- due to connections between output and other inputs
The standard unit is denoted by Cg. It represents the capacitance between gate to channel with
W=L=min feature size. Here is a figure showing the different capacitances that add up to give
the total gate capacitance
Cgd, Cgs = gate to channel capacitance lumped at the source and drain
Csb, Cdb = source and drain diffusion capacitance to substrate
Ɛ
C=
1. Calculate the areas of area under consideration relative to that of standard gate i.e.4. (Standard
gate varies according to the technology)
2. Multiply the obtained area by relative capacitance values tabulated.
3. This gives the value of the capacitance in the standard unit of capacitance Cg
Problems
I. A particular layer of MOS circuit has a resistivity of 1 ohm -cm. The section is 55um long, 5um wide and
1 um thick. Calculate the resistance and also find Rs
Solution:
R= RsxL/W,
Rs= /t
Rs=1x10-2/1x106=104ohm
R= 104x55x10-6/5x106=110k
Digital VLSI Design, ECE Dept.SET,JU. Page 32
II. For a 5u technology the area of the minimum sized transistor is 5uX5u=25um2 i.e. λ=2.5u,
hence, area of minimum sized transistor in lambda is 2λX2λ= 4λ2. Therefore for 2u or 1.2u
or any other technology the area of a minimum sized transistor in lambda is 4λ2
Solution:
The figure above shows the dimensions and the interaction of different layers, for evaluating the total
capacitance resulting so.
Metal
( )
Ratio= Relative area=
2 2
Relative area =300λ /4λ =75
Metal Capacitance = Relative area of Selected layer (i.e. Meta) X Relative Capacitance of metal
Cm= 75 x0.075=5.625
Polysilicon
=22λ2
2 2
Relative area =22λ /4λ =5.5
Cp=5.5x0.1=0.55
Gate
Cg=1x1=1
( | |)
Idsp=
This current charges CL and since its magnitude is approximately constant we have
∗
Vout=
t=
( )
tr=
This results compares reasonably well with a more detailed analysis in which the charging of
CL is divided, more correctly, into two parts(1) saturation and (2) Resistive region of transistor.
Similar reasoning can be applied to the discharge of CL through n-transistor.the circuit model in
this case is given as shown in below figure
tf=
Delay Time:
In MOS circuits, the delay of a single Gate is dominated by the output rise and
fall time, the delay is approximately given by
tdr= tr/2
tdf= tf/2
The average gate delay for rising edge and falling transitions is given by
tav=(tdr+tdf)/2
(tr+tf)/4
1.
3.
IO=IS( )
Where IO reverse saturation current
Q electronic charge
V diode voltage
K Boltzmann constant
T temperature
The static power dissipation is the product of the device leakage current and the supply voltage.
Ps= ∑ ( ∗ )
During transition from either ‘0’ to ‘1’ or alternatively from ‘1’ to ‘0’, both n- and p-
transistors are on for a short period of time .This results in a short current pulse from VDD to VSS
current is also required to charge and discharge the output capacitive load. This latter term is
generally the dominant term. The current pulse from VDD to VSS results in a “short-circuit”
dissipation which is dependent on the load capacitance and gate design. This is of relevance to
I/O buffer design.
The dynamic dissipation can be modeled by assuming the rise and fall time of the step input is
much less than the repetition period. The average dynamic power, PD, dissipated during
switching for a square-wave input vin, having a repetition frequency of fp=1/tp as shown by fig
Thus for a repetitive step input the average power that is dissipated is proportional to the energy
required to change and discharge the circuit capacitance .the important factor to be noted here is
that shows power to be proportional to switching frequency but independent of the device
parameters.
Total power dissipation can be obtained from the sum of the two dissipation components, so
Ptotal=Ps+Pd
When calculating the power dissipation a rule of thumb is to add all capacitance operating at
particular frequency and calculate the power. Then the power from other groups operating at
different frequencies may be summed, the dynamic power dissipation may be used to estimate
total power consumption of a circuit and also the size of VDD and VSS conductors to minimize
transient induced voltage drops.
Charge sharing:
In many structures a bus can be modeled as a capacitors Cb as shown in fig. Sometimes the
voltage on this bus is sampled to determine the state of a given signal frequency, this sampling
can be modeled by the two capacitors Cs and Cb and a switch. In general Cs is in some way
related to the switching element. The charge associated with each of the capacitance prior to
closing the switch can be described by
• At time t=0-, switch is open and each capacitor contains some initial charge
• At time t=0+, the switch is closed and the charge redistributes across both capacitors
• Conserve the total charge:
– Sum up initial charge Qt = Qb + Qs = CbVb + CsVs
Design Margin:
As semiconductor technology scales to the nanometer regime, the variation of process
parameters is a critical problem in VLSI design. Thus the need for variation-aware timing
analysis for the performance yield is increasing. However, the traditional worst-case corner-
based approach gives pessimistic results, and makes meeting given designs specifications
difficult. As an alternative to this approach, statistical analysis is proposed as a new and
promising variation-aware analysis technique. However, statistical design flow cannot be applied
easily to existing design flow, and not enough tools for statistical design exist. To overcome
these problems, new design methodology based on traditional static timing analysis (STA) using
a relaxed corner proposed
Yield
An important issue in the manufacture of VLSI structure is yield, although yield is not a
performance parameter, it is influenced by such factors are
Technology
Chip area
Y= x 100%
And may be described as a function of chip area and defect density.
Two common equations used are,
1) Seed’s Model which is given by
Y= √
Y= ( )
Where A is chip area
D is defect density
This model is used for small chips and for yields greater than 30% from these relations it
is obvious that yield decreases dramatically as the area of the chip increases.
.
Technology Scaling
Goals of scaling the dimensions by 30%:
Reduce gate delay by 30% (increase operating frequency by 43%)
Double transistor density
Reduce energy per transition by 65% (50% power savings @ 43% increase in frequency)
Die size used to increase by 14% per generation
Technology generation spans 2-3 years
International Technology Roadmap for Semiconductors (ITRS)
Gate -Poly
1. Gate area Ag
Ag= L* W
Where
L: Channel length scaled by 1/α and
W: Channel width and both are scaled by 1/α
2
Thus Ag is scaled up by 1/α
Where
εox is permittivity of gate oxide (thin-ox)= εins* εo constant and
D is the gate oxide thickness scaled by 1/β
Co =Cox = =β
( )
fo= ∗
/ ( ∗ ∗ )
fo is inversely proportional to delay Td and is scaled by= *( )= /
/ /
9. Saturation current
∗
IDSAT = *( − )
Current density= ,
Where A is cross sectional area of the Channel=W* L= in the “on” state which is scaled
by (1/α2).
( )
Therefore, J is scaled by = =
( )
Eg= VDD 2 Cg
VDD is scaled by
2
Cg is scaled by β/ α
Hence Eg is scaled by = ∗ =
Pg comprises of two components: static component Pgs and dynamic component Pgd:
fo by
Therefore, Pg scales by .
= * =
Interconnect Woes
Scaled transistors are steadily improving in delay, but scaled wires are holding
constant or getting worse.
For short wires, such as those inside a logic gate, the wire RC delay is negligible.
However, the long wires present a considerable challenge.
• Moore's law, first postulated by Intel co-founder Gordon Moore, says the number of
transistors -- the main component of a microchip -- that can fit on a chip doubles about
every 18-24 months.
• To keep pace with Moore's law, transistors would have to reach the atomic level by 2020.
• The smallest transistor ever built has been created using a single phosphorous atom by an
international team of researchers at the University of New South Wales, Purdue
University and the University of Melbourne. "To me, this is the physical limit of Moore's
Law," Klimeck says. "We can't make it smaller than this (atom)."
Limitations of Scaling
Substrate doping
Depletion width
Limits of miniaturization
Limits of interconnect and contact resistance
Limits due to sub threshold currents
Limits on logic levels and supply voltage due to noise
Limits due to current density
d= ∗ ∗ ∗
d= ∗ ∗ ∗ DD
Depletion width
• NB is increased to reduce d, but this increases threshold voltage Vth –against trends for scaling
down.
• Maximum value of N B (1.3*1019 cm-3, at higher values, maximum electric field applied to gate
is insufficient and no channel is formed.
Limits of miniaturization
t= =
Emax=2∗ ( )
Limits on supply voltage due to noise
Decreased inter-feature spacing and greater switching speed –result in noise problems.
CMOS structures require a nblock and a pblock for completion of the logic. That is for a n input
logic 2n gates are required. Each logic function is duplicated for both pull-down and pull-up
logic tree
– pull-down tree gives the zero entries of the truth table, i.e. implements the
negative of the given function
– pull-up tree is the dual of the pull-down tree, i.e. implements the true logic with
each input negative-going
Disadvantage:
• This actually means that PMOS is all the time on and that now for a n input logic we
have only n+1 gates.
• This technology is equivalent to the depletion mode type and preceded the CMOS
technology and hence the name pseudo.
(DRIVER)
• The two sections of the device are now called as load and driver.
• The ßn/ßp (ßdriver/ßload) has to be selected such that sufficient gain is achieved to get
consistent pull up and pull down levels.
• This involves having ratioed transistor sizes so that correct operation is obtained.
However if minimum size drivers are being used then the gain of the load has to be
reduced to get adequate noise margin.
2. GANGED LOGIC
The inputs are separately connected but the output is connected to a common terminal.
The logic depends on the pull up and pull down ratio. If PMOS is able to overcome
NMOS it behaves as NAND else NOR.
Here N=5
TOTAL=6 Transistor
1. The gate capacitance of CMOS logic is two unit gate but for Psuedo logic it is only one gate
unit.
Since the PMOS is always on, static power dissipation occurs, whenever the NMOS
is on. Hence the conclusion is that in order to use Psuedo logic a tradeoff between
size & load or power dissipation has to be made.
• This logic looks into enhancing the speed of the pull up device by Precharging the
output node to VDD.
• Hence we need to split the working of the device into
Precharge stage and
Evaluate stage for which we need a clock. Hence it is called as dynamic
logic
Example
Precharge (CLK = 0)
Evaluate (CLK = 1)
Conditions on Output
• Once the output of a dynamic gate is discharged, it cannot be charged again until the next
precharge operation.
• Inputs to the gate can make at most one transition during evaluation.
• Output can be in the high impedance state during and after evaluation (PDN off), state is
stored on CL
• Nonratioed - sizing of the devices is not important for proper functioning (only for
performance)
– consumes only dynamic power – no short circuit power consumption since the
pull-up path is not on when evaluating
• PDN starts to work as soon as the input signals exceed VTn, so set VM, VIH and VIL all
equal to VTn
DRAWBACK
• Inputs have to change during the precharge stage and must be stable during the evaluate.
If this condition cannot occur then charge redistribution corrupts the output node.
• A simple single dynamic logic cannot be cascaded. During the evaluate phase the first
gate will conditionally discharge but by the time the second gate evaluates, there is going
to be a finite delay. By then the first gate may precharge.
• If several stages of the previous CMOS dynamic logic circuit are cascaded together using
the same clock CLK, a problem in evaluation involving a built-in “race condition” will
exist
– When CLK goes high to begin evaluate, all inputs at stage 1 require some finite
time to resolve, but during this time charge may erroneously be discharged from
output2
Now assume that eventually the 1 st stage NMOS logic tree conducts and fully discharges
output1, but since all the inputs to the N-tree all not immediately resolved, it takes some
time for the N-tree to finally discharge output1 to GND.
If, during this time delay, the 2nd stage has the input condition shown with bottom NMOS
transistor gate at a logic 1, then output2 will start to fall and discharge its load
capacitance until output1 finally evaluates and turns off the top series NMOS transistor in
stage 2
The result is an error in the output of the 2nd stage output2
• The problem with faulty discharge of precharged nodes in CMOS dynamic logic circuits
can be solved by placing an inverter in series with the output of each gate
– All inputs to N logic blocks (which are derived from inverted outputs of previous
stages) therefore will be at zero volts during precharge and will remain at zero
until the evaluation stage has logic inputs to discharge the precharged node.
When CLK is low, dynamic node is precharged high and buffer inverter output is low. NFETs in
the next logic block will be off. When CLK goes high, dynamic node is conditionally discharged
and the buffer output will conditionally go high. Since discharge can only happen once, buffer
output can only make one low-to-high transition.
When domino gates are cascaded, as each gate “evaluates”, if its output rises, it will
trigger the evaluation of the next stage, and so on… like a line of dominos falling. Like dominos,
once the internal node in a gate “falls”, it stays “fallen” until it is “picked up” by the precharge
phase of the next cycle.
Thus many gates may evaluate in one eval cycle.
• In (a) the N evaluate transistor is placed nearest to the output C1 node (poor design)
– During precharge C1 is charged high to VDD, but C2-C7 do not get charged and
may be sitting at ground potential.
– When the clock goes high for the evaluate phase, some or all of capacitors C2-C7
will bleed charge from the larger node capacitor C1, thus reducing the voltage on
C1.
Vn1=
∑
– The solution is to put the discharge transistor N1 at the bottom of the logic tree
thus allowing the possibility of getting C2-C7 charged during the precharge phase
• An elegant solution to the dynamic CMOS logic “erroneous evaluation” problem is to use
NP Domino Logic (also called NORA logic) as shown below.
• During precharge clk is low (-clk is high) and the P-logic output precharges to ground
while N-logic outputs precharge to Vdd.
• During evaluate clk is high (-clk is low) and both type stages go through evaluation; N-
logic tree logically evaluates to ground while P-logic tree logically evaluates to Vdd.
If we turn a dynamic gate “upside down” and use PFETs to build the logic block, we get a logic
gate that “precharges” low and “discharges” high. By using these gates in an alternating
sequence with regular NFET dynamic gates we can eliminate the race problem we had with
NFET-only dynamic gate sequences and hence we don’t need the buffer inverter present in
domino gates.
Removing the buffer is a mixed blessing since we may need it for drive reasons and to
keep compatibility with other domino gates. It also makes NORA logic very susceptible to noise
since during the evaluate phase all information is stored dynamically.
• Clocked CMOS logic has been used for very low power CMOS and/or for minimizing
hot electron effect problems in N-FET devices
• Clocking transistors allow valid logic output only when CLK is high
• A differential gate requires that each input is provided in complementary format, and
produces complementary outputs in turn. The feedback mechanism ensures that the load
device is turned off when no needed
CASE 2: Assume PDN1 ON (1) this will make PDN2 OFF (0) (Because PDN1 and
PDN2 are mutually exclusive) So PDN1 ON (1) will make Out will discharge completely
and it will become (Out=0) this will make M2 ON this will make Out to charge VDD
(Out=1) because of PDN2 is OFF Out stays at 1 only, this will turn off M1
OR
Pass-Transistor Logic
A popular and widely-used alternative to complementary CMOS is pass-transistor logic
which attempts to reduce the number of transistors required to implement logic by
allowing the primary inputs to drive gate terminals as well as source/drain terminals
1. AND/NAND
2. OR/NOR
Addition forms the basis for many processing operations, from ALUs to address generation to
multiplication to filtering. As a result, adder circuits that add two binary numbers are of great
interest to digital system designers. An extensive, almost endless, assortment of adder
architectures serves different speed/power/area requirements. This section begins with half
adders and full adders for single-bit addition
ADDER
HALF ADDER
The half adder adds two single-bit inputs, A and B, The result of two bits are required to
represent the value; they are called the sum S and carry-out Cout. The carry-out is equivalent to a
carry-in to the next more significant column of a multibit adder,
The adder generates a carry when Cout is true independent of Cin, so Generate G =
A · B.
The adder kills a carry when Cout is false independent of Cin, so K = A · B = A + B
The adder propagates a carry; i.e., it produces a carry-out if and only if it receives a
carry-in, when exactly one input is true:
the full adder logic is
This involves inverting one operand to an N-bit carry-propagate adder and adding 1 via the
carry input, as shown in Figure
ADDER/SUBTRACTOR
An adder/subtractor uses XOR gates to conditionally invert B, as shown in below Figure.
In prefix adders, the XOR gates on the B inputs are sometimes merged into the bitwise PG
circuitry.
While it is perfectly possible to design a custom circuit for the subtraction operation, it is
much more common to re-use an existing adder and to replace a subtraction by a two-
complement's addition.
When the SUB/ADD input is low (0), the XOR-gates act as non-inverting buffers and the
carry-input to the adder is 0. Therefore, the adder calculates a four-bit sum plus carry-out:
(Cout, S3, S2, S1, S0) = (A3, A2, A1, A0) + (B3, B2, B1, B0)
(Cout, S3, S2, S1, S0) = (A3, A2, A1, A0) - (B3, B2, B1, B0)
2. ADD A+B
A 1110
B 1111
SUM= 1101 CAR=1
2. SUBTRACT A-B
A 1110
B 1111
S=A-B=A+B+1=A+2’s Compliment of B
MULTIPLICATION
Multiplication can be considered as a series of repeated additions. The number to be added is
the multiplicand, the number of times that it is added is the multiplier, and the result is the
product. Each step of the addition generates a partial product. In most computers, the operands
usually contain the same number of bits. When the operands are interpreted as integers, the
product is generally twice the length of the operands in order to preserve the information content.
This repeated addition method that is suggested by the arithmetic definition is slow that it is
almost always replaced by an algorithm that makes use of positional number representation. It is
possible to decompose multipliers in two parts. The first part is dedicated to the generation of partial
products, and the second one collects and adds them.
1. Parallel Multiplier
1. Parallel Multiplier
We denote the multiplicand as Y {yM–1, yM–2, …, y 1, y 0} and the multiplier as X {x N–1, xN–2, …, x 1, x0}.
For unsigned multiplication, the product is given in Equation
For example, the multiplication of two positive 6-bit binary integers, 25 10 and 3910, proceeds as
shown in Figure
1. Identity Comparator - an Identity Comparator is a digital comparator that has only one
output terminal for when A = B either "HIGH" A = B = 1 or "LOW" A = B = 0
2. Magnitude Comparator - a Magnitude Comparator is a type of digital comparator that
has three output terminals, one each for equality, A = B greater than, A > B and less
than A < B
This is useful if we want to compare two variables and want to produce an output when
any of the above three conditions are achieved. For example, produce an output from a counter
when a certain count number is reached. You may notice two distinct features about the
comparator from the above truth table. Firstly, the circuit does not distinguish between either two
"0" or two "1"'s as an output A = B is produced when they are both equal, either A = B = "0" or
A = B = "1". Secondly, the output condition for A = B resembles that of a commonly available
logic gate, the Exclusive-NOR or Ex-NOR function (equivalence) on each of the n-bits giving:
Q=A⊕B
Digital comparators actually use Exclusive-NOR gates within their design for comparing their
respective pairs of bits. When we are comparing two binary or BCD values or variables against
1-bit Comparator
Then the operation of a 1-bit digital comparator is given in the following Truth Table.
Truth Table
XNOR
Digital VLSI Design, ECE Dept.SET,JU. Page 59
PARITY GENERATORS
A parity bit can be added to an N-bit word to indicate whether the number of 1s in the word is
even or odd. In even parity, the extra bit is the XOR of the other N bits, which ensures the (N +
1)-bit coded word has an even number of 1s:
Parity generator helps in indicating the parity of a binary number or a word. Let us consider
One/Zero Detectors
Detecting all ones or zeros on wide N-bit words requires large fan-in AND or NOR gates. Recall
that by DeMorgan’s law, AND, OR, NAND, and NOR are fundamentally the same operation
except for possible inversions of the inputs and/or outputs. You can build a tree of AND gates, as
shown in Figure. Here, alternate NAND and NOR gates have been used.
Counters
Two commonly used types of counters are binary counters and linear-feedback shift
registers. An N-bit binary counter sequences through 2N outputs in binary order. Simple designs
have a minimum cycle time that increases with N, but faster designs operate in constant time. An
N-bit linear-feedback shift registers sequences through up to 2 N – 1 outputs in pseudo-random
order. It has a short minimum cycle time independent of N, so it is useful for extremely fast
counters as well as pseudo-random number generation.
Some of the common features of counters include the following:
Resettable: counter value is reset to 0 when RESET is asserted (essential for testing)
Loadable: counter value is loaded with N-bit value when LOAD is asserted
Enabled: counter counts only on clock cycles when EN is asserted
Reversible: counter increments or decrements based on UP/DOWN input
Terminal Count: TC output asserted when counter overflows (when counting up)
or underflows (when counting down)
1. Binary Counters
The simplest binary counter is the asynchronous ripple-carry counter, as shown in Figure
It is composed of N registers connected in toggle configuration, where the falling transition of
each register clocks the subsequent register. Therefore, the delay can be quite long. It has no
reset signal, making it difficult to test. In general, asynchronous circuits introduce a whole
2. Ring Counters
A ring counter consists of an M-bit shift register with the output fed back to the input, as
shown in Figure. On reset, the first bit is initialized to 1 and the others are initialized to 0. TC
toggles once every M cycles. Ring counters are a convenient way to build extremely fast
prescalars because there is no logic between flip-flops, but they become costly for larger M.
3. Johnson Counters
Johnson or Mobius counter is similar to a ring counter, but inverts the output before it is fed
back to the input, as shown in Figure. The flip-flops are reset to all zeros and count through 2M
states before repeating. Table shows the sequence for a 3-bit Johnson counter.
110110 →
_ _1101 →
101101
Control
It is usually an FSM since some operations that the chip performs take multiple clock cycles, and
the controller must know where it is in the instruction. In pipelined microprocessors, each
instruction may take n cycles to complete, but the next instructions are started before previous
ones have finished. The controller must track which instruction is at each function in each cycle.
FSM
Finite State Machines (as a sequential network) hold the present state in memory and compute
the next state and output using combinational logic. The input to the next state computation is the
present state and any inputs. A modulus counter represents the simplest FSM since it has no
It is important to understand the role of the clock in coordinating the operation of the FSM. As
the following diagram illustrates, the Next State computation is (must be) completed when the
clock pulse occurs. The clock is active on the rising edge in this example meaning that all
changes to the memory occur on the rising clock edge. The Next State becomes the Present State
on the active clock edge, meaning the count changes.
FSM Design
• This presentation deals with front to end design of finite state machines, both Mealy and
Moore types.
FSM Implementation
• Converting a problem to equivalent state table and state diagram is just the first step in
the design process
• The next step is to design the system hardware that implements the state machine.
• This section deals with the process involved to design the digital logic to implement a
finite state machine.
• First step is to assign a uniquely binary value to each of the state that the machine can
be in. The state must be encoded in binary.
• Next we design the hardware to go from the current state to the correct next state. This
logic converts the current state and the current input values to the next state values and
stores that value.
• The final stage would be to generate the outputs of the state machine. This is done
using combinatorial logic.
• Any values can be assigned to the states, some values can be better than others (in
terms of minimizing the logic to create the output and the next state values)
• This is actually an iterative process: first the designer creates a preliminary design to
generate the outputs and the next states, then modifies the state values and repeats
the process. There is a rule of thumb, that simplifies the process: whenever possible, the
state should be assigned the same with the output values associated with that state. In
this case, same logic can be used to generate the next state and the output
• The state value together with the machine inputs, are input to a logic block (CLC) that
generates the next state value and machine outputs
• The next state is loaded into the register on the rising edge of the clock signal
• A module 6 counter is a 3-bit counter that counts through the following sequence:
– 000->001->010->011->100->101->000->…
– When I=1 the counter increments its value on the rising edge of the clock
– When I=0 the counter retains its value on the rising edge of the clock
• There is an additional output O (Carry) that is 1 when going from 5 to 0 and 0 otherwise
(the O output remains 1 until the counter goes from 0 to 1)
S0 0 S0 1 000
S0 1 S1 0 001
S1 0 S1 0 001
S1 1 S2 0 010
S2 0 S2 0 010
S2 1 S3 0 011
S3 0 S3 0 011
S3 1 S4 0 100
S4 0 S4 0 100
S5 0 S5 0 101
S5 1 S0 1 000
• For each state examine what happens for all possible values of the inputs
– If I=0 the state machine remains in state S0 and outputs ‘O’=1 and C2C1C0=000
– If I=1 the state machine goes in state S1, outputs O=0 and C2C1C0=001
• In the same manner, each state goes to the next state if I=1 and remains in the same
state if I=0
• Since the Mealy and Moore machines must traverse the same states under the same
conditions, their next state logic is identical
• To begin with, we need to setup the truth table for the next state logic
P2P1P0 N2N1N0
000 0 000
000 1 001
001 1 010
010 0 010
010 1 011
011 0 011
011 1 100
100 0 100
100 1 101
101 0 101
101 1 000
• The system inputs and the present states are the inputs of the truth table
• We have to construct a Karnaugh map for each output bit and obtain its equation
• N0 = P 0I + P0I
• Both for Mealy and Moore machines we follow the same design procedure to develop
their output logic
• There are two approaches to generate the output (similar to generate the next state
logic):
– For Mealy machine, the truth table inputs will be the present state and the
system inputs, and the table outputs are the system outputs
– For Moore machine, only the state bits are inputs of the truth table, since only
these bits are used to generate the system outputs; the table outputs are the
system outputs
P2P1P0 O C2C1C0
000 1 000
001 0 001
010 0 010
011 0 011
100 0 100
101 0 101
• The outputs depend only on the present state and not on its inputs
– The system output depends only on the present state, so the implementation of
the output logic is done separately
– The next state is obtained from the input and the present state (same as for the
Mealy machine)
– C 2 = P2
– C 1 = P1
– C 0 = P0
– O = P2’P1’P0’ = (P2+P1+P0)’
Mealy
Modulo 6 Counter - Mealy state diagram
000 0 1 000
000 1 0 001
001 0 0 001
001 1 0 010
010 0 0 010
010 1 0 011
011 0 0 011
011 1 0 100
100 0 0 100
100 1 0 101
101 0 0 101
101 1 1 000
• The logic block (CLC) is specific to every system and may consist of combinatorial logic
gates, multiplexers, lookup ROMs and other logic components
• The logic block can’t include any sequential components, since it must generate its value
in one clock cycle
• Mealy machine (note that the equations for C2, C1, C0 are exactly the same as for the
N2, N1, N0. This is the result of optimally assignation of the state values. Same
combinatorial logic can be use to obtain the outputs):
– C2 = P2P0’+P2I’+P1P0I
– C1 = P1P0’+P1I’+P2P1’P0I
– C0 = P0’I+P0I’
– O = P2’P1’P0’I’+P2P0I