Вы находитесь на странице: 1из 181

Module 1 : Introduction to VLSI Design

Lecture 1 : Motivation of the Course


Objectives
In this lecture you will learn the following:
Motivation of the course
Course Objectives
1.1 Motivation of the course
Why do some circuits work the first time and some circuits take over a year and multiple
design iterations to work properly? Why can, for some circuits, the produced quantities
easily be ramped up, and for others both circuit and process optimisation is needed.
Why are some circuits running red-hot requiring expensive cooling solutions while other
circuits, for similar performance, are running from small batteries in hand held gadgets?
Why do some companies make money with successful innovations and why do some
companies loose hundreds of millions of dollars of revenue just because they did not get
their product on market in time.
The answer to these questions is (a lack of) system engineering: analysis and design of
a system's relevant electrical parameters. The deep submicron CMOS technologies have
moved the bottleneck from device and gate level issues to interconnects and
communication (metal wires) bottle necks, where we currently do not have any design
automation. This course aims to provide a working knowledge of system electrical issues
at chip level related to remove or live with these new bottle-necks (so that the disasters
in design can be avoided with proper structures and performance budgeting).
1.2 Course Objectives
The course provides for final year undergraduates a solid and fundamental engineering
view of digital system operation and how to design systematically well performing digital
VLSI systems exceeding consistently, customer expectations and competitor fears. The
aim is to teach the critical methods and circuit structures to identify the key 1 % of the
circuitry on-chip which dominates the performance, reliability, manufacturability, and
the cost of the VLSI circuit. With the current utilisation of the deep submicron CMOS
technologies (0.25 micron and below design rules) the major design paradigm shift is
associated with the fact that the interconnections (metal Al or Cu wires connecting
gates) and the chip communication in general is the main design object instead of active
transistors or logic gates. The main design issues defining the make-or-break point in
each project is associated with power and signal distribution and bit/symbol
communication between functional blocks on-chip and off-chip. In the course we
provide a solid framework in understanding:
-

Scaling of technology and their impact on interconnects


Interconnects as design objects
Noise in digital systems and its impact on system operation
Power distribution schemes for low noise
Signal and signalling conventions for on-chip and off-chip communication
Timing and synchronisation for fundamental operations and signaling

The course objective is to provide the student with a solid understanding of the
underlying mechanism and solution techniques to the above key design issues, so that
the student, when working as industrial designer, is capable of identifying the key
problem points and focus his creative attention and 90% of available resources to right
issues for 1% of the circuitry and leave the remaining 99% of circuitry to computer
automated tools or unqualified engineers.
Recap
In this course you have learnt the following
Motivation of the course
Course Objectives

Module 3 : Fabrication Process and Layout Design Rules


Lecture 10 : General Aspects of CMOS Technology
Objectives
In this course you will learn the following
Gate Material
Parasitic Capacitances
Self-aligned silicon gate technology
Channel Stopper
Polysilicon deposition
Oxide Growth
Active mask or Isolation mask (thin-ox)
10.1 Gate Material
Metals have several advantages when considered as gate electrodes. The use of metal
gates would certainly eliminate the problems of dopant penetration through the
dielectric and subsequent gate depletion. The use of metals with appropriate work
functions for NMOS and PMOS devices would led to transistors with symetrical and
tailored threshold voltages. Most refractory metals are good choices for this application
primarily due to their high melting points, which allow them to be used at high
temperatures necessary for source-drain implant activation. However thermodynamic
stability of metal-dielectric interfaces at processing temperatures are major concerns
which need to be addressed in addition to more subtle issues of electrical properties, flat
band voltage (ultimately threshold voltages) stability and the charge trapping at the
interface. The problem with using aluminium is that once deposited, it cannot be
subjected to high temperature processes. Copper causes a lot of trap generation when
used as a gate material.
10.2 Parasitic Capacitances

Figure 10.2: Parasitic capacitances in MOSFET


Though a lot of parasitic capacitances exist in a MOSFET as shown in figure 10.2, but
those of prime concern to us are the gate to drain capacitance (Cgd) and gate to source
capacitance (Cgs) because they are common to input and output nodes and gate
multiplied by gain during circuit operation. Thus they increase the input capacitance
drastically and decrease the charging rate.
10.3 Self-aligned Silicon Gate Technology

Figure 10.3: Cross sectional view of MOSFET under Selfalgining process


When the metal is used as the gate material, then the source and drain are deposited
before the gate and thus to align the gate, mask aligners are used and errors in aligning
takes place. In case of polysilicon gate process, the exposed gate oxide (not covered by
polysilicon) is etched away and the wafer is subjected to dopant-source or ion-implant
which causes source-drain deposition and also these are formed in the regions not
covered by polysilicon and thus source and drain donot extend under the gate. This is
called self-aligning process.
10.4 Channel Stopper
It is used to prevent the channel formation in the substrate below the field oxide. For
example, for a p-substrate, the channel stopper implant would p+ which will increase
the magnitude of threshold voltage.
Irregular surfaces can cause "step coverage problems" in which a conductor thins and
can even break as it crosses a thick to thin oxide boundary. One of the methods used to
remove these irregularities is to pre-etch the silicon in areas where the field oxide is to
be grown by around half the final required field oxide thickness. LOCOS (will explain it
shortly) oxidation done after this gives the planner field oxide/gate oxide interface.
10.5 Polysilicon Deposition
The sheet resistance of undoped polysilicon is 10^8 ohms/cm and it can be reduced to
30 ohm/cm by heavy doping. The advantage of using polysilicon as gate material is its
use as further mask to allow precise definition of source and drain. The polysilicon
resistance affects the input resistance of the transistor and thus should be small for
improving the RC time constant. For this, higher doping concentration is used.

10.6 Oxide Growth

Figure 10.6: Formation of bird's beak in MOSFET


Oxide grown on silicon may result in an uneven surface due to unequal thickness of
oxide grown from same thickness of silicon. Stress along the edge of an oxidized area
(where silicon has been trenched prior to oxidation to produce a plainer surface) may
produce severe damage in the silicon. To relieve this stress, the oxidation temperature
must be sufficiently high to allow the stress in the oxide to relieved by viscous flow. In
the LOCOS process, the transistor area is masked by SiO2/SiN sandwich and the thick
field oxide is then grown. The oxide grows in both the directions vertically and also
laterally under the sandwich and results in an encroachment into the gate region called
as bird's beak.
This reduces the active area of the transistor and specially the width. Some
improvements in the LOCOS process produce Bird's crest which reduces the
encroachments, but it is non-uniform.

Figure 10.62: Comparison of the LOCOS process


with and without some sacrificial polysilicon
The goal is to oxidize Si only locally, whenever a field oxide is needed. This is necessary
for the following reasons:
-- Local oxide penetrates into the Si, so the Si-SiO2interface is lower than the sourcedrain regions to be made later. This could not be achieved with oxidizing all of the Si
and then etching of unwanted oxide.
-- For device performance reasons, this is highly beneficial, if not absolutely necessary.
10.7 Active Mask or Isolation Mask (thin-ox)

It describes the areas where thin oxides are needed to implement the transistor gates
and allow implantations to form p/n type diffusions. A thin layer of SiO2is grown and
covered with SiN and this is used as mask. The bird's bead must be taken into account
while designing thin-ox.
Recap
In this lecture you have learnt the following
Gate Material
Parasitic Capacitances
Self-aligned silicon gate technology
Channel Stopper
Polysilicon deposition
Oxide Growth
Active mask or Isolation mask (thin-ox)

Congratulations, you have finished Lecture 10.

Module 3 : Fabrication Process and Layout Design Rules


Lecture 11 : General Aspects of CMOS Technology (contd...)
Objectives
In this course you will learn the following
Why polysilicon prefered over aluminium as gate material?
Channel stopper Implant
Local Oxidation of silicon (LOCOS)
11.1 Why polysilicon prefered over aluminium as gate material?
Because-

Figure 11.11: Self-alignment is not possible in case of


Al gate due to Cgd and Cgs

Figure 11.12 Self-alignment possible in case of polysilicon


1. Penetration of silicon substrate: If aluminium metal is deposited as gate, we
can't increase the temperature beyond 500 degree celcius due to the fact that
aluminium will then start penetrating the silicon substrate and act as p-type
impurity.
2. Problem with non-self alignment: In case of aluminium gate, we have to first
create source and drain and then gate implant. We can't do the reverse because
diffusion is a high temperature process. And this creates parasitic overlap input
capacitances Cgd and Cgs (figure 11.11).Cgd is more harmful because it is a
feedback capacitance and hence it is reflected on the input magnified by (k+1)
times (recall Miller's theorem), where k is the gain. So if aluminum is used, the
input capacitance increases unnecessarily which further increases the charging
time of the input capacitance.

Therefore output doesn't appear immediately. If poly-silicon is used instead, it is


possible to first create gate and then source & drain implant, which eliminates the
problem of overlap capacitances Cgd and Cgs.
Resistivity of poly-silicon is 10^8 ohm/cm. So we need to dope polysilicon so that it
resembles a metal like Al and its resistance is reduced to 100 or 300 ohm (although its
still greater than Al).
Time for charging capacitance varies as negative exponential of (RC)^(-1) where R and
C are resistance and capacitance of the device. As we know the resistance is directly
propotional to the length, so poly-silicon length should be kept small so that the
resistance is not large, otherwise the whole purpose of decreasing C (hence the time
constant RC) will be nullified.
11.2 Channel stopper Implant
As we know millions of transistors are fabricated on a single chip. To seperate (insulate)
these from each-other, we grow thick oxides (called field oxides). So, at very high
voltages, inversion may set in the region below the field oxide also, despite the large
thickness of these oxides.

Figure 11.21: channel stopper implant before


field oxide region is grown (yellow color region)
To avoid this problem, we do an implant in this region before growing the field oxide
layer so that threshold voltage for this region is much greater than that for the desired
active transistor channel region. This implant layer is called channel stopper implant.
(as shown in figure 11.21)
11.3 Local Oxidation of Silicon (LOCOS)

Figure 11.31: Formation of LOCOS Creation of LOCOS:


During etching, anything irregular becomes more irregular. So we grow oxide fields 50%
above and 50% below the wafer. This is called LOCal Oxidation of Silicon (LOCOS).

Figure 11.32: bird's beak


0.45 mU of silicon, when oxidized, becomes 1 mU of SiO2 because of change in density.
When field oxides are grown, there is an encroachment of the oxide layer in the active
transistor region below the gate oxide, because of the affinity of the SiO2 gate oxide for
oxygen. The resulting structure resembles a bird's beak (as shown in figure 11.32).
This affects the device performance.

Figure 11.33: bird's creast


If we use Si3N as the gate dielectric, it will not let oxygen pass through. But due to
mismatch of the thermal coefficients of Si and Si3N4, hence the resulting stress
produces a nonplanar structure called bird's crest (as shown in figure 11.33).
The thermal coefficients of Si and SiO2 match. So when Si3N4 is used as the gate
dielectric, we first grow a thin oxide layer underneath. The stress, which would
otherwise be generated on the account of the difference in the thermal coefficients of Si
and SiO2 is now reduced. Since SiO2 is now there, bird's beak will be formed.
Recap
In this lecture you have learnt the following
Why polysilicon prefered over aluminium as gate material?
Channel stopper Implant
Local Oxidation of silicon (LOCOS)

Congratulations, you have finished Lecture 11.

Module 3 : Fabrication Process and Layout Design Rules


Lecture 12 : CMOS Fabrication Technologies
Objectives
In this course you will learn the following
Introduction
Twin Well/Tub Technology
Silicon on Insulator (SOI)
N-well/P-well Technology
12.1 Introduction
CMOS fabrication can be accomplished using either of the three technologies:
N-well/P-well technologies
Twin well technology
Silicon On Insulator (SOI)
In this discussion we will focus chiefly on N-well CMOS fabrication technology.
12.2 Twin Well Technology
Using twin well technology, we can optimise NMOS and PMOS transistors separately.
This means that transistor parameters such as threshold voltage, body effect and the
channel transconductance of both types of transistors can be tuned independenly.
n+ or p+ substrate, with a lightly doped epitaxial layer on top, forms the starting
material for this technology. The n-well and pwell are formed on this epitaxial layer
which forms the actual substrate. The dopant concentrations can be carefully optimized
to produce the desired device characterisitcs because two independent doping steps are
performed to create the well regions.
The conventional n-well CMOS process suffers from, among other effects, the problem of
unbalanced drain parasitics since the doping density of the well region typically being
about one order of magnitude higher than the substrate. This problem is absent in the
twin-tub process.
12.3 Silicon on Insulator (SOI)
To improve process characteristics such as speed and latch-up susceptibility,
technologists have sought to use an insulating substrate instead of silicon as the
substrate material.
Completely isolated NMOS and PMOS transistors can be created virtually side by side on
an insulating substrate (eg. sapphire) by using the SOI CMOS technology.

This technology offers advantages in the form of higher integration density (because of
the absence of well regions), complete avoidance of the latch-up problem, and lower
parasitic capacitances compared to the conventional n-well or twin-tub CMOS processes.
But this technology comes with the disadvantage of higher cost than the standard n-well
CMOS process. Yet the improvements of device performance and the absence of latchup problems can justify its use, especially in deep submicron devices.
12.4 N-well Technology
In this discussion we will concentrate on the well established n-well CMOS fabrication
technology, which requires that both nchannel and p-channel transistors be built on the
same chip substrate. To accomodate this, special regions are created with a
semiconductor type opposite to the substrate type. The regions thus formed are called
wells or tubs. In an n-type substrate, we can create a p-well or alternatively, an n-well is
created in a p-type substrate. We present here a simple n-well CMOS fabrication
technology, in which the NMOS transistor is created in the p-type substrate, and the
PMOS in the n-well, which is built-in into the p-type substrate.
Historically, fabrication started with p-well technology but now it has been completely
shifted to n-well technology. The main reason for this is that, "n-well sheet resistance
can be made lower than p-well sheet resistance" (electrons are more mobile than holes).
The simplified process sequence (shown in Figure 12.41) for the fabrication of CMOS
integrated circuits on a p-type silicon substrate is as follows:

N-well regions are created for PMOS transistors, by impurity implantation into the
substrate.
This is followed by the growth of a thick oxide in the regions surround the NMOS
and PMOS active regions.
The thin gate oxide is subsequently grown on the surface through thermal
oxidation.
After this n+ and p+ regions (source, drain and channel-stop implants) are
created.
The metallization step (creation of metal interconnects) forms the final step in this
process.

Fig 12.41: Simplified Process Sequence For Fabrication Of CMOS ICs


The integrated circuit may be viewed as a set of patterned layers of doped silicon,
polysilicon, metal and insulating silicon dioxide, since each processing step requires that
certain areas are defined on chip by appropriate masks. A layer is patterned before the
next layer of material is applied on the chip. A process, called lithography, is used to
transfer a pattern to a layer. This must be repeated for every layer, using a different
mask, since each layer has its own distinct requirements.
We illustrate the fabrication steps involved in patterning silicon dioxide through optical
lithography, using Figure 12.42 which shows the lithographic sequences.

Fig 12.42: Process steps required for patterning of silicon dioxide


First an oxide layer is created on the substrate with thermal oxidation of the silicon
surface. This oxide surface is then covered with a layer of photoresist. Photoresist is a
light-sensitive, acid-resistant organic polymer which is initially insoluble in the
developing solution. On exposure to ultraviolet (UV) light, the exposed areas become
soluble which can be etched away by etching solvents. Some areas on the surface are
covered with a mask during exposure to selectively expose the photoresist. On exposure
to UV light, the masked areas are shielded whereas those areas which are not shielded
become soluble.
There are two types of photoresists, positive and negative photoresist. Positive
photoresist is initially insoluble, but becomes soluble after exposure to UV light, where
as negative photoresist is initially soluble but becomes insoluble (hardened) after
exposure to UV light. The process sequence described uses positive photoresist.

Negative photoresists are more sensitive to light, but their photolithographic resolution
is not as high as that of the positive photoresists. Hence, the use of negative
photoresists is less common in manufacturing high-density integrated circuits.
The unexposed portions of the photoresist can be removed by a solvent after the UV
exposure step. The silicon dioxide regions not covered by the hardened photoresist is
etched away by using a chemical solvent (HF acid) or dry etch (plasma etch) process.
On completion of this step, we are left with an oxide window which reaches down to the
silicon surface. Another solvent is used to strip away the remaining photoresist from the
silicon dioxide surface. The patterned silicon dioxide feature is shown in Figure 12.43

Fig 12.43: The result of single photolithographic


patterning sequence on silicon dioxide
The sequence of process steps illustrated in detail actually accomplishes a single pattern
transfer onto the silicon dioxide surface. The fabrication of semiconductor devices
requires several such pattern transfers to be performed on silicon dioxide, polysilicon,
and metal. The basic patterning process used in all fabrication steps, however, is quite
similar to the one described earlier. Also note that for accurate generation of highdensity patterns required in submicron devices, electron beam (E-beam) lithography is
used instead of optical lithography.
In this section, we will examine the main processing steps involved in fabrication of an
n-channel MOS transistor on a p-type silicon substrate.

The first step of the process is the oxidation of the silicon substrate (Fig 12.44(a)),
which creates a relatively thick silicon dioxide layer on the surface. This oxide layer is
called field oxide (Fig. 12.44(b)). The field oxide is then selectively etched to expose the
silicon surface on which the transistor will be created (Fig. 12.44(c)). After this the
surface is covered with a thin, high-quality oxide layer. This oxide layer will form the
gate oxide of the MOS transistor (Fig. 12.44(d)). Then a polysilicon layer is deposited on
the thin oxide (Fig 12.44(e)). Polysilicon is used as both a gate electrode material for
MOS transistors as well as an interconnect medium in silicon integrated circuits. The
resistivity of polysilicon, which is usually high, is reduced by doping it with impurity
atoms.
Deposition is followed by patterning and etching of polysilicon layer to form the
interconnects and the MOS transistor gates (Fig. 12.44(f)). The thin gate oxide not
masked by polysilicon is also etched away exposing the bare silicon surface. The drain
and source junctions are to be formed (Fig 12.44(g)). Diffusion or ion implantation is
used to dope the entire silicon surface with a high concentration of impurities (in this
case donor atoms to produce n-type doping). Fig 12.44(h) shows two n-type regions
(source and drain junctions) in the p-type substrate as doping penetrates the exposed
areas of the silicon surface. The penetration of impurity doping into the polysilicon
reduces its resistivity. The polysilicon gate is patterned before the doping and it precisely
defines the location of the channel region and hence, the location of the source and
drain regions. Hence this process is called a self-aligning process.
The entire surface is again covered with an insulating layer of silicon dioxide after the
source and drain regions are completed (Fig 12.44(i)). Next contact windows for the
source and drain are patterned into the oxide layer (Fig. 12.44(j)). Interconnects are
formed by evaporating aluminium on the surface (Fig 12.44(k)), which is followed by
patterning and etching of the metal layer (Fig 12.44(l)). A second or third layer of
metallic interconnect can also be added after adding another oxide layer, cutting (via)
holes, depositing and patterning the metal.

Fig 12.44: Process flow for the fabrication of an n-type MOSFET on p-type silicon
We now return to the generalized fabrication sequence of n-well CMOS integrated
circuits. The following figures illustrate some of the important process steps of the
fabrication of a CMOS inverter by a top view of the lithographic masks and a crosssectional view of the relevant areas.
The n-well CMOS process starts with a moderately doped (with impurity concentration
typically less than 1015 cm-3) p-type silicon substrate. Then, an initial oxide layer is
grown on the entire surface. The first lithographic mask defines the n-well region. Donor
atoms, usually phosphorus, are implanted through this window in the oxide. Once the nwell is created, the active areas of the nMOS and pMOS transistors can be defined

The creation of the n-well region is followed by the growth of a thick field oxide in the
areas surrounding the transistor active regions, and a thin gate oxide on top of the
active regions. The two most important critical fabrication parameters are the thickness
and quality of the gate oxide. These strongly affect the operational characteristics of the
MOS transistor, as well as its long-term stability.
Chemical vapor deposition (CVD) is used for deposition of polysilicon layer and patterned
by dry (plasma) etching. The resulting polysilicon lines function as the gate electrodes of

the nMOS and the pMOS transistors and their interconnects. The polysilicon gates also
act as self-aligned masks for source and drain implantations.
The n+ and p+ regions are implanted into the substrate and into the n-well using a set
of two masks. Ohmic contacts to the substrate and to the n-well are also implanted in
this process step.

CVD is again used to deposit and insulating silicon dioxide layer over the entire wafer.
After this the contacts are defined and etched away exposing the silicon or polysilicon
contact windows. These contact windows are essential to complete the circuit
interconnections using the metal layer, which is patterned in the next step.

Metal (aluminum) is deposited over the entire chip surface using metal evaporation, and
the metal lines are patterned through etching. Since the wafer surface is non-planar, the
quality and the integrity of the metal lines created in this step are very critical and are
ultimately essential for circuit reliability.
The composite layout and the resulting cross-sectional view of the chip, showing one
nMOS and one pMOS transistor (built-in nwell), the polysilicon and metal
interconnections. The final step is to deposit the passivation layer (for protection) over
the chip, except for wire-bonding pad areas.
This completes the fabrication of the CMOS inverter using n-well technology.

Recap
In this lecture you have learnt the following
Motivation
N-well / P-well Technologies
Silicon on Insulator (SOI)
Twin well Technology

Congratulations, you have finished Lecture 12.

Module 3 : Fabrication Process and Layout Design Rules


Lecture 13 : Layout Design Rules
Objectives
In this course you will learn the following
Motivation
Types of Design Rules
Layer Representations
Stick Diagrams
13.1 Motivation
In VLSI design, as processes become more and more complex, need for the designer to
understand the intricacies of the fabrication process and interpret the relations between
the different photo masks is really trouble some. Therefore, a set of layout rules, also
called design rules, has been defined. They act as an interface or communication link
between the circuit designer and the process engineer during the manufacturing phase.
The objective associated with layout rules is to obtain a circuit with optimum yield
(functional circuits versus non-functional circuits) in as small as area possible without
compromising reliability of the circuit. In addition, Design rules can be conservative or
aggressive, depending on whether yield or performance is desired. Generally, they are a
compromise between the two. Manufacturing processes have their inherent limitations in
accuracy. So the need of design rules arises due to manufacturing problems like

Photo resist shrinkage, tearing.


Variations in material deposition, temperature and oxide thickness.
Impurities.
Variations across a wafer.

These lead to various problems like :


Transistor problems:
Variations in threshold voltage: This may occur due to variations in oxide
thickness, ion-implantation and poly layer.
Changes in source/drain diffusion overlap.
Variations in substrate.

Wiring problems:
Diffusion: There is variation in doping which results in variations in resistance,
capacitance.
Poly, metal: Variations in height, width resulting in variations in resistance,
capacitance.
Shorts and opens.

Oxide problems:
Variations in height.
Lack of planarity.

Via problems:
Via may not be cut all the way through.
Undersize via has too much resistance.
Via may be too large and create short.

To reduce these problems, the design rules specify to the designer certain geometric
constraints on the layout artwork so that the patterns on the processed wafers will
preserve the topology and geometry of the designs. This consists of minimum-width and
minimum-spacing constraints and requirements between objects on the same or
different layers. Apart from following a definite set of rules, design rules also come by
experience.
13.2 Types of Design Rules
The design rules primary address two issues:
1. The geometrical reproduction of features that can be reproduced by the
maskmaking and lithographical process ,and
2. The interaction between different layers.
There are primarily two approaches in describing the design rules.
1. Linear scaling is possible only over a limited range of dimensions.
2. Scalable design rules are conservative .This results in over dimensioned and less
dense design.
3. This rule is not used in real life.
1. Scalable Design Rules (e.g. SCMOS, -based design rules):
In this approach, all rules are defined in terms of a single parameter . The rules
are so chosen that a design can be easily ported over a cross section of industrial
process ,making the layout portable .Scaling can be easily done by simply
changing the value of.
The key disadvantages of this approach are:
2. Absolute Design Rules (e.g. -based design rules ) :
In this approach, the design rules are expressed in absolute dimensions (e.g.
0.75m) and therefore can exploit the features of a given process to a maximum
degree. Here, scaling and porting is more demanding, and has to be performed
either manually or using CAD tools .Also, these rules tend to be more complex
especially for deep submicron.
The fundamental unity in the definition of a set of design rules is the minimum line
width .It stands for the minimum mask dimension that can be safely transferred to
the semiconductor material .Even for the same minimum dimension, design rules
tend to differ from company to company, and from process to process. Now, CAD
tools allow designs to migrate between compatible processes.

13.3 Layer Representations


With increase of complexity in the CMOS processes, the visualization of all the mask
levels that are used in the actual fabrication process becomes inhibited. The layer
concept translates these masks to a set of conceptual layout levels that are easier to
visualize by the circuit designer. From the designer's viewpoint, all CMOS designs have
the following entities:

Two different substrates and/or wells: which are p-type for NMOS and n-type for
PMOS.
Diffusion regions (p+ and n+): which defines the area where transistors can be
formed. These regions are also called active areas. Diffusion of an inverse type is
needed to implement contacts to the well or to substrate.These are called select
regions.
Transistor gate electrodes : Polysilicon layer
Metal interconnect layers
Interlayer contacts and via layers.

The layers for typical CMOS processes are represented in various figures in terms of:
A color scheme (Mead-Conway colors).
Other color schemes designed to differentiate CMOS structures.
Varying stipple patterns
Varying line styles

13.31 Mead Conway Color coding for layers.

An example of layer representations for CMOS inverter using above design rules is
shown below-

Figure 13.32 :CMOS Inverter Layout Figure


13.4 Stick Diagrams
Another popular method of symbolic design is "Sticks" layout. In this, the designer
draws a freehand sketch of a layout, using colored lines to represent the various process
layers such as diffusion, metal and polysilicon .Where polysilicon crosses diffusion,
transistors are created and where metal wires join diffusion or polysilicon, contacts are
formed.
This notation indicates only the relative positioning of the various design components.
The absolute coordinates of these elements are determined automatically by the editor
using a compactor. The compactor translates the design rules into a set of constraints
on the component positions, and solve a constrained optimization problem that attempts
to minimize the area or cost function.
The advantage of this symbolic approach is that the designer does not have to worry
about design rules, because the compactor ensures that the final layout is physically
correct. The disadvantage of the symbolic approach is that the outcome of the
compaction phase is often unpredictable. The resulting layout can be less dense than
what is obtained with the manual approach. In addition, it does not show exact
placement, transistor sizes, wire lengths, wire widths, tub boundaries.

For example, stick diagram for CMOS Inverter is shown below.

Figure 13.41: Stick Diagram of a CMOS Inverter

Recap
In this lecture you have learnt the following
Motivation
Types of Design Rules
Layer Representations
Stick Diagrams

Congratulations, you have finished Lecture 13.

Module 3 : Fabrication Process and Layout Design Rules


Lecture 14 : -based Design Rules
Objectives
In this course you will learn the following
Background
-based Design Rules
14.1 Background
As we studied in the last lecture, Layout rules are used to prepare the photo mask used
in the fabrication of integrated circuits. The rules provide the necessary communication
link between the circuit designer and process engineer. Design rules represent the best
possible compromise between performance and yield.
The design rules primarily address two issues 1. The geometrical reproductions of features that can be reproduced by mask making
and lithographical processes.
2. Interaction between different layers
Design rules can be specified by different approaches
1. -based design rules
2. -based design rules
As -based layout design rules were originally devised to simplify the industry- standard
-based design rules and to allow scaling capability for various processes. It must be
emphasized, however, that most of the submicron CMOS process design rules do not
lend themselves to straightforward linear scaling. The use of -based design rules must
therefore be handled with caution in sub-micron geometries.
In further sections of this lecture, we will present a detailed study about -based design
rules.
14.2 -based Design Rules
Features of -based Design Rules: -based Design Rules have the following features is the size of a minimum feature
All the dimensions are specified in integer multiple of .
Specifying particularizes the scalable rules.
Parasitic are generally not specified in units
These rules specify geometry of masks, which will provide reasonable yields

Guidelines for using -based Design Rules:

As, Minimum line width of poly is 2 & Minimum line width of diffusion is 2

As Minimum distance between two diffusion layers 3

As It is necessary for the poly to completely cross active, other wise the transistor that
has been created crossing of diffusion and poly, will be shorted by diffused path of
source and drain.
Contact cut on metal

Contact window will be of 2 by 2 that is minimum feature size while metal deposition
is of 4 by 4 for reliable contacts.
In Metal

Two metal wires have 3 distance between them to overcome capacitance coupling and
high frequency coupling. Metal wires width can be as large as possible to decrease
resistance.
Buttering contact

Buttering contact is used to make poly and silicon contact. Window's original width is 4,
but on overlapping width is 2.
So actual contact area is 6 by 4.
The distance between two wells depends on the well potentials as shown above. The
reason for 8l is that if both wells are at same high potential then the depletion region
between them may touch each other causing punch-through. The reason for 6l is that if
both wells are at different potentials then depletion region of one well will be smaller, so
both depletion region will not touch each other so 6l will be good enough.

The active region has length 10 which is distributed over the followings 2 for source diffusion
2 for drain diffusion
2 for channel length
2 for source side encroachment
2 for drain side encroachment
Recap
In this lecture you have learnt the following
Background
-based Design Rules
Congratulations, you have finished Lecture 14.

Module 4 : Propagation Delays in MOS


Lecture 15 : CMOS Inverter Characteristics
Objectives
In this lecture you will learn the following
CMOS Inverter Characterisitcs
Noise Margins
Regions of operation
Beta-n by Beta-p ratio
15. CMOS Inverter Characterisitcs
The complementary CMOS inverter is realized by the series connection of a p- and ndevice as in fig 15.11.

Fig 15.11: CMOS Inverter

Fig 15.12: I-V characteristics of PMOS & NMOS

Fig 15.13: Transfer Characteristics of CMOS


Inverter characteristics:
In the below graphical representation (fig.2). The I-V characteristics of the p-device is
reflected about x-axis. This step is followed by taking the absolute values of the pdevice, Vds and superimposing the two characteristics. Solving Vinn and Vinp and
Idsn=Idsp gives the desired transfer characteristics of a CMOS inverter as in fig3.
15.2 Noise Margins
Noise margin is a parameter closely related to the input-output voltage characteristics.
This parameter allows us to determine the allowable noise voltage on the input of a gate
so that the output will not be affected. The specification most commonly used to specify
noise margin (or noise immunity) is in terms of two parameters- The LOW noise margin,
NML, and the HIGH noised margin, NMH. With reference to Fig 4, NML is defined as the
difference in magnitude between the maximum LOW output voltage of the driving gate
and the maximum input LOW voltage recognized by the driven gate. Thus,

The value of NMH is difference in magnitude between the minimum HIHG output voltage
of the driving gate and the minimum input HIGH voltage recognized by the receiving
gate.
Thus,

Where,
VIHmin = minimum HIGH input voltage.
VILmax = maximum LOW input voltage.
VOHmin= minimum HIGH output voltage.
VOLmax= maximum LOW output voltage.

Fig 15.2: Noise Margin diagram


15.3: Regions of Operation
The operation of CMOS inverter can be divided into five regions .The behavior of n- and
p-devices in each of region may be found using

We will describe about each regions in detailsRegion A: This region is defined by 0 =< Vin < Vtn in which the n-device is cut off
(Idsn =0), and the p-device is in the linear region. Since Idsn = IIdsp, the drain-tosource current Idsp for the p-device is also zero. But for Vdsp = Vout VDD, with
Vdsp = 0, the output voltage is Vout=VDD.
Region B: This region is characterized by Vtn =< Vin < VDD /2 in which the p-device
is in its nonsaturated region (Vds != 0) while the n-device is in saturation. The
equivalent circuit for the inverter in this region can be represented by a resistor for the
p-transistor and a current source for the n-transistor as shown in fig. 6 . The saturation
current Idsn for the n-device is obtained by setting Vgs = Vin. This results in
and
Vtn=threshold
voltage
of
n-device,
n=mobility of electrons Wn = channel width of n-device & Ln = channel length of
n-device
The current for the p-device can be obtained by noting that Vgs = ( Vin VDD ) and
Vds = (Vout VDD ). And therefore,
and Vtp =threshold
voltage of n-device, p=mobility of electrons, Wp = channel width of n-device and
Lp=channel length of n-device. The output voltage Vout can be expressed as-

Fig 15.31: Equivalent circuit of MOSFET in region B


Region C: In this region both the n- and p-devices are in saturation. This is represented
by fig 7 which shows two current sources in series.

Fig 15.32: Equivalent circuit of MOSFET in region C


The saturation currents for the two devices are given by.

This yields,

By setting,

Which implies that region C exists only for one value of Vin. We have assumed that a
MOS device in saturation behaves like an ideal current soured with drain-to-source
current being independent of Vds.In reality, as Vds increases, Ids also increases
slightly; thus region C has a finite slope. The significant factor to be noted is that in
region C, we have two current sources in series, which is an unstable condition. Thus
a small input voltage as a large effect at the output. This makes the output transition
very steep, which contrasts with the equivalent nMOS inverter characteritics.
characteritics. The above expression of Vth is particularly useful since it provides the
basis for defining the gate threshold Vinv which corresponds to the state where
Vout=Vin .This region also defines the gain of the CMOS inverter when used as a small
signal amplifier.
Region D:

Fig 15.33: Equivalent circuit of MOSFET in region D


This region is described by VDD/2 <Vin =< VDD+ Vtp.The p-device is in saturation
while the n-device is operation in its nonsaturated region. This condition is represented
by the equivalent circuit shown in fig 15.33. The two currents may be written as

with Idsn = -Idsp.


The output voltage becomes

Region E: This region is defined by the input condition Vin >= VDD -Vtp, in which the
pdevice is cut off (Idsp =0), and the n-device is in the linear mode. Here,
Vgsp= Vin - VDD
Which is more positive than Vtp. The output in this region is Vout=0 From the transfer
curve , it may be seen that the transition between the two states is very step.This
characteristic is very desirable because the noise immunity is maximized.

15.4 n/p ratio:

Figure 15.4: n/p graph


The gate-threshold voltage, Vinv, where Vin =Vout is dependent on n/p. Thus, for
given process, if we want to change n/p we need to change the channel dimensions,
i.e.,channel-length L and channel-width W. Therefore it can be seen that as the ratio
n/p is decreased, the transition region shifts from left to right; however, the output
voltage transition remains sharp.
Recap
In this lecture you have learnt the following
CMOS Inverter Characterisitcs
Noise Margins
Regions of operation
Beta-n by Beta-p ratio
Congratulations, you have finished Lecture 15.

Module 4 : Propagation Delays in MOS


Lecture 16 : Propagation Delay Calculation of CMOS Inverter
Objectives
In this lecture you will learn the following
Few Definitions
Quick Estimates
Rise and Fall times Calculation
16.1 Few Definitions
Before calculating the propagation delay of CMOS Inverter, we will define some basic
terms Switching speed - limited by time taken to charge and discharge, CL.
Rise time, tr: waveform to rise from 10% to 90% of its steady state value
Fall time tf: 90% to 10% of steady state value
Delay time, td: time difference between input transition (50%) and 50% output
level

Fig 16.1: Propagation delay graph


The propagation delay tp of a gate defines how quickly it responds to a change at its
inputs, it expresses the delay experienced by a signal when passing through a gate. It is
measured between the 50% transition points of the input and output waveforms as
shown in the figure 16.1 for an inverting gate. The
gate for a low to high output transition, while
propagation delay

as the average of the two

16.2 Quick Estimates:

defines the response time of the


refers to a high to low transition. The

We will give an example of how to calculate quick estimate. From fig 16.22, we can write
following equations.

Fig 16.21: Example CMOS Inverter Circuit

Fig. 16.22 : Propagation Delay of above MOS circuit


From figure 16.21, when Vin = 0 the capacitor CL charges through the PMOS, and when
Vin = 5 the capacitor discharges through the N-MOS. The capacitor current is

From this the delay times can be derived as

The expressions for the propagation delays as denoted in the figure (16.22) can be
easily seen to be

16.3 Rise and Fall Times


Figure 16.21 shows the familiar CMOS inverter with a capacity load CL that represents
the load capacitance (input of next gates, output of this gate and routing). Of interest is
the voltage waveform Vout(t) when the input is driven by a step waveform, Vin(t) as
shown in figure 16.22.

Fig 16.31: trjectory of n-transistor operating point


Figure 16.31 shows the trajectory of the n-transistor operating point as the input
voltage, Vin(t), changes from 0V to VDD. Initially, the end-device is cutt-off and the
load capacitor is charged to VDD. This illustrated by X1 on the characteristic curve.
Application of a step voltage (VGS = VDD) at the input of the inverter changes the
operating point to X2. From there onwards the trajectory moves on the VGS= VDD
characteristic curve towards point X3 at the origin.
Thus it is evident that the fall time consists of two intervals:
1. tf1=period during which the capacitor voltage, Vout, drops from 0.9VDD to
(VDDVtn)
2. tf2=period during which the capacitor voltage, Vout, drops from (VDDVtn) to
0.1VDD.
The equivalent circuits that illustrate the above behavior are show in figure (16.32 &
16.33).

Figure 16.32: Equivalent circuit for showing behav. of tf1

Figure 16.33: Equivalent circuit for showing behav. of tf2


As we saw in last section, the delay periods can be derived using the general equation

from figure (16.32) while in saturation,

Integrating from t = t1, corresponding to Vout=0.9 VDD, to t = t2 corresponding to


Vout=(VDD-Vtn) results in,

Fig 16.34: Rise and Fall time graph

When the n-device begins to operate in the linear region, the discharge current is no
longer constant. The time tf1 taken to discharge the capacitor voltage from (VDD-Vtn)
to 0.1VDD can be obtained as before. In linear region,

Thus the complete term for the fall time is,

The fall time tf can be approximated as,

From this expression we can see that the delay is directly proportional to the load
capacitance. Thus to achieve high speed circuits one has to minimize the load
capacitance seen by a gate. Secondly it is inversely proportion to the supply voltage i.e.
as the supply voltage is raised the delay time is reduced. Finally, the delay is
proportional to the n of the driving transistor so increasing the width of a transistor
decreases the delay.
Due to the symmetry of the CMOS circuit the rise time can be similarly obtained as; For
equally sized n and p transistors (where n=2p) tf=tr
Thus the fall time is faster than the rise time primarily due to different carrier mobilites
associated with the p and n devices thus if we want tf=tr we need to make n/p =1.
This implies that the channel width for the p-device must be increased to approximately
2 to 3 times that of the n-device.
The propagation delays if calculated as indicated before turn out to be,

Figure 16.35: Rise and Fall time graph of Output w.r.t Input
If we consider the rise time and fall time of the input signal as well, as shown in the fig
16.35 we have,

These are the rms values for the propagation delays.


Recap
In this lecture you have learnt the following
Few Definitions
Quick Estimates
Rise and Fall times Calculation

Congratulations, you have finished Lecture 16.

Module 4 : Propagation Delays in MOS


Lecture 17 : Pseudo NMOS Inverter
Objectives
In this lecture you will learn the following
Introduction
Different Configurations with NMOS Inverter
Worries about Pseudo NMOS Inverter
Calculation of Capacitive Load
17.1 Introduction
The inverter that uses a p-device pull-up or load that has its gate permanently ground.
An n-device pull-down or driver is driven with the input signal. This roughly equivalent
to use of a depletion load is Nmos technology and is thus called Pseudo-NMOS. The
circuit is used in a variety of CMOS logic circuits. In this, PMOS for most of the time will
be linear region. So resistance is low and hence RC time constant is low. When the
driver is turned on a constant DC current flows in the circuit.

Fig 17.1: CMOS Inverter Circuit


17.2 Different Configurations with NMOS Inverter

17.3 CMOS Summary


Logic consumes no static power in CMOS design style. However, signals have to be
routed to the n pull down network as well as to the p pull up network. So the load
presented to every driver is high. This is exacerbated by the fact that n and p channel
transistors cannot be placed close together as these are in different wells which have to
be kept well separated in order to avoid latchup.

17.4 Pseudo nMOS Design Style


The CMOS pull up network is replaced by a single pMOS transistor with its gate
grounded. Since the pMOS is not driven by signals, it is always on'. The effective gate
voltage seen by the pMOS transistor is Vdd. Thus the overvoltage on the p channel gate
is always Vdd -VTp. When the nMOS is turned on', a direct path between supply and
ground exists and static power will be drawn. However, the dynamic power is reduced
due to lower capacitive loading.
17.5 Static Characteristics
As we sweep the input voltage from ground to, we encounter the following regimes of
operation:
nMOS off
nMOS saturated, pMOS linear
nMOS linear, pMOS linear
nMOS linear, pMOS saturated

17.6 Low Input

When the input voltage is less than VTn.


The output is high and no current is drawn from the supply.
As we raise the input just above VTn, the output starts falling.
In this region the nMOS is saturated, while the pMOS is linear.

17.7 nMOS saturated, pMOS linear


The input voltage is assumed to be sufficiently low so that the output voltage exceeds
the saturation voltage Vi - VTn. Normally, this voltage will be higher than VTp, so the p
channel transistor is in linear mode of operation. Equating currents through the n and p
channel transistors, we get

The solutions are:

substituting the values of V1 and V2 and choosing the sign which puts V0 in the correct
range, we get

As the input voltage is increased, the output voltage will decrease.


The output voltage will fall below Vi Vtnwhen

The nMOS is now in its linear mode of operation. The derived equation does not apply
beyond this input voltage.

Recap
In this lecture you have learnt the following
Introduction
Different Configurations with NMOS Inverter
Worries about Pseudo NMOS Inverter
Calculation of Capacitive Load

Congratulations, you have finished Lecture 17.

Module 4 : Propagation Delays in MOS


Lecture 18 : Dependence of Propagation delay on Fan-in and
Fan-out
Objectives
In this lecture you will learn the following
Motivation
Design Techniques for large Fan-in
18.1 Motivation
First we will show you how the fan-in and fan-out depends on propagation delay and
then we will analyze how to make Fan-in large.
The propagation delay of a CMOS gate deteriorates rapidly as a function of the fan-in.
firstly the large number of transistor (2N) increases the overall capacitance of the gate.
Secondly a series connection of transistor either in the PUN or PDN slows the gate as
well, because the effective (dis)charging resistance is increased .
Fan-out has a lager impact on the gate delay in complementary CMOS than some other
logic states. In complementary circuit style, each input connects to both an NMOS and a
PMOS device and presents a load to the driving gate equal to the sum of the gates
capacitances.

Fig 18.1: Dependence of Propagation delay on Fan-in


Thus we can approximate the influence of fan in and fan-out on propagation delay in
complementary CMOS gate as:

Where a1, a2 and a3 are weighing factor which are a function of technology
18.2 Design techniques for large fan in

1. Transistor Sizing: Increasing the transistor sizes increases the available


(dis)charging current. But widening the transistor results in large parasitic
capacitor. This does not only affect the propagation delay of the gate but also
present a larger load to the preceding gate.
2. Progressive Transistor Sizing: Usually we assume that all the intrinsic
capacitances, in a series connected array of transistors, can be lumped into a
single load capacitance CL and no capacitance is present at the internal nodes of
network.

Fig 18.21: Illustration of Progressive Transistor Sizing


Under these assumptions making all transistors in a series chain equal in size
makes sense. This model is an over-simplification, and become more and more
incorrect for increasing fan in. referring to the circuit below we can see that the
capacitor associated with the transistor as we go down the chain increases and so
the transistor has to discharge an increasing current as we go down the chain.
While transistor MN has to conduct the discharge current only of load capacitance
CL. M1 has to carry the discharge current from the total capacitance Ctot =C1+
C2 + ....+CL, which is substantially larger. Consequently a progressive scaling of
the transistors is beneficial. M1>M2 >M3>....>MN. This technique has for
instance proven to be advantageous in the decoders of memories where gates
with large fan in are common. The effect of progressive sizing can be understood
by the circuit in fig 18.21.
Spice simulation Example:
Taking CL=15 fF; N =5; C1=C2=C3=C4=10fF.
When all transistors are of minimum size SPICE predicts a propagation delay of
1.1nsecs. The transistors M5 toM1 are then made progressively wider in such a
way that the width of the transistor is proportional to the total capacitor it has to
discharge.
M5
is
of
minimum
size,
WM4=WM5(CL+
C4)/CL,
WM3=WM5(CL+C3+C4)/CL and so on. The resulting circuit has tpHL of
0.81nsecs or a reduction of 26.5%.
3. Transistor Ordering: Some signals in complex combinational logic blocks might be
more critical than others .no all inputs of a gate arrive at the same time (may be

due to propagation delays of the preceding blocks). An input signal to a gate is


called critical if it is the last signal of all input to assume a stable value. The path
through the logic which determines the ultimate speed of the structures is called
the critical path. Putting the critical path transistor closer to the output of the gate
can result in a speed up. Referring to the figure given below signal In1 is assumed
to be the critical signal. Suppose we assume signal In2 and In3 are high and In1
undergoes a 0 to 1 transition. Assume also that CL is initially charged high in 1st
case no path to ground exists until M1 is turned on .the delay between the arrival
of In1 and the output is therefore determined by the time it takes to discharge CL
+ C1 + C2. In the 2nd case C1 and C2 are already discharged when In1 changes.
Only CL has to be discharged, resulting in a faster response time. Using SPICE the
tPHL for a 4-input NAND gate was calculated. With the critical input connected to
the bottommost transistor the tpd =717ns and when connected to the uppermost
transistor tpd = 607 ns, an improvement of 15%.

Fig18.21: Two examples circuits for critical path


Recap
In this lecture you have learnt the following
Motivation
Design Techniques for large Fan-in

Congratulations, you have finished Lecture 18.

Module 4 : Propagation Delays in MOS


Lecture 19 : Analyzing Delay for various Logic Circuits
Objectives
In this lecture you will learn the following
Ratioed Logic
Pass Transistor Logic
Dynamic Logic Circuits
19.1 Ratioed Logic
Instead of combination of active pull down and pull up networks such a gate consists of
an NMOS pull down network that realizes the logic function and a simple load device. For
an inverter PDN is single NMOS transistor.

Fig 19.1: Ratioed Logic Circuit


The load can be a passive device, such as a resistor or an active element as a transistor.
Let us assume that both PDN and load can be represented as linearized resistors. The
operation is as follows: For a low input signal the pull down network is off and the output
is high by the load. When the input goes high the driver transistor turns on, and the
resulting output voltage is determined by the resistive division between the impedances
of pull down and load network:
VOL= RDVDD/(RD+RL)
where RD = pulldown n/w resistance, RL= load resistance.
To keep the low noise margin high it is important to chose RL>>RD. This style of logic
therefore called ratioed, because a careful PDN scaling of impedances (or transistor
sizes) is required to obtain a workable gate. This is in contrast to the ratioless logic style

as complementary CMOS, where the low and high level dont depend upon transistor
sizes. As a satisfactory level we keep RL>=4RD. To achieve this, (W/L)D/(W/L)L> 4.
19.2 Pass Transistor Logic
The fundamental building block of nMOS dynamic logic circuit, consisting of an nMOS
pass transistor is shown in figure 19.21.

Fig 19.21: Pass Transistor Logic Circuit


The pass transistor MP is driven by the periodic clock signal and acts as an access switch
to either charge up or down the parasitic capacitance, Cx, depending on the input signal
Vin. Thus there are 2 possible operations when the clock signal is active are the logic 1
transfer( charging up the capacitance Cx to logic high level) and the logic 0 transfer(
charging down the capacitance Cx to a logic low level). In either case, the output of the
depletion load of the nMOS inverter obviously assumes a logic low or high level,
depending on the voltage Vx. The pass transistor MP provides the only current path to
the intermediate capacitive node X. when clock signal becomes inactive (clk=0) the
pass transistor ceases to conduct and the charge is stored in the parasitic capacitor Cx
continues to determine the output level of the inverter. Logic 1 Transfer: Assume that
the Vx = 0 initially. A logic "1"level is applied to the input terminal which corresponds to
Vin=VOH=VDD. Now the clock signal at the gate of the pass transistor goes from 0 to
VDD at t=0. It can be seen that the pass transistor starts to conduct and operate in
saturation throughout this cycle since VDS=VGS. Consequently VDS> VGSVtn.
Analysis: The pass transistor operating in saturation region starts to charge up the
capacitor Cx, thus:

The previous equation for Vx(t) can be solved as-

The variation of the node voltage Vx(t)is plotted as a function of time in fig. 19.22. The
voltage rises from its initial value of 0 and reaches Vmax =VDD-Vtn after a large time.
The pass transistor will turn off when Vx = Vmax. Since Vgs= Vtn. Therefore Vx can
never attain VDD during logic 1 transfer. Thus we can use buffering to overcome this
problem.

Fig 19.22: Node Voltage Vx vs t


Logic 0 Transfer: Assume that the Vx=1
Initially. A logic0 level is applied to the input terminal which corresponds to Vin=1.
Now the clock signal at the gate of the pass transistor goes from 0 to VDD at t=0. It
can be seen that the pass transistor starts to conduct and operate in linear mode
throughout this cycle and the drain current flows in the opposite direction to that of
charge up.
Analysis: We can write

The above equation for Vx(t) can be solved as

Plot of Vx(t) is shown in figure 19.23.

Fig 19.22: Node Voltage Vx vs t


19.3 Dynamic Logic Circuits
In case of static CMOS for a fan-in of N, 2N transistors are required. In order to reduce
this, various other design logics were used like pseudo-NMOS logic and pass transistor
logic. However the static power consumption in these cases increased. An alternative to

these design logics is Dynamic logic, which reduces the number of transistors at the
same time keeps a check on the static power consumption.
Principle: A block diagram of a dynamic logic circuit is as shown in fig 19.31. This uses
NMOS block to implement its logic
The operation of this circuit can be explained in two modes.
1. Precharge
2. Evaluation

Fig 19.31: Dynamic CMOS Block Diagram


In the precharge mode, the CLK input is at logic 0. This forces the output to logic 1,
charging the load capacitance to VDD. Since the NMOS transistor M1 is off the pulldown
path is disabled. There is no static consumption in this case as there is no direct path
between supply and ground.
In the evaluation mode, the CLK input is at logic 1. Now the output depends on the PDN
block. If there exists a path through PDN to ground (i.e. the PDN network is ON), the
capacitor CL will discharge else it remains at logic 1.As there exists only one path
between the output node and a supply rail, which can only be ground, the load capacitor
can discharge only once and if this happens, it cannot charge until the next precharge
operation. Hence the inputs to the gate can make at most one transition during
evaluation

19.32: DOMINO CMOS Block Diagram

Advantages of dynamic logic circuits:


1. As can be seen, the number of transistors required here are N+2 as compared to
2N in the Static CMOS circuits.
2. This circuit is still a ratioless circuit as in Static case. Hence, progressive sizing and
ordering of the transistors in the PDN block is important.
3. As can be seen, the static power loss is negligible.
Disadvantages of dynamic logic circuits:
1. The penalty paid in such circuits is that the clock must run everywhere to each
such block as shown in the diagram.
2. The major problem in such circuits is that the output node is at Vdd till the end of
the precharge mode. Now if the CLK in the next block arrives earlier compared to
the CLK in this block, or the PDN network in this block takes a longer time to
evaluate its output, then the next block will start to evaluate using this erroneous
value
The second part of the disadvantage can be eliminated by using DOMINO CMOS circuits
which are as shown below.
As can be seen the output at the end of precharge is inverted by the inverter to logic 0.
Thus the next block will not be evaluated till this output has been evaluated. As an
ending point, it must be noted that this also has a disadvantage that since at each stage
the output is inverted, the logic must be changed to accommodate this.
Recap
In this lecture you have learnt the following
Ratioed Logic
Pass Transistor Logic
Dynamic Logic Circuits

Congratulations, you have finished Lecture 19.

Module 1 : Introduction to VLSI Design


Lecture 2 : System approach to VLSI Design
Objectives
In this lecture you will learn the following:
What is System?
Design Abstraction Levels
Delay and Interconnect Issues in physical circuits
2.1 What is System?
2.1.1 Definition of System
A system is something which gives an output when it is provided with an
input (see figure 1).

Figure 1. A simple System


2.1.2 System-On-Chip (SoC)
As the name suggests, its basically means shrinking the whole system onto a
single chip.The most important feature of the chip is that its functionality
should be comparable to that of the original system. It improves quality,
productivity and performance.

Figure 2. An SoC example

2.2 Design Abstraction levels


Every system should be decomposed into three fundamental domains:
1. Behavioral Domain
2. Structural Domain
3. Physical Domain
In every domain, there are diffirent layers or levels of hierarchy. The following Onion
Diagram will give a better understanding of this -

Figure 3. Onion Diagram

We can design the system at various layers, which are called design abstraction levels:
1. Architechture
2. Algorithm
3. Modules (or Functions)
4. Logic
5. Switch
6. Circuit
7. Device
In this course, we are only dealing with Logic, Switch and Circuit levels.

Representation examples
Behavioral
Representation

Structural Representation

2.3 Delay and Interconnect Issues in physical circuits


It must be noted that when the adder described (in the above structural Representation)
is realized physically, the output may not arrive at the instant the input is given i.e. if
the input is given at time t=0, output can be obtained at time t=t1>0, where values of
t1 may range from picoseconds to milliseconds, but never zero.

Figure 4: Delay in system output


These delays may occur in the devices used to realize the system. However, today the
major concern of designers are are the interconnecting wires which connect the various
devices. They are the major bottleneck in the speed of the systems today. They occur
due to parasitic resistances and capacitances present in the circuits designed.
A detailed discussion on Circuit Interconnects will be done in later lectures.
Recap
In this lecture you have learnt the following
What is System?
Design Abstraction Levels
Delay and Interconnect Issues in physical circuits
Congratulations, you have finished Lecture 2.

Module 4 : Propagation Delays in MOS


Lecture 20 : Analyzing Delay in few Sequential Circuits
Objectives
In this lecture you will learn the delays in following circuits
Motivation
Negative D-Latch
S-R Latch using NOR Gates
Simple Latch using two Inverters (Bistable Element)
Master Slave Flip-Flop
20.1 Motivation
We know that digital circuits are formed by two type of components(1) Combinational circuit and (2) Sequential Circuits. Combinational circuit
components are used only for logic implementation and can't store the bits i.e. work as
memory. But Sequential circuit components can store bits, hence used as memory
elements. How fast a circuit (containing memory elements i.e sequential elements) can
store or retrieve the value from its memory depends upon the delays in each of such
basic sequential elements e.g. flip-flops etc.
In coming sections, we will analyze basic functionalities and delays of such sequential
elements20.2 Negative D-Latch
Structure: This circuit consists of a multiplexer and an inverter. Data is fed at the i1
input of mux where as the output is given to the inverter, which in turn is fed to the i2
input of the mux. Clock is given to the select input.
Working:

Fig 20.2 Negative D-Latch Circuits


When clock = '0' the data is passed on to the output.
When clock = '1' the data gets latched.
This circuit can be converted into positive clock latch by giving an inverted clock at the
select input.
Note: Latch = level senstive Flip flop = edge triggered.

20.3 S-R Latch using NOR Gates


Structure: Using two nor gates this circuit is designed .In this circuit one of the input of
nor gate is 'R' .Other input is

and the output of the gate is Q. In the second NOR gate

the inputs are S and


and the output is Q. Thus we see that the two inputs are
connected in feedback configuration.
Truth Table:

Fig 20.3: S-R Latch circuits using NOR


Note: Synchronous circuit: A circuit is said to be in synchronous mode if the output
data speed is equal to the speed of the clock.
Asynchronous circuit: A circuit is said to be in asynchronous mode if the output data
speed is less than the speed of the clock.
Setup time: it is the time for which the valid data must be present before the clock
edge arrives.
Hold time: it is the time for which the data must be held after the arrival of the clock
edge. Sufficient set up and hold time must be provided to prevent contention of data.
20.4 Simple Latch using two Inverters (Bistable Element)
Structure: Here the output voltage of ig1 is equal to the input of ig2, and vice-versa.
Fig 20.41 shows the latch using two inverters.

Fig 20.41: Latch using two inverters

Working: Notice that the input and output voltages of ig2 correspond to the output and
input voltages of ig1 respectively. It can be seen that the two voltage transfer
characteristics intersect at three points. Two of them are stable, while the middle point
is unstable. The gain at the stable points is less than unity. Thus if input is at any of
these points, it remains stable. The voltage gain at the third operating point is greater
than unity. However if the input has a small perturbation, it is amplified and led to any
of the two stable states. Hence this state is called metastable state. Since the circuit
has two stable operating points it is called bistable. The potential energy is at its
minimum at two of the three operating points, since the voltage gains of both inverters
are equal to zero. By contrast, energy attains maximum value at the operating point at
which the voltage gains of both inverters are maximum. Thus the circuit has two stable
states corresponding to the two energy minima, and one unstable state corresponding to
the potential energy maximum.
Consider the above circuit at vg1 =vg2=vinv, the unstable operating point. Assume that
input capacitance cg of each inverter is much more than output capacitance cd. The
drain current of each inverter is also equal to the gate current of other inverter.
--eq1
Where gm represents transconductance of inverter. The gate charges q1 and q2 are
--eq2

Fig 20.42: Stability Graph


The small signal gate current of each inverter can be written as--eq3
Using eq1 and eq3,
--eq4
These equation In terms q1 & q2 is given as below--eq5

--eq6

Combining equations eq5 and eq6, we will get


--eq7
This expression is simplified by using To, the transient time constant

The time domain solution is

--eq8
The initial condition is

--eq9

By solving these, we will get

--eq9
Note that the magnitude of both the output voltages increases exponentially with time.
Depending on the polarity of the initial small perturbation dVo1(0) and dVo2(0) the
output voltages will diverge from there initial value of Vinv to either Vol or Voh

Fig 20.3: Voltage Stablity graph


Solution for the problem:
1. The inverter should not be identical.
2. The lines connecting the two inverters should be of different lengths.
Note:

1. This same circuit can be used as static ram. After the data which is fed has
started circulating, the input can be removed since it keep on circulating.
2. This circuit is also called as transparent latch or level sensitive latch.

20.4 Master Slave Flip-Flop or 1-bit shift resister (d negative edge


triggered flip flop)
1: When the clock is at logic low the pass gate no.1 of master allows the input to
pass the input to its output
2: when the clock is at logic high pass gate no.2 of master becomes transparent
and the input gets latched.
3: In the second part, when the clock is at logic high the slave passes the output
of master which was initially inverted to its output by again inverting it. Thus the
input data D reaches the output at negative edge of the clock cycle.
4: If the clock is low, then the slave part latches the output.

Fig 20.41: Master Slave Flip-Flop circuit


Another circuit for Negative Edge Triggered D flip/flop:
Advantage In this circuit is we need to use just 8 transistors.

Fig 20.42: Alternative circuit of Master Slave Flip-Flop


Working: When the clock input is low, then the output is in high impedance state. So
we are not able to get any data at the output. So the earlier data which was stored at
the output capacitance is latched. In the next clock phase, the output terminal is
connected to the pull-down or pull-up networks. So the data which is at input of the
pull-down or pull-up network is stored at output in its inverted form. In the second part
of the circuit the same process takes place. Thus the data is shifted by a bit, and acts as
a 1-bit shift register.
Disadvantage: 1) There is a problem of charge sharing in this circuit.
2) This is used for slower circuits. Note: Clock frequency
<= 1/(5*propagation delay).

Recap
In this lecture you have learnt the following
Ratioed Logic
Pass Transistor Logic
Dynamic Logic Circuits

Congratulations, you have finished Lecture 20.

Module 4 : Propagation Delays in MOS


Lecture 21 : Logical Effort
Objectives
In this lecture you will learn the following
Motivation for Logical Effort
Definition of Logical Effort
Delay in a Logical Gate
21.1 Motivation for Logical Effort
Here, we are going to introduce the concepts on which logical effort is based. Logical
effort as introduced by Sutherland et al is just a formalized representation of these
concepts. The propagation delay of a MOS transistor depends on the capacitance of the
transistor. So, as the width W is increased capacitance increases and so does the
propagation delay. Let us say that when Cin, the input capacitance of a gate (say
inverter) is equal to CL, the load capacitance, then the propagation delay of the gate is
. If CL is not is equal Cin, then
Now let us see that if we can introduce a buffer then can we reduce the propagation
delay of the gate (see figure 21.11).

Fig 21.11: A circuit for propagation delay calculation


Here, the input capacitance of 1st gate is Cin and the load capacitance is Cl. The input
capacitance of 2nd gate, which is also load to the 1st, is uCin. Consider,
.
Now, let us find out the optimum CL for which introducing a buffer will provide Cin
performance improvement. For this we use the following:

Solving this

differential equation leads to the following result,


From the above result, we observe that a buffer will yield better results for Y > 4. Here
the input capacitance of first gate is Cin,
that of 2nd is uCin, hat of next is
capacitance is

and so on and for the last one the input

We have,

Fig 21.12: A circuit of high order for propagation delay calculation


taking lnu=1 and Y=1000. Therefore, with 7 stages we can drive a load capacitance
1000 times theinput capacitance.
Such cases arise when we have situations where we have to drive, lets say, a motor.
Then Y> 1000 and large currents have to be delivered. In such cases where load
current is to be provided outside chip then buffer should be put very close to the output
pad to avoid adding line capacitance.
21.2 Definition of Logical Effort
The method of logical effort is an easy way to estimate the delay in an MOS circuit. The
method can be used to decide the number of logic stages on a path and also what
should be the size of the transistors. Using this method we can do a simple estimations
in the early stages of design, which can be a starting point for more optimizations.
The logical effort of a gate tells how much worse it is at producing output current than
an inverter, given that each of its inputs may contain only the same input capacitance as
the inverter. Reduced output current means slower operation, and thus logical effort
number for a logic gate tells how much more slowly it will drive a load than an inverter
would.
Equivalently, logical effort is how much more input capacitance a gate presents to
deliver the same output current as an inverter.
As we can see from the table presented above, the logical effort increases as the
complexity of a gate increases. Also, for the same logic gate, as the number of inputs
increases, the logical effort increases. Thus, larger or more complex logic gates will
exhibit more delay.
Thus we can evaluate different choices of logical structure by considering their logical
effort. For example, designs that minimize the number of stages will require more inputs
for each logic gate and thus have larger logical effort. Similarly, designs with fewer
inputs and thus less logical effort per stage may require more stages of logic. These
tradeoffs should be evaluated for an optimum design.
21.1 Delay in a Logic Gate

Delays in a MOS gate are caused by the capacitive loads and due to the gate topology.
We will take an inverter as the unit gate and compare performance of other gates with
an inverter. A complex logic gate, which may have transistors connected in series, will
have more delay than an inverter with similar transistor sizes that drives the same load,
as they are poorer at driving current. The method of logical effort quantifies these
effects.
We will consider
as the delay unit that characterizes a given MOS process.
50ps for a typical 0.6 process.

is about

The absolute delay of the gate is


, Where d is unitless delay of the gate. The
delay incurred by a logic gate can be expressed as, d = f + p, Where p is a fixed part
called parasitic delay and f is proportional to the load on the gates output called the
effort delay or stage effort. d is measured in units of .
The effort delay f depends on the load and on the properties of the logic gate driving
that load and comprises of two components. f = gh ,Where g, logical effort, accounts
for the properties of the gate h, electrical effort, characterizes the load.
Combining above equations, we get - d = gh + p
Thus, we see that there are four components that basically contribute to delay, namely,
, g, h and p. The process parameter
represents the speed of the basic transistor.
The parasitic delay, p, represents the intrinsic delay of the gate due to its own internal
capacitance. The electrical effort, h, is Cout/Cin, where Cout is the capacitance due to
the load and Cin is the capacitance due to sizes of the transistors. The logical effort, g,
expresses the effect of circuit topology and is independent of load and transistor sizing.
Thus logical effort depends only on circuit topology.
Recap
In this lecture you have learnt the following
Motivation for Logical Effort
Definition of Logical Effort
Delay in a Logical Gate

Congratulations, you have finished Lecture 21.

Module 4 : Propagation Delays in MOS


Lecture 22 : Logical Effort Calculation of few Basic Logic Circuits
Objectives
In this lecture you will learn the following
Introduction
Logical Effort of an Inverter
Logical Effort of NAND Gate
Logical Effort of NOR Gate
Logical Effort of XOR Gate
Logic Effort Calculation of few Mixed Circuits
Delay Plot
22.1 Introduction
The method of logical effort is an easy way to estimate delay in a CMOS circuit. We can
select the fastest candidate by comparing delay estimates of different logic structures.
The method also specifies the proper number of logic stages on a path and the best
transistor sizes for the logic gates. Because the method is easy to use, it is ideal for
evaluating alternatives in the early stages of a design and provides a good staring point
for more intricate optimizations. It is founded on a simple model of the delay through a
single MOS logic gate. The model describes delays caused by the capacitive load that the
logic gate drives and by the topology of the logic gate. Clearly as the load increases, the
delay increases, but the delay also depends on the logic function of the gate. Inverters,
the simplest logic gates, drive loads best and are often used as amplifiers to drive large
capacitances. Logic gates that compute other functions require more transistors, some
of which are connected in series, making them poorer than inverters at driving current.
Thus a NAND gate has more delay than an inverter with similar transistor sizes that
drives the same load. The method of logical effort quantifies these effects to simplify
delay analysis for individual logic gates and multistage logic networks.
The method of logical effort is founded on a simple model of the delay through a single
MOS logic gate. The model describes delay caused by the capacitive load that the logic
gate drives. Certainly as the load increases the delay increases, but delay also depends
on logical function of the gate. Invertors, the simplest logical gates, drive loads best and
are often used as amplifiers to drive large capacitances. Logic gates that compute other
functions require more transistors, some connected in series, making them poorer than
inverters at driving currents. Thus NAND gate has more delay than inverter with similar
transistor size and driving load. The method of logical effort qualifies these effects to
simplify delay analysis for individual logic gates and multistage logic networks.

22.2 Logical Effort of an Inverter

Fig 22.21: Inverter Circuit


The logical effort of an Inverter is defined to be unity.
22.3 Logical Effort of a NAND Gate
A NAND gate contain two NMOS (pull down) transistors in series and two PMOS (pull up)
transistors as shown in fig 22.3).

Fig 22.3:2-input NAND


We have to size the transistors such that the gate has the same drive characteristics as
an inverter with a pull down of width 1 and a pull up of width 2. Because the two pull
down transistors are in series, each must have the twice the conductance of the inverter
pull down transistor so that the series connection has a conductance equal to that of the
inverter pull down transistor. Hence these two transistors should have twice the width
compared to inverter pull down transistor. By contrast, each of the two pull up
transistors in parallel need be only as large as the inverter pull up transistor to achieve
the same drive as the reference inverter. So, the logical effort per input can be
calculated as
g = (2+2)/ (1+2) = 4/3.
For 3 input NAND gate, g = (3+2)/ (1+2) =5/3
For n input NAND gate, g = (n+2)/ 3

22.4 Logical Effort of a NOR Gate


A NOR gate contain two pull down transistors in parallel and two pull up transistors in
series as shown in figure 22.4.

Fig 22.4: 2-input NOR


Because the two pull up transistors are in series, each must have the twice the
conductance of the inverter pull up transistor so that the series connection has a
conductance equal to that of the inverter pull up transistor. Hence these two transistors
should have twice the width compared to inverter pull down transistor. By contrast, each
of the two pull down transistors in parallel need be only as large as the inverter pull
down transistor to achieve the same drive as the reference inverter. So, the logical
effort per input can be calculated as effort of NOR gate,
g = (1+4)/ (1+2) = 5/3
For n input NOR gate, g = (2n+1)/3
22.5 Logical Effort of a XOR Gate
A two input XOR gate is shown in figure 22.5.

Fig 22.4: XOR Gate


Here we will calculate the logical effort for a bundle (A* or B*) instead of only one input
as complementary inputs are applied.
Logical effort for a bundle A is g = (2+4+2+4)/ (1+2) = 4.
Logical effort for a bundle B is g = (2+4+2+4)/ (1+2) = 4.

22.6 Examples Circuits


Example Circuit 1:
Example 1

Fig 22.61: Example Circuit 1


Example Circuit 2: 4 BIT MUX
Example 2
Logical effort for input D is
gD = (2+4)/ (1+2) = 2
Logical effort for bundle S is
gs =(2+4)/ (1+2) = 2.
For one arm, g = 12/3 = 4
For N-way symmetrical MUX
g= 4N (this is for the static CMOS MUX only)
Fig 22.62: Example Circuit 2
22.7 Tabular View of Logical Efforts
Logical effort for different circuits is tabulated in the table below in fig. 22.71

Fig 22.71: Logical efforts of basic gates with different input configurations

The parasitic delays for different is tabulated in fig. 22.72.

Fig 22.72: parasitic delay of basic gates


Now delay, d = gh+ p
For example, dINV = (1*1) +1=2.
If we assume
tau = 25 ps
Absolute delay
dABS =50 ps.
22.8 Delay Plot
The delay of a simple logic gate as represented in equation d = gh + p is a simple linear
relationship.

Fig 22.8: Delay Plot


The fig 22.8 shows this relationship graphically. Delay appears as a function of electrical
effort for an inverter and for a two-input NAND gate.The slope of each line is the logical
effort of the gate.Its intercept is the parasitic delay. The graph shows that we can
adjust the total delay by adjusting the electrical effort or by choosing a logic gate with a
different logical effort.

Example3: Fonout-of-4 (FO4) inverter circuit-

Fig22.82: FO4 circuit


Because each inverter is identical, Cout = 4Cin, so h = 4. The logical effort g = 1 for
an inverter. Thus FO4 delay is, d = gh + p = 1*4 + pINV =4 + 1 = 5.
Recap
In this lecture you have learnt the following
Introduction
Logical Effort of an Inverter
Logical Effort of NAND Gate
Logical Effort of NOR Gate
Logical Effort of XOR Gate
Logic Effort Calculation of few Mixed Circuits
Delay Plot

Module 4 : Propagation Delays in MOS


Lecture 23 : Logical Effort of Multistage Logic Networks
Objectives
In this lecture you will learn the following
Logical Effort of Multistage Logic Networks
Minimizing Delay along a Path
Few Examples
23.1 Logical Effort of Multistage Logic Networks
The logical effort along a path compounds by multiplying the logical effort of all the logic
gates along the path. We denote it by the letter 'G'. Hence,

The electrical effort along a path through the network is simply the ratio of the
capacitance that loads the logic gate in the path to input capacitance of the first gate in
the path. We denote it by the letter 'H'.

When fanout occurs within a logic network, some of the available drive current is
directed along the path we are analyzing, and some are directed off that path. Branching
effort (b) at the output of a logic gate is defined as

Where
is the load capacitance along the path and
is the capacitance of
connections that lead off the path. If there is no branching in the path the branching
effort is unity.
Branching effort along the entire path 'B' is the product of branching effort at each of
the stages along the

path.-

Path effort (F) is defined asThe path branching and electrical effort are related to the electrical effort of each stage
as-

The path delay D is the sum of the delays of each of the stages of logic in the path.

where DF is path effort delay and P is path parasitic delay which are given as

23.2 Minimizing Delay along a Path


Consider two path stages as in figure 23.21.

Fig 23.21: An Example Circuit equating


The total delay of the above circuit is given by

Substituting

in equation for D we get,

To minimize D, we take the partial derivative of D with respect to it to zero we get,

i.e. the product of logical effort and electrical effort of each stage should be equal to get
minimum delay. This is independent of scale of circuit and of the parasitic delay. The
delay in the two stages will differ only if the parasitic delays are different.
We can generalise this result for N stages as-

So,

Example of Minimizing delay: Consider the path from A to B involving three two input
NAND gates as in fig 23.22. The input capacitance of first gate is C and the load
capacitance is also C . Find the least delay in this path and how should the transistors be
sized to achieve least delay?

Fig 23.22: Example Circuit


Solution:
Logical effort of a two input NAND gate is g = 4/3
so G = (4/3)*3 = 64/27 = 2.37 .
B = 1 (as there is no branching) , H = Cout / Cin = 1
Path Effort F = 64/27*1*1 = 64/27
if each stage has same parasitic delay then P = p1+ p2+p3 =6 pinv (as all are two
input), then
As,
So, Cz = g3 * C/ (4/3) = C, Cy = g2 * C/ (4/3) = C.
Now if Cout = 8C, then

23.3 Reduction of Delay


For the minimum delay of the circuit we optimize the number of stages. Let total
number of stages be N = n1 + n2

Fig 23.31: Example Circuit

But the number of stages for minimum delay may not be the integer, so it is not feasible
to implement it. So we realise the circuit by either taking the number of stages greatest
integer of the obtained value or the one more then the greatest integer whatever gives
us the minimun delay.

Fig 23.32:
We will study about in more details in next chapter.
Recap
In this lecture you have learnt the following
Logical Effort of Multistage Logic Networks
Minimizing Delay along a Path
Few Examples

Congratulations, you have finished Lecture 23.

Module 4 : Propagation Delays in MOS


Lecture 24 : Methods for Reduction of Delays in Mutlistage Logic
Networks
Objectives
In this lecture you will learn the following
Effect of Using Wrong Number of Stages
Dynamic Latch
Carry Propagation Gate
Dynamic Mular C-element
Fork
24.1 Using Wrong No. of Stages
Let us assume that the number of stages is wrong by a factor s, i.e. the number of
stages is
. Where
is the best number to use. The delay can be expressed as a
function of N(assuming parasitic delay of each stage is same as p) as:

Let rbe the ratio of the delay when using sNstages to the delay when using best number
of stages, N. So,

Since

is the best number we know that

. Solving for r we obtain

This relationship is plotted in the figure for p = 1 and p = 3.59.

Fig 24.11: The relative delay compared to the best possible,


as a function of the relative error in the number of stages used
A designer often faces the problem of deciding whether it would be beneficial to change
the number of stages in an existing circuit. This can easily be done by calculating the
stage effort. If the effort is between 2 and 8, the design is within 35% of best delay. If
the effort is between 2.4 and 6, the design is within 15% of best delay. Therefore, there
is little benefit in modifying a circuit unless the stage effort is grossly high or low.
24.2 Dynamic Latch
Fig 24.21 shows a dynamic latch: when the clock signal

is HIGH, and its complement

is LOW, the gate output q is set to the complement of the input d. The total logical
effort of this gate is 4; the logical effort per input for d is 2, and the logical effort of the
bundle is also 2. (Note

is 2)

Fig 24.21: A dynamic latch with input d and output q.


The clock bundle is
24.3 Carry Propagation Gate
Fig 24.31 shows one stage of a ripple-carry chain in an adder. The stage accepts carry
and delivers a carry out in inverted form on
. The inputs g and
come from the
two bits to be summed at this stage. The signal g is HIGH if this stage generates a new
carry, forcing

. Similarly,

is LOW if this stage kills incoming carries, forcing

The total logical effort of this gate is


is 2; for the g input it is

. The logical effort per input for

; and for the input it is

Fig 24.41: A carry propgation gate

24.4 Dynamic Muller C-element


Fig 24.41 shows an inverting dynamic Muller C-element with two inputs. Although this
gate is rarely seen in designs for synchronous systems, it is a staple of asynchronous
system design. The behavior of the gate is as follows: When both inputs are HIGH, the
output goes LOW; when both inputs go LOW, the output goes HIGH. In other conditions,
the output retains its previous value - the C-element thus retains state. The total logical
effort of this gate is 4, divided between the two inputs.

Fig 24.41: A two input inverting dynamic Muller C-element


24.5 Fork
If we try to use a signal and an inverter for the complimentary signal then we get
unequal delay between two signals. So we use N-stages and adjust the sizing such that
we get two complementary signals with equal delay.
Fig 24.51 shows a 2-1 fork and a 3-2 fork, both of which produce the same logic signals.
Fig 24.52 shows a general fork.

Fig 24.51: A 2-1 fork and 3-2 fork

Fig 24.52: A general fork


The design of a fork starts out with a known load on the output legs and known total
input capacitance. As shown in Fig 24.52, we shall call the two output capacitances
and

. The combined total load driven we will call

. The total input

capacitance for the fork we shall call

, and can thereby describe the

electrical effort for the fork as a whole to be

. This electrical effort of the fork

may differ from the electrical efforts of the individual legs,

and

The input current to an optimized fork may divide unequally to drive its two legs. Even if
the load capacitances on the two legs of the fork are equal, it is not in general true that
the input capacitances to the two legs of the fork are equal. Because the legs have
different number of amplifiers but must operate with the same delay, their electrical
efforts may differ. The leg that can support the larger electrical effort, usually the leg
with more amplifiers, will require less input current than the other leg, and can therefore
have a smaller input capacitance. If we call the electrical efforts of the two legs
, using the notation of Fig 24.52, then
may not equal

and

and

and

. Even if

and
,

may also differ.

The design of a fork is a balancing act. Either leg of the fork can be made faster by
reducing its electrical effort, which is done by giving it wider transistors for its amplifier.
Doing so, however, takes input current away from the other leg of the fork and will
inevitably make it slower. A fixed value of
provides, in effect, only a certain total
width of transistor material to distribute between the first stages of the two legs; putting
wider transistors in one leg requires putting narrower transistors in the other leg. The
task of designing a minimum delay fork is really the task of allocating the available
transistor width set by

to the input stages of the two legs.

Recap
In this lecture you have learnt the following
Effect of Using Wrong Number of Stages
Dynamic Latch
Carry Propagation Gate
Dynamic Mular C-element
Fork

Congratulations, you have finished Lecture 24.

Module 4 : Propagation Delays in MOS


Lecture 25 : Designing Asymmetric Logic Gates
Objectives
In this lecture you will learn the following
Brief Introduction to Asymmetric Logic Gates
Application of Asymmetric Logic Gates
Analyzing Delays
Pseudo NMOS Circuits
25.1 Brief Introduction to Asymmetric Logic Gates
Logic gates sometimes have different logical efforts for different inputs. We call such
gates asymmetric. Asymmetric gates can speed up critical paths in a network by
reducing the logical effort along the critical paths. This attractive property has a price,
however the total logical effort of the logic gate increases. This lecture discusses design
issues arising from biasing a gate to favor particular inputs.

Fig 25.11: An asymmetric NAND gate


Fig 25.11 shows a NAND gate designed so that the widths of the two pull-down
transistors can differ; input a has a width 1/(1-s), while input b has width 1/s. The
parameter s, 0<s<1, called the symmetry factor, determines the amount by which the
logic gate is asymmetric. If s=1/2, the gate is symmetric, the pull down transistors
have equal sizes, and the logical effort is the same as calculated in the previous lectures.
Values of s between 0 and 1/2 favor the a input by making its pull-down transistor
smaller than the pull-down transistor for b. Values of s between and 1 favor the b
input.

The logical efforts per input for inputs a and b, and the logical effort

(Eq 25.1)

(Eq 25.2)

(Eq 25.3)
Choosing the least value possible for s, such as 0.01, minimizes the logical effort of
input a. This design results in pull-down transistor of width 1.01 for input a and a
transistor of width 100 for input b. The logical effort of input a is then
almost exactly 1. The logical effort of input b becomes
total logical effort is about 35, again assuming

, or about 34 if

, or
. The

Extremely asymmetric designs, such as with s=0.1, are able to achieve a logical effort
for one input that almost matches that of an inverter, namely 1. The price of this
achievement is an enormous total logical effort, 35, as opposed to 8/3 for a symmetric
design. Moreover, the huge size of the pull-down transistor will certainly cause layout
problems, and the benefit of the reduced logical effort on input a may not be worth the
enormous area of this transistor.
Less extreme asymmetry is more practical. If s=1/4, the pull-down transistors have
widths 4/3 and 4, and the logical effort of input a is
, which is 1.1 if
. The
logical effort of input b is 2, and the total logical effort is 3.1, which is very little more
than 8/3, the total logical effort of the asymmetric design. This design achieves a logical
effort for the favored input, a, that is only 10% greater than that of an inverter, without
a huge increase in total logical effort.
25.2 Applications of Asymmetric Logic Gates
The principal application of asymmetric logic gates occurs when one path must be very
fast. For example, in a ripple carry adder or counter, the carry path must be fast. The
best design uses an asymmetric circuit that speeds the carry even though it retards the
sum output.

Paradoxically, another important use of asymmetric logic gates occurs when a signal
path may be unusually slow, as in a reset signal. Figure 25.21 shows a design for a
buffer amplifier whose output is forced LOW, when the reset signal,
, is LOW. The
buffer consists of two stages: a NAND gate and an inverter. During normal operation,
when
is HIGH, the first stage has an output drive equivalent to that of an inverter
with pull-down width 6 and pull-up width 12, but the capacitive load on the in input is
slightly larger than that of the corresponding inverter:

(Eq 25.4)

Fig 25.21: A buffer amplifier with a reset input. When


the output will always be LOW.

is LOW,

This circuit takes advantage of slow response allowed to changes on by using the
smallest pull-up transistor possible. This choice reduces the area required to lay out the
gate, partially compensating for the large area pull-down transistor. Area can be further
reduced by sharing the the reset pull-down among multiple gates that switch at different
times; this is known as Virtual Ground technique.

25.3 Analyzing Delays


We can model the delay of an individual stage of logic with one of the following two
expressions:
(Eq 25.5)
(Eq 25.6)
where the delays are measured in terms of . Notice that the logical efforts, parasitic
delays, and stage delays differ for rising transitions (u) and falling transitions (d).
In path containing N logic gates, we use one of two equations for the path delays,
depending on whether the final output of the path rises or falls. In the equations, i is the
distance from the last stage, ranging from 0 for the final gate, to (N-1) for the first
gate.

(Eq 25.7)
(Eq 25.8)
Equation 25.7 models the delay incurred when a network produces a rising output
transition. In this equation, the first sum tallies the delay of falling transitions at the
output of stages whose distance from the last stage is odd, and the second tallies the
delay of the rising transitions at the output of stages whose distance from the last stage
is even. Similarly Equation 25.8 models the falling output transition.
A reasonable goal is to minimize the average delay:

(Eq 25.9)
Then we have for the average delay:
(Eq 25.10)

25.4 Pseudo NMOS Circuits

Fig 25.41 Pseudo-NMOS inverter, NAND and NOR gates, assuming


The analysis presented in the previous delay analysis applies to pseudo-NMOS designs.
The PMOS transistor produces 1/3rd of the current of the reference inverter, and the
NMOS transistor stacks produce 4/3rd of the current of the reference inverter. For
falling transitions, the output current is pull-down current minus the pull-up, i.e., 4/3 1/3 = 1. For rising transitions, the output current is just the pull-up current, 1/3. The
inverter and NOR gate have an input capacitance of 4/3. The falling logical effort is the
input capacitance divided by that of an inverter with the same output current, or
. The rising logical effort is 3 times greater,
, because the
current produced on a rising transition is only 1/3rd that of a falling transition. The
average logical effort is
. This is independent of the number of
inputs, explaining why pseudo-NMOS is a way to build fast, wide NOR gates.
Table 25.41 shows the rising, falling and average logical efforts of other pseudo-NMOS
gates, assuming

Recap

and a 4:1 pull-down to pull-up strength ratio.

In this lecture you have learnt the following


Brief Introduction to Asymmetric Logic Gates
Application of Asymmetric Logic Gates
Analyzing Delays
Pseudo NMOS Circuits

Congratulations, you have finished Lecture 25.

Module 5 : Power Disipation in CMOS Circuits


Lecture 26 : Power Disipation in CMOS Circuits
Objectives
In this lecture you will learn the following
Motivation
Effect of Power Disipation
How to Reduce Temperature
Components of Power Disipation
Static Power Dissipation
Dynamic Power Dissipation
Methods to Reduce Power Disipation
Short-Circuit Power Dissipation
26.1 Motivation
Why is power dissipation so important? Power dissipation considerations have become
important not only from reliability point of view, but they have assumed greater
importance by the advent of portable battery driven devices like laptops, cell phones,
PDAs etc.
26.2 Effects of Power Dissipation
When power is dissipated, it invariably leads to rise in temperature of the chip. This rise
in temperature affects the device both when the device is off as well as when the device
is on.
When the device is off, it leads to increase in the number of intrinsic carriers,
following relation:

by the

(Eq 26.1)
From this relation it can be seen that as temperature increases, it leads to increase in
the number of intrinsic carriers in the semiconductor. The majority carriers, contributed
by the impurity atoms, are less affected by increase in temperature. Hence the device
becomes more intrinsic.
As temperature increases, leakage current, which directly depends on minority carrier
concentration, increases which leads to further increase in temperature. Ultimately, the
device might break down, if the increase in temperature is not taken care of by time to
time removal of the dissipated heat.
A ON device wont be affected much by minority carrier increase, but will be affected by
VT and which decrease with increase in temperature and lead to change in ID. Hence
the device performance might not meet the required specifications. Also, power

dissipation is more critical in battery powered applications as the greater power


dissipated, the battery life will be.
26.3 How to Reduce Temperature
The heat generated due to power dissipation can be taken away by the use of heat
sinks. A heat sink has lower thermal resistance than the package and hence draws heat
from it. For the heat to be effectively removed, the rate of heat transfer from the area of
heat generation to the ambient should be greater than the rate of heat generation. This
rate of heat transfer depends on the thermal resistance.
The thermal resistance, is given by the following relation:
(Eq 26.2)

where,
l = length, A = Area and c= thermal conductivity of the heat sink
From the above relation it can be seen that large c implies smaller . is also given by
the relation,
(Eq 26.3)
Using this relation, we can see that for a given power dissipation, PD

(Eq 26.4)
where,
Tj= junction temperature, and
Ta= ambient temperature.
Heat sink materials are generally coated black to radiate more energy
26.4 Components of Power Dissipation
Unlike bipolar technologies, here a majority of power dissipation is static, the bulk of
power dissipation in properly designed CMOS circuits is the dynamic charging and
discharging of capacitances. Thus, a majority of the low power design methodology is
dedicated to reducing this predominant factor of power dissipation.
There are three main sources of power dissipation:
Static power dissipation (PS)
Dynamic power dissipation (DS)
Short circuit power dissipation (PSC)
Thus the total power dissipation,

, is
(Eq 26.5)

26.5 Static Power Dissipation

Consider the complementary CMOS gate, shown in Figure 26.51

Fig 26.51: CMOS inverter model for static power dissipation evaluation
When input = '0', the associated n-device is off and the p-device is on. The output
voltage is

or logic '1'. When the input = '1', the associated n-device is on and the

p-device turns off. The output voltage is '0' volts or


. It can be seen that one of the
transistors is always off when the gate is in either of these logic states. Since no current
flows into the gate terminal, and there is no DC current path from
resultant quiescent (steady-state) current, and hence power

to

, the

, is zero.

However, there is some small static dissipation due to reverse bias leakage between
diffusion regions and the substrate. In addition, subthreshold conduction can contribute
to the static dissipation. A simple model that describes the parasitic diodes for a CMOS
inverter should be looked at in order to have an understanding of the leakage involved
in the device. The source-drain diffusions and the n-well diffusion form parasitic diodes.
In the model, a parasitic diode exists between n-well and the substrate. Since parasitic
diodes are reverse biased, only their leakage current contributes to static power
dissipation. The leakage current is described by the diode equation:

(Eq 26.6)
where,
is= reverse saturation current
V = diode voltage
q = electronic charge
k = Boltzmann's constant
T = temperature
The static power dissipation is the product of the device leakage current and the supply
voltage:
(Eq 26.7)
26.6 Dynamic Power Dissipation
During switching, either from '0' to '1' or, alternatively, from '1' to '0', both n- and ptransistors are on for a short period of time. This results in a short current pulse from
to

. Current is also required to charge and discharge the output capacitive

load. This latter term is usually the dominant term. The current pulse from
to
results in a 'short-circuit' dissipation that is dependent on the input rise/fall time, the
load capacitance and the gate design.

Fig 26.61: Power dissipation due to charging/discharging of capacitor


The dynamic dissipation can be modeled by assuming that the rise and fall time of the
step input is much less than the repetition period. The average dynamic power,

dissipated during switching for a square-wave input, Vin, having a repetition frequency
of

, is given by

(Eq 26.8)
where
= n-device transient current
= p-device transient current
For a step input and with

(Eq 26.9)

(Eq 26.10)
with
resulting in

,
(Eq 26.11)

Thus for a repetitive step input the average power that is dissipated is proportional to
the energy required to charge and discharge the circuit capacitance. The important
factor to be noted here is that Eq 26.11 shows power to be proportional to switching
frequency but independent of device parameters. The power dissipation also depends on
the switching activity, denoted by, .
The equation can then can be written as
(Eq 26.12)

26.7 Methods to Reduce Dynamic Power Dissipation

As can be seen from Eq (26.12), the power dissipated can be reduced by reducing either
the clock frequency,
, or the load capacitance,
, or the rail voltage,
, or the
switching activity parameter, . Reducing the clock frequency is the easiest thing to do,
but it seriously affects the performance of the chip. Applications where power is
paramount, this is approach can be used satisfactorily. Another method to reduce the
dissipated power is to lower the load capacitance,
. But this method is more difficult
than the previous approach because it involves conscientious system design, so that
there are fewer wires, smaller pins, smaller fan-out, smaller devices etc.
Power dissipation can also be reduced by reducing the rail voltage,
. But this can be
done only through device technology. Also rail voltage is a standard agreed to in many
cases by the semiconductor industry, hence we do not have much control over this
parameter. Also rail voltage is strongly dependent on the threshold voltage and the
noise margin.
Some special techniques are also used to reduce power dissipation. The first one
involves the use of pipelining to operate the internal logic at a lower clock than the i/o
frequency. The other technique is to reduce switching activity, , by optimizing
algorithms, architecture, logic topology and using special encoding techniques.
26.8 Short-Circuit Power Dissipation
The short-circuit power dissipation is given by
(Eq 26.13)
For the input waveform shown in Fig 26.81a, which depicts the short-circuit (Fig26.81b)
in an unloaded inverter,

(Eq 26.14)
assuming that

With

and

and that the behavior is symmetrical around t2.

Thus for an inverter without load, assuming that

where tp is the period of the waveform. This derivation is for an unloaded inverter. It
shows that the short-circuit current is dependent on and the input waveform rise and
fall times. Slow rise times on nodes can result in significant (20%) short-circuit power
dissipation for loaded inverters. Thus it is good practice to keep all edges fast if power
dissipation is a concern. As the load capacitance is increased the signifance of the shortcircuit dissipation is reduced by the capacitive dissipation PD.
Recap
In this lecture you have learnt the following
Motivation
Effect of Power Disipation
How to Reduce Temperature
Components of Power Disipation
Static Power Dissipation
Dynamic Power Dissipation
Methods to Reduce Power Disipation
Short-Circuit Power Dissipation
Congratulations, you have finished Lecture 26.

Module 6 : Semiconductor Memories


Lecture 27 : Basics of Seminconductor Memories
Objectives
In this lecture you will learn the following
Introduction
Memory Classification
Memory Architechtures and Building Blocks
Introduction to Static and Dynamic RAMs
27.1 Introduction
Semiconductor based electronics is the foundation to the information technology society
we live in today. Ever since the first transistor was invented way back in 1948, the
semiconductor industry has been growing at a tremendous pace. Semiconductor
memories and microprocessors are two major fields, which are benefited by the growth
in semiconductor technology.

Fig 27.11: Increasing memory capacity over the years

The technological advancement has improved performance as well as packing density of


these devices over the years Gordon Moore made his famous observation in 1965, just
four years after the first planar integrated circuit was discovered. He observed an
exponential growth in the number of transistors per integrated circuit in which the
number of transistors nearly doubled every couple of years. This observation, popularly
known as Moore's Law, has been maintained and still holds true today. Keeping up
with this law, the semiconductor memory capacity also increases by a factor of two
every year.
27.2 Memory Classification

Size: Depending upon the level of abstraction, different means are used to
express the size of the memory unit. A circuit designer usually expresses memory
in terms of bits, which are equivalent to the number of individual cells need to
store the data. Going up one level in the hierarchy to the chip design level, it is
common to express memory in terms of bytes, which is a group of 8 bits. And on
a system level, it can be expressed in terms of words or pages, which are in turn
collection of bytes.

Function: Semiconductor memories are most often classified on the basis of


access patterns, memory functionality and the nature of the storage mechanism.
Based on the access patterns, they can be classified into random access and serial
access memories. A random access memory can be accessed for read/write in a
random fashion. On the other hand, in serial access memories, the data can be
accessed only in a serial fashion. FIFO (First In First Out) and LIFO (Last In Last
Out) are examples of serial memories. Most of the memories fall under the
random access types.
Based on their functionalities, memory can be broadly classified into Read/Write
memories and Read-only memories. As the name suggests, Read/Write memory
offers both read and write operations and hence is more flexible. SRAM (Static
RAM) and DRAM (Dynamic RAM) come under this category. A Read-only memory
on the other hand encodes the information into the circuit topology. Since the
topology is hardwired, the data cannot be modified; it can only be read. However,
ROM structures belong to the class of the nonvolatile memories. Removal of the
supply voltage does not result in a loss of the stored data. Examples of such
structures include PROMs, ROMs and PLDs. The most recent entry in the filed are
memory modules that can be classified as nonvolatile, yet offer both read and
write functionality. Typically, their write operation takes substantially longer time
than the read operation. An EPROM, EEPROM and Flash memory fall under this
category.

Fig 27.21: Classification of memories

Timing Parameters: The timing properties of a memory are illustrated in Fig


27.22. The time it takes to retrieve data from the memory is called the readaccess time. This is equal to the delay between the read request and the moment
the data is available at the output. Similarly, write-access time is the time elapsed
between a write request and the final writing of the input data into the memory.
Finally, there is another important parameter, which is the cycle time (read or
write), which is the minimum time required between two successive read or write
cycles. This time is normally greater than the access time.

27. 3 Memory Architecture and Building Blocks


The straightforward way of implementing a N-word memory is to stack the words in a
linear fashion and select one word at a time for reading or writing operation by means of
a select bit. Only one such select signal can be high at a time. Though this approach as
shown in Fig 27.31 is quite simple, one runs into a number of problems when trying to
use it for larger memories. The number of interface pins in the memory module varies
linearly with the size of the memory and this can easily run into huge values.

Fig 27.31: Basic Memory Organization


To overcome this problem, the address provided to the memory module is generally
encoded as shown in Fig 27.32. A decoder is used internally to decode this address and
make the appropriate select line high. With 'k' address pins, 2K number of select pins
can be driven and hence the number of interface pins will get reduced by a factor of
log2N.

Fig 27.32: Memory with decoder logic

Though this approach resolves the select problem, it does not address the issues of the
memory aspect ratio. For an N-word memory, with a word length of M, the aspect ratio
will be nearly N:M, which is very difficult to implement for large values of N. Also such
sort of a design slows down the circuit very much. This is because, the vertical wires
connecting the storage cells to the inputs/outputs become excessively long. To address
this problem, memory arrays are organized so that the vertical and horizontal
dimensions are of the same order of magnitude, making the aspect ratio close to unity.
To route the correct word to the input/output terminals, an extra circuit called column
decoder is needed. The address word is partitioned into column address (A0 to AK-1)
and row address (AK-1 to AL-1). The row address enables one row of the memory for
read/write, while the column address picks one particular word from the selected row.

Fig 27.33: Memory with row and column decoders


27. 4 Static and Dynamic RAMs
RAMs are of two types, static and dynamic. Circuits similar to basic D flip-flop are used
to construct static RAMs (SRAMs) internally. A typical SRAM cell consists of six
transistors which are connected in such a way as to form a regenerative feedback. In
contrast to DRAM, the information stored is stable and does not require clocking or
refresh cycles to sustain it. Compared to DRAMs, SRAMs are much faster having typical

access times in the order of a few nanoseconds. Hence SRAMs are used as level 2 cache
memory.
Dynamic RAMs do not use flip-flops, but instead are an array of cells, each containing a
transistor and a tiny capacitor. '0's and '1's can be stored by charging or discharging the
capacitors. The electric charge tends to leak out and hence each bit in a DRAM must be
refreshed every few milliseconds to prevent loss of data. This requires external logic to
take care of refreshing which makes interfacing of DRAMs more complex than SRAMs.
This disadvantage is compensated by their larger capacities. A high packing density is
achieved since DRAMs require only one transistor and one capacitor per bit. This makes
them ideal to build main memories. But DRAMs are slower having delays in the order
tens of nanoseconds. Thus the combination of static RAM cache and a dynamic RAM
main memory attempts to combine the good properties of each.
Recap

In this lecture you have learnt the following


Introduction
Memory Classification
Memory Architechtures and Building Blocks
Introduction to Static and Dynamic RAMs

Congratulations, you have finished Lecture 27.

Module 6 : Semiconductor Memories


Lecture 28 : Static Random Access Memory (SRAM)
Objectives
In this lecture you will learn the following
SRAM Basics
CMOS SRAM Cell
CMOS SRAM Cell Design
READ Operation
WRITE Operation
28.1 SRAM Basics
The memory circuit is said to be static if the stored data can be retained indefinitely, as
long as the power supply is on, without any need for periodic refresh operation. The data
storage cell, i.e., the one-bit memory cell in the static RAM arrays, invariably consists of
a simple latch circuit with two stable operating points. Depending on the preserved state
of the two inverter latch circuit, the data being held in the memory cell will be
interpreted either as logic '0' or as logic '1'. To access the data contained in the
memory cell via a bit line, we need atleast one switch, which is controlled by the
corresponding word line as shown in Figure 28.11.

Fig 28.11: SRAM Cell


28.2 CMOS SRAM Cell
A low power SRAM cell may be designed by using cross-coupled CMOS inverters. The
most important advantage of this circuit topology is that the static power dissipation is
very small; essentially, it is limited by small leakage current. Other advantages of this
design are high noise immunity due to larger noise margins, and the ability to operate at
lower power supply voltage. The major disadvantage of this topology is larger cell size.
The circuit structure of the full CMOS static RAM cell is shown in Figure 28.12. The
memory cell consists of simple CMOS inverters connected back to back, and two access

transistors. The access transistors are turned on whenever a word line is activated for
read or write operation, connecting the cell to the complementary bit line columns.

Fig 28.21: Full CMOS SRAM cell


28.3 CMOS SRAM Cell Design
To determine W/L ratios of the transistors, a number of design criteria must be taken
into consideration. The two basic requirements, which dictate W/L ratios, are that the
data read operation should not destroy the stored information in the cell. The cell
should allow stored information modification during write operation. In order to
consider operations of SRAM, we have to take into account, the relatively large parasitic
column capacitance
28.31.

and

and column pull-up transistors as shown in Figure

Fig 28.31: CMOS SRAM cell with precharge transistors

When none of the word lines is selected, the pass transistors M3 and M4 are turned off
and the data is retained in all memory cells. The column capacitances are charged by
the pull-up transistors P1 and P2. The voltages across the column capacitors reach
VDD - VT.
28.4 READ Operation
Consider a data read operation, shown in Figure 28.41, assuming that logic '0' is stored
in the cell. The transistors M2 and M5 are turned off, while the transistors M1 and M6
operate in linear mode. Thus internal node voltages are V1 = 0 and V2 = VDD before the
cell access transistors are turned on. The active transistors at the beginning of data read
operation are shown in Figure 28.41.

Fig 28.41: Read Operation


After the pass transistors M3 and M4 are turned on by the row selection circuitry, the
voltage CBb of will not change any significant variation since no current flows through
M4. On the other hand M1 and M3 will conduct a nonzero current and the voltage level
of CB will begin to drop slightly. The node voltage V1 will increase from its initial value of
'0'V. The node voltage V1 may exceed the threshold voltage of M2 during this process,
forcing an unintended change of the stored state. Therefore voltage must not exceed the
threshold voltage of M2, so the transistor M2 remains turned off during read phase, i.e.,
(Eq 28.1)
The transistor M3 is in saturation whereas M1 is linear, equating the current equations
we get

(Eq 28.2)

substituting Eq 28.1 in Eq 28.2 we get

(Eq 28.3)
28.5 WRITE Operation
Consider the write '0' operation assuming that logic '1' is stored in the SRAM cell
initially. Figure 28.51 shows the voltage levels in the CMOS SRAM cell at the beginning
of the data write operation. The transistors M1 and M6 are turned off, while M2 and M5
are operating in the linear mode. Thus the internal node voltage V1 = VDD and V2 = 0
before the access transistors are turned on. The column voltage Vb is forced to '0' by
the write circuitry. Once M3 and M4 are turned on, we expect the nodal voltage V2 to
remain below the threshold voltage of M1, since M2 and M4 are designed according to
Eq. 28.1.

Fig 28.51: SRAM start of write '0'


The voltage at node 2 would not be sufficient to turn on M1. To change the stored
information, i.e., to force V1 = 0 and V2 = VDD, the node voltage V1 must be reduced
below the threshold voltage of M2, so that M2 turns off. When
the transistor
M3 operates in linear region while M5 operates in saturation region. Equating their
current equations we get

(Eq 28.4)

Rearranging the condition of in the result we get

(Eq 28.5)
28.6 WRITE Circuit
The principle of write circuit is to assert voltage of one of the columns to a low level.
This can be achieved by connecting either
or
to ground through transistor M3
and either of M2 or M1. The transistor M3 is driven by the column decoder selecting the
specified column. The transistor M1 is on only in the presence of the write enable signal
and when the data bit to be written is '0'. The transistor M2 is on only in the
presence of the write signal
and when the data bit to be written is '1'. The circuit
for write operation is shown in Figure 28.61

Fig 28.61: Circuit for write operation

Recap
In this lecture you have learnt the following
SRAM Basics
CMOS SRAM Cell
CMOS SRAM Cell Design
READ Operation
WRITE Operation

Congratulations, you have finished Lecture 28.

Module 6 : Semiconductor Memories


Lecture 29 : Basics Of DRAM Cell And Access Time Consideration
Objectives
In this lecture you will learn the following
DRAM Basics
Differential Operation In Dynamic RAMs
DRAM Read Process With Dummy Cell
Operation Of The Read Circuit
Calculation Of Change In Bitline Voltage
Area Considerations
Metal Gate Diffusion Storage
29.1 DRAM Basics
A typical 1-bit DRAM cell is shown in Figure 29.11

Fig 29.11: DRAM Cell


The CS capacitor stores the charge for the cell. Transistor M1 gives the R/W access to
the cell. CB is the capacitance of the bit line per unit length.
Memory cells are etched onto a silicon wafer in an array of columns (bit lines) and rows
(word lines). The intersection of a bit line and word line constitutes the address of the
memory cell.
DRAM works by sending a charge through the appropriate column (CAS) to activate the
transistor at each bit in the column. When writing, the row lines contain the state the
capacitor should take on. When reading, the sense amplifier determines the level of
charge in the capacitor. If it is more than 50%, it reads it as "1; otherwise it reads it as
"0". The counter tracks the refresh sequence based on which rows have been accessed
in what order. The length of time necessary to do all this is so short that it is expressed

in nanoseconds (billionths of a second). e.g. a memory chip rating of 70ns means that it
takes 70 nanoseconds to completely read and recharge each cell.
The capacitor in a dynamic RAM memory cell is like a leaky bucket. Dynamic RAM has to
be dynamically refreshed all of the time or it forgets what it is holding. This refreshing
takes time and slows down the memory.
29.2 Differential Operation In Dynamic RAMs
The sense amplifier responds to difference in signals appearing between the bit lines. It
is capable of rejecting interference signals that are common to both lines, such as those
caused by capacitive coupling from the word lines. For this common-mode to be
effective, both sides of the amplifier must be matched, taking into account the circuit
that feed each side. This is required in order to make the inherently single ended output
of the DRAM cell appear differential.
Single To Differential Conversion:
Large memories (>1Mbit) that are exceedingly prone to noise disturbances resort to
translating the single ended sensing problem into a differential one. The basic concept
behind the single to differential is demonstrated in Figure 29.21

Fig 29.21: Single to differential conversion


A differential sense amplifier is connected to a single ended bit line on one side and a
reference voltage positioned between the "0" and "1" level at the other end. Depending
on the value of BL the flip flop toggles in one or the other direction. Voltage levels tend
to vary from die to die or even a single die so the reference source must track those
variations. A popular way of doing so is illustrated in Figure 29.31 for the case of 1T
DRAM. The memory array is divided into two halves, with the differential amplifier
placed in the middle. On each side, a column called Dummy cell is added, these are 1T
memory cells that are similar to the others, but whose sole purpose is to serve as
reference. This approach is often called Open bit line architecture.

29.3 DRAM Read Process With Dummy Cell


Circuit Construction:
The circuit is illustrated in Figure 29.31. Each bit line is split into two identical halves.
Each half line is connected to half cells in the column and an additional cell known as
Dummy cell having a capacitor
. When a word line on the left side is selected
for reading the Dummy cell on the right side (controlled by XR) is also selected and vice
versa, i.e. when a word line on the right side is selected the Dummy cell on the left
(controlled by XL) is also selected. In effect, then, the Dummy cell operation serves as
the other half of a differential DRAM cell. When the left bit line is in operation, the right
half bit line acts as complement for b line and vice versa. These cells shown here are the
cells of a column, but look like a row. The distribution of the select lines are such that
the even X's are in the right half and all the odds are in the left half.

Fig 29.31: Arrangement for obtaining differential operation


from the single ended DRAM cell
29.4 Operation Of The Read Circuit
The circuit is shown in the previous slide in Figure 29.31. The two halves of the line are
precharged to

and their voltages are equalized. At the same time, the capacitors of

the two Dummy cells are precharged to


. Then a word line is selected, and the
Dummy cell of the other side is enabled (with and raised to VDD). Thus the half line
connected to the selected cell will develop a voltage increment (above

) of v or v0

depending on whether a "1" or "0" is stored in the cell. Meanwhile the other half of the
line will have its voltage held equal to that of Cd (i.e.
) the result is a differential
signal that the sense amplifier detects and amplifies when it is enabled. As usual by the
end of the regenerative process, the amplifier will cause the voltage on one half of the
line to become VDD and that on the other half to become 0.

Fig 29.41: Timing diagram of DRAM operation


If X1 cell is accessed, then the dummy cell on the right side is also selected.
Then the actual voltage of the bit line will be

, where

where CB is BIT capacitance and CS is storage cell capacitance. VS is voltage between


the nodes of the storage capacitance. The storage capacitance with dummy cell is taken
on both sides in a column.
If
then

; this shows that there is no charge sharing.

If VS = 0
then

Since Dummy cell is discharged

(which is between 5 and 4)


The tricky part of a DRAM cell lie in the design of the circuitry to read out the stored
value and the design of the capacitor to maximise the stored charge/minimise the
storage capacitor size. Stored values in DRAM cells are read out using sense amplifiers,
which are extremely sensitive comparators which compare the value stored in the DRAM
cell with that of a reference cell. The reference cell used is a dummy cell which stores a
voltage halfway between the two voltage levels used in the memory cell (experimental
multilevel cells use slightly different technology). Improvements in sense amplifiers
reduce sensitivity to noise and compensate for differences in threshold voltages among
devices.
29.5 Calculation Of Change In Bitline Voltage
An array of DRAM cells is laid out as in Figure 29.21

Fig 29.21: Array of DRAM Cells

Let the capacitance per unit length of bitline be CB and the storage capacitance of the
DRAM cell be CS. If there are n such sections (as shown) then the net bit capacitance of
the bitline is nCB.
Let the bitline be precharged to VP. So when the bitline is connected to the capacitance
of the DRAM cell, the net voltage will be some intermediate voltage due to charge
sharing and is given by:

(Eq 29.1)
A term Charge Transfer Ratio is defined in this context as

(Eq 29.2)

where

is defined as

For a particular technology, CB is fixed. So only CS can be changed. When the bitline is
connected to storage capacitor, the change of voltage at the bitline is given by

(Eq 29.3)
For good design the value of
(the change in voltage at bitline) should be as high
as possible, so that it will allow the sensor to sense the bit correctly and quickly.
Increase in
requires CTR to increase. That leads to increase in the value of n. n
depends on CB and CS. Thus to increase n the storage capacitance CS can be increased
or the bitline capacitance CB can be decreased or both can be done. However increasing
the value of storage capacitance requires larger area.

29.6 Area Considerations


To reduce the area requirement, still allowing the larger capacitance for storage, retrofit
technique is used. In this case, the capacitors are laid down as shown below in Figure
29.31

Fig 29.31: Retrofit Technique


Standard DRAM cell uses diffusion, poly or metal as the bit line which will be discussed
later.
Typical area calculation for storage capacitance is illustrated in Figure 29.32

Fig 29.32: Area Calculation


Area, A, consumed by the capacitor is given by:

(Eq 29.4)
Large area leads to large storage capacitance. Large storage capacitance leads to large
change in bit voltage
and therefore the access time will be small. DRAM cell with
small access time can be designed by improvement on cell itself and the sense amplifier.

29.4 Metal Gate Diffusion Storage


The cross section of a Metal Gate Diffusion Storage is shown below in Figure 29.41.

Fig 29.41: Metal Gate Diffusion Storage


In this case the charge is stored in the depletion capacitance of the substrate and
diffusion region. The problem with this design are:
Diffusion line has higher capacitance. Thus increases and hence CTR decreases.
Parasitic capacitances are higher and the gate is not self-aligned.
There is routing problem associated with this kind of design.
A very much similar configuration can be used for inversion storage. In this case when
gate input goes HIGH, the inversion charge stored in the capacitor is drained out and the
potential at the channel drops indicating a '1' was stored. The reverse phenomena
occurs if a '0' is stored.
Several DRAM cells are in place now. Few of them are as follows:
SPDB - Single Poly Diffused Bit
SPPB - Single Poly Poly Bit
SPMB - Single Poly Metal Bit
Similarly it's possible to have:
DPDB - Dual Poly Diffused Bit
DPPM - Dual Poly Poly Bit
DPMB - Dual Poly Metal Bit
Layout of a SPDB is shown in Figure 29.42

Fig 29.42: Layout of SPDB Cell


Recap
In this lecture you have learnt the following
DRAM Basics
Calculation Of Change In Bitline Voltage
Area Considerations
Metal Gate Diffusion Storage

Congratulations, you have finished Lecture 29.

Module 2 : MOSFET
Lecture 3 : Introduction to MOSFET
Objectives
In this course you will learn the following:
Basic MOS Structure
Types of MOSFET
MOSFET I-V Modelling
3.1 Basic MOSFET Structure
In the introduction to a system, we got an overview of various levels of design, viz.
Architectural level design, Program level design, Functional level design and Logic level
design. However we can't understand the levels of design unless we are exposed to the
basics of operation of the devices currently used to realize the logic circuits, viz.,
MOSFET (Metal Oxide Semiconductor Field Effect Transistor). So in this section, we'll
study the basic structure of MOSFET.
The cross-sectional and top/bottom view of MOSFET are as in figures 3.11 and 3.12
given below :

fig 3.11 Cross-sectional view of MOSFET

fig 3.12 Top/Bottom View of MOSFET

An n-type MOSFET consists of a source and a drain, two highly conducting n-type
semiconductor regions which are separated from the p-type substrate by reverse-biased
p-n diodes. A metal or poly crystalline gate covers the region between the source and
drain, but is isolated from the semiconductor by the gate oxide.
3.2 Types of MOSFET
MOSFETs are divided into two types viz. p-MOSFET and n-MOSFET depending upon its
type of source and drain.

Fig. 3.21: p-MOSFET

Fig. 3.22: n-MOSFET

Fig. 3.23: c-MOSFET

The combination of a n-MOSFET and a p-MOSFET (as shown in figure 3.21) is called
cMOSFET which is the mostly used as MOSFET transistor. We will look at it in more
detail later.
3.3 MOSFET I-V Modelling
We are interested in finding the outputcharacteristics ( ) and the transfer charcteristics (
) of the MOSFET. In other words, we can find out both if we can formulate a
mathematical equation of the form:
ntutively, we can say that voltage level specifications and the material parameters
cannot be altered by designers. So the only tools in the designer's hands with which
he/she can improve the performance of the device are its dimensions, W and L (shown
in top view of MOSFET fig 2). In fact, the most important parameter in the device
simulations is ratio of W and L.
The equations governing the output and transfer characteristics of an n-MOSFET and
p-MOSFET are :

p-MOSFET:

n-MOSFET:

Linear
Saturation

Linear
Saturation

The output characteristics plotted for few fixed values of for p-MOSFET and n-MOSFET
are shown next :

fig 3.31 p-MOSFET

fig 3.32 n-MOSFET

The transfer characteristics of both p-MOSFET and n-MOSFET are plotted for a fixed
value of as shown next :

fig 3.33 p-MOSFET

fig 3.34 n-MOSFET

Note: From now onwards in the lectures, we will symbolize MOSFET by MOS.
3.4 C-V Characteristics of a MOS Capacitor
As we have seen earlier, there is an oxide layer below Gate terminal. Since oxide is a
very good insulator, it contributes to an oxide capacitance in the circuit. Normally, the
capacitance value of a capacitor doesn't change with values of voltage applied across its
terminals. However, this is not the case with MOS capacitor. We find that the
capacitance of MOS capacitor changes its value with the variation in Gate voltage. This is
because application of gate voltage results in the band bending of silicon substrate and
hence variation in charge concentration at Si-SiO2 interface. Also we can see (from
fig.3.42 ) that the curve splits into two (reason will be explained later), after a certain
voltage, depending upon the frequency (high or low) of AC voltage applied at the gate.
This voltage is called the threshold voltage(Vth) of MOS capacitor.

fig 3.41 Cross section view of MOS Capacitor

Fig 3.42: plot of MOS Capacitor

Recap
In this lecture you have learnt the following
Basic MOS Structure
Types of MOSFET
MOSFET I-V Modelling
Congratulations, you have finished Lecture 3.

Module 6 : Semiconductor Memories


Lecture 30 : SRAM and DRAM Peripherals
Objectives
In this lecture you will learn the following
Introduction
SRAM and its Peripherals
DRAM and its Peripherals
30.1 Introduction
Even though a lot of the concepts here have been discussed earlier, they are repeated
for convenience.
Broadly memories can be classified into
RAM (Random Access Memory)
Serial Memory
A RAM is one in which the time required for accessing the information and retrieving the
information is independent of the physical location of the information. In contrast, in a
Serial memory, the data is available only in the same form as it was stored previously.
The following diagram shows the organization of a Memory

Fig 30.11: Organization of Memory


This memory consists of two address decoders viz. Row and Column decoders to select a
particular bit in the memory. If there are M rows and N columns, then the number of

bits that can be accessed are


. Either a read operation or a write operation can be
done on any selected bit by the use of control signals.
RAMs are once again classified into two types:
SRAM (Static RAM)
DRAM (Dynamic RAM)
30.2 SRAM and Its Peripherals

Fig 30.21: SRAM Cell


Figure 30.21 shows a standard 6 transistor SRAM cell. The signal designed as WL is the
WORDLINE used to read or write into the cell.
into the cell.

and

are the data to be written

Fig 30.22: Circuit for reading and writing data into cell
The circuits shown in the previous page are used to write and read the data to and from
the cell. When a read operation is to be performed,

signal is made HIGH and

at the same time is made LOW. As a result the data present on the
and
lines
are transferred to the input of the sense amplifier (Sense amplifier operation will be
discussed shortly). The sense amplifier then senses the data and gives the output.
During the write operation,
and

is made LOW and

will be written onto the

and

is made HIGH. Thus the

lines respectively.

However the read and write operation on a particular cell takes place only if the cell is
enabled by the corresponding row(Word) and column(Digit) lines. It is important to
remember that before every read operation, the

and

are precharged to a

voltage (usually VDD/2). During read operation, one of the two BIT (
or
) lines
discharges slightly whereas the other line charges to a voltage slightly greater than its
precharged value. This difference in these voltages is detected by the sense amplifier to
produce and output voltage, which corresponds to the stored value in the cell which is
read. Care should be taken in sizing the transistors to ensure that the data stored in the
cell does not change its value.
30.3 Sense Amplifier
The circuit shown in Figure 30.31 is the sense amplifier used to read data from the cell.
As soon as the SE signal goes HIGH the amplifier senses the difference between the
and
voltages and produces an output voltage appropriately. The access time
of the memory, which is defined as the time between the initiation of the read operation
and the appearance of the output, mainly depends on the performance of the sense
amplifier. So the design of the sense amplifier forms the main criteria for the design of
memories. The one that is shown here is a simple sense amplifier.

Fig 30.31: Differential Sense Amplifier


Figure 30.31 shows the block diagram of a memory cell with all the peripherals

Fig 30.32: Block Diagram Of A Memory Cell With All Its Peripherals
30.4 Another Type of Sensing

Figure 30.41 illustrates the SRAM sensing scheme.

Fig 30.41: SRAM Sensing Scheme


In the above figure,
is the signal used to precharge the
and
lines before
every read operation. The transistor labelled EQ is the equalization transistor to ensure
equal voltages on

and

lines after precharge. SE is the sense enable signal

used to sense the voltage difference between the

and

lines.

As mentioned earlier, the access time of the memory mainly depends on the
performance of the sense amplifier. In contrast with the simple sense amplifier shown

earlier, Figure 30.42 shows an amplifier which is somewhat complicated to improve the
performance.

Fig 30.42: Two Stage Differential Amplifier


30.5 DRAM and Its Peripherals
The circuit shown in Figure 30.51 is the simple DRAM circuit. Charge sharing takes place
between the two capacitors during read and write operations in the following manner.
During the write cycle, CS is charged or discharged by asserting WL and BL. During the
read cycle, charge redistribution takes place between the bit line and the storage
capacitance.
(Eq 30.1)
Voltage swing is small; typically around 250mV.
Figure 30.52 shows a simple 3-transistor DRAM cell.

Fig 30.52: 3-Transistor DRAM Cell


Figure 30.53 shows a very simple address decoder. These address decoders are
compulsory in case of main memories. But the cache memories avoid the usage of
address decoders. Many other possible architectures are available for address decoding.

Fig 30.53: A Simple Address Decoder

Recap
In this lecture you have learnt the following
Introduction
SRAM and its Peripherals
DRAM and its Peripherals
Congratulations, you have finished Lecture 30.

Module 6 : Semiconductor Memories


Lecture 31 : Semiconductor ROMs
Objectives
In this lecture you will learn the following
Introduction to Semiconductor Read Only Memory (ROM)
NOR based ROM Array
NAND based ROM Array
31.1 Introduction
Read only memories are used to store constants, control information and program
instructions in digital systems. They may also be thought of as components that provide
a fixed, specified binary output for every binary input.
The read only memory can also be seen as a simple combinational Boolean network,
which produces a specified output value for each input combination, i.e. for each
address. Thus storing binary information at a particular address location can be achieved
by the presence or absence of a data path from the selected row (word line) to the
selected column (bit line), which is equivalent to the presence or absence of a device at
that particular location.
The two different types of implementations of ROM array are:
NOR-based ROM array
NAND-based ROM array
31.2 NOR-based ROM Array
There are two different ways to implement MOS ROM arrays. Consider the first 4-bit X 4bit memory array as shown in Figure 31.21. Here, each column consists of a pseudonMOS NOR gate driven by some of the row signals, i.e., the word line.

Fig 31.21: NOR-based ROM array


As we know, only one word line is activated at a time by raising tis voltage to VDD,
while all other rows are held at a low votlage level. If an active transistor exists at the
cross point of a column and the selected row, the column voltage is pulled down to the
logic LOW level by that transistor. If no active transistor exists at the cross point, the
column voltage is pulled HIGH by the pMOS load device. Thus, a logic "1"-bit is stored
as the absence of an active transistor, while a logic "0"-bit is stored as the presence of
an active transistor at the cross point. The truth table is shown in Figure 31.22.

Fig 31.22: Truth Table for Figure 31.21


In an actual ROM layout, the array can be initially manufactured with nMOS transistors
at every row-column intersection. The "1"-bit are then realized by omitting the drain or
source connection, or the gate electrode of the corresponding nMOS transistors in the

final metallization step. Figure 31.23 shows nMOS transistors in a NOR ROM array,
forming the intersection of two metal lines and two polysilicon word lines. To save silicon
area, the transistors in every two rows are arranged to share a common ground line,
also routed in n-type diffusion. To store a "0"-bit at a particular address location, the
drain diffusion of the corresponding transistor must be connected to the metal bit line
via a metal-to-diffusion contact. Omission of this contact, on the other hand, results in
stored "1"-bit.

Fig 31.23: Metal column line to load devices


The layout of the ROM array is shown below in Figure 31.24.

Fig 31.24: Programming using the Active Layer Only

Figure 31.25 shows a different type of NOR ROM layout implementation which is based
on deactivation of the nMOS transistor by raising their threshold voltage through
channel implants. In this case, all nMOS transistors are already connected to the column
lines: therefore, storing a "1"-bit at a particular location by omitting the corresponding
the drain contact is not possible. Instead, the nMOS transistor corresponding to the
stored "1"-bit can be deactivated, i.e. permanently turned off, by raising its threshold
voltage above the VCH level through a selective channel implant during fabrication.

Fig 31.25: Programming using the Contact Layer Only

31.3 NAND-based ROM Array


In this types of ROM array which is shown in Figure 31.31, each bit line consists of a
depletion-load NAND gate, driven by some of the row signals, i.e. the word lines. In
normal operation, all word lines are held at the logic HIGH voltage level except for the
selected line, which is pulled down to logic LOW level. If a transistor exists at the cross
point of a column and the selected row, that transistor is turned off and column voltage
is pulled HIGH by the load device. On the other hand, if no transistor exists (shorted) at
that particular cross point, the column voltage is pulled LOW by the other nMOS
transistors in the multi-input NAND structure. Thus, a logic "1"-bit is stored by the
presence of a transistor that can be deactivated, while a logic "0"-bit is stored by a
shorted or normally ON transistor at the cross point.

Fig 31.31: NAND-based ROM


As in the NOR ROM case, the NAND-based ROM array can be fabricated initially with a
transistor connection present at every row-column intersection. A "0"-bit is then stored
by lowering the threshold voltage of the corresponding nMOS transistor at the cross
point through a channel implant, so that the transistor remains ON regardless of the
gate voltage. The availability of this process step is also the reason why depletion-type
nMOS load transistors are used instead of pMOS loads.

Fig 31.32: Truth table for Fig 31.31


Figures 31.33 and 31.34 show two different types of layout implementations of NAND
ROM array. In the implant-mask NAND ROM array, vertical columns of n-type diffusion
intersect at regular intervals with horizontal rows of polysilicon, which results in an
nMOS transistor at each intersection point. The transistor with threshold voltage implant

operate as normally-ON depletion devices, thereby providing a continuous current path


regardless of the gate voltage level. Since this structure has no contacts embedded in
the array, it is much more compact than the NOR ROM array. However, the access time
is usually slower than the NOR ROM, due to multiple series-connected nMOS transistor in
each column.

Fig 31.33: Programming using the Metal-1 Layer Only

Fig 31.34: Programming using Implants Only

Recap
In this lecture you have learnt the following
Introduction to Semiconductor Read Only Memory (ROM)
NOR based ROM Array
NAND based ROM Array
Congratulations, you have finished Lecture 31.

Module 6 : Semiconductor Memories


Lecture 32 : Few special Examples of Memories
Objectives
In this lecture you will learn the following
Non-Volatile READ-WRITE Memory
The Floating Gate Transistor
Erasable Programmable Read Only Memory (EPROM)
Electrically Erasable Programmable Read Only Memory ( E2PROM)
32.1 Non-Volatile Read-Write Memory
The architecture of Non-Volatile Read-Write (NVRW) Memory is virtually identical to the
ROM structure. The memory core consists of an array of transistors placed on the
wordline/bitline grid. Selectively disabling or enabling some of the devices programs the
memory. In a ROM, this is accomplished by mask-level alterations. In a NVRW memory,
a modified transistor that permits its threshold to be altered electrically is used. This
modified threshold is retained indefinitely (or atleast for the lifetime, typically of the
order of 10 yrs) even when the supply is turned off. To reprogram the memory, the
programmed values must be erased, after which a new programming round must be
started. The method of erasing is the main differentiating factor between the various
classes of reprogrammable nonvolatile memories. The programming of the memory is
typically an order of magnitude slower than the reading operation.
32.2 Floating Gate Transistor
Over the years, various attempts have been made to create a device with electrically
alterable characteristics and enough reliability to support a multitude of write cycles. For
example, the MNOS (Metal Nitride Oxide Semiconductor) transistor held promise, but
has been unsuccessful until now. In this device, threshold-modifying electrons are
trapped in a Si3N4 layer deposited on the top of the gate SiO2. A more accepted solution
is offered by the floating gate transistor shown in Figure 32.21, which forms the core of
virtually every NVRW memory built today.

Fig 32.21: FAMOS Structure


The structure is similar to traditional MOS device, except that an extra polysilicon strip is
inserted between the gate and channel. This strip is not connected to anything and is
called as Floating gate. The most obvious impact of inserting this extra gate is to
double the gate oxide thickness tox, which results in a reduced device transconductance
as well as increased threshold voltage. Both these properties are not particularly
desirable.
This device has property that its threshold voltage is programmable. Applying a high
voltage (about 10V) between the source and drain terminals creates a high electric field
and causes avalanche injection to occur. Electrons acquire sufficient energy to become
'HOT' and traverse through the first oxide insulator, so that they get trapped on the
floating gate. This phenomenon can occur with oxide as thick as 100nm, which makes it
relatively easy to fabricate the device. In reference to the programming mechanism, the
floating gate transistor is often called Floating Gate Avalanche Injection MOS (FAMOS).
The trapping of electrons on the floating gate effectively drops the voltage on that gate.
This process is self-limiting and the negative charge accumulated on the floating gate
reduces the electrical field over the oxide so that ultimately it becomes incapable of
accelerating more hot electrons. Removing the voltage leaves the induced negative
charge in place, and results in a negative voltage on the intermediate gate. From the
device point of view, this translates into an effective increase in threshold voltage. To
turn on the device, a higher voltage is needed to overcome the effect of the induced
negative charge. Typically, the resulting threshold voltage is around 7V; thus a 5V gateto-source voltage is not sufficient to turn on the transistor, and the device is effectively
disabled.
Since the floating gate is surrounded by SiO2, which is an excellent insulator, the
trapped charge can be stored for many years, even when the supply voltage is removed,
creating the nonvolatile storage mechanism. One of the major concerns of the floating
gate approach is the need for high programming voltages. By tailoring the impurity
profiles, technologists have been able to reduce the required voltage from the original
25V to approximately 12.5V in today's memories.

32.3 Erasable Programmable Read Only Memory (EPROM)


An EPROM is erased by shining ultraviolet light on the cells through a transparent
window in the package. The UV radiation renders the oxide slightly conductive by direct
generation of electron-hole pair in the material. The erasure process is slow and can
take from seconds to several minutes, depending on the intensity of the UV source.
Programming takes several (5-10) microseconds/word. Another problem with the
process is limited endurance, that is, the number of erase/program cycles is generally
limited to maximum of 1000, mainly as a result of UV erase procedure. Reliability is also
an issue. The device threshold might vary with repeated programming cycles. Most
EPROM memories therefore contain on-chip circuitry to control the value of thresholds to
within a specified range during programming. Finally, the injection always entails a large
channel current, as high as 0.5mA at a control gate voltage of 12.5V. This causes high
power dissipation during programming. The EPROM cell is extremely simple and dense,
making it possible to fabricate large memories at a low cost. EPROMs were therefore
attractive in applications that do not require regular programming. Due to cost and
reliability issues, EPROMs have fallen out of favor and have been replaced by Flash
Memories.
32.4 Electrically Erasable Programmable Read Only Memory
(EEPROM)
The major disadvantage of the EPROM approach is that erasure procedure has to occur
"off system". This means the memory must be removed from the board and placed in
the EPROM programmer for programming. The EEPROM approach avoids this labor
intensive and annoying procedure by using another mechanism to inject or remove
charges from the floating gate viz. tunneling. A modified floating gate device called the
FLOTOX (Floating Gate Tunnel Oxide) transistor is used as a programmable device that
supports an electrical erasure procedure. A cross section of the FLOTOX structure is
shown in Figure 32.41.

Fig 32.41: FLOTOX Structure


It resembles the FAMOS device, except that a portion of the dielectric separating the
floating gate from the channel and drain is reduced in thickness to about 10nm or less.

When a voltage of approximately 10V is applied over the thin insulator, electrons can
move to and from the floating gate through tunneling.
The main advantage of this programming approach is that it is reversible; that is,
erasing is simply achieved by reversing the voltage applied during the writing process.
Injecting electrons onto the floating gate raises the threshold, while the reverse
operation lowers it. This bidirectionality, however, introduces a threshold control
problem: removing too much charge from the floating gate results in a depletion device
that cannot be turned off by the standard wordline signals. Notice that the resulting
threshold voltage depends on initial charge on the gate, as well as the applied
programming voltages. It is a strong function of the oxide thickness, which is subject to
non-neglible variations over the die. To remedy this problem, an extra transistor
connected in series with the floating gate transistor is added to the EEPROM cell. This
transistor acts as the access device during the read operation, while the FLOTOX
transistor performs the storage function. This is in contrast to the EPROM cell, where the
FAMOS transistor acts as both the programming and access device.
The EEPROM cell with its two transistors is larger than its EPROM counterpart. This area
penalty is further aggravated by the fact that the FLOTOX device is intrinsically larger
than the FAMOS transistor due to the extra area of the tunneling oxide. Additionally,
fabrication of very thin oxide is a challenging and costly manufacturing step. Thus
EEPROM components pack less bits for more cost than EPROMs. On the positive side
EEPROM offer high versatility. They also tend to last longer, as they can support upto
100,000 erase/write cycles. Repeated programming causes a drift in the threshold
voltage due to permanently trapped charges in the SiO2.This finally leads to malfunction
or the inability to reprogram the device.
Recap
In this lecture you have learnt the following
Non-Volatile READ-WRITE Memory
The Floating Gate Transistor
Erasable Programmable Read Only Memory (EPROM)
Electrically Erasable Programmable Read Only Memory (E2PROM)

Congratulations, you have finished Lecture 32.

Module 7 : I/O PADs


Lecture 33 : I/O PADs
Objectives
In this lecture you will learn the following
Introduction
Electrostatic Discharge
Output Buffer
Tri-state Output Circuit
Latch-Up
Prevention of Latch-Up
33.1 Introduction
Pad cells surround the rectangular metal patches where external bonds are made. Pads
must be sufficiently large and sufficiently spaced apart from each other. There are three
types of pad cells, input, output, power (also, tristate, analog). Typical structures inside
pad cells should have
sufficient connection area (eg. 85 x 85 microns) in the pad,
electrostatic discharge (ESD) protection structures
interface to internal circuitry
circuitry specific to input and output pads
Pads are generally arranged around the chip perimeter in a "pad frame". Pad frame will
have a signal ring of pads in smaller designs. Lower limit on pad size is minimum size to
which a bond wire can be attached; typically 100-150 micrometers. It is also a minimum
pitch at which bonding machines can operate.
Input pads of gates of input buffer transistors are susceptible to high voltage build up,
so we need to have ESD protection for it. Output pads are expected to drive large
capacitance loads, so characteristics of load must be met by proper sizing of output
buffer. Due to large transistors, I/O currents are higher and hence Latch-Up may occur.
To prevent this, we use guard rings in layout. For area efficiency, I/O transistors should
be constructed from several small transistors in parallel. Long gates must be provided to
reduce avalanche breakdown tendency.
33.2 Electrostatic Discharge (ESD)
ESD damage is usually caused by poor handling procedures. ESD is especially severe in
low humidity environments. Electrostatic discharge is a pervasive reliability concern in
VLSI circuits. It is a short duration (<200ns) high current (>1A) event that causes
irreparable damage. The most common manifestation is the human body ESD event,
where a charge of about 0.6uC can be induced on a body capacitance of 100pF, leading
to electrostatic potentials of 4KV or greater.

Whenever body comes in contact with plastic or other insulating material, static charge
is generated. It can be a very small charge, as low as nano Coulombs, but it can cause
potential damage to MOS devices, as voltages are pretty high.
We know that
Q = CV
V = Q/C
V = It/C
Let us consider a modest 1pF capacitor, in which, this 1nC charge is put (can be through
a 100uA current for a millisec). This results in

SiO2 breakdown voltage is 109 volts/meter. If gate oxide is about 0.1um thick, say;
Maximum allowable voltage is
.
This can easily be generated by walking across a carpet!! A human touch can produce
instanteous voltages of 20,000 volts!
A typical solution of the ESD protection problem is to use clamping diodes implemented
using MOS transistors with gates tied up to either GND for nMOS transistors, or to VDD
for pMOS transistors as shown in Figure 33.21. For normal range of input voltages these
transistors are in the OFF state. If the input voltage builds up above (or below) a certain
level, one of the transistors starts to conduct clamping the input voltage at the same
level.

Fig 33.21: Clamping Transistors

These clamping transistors are very big structures consisting of a number of transistors
connected in parallel, and are able to sustain significant current. The thick field NMOS
used design is not suitable for deep submicron processes, and the thin field oxide NMOS
presents oxide breakdown problems while interfacing between blocks with high power
supply voltages.
Scaling of VLSI devices have reduced the dimensions of all structures used in ICs and
this has increased their susceptibility to ESD damage. Hence ESD protection issues are
becoming increasingly important for deep submicron technologies. The gate oxide
thicknesses are approaching the tunneling regime of around 35 Angstroms. From an
ESD perspective, the important issue is whether the oxide breakdown is reached before
the protection devices are able to turn on and protect them!
33.3 Output Buffer
The intra-chip buffer circuits are relatively well known. They are fast, and need only be
as big as needed to drive their particular load capacitances. However, in the inter-chip
buffer design case, there are some very important limitations. First, these buffers must
be able to drive large capacitive loads, as they are driving off-chip signals, which
means driving I/O pads, parasitic board capacitances, and capacitances on other chips.
Adding a few picofarads of capacitance at the output node is really inconsequential, and
shouldnt significantly degrade the propagation delay through this structure. So, the O/P
load for worst case design is considered to be 50 times normal load, approximately
50pF. The simplest driver for the output pad consists of a pair of inverters with large
transistors in addition to the standard ESD protection circuitry. The driver must be able
to supply enough current (must have enough driving capability) to achieve satisfactory
rise and fall times (tr, tf) for a given capacitive load. In addition the driver must meet
any required DC characteristics regarding the levels of output voltages for a given load
type, viz. CMOS or TTL.

Fig 33.31: Output Buffer

Design method is same as we have already discussed in previous lectures. Optimum


number of stages are found for a load capacitance assumed a-priori. Logical effort
method can be used to decide sizing.
Second, the voltage across any oxide at any time should not be greater than the supply
voltage, which ensures oxide reliability; most process design engineers will not
guarantee oxide reliability for oxide voltages greater than the chip VDD. If a low voltage
chip is tied to a bus which connects several chips, some with higher supply voltages,
then the Input buffer must be designed such that there is no chance of a problem with
the oxide.
33.4 Tri-State Output Circuit
The circuits of VLSI chips are designed to be tri-statable as shown in Figure 33.41, which
is designed to be driven only when the output enable signal
is asserted. The circuit
implementation requires 12 transistors. However in terms of silicon area, this
implementation may require a relatively small area since the last stage transistors need
not be sized large.

Fig 33.41 Tri-State Output Circuit


33.5 Latch-Up
Large MOS transistors are susceptible to the latch-up effect. In the chip substrate, at the
junctions of the p and n material, parasitic pnp and npn bipolar transistors are formed
as in the following cross-sectional view shown in Figure 33.51

Fig 33.51: Latch-Up


These bipolar transistors form a silicon-controlled rectifier (SRC) with positive feedback
as in the following circuit model shown in Figure 33.52

Fig 33.52: SRC With Positive Feedback

The final result of the latch-up is the formation of a short-circuit (a low impedance path)
between VDD and GND which results in the destruction of the MOS transistor.
33.6 Prevention of Latch-Up

Fig 33.61: Latch-Up Prevention Techniques


The following techniques can be used to prevent latch-up:
Use p+ guard rings to ground around nMOS transistors and n+ guard rings
connected to VDD around pMOS transistors to reduce Rwell and Rsub and to
capture injected minority carriers before they reach the base of the parasitic BJTs

Place substrate and well contacts as close as possible to the source connections
Use minimum area p-wells (in case of twin-tub technology or n-type substrate) so
that the p-well photocurrent can be minimized during transient pulses

Source diffusion regions of pMOS transistors should be placed so that they lie
along equipotential lines when currents flow between VDD and p-wells. In some nwell I/O circuits, wells are eliminated by using only nMOS transistors.

Avoid the forward biasing of source/drain junctions so as not to inject high


currents; the use of a lightly doped epitaxial layer on top of a heavily doped
substrate has the effect of shunting lateral currents from the vertical transistor
through the low-resistance substrate.

Layout n- and p-channel transistors such that all nMOS transistors are placed close
to GND and pMOS transistors are placed close to VDD rails. Also maintain
sufficient spacings between pMOS and nMOS transistors.

Recap
In this lecture you have learnt the following
Introduction
Electrostatic Discharge
Output Buffer
Tri-state Output Circuit
Latch-Up
Prevention of Latch-Up

Congratulations, you have finished Lecture 33.

Module 2 : MOSFET
Lecture 4 : MOS Capacitor
Objectives
In this course you will learn the following
MOS as Capacitor
Modes of operation
Capacitance calculation of MOS capacitor
4.1 MOS as Capacitor
Refering to fig. 4.1, we can see there is an oxide layer below the Gate terminal. Since
oxide is a very good insulator, it contributes to an oxide capacitance in the circuit.

Fig 4.1: Cross-section view of MOS Capacitor

Normally, the capacitance value of a


capacitor doesn't change with values
of voltage applied across its terminals.
However, this is not the case with
MOS capacitor. We find that the
capacitance of MOS capacitor changes
its value with the variation in Gate
voltage. This is because application of
gate voltage results in band bending in
silicon substrate and hence variation
in charge concentration at Si-SiO2
interface.

4.2 Modes of operation


Depending upon the value of gate voltage applied, the MOS capacitor works in three
modes :

Fig 4.2a: Accumulation mode (grey layer - strong hole concentration)

Fig 4.2b: Depletion Mode (light grey layer depletion region)


1. Accumulation: In this mode, there is accumulation of holes (assuming nMOSFET) at the Si-SiO2 interface. All the field lines emanating from the gate
terminate on this layer giving an effective dielectric thickness as the oxide
thickness (shown in Fig. 4.2a). In this mode, Vg <0
2. Depletion: As we move from negative to positive gate voltages the holes at the
interface are repelled and pushed back into the bulk leaving a depleted layer. This
layer counters the positive charge on the gate and keeps increasing till the gate
voltage is below threshold voltage. As shown in Fig. 4.2b we see a larger effective
dielectric length and hence a lower capacitance.
3. Strong Inversion:When Vg crosses threshold voltage, the increase in
depletion region width stops and charge on layer is countered by mobile electrons
at Si-SiO2 interface. This is called inversion because the mobile charges are
opposite to the type of charges found in substrate. In this case the inversion layer
is formed by the electrons. Field lines hence terminate on this layer thereby
reducing the effective dielectric thickness as shown in Fig. 4.2c)

Fig 4.2c: Strong Inversion mode


(grey layer - strongelectron concentration, light grey - depletion region)
4.3 Capacitace calculation of MOS Capacitor
In the last chapter, we gave you an introduction of MOS as capacitor. In this chapter, we
will see how MOS works as a capacitor with derivation of some related equations.

By Gauss's Law:
Also by thermal equilibrium:

where p and n are hole and electron concentrations of substrate and is hole or electron
concentration of the corresponding intrinsic seminconductor.
We see that if we keep making more and more -ve, the charges Qs and Qm keep
increasing. Thus, it is acting like a good parallel plate capacitor. Its capacitance can be
given as-

Fig 4.3: Gate and Depeletion charge of MOS Capacitor


For +ve bias voltage on gate, increasing

will increase Qm and Qs.

Using the depletion approximation, we can write depletion width

where
is the substrate acceptor density,
the surface potential at substrate.

as a function of

is dielectric constant of substrate and

as

is

The depletion region grows with increased voltage across the capacitor until strong
inversion is reached. After that, further increase in the voltage results in inversion rather
than more depletion. Thus the maximum depletion width is:

Also,
Therefore at

But by Gauss's law, electrons must compensate for increasing Qs.


So,

where charge Qi is due to electrons in the inversion layer.

Earlier due to low electric field, the electron-hole pairs formed below the oxide interface
recombine. However, once the electric field increases, the electron-hole pairs formed are
not able to recombine. So the free electron concentration increases.
By Kirchoff's law,

is given by:

Recap
In this lecture you have learnt the following
MOS as Capacitor
Modes of operation
Capacitance calculation of MOS capacitor

Congratulations, you have finished Lecture 4.

Module 2 : MOSFET
Lecture 5 : MOS Capacitor (Contd...)
Objectives
In this course you will learn the following
Threshold Voltage Calculation
C-V characteristics
Oxide Charge Correction
5.1 Threshold Voltage Calculation
Threshold voltage is that gate voltage at which the surface band bending is twice

Where
We know that the depth of depletion region for
by,

Charge in depletion region at

is between 0 and

is given by

and is given

where

Beyond threshold, the total charge QD in the seminconductor has to balance the charge
on gate electrode, Qs i.e.
where we define the charge in the inversion
layer as a quantity which needs to be determined.
This leads to following expression for gate voltage-

In case of depletion, there in no inversion layer charge, so Qi =0, i.e. gate voltage
becomes

but in case of inversion, the gate voltage will be given by :

The second term in second equality of last expression states our basic assumption,
namely that any change in gate voltage beyond the threshold requires a change in
inversion layer charge. Also from the same expression, we obtain threshold voltage as :

5.2 C-V Characteristics


The low frequency and high frequency C-V characteristics curves of a MOS capacitor are
shown in fig 5.2.

Fig 5.2 : Low & High Frequency C-V curves


The low frequency or quasi-static measurement maintains thermal equilibrium at all
times. This capacitance is the ratio of the change in charge to the change in gate
voltage, measured while the capacitor is in equilibrium. A typical measurement is
performed with an electrometer, which measures the charge added per unit time as one
slowly varies the applied gate voltage.
The high frequency capacitance is obtained from a small-signal capacitance
measurement at high frequency. The bias voltage on the gate is varied slowly to obtain
the capacitance versus voltage. Under such conditions, one finds that the charge in the
inversion layer does not change from the equilibrium value corresponding to the applied
DC voltage. The high frequency capacitance therefore reflects only the charge variation
in the depletion layer and the (rather small) movement of the inversion layer charge.
5.3 Oxide Charge Correction
To keep the value of within -1 Volt and +1 Volt, an n-channel device has high doping
(similarly, pchannel device has high doping).
Recap
In this lecture you have learnt the following
Threshold Voltage Calculation
C-V characteristics
Oxide Charge Correction
Congratulations, you have finished Lecture 5.

Module 2 : MOSFET
Lecture 6 : MOSFET I-V characteristics
Objectives
In this course you will learn the following
Derivation of I-V relationship
Channel length modulation and body bias effect
6.1 derivation of I-V relationship
In this section, the relation between

and

is discussed. We assume that gate-

body voltage drop is more than threshold voltage


, so that mobile electrons are
created in the channel. This implies that the transistor is either in linear or saturation
region.
Here we will derive some simple I-V characteristics of MOSFET, assuming that the
device essentially acts as a variable resistor between source and drain, and only drift
ohmic current needs to be calculated. Also note that the MOSFET is basically a twodimensional device. The gate voltage
produces a field in the vertical (x) direction,
which induces charge in the silicon, including charge in the inversion layer. The voltage
produces a field in the lateral (y) direction, and current flows (predominantly) in the
y-direction. Strictly speaking, we must solve the 2-D Poisson and continuity equations to
evaluate the I-V characteristics of the device. These are analytically intractable. We
therefore resort to the gradual channel approximation described below.
To find the current flowing in the MOS transistor, we need to know the charge in the
inversion layer. This charge, Qn(y) (per sq. cm) is a function of position along the
channel, since the potential varies going from source to drain. We assume that Qn(y)
can be found at any point y by solving the Poisson equation only in the x direction, that
is treating the gate-oxide-silicon system in the channel region very much like a MOS
capacitor. This is equivalent to assuming that vertical electric field Ex is much larger than
the horizontal electric field Ey, so that the solution of the 1-dimensional Poisson equation
is adequate. This gradual channel approximation (the voltage varies only gradually along
the channel) is quite valid for long channel MOSFETs since Ey is small. For Qn(y) using
charge control relation at location y we have:

Now we turn our attention to evaluate the resistance of the infinitesimal element of
length dy along the channel (as shown in fig 6.21).
Assuming that only drift current is present and hence applying Ohm's law, we get :

Fig 6.21: Cross Sectional View of channel

Here we have I = dy,

and A=Wxi, where xi = inversion layer thickness.

Now using equation (6.22), We have:


Since

is varying along the transverse direction, we define

Now using

as:

in eqn (6.23) and rearranging the terms, we will get:

Neglecting recombination-regeneration which implies IDS(y) = IDS i.e. current constant


throughout the channel.
Integrating RHS of eqn (6.26) from 0 to VDS and LHS from 0 to L, we will get

Now substituting Qn(y) from eqn (6.21) in eqn (6.27), we will get:

Eqn (6.29) holds true for

The drain current first increases linearly with the applied drain-to-source voltage, but
then reaches a maximum value. This occurs due to the formation of depletion region
between pinch-off point and drain. This behavior is known as drain saturation which is
observed for

as shown in figure below.

Fig 6.22: IDS-VDS graph


The saturation current IDSsat is given by eqn (6.210),

6.2 Channel length modulation and body bias effect


The observed current IDS does not saturate, but has a small finite slope as shown in fig
6.31. This is attributed as channel

Fig 6.31: Actual vs Ideal IDS-VDS graph

Length modulation. This in MOSFET is caused by the increase in depletion layer width
at the drain as the drain voltage is increased. This leads to a shorter channel length
(reduced by
) and increased drain current. When the channel length of MOSFET is
decreased and MOSFET is operated beyond channel pinch-off, the relative importance of
pinchoff length
with respect to physical length is increased. This effect can be
included in saturation current as :

Here

is called channel length modulation coefficient.

Till now we assumed that the body of MOSFET is to be grounded. We will now take effect
of body bias into account i.e. body being applied a negative voltage in case of nMOSFET. Application of VSB > 0 increases the potential build up across the
semiconductor. Depletion region widens in order to compensate for the extra required
field, which implies higher VT. Viewing it from the point of energy band diagram, a
higher potential needs to be applied to the gate in order to bend the bands by the same
amount in order to create the same electron concentration in the channel. With the
application to the body bias, it modulates to the threshold voltage governed by the
threshold voltage governed by the following equations:

where

is known as the body coefficient.

Recap
In this lecture you have learnt the following:
Derivation of I-V relationship
Channel length modulation and body bias effect

Module 2 : MOSFET
Lecture 7: Advanced Topics
Objectives
In this course you will learn the following
Motivation for Scaling
Types of Scaling
Short channel effect
Velocity saturation
7.1 Motivation for Scaling
The reduction of the dimensions of a MOSFET has been dramatic during the last three
decades. Starting at a minimum feature length of 10 mm in 1970 the gate length was
gradually reduced to 0.15 mm minimum feature size in 2000, resulting in a 13%
reduction per year. Proper scaling of MOSFET however requires not only a size reduction
of the gate length and width but also requires a reduction of all other dimensions
including the gate/source and gate/drain alignment, the oxide thickness and the
depletion layer widths. Scaling of the depletion layer widths also implies scaling of the
substrate doping density.
In short, we will study simplified guidelines for shrinking device dimensions to increase
transistor density & operating frequency and reduction in power dissipation & gate
delays.
7.2 Types of Scaling
Two types of scaling are common:
1) constant field scaling and
2) constant voltage scaling.
Constant field scaling yields the largest reduction in the power-delay product of a single
transistor. However, it requires a reduction in the power supply voltage as one
decreases the minimum feature size.
Constant voltage scaling does not have this problem and is therefore the preferred
scaling method since it provides voltage compatibility with older circuit technologies. The
disadvantage of constant voltage scaling is that the electric field increases as the
minimum feature length is reduced. This leads to velocity saturation, mobility
degradation, increased leakage currents and lower breakdown voltages.
After scaling, the different Mosfet parameters will be converted as given by table below:
Before Scaling After Constant Field Scaling After Constant Voltage Scaling

Where s = scaling parameter of MOS


7.3 Short Channel Effect
So far our discussion was based upon the assumptions that channel was long and wide
enough, so that edge effects along the four sides was negligible, longitudinal field was
negligible and electric field at every point was perpendicular to the surface. So we could
perform one-dimensional analysis using gradual channel approximation. But in devices
where channel is short longitudinal field will not be negligible compared to perpendicular
field. So in that case one-dimensional analysis gives wrong results and we will have to
perform dimensional analysis taking into account both longitudinal and vertical fields.
(which is out of the scope this course)
When is a channel called a short channel?
(i) When junction (source/drain) length is of the order of channel length.
(ii) L is not much larger then the sum of the drain and source depletion width.
We have shown below the comparative graphs of I-V characteristics for both long
channel and short channel length MOSFETs. From graph, it can be clearly concluded that
when the channel becomes short, the current in saturation region becomes linearly
dependent on applied drain voltage rather than being square dependent.

Figure 7.3: Comparison of ID vs VDS characteristics


for long and short channel MOSFET devices

7.4 Velocity Saturation


As we were assuming longitudinal field to be very small in the channel, the magnitude of
carrier velocity |vd| was proportional to |Ex|. But it has been observed that for high
values of |Ex| carrier velocity tends to saturate. It is no more proportional to |Ex|. This
lack of proportionality at high |Ex| values is known as velocity saturation.

Recap
In this lecture you have learnt the following
Motivation for Scaling
Types of Scaling
Short channel effect
Velocity saturation

Module 2 : MOSFET
Lecture 8 : Short Channel Effects
Objectives
In this course you will learn the following
Motivation
Mobility degradation
Subthreshold current
Threshold voltage variation
Drain induced barrier lowering (DIBL)
Drain punch through
Hot carrier effect
Surface states and interface trapped charge
8.1 Motivation
As seen in the last lecture as channel length is reduced, departures from long channel
behaviour may occur. These departures, which are called Short Channel Effects, arise
as results of a two-dimensional potential distribution and high electric fields in the
channel region.
For a given channel doping concentration, as the channel length is reduced, the
depletion layer widths of source and drain junctions become comparable to channel
length. The potential distribution in the channel now depends on both the tranverse field
Ex(controlled by the gate voltage and back-surface bias) and the longitudinal field
Ey(controlled by the drain bias). In other words, the potential distribution becomes two
dimensional, and the gradual channel approximation (i.e. Ex >> Ey) is no longer valid.
This two dimensional potential results in the degradation of the threshold behaviour,
dependence of threshold voltage on the channel length & biasing voltages and failure of
the current saturation due to punch through effect.
In further sections, we will study various effects due to short channel length in MOSFET.
8.2 Mobility Degradation
Mobility is important because the current in MOSFET depends upon mobility of charge
carriers(holes and electrons).

We can describe this mobility degradation by two effects:

Figure 8.2: Mobilty degradation graph


i.

Lateral Field Effect: In case of short channels, as the lateral field is increased,
the channel mobility becomes field-dependent and eventually velocity saturation
occurs (which was referred to in the previous lecture). This results in current
saturation.

ii.

Vertical Field Effect: As the vertical electric field also increases on shrinking the
channel lengths, it results in scattering of carriers near the surface. Hence the
surface mobility reduces (Also explained by the mobility dependence equation
given below).

Thus for short channels, we can see (in the figure 8.2) the mobility degradation which
occurs due to velocity saturation and scattering of carriers.

8.3 Subthreshold Current


An effect that is exacerbated by short channel designs is the subthreshold current which
arises from the fact that some electrons are induced in the channel even before strong
inversion is established. For the low electron concentration (typically of subthreshold
regime), we expect diffusion current (propotional to carrier gradients) to dominate over
drift currents (propotional to carrier concentrations). For very short channel lengths,

such carrier diffusion from source to drain can make it impossible to turn off the device
below threshold. The subthreshold current is made worse by the DIBL effect (will be
explained in later sections) which increases the injection of electrons from the source.
8.4 Threshold Voltage variation with Channel Length

Figure 8.41: Dependence of VT on L for MOSFET

Figure 8.42: IDS Vs VGS for short channel


In case of long channel MOSFETs, gate has control over the channel and supports most
of the charge. As we go to short channel lengths as seen in the graph above, the
threshold voltage begins to decrease as the charge in the depletion region is now
supported by the drain and the source also. Thus the gate needs to support less charge
in this region and as a result, VT falls down. This phenomenon is known as charge
sharing effect.
Now since IDS is propotional to (VGS - VT), therefore as VT begins to fall in case of
short channels, IDS starts increasing resulting in larger drain currents. Also when VGS
is zero and the MOSFET is in the cut off mode, since VT is small, (VGS - VT) will be a
small negative value and will result in leakage current which further multiplied by the
drain voltage will result in leakage power. In case of long channel MOSFETs, VT is large
enough and (VGS - VT) is a comparatively larger negative value, in cut off mode
leakage power is very small.

Transit Time: As seen in previous lecture, the short channel results in velocity
saturation over part of the channel. So the argument used to derive the transit time for
long channel MOSFET is no longer valid for short channel MOSFETs. We note that the
transit time will be larger if electrons were moving at maximum speed all over the
channel. Thus,

Figure 8.42 shows that the transit time of a device operating in the 'flat' part of IDSVGS characteristics curve which concludes that transit time cannot be decreased by
increasing further VGS.
Quantum Mechanical Increase Effect: Another effect of quantum mechanics that also
increases with scaling, is a shift in the surface potential required for strong inversion.
This effect arises from so called "energy quantization" of confined particles which
preludes electrons and holes from existing at zero energy in the conduction or valence
bands. It is a direct consequence of the coupled Poisson-Schrodinger equation
solution. This surface potential shift manifests itself as an increase in |VT| which for the
long devices is given by

Above equation tells that |VT| increases as devices are scaled down.
8.5 Drain Induced Barrier Lowering (DIBL)

Figure 8.5: Surface potential graph with


constant gate voltage (VDS and L are varied)
The source and drain depletion regions can intrude into the channel even without bias,
as these junctions are brought closer together in short channel devices. This effect is
called charge sharing (as mentioned earlier) since the source and drain in effect take
part of the channel charge, which would otherwise be controlled by the gate. As the

drain depletion region continues to increase with the bias, it can actually interact with
the source to channel junction and hence lowers the potential barrier. This problem is
known as Drain Induced Barrier Lowering (DIBL). When the source junction barrier
is reduced, electrons are easily injected into the channel and the gate voltage has no
longer any control over the drain current.
In DIBL case,

For figure 8.5, we can observe that under extreme conditions of encroaching source and
drain depletion regions, the two curves can meet.
8.6 Drain Punch Through
When the drain is at high enough voltage with respect to the source, the depletion
region around the drain may extend to the source, causing current to flow irrespective of
gate voltage (i.e. even if gate voltage is zero). This is known as Drain Punch Through
condition and the punch through voltage VPT given by:

So when channel length L decreases (i.e. short channel length case), punch through
voltage rapidly decreases.
8.7 Hot Carrier Effect
Electric fields tend to be increased at smaller geometries, since device voltages are
difficult to scale to arbitrarily small values. As a result, various hot carrier effects appear
in short channel devices. The field in the reversed biased drain junction can lead to
impact ionization and carrier multiplication. The resulting holes contribute to substrate
current and some may move to the source, where they lower source barrier and result in
electron injected from source into p-region. In fact n-p-n transistor can result within
source channel drain configuration and prevent gate control of the current.
Another hot electron effect is the transport of the energetic electrons over (or tunneling
through) the barrier into the oxide. Such electrons become trapped in the oxide, where
they change the threshold voltage and I-V characteristics of the device. Hot electron
effects can be reduced by reducing the doping in the source and drain regions, so that
the junction fields are smaller. However lightly doped source and drain regions are
incompatible with small geometry devices because of contact resistances and other
similar problems. A compromise design of MOSFET, called Lightly Doped Drain (LDD),
using two doping levels with heavy doping over most of the source and drain areas with
light doping in a region adjacent to the channel. The LDD structure decreases the field
between drain and channel regions, thereby reduces injection into the oxide, impact
ionization and other hot electron effects.

8.8 Surface States and Interface Trapped Charge


At Si-SiO2 interface, the lattice of bulk silicon and all the properties associated with its
periodicity terminate. As a result, localized states with energy in the forbidden energy
gap of silicon are introduced at or very near to the Si-SiO2 interface. Interface trapped
charges are electrons or holes trapped in these states. The probability of occupation of a
surface state by an electron or by a hole is determined by the surface state energy
relative to the Fermi level. An electron in conduction band can contribute readily to
electrical conduction current while an interface trapped electron does not, except
hopping among the surface states. Thus by trapping electrons and holes, surface states
can reduce conduction current in MOSFETs.
Surface states can also act as localized generation-recombination centers and lead to
leakage currents.
8.9 Conclusion
Because short channel effects complicate device operation and degrade device
performance, these effects should be eliminated or minimized, so that a physical short
channel device can preserve the electrical long channel behaviour.
Recap
In this lecture you have learnt the following
Motivation
Mobility degradation
Subthreshold current
Threshold voltage variation
Drain induced barrier lowering (DIBL)
Drain punch through
Hot carrier effect
Surface states and interface trapped charge

Congratulations, you have finished Lecture 8.

Module 3 : Fabrication Process and Layout Design Rules


Lecture 9 : Introduction to Fabrication Process
Objectives
In this course you will learn the following
Motivation
Photolithography
Fabrication Process
9.1 Motivation
In the previous module, we did a detailed study about the MOSFET. VLSI circuits are
very complex circuits i.e we cannot make circuits by interconnecting few single MOSFET
transistors. A VLSI circuit consists of millions to billions of transistors. For this purpose,
we use Photolithography which is a method/technology to create the circuit patterns
on a silicon wafer surface and the process is called Fabrication.
In this lecture, we will study in detail photolithography, how it is done and what sort of
materials are used for this purpose.
9.2 Photolithography
Photolithography is the method that sets the surface dimensions (horizontal) of various
parts of devices and circuits. Its goal is two fold. First goal is to create in and on the
wafer surface a pattern whose dimensions are as close to the device requirements as
possible. This is known as resolution of images on the wafer and the pattern
dimensions are known as feature or image sizes of the circuit. Second goal is the
correct placement called alignment or registration of the circuit patterns on the wafer.
The entire circuit patterns must be correctly placed on the wafer surface because
misaligned mask layers can cause the entire circuit to fail.

Figure 9.1: Clear Field mask

Figure 9.2: Dark Field mask

In order to create patterns on the wafer, the required pattern is first formed in the
reticles or photomasks. The pattern on reticle or mask is then transfered into a layer
of photoresist. Photoresist is a light sensitive material similar to the coating on a
regular photographic film. Exposure to light causes changes in its structure and
properties. If the exposure to light causes photoresist to change from a soluble to
insoluble one, it is known as negative actingand the chemical change is called
polymerization. Similarly, if exposure to light causes it change from relatively nonsoluble to much more soluble, it is known as positive acting and the term describing it
is called as photosolubilisation. The exposure radiation is generally UV and E-beam.
Removing the soluble portions with chemical solvents called developers leaves a
pattern on the photoresist depending upon the type of mask used. A mask whose
pattern exists in the opaque regions is called clear field mask. The pattern could also
be coded in reverse, and such masks are known as dark field masks.
The result obtained from the photomasking process from different combinations of mask
and resist polarities is shown in the following table:

The second transfer takes place from the photoresist layer into the wafer surface layer.
The transfer occurs when etchants remove the portion of the wafer's top layer that is not
covered by the photoresist. The chemistry of the photoresists is such that they do not
dissolve in the chemical etching solutions; they are etch resistant; hence the name
photoresists.The etchant generally used to remove silicon dioxide is hydrogen fluoride
(HF).

The choice of mask and resist polarity is a function of the level of dimensional control
and defect protection required to make the circuit work. For example, sharp lines are not
obtainable with negative photoresists while etchants are difficult to handle with positive
photoresists.
After the pattern has been taken on resist, the thin layer needs to be etched. Etching
process is used to etch into a specific layer the circuit pattern that has been defined
during the photomasking process. For example, aluminium connections are obtained
after etching of the aluminium layer.

9.3 Fabrication Process


Why polysilicon gate?
The most significant aspect of using polysilicon as the gate electrode is its ability to be
used as a further mask to allow precise definition of source and drain regions. This is
achieved with minimum gate to source/drain overlap, which leads to lower overlap
capacitances and improved circuit performance.
Procedure:
1. A thick layer of oxide is grown on the wafer surface which is known as field oxide
(FOX). It is much thicker than the gate oxide. It acts as shield which protects the
underlying substrate from impurities when other processes are being carried out
on the wafer. Besides, it also aids in preventing conduction between unrelated
transistor source/drains. In fact, the thick FOX can act as a gate oxide for a
parasitic MOS transistor. The threshold voltage of this transistor is much higher
than that of a regular transistor due to thick field oxide. The high threshold voltage
is further ensured by introducing channel-stop diffusion underneath the field
oxide, which raises the impurity concentration in the substrate in the areas where
transistors are not required.
2. A window is opened in the field oxide corresponding to the area where the
transistor is to be made. A thin highly controlled layer of oxide is deposited where
active transistors are desired. This is called gate oxide or thinox. A thick layer of
silicon dioxide is required elsewhere to isolate the individual transistors.
3. The thin gate oxide is etched to open windows for the source and drain diffusions.
Ion implantation or diffusion is used for the doping. The former tends to produce
shallower junctions which are compatible with fine dimension processes. As the
diffusion process occurs in all directions, the deeper a diffusion is the more it
spreads laterally. This lateral spread determines the overlap between gate and
source/drain regions.
4. Next, a gate delineation mask is used to determine the gate area. There has to be
minimum overlap between gate and source/drain regions. This is referred to as
self-aligned process because source and drain do not extend under the gate.
Polysilicon is then deposited over the oxide.
5. The complete structure is then covered with silicon dioxide and contact holes are
etched using contact window mask down to the surface to be contacted. These
allow metal to contact diffusion or polysilicon regions.
6. Metallization is then applied to the surface using interconnect mask and selectively
etched to produce circuit interconnections.

7. As a final step, the wafer is passivated and openings to the bond pads are etched
to allow for wire bonding. Passivation protects the silicon surface against the
ingress of contaminants than can modify circuit behavior.

Recap
In this lecture you have learnt the following
Motivation
Photolithography
Fabrication Process

Congratulations, you have finished Lecture 9.

Вам также может понравиться