Вы находитесь на странице: 1из 112

Sub-System Design

Topics

Architectural issues. Switch logic, Gate Logic. Examples of Structured Design - Combinational logic - Design of an ALU subsystem Consideration of Adders, Multipliers. Sequential Circuits. Semiconductor Memories.

Introduction

Large systems are composed of sub-systems, known as Leaf-Cells The most basic leaf cell is the common logic gate (inverter, nand, ..etc) Structured Design

High regularity Leaf cells replicated many times and interconnected to form the system

Logical and systematic approach to VLSI design is essential

Dealing with Complexity

Divide and conquer - limit the number of components you deal with at any one time Group several components into larger components

transistors form gates gates form functional units functional units form processing elements

A System-on-a-Chip

Courtesy: Philips

Major Levels of Design

Specification Description of requirements Systems Level placing and interconnecting major functional units Function Level specification and design of major functional units Logic/Circuit Level Gate level design, gate interconnection design Layout Level what will actually be patterned onto the chip, how the chip will be processed Physics Level the physics of gate and switch operation

Sub-System Design Guidelines

Define the requirements. Partition the overall architecture into subsystems. Consider interconnection paths between the subsystems. System floor plan on silicon chip. Regular structures for replication. Stick diagram for each leaf-cell (module) in the system. Convert the stick diagram of each leaf-cell into layout and go for design rule check. Simulate the performance of each cell.

Design Validation

Must check at every step that errors have not been introduced

the longer the error remains the more expensive it becomes to remove it

Chip Architecture Floor plan


After high level design is complete, it is necessary to decide on how design is to be implemented in silicon The implementation plan is known as the floor plan First step in laying out a floor plan is the routing of supply and clock rails In doing this sufficient space must be left between power rails to allow for data-buses and combinational logic cells Decide on relative positions of major functional blocks Use routing algorithm ( software ) Routing algorithm will minimize total routing area.

Alpha 21364 Microprocessor Floor plan

Major Levels of Design

Switch Logic

How do we build switches from MOS transistors? 1) Pass Transistors 2) Transmission gates

Pass Transistors

We have assumed source is grounded What if source > 0?

e.g. pass transistor passing VDD


VDD VDD

NMOS Pass Transistors

Require one transistor and one gate signal Transmit 0 well, but when Vdd is applied to the drain, the voltage at the source is Vdd-Vtn When switch logic drives gate logic, n-type switches can cause electrical problems

When n-type switch driving a complementary gate cause the gate to run slower when the switch input = 1

Since pull down current is weaker when a lower gate voltage is applied The complementary gates pull down will not suck current off the output capacitance as fast as it should be

PMOS Pass Transistors


When Vin= Vdd , then Vout= Vdd When Vin=0, CL will be discharged through P-transistor until Vout= Vtp P-device will stop conducting Logic 0 is somewhat degraded through p-device.

Voltage degradation of Pass Transistors

Pass Transistor Ckts


VDD VDD VDD VDD VDD VDD

VDD VDD VSS

Pass Transistor Ckts


V DD V DD V s = V DD -V tn V DD V DD V DD -V tn V -V DD tn V DD -V tn V DD V DD

V s = |V tp | V SS

V DD V DD

V DD -V tn V DD -2V tn

Complementary Pass Transistor Logic


A A B B Pass-Transistor Network F

(a)
A A B B Inverse Pass-Transistor Network F

A B A B AND/NAND F=AB F=AB

A B A B OR/NOR F=A+B F=A+B

A A A A EXOR/NEXOR F=A F=A

(b)

Advantages and Disadvantages

Advantages:

Less no. of Transistors. No Static Power Consumption.

Disadvantages:

Output voltage degrades. Not an ideal switch due to series resistance. Delay of series pass transistors is large.

Transmission gates

Transmission gates (contd..)

Logic with Transmission gates

Logic with Transmission gates

Advantages and Disadvantages

Advantages:

Less no. of Transistors compared to CMOS. No Static Power Consumption. Efficient building of complex gates.

Disadvantages:

Not an ideal switch due to series resistance. Delay of series transmission gates is large.

Gate Logic

Sizing of NMOS inverter

The following parameters are calculated for different sizes of nmos inverter.

Zp.u / Zp.d ratio Pull-up and Pull-down resistance. Power dissipation. Standard unit of capacitance Gate delay.

Sizing of NMOS Nand Gate.

The ratio between p.u to all p.d transistors (Zp.u / nZp.d) must be minimum 4:1 for making correct level of output voltage.

nMOS Nand gate geometry reveals two factors:

Area of nand gate is greater than area of inverter


because more no. of pull down transistors and corresponding increase in length of pull up transistor.

Delay is also increased due to direct proportion to the


number of inputs added.

NMOS Nor Gate

Since the pull down transistors are parallel in Nor gate, i.e. the pull down ratio for all transistors is same. So, it has same characteristics as inverter. The area occupied is reasonable, since there is no increase in length of pull-up transistor. So, Nor gate is preferable than nand gate.

CMOS Logic

Properties of CMOS Gates


High noise margins : VOH and VOL are at VDD and GND, respectively. No static power consumption : There never exists a direct path between VDD and VSS (GND) in steady-state mode.
Comparable rise and fall times: (under appropriate sizing conditions)

CMOS Nand and Nor Characteristics

CMOS Nand Gate has no restrictions as NMOS Nand Gate, but we have to keep the geometry symmetry by Allowing extended fall times for series nmos transistors (for series resistance). Keep the transfer characteristics for Vdd/2. CMOS Nor Gate has series p-transistors which increase the resistance and delay. This effect the transfer characteristics and reduce noise immunity. So, Geometries of nmos and pmos transistors should change.

CMOS Logic

Static CMOS Logic Pseudo NMOS Logic Dynamic CMOS Logic Domino CMOS Logic Clocked CMOS Logic n-p CMOS Logic

Static CMOS: Switch Delay Model


NAND2 A Req NOR2

A
Rp Rp B Rn B Rn A Cint A CL Rn A CL Rn A Rn B CL A INV Rp B Rp Cint Rp

Input Pattern Effects on Delay

Rp A Rn B Rn A B

Rp

Delay is dependent on the pattern of inputs Low to high transition

both inputs go low

CL

one input goes low

delay is 0.69 Rp/2 CL delay is 0.69 Rp CL

Cint

High to low transition

both inputs go high

delay is 0.69 2Rn CL

Delay Dependence on Input Patterns


3 2.5 2

A=B=10 A=1 0, B=1 A=1, B=10

Input Data Pattern A=B=01 A=1, B=01 A= 01, B=1 A=B=10

Delay (psec) 67 64 61 45

Voltage [V]

1.5 1 0.5 0 -0.5 0 100

A=1, B=10
200 300 400

80 81

A= 10, B=1

time [ps]

NMOS = 0.5m/0.25 m PMOS = 0.75m/0.25 m CL = 100 fF

Pseudo NMOS Logic


Reducing the no.of inputs from N to 1 of Static CMOS Logic is Pseudo-NMOS Logic

VDD In1 In2 InN

PUN F PDN

In1 In2 InN

Static CMOS

Pseudo NMOS Operation


The pulldown network of the gate is the same as for a fully complementary gate. The pullup network is replaced by a single p-type transistor whose gate is connected to VSS leaving the transistor permanently on. The p-type transistor is used as a resistor. When the gate input is 0V, both n-type transistors are off and the p-type transistor pulls the gates output up to VDD. When the gate input is Vdd, both the p-type and n-type transistor are on and both are operating to determine the gates output voltage.

Pseudo NMOS Characteristics

Pseudo NMOS VTC


3.0
2.5

2.0

W/Lp = 4

Vou t [V]

1.5

W/Lp = 2
1.0

0.5

W/Lp = 0.5 W/Lp = 0.25

W/Lp = 1

0.0 0.0

0.5

1.0

1.5

2.0

2.5

Vin [V]

Advantages and Disadvantages


Advantages:

Main advantage of the pseudo-nMOS gate is the small size of the pullup network, both in terms of number of devices and wiring complexity. Disadvantages: Due to more pull-up resistance, delay is more and hence speed of circuits is less. More Static power dissipation due to conduction path between VDD and VSS

Dynamic CMOS Logic

The disadvantage of Static Power dissipation in Pseudo NMOS Logic leads for an alternative logic which is Dynamic CMOS Logic. It avoids Static Power dissipation and adds a clock input for precharge and conditional evaluation phases.

Clk In1 In2 In3 Clk

Mp

Out CL

PDN

Me

Two phase operation Precharge (CLK = 0) Evaluate (CLK = 1)

Dynamic CMOS Logic operation

In static circuits at every point in time (except when switching) the output is connected to either GND or VDD via a low resistance path.

fan-in of n requires 2n (n N-type + n P-type) devices.

Dynamic circuits rely on the temporary storage of signal values on the capacitance of high impedance nodes.

requires on n + 2 (n+1 N-type + 1 P-type) transistors

Dynamic CMOS Logic operation

Precharge. When CLK goes low, the p-type transistor starts charging the precharge capacitance. The pulldown transistors controlled by the clock keep that precharge node from being drained. The length of CLK = 0 phase is adjusted to ensure that the storage node is charged to a solid logic 1. Evaluate. When CLK goes high, precharging stops i.e. the p-type pullup turns off. The evaluation phase begins i.e. the n-type pulldown turns on. The input signals must monotonically riseif an input goes from 0 to 1 and back to 0, it will inadvertently discharge the precharge capacitance. If the inputs create a conducting path through the pulldown network, the precharge capacitance is discharged, forcing its gates output to 0. If input is not 1, then the gates output would be left charged at logic 1.

Conditions on Output

Once the output of a dynamic gate is discharged, it cannot be charged again until the next precharge operation. Inputs to the gate can make at most one transition during evaluation. Output can be in the high impedance state during and after evaluation (PDN off), state is stored on CL

Clk

off Mp on

1 Out ((AB)+C) C

A B Clk off Me on

Properties of Dynamic Gates

Logic function is implemented by the PDN only Full swing outputs (VOL = GND and VOH = VDD) Non-ratioed sizing of the devices does not affect the logic levels. Faster switching speeds due to reduced load capacitance. Overall power dissipation usually higher than static CMOS. no glitching higher transition probabilities extra load on Clk PDN starts to work as soon as the input signals exceed VTn, so VM, VIH and VIL equal to VTn low noise margin (NML)

Advantages and Disadvantages

Advantages: low area higher speed than static complementary gates. Disadvantages: Precharged gates introduce functional complexity because they must be operated in two distinct phases, Requires introduction of a clock signal. They are also more sensitive to noise; Their clocking signals also consume power and are difficult to turn off to save power.

Cascading Dynamic Gates


V Clk
Mp

Clk Out1

Mp

Clk Out2 In

In
Clk
Me

Clk

Me

Out1

VTn V t

Out2

Only 0 1 transitions allowed at inputs!

Cascading Dynamic Gates- Problem

Out2 should remain at VDD since Out1 transitions to 0 during evaluation. However, since there is a finite propagation delay for the input to discharge Out1 to GND, the second output also starts to discharge. The second dynamic inverter turns off (PDN) when Out1 reaches VTn. Correct operation is guaranteed (ignoring charge redistribution and leakage) as long as the inputs can only make a single 0 -> 1 transition during the evaluation period. Setting all inputs of the second gate to 0 during precharge will fix it.

Domino CMOS Logic


Clk In1 In2 In3 Clk
Mp
11 10

Out1

PDN

Me

Combination of Dynamic Logic and Static inverter is Domino Logic

Domino CMOS Logic operation

Precharge Phase (Same as Dynamic Logic) Evaluate Phase (Modification to Dynamic Logic).

When CLK goes high, precharging stops i.e. the p-type pullup turns off. The evaluation phase begins i.e. the n-type pulldown turns on. The input signals must monotonically riseif an input goes from 0 to 1 and back to 0, it will inadvertently discharge the precharge capacitance. If the inputs create a conducting path through the pulldown network, the precharge capacitance is discharged, forcing its value to 0 and the gates output (through the inverter) to 1. If input is not 1, then the storage node would be left charged at logic 1 and the gates output would be 0.

Properties of Domino Logic


Only non-inverting logic can be implemented. Smaller area compared to Static CMOS. Free of glitches due to transistion from logic 1 to logic 0 only. Very high speed

static inverter can be skewed, only L-H transition Input capacitance reduced smaller logical effort

Cascading Domino Logic Gates


Clk
Mp
11 10

Out1

Clk
00 01

Mp

Out2

In1 In2 In3


Clk

PDN

In4 In5 Clk

PDN

Me

Me

Ensures all inputs to the Domino gate are set to 0 at the end of the precharge period. Hence, the only possible transition during evaluation is 0 -> 1

Clocked CMOS Logic

It is a combination of

Static CMOS Clk input given to one additional NMOS. Clk input given to another additional PMOS.

Fan-in for this logic is 2n+2. When Clk goes high, then logic of the circuit is evaluated due to NMOS in ON condition. When Clk goes low, then Logic is not evaluated due to NMOS in OFF condition.

Advantages and Disadvantages


Advantages: Clocked CMOS logic has been used for very low power CMOS and/or for minimizing hot electron effect problems in N-FET devices Disadvantages: More Area More Complexity

N-P CMOS Logic

N-P CMOS Logic

An elegant solution to the dynamic CMOS logic erroneous evaluation problem is to use NP Domino Logic (also called NORA logic) as shown below.

Alternate stages of N logic with stages of P logic

N logic stages use true clock, normal precharge and evaluation phases, with N logic tree in the pull down leg. P logic stages use a complement clock, with P logic stage tied above the output node. During precharge clk is low (-clk is high) and the P-logic output precharges to ground while N-logic outputs precharge to Vdd. During evaluate clk is high (-clk is low) and both type stages go through evaluation; N-logic tree logically evaluates to ground while Plogic tree logically evaluates to Vdd.

Inverter outputs can be used to feed other N-blocks from Nblocks, or to feed other P-blocks from P-blocks

Cascading N-P Logic

Example

Logic below: Stage 1 is X = (A B) Stage 2 is G = X + Y Stage 3 is Z = (F G + H)

BiCMOS Technology

CMOS properties:

Low Power Slower

BJT properties:

Faster High Power

Implementation: CMOS & BJT


Core Logic: CMOS, Interface: BJT

Basic Circuits

Inverter Nand Gate Nor Gate

BiCMOS Circuits

Inverter

Nand Gate

Structured Design

Designing the Digital block or standard cell in the following levels


Logic level Circuit level Stick Diagram Layout

Examples : Combinational Logic


5-way selector. Multiplexers & Demultiplexers. Encoders & Decoders. Parity Generator. Bus Arbitration Logic. Gray Code to Binary Code Converter.

Adders

1-bit full adder Manchester Carry Chain Adder Enhancement Techniques

Carry Select Adders Carry Skip Adders Carry Look-ahead adders

32-bit Adders

1-bit full adder

CMOS Full Adder

Sum Output

Carry output

Complementary Static CMOS Adder

Standard cells dimensions for Adder

Adder is made up of standard cells of


Multiplexers :- L=11, W=7. i) NMOS Inverter (8:1 and 4:1 ratio)


Butting contact :- L=22, W=10

Buried contact :- L=30, W=10

ii) CMOS Inverter :- L=35, W=18

Communication paths :- 3 spacing between metal contacts.

So, for Adder Standard cell :- L=190, W=150

Layout of full adder

Layout2 of full adder

Propagate and Generate Signal

For a full adder, define what happens to carries

Generate: Cout = 1 independent of C

G=AB

Propagate: Cout = C

P=AB

Kill: Cout = 0 independent of C

K = ~A ~ B

Manchester Carry Chain

Build from switch logic using propagate, generate, & kill

Manchester Chain adders

The carry input is precharged with clock signal instead of passing through the Logic. Carry path is gated by Propagate signal (P=A^B) with a single n-type pass transistor. Generate signal (G= A.B)

4-stage Dynamic Manchester Chain

Carry-Ripple Adder

Simplest design: cascade full adders


Critical path goes from Cin to Cout Design full adder to have fast carry delay
B4 A3 C3 B3 A2 C2 S3 S2 B2 A1 C1 S1 B1 Cin

A4 Cout S4

Carry Look-Ahead Adder (CLA)

To avoid the linear growth of the carry delay, we use a Carry Look-Ahead Adder (CLA) in which the carries can be generated in parallel. Output Carry is represented in terms of generate and propagate signals. The Expressions for 4-bit CLA are

4-Stage Carry look-ahead adder

1-bit CMOS CLA

C1=G0+P0C0

4-bit CMOS CLA

Carry Select Adder

It is also referred as Conditional sum adder. The adder is divided into two blocks

One block with logical 0 Cin. Another block with logical 1 Cin.

S and Cout are selected by actual Cin using 2:1 Multiplexer.

Carry Select Adder Structure

8-bit Carry Select Adder

Delay of Carry Select Adder

The Delay of n-bit adder is given by T=P.K1+(M-1).K2 where, M=No. of Blocks in the adder. P=No. of Cells of each block. K1=Delay through one adder cell. K2 = Delay through Multiplexer.

Carry Skip Adder (CSA)


It is also called as Carry Bypass Adder. Looks for cases in carry-out of a set of bits is identical to carry in. Typically organized into m-bit stages. If Ai Bi , for every bit in stage, then bypass gate sends stages carry input directly to carry output.

Simplified Carry-Skip Adder Logic


bi+1 ai+1 bi ai

ci+1

FA

ci+1

FA

ci

si+1

pi+1

si

pi

skip signal

2-Stage Carry Skip Adder

4-stage carry skip adder

If (P0 and P1 and P2 and P3 = 1) then C3 = C0 else C3=output carry of stage4 adder

Delay of CSA

Worst case delay T is given by, T=2(P-1)K1+(M-2)K2, where k1=delay through one adder cell k2=delay for skipping the carry over a block

Delay for Ripple and Carry Skip Adder

Comparison of CLA and CSK


Using 32-bit operands, a multi-level carry-skip adder was 14 % faster and its power dissipation was 58 % of that of the carry-lookahead adder. Using 64-bit operands, a one-level carry-skip adder was 38% slower and its power consumption is 68 %

of the the carry-lookahead adder.

Comparison of Adders

MULTIPLIERS

Multipliers: Basics

4-bit Multiplier

Multiplier structure using Shift and Add

Multipliers

Serial parallel Multiplier Braun Array 2s Complement multiplication using Baugh-Wooley method Pipelined Multiplier Array Modified Booths Algorithm Wallace Tree Multipliers Recursive decomposition of Multiplication Daddas Method

Serial-Parallel Multiplier

Braun Array

Modified Booth Algorithm

Booth Multiplier: Procedure

Structure of Booths Multiplier

Wallace Tree Multiplier

A Wallace tree is an implementation of an adder tree designed for minimum propagation delay. Completion time is proportional to log2n. Optimized column adder tree Combines all partial products into 2 vectors (carry and sum) . Carry and sum outputs combined using a conventional adder. Compresses the no. of stages of partial products.

Wallace Tree Multiplier Structure

Sequential Logic

D flip-flop Two Phase Clocking Dynamic Shift Register RAM ROM

Design of Subsystem - ALU

Data path of a Processor

Designing the complex systems - ALU

Complex systems are designed in Top-Down approach with the help of CAD tools. Partition the system sensibly. Aiming for simple interconnection and high regularity between sub-systems. Generate and verify each section of the design. Calculate the dimensions of the layout of subsystems and check the proportion in the total chip area.

Bit Slice design of ALU

Design of Adder and Shifter is essential for ALU

Barrel Shifter in 4-bit ALU

Barrel Shifter- Operation

4x4 Barrel Shifter Pass Transistor Circuit

Routing Power rails for ALU

Вам также может понравиться