2.1 Advanced Processor Technology

Advanced Processor Technology
Architectural families of modern computers are

• CISC
• RISC
• Superscalar
• VLIW
• Super pipelined
• Vector processors
• Symbolic processors
Design Space of processors
 Various processor families can be mapped onto a
coordinated space of clock rate versus cycles per
instruction(CPI).
 As implementation technology evolves rapidly, the
clock rates of various processors are gradually
moving from low to higher speeds toward the right of
the design space.
 Manufacturers are trying to lower the CPI rate using
hardware and software approaches.
CISC processors:
Intel i486,M68040,VAX/8600,IBM 390.
Clock rate: 33 to 50 MHz.

CPI:varies from 1 to 2 cycles.
Microprogrammed control.
RISC processors:
Intel i860,SPARC,MIPS R3000,IBM RS/6000.

CPI:1 to 2 cycles.
Hardwired control.
Super scalar processors:
Intel i960CA,IBM RS/6000,DEC 21064.
Multiple instructions are issued simultaneously

during each cycle.
CPI: .2 to .5 cycles.
Subclass of RISC processors

Very long instruction word(VLIW):

Uses more functional units than superscalar
processor.
CPI: .1 to .2 cycles.
Uses very long instructions (256 to 1024 bits per
instruction.)
Implemented with micro programmed control.
Super pipelined processors:
Uses multiphase clocks with a much increased
clock rate.
CPI: 1 to 5 cycles.
Vector supercomputers:
Uses multiple functional units for concurrent
scalar and vector operations.
Processors are super pipelined.
Very high clock rate:100 to 1000 MHz.
Very low CPI:.1 to .2 cycles.

Instruction pipelines
The execution cycle of a typical instruction includes
four phases.
• Fetch
• Decode
• Execute
• Write-back
Pipeline cycle: It is defined as the time required for

each phase to complete its operation assuming
equal delay in all phases.
 Instruction pipeline cycle: the clock period of the
instruction pipeline.
 Instruction issue latency: the time ( in cycles)
required between the issuing of two adjacent
instructions.
 Instruction issue rate: the number of instructions
issued per cycle.
 Simple operation latency: simple operations are
integer adds, loads, stores, etc.
 Complex operations are divides, cache misses.
 Resource Conflicts: two or more instructions

demand use of the same functional unit at the same
time.
• Base scalar processor
– One instruction issued per cycle
– One cycle latency for simple operation

– One cycle latency b/w instruction issues
• Instruction pipeline fully utilized
– Successive can enter it continuously at the rate of

one per cycle
• If the instruction issue latency is 2 cycles per
instruction –pipeline is underutilized
• Pipeline cycle time is doubled by combining pipeline
stages
• Fetch and decode phases are combined into one
pipeline stage
• Execute and write back are combined into other
stage
• Poor pipeline utilization
Processors and Co-processors
• The central processor of a computer is called the
CPU.
• The CPU is a scalar processor consist of multiple
functional units such as integer arithmetic and logic
unit, a floating point accelerator.
• Floating point unit can be built on a coprocessor
which is attached to the CPU.
• Coprocessor executes instructions dispatched from
the CPU.
• Coprocessor may be a
– floating point accelerator executing scalar data,
– a vector processor executing vector operands,
– a digital signal processor,
– or a Lisp processor executing AI programs.
• Coprocessors cannot handle I/O operations.

• Coprocessors cannot be used alone.
• Processor and Coprocessor operate with a

host-back-end relationship.
• Coprocessor may be more powerful than
its host
Instruction set architecture
• RISC architecture
• CISC architecture
• Instruction set of a computer specifies

– Primitive commands
– Machine instructions
• Complexity of instruction depends on

– Data formats &Addressing modes
– Registers &Opcode specification

– Flow control mechanism
Complex Instruction Sets
• More functions built into hardware –instruction set
are very large & complex
• User defined instructions were implemented using
microcodes
• CISC instruction set contains – 120 to 350
instructions
• Variable instruction & data formats
• 8-24 general purpose register
Complex Instruction Sets
• Executes large number of memory reference
operations
• HLL –implemented in hardware/firmware
• Simply complier development

• Improve execution efficiency
• Extension from scalar to vector
• Symbolic instruction
Reduced Instruction sets
• Only 25% of the instructions of complex instructions

are frequently used for 95% of time
• 75% of hardware supported instructions are not
used
• Pushing rarely used instructions into software will
vacate chip are for building more powerful RISC
architecture
• Less than 100 instructions
• Fixed instruction format -32 bits
• 3-5 addressing modes
• Register based instructions
• Memory access- load/store instruction only

• Larger register file
– Fast context switching
– More instruction execute in one cycle

• Reduction in instruction set complexity, entire
processor on single VLSI chip
• High clock rate
• Lower CPI
• Higher MIPS ratings
Architectural Distinctions
CISC RISC
Unified cache –holding both instruction Separate instruction & data cache
& cache
Same data/instruction path Different access path
Use split codes
Digital Equipment VAX 8600
• Introduced by Digital Equipment
Corporation in 1985.
• This implements a typical CISC architecture
with micro programmed control.
• Instruction set Contains about 300
instructions and 20 addressing modes.
• Consists of two functional units for
concurrent execution of integer and floating
point instructions.
• Unified cache is use for holding both instructions
and data.
• 16 GPRs in instruction unit.
• Translation lookaside buffer(TLB) is used in the

memory unit for fast generation of physical
address from virtual address.
• Both integer and floating point units are

pipelined.
• CPI varies from 2 cycles to 20 cycles.

Intel i860 processor architecture
• Introduced by Intel Corporation in 1989.
• It is a 64 bit RISC processor fabricated on a
single chip containing more than one million
transistors.
• There are 9 functional units interconnected by
multiple data paths with widths ranging from 32
to 128 bits.
• All external or internal address buses are 32 bit
wide.
• All external or internal data bus is 64 bits wide.
• Instruction cache has 4Kbytes organized
as a two way set-associative memory with
32 bytes per cache block. It transfers 64
bits per clock cycle.
• Data cache is a two way set-associative

memory of 8Kbytes.It transfers 128 bits
per clock cycle. Write-back policy is used.
• Bus control co-ordinates the 64 bit data

transfer between chip and outside world.
• MMU implements protected 4Kbyte paged virtual
memory of 232 bytes via TLB.
• There are two floating point units: multiplier-unit

and adder-unit which can be used separately or
simultaneously under coordination of the floating
point control unit.
• Both integer unit and floating point control unit

can execute concurrently.
• Graphics unit executes integer operations

corresponding to 8,126,32 bit pixel data types.
• This unit supports three-dimensional
drawing in a graphics frame buffer with
color intensity, shading and hidden surface
elimination.
• Merge register is used only by vector
integer instructions.
• I860 executes 82 instructions including 42
RISC integer, 24 floating point,10 graphics
and 6 assembler pseudo operations.
• All these instructions execute in one cycle.

2.1 Advanced Processor Technology

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

2.1 Advanced Processor Technology

Загружено:

Авторское право:

Доступные форматы

Advanced Processor Technology

Architectural families of modern computers are

Clock rate: 33 to 50 MHz.

Clock rate: 20 to 120 MHz.

Multiple instructions are issued simultaneously

Subclass of RISC processors

Very long instruction word(VLIW):

Very high clock rate:100 to 1000 MHz.

Very low CPI:.1 to .2 cycles.

Pipeline cycle: It is defined as the time required for

 Resource Conflicts: two or more instructions

– One cycle latency for simple operation

• Instruction pipeline fully utilized

– Successive can enter it continuously at the rate of

• Coprocessors cannot handle I/O operations.

• Coprocessors cannot be used alone.

• Processor and Coprocessor operate with a

• Instruction set of a computer specifies

• Complexity of instruction depends on

– Registers &Opcode specification

• Simply complier development

• Extension from scalar to vector

• Only 25% of the instructions of complex instructions

• Memory access- load/store instruction only

– More instruction execute in one cycle

• 16 GPRs in instruction unit.

• Translation lookaside buffer(TLB) is used in the

• Both integer and floating point units are

• CPI varies from 2 cycles to 20 cycles.

• Data cache is a two way set-associative

• Bus control co-ordinates the 64 bit data

• There are two floating point units: multiplier-unit

• Both integer unit and floating point control unit

• Graphics unit executes integer operations

Вам также может понравиться