Mr. A. B. Shinde
Assistant Professor,
Electronics Engineering,
PVPIT, Budhgaon.
shindesir.pvp@gmail.com
CISC
CISC are chips that are easy to program and which make efficient use of
memory.
Examples
of
System/360,
PDP-11, VAX,
68000, and
x86.
CISC
processor
families
are
CISC History
The first PC microprocessors developed were CISC chips, because
all the instructions the processor could execute were built into the
chip.
Memory was expensive in the early days of PCs, and CISC chips
saved memory because their programming could be fed directly into
the processor.
CISC was developed to make compiler development simpler. It shifts
most of the burden of generating machine instructions to the
processor.
For example, instead of having to make a compiler write long
machine instructions to calculate a square-root, a CISC processor
would have a built-in ability to do this.
CISC Philosophy
The three decisions that led to the CISC philosophy, which drove all
computer designs until the late 1980s, and is still in major use today
are the
use Microcode,
build rich instruction sets, and
build high-level instruction sets.
CISC Philosophy
Use Microcode:
simple logic to control the data paths between the various elements
of the processor.
In a micro programmed system, the main processor has some builtin memory (typically ROM) that contains groups of microcode
instructions which correspond with each machine-language
instruction.
Since the microcode memory can be much faster than main
memory, an instruction set can be implemented in microcode
without losing much speed over a purely hard-wired
implementation.
CISC Philosophy
CISC Philosophy
Characteristics
Equation
(the number of cycles per instruction * instruction cycle time) = execution time.
Addressing modes
Register Addressing Mode
Memory Addressing Modes
Displacement Only Addressing Mode
Register Indirect Addressing Modes
Indexed Addressing Modes
Based Indexed Addressing Modes
Based Indexed Plus Displacement Addressing
RISC
RISC
RISC is a type of microprocessor architecture that utilizes a small, highlyoptimized set of instructions, rather than a more specialized set of
instructions found in other types of architectures.
RISC represents a CPU design to make instructions execute very quickly.
Alpha,
ARC,
ARM,
AVR,
MIPS,
PA-RISC,
Power Architecture (including PowerPC),
SuperH and
SPARC.
CHARACTERISTICS OF RISC
RISC chip will typically have far fewer transistors dedicated to the core
logic which originally allowed designers to increase the size of the
register set and increase internal parallelism.
Few data types in hardware, some CISCs have byte string instructions.
RISC
RISC
Key features
RISC
History
The first RISC projects came from IBM, Stanford, and UC-Berkeley in the late
70s and early 80s.
The IBM 801, Stanford MIPS, and Berkeley RISC 1 and 2 were all designed with
a similar philosophy which has become known as RISC.
Pipelining:
CISC Vs RISC
CISC
RISC
Emphasis on hardware
Emphasis on software
Memory-to-memory:
"LOAD" and "STORE"
incorporated in instructions
Register to register:
"LOAD" and "STORE"
are independent instructions
CISC Vs RISC
The CISC Approach
Operations:
1.
2.
3.
4.
Instruction :
MULT 2:3, 5:2
LW
LW
MULT
SW
A, 2:3
B, 5:2
A, B
2:3, A
Operations:
1.
Load operand1 into register A
2.
Load operand2 into register B
3.
Multiply the operands in the execution unit
and store result in A
4.
Store value of A back to memory location
2:3
CISC Vs RISC
HARVARD ARCHITECTURE
The term originated from the Harvard Mark: relay-based computer, which
stored instructions on punched tape (24 bits wide) and data in electromechanical counters. These early machines had limited data storage,
entirely contained within the central processing unit, and provided no
access to the instruction storage as data.
Both cannot occur at the same time since the instructions and
data use the same bus system.
Also, a Harvard architecture machine has distinct code and data address
spaces: instruction address zero is not the same as data address zero.
Instruction address zero might identify a twenty-four bit value, while data
address zero might indicate an eight bit byte that isn't part of that twentyfour bit value.
Soft processors
Soft processors
Soft processors
Most systems, if they use a soft processor at all, only use a single soft
processor. However, a few designers tile as many soft cores onto an
FPGA as will fit
Soft processors
IBMs power PC
PPC405Fx Features
The PPC405Fx core provides high performance and low power
consumption.
The
PPC405Fx RISC CPU executes at sustained speeds
approaching one cycle per instruction.
On-chip instruction and data cache arrays can be implemented to
reduce chip count and design complexity in systems and improve
system throughput.
PPC405Fx Features
The PowerPC RISC fixed-point CPU features:
PPC405Fx Features
Storage control :
PPC405Fx Features
Memory Management
Translation of the 4GB logical address space into physical addresses
Independent enabling of instruction and data translation/protection
Page level access control using the translation mechanism
Software control of page replacement strategy
WIU0GE (write-through, cachability, compressed user-defined 0,
guarded, endian) storage attribute control for each virtual memory
region
WIU0GE storage attribute control for thirty-two real 128MB regions in
real mode
Support for OCM that provides memory access performance identical
to cache hits
Full PowerPC floating-point unit (FPU) support using the auxiliary
processor unit (APU) interface
(the PPC405Fx does not include an FPU)
PPC405Fx Features
Debug Support
PowerPC Architecture
PowerPC User Instruction Set Architecture (UISA), including the base user-level
instruction set, user level registers, programming model, data types, and
addressing modes.
The instruction cache unit (ICU) and data cache unit (DCU) enable
concurrent accesses and minimize pipeline stalls.
The storage capacity of the cache units, which can range from 0KB32KB,
depends upon the implementation. Both cache units are two-way setassociative, use a 32-byte line size.
The DCU uses a two-line flush queue to minimize pipeline stalls caused by
cache misses. Line flushes are postponed until after a line fill is completed.
Registers comprise the first position of the flush queue; the line buffer built
into the output of the array for manufacturing test serves as the second
position of the flush queue.
The 4GB address space of the PPC405Fx is presented as a flat address space.
The MMU provides address translation, protection functions, and storage attribute
control for embedded applications.
Working with appropriate system level software, the MMU provides the following
functions:
Storage attributes for cache policy and speculative memory access control
The MMU can be disabled under software control. If the MMU is not used, the
PPC405Fx core provides other storage control mechanisms.
Timer Facilities
Watchdog timer
The time base is a 64-bit counter incremented either by an internal signal equal
to the CPU clock rate or by a separate external timer clock signal.
The PIT is a 32-bit register that is decremented at the same rate as the time
base is incremented. The user loads the PIT register with a value to create the
desired delay. When a decrement occurs on a PIT count of 1, the timer stops
decrementing, a bit is set in the Timer Status Register (TSR), and a PIT interrupt
is generated. Optionally, the PIT can be programmed to reload automatically the
last value written to the PIT register, after which the PIT begins decrementing
again.
PowerPC 7xx
PowerPC 7xx
The 7xx family is also widely used in embedded devices like printers,
routers, storage devices, spacecraft and video game consoles.
The 7xx family had its shortcomings, namely lack of SMP (Symmetric
multiprocessing) support and SIMD capabilities and a relatively weak
FPU (Floating-point unit).
Overview
Spartan-3 FPGA
Spartan-3 FPGA
logic resources,
Spartan-3 FPGA
Features
Low-cost, high-performance logic solution for high-volume, consumeroriented applications
Select IO interface signaling
Up to 633 I/O pins
622+ Mb/s data transfer rate per I/O
DDR, DDR2 SDRAM support up to 333 Mb/s
Logic resources
logic cells with shift register capability
Wide, fast multiplexers
Dedicated 18 x 18 multipliers
JTAG logic compatible with IEEE 1149.1/1532
Spartan-3 FPGA
Features
MicroBlaze and PicoBlaze processor, PCI, PCI Express PIPE Endpoint, and
other IP cores.
Architectural Overview
Configurable Logic Blocks (CLBs) contains flexible Look-Up Tables (LUTs) that
implement logic plus storage elements used as flip-flops or latches. CLBs perform a wide
variety of logical functions as well as store data.
Input/Output Blocks (IOBs) controls the flow of data between the I/O pins and the
internal logic of the device. IOBs support bidirectional data flow plus 3-state operation.
Supports a variety of signal standards, including several high-performance differential
standards. Double Data-Rate (DDR) registers are included.
Block RAM provides data storage in the form of 18-Kbit dual-port blocks.
Multiplier Blocks accept two 18-bit binary numbers as inputs and calculate the product.
The Spartan-3A DSP platform includes special DSP multiply-accumulate blocks.
Digital Clock Manager (DCM) Blocks provide self-calibrating, fully digital solutions for
distributing, delaying, multiplying, dividing, and phase-shifting clock signals.
IOB Overview
There are three main signal paths within the IOB: the output path, input path,
and 3-state path. Each path has its own pair of storage elements that can act as
either registers or latches. The three main signal paths are as follows:
The input path carries data from the pad, which is bonded to a package pin, through an
optional programmable delay element directly to the line. The IOB outputs IQ1, and IQ2
all lead to the FPGAs internal logic.
The output path, starting with the O1 and O2 lines, carries data from the FPGAs
internal logic through a multiplexer and then a three-state driver to the IOB pad.
The 3-state path determines when the output driver is high impedance. The T1 and T2
lines carry data from the FPGAs internal logic through a multiplexer to the output
driver. The output driver is active-Low enabled.
All signal paths entering the IOB, including those associated with the storage elements,
have an inverter option.
There are three pairs of storage elements in each IOB, one pair for each
of the three paths.
It is possible to configure each of these storage elements as an edgetriggered D-type flip-flop (FD) or a level-sensitive latch (LD).
The storage elements in the upper and lower portions of the slice are
called FFY and FFX, respectively.
The carry chain, together with various dedicated arithmetic logic gates,
support fast and efficient implementations of math operations.
Five multiplexers control the chain: CYINIT, CY0F, and CYMUXF in the
lower portion as well as CY0G and CYMUXG in the upper portion.
PicoBlaze
PicoBlaze
It also provides support for the Virtex-5, Spartan-6, and Virtex-6 FPGA
families.
The PicoBlaze microcontroller provides cost-efficient microcontrollerbased control and simple data processing.
PicoBlaze
Furthermore, the PicoBlaze microcontroller is provided as a free, sourcelevel VHDL file with royalty-free re-use within Xilinx FPGAs.
The PicoBlaze microcontroller reduces system cost because it is a singlechip solution, integrated within the FPGA and sometimes only occupying
leftover FPGA resources.
For example, a microcontroller cannot respond to events much faster than a few
microseconds. The FPGA logic can respond to multiple, simultaneous events in just a
few to tens of nanoseconds. Conversely, a microcontroller is cost-effective and simple
for performing format or protocol conversions.
FPGA Logic
Strengths
Weaknesses
Executes sequentially
Performance degrades with
increasing complexity
Program memory requirements
increase with increasing
complexity
Slower response to simultaneous
inputs
PicoBlaze Microcontroller
General-Purpose Register
The PicoBlaze microcontroller includes 16 byte-wide general-purpose
registers, designated as registers s0 through sF. For better program
clarity, registers can be renamed using an assembler directive. All
register operations are completely interchangeable.
There is no dedicated accumulator; each result is computed in a
specified register.
Flags
Input/Output
During an INPUT operation, the PicoBlaze microcontroller reads data from the
IN_PORT port to a specified register, sX.
The Program Counter (PC) points to the next instruction to be executed. By default, the
PC automatically increments to the next instruction location when executing an
instruction.
Only the JUMP, CALL, RETURN instructions and the Interrupt and Reset Events
modify the default behavior. The PC cannot be directly modified by the application
code. The 10-bit PC supports a maximum code space of 1,024 instructions (000 to 3FF
hex). If the PC reaches the top of the memory at 3FF hex, it rolls over to location 000.
The default execution sequence of the program can be modified using conditional and
non-conditional program flow control instructions.
CALL and RETURN instructions provide subroutine facilities for commonly used
sections of code.
If the interrupt input is enabled, an Interrupt Event also preserves the address of the
preempted instruction on the CALL/RETURN stack while the PC is loaded with the
interrupt vector, 3FF hex.
CALL/RETURN Stack
The stack is implemented as a separate cyclic buffer. When the stack is full, it
overwrites the oldest value. No program memory is required for the stack.
Interrupts
The PicoBlaze microcontroller responds to interrupts quickly in just five clock cycles.
Reset
PicoBlaze Architecture
MicroBlaze Processor
MicroBlaze Processor
With few exceptions, the MicroBlaze can issue a new instruction every
cycle, maintaining single-cycle throughput under most circumstances.
MicroBlaze Processor
cache size,
embedded peripherals,
The performance-optimized version expands the execution-pipeline to 5stages, allowing top speeds of 210 MHz
Also, key processor instructions which are rarely used but more expensive
to implement in hardware can be selectively added/removed
MicroBlaze Processor
MicroBlaze
Features
32-bit instruction word with three operands and two addressing modes
MicroBlaze Architecture
MicroBlaze
MicroBlaze
Instructions
All MicroBlaze instructions are 32 bits and are defined as either Type A or Type
B.
Type A instructions have up to two source register operands and one destination
register operand.
Type B instructions have one source register and a 16-bit immediate operand
(which can be extended to 32 bits by preceding the Type B instruction with an
imm instruction).
Type B instructions have a single destination register operand.
Instructions are provided in the following functional categories:
arithmetic,
logical,
branch,
load/store, and
special.
MicroBlaze
Registers
MicroBlaze has an orthogonal instruction set architecture. It has thirtytwo 32-bit general purpose registers and up to eighteen 32-bit special
purpose registers, depending on configured options.
1. General Purpose Registers
The thirty-two 32-bit General Purpose Registers are numbered
R0 through R31. The register file is reset on bit stream download
(reset value is 0x00000000).
MicroBlaze
2. Special Purpose Registers
Program Counter (PC)
The Program Counter (PC) is the 32-bit address of the execution
instruction.
When used with the MFS instruction the PC register is specified by
setting Sa = 0x0000.
MicroBlaze
2. Special Purpose Registers
Machine Status Register (MSR)
The Machine Status Register contains control and status bits for the
processor.
When reading the MSR, bit 29 is replicated in bit 0 as the carry copy.
When writing to the MSR, the Carry bit takes effect immediately
and the remaining bits take effect one clock cycle later.
The MSR is specified by setting Sx = 0x0001.
MicroBlaze
2. Special Purpose Registers
- - A data TLB miss exception that specifies the (virtual) effective address accessed
- - An instruction TLB miss exception that specifies the (virtual) effective address read
MicroBlaze
2. Special Purpose Registers
Exception Status Register (ESR)
The Exception Status Register contains status bits for the processor.
The ESR is specified by setting Sa = 0x0005.
Branch Target Register (BTR)
The Branch Target Register only exists if the MicroBlaze processor is
configured to use exceptions.
The register stores the branch target address for all delay slot branch
instructions executed while MSR[EIP] = 0.
The BTR is specified by setting Sa = 0x000B.
MicroBlaze
2. Special Purpose Registers
Floating Point Status Register (FSR)
The Floating Point Status Register contains status bits for the floating
point unit.
The register is specified by setting Sa = 0x0007.
Exception Data Register (EDR)
The Exception Data Register stores data read on an FSL link that
caused an FSL exception.
MicroBlaze
2. Special Purpose Registers
Zone Protection Register (ZPR)
The Zone Protection Register is used to override MMU memory
protection defined in TLB entries.
MicroBlaze
Pipeline Architecture
MicroBlaze instruction execution is pipelined. For most instructions, each
stage takes one clock cycle to complete.
Consequently, the number of clock cycles necessary for a specific
instruction to complete is equal to the number of pipeline stages, and
one instruction is completed on every cycle.
A few instructions require multiple clock cycles in the execute stage to
complete.
When executing from slower memory, instruction fetches may take
multiple cycles.
MicroBlaze implements an instruction prefetch buffer that reduces the
impact of such multi-cycle instruction memory latency.
When the pipeline resumes execution, the fetch stage can load new
instructions directly from the prefetch buffer instead of waiting for the
instruction memory access to complete.
MicroBlaze
Pipeline Architecture
Fetch (IF), Decode (OF), Execute (EX), Access Memory (MEM), and Writeback (WB).
MicroBlaze
Memory Architecture
Both instruction and data interfaces of MicroBlaze are 32 bits wide and
use big endian, bit-reversed format.
Any ?s