Академический Документы
Профессиональный Документы
Культура Документы
Suppose that we are considering an enhancement that runs 10 times faster than the
original machine but is only usable 40% of the time. What is the overall speedup gained
by incorporating the enhancement?
Answer:
Fractionenhanced = 0.4
Speedupenhanced = 10
Speedupoverall = 1 / [0.6+(0.4/10)] = 1/0.64 = 1.56
Locality of Reference
Programs tend to reuse data and instructions they have used recently.
A program may spend 90% of its execution time in only 10% of the code.
Based on a program’s recent past, one can predict with reasonable
accuracy what instructions and data will use in the near future.
Two Types of Locality
Temporal Locality
recently accessed items are likely to be accessed in the near future
Spatial Locality
items whose addresses (or location) are near one another tend to be
referenced close together in time
MIPS Benchmark
millions of instructions per second
• easy to understand and straightforward
• dependent on instruction set
• varies between programs on the same computer
• MIPS can vary inversely to performance!
MFLOPS Benchmark
millions of floating-point operations per second (megaflops)
• intended to measure floating-point operations & some programs don’t use any
• floating-point operations is not consistent across machines
• MFLOPS ratings for the same machine may differ depending on instruction mix
Programs as Evaluators
Four types (in decreasing order of accuracy):
Real programs
Kernels
Toy Benchmarks
Synthetic Benchmarks
Synthetic Benchmarks
programs which try to match the average number and frequency of operations of a
typical workload, e.g. dhrystone, whetstone, etc.
not real programs, may not reflect program behavior for factors not measured
• A standard set of programs is hard to obtain because each program run differently
for each machine and companies would want to use programs that run fast on
their machines.
What is VLSI
-Very Large Scale Integration
refers to a technology through which it is possible to implement large circuits
consisting of up to or more than a million transistors on semiconductor
wafers,primarily silicon.
-
VLSI-technology used in
Microprocessors
digital signal processors (DSPs)
systolic arrays
large capacity memories
memory controllers
I/O controllers
interconnection networks.
Von Neumann Machines
Three basic hardware subsystems
(CPU, memory and I/O)
Stored-program computer
Sequential operation
Single path between main memory and the control unit of the CPU, i.e. the Von
Neumann bottleneck
Harvard Architecture
class of Von Neumann architectures that provide independent pathways for data
addresses, data, instruction addresses and instructions
allows the CPU to access instructions and data simultaneously
Registers
A component used for data storage
(can be read from or written to).
High speed memory locations used to store important information during CPU
operations.
Program Counter (PC)
A register which holds the address of the next instruction to be executed.
Instruction Register (IR)
Holds a copy of the currently executing instruction.
Status/Flags Register
Holds data about system events/states.
Memory Address Register (MAR)
Holds the memory address of the data to be read from OR written to memory.
Connected to the address bus.
Memory Data Register (MDR)
Programmable registers for various use (“scratch pad”).
Number varies between processors.
Decoder
Also known as the Instruction Decoder.
Translates instructions from the machine level code to the sequence of digital
control signals which carry out the instruction.
Control Unit (CU)
Executes each control signal in the sequence determined by the decoder.
Connected to the control bus.
Writing to Memory
<address> to MAR
<data> to MDR
Control Unit issues WRITE signal to Memory
• Memory responds by reading data on the data bus and storing it into the
location specified by the information on the address bus.
ALU Operation
<operand1> to ALU
<operand2> to ALU
Control Unit issues <ALU operation> signal to the ALU
result from ALU to <anywhere in CPU>
How Instructions Are Processed: An Illustration
A program called SMPLADD.EXE was executed by a user. The operating
system searches for the file in the hard disk and places the program in main
memory. The program contains the following code ◊
Loc Instruction
100 mov (010A), R0
103 add (010B), R0
106 mov R0, 0200
109 end
10A 09
10B 03
What is a WORD?
CPU: the smallest unit of data that can be processed at a time, e.g. size of ALU
operands or data in the registers.
Memory: the smallest unit of addressable data.
Bus: the smallest unit of data that can be sent through the bus at a time.
Word Length
Size of a word specified in bits.
Possible benefits of a large word length:
l faster speed
(more data and/or instructions can be fetched at a time)
l greater numeric precision
l more powerful instructions
(Instructions can have more operands and modes of operation.)
Recap: Instruction Cycle
l Fetch - fetch the instruction from the memory
l Decode - decode the instruction
l Operand Fetch - get the necessary operands
l Execute - execute the instruction
l Store - store the result to the appropriate location