Вы находитесь на странице: 1из 36

CPSC 321 Computer Architecture

Fall 2006
Lecture 1 Introduction and Five Components of a Computer

Adapted from CS 152 Spring 2002 UC Berkeley Copyright (C) 2001 UCB

Course Instructor
Rabi Mahapatra E-mail: (rabi@cs.tamu.edu), Sections: 501-503:MWF 12:40 1:30 PM
520B, HRBB tel: 845-5787 Office Hours: After the Class

TA Information
Suman K Mandal Email: Office: Office Hours: Lei Wu Phone: E-mail: (leiwu@tamu.edu) Office: 526, HRBB tel: 571-2640 Office Hour: TBD

Course Information [contd]


Grading: Projects, Assignments, Exams Assignments 20% Mid Term 25% Finals 25% Projects 30% Labs MIPS (Assembly Programming), Verilog (HDL)

Projects Project 1: MIPS Projects 2 & 3: Verilog (Datapath Design)

Course Information [contd]


Book (Required)
Computer Organization and Design: The Hardware/Software Interface, Third Edition , David A. Patterson and John L. Hennessy, Morgan Kaufmann Publishers. Do not get second edition
REFERENCES: Digital Design M. Morris Mano, 3rd Edition, Prentice Hall The Verilog Hardware Description Language Thomas & Morby, 5th Edition, Kluwer Academic Publishers Check the course webpage for other materials and links

Course Information [contd]


Course Webpage
http://courses.cs.tamu.edu/rabi/cpsc321/

CS Accounts
Use your CS accounts to turnin and check any email regarding course

Course Overview
Input Multiplicand
32

Input Multiplier

Multiplicand Register
32=>34 signEx <<1
34

LoadMp

32
34

32=>34 signEx

34x2 MUX

Arithmetic
Control Logic
ENC[2] ENC[1] ENC[0]

Computer Arithmetic

Multi x2/x1

34

34

34-bit ALU 34

Sub/Add

32

32

ShiftAll

HI register (16x2 bits)


ClearHI
LoadHI

LO register (16x2 bits)


LoadLO

2 LO[1:0]

32

32

Result[HI]

Result[LO]

Single/multicycle Datapaths

"LO [0]"

LO[1]

Booth Encoder

Extra 2 bits

Prev

Datapaths

Course Overview [contd]


IFetch Dcd Exec Mem WB WB WB WB IFetch Dcd Exec Mem IFetch Dcd Exec Mem

Performance

IFetch Dcd

Exec Mem

Pipelining

Memory

Memory Systems

Whats In It For Me ?
In-depth understanding of the inner-workings of modern computers, their evolution, and tradeoffs present at the hardware/software boundary.
Insight into fast/slow operations that are easy/hard to implementation hardware

Experience with the design process in the context of a large complex (hardware) design.
Functional Spec --> Control & Datapath --> Physical implementation Modern CAD tools

Computer Architecture - Definition


Computer Architecture = ISA + MO Instruction Set Architecture
What the executable can see as underlying hardware Logical View

Machine Organization
How the hardware implements ISA ? Physical View

Computer Architecture Changing Definition


1950s to 1960s: Computer Architecture Course:
Computer Arithmetic

1970s to mid 1980s: Computer Architecture Course:


Instruction Set Design, especially ISA appropriate for compilers

1990s: Computer Architecture Course: Design of CPU, memory system, I/O system, Multiprocessors, Networks 2000s: Computer Architecture Course:
Non Von-Neumann architectures, Reconfiguration

DNA Computing, Quantum Computing ????

Some Examples
Digital Alpha Sun SPARC (v8, v9) SGI MIPS (MIPS I, II, III, IV, V) IA-16/32 (8086,286,386, 486, Pentium, MMX, SSE, ) IA-64 (Itanium) AMD64/EMT64 IBM POWER (PowerPC,) microcontrollers (v1, v3) 1992-97 1986-96 1987-95 1986-96 1978-1999 1996-now 2002-now 1990-now RIP soon RIP soon HP PA-RISC (v1.1, v2.0)

Many dead processor architectures live on in

The MIPS R3000 ISA (Summary)


Instruction Categories Load/Store Computational Jump and Branch Floating Point coprocessor Memory Management Special
3 Instruction Formats: all 32 bits wide OP OP OP rs rs rt rt rd sa funct R0 - R31

PC HI LO

immediate jump target

What is Computer Architecture ?


Application
Operating System Compiler CPSC 321 Firmware Instruction Set Architecture

Instr. Set Proc. I/O system Datapath & Control Digital Design Circuit Design
Layout

Coordination of many levels of abstraction Under a rapidly changing set of forces Design, Measurement, and Evaluation

Impact of changing ISA


Early 1990s Apple switched instruction set architecture of the Macintosh
From Motorola 68000-based machines To PowerPC architecture

Intel 80x86 Family: many implementations of same architecture


program written in 1978 for 8086 can be run on latest Pentium chip

Factors affecting ISA ???


Technology
Applications
Computer Architecture Programming Languages

Cleverness

Operating Systems

History

ISA: Critical Interface


software

instruction set

hardware

Examples: 80x86 50,000,000 vs. MIPS 5500,000 ???

The Big Picture


Processor

Input
Control Memory

Datapath

Output

Since 1946 all computers have had 5 components!!!

Example Organization
TI SuperSPARCtm TMS390Z50 in Sun SPARCstation20
SuperSPARC Floating-point Unit Integer Unit L2 $ CC MBus DRAM Controller MBus Module

Inst Cache

Ref MMU

Data Cache Store Buffer

L64852 MBus control


M-S Adapter STDIO
serial kbd mouse audio RTC Floppy

SBus
SBus
DMA

SCSI Ethernet

Bus Interface

SBus Cards

Technology Trends
Processor logic capacity: about 30% per year clock rate: about 20% per year Memory DRAM capacity: about 60% per year (4x every 3 years) Memory speed: about 10% per year Cost per bit: improves about 25% per year Disk capacity: about 60% per year Total use of data: 100% per 9 months! Network Bandwidth Bandwidth increasing more than 100% per year!

Technology Trends
Microprocessor Logic Density DRAM chip capacity
Year 1980 1983 1986 1989 1992 1996 1999 2002 DRAM Size 64 Kb 256 Kb 1 Mb 4 Mb 16 Mb 64 Mb 256 Mb 1 Gb
100000000 10000000

uP-Name

R10000 Pentium R4400

1000000

i80486

Transistors

i80386 100000 i80286 R3010

i8086 10000

SU MIPS

i80x86 M68K MIPS Alpha

i4004 1000 1965

1970

1975

1980

1985

1990

1995

2000

2005

In ~1985 the single-chip processor (32-bit) and the single-board computer emerged In the 2002+ timeframe, these may well look like mainframes compared single-chip computer (maybe 2 chips)

Technology Trends

Smaller feature sizes higher speed, density

ECE/CS 752; copyright J. E. Smith, 2002 (Univ. of Wisconsin)

Technology Trends

Number of transistors doubles every 18 months (amended to 24 months)


ECE/CS 752; copyright J. E. Smith, 2002 (Univ. of Wisconsin)

Levels of Representation
temp = v[k]; High Level Language Program v[k] = v[k+1]; v[k+1] = temp;

Compiler
Assembly Language Program

Assembler
Machine Language Program
0000 1010 1100 0101 1001 1111 0110 1000

lw lw sw sw
1100 0101 1010 0000 0110 1000 1111 1001

$15, $16, $16, $15,


1010 0000 0101 1100

0($2) 4($2) 0($2) 4($2)


1111 1001 1000 0110 0101 1100 0000 1010 1000 0110 1001 1111

Machine Interpretation Control Signal Specification


ALUOP[0:3] <= InstReg[9:11] & MASK

Execution Cycle
Instruction Obtain instruction from program storage

Fetch
Instruction Determine required actions and instruction size

Decode
Operand Fetch Execute Result Compute result value or status Deposit results in storage for later use Locate and obtain operand data

Store
Next Determine successor instruction

Instruction

The Role of Performance

Example of Performance Measure

Performance Metrics
Response Time
Delay between start end end time of a task

Throughput
Numbers of tasks per given time

New: Power/Energy
Energy per task, power

Examples (Throughput/Performance)
Replace the processor with a faster version?
3.8 GHz instead of 3.2 GHz

Add an additional processor to a system?


Core Duo instead of P4

Measuring Performance
Wall-clock time or- Total Execution Time CPU Time
User Time System Time

Try using time command on UNIX system

Relating the Metrics


Performance = 1/Execution Time CPU Execution Time = CPU clock cycles for program x Clock cycle time CPU clock cycles = Instructions for a program x Average clock cycles per Instruction

Amdahls Law
Pitfall: Expecting the improvement of one aspect of a machine to increase performance by an amount proportional to the size of improvement

Amhdahls Law [contd]


A program runs in 100 seconds on a machine, with multiply operations responsible for 80 seconds of this time. How much do I have to improve the speed of multiplication if I want my program to run five times faster ? Execution Time After improvement = (exec time affected by improvement/amount of improvement) + exec time unaffected
exec time after improvement = (80 seconds / n) + (100 80 seconds)

We want performance to be 5 times faster => 20 seconds = 80/n seconds / n + 20 seconds 0 = 80 / n !!!!

Amdahls Law [contd]


Opportunity for improvement is affected by how much time the event consumes Make the common case fast Very high speedup requires making nearly every case fast Focus on overall performance, not one aspect

Summary
Computer Architecture = Instruction Set Architure + Machine Organization All computers consist of five components Processor: (1) datapath and (2) control (3) Memory (4) Input devices and (5) Output devices Not all memory are created equally Cache: fast (expensive) memory are placed closer to the processor Main memory: less expensive memory--we can have more Interfaces are where the problems are - between functional units and between the computer and the outside world Need to design against constraints of performance, power, area and cost

Summary
Performance eye of the beholder
Seconds/program =
(Instructions/Pgm)x(Clk Cycles/Instructions)x(Seconds/Clk cycles)

Amdahls Law Make the Common Case Faster

Вам также может понравиться